Redis Explained

Redis Explained

How Does a Single Thread Handle 100K Ops/Sec?

Here is a question that trips up even experienced engineers: how does Redis, running on a single thread, benchmark at over 100,000 operations per second on a single CPU core? That sounds impossible. Modern web servers use thread pools, async workers, and all sorts of concurrency tricks. Redis uses... one thread.

The answer is counterintuitive: single-threading is the optimization, not the limitation. Think about what happens in a multithreaded data store. Every shared data structure needs a lock. Every lock means potential contention. Every context switch between threads means the CPU has to flush its L1/L2 cache lines, reload state, and resume. On a modern CPU, a context switch costs roughly 1-5 microseconds. That does not sound like much, but when you are trying to serve a request in 100 microseconds total, burning 5 of them on a context switch is a 5% tax on every operation.

Redis sidesteps all of this. No locks, no mutexes, no atomic compare-and-swap operations on shared data. Every command executes sequentially, which means every operation is inherently atomic. When you run INCR pageviews, there is zero chance of a race condition. The CPU cache stays hot because the same thread is accessing the same data structures over and over. The branch predictor gets comfortable. The prefetcher knows what is coming next.

Redis benchmarks at 100,000+ operations per second on a single core. A GET operation completes in approximately 0.1 milliseconds. That is 10,000x faster than a typical PostgreSQL query that hits disk.

But if there is only one thread, how does Redis handle thousands of simultaneous client connections without blocking? That is where the event loop comes in.

The Event Loop: Juggling 10,000 Connections Without Breaking a Sweat

Redis uses an I/O multiplexing event loop built on top of the operating system's most efficient polling mechanism: epoll on Linux, kqueue on macOS and BSD. The idea is simple but powerful. Instead of spawning a thread per connection (the Apache model) or using async/await coroutines (the Node.js model), Redis registers all client sockets with the kernel's event notification system and says: "Wake me up when any of these sockets have data ready to read."

When data arrives on, say, 50 connections simultaneously, the kernel returns all 50 file descriptors in a single system call. Redis then processes them one by one: read the command, execute it against the in-memory data structures, write the response back to the socket buffer. Each individual command execution is blazingly fast because it is just pointer lookups and memory copies. There is no disk I/O, no network wait, no blocking of any kind.

Redis Single-Threaded Event Loop Client 1 Client 2 Client 3 ... Client N epoll / kqueue I/O Multiplexer "Which sockets are ready?" ready FDs Single Thread 1. Read command 2. Parse RESP 3. Execute in memory 4. Write response ~0.1ms per command Memory Hash Tables Skip Lists SDS Strings loop: poll again for next batch No threads, no locks, no context switches. Just one tight loop doing memory operations.

Figure 1: The Redis event loop. Multiple client connections are multiplexed through epoll/kqueue. A single thread reads commands, executes them against in-memory data structures, and writes responses. The entire cycle for one command takes roughly 0.1ms.

This architecture has a fascinating implication: the bottleneck is never CPU. Redis typically saturates your network bandwidth long before it saturates a CPU core. A 10 Gbps NIC becomes the limiting factor, not compute. That is why Redis 6.0 introduced I/O threading, but only for reading from sockets and writing responses. The actual command execution still happens on a single thread, preserving the lock-free guarantee.

What Happens in Those 0.1 Milliseconds?

Let us trace exactly what happens when you run GET user:42:name. The entire journey takes about 100 microseconds, and most of that is network latency, not Redis compute.

Your client serializes the command into the RESP (Redis Serialization Protocol) wire format: *2\r\n$3\r\nGET\r\n$13\r\nuser:42:name\r\n. The bytes travel over the TCP socket to Redis. The event loop picks it up, parses the RESP tokens, and identifies the command as GET with one argument.

Now the actual lookup. Redis stores all key-value pairs in a global hash table (a dict in the source code). The key user:42:name gets hashed using SipHash (chosen for its resistance to hash-flooding attacks). The hash maps to a bucket, and Redis walks the bucket's chain of dictEntry structs, comparing keys using a simple memcmp. In the common case with a good hash function and a load factor under 1.0, this is a single pointer dereference. The value is found, serialized back into RESP format, and written to the client's output buffer.

The actual CPU work is measured in nanoseconds. The remaining microseconds are network round-trip time. This is why Redis benchmarks look dramatically different with Unix sockets versus TCP: you strip away the network overhead and see the raw speed of in-memory operations.

Redis Memory Layout: How a Key-Value Pair Is Stored dict (Hash Table) table[0]: ht[0] table[1]: ht[1] size: power of 2 used: entry count Buckets[] [0] NULL [1] -> [2] NULL ... dictEntry *key -> SDS *val -> robj *next -> NULL (chained for collisions) SDS (Key) len: 13 | alloc: 16 "user:42:name\0" robj (Value) type: STRING encoding: EMBSTR ptr -> "Alice" SDS (Simple Dynamic String) Internals len 5 alloc 8 flags SDS8 A l i c e \0 free space (alloc - len) SDS stores length inline, making strlen O(1) instead of O(n). Binary-safe. No buffer overflows. Pointer returned to caller points to 'A', so SDS is compatible with C string functions.

Figure 2: How Redis stores a key-value pair in memory. The global hash table (dict) maps to buckets of dictEntry structs. Each entry points to an SDS string for the key and a Redis object (robj) for the value. SDS stores the string length in a header, giving O(1) length checks and binary safety.

Why does Redis use SDS instead of plain C strings? Three reasons. First, strlen on a C string is O(n) because it scans for the null terminator. SDS stores the length in its header, making it O(1). Second, SDS is binary-safe: you can store arbitrary bytes including null bytes. Third, SDS pre-allocates extra space, so appending to a string does not require a realloc every time. These seemingly small optimizations compound when you are doing millions of operations per second.

Beyond Strings: Skip Lists, Ziplists, and Why Your Data Structure Choice Matters

When you use a Sorted Set (ZSET), Redis does not just throw everything into a hash table. Under the hood, sorted sets use a skip list combined with a hash table. The skip list gives you O(log n) range queries (like "give me the top 10 leaderboard entries"), while the hash table gives you O(1) lookups by member name. This dual structure is why ZRANGEBYSCORE and ZSCORE are both fast.

But here is a trick Redis plays for small collections. If a hash has fewer than 128 fields (configurable via hash-max-ziplist-entries) and all values are under 64 bytes, Redis stores it as a ziplist instead of a full hash table. A ziplist is a compact, contiguous block of memory. No pointers, no per-entry overhead. This is how Instagram stored 300 million user-to-photo-ID mappings in Redis using only 21GB of RAM. They chunked the mappings into small hashes, each small enough to trigger ziplist encoding, slashing memory usage by 10x compared to naive key-per-mapping storage.

Instagram stored 300 million user-to-photo mappings in Redis, using only 21GB of RAM. The key insight: small hashes use ziplist encoding, which is dramatically more memory-efficient than individual keys.

The Fork That Saves Your Data

If Redis keeps everything in RAM, what happens when the power goes out? Redis offers two persistence mechanisms, and understanding them requires understanding a Unix system call: fork().

RDB snapshots work like this: Redis calls fork(), creating a child process that is an exact copy of the parent. Thanks to the operating system's copy-on-write semantics, this fork is nearly instant regardless of dataset size. The child process sees a frozen snapshot of all the data and writes it to disk as a compact binary file. Meanwhile, the parent continues serving requests. When the parent modifies a memory page, the OS transparently copies that page so the child still sees the old data. Elegant, but you can lose data between snapshots.

AOF (Append-Only File) takes a different approach. Every write command is appended to a log file. On restart, Redis replays the entire log to reconstruct the dataset. The critical configuration is appendfsync:

  • always — fsync after every write. Maximum durability, but every command pays the cost of a disk sync. Throughput drops to whatever your SSD can handle (maybe 10,000-50,000 fsyncs per second).
  • everysec — fsync once per second. The recommended default. You lose at most one second of data on a crash. This is what most production deployments use.
  • no — let the OS decide when to flush. Fast, but you might lose 30 seconds of data.

Over time, the AOF file grows. Redis solves this with AOF rewrite: it forks a child process (same copy-on-write trick) that writes a new, minimal AOF file representing the current dataset state. A million INCR counter commands become a single SET counter 1000000. The parent buffers any new writes during the rewrite and appends them to the new file when the child finishes.

Most production deployments run both RDB and AOF. RDB gives you fast restarts and easy backups (just copy the file). AOF gives you minimal data loss. On restart, Redis prefers the AOF because it is more complete.

Redis Cluster: Splitting Your Data Across 16,384 Hash Slots

A single Redis instance tops out when your dataset exceeds available RAM. Enter Redis Cluster, which automatically shards your data across multiple nodes. The mechanism is surprisingly simple: Redis defines exactly 16,384 hash slots. Every key is assigned to a slot by computing CRC16(key) mod 16384. Each node in the cluster owns a subset of those slots.

Why 16,384 specifically? Antirez (Redis's creator) has explained this. The cluster nodes exchange heartbeat packets via a gossip protocol, and each packet includes a bitmap of which slots that node owns. 16,384 bits is exactly 2KB, a reasonable size for packets exchanged multiple times per second. Double it to 65,536 slots and you are sending 8KB bitmaps, which starts to matter in large clusters.

Redis Cluster: Hash Slot Distribution Client GET user:42 -> slot 9947 1. sends to Node A MOVED 9947 -> Node B 3. client caches slot mapping Node A (Master) Slots 0 - 5460 5,461 slots (1/3) Replica A1 Node B (Master) Slots 5461 - 10922 5,462 slots (1/3) slot 9947 is HERE Replica B1 Node C (Master) Slots 10923 - 16383 5,461 slots (1/3) Replica C1 CRC16("user:42") mod 16384 = 9947 -> Node B owns this slot Nodes exchange slot ownership via gossip protocol (2KB bitmap per heartbeat)

Figure 3: Redis Cluster with three master nodes. Each owns roughly one-third of the 16,384 hash slots. When a client sends a command to the wrong node, it receives a MOVED redirect and caches the correct mapping. Each master has a replica for failover.

When a client sends a command to the wrong node, that node responds with MOVED 9947 192.168.1.2:6379, telling the client exactly which node owns that slot. Smart clients cache these mappings, so subsequent requests go directly to the right node. During resharding (moving slots between nodes), clients may receive ASK redirections, which are temporary and do not update the cache.

Live resharding is one of Redis Cluster's most impressive features. You can migrate slots between nodes while the cluster is serving traffic. Redis migrates keys one by one, briefly locking each key during transfer. Twitter processes 600+ million tweets per day with Redis as a core caching layer, and resharding operations happen without downtime.

Why Instagram Chose Redis Over Memcached

This is a story worth knowing because it illustrates how understanding Redis internals leads to massive wins. Instagram needed to map 300 million photo IDs to the user IDs who posted them. The naive approach would be one key per mapping: SET photo:123 user:42. At roughly 90 bytes of overhead per key (the dict entry, two SDS strings, the robj wrapper), that is 27GB just for overhead on 300 million keys, plus the actual data.

Instead, Instagram bucketed the mappings into small hashes. They computed photo_id / 1000 to get a bucket key, then stored each photo-to-user mapping as a field in that hash: HSET media:123 456 789 (where 123 is the bucket, 456 is the photo ID, 789 is the user ID). Each hash had at most 1,000 entries, which kept it under the hash-max-ziplist-entries threshold. The result: ziplist encoding for every hash, and the entire 300 million mapping set fit in 21GB instead of the estimated 60+ GB with individual keys.

Memcached could not do this. It only supports flat string key-value pairs. No hashes, no ziplist encoding, no memory optimization tricks. Redis's richer data structures were not just a convenience feature; they were the reason the entire system was feasible.

Production Gotchas That Will Bite You at 3 AM

Running Redis in production is not just about knowing the commands. Here are the things that catch experienced teams off guard:

Memory fragmentation. Redis allocates and frees memory constantly. Over time, your allocator (usually jemalloc) may have plenty of free bytes scattered across memory pages, but no contiguous block large enough for a new allocation. You will see used_memory at 10GB but used_memory_rss at 14GB. That 4GB gap is fragmentation. Redis 4.0+ includes an active defragmentation feature (activedefrag yes), but it adds CPU overhead. Monitor mem_fragmentation_ratio and keep it under 1.5.

The thundering herd problem. Imagine a popular cache key expires, and 500 requests simultaneously discover the cache miss. All 500 hit your database to regenerate the value. Your database melts. Solutions include using a mutex (only one request regenerates while others wait), probabilistic early expiration (each request has a small chance of regenerating the value before TTL expires), or never letting hot keys expire and refreshing them asynchronously in the background.

Hot keys. In a Redis Cluster, all requests for a single key go to a single node. If you have a key that gets 50,000 reads per second (say, a trending topic counter), that node becomes a bottleneck while others sit idle. Solutions include client-side caching (Redis 6.0 added server-assisted client caching), read replicas for that specific node, or splitting the key into multiple sub-keys and aggregating on the client.

The KEYS * command. Never run this in production. It scans the entire keyspace in a single blocking operation. With millions of keys, this can freeze Redis for seconds. Use SCAN instead, which iterates incrementally.

Fork latency on large datasets. When Redis forks for RDB or AOF rewrite, the fork() system call itself is fast, but on a machine with 64GB of Redis data, the kernel needs to copy all the page table entries. This can take 10-20 milliseconds per 10GB, during which Redis is frozen. On transparent huge pages (THP), this gets worse because the copy-on-write granularity jumps from 4KB to 2MB. Disable THP for Redis. Always.

Twitter processes 600+ million tweets per day with Redis as a core caching layer. At that scale, every gotcha listed above is not hypothetical; it is a page-worthy incident waiting to happen.

The Bottom Line: When Redis Is the Right Call

Redis is not just "a cache." It is a programmable, in-memory data structure server that happens to be extraordinarily fast. Use it when you need sub-millisecond latency and your working set fits in memory. Use it for caching, session storage, rate limiting, leaderboards, real-time analytics, pub/sub messaging, and distributed locking. Pair it with a durable database like PostgreSQL for data that cannot afford to be lost.

Do not use Redis as your sole primary database for critical business data (despite the persistence features, it is optimized for speed, not durability). Do not store data that will grow unboundedly without configuring eviction policies. And do not assume that because it is "just a cache," it does not need monitoring, replication, or capacity planning.

The teams that get the most out of Redis are the ones who understand what is happening below the API surface: the event loop, the memory encoding optimizations, the fork-and-copy-on-write persistence model, and the hash-slot-based cluster topology. Now you are one of those teams.

References and Further Reading