Redis Complete Guide | Event Loop, Internals, RDB/AOF, Replication & Production

Redis Complete Guide | Event Loop, Internals, RDB/AOF, Replication & Production

이 글의 핵심

Hands-on Redis (data types, caching, Pub/Sub, Streams) plus how it works inside: the ae event loop, string and zset implementations, RDB snapshots vs AOF logs, PSYNC replication, Sentinel, Cluster, and production-grade patterns.

Core takeaways

Redis is an in-memory data structure store used for caching, sessions, messaging, and more. This guide pairs everyday APIs with internals: the single-threaded event loop, physical encodings (SDS, quicklist, skiplist + hash table), RDB vs AOF, PSYNC replication, and production patterns (Sentinel, Cluster, memory policies, observability).

Operations reality: Redis is fast, but KEYS, huge SORT/SUNION, and other O(N) or bulky work can monopolize the main thread and stall every other client. Understanding the loop helps you interpret spikes in SLOWLOG and latency.

Introduction: “Redis got slow”

Real-world scenarios

Scenario 1: We scanned every key and everything froze

Never use KEYS in production—use SCAN. Long commands block the event loop.

Scenario 2: After restart, data did not match expectations

RDB only recovers to the last snapshot; AOF behavior depends on appendfsync and rewrite timing.

Scenario 3: Replicas keep full-resyncing

Network, disk, backlog sizing, and write volume all affect replication stability—check INFO replication.


1. What is Redis?

Key characteristics

Redis (Remote Dictionary Server) is primarily a memory-backed key–value store with rich types: strings, hashes, lists, sets, sorted sets, streams, bitmaps, HyperLogLog, and more.

Typical use cases:

  • Cache: database/API responses, object cache
  • Sessions & token deny lists: fast reads/writes
  • Pub/Sub & Streams: messaging and append-only logs
  • Rate limiting: atomic counters and sliding windows
  • Leaderboards: sorted sets
  • Distributed locks (careful design): SET NX patterns

Throughput can reach hundreds of thousands of ops/sec in benchmarks, but command mix, payload size, persistence, and replication change real numbers.


2. The single-threaded event loop (internals)

Why “one thread” for commands

Redis executes commands on a single main thread in the classic model. That reduces lock contention, simplifies reasoning, and makes single-key command atomicity natural. Many concurrent clients still multiplex on one event loop.

The ae loop

Redis sits on a small event library (ae) that uses epoll (Linux), kqueue (BSD/macOS), or select as available. When a socket is readable, Redis reads/parses the query and dispatches the command handler.

I/O threads (Redis 6+)

Redis 6 introduced optional I/O threads to parallelize network reads/writes. Command execution remains on the main thread. If your workload is CPU-bound on commands, I/O threads alone will not fix it.

What blocks the loop

  • Slow commands: KEYS, large SORT, huge unions, massive serializations
  • Disk: AOF fsync policy—appendfsync always is durable but expensive
  • Forked work: BGSAVE, AOF rewrite—fork() plus Copy-on-Write can spike RSS under write load
  • Lua scripts: long scripts block other commands for their duration

Operational tools

  • SLOWLOG: records commands above a latency threshold
  • LATENCY DOCTOR / latency tooling: helps pinpoint stalls
  • Prefer splitting huge values or moving blobs to object storage when appropriate

3. Data structure implementations (encodings)

Redis chooses multiple physical representations per logical type. Names evolve across releases (e.g., ziplist → listpack), so focus on ideas: compact encodings for small values, hash tables and skiplists when data grows.

SDS (Simple Dynamic String)

String values use SDS, not raw C strings, keeping length in O(1), avoiding buffer overflows, and supporting binary-safe payloads.

Hash

Small hashes may be stored in a compact sequential encoding; larger hashes promote to a hash table. Many fields increase memory and rehashing cost—schema your hashes thoughtfully.

List

Lists are typically quicklists—a linked list of listpack chunks—balancing pointer overhead with compact storage.

Sorted set

Sorted sets combine a hash table (member → score) with a skiplist for ordered traversal and range queries—why ZSET excels at leaderboards.

Set

Integer-only small sets may use intset; larger or mixed types move to hash-based representations.

Debugging encodings

  • OBJECT ENCODING key: inspect how a key is stored
  • Version upgrades: encoding defaults can change—read release notes when migrating

4. Install and connect

docker run -d --name redis -p 6379:6379 redis:7-alpine
npm install redis
// lib/redis.ts
import { createClient } from "redis";

export const redis = createClient({
  url: process.env.REDIS_URL || "redis://localhost:6379",
  socket: {
    reconnectStrategy: (retries) => Math.min(retries * 50, 500),
  },
});

redis.on("error", (err) => console.error("Redis Client Error", err));
await redis.connect();

5. Core data types (APIs)

String

await redis.set("user:1:name", "John");
const name = await redis.get("user:1:name");
await redis.setEx("session:abc", 3600, "user-data");
await redis.incr("page:views");

Hash

await redis.hSet("user:1", {
  name: "John",
  email: "[email protected]",
  age: "30",
});
const user = await redis.hGetAll("user:1");

Sorted set (leaderboards)

await redis.zAdd("leaderboard", { score: 100, value: "user1" });
const top = await redis.zRangeWithScores("leaderboard", 0, 2, { REV: true });

6. Caching strategies

Cache-aside

async function getUser(id: number) {
  const cacheKey = `user:${id}`;
  const cached = await redis.get(cacheKey);
  if (cached) return JSON.parse(cached);
  const user = await db.user.findUnique({ where: { id } });
  await redis.setEx(cacheKey, 3600, JSON.stringify(user));
  return user;
}

Combine TTL with explicit invalidation on writes. Mitigate cache stampede with locks, single-flight, or probabilistic early expiration.


7. Pub/Sub and Streams

Pub/Sub is fire-and-forget broadcast—if no subscriber is connected, messages disappear. For durability and replay, use Streams with consumer groups.


8. Persistence: RDB and AOF

RDB snapshots

RDB periodically writes a compact binary snapshot. BGSAVE forks a child process; Copy-on-Write shares pages until writes copy pages—heavy write load can inflate memory during saves.

Pros: fast restarts, simple backups
Cons: data since the last snapshot may be lost

AOF (append-only file)

Append every mutating command (subject to rewrite policies). Common appendfsync modes:

  • always: strongest durability, lowest throughput
  • everysec: popular compromise—up to ~1s exposure window
  • no: OS-buffered—fastest, largest loss window

AOF rewrite rewrites the log compactly; it also uses fork(), so plan RAM and disk headroom.

Practical choices

  • Cache-only: often no persistence or RDB-only snapshots
  • Durable sidecar: AOF (everysec) + periodic RDB backups
  • Test restores regularly—an untested backup is not a backup

9. Replication protocol

Full sync vs partial resync

New or lagging replicas may need a full resync (RDB stream over the wire). After that, replicas catch up incrementally. PSYNC (and successors) tracks a replication offset and uses a backlog buffer on the master to attempt partial resync after short disconnects.

Replication backlog

If a replica is disconnected longer than the backlog retains writes, partial resync fails and a full resync is required. Tune repl-backlog-size for peak write rate and expected outage windows.

Monitoring

Watch lag, master_repl_offset, and network errors. Slow disks or saturated links cause permanent lag.

Diskless replication

Depending on version and flags, replicas can stream RDB directly over sockets without touching disk—useful when tuned correctly for your network and security model.


10. Production Redis patterns

Sentinel (HA)

Sentinel monitors masters, performs failover, and publishes the new master. Understand quorum, SDOWN vs ODOWN, and ensure clients subscribe to topology changes.

Cluster (sharding)

Redis Cluster maps keys to 16,384 hash slots. Hash tags {user123}:a and {user123}:b co-locate keys, but abusing tags can create hot shards.

Memory policies

Set maxmemory and maxmemory-policy (allkeys-lru, volatile-lru, allkeys-lfu, etc.) to match whether Redis is a cache or holds required data.

Security and ops

  • TLS, ACLs, network isolation
  • rename-command to neuter dangerous commands in shared environments
  • Metrics: INFO, latency, memory, replication lag, disk I/O
  • Rolling upgrades: plan client reconnects and failover drills

11. Performance tooling

Pipelines / transactions

const pipeline = redis.multi();
pipeline.set("key1", "value1");
pipeline.set("key2", "value2");
await pipeline.exec();

Use WATCH / MULTI / EXEC or Lua when you need atomic multi-key updates—mind slot constraints in Cluster.


Summary checklist

  • Event loop: single-threaded execution; optional I/O threads
  • Encodings: SDS, quicklist, skiplist + dict, intset—chosen per size/shape
  • Persistence: RDB snapshots vs AOF logs; fork/COW/rewrite costs
  • Replication: PSYNC + backlog; monitor offsets and lag
  • Production: Sentinel, Cluster, memory policy, security, SLOWLOG

Checklist

  • Configure SLOWLOG and latency monitoring
  • Choose RDB/AOF/none deliberately per role
  • Alert on replication lag and backlog limits
  • Define TTL and invalidation for caches
  • Ban KEYS in prod; use SCAN
  • Practice failover and restore drills


Keywords covered

Redis, event loop, SDS, skiplist, RDB, AOF, replication, Sentinel, Cluster, cache


FAQ

Q. Redis vs Memcached?

A. Redis wins on data structures, persistence, replication, and ecosystem. Memcached can still win for extremely simple string caching at huge scale, but Redis is the default choice for most backends.

Q. When should I use Lua?

A. When you must touch multiple keys atomically on the server. Keep scripts short—they block the main thread while running.

Q. Why do multi-key transactions fail in Cluster?

A. Keys must live in the same hash slot unless you use hash tags carefully. Design keys up front.