By DUY HUYNH in System Design — Dec 6, 2024

Build your own Redis (2/5) - Redis Persistence

Following the previous post, now, we will talk about redis persistence. While primarily known as an in-memory database, Redis offers a sophisticated durability model through two complementary strategies: RDB snapshots and AOF logs. Let's dive deep into how these work, implementing our own RDB parser along the way.

Imagine you're building a high-performance in-memory database that processes thousands of operations per second. You face a critical challenge: how do you maintain this blazing speed while ensuring data isn't lost when the server restarts? This is where Redis's elegant dual-strategy approach comes into play.

Redis provides four persistence strategies:

RDB (Redis Database): RDB persistence performs point-in-time snapshots of your dataset at specified intervals.
AOF (Append Only File): AOF persistence logs every write operation received by the server. These operations can then be replayed again at server startup, reconstructing the original dataset. Commands are logged using the same format as the Redis protocol itself.
No persistence: You can disable persistence completely. This is sometimes used when caching.
RDB + AOF: You can also combine both AOF and RDB in the same instance.

Since our challenge focuses on RDB implementation, I'll dive deep into this approach while providing guidance on AOF persistence.

RDB: Point-in-Time Snapshots

RDB creates point-in-time snapshots of your dataset, effectively taking a full-system photograph at precisely timed intervals. This means every key, value, and data structure in your Redis instance is captured in its exact state at that moment—crucial for consistent backups and system recovery. Let's break down the data file format:

The RDB file starts with a simple magic string "REDIS" followed by a four-byte version number. This straightforward header lets Redis quickly validate the file format and version compatibility. For detailed format specifications, refer to the RDB File Format Documentation.

RDB File Format

Here's how we implement the basic RDB parser:

The RDB format uses special operation types to control the parsing flow:

One of the clever aspects of RDB's format is its variable-length integer encoding:

This encoding achieves impressive space efficiency:

Small numbers (0-63) use just 1 byte
Medium numbers (64-16383) use 2 bytes
Large numbers use 4 bytes
Special encodings handle various integer types

RDB advantages

RDB (Redis Database) offers a powerful snapshot-based persistence mechanism that creates point-in-time representations of your Redis dataset in a single compact file. This approach brings notable advantages: it enables efficient backup strategies (think hourly snapshots for the last day, daily for a month), excels in disaster recovery through easy file transfer to remote locations. RDB also maximizes Redis performances since the only work the Redis parent process needs to do in order to persist is forking a child that will do all the rest. The parent process will never perform disk I/O or alike. It particularly shines during large dataset restarts and supports partial resynchronization for replicas.

RDB disadvantages

However, RDB's snapshot-based nature comes with tradeoffs. While it's optimized for performance, it's less ideal for scenarios requiring minimal data loss risk – you might lose several minutes of data during unexpected shutdowns. Additionally, the fork() operation, crucial for RDB's functionality, can temporarily impact Redis's responsiveness, especially with large datasets on resource-constrained systems. While these brief interruptions are typically milliseconds-long, they can extend to a second in more demanding scenarios.

AOF: Command Logging

While RDB provides efficient snapshots, the Append-Only File (AOF) offers a different approach to persistence: logging every write operation. Think of it as keeping a detailed diary of all changes, rather than just taking periodic photos.

The AOF uses the same RESP protocol we explored in Part 1, making it human-readable and easy to parse:

*2\r\n$6\r\nSELECT\r\n$1\r\n0\r\n
*3\r\n$3\r\nSET\r\n$5\r\nmykey\r\n$7\r\nmyvalue\r\n
*3\r\n$3\r\nSET\r\n$5\r\ncount\r\n$2\r\n42\r\n

I won't show you the actual code, but here is some guidance on how to implement this feature.

First, to store operations in AOF, follow these key steps:

Command Capture: Intercept all write commands (SET, DEL, INCR, etc.)
RESP Encoding: Convert the command and its arguments to RESP format
Append Operation: Write the encoded command to the AOF file
Durability Control: Handle fsync based on your configuration

Redis offers three fsync policies for AOF:

Always (appendfsync always)
- Sync after every write operation
- Maximum durability
- Significant performance impact
Every Second (appendfsync everysec)
- Sync once per second
- Good balance of durability and performance
- Default and recommended setting
No Explicit Sync (appendfsync no)
- Let the OS handle syncing
- Best performance
- Risk of data loss on crashes

When recovering data from AOF, follow this sequence:

File Validation: Check AOF file integrity
Command Replay: Parse and execute commands sequentially
State Reconstruction: Build in-memory state from replayed commands
Error Handling: Handle truncated or corrupted entries

If you encounter a corrupted AOF file:

Make a backup of the corrupted file
Use redis-check-aof --fix to repair
Verify the fixed file's contents
Restart Redis with the repaired file

To prevent unbounded growth, Redis can rewrite the AOF file:

Fork Process: Create a child process for rewriting
State Capture: Write current dataset as SET commands
Buffer Changes: Accumulate new changes during rewrite
File Switch: Atomically replace old file with new one

Redis 7.0 introduced a significant improvement to AOF persistence:

appenddirname "appendonlydir"
├── base.aof     # Base snapshot (RDB or AOF format)
├── incr_1.aof   # Incremental changes
├── incr_2.aof   # More incremental changes
└── manifest.json # Tracks file relationships

This multi-part approach offers several advantages:

Better manageability of large datasets
Reduced memory pressure during rewrites
Improved recovery performance

AOF advantages

AOF represents Redis's approach to write-ahead logging, offering a robust durability mechanism through configurable fsync policies. At its core, AOF maintains system durability by recording each write operation sequentially, giving you granular control over the trade-off between performance and data safety. You can fine-tune this balance through three fsync strategies: immediately, every second (the default sweet spot), or delegate to the operating system. With the default one-second fsync policy, Redis achieves impressive write performance while limiting potential data loss to just a one-second window.

What makes AOF particularly elegant is its append-only nature. By eliminating random seeks and maintaining a linear write pattern, it provides excellent resilience against power failures and corruption. Even in edge cases where commands are partially written, Redis includes a sophisticated self-healing mechanism through its redis-check-aof utility. The system also implements a clever background rewrite mechanism that compacts the log file while maintaining operational continuity – Redis continues serving requests with the original file while constructing an optimized version containing just the essential commands to reconstruct the current dataset.

AOF disadvantages

However, this robustness comes with specific trade-offs. AOF files typically consume more storage space compared to RDB snapshots of the same dataset. While performance remains strong with the default fsync settings, it may not match RDB's consistent latency profile under extreme write loads. This becomes especially noticeable when fsync policies are set to their most conservative settings.

This balance of features makes AOF particularly valuable in scenarios where data durability is paramount and you need the ability to inspect or recover individual operations from your persistence log.

Conclusion

We've explored Redis's sophisticated persistence mechanisms through RDB snapshots and AOF logging, each serving distinct operational needs. RDB provides efficient point-in-time snapshots perfect for backups, while AOF offers fine-grained durability through command logging. Understanding these mechanisms – and their tradeoffs – is crucial for building robust Redis deployments.

In the next part, we'll talk about Redis replication, exploring how Redis scales across multiple servers while maintaining data consistency.

The complete implementation is available in this GitHub repository, which provide a practical foundation for exploring these concepts further.

RDB: Point-in-Time Snapshots

AOF: Command Logging

Conclusion

Subscribe to The Technician