24 How to Perform Data Recovery After Redis Collapse

24 How to Perform Data Recovery After Redis Collapse #

Hello, I’m your cache course teacher, Chen Bo. Welcome to lesson 24, “How to recover data after Redis crashes.” In this lesson, we will mainly learn about data recovery using RDB, AOF, and hybrid storage.

Redis persistence is the process of dumping memory data to disk. Currently, Redis supports three modes of persistence: RDB, AOF, and hybrid storage.

RDB #

Redis RDB persistence stores memory data to disk in the form of snapshots. When RDB persistence is required, Redis saves all data in memory to disk in a binary format. Each data item stored includes expiration time, data type, key, and value. When Redis is restarted, if appendonly is turned off, it will read the binary file generated by RDB persistence for data recovery.

There are four common scenarios triggering RDB creation:

The first scenario is manually triggering RDB snapshot creation using the save or bgsave commands. This is triggered by the caller calling the save or bgsave command.
The second scenario is automatic snapshot generation using the save m n configuration. This means that if there are n key insertions or modifications within m seconds, it will automatically trigger a bgsave. Multiple lines can be configured to combine and use together. During peak periods, Redis is under heavy load and there are many key changes. If RDB creation is performed, it will further increase the machine load and affect the caller’s requests, so caution is required when using it in production.
The third scenario is master-slave replication. When the slave needs to perform a full replication, the master will also perform a bgsave to generate an RDB snapshot.
The fourth scenario is when the system administrator executes flushall to clear all data or shutdown to shut down the service, which will also trigger Redis to automatically create an RDB snapshot.

save performs RDB persistence in the main process, which blocks Redis and does not handle any client requests, so it is less commonly used. On the other hand, bgsave forks a child process to create an RDB snapshot. The process of creating the snapshot does not directly affect user access, but it still increases the machine load. When performing online Redis snapshot backups, it is generally recommended to trigger backups during low traffic periods, such as early mornings, using bgsave.

The RDB snapshot file consists of three parts:

The first part is the RDB header, which includes the RDB version, Redis version, creation date, memory occupation, and other auxiliary information.
The second part is the data of each RedisDB. When storing each RedisDB, it first records the current DBID of the RedisDB, then records the number of records in the main dictionary and expiration dictionary, and finally stores each data record in a loop. When storing data records, if the data has an expiration time, it will be recorded first. If Redis’s maxmemory_policy expiration policy uses LRU or LFU, it will also store the LRU and LFU values corresponding to the key. Finally, it records the data type, key, and value of the data.
The third part is the RDB footer. The RDB footer first stores auxiliary information such as Lua scripts in Redis. Then it stores the EOF marker, which is a character with a value of 255. Finally, it stores the checksum of the RDB.

With that, the RDB is completed.

RDB stores memory data in binary format, resulting in smaller files and faster recovery speed during startup. However, the RDB snapshot file can only store the memory data at the moment it is created and cannot record subsequent data changes. The process of creating an RDB is CPU-intensive, even if it is performed in a child process. Moreover, each snapshot contains the entire data set, which takes a long time and cannot be performed at any time, especially during peak hours. RDB uses binary storage, which has poor readability, and there may be compatibility issues between different versions due to the fixed format.

AOF #

Redis AOF persistence logs data to disk by appending commands. By enabling the appendonly configuration, Redis appends every write instruction to the AOF file on disk, effectively recording the latest state of the memory data. Even if Redis crashes or shuts down unexpectedly, it can recover the latest full data set by loading the AOF, minimizing data loss. The protocol stored in the AOF file is the multibulk format for writing instructions. This is the standard protocol format for Redis, so it can be parsed and processed by different versions of Redis with good compatibility.

However, because Redis records all write instructions in the AOF, a large amount of intermediate state data and even deleted expired data will exist in the AOF, resulting in a high redundancy. Additionally, each instruction needs to be loaded and executed for data recovery, which can be time-consuming.

The process of persisting AOF data is as follows. After processing a write instruction, Redis first writes the instruction to the AOF buffer, and then periodically writes the AOF buffer to the file buffer through server_cron. Finally, the file buffer is asynchronously synced to disk according to the configured strategy.

Redis uses appendfsync to set three different file buffer synchronization strategies.

The first strategy is “no”, which means Redis does not actively use fsync to sync the file data to disk, but relies on the write function of the operating system to confirm the sync time. On Linux, synchronization occurs approximately every 30 seconds. However, if Redis crashes, a large amount of data loss may occur.
The second strategy is “always”, which means that every time the AOF buffer is written to the file, fsync is called to force the kernel data to be written to the file. This strategy has the highest security but may have lower performance and reduce the lifespan of the disk due to frequent I/O operations.
The third strategy is “everysec”, which means that fsync is called every second through the BIO thread. This strategy strikes a good balance between security, performance, and disk lifespan, and can meet the needs of online business well.

Over time, the AOF will continue to record all write instructions, making it larger and filled with a lot of intermediate and expired data. To reduce invalid data and improve recovery time, periodic AOF rewrite operations can be performed.

AOF rewrite can be performed by executing the bgrewriteaof command or by configuring the rewrite policy to be automatically triggered by Redis. During an AOF rewrite, a child process is forked. The child process iterates over all RedisDB snapshots, converts all memory data to commands, and writes them to a temporary file. While the child process is rewriting the AOF, the main process can continue executing user requests, and after completion, the write instructions are written to the old AOF file and the rewrite buffer. Once the child process finishes persisting the data from RedisDB, it notifies the main process. The main process then writes the AOF rewrite buffer data to the AOF temporary file, replaces the old AOF file with the new one, and finally asynchronously closes the old AOF file through the BIO thread. With that, the AOF rewrite process is complete.

The AOF rewrite process involves persisting all RedisDB snapshots one by one. For each DB, it first records the DBID to be persisted by using the select $db command. Then, it records each key/value pair using commands. For value data of SDS type, it can be persisted directly. However, if the value is an aggregate type, all elements are batch-added commands and persisted.

For list data type, all list elements are persisted using the RPUSH command. For set data type, all set elements are persisted using the SADD command. For sorted set data type, all elements are persisted using the ZADD command. For hash data type, all hash elements are persisted using the HMSET command. If the data has an expiration time, it is also recorded using the pexpireat command.

The advantage of AOF persistence is that it can record all the latest memory data, with a maximum data loss of only 1-2 seconds. AOF appends record data using the Redis protocol, so it has high compatibility and is capable of continuously and lightly saving the latest data. Lastly, because it directly stores data using the Redis protocol, it has good readability.

The disadvantage of AOF persistence is that as time goes on, redundant data increases, the file gets larger, and the data recovery process needs to read and execute all the commands, which makes the recovery relatively slow.

Hybrid Persistence #

Starting from Redis version 4.0, Redis introduced the Hybrid Persistence mode, which is enabled by default in version 5.0. As mentioned earlier, RDB has fast loading speed but slow construction time and lacks the most recent data. AOF continuously appends the latest write records and can contain all data, but it is redundant and has a slow loading speed. The Hybrid mode combines the advantages of RDB and AOF, allowing it to include full data and load relatively quickly. The configuration aof-use-rdb-preamble can be used to explicitly enable Hybrid Persistence mode.

The Hybrid Persistence mode is also implemented through bgrewriteaof. When the Hybrid mode is enabled and bgrewriteaof is performed, the main process initially forks a child process. The child process first writes the memory data in the RDB binary format to the AOF temporary file. Then, it appends the newly written instructions buffered during the persisting process to the temporary file as commands. Afterwards, it notifies the main process that the persistence is complete. The main process renames the temporary file as the AOF file and closes the old AOF file. In this way, the main data is stored in the RDB format, and the new instructions are appended using the command-based approach for persistence. Subsequent tasks can be appended to the new AOF file through normal command execution.

Hybrid Persistence combines the advantages of RDB and AOF while minimizing disadvantages. Its advantages are containing full data and having a fast loading speed. The disadvantage is that the RDB format at the beginning of the file is less compatible and less readable.