03 Redis Persistence Rdb

03 Redis Persistence RDB #

Both reading and writing in Redis are done in memory, so it has high performance. However, the data in memory will be lost when the server restarts. In order to ensure that data is not lost, we need to store the data in memory to disk, so that Redis can recover the original data from the disk when it restarts. This whole process is called Redis persistence.

Redis persistence is also one of the main differences between Redis and Memcached, as Memcached does not have persistence functionality.

1 Several Ways of Persistence #

Redis persistence has the following three ways:

Snapshotting (RDB, Redis DataBase) writes the memory data of a certain moment into the disk in binary format.
Append-Only File (AOF) records all the operation commands and appends them to a file in text format.
Hybrid Persistence is a new way introduced after Redis 4.0, which combines the advantages of RDB and AOF. When writing, the current data is written to the file in RDB format first, and then the subsequent operation commands are stored in the file in AOF format. This ensures both the restarting speed of Redis and reduces the risk of data loss.

Since each persistence solution has specific use cases, let’s start with RDB persistence.

2 Brief Introduction to RDB #

RDB (Redis DataBase) is the process of writing a memory snapshot of a certain moment to disk in binary format.

3 Triggering Persistence #

There are two types of persistent triggering methods for RDB: manual triggering and automatic triggering.

1) Manual Triggering #

There are two operations for manually triggering persistence: save and bgsave, and the main difference between them lies in whether they block the execution of the Redis main thread.

① `save` command #

Executing the save command in the client triggers Redis persistence, but it also blocks Redis and responds to other clients’ commands only after the RDB persistence is completed. Therefore, it should be used with caution in production environments.

The save command is used as follows: From the image, we can see that after executing the save command, the modification time of the persistence file dump.rdb changes, indicating that the save command has successfully triggered RDB persistence. The execution process of the save command is shown in the following image:

② `bgsave` command #

bgsave (background save) means saving in the background. The biggest difference between it and the save command is that bgsave forks a child process to perform the persistence. Only during the fork process does the Redis main process have a short blocking period. After the child process is created, the Redis main process can respond to other clients’ requests. Compared to the save command, which blocks the entire process, the bgsave command is obviously more suitable for use.

The bgsave command is used as shown in the following image: The execution process of bgsave is shown in the following image:

2) Automatic Triggering #

After discussing the manual triggering methods of RDB, let’s take a look at how to trigger RDB persistence automatically. RDB automatic persistence mainly comes from the following situations.

① `save m n` #

save m n means that if n keys change within m seconds, persistence will be triggered automatically. The parameters m and n can be found in the Redis configuration file. For example, save 60 1 means that if at least one key changes within 60 seconds, RDB persistence will be triggered. Automatic triggering of persistence essentially means that Redis automatically executes a bgsave command if the set triggering condition is met. Note: When multiple save m n commands are set, any one of the conditions being met will trigger persistence. For example, if we set the following two save m n commands:

save 60 10
save 600 1

If 10 Redis key-value changes occur within 60 seconds, persistence will be triggered. If the number of Redis key-value changes within 60 seconds is less than 10, Redis will check if the key-value has been modified at least once within 600 seconds. If it does, persistence will be triggered.

② `flushall` command #

The flushall command is used to clear the Redis database. It should be used with caution in production environments. When Redis executes the flushall command, automatic persistence will be triggered and the RDB file will be cleared. The result of the execution is shown in the following image:

③ Master-Slave Synchronization Triggering #

In Redis master-slave replication, when the slave node performs a full replication operation, the master node executes the bgsave command and sends the RDB file to the slave node, automatically triggering Redis persistence.

4 Configuration Explanation #

Reasonably setting the RDB configuration can ensure efficient and stable operation of Redis. Let’s take a look at the configuration items for RDB.

RDB configuration parameters can be found in the Redis configuration file, and the specific content is as follows:

# Conditions for Awaiting RDB Saving
save 900 1
save 300 10
save 60 10000

# Whether to stop persisting data to disk after a bgsave failure. "yes" means stop persistence, "no" means ignore errors and continue writing to the file.

    stop-writes-on-bgsave-error yes
    
    # RDB file compression
    rdbcompression yes
    
    # Whether to enable RDB file check when writing and reading files to check for corruption. If corruption is detected during startup, stop the startup.
    rdbchecksum yes
    
    # RDB file name
    dbfilename dump.rdb
    
    # RDB file directory
    dir ./
    
    

The important parameters are as follows: **① save parameter**. It is used to configure the triggering conditions for RDB persistence. When the save condition is met, the data will be persisted to the hard disk. The default configuration is as follows:

  * save 900 1: If at least 1 key value changes within 900 seconds, the data will be persisted to the hard disk;
  * save 300 10: If at least 10 key values change within 300 seconds, the data will be persisted to the hard disk;
  * save 60 10000: If at least 10000 key values change within 60 seconds, the data will be persisted to the hard disk.



**② rdbcompression parameter**. Its default value is `yes`, which means that RDB file compression is enabled, and Redis will use the LZF algorithm for compression. If you don't want to consume CPU performance for file compression, you can disable this function. The disadvantage is that it requires more disk space to save the file. **③ rdbchecksum parameter**. Its default value is `yes`, which means that RDB file check is enabled when writing and reading files to check for corruption. If corruption is detected during startup, it will stop the startup.

### 5. Configuration Query

In Redis, you can use the command to query the current configuration parameters. The format of the query command is: `config get xxx`, For example, if you want to get the storage name setting of the RDB file, you can use `config get dbfilename`, the execution effect is as shown in the figure below: ![image.png](../images/2020-02-24-122630.png) To query the directory where the RDB file is located, you can use the command `config get dir`, the execution effect is as shown in the figure below: ![image.png](../images/2020-02-24-122633.png)

### 6. Configuration Setting

To set the configuration of RDB, you can use the following two methods:

  * Manually modify the Redis configuration file;
  * Use command line to set, for example, use `config set dir "/usr/data"` to modify the storage directory of RDB.



**Note**: Manually modifying the Redis configuration file will take effect globally, and even if Redis server is restarted, the parameter settings will not be lost. However, the command line setting will be lost after Redis is restarted. But to make the manually modified Redis configuration file take effect immediately, you need to restart the Redis server, while the command method does not require restarting the Redis server.

> Tips: The Redis configuration file is located at the root path of the Redis installation directory, with the default name redis.conf.

### 7. RDB File Recovery

When the Redis server starts, if the RDB file dump.rdb exists in the Redis root directory, Redis will automatically load the RDB file to restore the persistent data. If the dump.rdb file is not in the root directory, please move the dump.rdb file to the Redis root directory first. **Verify whether the RDB file is loaded**. Redis has log information when starting, which shows whether the RDB file is loaded. We execute the Redis start command: `src/redis-server redis.conf`, as shown in the figure below: ![image.png](../images/2020-02-24-122634.png) From the log, we can see that the Redis service has successfully loaded the RDB file during startup.

> Tips: Redis server will be in a blocked state during the RDB file loading process and will remain so until the loading work is completed.

### 8. Pros and Cons of RDB

#### 1) Advantages of RDB

  * RDB files store binary data, occupy less memory, are more compact, and are more suitable as backup files;
  * RDB is very useful for disaster recovery. It is a compact file that can be transferred to a remote server for Redis service recovery more quickly;
  * RDB can greatly improve the running speed of Redis. When Redis forks() a child process to persist data to the disk during each persistence, the main Redis process does not execute disk I/O operations, etc.
  * Compared with the AOF format file, RDB files can restart faster.



#### 2) Disadvantages of RDB

  * Because RDB can only save data within a certain time interval, if the Redis service is unexpectedly terminated midway, Redis data for that period will be lost;
  * RDB needs to frequently fork() in order to use a child process to persist it to the disk. If the data set is large, fork() may be time-consuming. If the data set is large and the CPU performance is poor, it may cause Redis to stop serving clients for a few milliseconds or even one second.



### 9. Disable Persistence

Disabling persistence can improve the execution efficiency of Redis. If you are not sensitive to data loss, you can disable Redis persistence by using the command `config set save ""` while connecting to the client, as shown in the figure below: ![image.png](../images/2020-02-24-122636.png)

### 10. Summary

Through this article, we can learn that RDB persistence can be triggered manually or automatically. Its advantages are that it stores files in a small size, and it can quickly restore data when Redis starts. The disadvantage is the risk of data loss. The recovery of RDB file is also simple, just move the RDB file to the root directory of Redis, and it will be automatically loaded and data will be restored when Redis starts. Finally, let's leave a question for everyone to think about: What might cause high CPU usage of the Redis server? Welcome to write down your answers in the comments.

**Reference & Acknowledgement**: <https://redis.io/topics/persistence> <https://blog.csdn.net/qq_36318234/article/details/79994133> <https://www.cnblogs.com/ysocean/p/9114268.html> <https://www.cnblogs.com/wdliu/p/9377278.html>