27 How Does Redis Perform Master Slave Replication

27 How Does Redis Perform Master-Slave Replication #

In this lesson, we will primarily learn about the principles of Redis replication and analyze the replication process.

Redis Replication Principle #

To avoid single point of failure, data storage needs to be replicated. Additionally, since Redis operates on a single-threaded model, a single Redis instance has limited throughput in terms of requests per second (TPS). Therefore, Redis has provided replication functionality since its inception and has continuously optimized the replication strategy.

Through data replication, a Redis master can have multiple slaves, and each slave can have multiple nested slaves. All write operations are performed on the master instance, and after the master completes the write instructions, it distributes them to the slaves attached to it. If there are nested slaves under a slave node, the received write instructions are further distributed to the slaves attached to it. With multiple slaves, Redis can achieve multi-replica data storage, ensuring that no data is lost even if any node fails, while also greatly improving read performance by multiple times. The master only handles write operations and does not handle read operations. With this master-slave combination, both read and write capabilities can be significantly improved.

When distributing write requests, the master also copies the write instructions to a replication backlog, which allows a slave that has reconnected within a short time to continue replication from the last replication position in the backlog. This improves replication efficiency.

Matching between the master and slave is performed using replication IDs to prevent slaves from attaching to the wrong master. Redis replication can be divided into full synchronization and incremental synchronization. During full synchronization, the master saves the in-memory data through bgsave and stores the write instructions during the creation of the in-memory snapshot in the replication buffer. After the RDB snapshot is completed, the master sends both the RDB snapshot and the data in the replication buffer to the slave, where the slave completely recreates a copy of the data. This process incurs a considerable performance cost on the master, and the time it takes for the slave to construct the data is also relatively long. Moreover, transmitting the RDB snapshot consumes a large amount of bandwidth, which significantly affects the performance and resource access of the entire system. In contrast, incremental replication involves the master only sending the write instructions that occurred after the last replication position on the slave, without the need to construct an RDB snapshot. Additionally, the amount of data transmitted is minimal, resulting in minimal impact on the load of the master and slave, and the impact on the bandwidth can be ignored, resulting in only minimal impact on the entire system.

Before Redis 2.8, Redis mostly supported full replication. When a connection between a slave and the master was disconnected or when a slave restarted, full replication needed to be performed. Starting from version 2.8, Redis introduced psync, which added a replication backlog. When synchronizing write instructions to the slave, a copy is also written to the replication backlog. When a slave reconnects after a brief disconnection, it reports the master’s run ID and replication offset. If the run ID matches the master and the offset is still within the replication backlog on the master, incremental synchronization is performed by the master.

However, if the slave restarts, the master’s run ID is lost, or if the master switches, the run ID changes, and full synchronization is still required. Therefore, Redis 4.0 introduced psync2 to strengthen the psync mechanism. In psync2, instead of using the run ID, Redis uses the replid (i.e., replication ID) as the basis for replication judgment. Moreover, when a Redis instance creates an RDB snapshot, it stores the replid as auxiliary information in the RDB. Therefore, during a restart, the master’s replication ID can still be obtained when loading the RDB, enabling incremental synchronization to continue after a slave restart.

In psync2, each Redis instance has a replication ID (replid) in addition to a replid2. After Redis starts, it creates a random 40-character string as the initial value of replid. After establishing a master-slave connection, it replaces its replid with the master’s replid. At the same time, it stores the previous master’s replid in replid2. This ensures that even if the replication ID reported by the slave is different from the new master’s replid, but is the same as the new master’s replid2, and the replication offset is still within the replication backlog, incremental replication can still be achieved.

Redis Replication Analysis #

When setting up a master and slave in Redis, first use configuration or the command slaveof no one to set the node as the master. Then, other slave nodes can be attached to the master by using the command slaveof (master_ip) master_port. The same method can be used to attach a slave node to an existing slave node. When preparing for data replication, the slave first establishes a connection with the master and reports information. The specific process is as follows.

After the slave establishes a connection with the master, it sends a ping command. If the master returns pong instead of an exception, it means that the master is available. If Redis is password protected, the slave sends an auth $masterauth command for authentication. Once authenticated, the slave sends its own port and IP to the master using the replconf command. Then, the slave continues to send capa eof and capa psync2 using replconf to verify the replication version. If the master verifies successfully, the slave then sends its own replication ID and replica offset to the master using psync, and data synchronization officially begins.

When the master receives the psync command from the slave, it starts to determine the method of data synchronization. As mentioned earlier, Redis currently stores the replication ID, replid, and replid2. If the replication ID received from the slave is the same as the master’s replication ID (replid and replid2), and the replication offset is in the replication backlog, then incremental synchronization can take place. The master sends a continue response and returns the master’s replid. The slave replaces the master’s replid with its own, and sets the previous replication ID as replid2. After that, the master can continue to send instructions after the replication offset to the slave, completing data synchronization.

If the master finds that the replication ID received from the slave is different from its own replid and replid2, or the replication offset is not in the replication backlog, it determines that full replication is required. The master sends a fullresync response, along with the replid and replication offset. Then, depending on the need, the master builds an RDB and sends the RDB and replication backlog to the slave.

For incremental replication, the slave then waits to receive the replication backlog and newly added write commands from the master for data synchronization.

As for full synchronization, the slave will first perform cleanup work for nested replication. For example, if the slave currently has nested sub-slaves, it will close all connections of the nested sub-slaves and clean up its replication backlog. Then, the slave constructs a temporary RDB file and reads the actual data from the master connection into the RDB. When writing the RDB file, an fsync operation is performed every 8M to flush the file buffer. Once the RDB is received, the temporary RDB file is renamed to the actual RDB file name.

Next, the slave will first clear old data by deleting all data in its local databases, temporarily stopping replication from the master. Then, the slave starts loading the RDB to restore the data from the RDB into memory. Once the RDB is loaded, the slave reuses the socket connection with the master to create a client connection and registers a read event, allowing it to start receiving write commands from the master. At this point, the slave also sets its own replication ID and replication offset as the master’s replid and replication offset, and clears its own replid2, because all nested sub-slaves of the slave also need to perform full replication. Finally, the slave opens the AOF file, executes and writes the write commands received from the master to its own AOF.

Compared to previous versions, the optimization of psync2 is evident. In scenarios such as short disconnection, slave restart, or master switch, as long as the delay is not too long and the replication offset is still in the replication backlog, incremental synchronization can still be performed. This reduces the burden on the master and the network bandwidth. At the same time, the slave can achieve data synchronization through lightweight incremental replication, quickly recovering the service and reducing system jitter.

However, psync still heavily relies on the replication backlog. If the backlog is too large, it may occupy too much memory, and if it is too small, frequent full replication may occur. Moreover, due to memory limitations, even if a relatively large replication backlog is set, it can still easily be flushed due to a long disconnection of the slave, resulting in full replication.