19 Impact of Aof Rewrite Trigger Timing and Rewriting

19 Impact of AOF Rewrite Trigger Timing and Rewriting #

We know that besides using memory snapshot RDB to ensure data reliability, Redis can also use AOF logs. However, while the RDB file saves the memory data at a certain moment as a file, the AOF log records all the received write operations. If there are a lot of write requests to the Redis server, the recorded operations in the AOF log will increase, resulting in larger AOF log files.

Therefore, to avoid generating excessively large AOF log files, Redis performs AOF rewriting, which means recording the insert operations of the latest content for each key-value pair in the current database, instead of recording its historical write operations. As a result, the size of the rewritten AOF log file can be reduced.

So when will AOF rewriting be triggered? Will the process of writing files for AOF rewriting block the main thread of Redis and affect its performance?

In today’s lesson, I will introduce the code implementation process of AOF rewriting. By understanding its code implementation, we can have a clear understanding of the performance of the AOF rewriting process and its impact on the Redis server. Therefore, when you encounter a slow Redis server issue, you can check whether it is caused by AOF rewriting.

Alright, next, let’s take a look at the AOF rewriting function and its triggering timing.

AOF Rewrite Function and Trigger Timing #

First, the function that implements AOF rewriting is rewriteAppendOnlyFileBackground, which is implemented in the aof.c file. In this function, the fork function is called to create a child process for AOF rewriting to perform the actual rewrite operation. I will explain the specific implementation of this function later. Now, let’s see which functions call this function so that we can understand the timing of AOF rewriting.

In fact, the rewriteAppendOnlyFileBackground function is called in three functions.

The first one is the bgrewriteaofCommand function. This function is implemented in the aof.c file and corresponds to the bgrewriteaof command executed on the Redis server. This means that we manually trigger the execution of AOF rewriting.

However, even if we manually execute the bgrewriteaof command, the bgrewriteaofCommand function determines whether to actually perform AOF rewriting based on the following two conditions:

  * Condition 1: Whether there is already a child process for AOF rewriting currently executing. If there is, the bgrewriteaofCommand function will not perform AOF rewriting.   * Condition 2: Whether there is a child process for creating an RDB currently executing. If there is, the bgrewriteaofCommand function sets the aof_rewrite_scheduled member variable of the global variable server to 1. This flag indicates that the Redis server has scheduled AOF rewriting to be executed later when the subsequent conditions are met (we will see later when aof_rewrite_scheduled is set to 1, under what conditions the Redis server will actually perform the rewrite operation).

This means that the bgrewriteaofCommand function immediately calls the rewriteAppendOnlyFileBackground function and actually performs AOF rewriting only when there is neither an AOF rewrite child process nor an RDB child process.

The following code shows the basic execution logic of the bgrewriteaofCommand function:

void bgrewriteaofCommand(client *c) {
    if (server.aof_child_pid != -1) {
        .. // AOF rewrite child process exists, so no rewriting is performed
    } else if (server.rdb_child_pid != -1) {
        server.aof_rewrite_scheduled = 1; // RDB child process exists, set AOF rewriting to be scheduled
        ...
    } else if (rewriteAppendOnlyFileBackground() == C_OK) { // Perform AOF rewriting
        ...
    } 
    ...
}

The second one is the startAppendOnly function. This function is also implemented in the aof.c file, and it is called by the configSetCommand function (in the config.c file) and the restartAOFAfterSYNC function (in the replication.c file).

First, for the configSetCommand function, it corresponds to enabling AOF functionality by executing the config command in Redis, as shown below:

config set appendonly yes

Once AOF functionality is enabled, the configSetCommand function calls the startAppendOnly function to perform AOF rewriting once.

As for the restartAOFAfterSYNC function, it is called during the replication process between master and slave nodes. In simple terms, when the AOF option is enabled on the slave node, the AOF option will be turned off when loading and parsing the RDB file. Then, regardless of whether the slave node successfully loads the RDB file, the restartAOFAfterSYNC function will be called to restore the closed AOF functionality.

During this process, the restartAOFAfterSYNC function calls the startAppendOnly function and further calls the rewriteAppendOnlyFileBackground function to perform AOF rewriting once.

Here, you should note that, similar to the bgrewriteaofCommand function, the startAppendOnly function also checks whether there is an RDB child process currently executing. If there is, it sets the AOF rewriting to be scheduled. In addition, if the startAppendOnly function detects that an AOF rewrite child process is executing, it kills the child process first and then calls the rewriteAppendOnlyFileBackground function to perform AOF rewriting.

So, at this point, we can actually find that, whether it is the bgrewriteaofCommand function or the startAppendOnly function, when they detect that there is an RDB child process executing, they will set the aof_rewrite_scheduled variable to 1, indicating that the AOF rewriting operation will be executed when the conditions are met.

So, when will the Redis server recheck whether the conditions for AOF rewriting are met? This is related to the serverCron function, which is called periodically while the Redis server is running.

The third one is the serverCron function. The serverCron function is periodically executed during the operation of the Redis server. During the execution, it checks twice to decide whether to perform AOF rewriting.

First, the serverCron function checks whether there is no RDB child process and AOF rewrite child process currently executing and whether there is AOF rewriting scheduled (i.e., the aof_rewrite_scheduled variable value is 1). If all three conditions are met, the serverCron function will call the rewriteAppendOnlyFileBackground function to perform AOF rewriting. The execution logic of this part in the serverCron function is as follows:

// If there is no RDB child process, no AOF rewrite child process, and AOF rewrite is scheduled to be executed, 
// call the rewriteAppendOnlyFileBackground function to perform AOF rewriting
if (server.rdb_child_pid == -1 && server.aof_child_pid == -1 &&
    server.aof_rewrite_scheduled)
{
    rewriteAppendOnlyFileBackground();
}

In fact, the above code also answers the question we just mentioned: when will the scheduled AOF rewrite be executed?

Actually, if the AOF rewrite cannot be executed immediately, we don’t need to worry. Because as long as the aof_rewrite_scheduled variable is set to 1, the serverCron function will default to execute and check the value of this variable every 100 milliseconds. So, if the executing RDB child process and AOF rewrite child process end, the scheduled AOF rewrite can be executed quickly.

In addition, even if the AOF rewrite operation is not scheduled to be executed, the serverCron function will periodically check whether AOF rewrite needs to be executed. There are mainly three conditions for this judgment: AOF is enabled, AOF file size ratio exceeds the threshold, and the absolute value of AOF file size exceeds the threshold.

As a result, when all three conditions are met and there are no RDB child process and AOF child process running, the serverCron function will call the rewriteAppendOnlyFileBackground function to perform AOF rewriting. The code logic of this part is as follows:

// If AOF is enabled, there is no RDB child process and AOF rewrite child process running,
// AOF file size ratio is set and AOF file size exceeds the minimum size,
// further judge whether the AOF file size ratio exceeds the threshold
if (server.aof_state == AOF_ON && server.rdb_child_pid == -1 && server.aof_child_pid == -1 && server.aof_rewrite_perc && server.aof_current_size > server.aof_rewrite_min_size) {
   // Calculate the ratio by which the current AOF file size exceeds the base size
   long long base = server.aof_rewrite_base_size ? server.aof_rewrite_base_size : 1;
   long long growth = (server.aof_current_size*100/base) - 100;
   // If the ratio by which the current AOF file size exceeds the base size exceeds the preset threshold, perform AOF rewriting
   if (growth >= server.aof_rewrite_perc) {
      ...
      rewriteAppendOnlyFileBackground();
   }
}

From this code, you can see that in order to avoid the AOF file becoming too large and occupying too much disk space, as well as increasing the recovery time, you can let Redis server automatically rewrite the AOF file by setting the following two thresholds in the redis.conf file.

  • auto-aof-rewrite-percentage: The ratio by which the AOF file size exceeds the base size, with a default value of 100%, which means exceeding the size by 1 time.
  • auto-aof-rewrite-min-size: The minimum absolute value of AOF file size, with a default value of 64MB.

Well, here we have learned about the four triggering conditions for AOF rewriting. Let me summarize them for you to review.

  • Condition 1: The bgrewriteaof command is executed.
  • Condition 2: The replication process completes parsing and loading the RDB file (regardless of success or failure).
  • Condition 3: AOF rewriting is set to be scheduled for execution.
  • Condition 4: AOF is enabled, the AOF file size ratio exceeds the threshold, and the AOF file size exceeds the threshold.

In addition, you also need to note that there should be no ongoing RDB child process and AOF rewrite child process during these four conditions, otherwise the AOF rewrite cannot be executed.

So next, let’s learn about the basic execution process of AOF rewriting.

Basic Process of AOF Rewriting #

First, let’s take another look at the rewriteAppendOnlyFileBackground function introduced earlier. The logic of this function is relatively simple. On one hand, it creates a child process by calling the fork function, and then calls the rewriteAppendOnlyFile function to rewrite the AOF file in the child process.

The rewriteAppendOnlyFile function is implemented in the aof.c file. It mainly calls the rewriteAppendOnlyFileRio function (also in the aof.c file) to complete the rewriting of the AOF log file. Specifically, the rewriteAppendOnlyFileRio function will iterate through each database in the Redis server, retrieve each key-value pair, and record the corresponding insert command for the type of the key-value pair, as well as the content of the key-value pair itself.

For example, if a key-value pair of type String is retrieved, the rewriteAppendOnlyFileRio function will record the SET command and the content of the key-value pair; if a key-value pair of type Set is retrieved, it will record the SADD command and the content of the key-value pair. In this way, when we need to restore the Redis database, we can replay the command operations recorded in the AOF rewrite log to sequentially insert all the key-value pairs.

On the other hand, in the parent process, the rewriteAppendOnlyFileBackground function sets the aof_rewrite_scheduled variable to 0 and records the start time of the AOF rewriting, as well as the process ID of the AOF child process.

In addition, the rewriteAppendOnlyFileBackground function also calls the updateDictResizePolicy function to disable rehashing during the AOF rewriting. This is because rehashing involves a lot of data movement, which means that there will be more memory modifications in the parent process for the AOF child process. Therefore, the AOF child process needs to perform more copy-on-write operations to complete the writing of the AOF file, which can have a negative impact on the performance of the Redis system.

The following code shows the basic execution logic of the rewriteAppendOnlyFileBackground function:

int rewriteAppendOnlyFileBackground(void) {
   ...
   if ((childpid = fork()) == 0) {  // Create child process
      ...
      // Child process calls rewriteAppendOnlyFile to rewrite AOF
      if (rewriteAppendOnlyFile(tmpfile) == C_OK) {
           size_t private_dirty = zmalloc_get_private_dirty(-1);
           ...
           exitFromChild(0);
       } else {
           exitFromChild(1);
       }
   }
   else{ // Logic executed by parent process
      ...
      server.aof_rewrite_scheduled = 0;  
      server.aof_rewrite_time_start = time(NULL);
      server.aof_child_pid = childpid; // Record the process ID of the rewriting child process
      updateDictResizePolicy(); // Disable rehashing
}

From here, you can see that AOF rewriting is similar to RDB creation in that they both create a child process to iterate through all databases and record each key-value pair to a file. However, there are two differences between AOF rewriting and RDB file creation:

  • First, the AOF file records each key-value pair’s insert operation in the form of “command + key-value pair”, while the RDB file records the key-value pair data itself.
  • Second, during AOF rewriting or RDB creation, the main process can still handle client write requests. However, because the RDB file only needs to record all the data in the database at a certain point in time, while AOF rewriting needs to record as many write operations received by the main process as possible in the rewrite log file. Therefore, the AOF child process needs to have a mechanism to communicate with the main process to receive the write operations received by the main process.

The following diagram shows the basic logic of the rewriteAppendOnlyFileBackground function, the execution of the main process and the AOF child process, and the communication between the main process and the child process. Take a look for an overall review.

So far, we have roughly understood the basic process of AOF rewriting. But at this point, you may still have questions, such as how the communication between the AOF child process and the parent process works.

In fact, this communication process is achieved through the operating system’s pipe mechanism, but don’t worry, I will explain this in detail in the next lecture.

Summary #

In today’s lesson, I introduced you to the implementation of Redis AOF rewrite mechanism. You need to pay attention to the following two key points:

  • Triggering AOF rewrite. This includes both manually executing the bgrewriteaof command and Redis server automatically triggering the rewrite based on the size of the AOF file. In addition, during the process of master-slave replication, the slave node also initiates AOF rewrite to create a complete AOF log for future recovery. However, you should also note that when triggering AOF rewrite, Redis server cannot run RDB subprocess and AOF rewrite subprocess at the same time.
  • The basic process of AOF rewrite. AOF rewrite is similar to RDB creation in that it creates a subprocess to complete the rewrite job. This is because the AOF rewrite operation actually needs to traverse all the databases on the Redis server, and write each key-value pair into the log file in the form of insertion operations, which requires writing to disk. Therefore, the Redis source code uses subprocesses to implement AOF rewrite, avoiding blocking the main thread and reducing the impact on the overall performance of Redis.

However, you need to be aware that although AOF rewrite and RDB creation both use subprocesses, they also have differences. During the AOF rewrite process, the parent process needs to write the received write operations to the AOF rewrite log as much as possible. In this case, the Redis source code uses pipe mechanism to implement communication between the parent process and the AOF rewrite subprocess. In the next lecture, I will focus on introducing how Redis uses pipes to achieve communication between the parent and child processes, and what data or information is passed through the pipe.

One question per lesson #

The creation of RDB files is done by a subprocess, and AOF rewriting is also done by a subprocess. These two subprocesses can run independently. So please consider why in the Redis source code, when there is a RDB subprocess running, the AOF rewriting subprocess is not started.