26 How to Significantly Improve Redis Processing Performance

26 How to Significantly Improve Redis Processing Performance #

In this lesson, we will mainly learn how to greatly improve performance through Redis multi-threading. This involves the main thread and IO thread, command processing flow, as well as the advantages and disadvantages of multi-threading solutions.

Main Thread #

Redis has been widely praised and widely used since its inception. However, compared to Memcached, which can handle millions of TPS (Transactions Per Second) in a single instance and stably run 200,000 to 400,000 TPS online, Redis can only handle 100,000 to 120,000 TPS in a single instance and usually reaches a maximum of 20,000 to 40,000 TPS online, still differing by an order of magnitude.

The main reason for Redis’s slowness is its single-process single-threaded model. Although some heavyweight operations have been divided, such as RDB construction being performed in a child process, file closing and file buffer synchronization, as well as asynchronous processing of large key cleaning in BIO threads, it is still far from sufficient. In online Redis, tens of thousands of clients are connected to a single Redis instance, and all event handling, read requests, command parsing, command execution, and final response are handled by the main thread. Even with various extreme optimizations, a thread’s processing capacity always has its limits. Most current server CPUs have 16 to 32 cores or more, but Redis primarily only uses one core for its day-to-day operations, failing to effectively utilize the other CPU cores, resulting in no significant improvement in Redis’s processing performance. On the other hand, Memcached can configure dozens of threads according to the number of CPU cores on the server, and these threads can concurrently perform IO read and write operations and task processing, allowing for a significant increase in processing performance.

IO Thread #

Facing the dilemma of performance improvement, although the Redis author doesn’t think it’s a big deal and believes that similar multi-threading effects can be achieved by deploying multiple Redis instances. However, deploying multiple instances brings about complex operational issues, and deploying multiple instances on a single machine will also have a mutual impact, further increasing the complexity of operations. Therefore, there has always been a voice in the community hoping that Redis can develop a multi-threaded version.

Therefore, Redis is about to introduce a multi-threaded model in the 6.0 version. The current code is in the unstable version, and the 6.0 version is expected to be released next year. Redis’ multi-threaded model consists of a main thread and IO threads.

Because there are several time-consuming points in handling command requests, including request reading, protocol parsing, protocol execution, and response replies. Therefore, Redis introduces IO multi-threading to concurrently perform the reading and parsing of request commands, as well as the response replies. However, all other tasks, such as event triggering, command execution, IO task distribution, and various other core operations, are still performed in the main thread, which means that these tasks are still handled by a single thread. This way, multiple threads can be introduced to the greatest extent without changing the original processing flow.

Command Processing Flow #

The multi-threaded processing flow of Redis 6.0 is shown in the diagram below. The main thread is responsible for listening on the port and registering connection read events. When a new connection enters, the main thread accepts the new connection, creates a client, and registers request read events for the new connection.

When a command request enters, the main thread triggers a read event. At this point, the main thread does not perform network IO reading, but adds the client of the connection to the pending read queue. In the event loop of Redis’ Ae event model, if the pending read queue is not empty, all the clients of the pending read requests are dispatched to IO threads one by one, with a spin check wait, waiting for the IO thread to read all the network data. The so-called spin check wait refers to the main thread continuously looping and checking if the IO thread has finished reading, without performing any other tasks. Only when the main thread discovers that the IO thread has read all the network data, it stops the loop and continues with subsequent task processing.

Multiple IO threads can be configured, for example, 4-8 threads can be configured. When these IO threads find tasks in the pending read queue, they start concurrent processing. Each IO thread retrieves a task from the corresponding list, reads request data from the client connection in it, and parses the command. When the IO thread completes all the request readings and completes the parsing, the number of pending read tasks becomes 0. The main thread stops looping and starts executing all the commands already parsed by the IO threads one by one. After executing each command, it writes the response to the client write buffer, and these clients become pending reply clients, which are added to the pending reply list. Then the main thread polls and assigns these pending reply clients to multiple IO threads. Then it spins again for checking and waiting.

Then the IO threads start concurrent execution again and write the response buffers of different clients to the clients. When all the responses have been processed, the number of pending reply tasks becomes 0, and the main thread ends the spin check and continues to process subsequent tasks and new read requests.

The multi-threaded model introduced in Redis 6.0 mainly refers to the configurable multiple IO threads, which are specifically responsible for request reading, parsing, and reply response. With the IO multi-threading, the performance of Redis can be improved by more than 1x.

Pros and Cons of Multithreading Solution #

Although the multithreading solution can improve performance by more than 1x, the overall solution is still relatively crude. Firstly, all command executions are still performed in the main thread, which leads to performance bottlenecks. Secondly, all event triggers are also performed in the main thread, which still cannot effectively utilize multiple cores. Additionally, IO reads and writes are done in batch processing, meaning all IO threads read all requests together, and after the main thread finishes parsing and processing, all IO threads reply with all responses together. Different requests need to wait for each other, resulting in inefficient processing. Lastly, during IO batch processing, the main thread uses spin-waiting for detection, leading to even lower efficiency. Even if the workload is small, it is still easy to fully occupy the CPU. The entire multithreading solution is relatively primitive, so the performance improvement is also limited, only around 1~2 times more. To achieve a larger performance boost, command executions and event triggers need to be split into different threads. Additionally, the multithreading processing model also needs to be optimized. Each thread should perform IO reads and writes and execute commands independently, without interfering with or competing against each other, in order to truly and efficiently utilize the multiple cores of the server and achieve a significant improvement in performance.