35 Practical Redis Performance Optimization Solutions

35 Practical Redis Performance Optimization Solutions #

Redis is implemented based on a single-threaded model, which means that Redis uses a single thread to handle all client requests. Although Redis uses non-blocking IO and optimizes various commands (most commands have a time complexity of O(1)), due to the single-threaded execution of Redis, it has stricter requirements for performance. In this article, we will use some optimization techniques to make Redis run more efficiently.

In this article, we will use the following methods to improve the running speed of Redis:

Shorten the storage length of key-value pairs;
Use the lazy free feature;
Set expiration time for key-value pairs;
Disable time-consuming query commands;
Use slowlog to optimize time-consuming commands;
Use Pipeline for batch data operations;
Avoid simultaneous expiration of a large amount of data;
Optimize client usage;
Limit the memory size of Redis;
Install Redis service on physical machines instead of virtual machines;
Check the data persistence strategy;
Use distributed architecture to increase read and write speeds.

Shorten the storage length of key-value pairs #

The length of key-value pairs is inversely proportional to performance. For example, let’s perform a performance test of writing data, and the execution results are as follows:

Data Size	Key Size	Value Size	Average Time for string:set	Average Time for hash:hset
1 million	20 bytes	512 bytes	1.13 microseconds	10.28 microseconds
1 million	20 bytes	200 bytes	0.74 microseconds	8.08 microseconds
1 million	20 bytes	100 bytes	0.65 microseconds	7.92 microseconds
1 million	20 bytes	50 bytes	0.59 microseconds	6.74 microseconds
1 million	20 bytes	20 bytes	0.55 microseconds	6.60 microseconds
1 million	20 bytes	5 bytes	0.53 microseconds	6.53 microseconds

From the above data, we can see that with the same key, the larger the value, the slower the operation efficiency. This is because Redis uses different internal encodings to store the same data type. For example, there are three internal encodings for strings: int (integer encoding), raw (optimized memory allocation string encoding), and embstr (dynamic string encoding). This is because the author of Redis wants to achieve a balance between efficiency and space through different encodings. However, the larger the data size, the more complex the internal encoding used, and the lower the storage performance of the complex internal encoding.

This is only the speed during writing. When the content of the key-value pair is large, it will also bring several other problems:

The larger the content, the longer the persistence time and the longer the suspension time, which will lower the performance of Redis.
The larger the content, the more content needs to be transmitted over the network, and the longer the transmission time, which will lower the overall operation speed.
The larger the content, the more memory it occupies, and it will trigger the memory eviction mechanism more frequently, thus bringing more operational burden to Redis.

Therefore, while ensuring the complete semantics, we should try to shorten the storage length of key-value pairs as much as possible, and if necessary, serialize and compress the data before storage. Taking Java as an example, we can use protostuff or kryo for serialization, and snappy for compression.

Use the lazy free feature #

The lazy free feature is a very useful feature added in Redis 4.0. It can be understood as lazy deletion or delayed deletion. It means that when deleting, it provides the function of asynchronously delaying the release of key-value pairs and puts the key-value pair release operation in a separate sub-thread of BIO (Background I/O) for processing, reducing the blocking of Redis main thread caused by deletion. It can effectively avoid the performance and availability issues caused by deleting big keys.

There are four scenarios corresponding to lazy free, which are off by default:

lazyfree-lazy-eviction no
lazyfree-lazy-expire no
lazyfree-lazy-server-del no
slave-lazy-flush no

They represent the following meanings:

lazyfree-lazy-eviction: Indicates whether to enable lazy free mechanism for deletion when Redis’s memory usage exceeds maxmemory.
lazyfree-lazy-expire: Indicates whether to enable lazy free mechanism for expiration of keys with set expiration time.
lazyfree-lazy-server-del: Some instructions have an implicit del key operation when processing existing keys, such as the rename command. When the target key already exists, Redis will delete the target key first. If these target keys are big keys, it will cause blocking deletion. This configuration indicates whether to enable lazy free mechanism for deletion in this scenario.
slave-lazy-flush: When the slave node synchronizes all data from the master (full data synchronization), the slave will run flushall to clean up its own data before loading the master’s RDB file. This configuration indicates whether to enable lazy free mechanism for deletion in this scenario.

It is recommended to enable the lazyfree-lazy-eviction, lazyfree-lazy-expire, and lazyfree-lazy-server-del configurations to effectively improve the execution efficiency of the main thread.

Setting the expiration time for key-value pairs #

We should set a reasonable expiration time for key-value pairs based on the actual business situation. This way, Redis will automatically remove expired key-value pairs to save memory usage and avoid excessive accumulation of key-value pairs, which can trigger frequent memory eviction strategies.

Disabling time-consuming query commands #

Most of Redis’s read and write commands have a time complexity ranging from O(1) to O(N). The official documentation provides the time complexity of each command, which can be found at:

https://redis.io/commands

Here is a screenshot:

Among them, O(1) indicates that it is safe to use, while O(N) should be used with caution as N is uncertain, and the query speed may become slower as the data increases. Redis uses only one thread for data queries, so if these commands take a long time, Redis will be blocked, resulting in significant delays.

To avoid the impact of O(N) commands on Redis, you can make the following improvements:

Decide to prohibit the use of the keys command.
Avoid querying all members at once and use the scan command for iterative and cursor-based traversal.
Strictly control the data size of hash, set, sorted set, and other structures through mechanisms.
Perform sorting, unions, intersections, and other operations on the client side to reduce the load on the Redis server.
When deleting (del) a large data, it may take a long time, so it is recommended to use asynchronous deletion by using unlink. It will start a new thread to delete the target data without blocking the main thread of Redis.

Optimizing time-consuming commands with slowlog #

We can use the slowlog function to find the most time-consuming Redis commands and optimize them accordingly to improve the performance of Redis. There are two important configuration items for slow queries:

slowlog-log-slower-than: Used to set the evaluation time for slow queries. That is, commands that exceed this configuration item will be recorded as slow operations in the slow query log. The unit of time is microseconds (1 second equals 1000000 microseconds).
slowlog-max-len: Used to configure the maximum number of records in the slow query log.

We can configure these items according to the actual business situation. The slow log is stored in reverse order of insertion in the slow query log. We can use slowlog get n to retrieve the relevant slow query log entries and optimize the corresponding business logic for these slow queries.

Using Pipeline for batch data operations #

Pipeline is a client-side batch processing technology that allows multiple Redis commands to be processed at once, thereby improving the performance of the entire interaction.

Let’s test the performance comparison between Pipeline and normal operations using Java code. Here is an example of Pipeline code:

public class PipelineExample {
    public static void main(String[] args) {
        Jedis jedis = new Jedis("127.0.0.1", 6379);
        // Record the start time of execution
        long beginTime = System.currentTimeMillis();
        // Get the Pipeline object
        Pipeline pipe = jedis.pipelined();
        // Set multiple Redis commands
        for (int i = 0; i < 100; i++) {
            pipe.set("key" + i, "val" + i);
            pipe.del("key"+i);
        }
        // Execute the commands
        pipe.sync();
        // Record the end time of execution
        long endTime = System.currentTimeMillis();
        System.out.println("Execution time: " + (endTime - beginTime) + " milliseconds");
    }
}

The result of the above program execution is:

Execution time: 297 milliseconds

The code for regular operation is as follows:

public class PipelineExample {
    public static void main(String[] args) {
        Jedis jedis = new Jedis("127.0.0.1", 6379);
        // record the start time
        long beginTime = System.currentTimeMillis();
        for (int i = 0; i < 100; i++) {
            jedis.set("key" + i, "val" + i);
            jedis.del("key"+i);
        }
        // record the end time
        long endTime = System.currentTimeMillis();
        System.out.println("Execution time: " + (endTime - beginTime) + " milliseconds");
    }
}

The execution result of the program above is:

Execution time: 17276 milliseconds

From the above result, we can see that the execution time of the pipeline is 297 milliseconds, while the execution time of the regular commands is 17276 milliseconds. The pipeline technique is about 58 times faster than the regular execution.

Avoiding simultaneous expiration of a large amount of data #

Redis uses a greedy strategy for deleting expired key-value pairs. It performs 10 scans for expired keys per second, and this setting can be configured in redis.conf with a default value of hz 10. Redis randomly selects 20 values and deletes the expired keys among these 20 keys. If the proportion of expired keys exceeds 25%, the process is repeated. The diagram below illustrates this process:

If a large number of caches in a large-scale system expire at the same time, it will cause Redis to continuously scan and delete expired key-value pairs until they become sparse in the expired dictionary. This entire process will cause significant latency in Redis read and write operations. Another reason for this latency is that the memory manager needs to frequently reclaim memory pages, which consumes CPU resources.

To prevent this latency, we need to prevent a large number of caches from expiring at the same time. The simplest solution is to add a random number within a specified range to the expiration time.

Client-side optimization #

In addition to using the Pipeline technique as much as possible, we should also try to use a Redis connection pool instead of creating and destroying Redis connections frequently. This reduces the number of network transmissions and unnecessary command invocations.

Limiting Redis memory size #

In a 64-bit operating system, there is no limit on the memory size of Redis. The configuration item maxmemory <bytes> is commented out, which means that Redis uses swap space when physical memory is insufficient. However, when the operating system moves Redis memory pages to swap space, it will block the Redis process and cause delays, thereby affecting the overall performance of Redis. Therefore, we need to limit the memory size of Redis to a fixed value. When Redis reaches this limit, it triggers a memory eviction policy. There are 8 memory eviction policies in Redis 4.0:

noeviction: Does not evict any data. When memory is insufficient, new operations will result in an error. This is the default memory eviction policy in Redis.
allkeys-lru: Evicts the least recently used key-value pair from all key-value pairs.
allkeys-random: Randomly evicts any key-value pair.
volatile-lru: Evicts the least recently used key-value pair from all key-value pairs with an expiration time set.
volatile-random: Randomly evicts any key-value pair with an expiration time set.
volatile-ttl: Evicts the key-value pair that expires earlier from all key-value pairs with an expiration time set.

Redis 4.0 added 2 additional memory eviction policies:

volatile-lfu: Evicts the least frequently used key-value pair from all key-value pairs with an expiration time set.
allkeys-lfu: Evicts the least frequently used key-value pair from all key-value pairs.

In the above memory eviction policies, allkeys-xxx evicts data from all key-value pairs, while volatile-xxx evicts data from key-value pairs with expiration time set.

We can set the appropriate memory eviction policy based on the actual business requirements. The default eviction policy does not evict any data and throws an error for new operations.

Use physical machines instead of virtual machines #

Running Redis server on virtual machines has poor performance in terms of memory usage and network latency because virtual machines share a physical network interface with the physical machine, and a physical machine may have multiple virtual machines running. You can use the command ./redis-cli --intrinsic-latency 100 to check the latency. If you have high performance requirements for Redis, it is recommended to deploy the Redis server directly on a physical machine.

Checking Data Persistence Strategy #

The persistence strategy of Redis is to copy the data in memory to the disk, which is necessary for disaster recovery or data migration. However, maintaining this persistence feature requires a significant performance overhead.

After Redis 4.0, there are three persistence methods available:

RDB (Redis Database, snapshot method) writes the memory data of a specific moment to the disk in a binary format.
AOF (Append Only File, file append method) records all operation commands and appends them to a file in text format.
Hybrid persistence, a new method introduced in Redis 4.0, combines the advantages of RDB and AOF. In this method, the current data is first written to the file in RDB format, and then the subsequent operation commands are stored in the file in AOF format. This approach ensures both the speed of Redis restart and reduces the risk of data loss.

RDB and AOF have their pros and cons. RDB may cause data loss within a certain time period, while AOF, due to its larger file size, will affect the startup speed of Redis. To enjoy the benefits of both RDB and AOF, the hybrid persistence method was introduced in Redis 4.0. Therefore, when it is necessary to perform persistence operations, the hybrid persistence method should be chosen.

To check whether hybrid persistence is enabled, the command config get aof-use-rdb-preamble can be used. The execution result is shown in the following figure:

In the figure, “yes” indicates that hybrid persistence is enabled, while “no” indicates that it is disabled. In Redis 5.0, the default value is “yes”.

If you are using a different version of Redis, first check whether hybrid persistence is enabled. If it is disabled, you can enable it in the following two ways:

Enable via the command line
Enable by modifying the Redis configuration file

Enabling via the Command Line #

Use the command config set aof-use-rdb-preamble yes. The execution result is shown in the following figure:

A disadvantage of setting configurations via the command line is that the configuration will become invalid after restarting the Redis service.

Enabling by Modifying the Redis Configuration File #

In the root directory of Redis, find the redis.conf file, and change the configuration aof-use-rdb-preamble no to aof-use-rdb-preamble yes, as shown in the following figure:

After configuring, restart the Redis server for the configuration to take effect. However, the disadvantage of modifying the configuration file is that the configuration information will not be lost after each restart of the Redis service.

It should be noted that for non-essential persistence operations, persistence can be disabled to effectively improve the running speed of Redis and avoid intermittent freezing issues.

Using Distributed Architecture to Increase Read/Write Speed #

Redis distributed architecture has three important means:

Master-slave synchronization
Sentinel mode
Redis Cluster

Using the master-slave synchronization function, write operations can be executed on the master node while read operations can be performed on the slave nodes. This way, more requests can be processed in a unit of time, thereby improving the overall running speed of Redis.

Sentinel mode is an upgrade to the master-slave function. After the master node crashes, Redis can be automatically restored to normal operation without manual intervention.

Redis Cluster was officially introduced in Redis 3.0. In Redis clustering, data is stored in multiple nodes to balance the load pressure of each node.

Redis Cluster uses virtual hash slot partitioning, and all keys are mapped to integer slots from 0 to 16383 based on a hash function. The calculation formula is as follows:

slot = CRC16(key) & 16383

Each node is responsible for maintaining a portion of the slots and the key-value data mapped by these slots. This way, Redis can distribute the read/write pressure to multiple servers, thus greatly improving performance.

Among these three features, we only need to use one. Undoubtedly, Redis Cluster should be the preferred implementation solution as it automatically distributes the read/write pressure to more servers and has automatic fault tolerance capabilities.