41 Case Study Redis Problem Summary and Resolution Solutions

41 Case Study Redis Problem Summary and Resolution Solutions #

This article collects some common problems encountered in Redis usage and their corresponding solutions. These issues not only arise in practical work but are also frequently asked in interviews. Let’s take a look.

Cache Avalanche #

Cache avalanche refers to a situation where a large number of caches expire at the same time within a short period, resulting in a huge load on the database as a large number of requests directly query the database. In severe cases, it may even cause the database to crash.

Let’s first take a look at the flowchart of program execution under normal circumstances and during a cache avalanche. The flowchart under normal circumstances is shown in the following diagram:

Normal Access Image.png

The flowchart during a cache avalanche is shown in the following diagram:

Cache Avalanche.png

From the above comparison, we can see the impact of a cache avalanche on the system. So how do we solve the problem of cache avalanche?

The common solutions to cache avalanche include the following.

Locking and Queueing #

Locking and queueing can act as a buffer to prevent a large number of requests from simultaneously operating on the database. However, the drawback is that it increases the system’s response time, reduces its throughput, and sacrifices a part of the user experience.

The code implementation of locking and queueing is as follows:

// Cache key
String cacheKey = "userlist";
// Query cache
String data = jedis.get(cacheKey);
if (StringUtils.isNotBlank(data)) {
    // Data found, return the result directly
    return data;
} else {
    // Queue for querying the database first, then put it into the cache
    synchronized (cacheKey) {
        data = jedis.get(cacheKey);
        if (!StringUtils.isNotBlank(data)) { // Double check
            // Query the database
            data = findUserInfo();
            // Put it into the cache
            jedis.set(cacheKey, data);
        }
        return data;
    }
}

The above code is an example of locking and queueing. Readers can make corresponding modifications according to their actual project situations.

Randomizing Expiration Time #

To avoid caches expiring at the same time, you can add a random time when setting the cache, which can greatly reduce the chances of a large number of caches expiring simultaneously.

Here’s an example code:

// Original expiration time of the cache
int exTime = 10 * 60;
// Random number generator
Random random = new Random();
// Cache setting
jedis.setex(cacheKey, exTime + random.nextInt(1000) , value);

Setting up a Secondary Cache #

Level 2 Cache #

Level 2 cache refers to adding another layer of cache apart from Redis cache. When Redis cache becomes invalid, the system will first check the Level 2 cache.

For example, you can set up a local cache so that when Redis cache becomes invalid, the system will first query the local cache instead of querying the database.

After adding the Level 2 cache, the program execution flow is shown in the following diagram:

Cache Penetration #

Cache penetration refers to the situation when there is no data in both the database and the cache. Due to fault tolerance considerations, the system does not store the results in the cache when querying the database. Therefore, every request will result in a database query, which is called cache penetration.

The execution flow of cache penetration is shown in the following diagram:

Cache Avalanche - Cache Penetration.png

The red path represents the execution path of cache penetration, and it can be seen that cache penetration puts a lot of pressure on the database.

There are several solutions to solve the problem of cache penetration.

Using filters #

We can use filters to reduce the number of database queries. For example, we can use the Bloom Filter we learned in the previous chapter. Its principle is to hash the data in the database to a bitmap. Before each query, we can use the Bloom Filter to filter out invalid requests that are likely to not exist, thus avoiding the query pressure on the database caused by invalid requests.

Cache empty results #

Another approach is to save the data queried from the database to the cache every time. To improve the user experience on the front end (to solve the situation where no information is found for a long time), we can set a shorter cache time for empty results, such as 3 to 5 minutes.

Cache Breakdown #

Cache breakdown refers to the situation where a hot cache becomes invalid at a certain moment and a large number of concurrent requests arrive at the same time, which leads to a huge pressure on the database. This is called cache breakdown.

The execution flow of cache breakdown is shown in the following diagram:

There are two solutions to cache breakdown.

Lock-based queuing #

This solution is similar to the method of locking and queuing in cache avalanche. Lock-based queuing is performed when querying the database, buffering the operation requests to reduce the server’s running pressure.

Setting never expire #

For certain hot caches, we can set them to never expire in order to ensure cache stability. However, we need to update these hot caches promptly after the data has changed, otherwise it may cause query result deviations.

Cache Preheating #

Firstly, cache preheating is not a problem but an optimization solution when using cache. It can improve the user experience on the front end.

Cache preheating refers to pre-storing query results in the cache when the system starts, so that subsequent queries from users can be directly read from the cache, saving their waiting time.

The execution flow of cache preheating is shown in the following diagram:

There are three implementation approaches for cache preheating:

Write the methods that need caching in the system initialization method, so that the system will automatically load and cache the data when starting up;
Attach the methods that need caching to a certain page or backend interface, and manually trigger cache preheating;
Set up a scheduled task to perform cache preheating automatically at regular intervals.

Summary #

This article introduces that the cause of cache avalanche is the simultaneous failure of a large number of caches within a short period of time, which leads to a situation where a large number of requests directly query the database. The solutions include locking, random expiration time, and setting Level 2 cache. It also introduces the cache penetration problem that occurs when the database returns no data, and the solutions include using Bloom Filter and caching empty results. It further introduces the cache breakdown problem caused by the sudden failure of a cache at a certain moment of high concurrency, as well as the solutions such as locking and setting never expire. Finally, it discusses the cache preheating as a means of optimizing system performance.