34 Lecture 2333 Post Lecture Thinking Problems, Answers, and Common Questions

34 Lecture 2333 Post-lecture Thinking Problems, Answers, and Common Questions #

Today, it’s time for us to answer questions and clear up doubts. Let’s take a look at the post-lecture questions from Lecture 23-33. In addition, I will explain two typical questions.

Answers to After-class Questions #

Lecture 23 #

Question: What is the difference between Redis’s read-only cache and its read-write cache with write-through strategy, which both synchronize data to the backend database?

Answer: The main difference lies in how they handle modified cached data. In a read-only cache, the business application directly modifies the database and marks the data in the cache as invalid. In a read-write cache, the business application needs to modify both the cache and the database.

I have summarized the advantages and disadvantages of these two types of cache in the following table:

Lecture 24 #

Question: When dealing with dirty data, Redis cache not only modifies the data but also writes it back to the database. We have learned about Redis’s read-only cache mode and two read-write cache modes (write-through mode with synchronous write and write-behind mode with asynchronous write). Can you think about which mode(s) the Redis cache corresponds to?

Answer: If we need to write dirty data back to the database when using Redis cache, it means that the cached data in Redis can be directly modified. This corresponds to the read-write cache mode. Further analysis shows that dirty data is written back to the backend database when it is evicted from the cache, which corresponds to the read-write cache mode with an asynchronous write-back strategy.

Lecture 25 #

Question: When deleting or updating data in a read-only cache, we need to delete the corresponding cached value in the cache. Instead of deleting the cached value, what are the benefits and drawbacks of directly updating the cached value?

Answer: If we directly update the cached value in the cache, the next time the data is accessed, the business application can directly read it from the cache, which is a major benefit.

The drawback is that when there are data update operations, we need to ensure that the data in the cache and the database are consistent. This can be achieved through the retry or delayed double deletion methods I introduced in Lecture 25. However, this requires adding additional code to the business application, which incurs some overhead.

Lecture 26 #

Question: When discussing cache avalanche, I mentioned three methods: service fusing, service degradation, and request limiting. Can these three methods be used to deal with cache penetration?

Answer: Regarding this question, student @徐培 gave an excellent answer. He recognized the essence of cache penetration and understood the difference between penetration and avalanche or breakdown scenarios. Let me answer this question once again. The essence of the problem with cache penetration is that Redis and the database are queried for data that does not exist. The essence of methods such as service degradation, service fallback, and request rate limiting is to solve the problem where the Redis instance does not function as a cache layer. Cache avalanche and cache breakthrough also fall into this category.

In the scenario of cache penetration, the business application is supposed to read data that does not exist in Redis and the database. In this case, if no manual intervention is made, Redis cannot act as a cache.

One feasible solution is to intercept in advance and prevent requests for data that does not exist in Redis and the database from being sent to the database layer.

Using a Bloom filter is also a method. When a Bloom filter determines that data does not exist, it does not produce false positives, and the judgment is very fast. Once it determines that the data does not exist, it immediately returns the result to the client. The advantage of using a Bloom filter is that it reduces the query pressure on Redis and avoids ineffective access to the database.

In addition, for cache avalanche and breakthrough problems, service degradation, service fallback, and request rate limiting are all lossy methods, which will reduce business throughput, slow down system response, and decrease user experience. However, by using these methods, as data gradually fills back into Redis, Redis can gradually recover its role as a cache layer.

Lesson 27 #

Question: Will the cache still be contaminated when using the LFU strategy?

Answer: In Redis, cache contamination can still occur even when using the LFU strategy. @yeek gave a good answer, and I’ll share it with you.

In some extreme cases, the counter used by the LFU strategy may reach a large value in a short period of time, and the decay setting of the counter is set to a large value, resulting in a slow decay of the counter value. In this case, the data may reside in the cache for a long time even when using the LFU strategy, causing contamination.

Lesson 28 #

Question: In this lesson, I introduced using SSD as an extension of memory capacity to increase the data storage capacity of Redis instances. Could we use mechanical hard disks as an instance capacity extension? What are the advantages or disadvantages?

Answer: Many students (such as @Lemon, @Kaito) analyzed this question correctly, and I’ll summarize the advantages and disadvantages of using mechanical hard disks.

In terms of capacity, mechanical hard disks have a higher cost performance. The cost per GB of mechanical hard disks is approximately around 0.1 yuan, while for SSDs, it is approximately around 0.4-0.6 yuan per GB.

In terms of performance, the latency of mechanical hard disks (such as SAS disks) is approximately 3-5ms, while the read latency of enterprise-grade SSDs is approximately 60-80us, and the write latency is around 20us. The load characteristics of caches are generally fine-grained data and high-concurrency requests, requiring low access latency. Therefore, if mechanical hard disks are used as the underlying storage devices of Pika, the access performance of the cache will be reduced.

So, my suggestion is, if the business application requires caching a large amount of data but does not have high performance requirements for the cache, mechanical hard disks can be used. Otherwise, it is better to use SSDs.

Lesson 29 #

Question: When executing a Lua script in Redis, atomicity can be guaranteed. In the Lua script example (lua.script) mentioned in the course, do you think it is necessary to include the logic to read the number of times the client IP is accessed (GET(ip)) and judge whether the access count exceeds 20 in the Lua script as well? The code is as follows:

local current
current = redis.call("incr",KEYS[1])
if tonumber(current) == 1 then
    redis.call("expire",KEYS[1],60)
end

Answer: In this example, there are three atomic operations that need to be ensured, which are INCR, checking if the visit count is 1, and setting the expiration time. As for the operations of getting the IP and checking if the visit count exceeds 20, they are only read operations. Even if the client has multiple threads executing these operations concurrently, they will not change any values, so atomicity does not need to be ensured. Therefore, we don’t need to include them in the Lua script.

Lecture 30 #

Question: In the course, I mentioned that we can use the SET command with the NX and EX/PX options for locking operations. Can we use the following approach to implement locking?

// Lock
SETNX lock_key unique_value
EXPIRE lock_key 10S
// Business logic
DO THINGS

Answer: If we use this method to implement locking, although the SETNX and EXPIRE commands individually complete the atomic judgment and value setting of the lock variable, as well as setting the expiration time of the lock variable, these two operations are not atomic when executed together.

If a client fails after executing the SETNX command but before setting the expiration time of the lock variable, the lock will not be released on the instance, causing other clients to be unable to perform locking operations. Therefore, we cannot use this method for locking.

Lecture 31 #

Question: When executing a transaction, if the Redis instance fails and Redis uses the RDB mechanism, can the atomicity of the transaction still be guaranteed?

Answer: When Redis uses the RDB mechanism to ensure data reliability, Redis will periodically perform memory snapshots.

During the execution of a transaction, the modifications made to the data by the transaction operations are not recorded in the RDB in real-time, and Redis does not create an RDB snapshot. We can discuss the atomicity guarantee of the transaction based on the timing of the failure and whether an RDB snapshot is generated.

If a failure occurs when a transaction is halfway through, the previous RDB snapshot will not include the modifications made by the transaction, and the next RDB snapshot has not been executed. Therefore, after the instance recovers, the data modified by the transaction will be lost, and the atomicity of the transaction will be guaranteed.
If a failure occurs after the transaction is completed and the RDB snapshot has been generated, the data modified by the transaction can be recovered from the RDB, and the atomicity of the transaction is guaranteed.
If a failure occurs after the transaction has been completed but before an RDB snapshot is generated, the data modified by the transaction will be completely lost, and atomicity cannot be guaranteed.

Lecture 32 #

Question: In a master-slave cluster, we set slave-read-only to no, allowing the slave to also delete data directly to avoid reading expired data. Do you think this is a good method?

Answer: The key point of this question is whether it is a good method if the slave can also delete expired data (i.e., perform write operations). In fact, I want to remind you through this question that read and write operations in master-slave replication need to be performed on the master, even if the slave can delete, do not delete on the slave, otherwise it will cause data inconsistency.

For example, if both the master and the slave have a key called a:stock, and client A sends a SET command to the master to modify the value of a:stock, and client B sends a SET command to the slave to modify the value of a:stock, the values of the same key will be different. Therefore, if the slave can perform write operations, it will cause data inconsistency between the master and the slave.

@Kaito provided a good analysis in the comments, and I will summarize and share his comment. Even if the slave can delete expired data, there is still a risk of inconsistency in two situations.

The first situation is that for a key that has already been set with an expiration time, when the expiration of the key is close, the master resets the expiration time using the EXPIRE command. For example, a key was originally set to expire after 10s, and when there is only 1s left before expiration, the master uses the EXPIRE command to set the expiration time to expire after 60s. However, if there is a network delay and the slave does not receive the EXPIRE command in a timely manner (e.g., the slave receives the EXPIRE command 3s later), the slave will delete the expired key according to the original expiration time, resulting in data inconsistency between the master and the slave.

The second situation is that the clocks of the master and the slave are not synchronized, resulting in the deletion times of the master and the slave being inconsistent.

In addition, when slave-read-only is set to no, if data with an expiration time is written to the slave, Redis versions before 4.0 will not delete the expired data, while Redis 4.0 and later versions will delete the data after it expires. However, the slave still does not actively delete data with an expiration time that is synchronized from the master.

Lecture 33 #

Question: Suppose we set min-slaves-to-write to 1, min-slaves-max-lag to 15s, down-after-milliseconds of the sentinel to 10s, and the master-slave switch of the sentinel takes 5s, but the master gets stuck for 12s due to some reasons. Will split brain occur in this case? Will data be lost after the master-slave switch?

Answer: The master gets stuck for 12s, which exceeds the 10s threshold set by the sentinel’s down-after-milliseconds, so the sentinel will consider the master to be objectively offline and start the master-slave switch. Because the master-slave switch takes 5s, the original master is restored to normal during the process of the master-slave switch. min-slaves-max-lag is set to 15s, and the original master recovers to normal after being stuck for 12s, so it is not prohibited from receiving requests. After the original master recovers, the client can send requests to the original master again. Once a new master comes online after the master-slave switch, split brain occurs. If the original master receives write requests during the time it recovers to before being demoted to a slave, the data will be lost.

FAQ: Typical Questions and Answers #

In Lecture 23, we learned about the working principle of Redis cache, and I mentioned that Redis is a side cache, which can be divided into read-only mode and read-write mode. I have noticed some common questions in the comments area: how to understand Redis as a side cache, and what mode does Redis usually use? Now, let me explain these two questions.

How to understand Redis as a side cache? #

Some students mentioned that the side cache they usually see refers to the processing method of write requests, which directly updates the database and deletes the cached data. For read requests, it queries the cache and if the cache is missing, it reads from the database and writes the data into the cache. So, does this mean Redis belongs to this type of side cache that was mentioned in the course?

Actually, what the student mentioned is the typical feature of a read-only cache. When I refer to Redis as a side cache, it is more about how “business applications use Redis cache” from that perspective. When using Redis cache in business applications, the cache operation logic needs to be explicitly added in the business code.

For example, a basic cache operation is that once a cache miss occurs, the business application needs to read the database by itself, instead of the cache itself reading the data from the database and returning it.

To help you understand better, let’s take a look at CPU cache and page cache in computer systems, which correspond to the side cache. These two caches are by default on the path of application’s access to memory and disk. Our written applications can directly use these two caches.

The reason why I emphasize that Redis is a side cache is also to remind you that when using Redis cache, we need to modify the business code.

Which mode should be used when using Redis cache? #

I mentioned that there are three common cache modes: read-only mode, read-write mode with synchronous direct write strategy, read-write mode with asynchronous write-back strategy.

In general, we use Redis cache as a read-only cache. The operations involved in read-only cache include query cache, read the database when there is a cache miss and backfill, and delete cache data when data is updated. These operations can all be added to the business application. Moreover, when data is updated, the cache directly deletes the data, and the consistency between the cache and the database is relatively easy to maintain.

Of course, sometimes we also use Redis as a read-write cache while adopting the synchronous direct write strategy. In this case, the cache operations can also be added to the business application. And compared to the read-only cache, there is an advantage that the latest value after data modification can be directly read from the cache.

As for the read-write cache mode with asynchronous write-back strategy, the cache system needs to write the data back to the database by itself when dirty data is evicted. However, Redis cannot achieve this, so we don’t use this mode when using Redis cache.

Conclusion #

Alright, that’s all for this Q&A session. If you encounter any problems during your learning process, feel free to leave me a message.

Lastly, I want to say, “To learn without thinking is futile, to think without learning is perilous.” When using Redis in your daily work, don’t just focus on your current problems, but also try to understand the underlying principles and accumulate relevant solutions. Of course, when studying the course’s operations and configurations, consciously practice them yourself. Only by combining learning and thinking can you truly enhance your Redis practical skills.