29 How Redis Responds to Concurrent Access With Lock Free Atomic Operations

29 How Redis Responds to Concurrent Access with Lock-free Atomic Operations #

When using Redis, we inevitably encounter the problem of concurrent access. For example, if multiple users place orders simultaneously, the inventory of goods stored in Redis may be updated concurrently. Once there are concurrent write operations, data might be modified. If we don’t control concurrent write requests properly, data may be altered incorrectly, affecting the normal use of the business (e.g., inventory data errors leading to order exceptions).

To ensure the correctness of concurrent access, Redis provides two methods: locking and atomic operations.

Locking is a commonly used method. Before reading data, the client needs to acquire a lock, otherwise, it cannot perform operations. Once a client acquires a lock, it holds the lock until it completes the data update and then releases it.

It seems like a good solution, but there are two issues here: First, if there are many locking operations, it will reduce the system’s concurrency performance. Second, when a Redis client needs to lock, it requires the use of distributed locks, which are complex to implement and require an additional storage system to provide lock and unlock operations. I will introduce this in the next lesson.

Atomic operations are another method for providing concurrent access control. Atomic operations refer to operations that maintain atomicity during execution and do not require additional locking, achieving lock-free operations. In this way, we can ensure both concurrent control and reduce the impact on system concurrency performance.

In this lesson, I will talk to you about the atomic operations in Redis. The goal of atomic operations is to achieve concurrent access control. So, what specific aspects do we need to control when there are concurrent access requests? Next, I will first introduce the content of concurrent control to you.

What needs to be controlled in concurrent access? #

When we talk about concurrent access control, we refer to the process of controlling the access of multiple clients to the same data, in order to ensure mutual exclusion when executing operations sent by any client on the Redis instance. For example, when client A is performing an access operation, client B’s operation cannot be executed and needs to wait until A’s operation is finished before it can be executed.

The operations corresponding to concurrent access control mainly involve data modification. When a client needs to modify data, the basic process consists of two steps:

  • The client first reads the data into the local memory and performs the modification locally.
  • After the client has finished modifying the data, it writes the data back to Redis.

We refer to this process as “Read-Modify-Write” operation (abbreviated as RMW operation). When multiple clients perform RMW operations on the same data, we need to ensure that the code involved in the RMW operation is executed atomically. The code that performs RMW operations on the same data is called critical section code.

However, when multiple clients execute critical section code concurrently, there may be some potential issues. Next, let me explain with an example of multiple clients updating the stock quantity.

Let’s first take a look at the critical section code. Suppose a client wants to perform a decrement operation on the stock quantity of a product, the pseudocode looks like this:

current = GET(id)
current--
SET(id, current)

As you can see, the client first reads the current stock value (GET(id)) from Redis based on the product ID (corresponding to Read), then it decrements the stock value by 1 (corresponding to Modify), and finally writes the new stock value back to Redis (corresponding to Write). When multiple clients execute this code, it becomes a critical section code.

If we don’t have any control mechanism for the execution of critical section code, data update errors can occur. In the example above, let’s say we have two clients, A and B, both executing the critical section code simultaneously. This can lead to an error, as shown in the following diagram:

As you can see, client A reads the stock value as 10 and decrements it by 1 at t1. At t2, before client A has written back the decremented stock value of 9 to Redis, client B reads the stock value as 10 and also decrements it by 1, resulting in both A and B having a recorded stock value of 9. At t3, client A writes back the stock value of 9 to Redis, and at t4, client B also writes back the stock value of 9.

If processed correctly, both client A and B should have decremented the stock value by 1, resulting in a stock value of 8. Therefore, the stock value is clearly updated incorrectly in this case.

The reason for this phenomenon is that the client’s operations of reading data, modifying data, and writing back data in the critical section code involve three separate operations that are not executed atomically. Multiple clients modify the value based on the same initial value, rather than based on the value modified by the previous client.

To ensure the correctness of concurrent data modifications, we can use locks to turn parallel operations into serial operations, which have mutual exclusion. When a client holds a lock, other clients can only acquire the lock after it is released to perform their modifications.

The pseudocode below shows the use of locks to control the execution of critical section code:

LOCK()
current = GET(id)
current--
SET(id, current)
UNLOCK()

Although locking ensures mutual exclusion, locking also leads to a decrease in system concurrency performance.

As shown in the following diagram, when client A acquires a lock and performs operations, clients B and C need to wait. After A releases the lock, let’s say B acquires the lock, and C still needs to wait. Therefore, only A can access the shared data during time period t1, and only B can access the shared data during time period t2. This naturally leads to a decrease in system concurrency performance.

Similar to locking, atomic operations can also achieve concurrency control, but atomic operations have a smaller impact on system concurrency performance. Next, let’s learn about atomic operations in Redis.

Two Atomic Operation Methods in Redis #

To achieve the requirement of mutual exclusion in critical section code, Redis utilizes two methods for atomic operations:

  • Implement multiple operations as a single command in Redis, also known as single-command operation.
  • Write multiple operations into a Lua script and execute the script atomically.

Let’s start with Redis’ single-command operations.

Redis processes client-requested commands in a single thread, which means that while Redis is executing a command, other commands cannot be executed. This ensures mutual exclusion for command operations. However, certain operations in Redis, such as generating snapshots and AOF rewriting, can be executed concurrently with the main thread using background threads or child processes. These operations only involve data reading and do not modify data, so there is no need for concurrency control.

You may have noticed that although individual commands in Redis can be executed atomically, in practical applications, data modifications may involve multiple operations, including reading data, modifying data, and writing back data. This clearly goes beyond the scope of a single command operation. So what should we do in such cases?

Don’t worry, Redis provides the INCR/DECR commands, which allow us to combine these three operations into a single atomic operation. The INCR/DECR commands can perform increment / decrement operations on data and they themselves are single-command operations, so they inherently possess mutual exclusion when executed by Redis.

For example, in the inventory deduction example mentioned earlier, clients can directly decrement the stock value of the product ID using the following code. Even if multiple clients execute this code simultaneously, there is no need to worry about incorrect stock deduction.

DECR id 

Therefore, if the RMW operation we are performing involves incrementing or decrementing data, Redis’ atomic operations INCR and DECR can help us achieve concurrency control directly.

However, if the operation we need to perform is not just a simple increment or decrement, but involves more complex conditional logic or other operations, Redis’ single-command operations can no longer guarantee mutual exclusion of multiple operations. In such cases, we need to use the second method - Lua scripts.

Redis executes the entire Lua script as a whole, without being interrupted by other commands, ensuring the atomicity of operations in the script. If we have multiple operations to execute but cannot be implemented using commands like INCR/DECR, we can write these operations into a Lua script. Then, we can use Redis’ EVAL command to execute the script, which ensures that these operations are executed atomically.

Let me give you another example to explain the usage of Lua in more detail.

When the number of users accessing a business application increases, sometimes we need to limit the number of accesses from a single client within a certain time range, such as limiting the purchase rate of hot-selling products or the number of likes per minute in a social network.

So how do we impose such limitations? We can use the client’s IP as the key and the number of accesses from that client as the value, stored in Redis. After each access, we increment the access count using INCR.

However, in this scenario, client rate limiting actually involves both access count and time range limitations. For example, the access count per minute should not exceed 20. Therefore, when the client accesses for the first time, we set an expiration time for the corresponding key-value pair, such as setting it to expire after 60 seconds. At the same time, for each subsequent access, we read the current access count of the client. If the count exceeds the threshold, we throw an error, thereby restricting the client from accessing further. You can see the code below, which implements the limitation of 20 accesses per minute for a client.

// Get the access count for the IP
current = GET(ip)
// If the access count exceeds 20, throw an error
IF current != NULL AND current > 20 THEN
    ERROR "exceed 20 accesses per second"
ELSE
    // If the access count is below 20, increment the count
    value = INCR(ip)
    // If it's the first access, set the key-value pair to expire after 60s
    IF value == 1 THEN
        EXPIRE(ip,60)
    END
    // Perform other operations
    DO THINGS
END

As you can see in this example, we have already used INCR to increment the count atomically. However, the client rate limiting logic not only involves the count but also access count checking and expiration time setting.

For these operations, we also need to ensure their atomicity. Otherwise, if the client uses multiple threads to access and the count is initially 0, when the first thread executes the INCR(ip) operation, the second thread immediately follows and also executes INCR(ip). As a result, the access count for the IP is incremented to 2, and we cannot set an expiration time for this IP anymore. This would lead to the situation where the client corresponding to this IP cannot access anymore after reaching 20 accesses. Even after 60 seconds have passed, the client still cannot continue accessing, which clearly does not meet the business requirement.

Therefore, the operations in this example cannot be achieved using a single Redis command. In such cases, we can use Lua scripts to ensure concurrency control. We can write the logic of incrementing the access count, checking the count, and setting the expiration time as a Lua script, as shown below:

local current
current = redis.call("incr",KEYS[1])
if tonumber(current) == 1 then
    redis.call("expire",KEYS[1],60)
end

Assuming we have named the script lua.script, we can use the Redis client with the eval option to execute the script. The parameters required by the script will be passed through the keys and args in the following command:

redis-cli  --eval lua.script  keys , args

This way, the three operations of incrementing the access count, checking the count, and setting the expiration time can be executed atomically. Even if the client has multiple threads executing this script simultaneously, Redis will execute the script code sequentially, avoiding data errors caused by concurrent operations.

Summary #

During concurrent access, concurrent RMW (Read-Modify-Write) operations can lead to data errors, so concurrent control is needed. Concurrency control ensures the mutual execution of critical section code.

Redis provides two methods for atomic operations to achieve concurrency control: single command operations and Lua scripts. Atomic operations themselves do not restrict access to a lot of resources, allowing for high system concurrency performance.

However, the scope of application for single command atomic operations is small, and not all RMW operations can be transformed into single command atomic operations (for example, the INCR/DECR command can only increment or decrement after reading data). When we need to make more judgments on the data read, or when our modification of the data is not a simple increment or decrement, single command operations are not suitable.

On the other hand, Redis Lua scripts can include multiple operations that are executed atomically, bypassing the restrictions of single command operations. However, including many operations in a Lua script will increase the time it takes for Redis to execute the script and will also decrease Redis’s concurrency performance. Therefore, here’s a little suggestion for you: when writing Lua scripts, you should avoid including operations that do not require concurrency control *** if possible ***.

Of course, locks can also achieve the mutual execution of critical section code, but if multiple clients need to lock, distributed lock support is required. So, next class, I will discuss with you the implementation of distributed locks.

One Question for Each Lesson #

As usual, I have a small question for you. When Redis executes a Lua script, it can ensure atomicity. So, in the Lua script example I provided (lua.script), do you think it is necessary to include the logic for reading the number of times the client IP has been accessed (GET(ip)) and checking if the access count exceeds 20?

Feel free to share your thoughts and answers in the comments. Let’s discuss together. If you find today’s content helpful, please feel free to share it with your friends or colleagues. See you in the next lesson.