30 Atomic Operations Part 2

30 Atomic Operations - Part 2 #

Hello, I’m Haolin. Today, we will continue sharing the content about atomic operations.

Let’s continue from where we left off in the previous article. In the previous article, we mentioned that the sync/atomic package provides functions for atomic operations, including addition (add), compare and swap (CAS), load, store, and swap. This leads to two derived questions.

Today, let’s take a look at the third derived question: What is the difference between compare and swap (CAS) and swap operations? What are the advantages?

The answer is: CAS is a conditional exchange operation, which means it only performs value swapping when a condition is met.

The term “swap” refers to assigning a new value to a variable and returning the old value of the variable.

When performing a CAS operation, the function first checks if the current value of the variable being operated on is equal to the expected old value. If they are equal, the new value is assigned to the variable, and true is returned to indicate that the swap operation has been performed. Otherwise, the swap operation is ignored, and false is returned.

As you can see, CAS is not a single operation but a combination of operations, which sets it apart from other atomic operations. Because of this, its applications are more extensive. For example, when combined with a for loop, CAS can be used to implement a simple spinlock.

for {
 if atomic.CompareAndSwapInt32(&num2, 10, 0) {
  fmt.Println("The second number has gone to zero.")
  break
 }
 time.Sleep(time.Millisecond * 500)
}

In the for loop, the CAS operation can continuously check a condition that needs to be satisfied. Once the condition is met, the for loop is exited. This is equivalent to continuously “blocking” the current flow until the condition is met.

In terms of effectiveness, this is similar to a mutex lock. However, they have different use cases. When using a mutex lock, we always assume that the state of shared resources will be frequently changed by other goroutines.

On the other hand, the assumption when using a for loop with CAS is often that the change in the shared resource’s state is not frequent or it will eventually become the expected state. This is a more optimistic or relaxed approach.

Now, let’s move on to the fourth derived question: Assuming I have ensured that write operations on a variable are atomic, such as addition or subtraction, store, swap, etc., is it still necessary to use atomic operations when performing read operations on it?

The answer is: Yes, it is necessary. You can compare this to a read-write lock. Why are write operations and read operations protected by a read-write lock mutually exclusive? It is to prevent read operations from reading a partially modified value, right?

If a read operation occurs before the write operation is completed, it can only read a value that has been partially modified. This obviously damages the integrity of the value, and the value read out is completely incorrect.

Therefore, once you decide to protect a shared resource, you need to provide complete protection. Incomplete protection is essentially no protection at all.

Now, the main question and the related derived questions have covered the usage, principles, comparisons, and some best practices of atomic operation functions. I hope you have understood them.

Since atomic operation functions only support a very limited range of data types, in many scenarios, a mutex lock is often more suitable.

However, once we determine that atomic operation functions can be used in a specific scenario, such as involving concurrent read and write of a single integer type value or multiple independent integer type values, there is no need to consider a mutex lock.

This is mainly because atomic operation functions are much faster in execution speed compared to a mutex lock. Moreover, they are easier to use and do not involve the selection of critical sections or issues like deadlocks. However, we still need to be cautious when using CAS operations because they can be used to mimic locks and may “block” the flow.

Knowledge Expansion #

Question: How to use sync/atomic.Value effectively?

In order to expand the scope of atomic operations, Go introduced a new type called Value to the sync/atomic package in version 1.4. The value of this type acts as a container and can be used to atomically store and load any value.

The atomic.Value type is ready to use, and once we declare a variable of this type (hereinafter referred to as an atomic variable), we can use it directly. This type is very simple to use and has only two pointer methods: Store and Load. However, although it is simple, there are still some things worth noting.

Firstly, once the value of an atomic.Value type (hereinafter referred to as an atomic value) is actually used, it should not be copied. What does “actually used” mean?

As long as we use it to store a value, it means that we have actually used it. The atomic.Value type is a struct type, and struct types are value types.

Therefore, copying a value of this type will create a completely separate new value. This new value is like a snapshot of the copied value. Afterwards, regardless of how the stored value changes, it will not affect the original value, and vice versa.

In addition, there are two mandatory rules for storing values with atomic values. The first rule is that nil cannot be stored in an atomic value.

In other words, we cannot pass nil as the argument value to the Store method of an atomic value, otherwise it will trigger a panic.

Here, it is important to note that if there is a variable of interface type, and its dynamic value is nil, but the dynamic type is not nil, then its value is not equal to nil. I mentioned this issue when discussing interfaces with you earlier. Because of this, the value of such a variable can be stored in an atomic value.

The second rule is that the first value stored in an atomic value determines the type of value it can and can only store in the future.

For example, if I store a value of type string for the first time in an atomic value, then I can only store strings in that atomic value in the future. If I want to store a struct in it again, it will trigger a panic when calling its Store method. This panic will tell me that the type of the value being stored this time is inconsistent with the previous one.

You may wonder: can I store a value of an interface type first, and then store a value of a certain implementation type of this interface?

Unfortunately, this is not allowed and will also trigger a panic. Because the atomic value internally judges based on the actual type of the stored value. Therefore, even if different types that implement the same interface, their values cannot be stored in the same atomic value one after another.

Unfortunately, there is no way to know whether an atomic value has been actually used through a method, and there is no way to obtain the actual type of value that can be stored in an atomic value through conventional means. This greatly increases the possibility of misusing atomic values, especially when using the same atomic value in multiple places.

Next, I will give you some specific usage recommendations.

Do not expose atomic values used internally to the outside world. For example, declaring a global atomic variable is not a correct approach. At least, the access permission of this variable should be package-level private.
If you have to allow code outside the package or module to use your atomic value, you can declare a package-level private atomic variable, and then indirectly allow the outside world to use it through one or more public functions. Note that in this case, do not pass the atomic value to the outside world, whether it is the atomic value itself or its pointer value.
If a function can store a value in the internal atomic value, then the validity of the type of the value to be stored should be checked in this function. If it is invalid, the corresponding error value should be returned directly, thereby avoiding panic.
If possible, we can encapsulate the atomic value into a data type, such as a struct type. This way, we can store values more safely through the methods of this type, and also include information about valid types that can be stored in this type.

In addition to the above usage recommendations, I would like to emphasize one more point: try not to store reference types in atomic values. Because this can easily lead to security vulnerabilities. Please take a look at the following code:

var box6 atomic.Value
v6 := []int{1, 2, 3}
box6.Store(v6)
v6[1] = 4 // Note that this operation is not concurrency-safe!

I stored a slice value v6 of type []int into the atomic value box6. Note that the slice type is a reference type. Therefore, when I modify this slice value outside, it is equivalent to modifying the value stored in box6. This bypasses the atomic value and performs a non-concurrency-safe operation. So, how do we fix this vulnerability? We can do it like this:

store := func(v []int) {
    replica := make([]int, len(v))
    copy(replica, v)
    box6.Store(replica)
}
store(v6)
v6[2] = 5 // This operation is safe.

I first create a complete copy for the slice value v6. The data involved in this copy has nothing to do with the original value. Then, I store this copy into box6. In this way, no matter how I modify the value of v6, it will not break the security protection provided by box6.

The above is what I want to tell you about the precautions and usage recommendations for atomic.Value. You can see the corresponding examples in the file demo64.go.

Summary #

Let’s summarize these two articles together. Compared to atomic operation functions, the advantages of atomic value types are obvious, but they also have more usage rules. First, after the first real use, the atomic value should not be copied again.

Second, the Store method of the atomic value has two mandatory constraints on its parameter value (the value to be stored). One constraint is that the parameter value cannot be nil. Another constraint is that the type of the parameter value cannot be different from the type of the first stored value. In other words, once an atomic value stores a certain type of value, it can only store values of that type in the future.

Based on these considerations, I have proposed several recommendations, including: not exposing atomic variables externally, not passing atomic values and their pointer values, and trying not to store reference types of values in atomic values, etc. Some related solutions have also been proposed. I hope you find them useful.

Atomic operations are clearly more lightweight than mutex locks, but the limitations are also obvious. So, it is usually not too difficult to choose between the two. However, the choice between atomic values and mutex locks sometimes requires careful consideration. However, if you can remember what I talked about today, it should be of great help.

Thought question #

There is only one thought question for today: If you have to choose between atomic values and mutex locks, what do you think are the three most important decision criteria?

Click here to view the detailed code for the Go language column.