35 Concurrent Safe Dictionary sync #
Hello, I’m Haolin. Today, we will continue sharing about the concurrent safe dictionary sync.Map.
In the previous article, we talked about how the methods provided by sync.Map involve the types of keys and values being interface{}
, so when we call these methods, we often need to check the actual types of the keys and values.
There are roughly two possible solutions for this. In the previous article, we mentioned the first solution, which is to fully determine the types of keys and values during coding and let the Go compiler handle the checks for us.
This approach is convenient, isn’t it? However, although it is convenient, it also makes such dictionary types lack some flexibility.
If we also need a concurrent safe dictionary with a key type of uint32
, we will have to duplicate the code in the same way. Therefore, with diverse requirements, the workload becomes even larger, and it can even result in a lot of duplicate code.
Knowledge Expansion #
Question 1: How to ensure the correctness of the types of keys and values in a concurrently safe dictionary? (Solution 2) #
So, if we want to maintain the flexibility of sync.Map
type while constraining the types of keys and values, what should we do? This involves the second solution.
In the second solution, all the methods of the struct type we encapsulate should be identical to the ones in sync.Map
type (including method name and signature).
However, in these methods, we need to add some code for type checking. Additionally, in this case, the types of keys and values in the concurrently safe dictionary must be completely determined during initialization. And in this situation, we must ensure that the key type is comparable first.
Therefore, just including the sync.Map
type field is not enough when designing such a struct type.
For example:
type ConcurrentMap struct {
m sync.Map
keyType reflect.Type
valueType reflect.Type
}
Here, ConcurrentMap
represents a concurrently safe dictionary with customizable key and value types. This type also has a field m
, which is of type sync.Map
, representing the internally used concurrently safe dictionary.
In addition, its fields keyType
and valueType
are used to store the key type and value type, respectively. The types of these two fields are reflect.Type
, which can represent any data type in Go. And it is very easy to obtain a value of this type: simply call the reflect.TypeOf
function and pass in a sample value.
The result value of the expression reflect.TypeOf(int(123))
represents the reflect type value of the int
type.
Now let’s take a look at how the methods of ConcurrentMap
type should be written.
First, let’s talk about the Load
method. This method accepts a parameter key
of type interface{}
, representing the value of a certain key.
Therefore, when we search for key-value pairs in the m
field of ConcurrentMap
, we must ensure that the type of ConcurrentMap
is correct. Since reflect type values can be directly compared using the ==
or !=
operators, the type checking code here is very simple.
func (cMap *ConcurrentMap) Load(key interface{}) (value interface{}, ok bool) {
if reflect.TypeOf(key) != cMap.keyType {
return
}
return cMap.m.Load(key)
}
By passing an interface type value to the reflect.TypeOf
function, we can obtain the reflect type value corresponding to the actual type of this value.
Therefore, if the reflect type of the parameter value is not equal to the reflect type represented by the keyType
field, then we skip the subsequent operations and return directly.
In this case, the first result value
of the Load
method has a value of nil
, and the second result ok
has a value of false
. This is consistent with the original meaning of the Load
method.
Next, let’s talk about the Store
method. The Store
method accepts two parameters key
and value
, both of which are of type interface{}
. Therefore, our type checking should be done for them.
func (cMap *ConcurrentMap) Store(key, value interface{}) {
if reflect.TypeOf(key) != cMap.keyType {
panic(fmt.Errorf("wrong key type: %v", reflect.TypeOf(key)))
}
if reflect.TypeOf(value) != cMap.valueType {
panic(fmt.Errorf("wrong value type: %v", reflect.TypeOf(value)))
}
cMap.m.Store(key, value)
}
The type checking code here is very similar to the one in the Load
method, but the handling of the checking result is different. When the actual type of the key
or value
parameter does not meet the requirements, the Store
method will immediately panic.
This is mainly because the Store
method does not have a result declaration, so it cannot inform the caller in a more peaceful way when there is a problem with the parameter value. However, this is also in line with the original meaning of the Store
method.
If you don’t want to do it this way, that’s also possible. In that case, you need to add a result of type error
to the Store
method.
And when the incorrect type of the parameter value is found, let it directly return the corresponding error
value instead of panicking. Keep in mind that what’s shown here is just a reference implementation, and you can optimize and improve it based on the actual application scenario.
As for the other methods and functions related to the ConcurrentMap
type, I won’t show them here. They don’t have any special differences in type checking methods and processing flow. You can see this code in the file demo72.go
.
In summary, the first solution is suitable when we can completely determine the specific types of keys and values. In this case, we can use the Go compiler for type checking and use type assertion expressions as assistance, just like in IntStrMap
.
In the second solution, we don’t need to determine the types of keys and values before the program runs. We just need to dynamically assign them during the initialization of the concurrently safe dictionary. This mainly requires using functions and data types in the reflect
package, along with some simple equality comparison operations.
The first solution has an obvious flaw, which is the lack of flexibility to change the types of keys and values in the dictionary. Once the requirements become varied, the coding workload will increase.
The second solution nicely fills this gap. However, those reflection operations will more or less reduce the performance of the program. We often need to obtain and compare various metrics of the program through rigorous and consistent testing, which can serve as an important basis for selecting a solution.
Question 2: How does a concurrent-safe dictionary minimize the use of locks? #
sync.Map
type uses a lot of atomic operations internally to access and store keys and values, and it uses two native maps
as storage media.
One of the native maps
is stored in the read
field of sync.Map
, which is of type sync/atomic.Value
. This native map can be seen as a snapshot, which will always re-save all the key-value pairs contained in the sync.Map
value when conditions are met.
For convenience, we will refer to it as the read-only dictionary. However, although the read-only dictionary does not increase or decrease its keys, it allows changes to the values associated with those keys. So, it is not a snapshot in the traditional sense; its read-only nature applies only to the collection of keys within it.
From the type of the read
field, it can be seen that sync.Map
does not need a lock when replacing the read-only dictionary. Additionally, when storing key-value pairs in the read-only dictionary, it is wrapped in a layer above the value.
It first converts the value to a unsafe.Pointer
type value, then encapsulates the latter and stores it in the native map. In this way, atomic operations can be used when changing the value associated with a key.
The other native map in sync.Map
is represented by its dirty
field. It stores key-value pairs in the same way as the native map in the read
field. Its key type is also interface{}
, and it is also converted and encapsulated before being stored. Let’s call it the dirty dictionary for now.
Note that if both the dirty dictionary and the read-only dictionary store the same key-value pair, the two keys here definitely refer to the same base value, and the same is true for the two values.
As mentioned earlier, when storing keys and values, these two dictionaries only store a pointer to them, not the base values.
When sync.Map
looks up the value associated with a specified key, it always searches in the read-only dictionary first, without needing to lock a mutex. It only accesses the dirty dictionary under lock protection when it is determined that “the key may not exist in the read-only dictionary, but may be in the dirty dictionary.”
Similarly, when storing a key-value pair, if the read-only dictionary already contains the key and the key-value pair is not marked as “deleted,” it will store the new value in it and return directly, without needing a lock.
Otherwise, it will store the key-value pair in the dirty dictionary under lock protection. At this point, the “deleted” flag of the key-value pair will be cleared.
read and dirty in sync.Map
By the way, only when a key-value pair should be deleted but still exists in the read-only dictionary will it be logically deleted by marking it as “deleted,” instead of being directly physically deleted.
This situation occurs for a period of time after rebuilding the dirty dictionary. However, soon after, they will be truly deleted. Already logically deleted key-value pairs will always be ignored when looking up and traversing key-value pairs.
When deleting a key-value pair, sync.Map
first checks if the corresponding key exists in the read-only dictionary. If not, it may exist in the dirty dictionary, so it will try to delete the key-value pair from the dirty dictionary under lock protection.
Finally, sync.Map
sets the pointer to the value of that key-value pair to nil
, which is another way of logical deletion.
In addition, there is one more detail to note: the read-only dictionary and the dirty dictionary can be swapped with each other. When the dirty dictionary is accessed enough times, sync.Map
directly saves the dirty dictionary as the read-only dictionary in its read
field, and sets the value representing the dirty dictionary in the dirty
field to nil
.
After this, once new key-value pairs are stored, it will rebuild the dirty dictionary based on the read-only dictionary. At this time, it filters out the logically deleted key-value pairs from the read-only dictionary. Naturally, these conversions need to be done under lock protection.
- The interchange between read and dirty in
sync.Map
To sum up, the collections of key-value pairs in the read-only dictionary and the dirty dictionary of sync.Map
are not synchronized in real time; they may be different during certain periods of time.
Since the collection of keys in the read-only dictionary cannot be changed, the key-value pairs in it may sometimes be incomplete. On the other hand, the collection of key-value pairs in the dirty dictionary is always complete and does not include logically deleted key-value pairs.
Therefore, it can be seen that in situations where there are many read operations but few write operations, the performance of a concurrent-safe dictionary is often better. Among several write operations, adding key-value pairs has the greatest impact on the performance of the concurrent-safe dictionary, followed by deleting, and lastly modifying.
If the key-value pair being operated on already exists in the read-only dictionary of sync.Map
and has not been logically deleted, modifying it will not require a lock and will have little impact on performance.
Summary #
In these two articles, we discussed the sync.Map
type and how to ensure the correctness of the types of keys and values in a concurrent-safe dictionary.
To further clarify the actual types of keys and values in a concurrent-safe dictionary, there are roughly two options available.
-
One option is to fully determine the types of keys and values at compile time and rely on the Go language compiler to perform the necessary checks.
-
Another option is to accept dynamically typed settings and perform checks using reflection at runtime.
Both of these options have their pros and cons. The first option lacks extensibility, while the second option often affects program performance. In practical use, we generally need to make decisions based on objective testing.
Furthermore, in some cases, using sync.Map
can significantly reduce lock contention compared to using a native dictionary and mutex lock. sync.Map
itself does use locks, but it tries to avoid their usage as much as possible.
This brings us to the clever use of two native dictionaries in sync.Map
. One is referred to as the read-only dictionary, and the other is referred to as the dirty dictionary. By analyzing them, we learn about the use cases for concurrent-safe dictionaries and the performance impact of each operation.
Reflection Questions #
Today’s reflection question is: Can you think of any other solutions to ensure the correctness of the types of keys and values in a concurrent-safe dictionary?
Click here to view the accompanying detailed code for the Go language column article.