35 Concurrent Safe Dictionary Sync. Map Part 2

35 Concurrent Safe Dictionary sync #

Hello, I’m Haolin. Today, we will continue sharing about the concurrent safe dictionary sync.Map.

In the previous article, we talked about how the methods provided by sync.Map involve the types of keys and values being interface{}, so when we call these methods, we often need to check the actual types of the keys and values.

There are roughly two possible solutions for this. In the previous article, we mentioned the first solution, which is to fully determine the types of keys and values during coding and let the Go compiler handle the checks for us.

This approach is convenient, isn’t it? However, although it is convenient, it also makes such dictionary types lack some flexibility.

If we also need a concurrent safe dictionary with a key type of uint32, we will have to duplicate the code in the same way. Therefore, with diverse requirements, the workload becomes even larger, and it can even result in a lot of duplicate code.

Knowledge Expansion #

Question 1: How to ensure the correctness of the types of keys and values in a concurrently safe dictionary? (Solution 2) #

So, if we want to maintain the flexibility of sync.Map type while constraining the types of keys and values, what should we do? This involves the second solution.

In the second solution, all the methods of the struct type we encapsulate should be identical to the ones in sync.Map type (including method name and signature).

However, in these methods, we need to add some code for type checking. Additionally, in this case, the types of keys and values in the concurrently safe dictionary must be completely determined during initialization. And in this situation, we must ensure that the key type is comparable first.

Therefore, just including the sync.Map type field is not enough when designing such a struct type.

For example:

type ConcurrentMap struct {
 m         sync.Map
 keyType   reflect.Type
 valueType reflect.Type
}

Here, ConcurrentMap represents a concurrently safe dictionary with customizable key and value types. This type also has a field m, which is of type sync.Map, representing the internally used concurrently safe dictionary.

In addition, its fields keyType and valueType are used to store the key type and value type, respectively. The types of these two fields are reflect.Type, which can represent any data type in Go. And it is very easy to obtain a value of this type: simply call the reflect.TypeOf function and pass in a sample value.

The result value of the expression reflect.TypeOf(int(123)) represents the reflect type value of the int type.

Now let’s take a look at how the methods of ConcurrentMap type should be written.

First, let’s talk about the Load method. This method accepts a parameter key of type interface{}, representing the value of a certain key.

Therefore, when we search for key-value pairs in the m field of ConcurrentMap, we must ensure that the type of ConcurrentMap is correct. Since reflect type values can be directly compared using the == or != operators, the type checking code here is very simple.

func (cMap *ConcurrentMap) Load(key interface{}) (value interface{}, ok bool) {
 if reflect.TypeOf(key) != cMap.keyType {
  return
 }
 return cMap.m.Load(key)
}

By passing an interface type value to the reflect.TypeOf function, we can obtain the reflect type value corresponding to the actual type of this value.

Therefore, if the reflect type of the parameter value is not equal to the reflect type represented by the keyType field, then we skip the subsequent operations and return directly.

In this case, the first result value of the Load method has a value of nil, and the second result ok has a value of false. This is consistent with the original meaning of the Load method.

Next, let’s talk about the Store method. The Store method accepts two parameters key and value, both of which are of type interface{}. Therefore, our type checking should be done for them.

func (cMap *ConcurrentMap) Store(key, value interface{}) {
 if reflect.TypeOf(key) != cMap.keyType {
  panic(fmt.Errorf("wrong key type: %v", reflect.TypeOf(key)))
 }
 if reflect.TypeOf(value) != cMap.valueType {
  panic(fmt.Errorf("wrong value type: %v", reflect.TypeOf(value)))
 }
 cMap.m.Store(key, value)
}

The type checking code here is very similar to the one in the Load method, but the handling of the checking result is different. When the actual type of the key or value parameter does not meet the requirements, the Store method will immediately panic.

This is mainly because the Store method does not have a result declaration, so it cannot inform the caller in a more peaceful way when there is a problem with the parameter value. However, this is also in line with the original meaning of the Store method.

If you don’t want to do it this way, that’s also possible. In that case, you need to add a result of type error to the Store method.

And when the incorrect type of the parameter value is found, let it directly return the corresponding error value instead of panicking. Keep in mind that what’s shown here is just a reference implementation, and you can optimize and improve it based on the actual application scenario.

As for the other methods and functions related to the ConcurrentMap type, I won’t show them here. They don’t have any special differences in type checking methods and processing flow. You can see this code in the file demo72.go.

In summary, the first solution is suitable when we can completely determine the specific types of keys and values. In this case, we can use the Go compiler for type checking and use type assertion expressions as assistance, just like in IntStrMap.

In the second solution, we don’t need to determine the types of keys and values before the program runs. We just need to dynamically assign them during the initialization of the concurrently safe dictionary. This mainly requires using functions and data types in the reflect package, along with some simple equality comparison operations.

The first solution has an obvious flaw, which is the lack of flexibility to change the types of keys and values in the dictionary. Once the requirements become varied, the coding workload will increase.

The second solution nicely fills this gap. However, those reflection operations will more or less reduce the performance of the program. We often need to obtain and compare various metrics of the program through rigorous and consistent testing, which can serve as an important basis for selecting a solution.

Question 2: How does a concurrent-safe dictionary minimize the use of locks? #

sync.Map type uses a lot of atomic operations internally to access and store keys and values, and it uses two native maps as storage media.

One of the native maps is stored in the read field of sync.Map, which is of type sync/atomic.Value. This native map can be seen as a snapshot, which will always re-save all the key-value pairs contained in the sync.Map value when conditions are met.

For convenience, we will refer to it as the read-only dictionary. However, although the read-only dictionary does not increase or decrease its keys, it allows changes to the values associated with those keys. So, it is not a snapshot in the traditional sense; its read-only nature applies only to the collection of keys within it.

From the type of the read field, it can be seen that sync.Map does not need a lock when replacing the read-only dictionary. Additionally, when storing key-value pairs in the read-only dictionary, it is wrapped in a layer above the value.

It first converts the value to a unsafe.Pointer type value, then encapsulates the latter and stores it in the native map. In this way, atomic operations can be used when changing the value associated with a key.

The other native map in sync.Map is represented by its dirty field. It stores key-value pairs in the same way as the native map in the read field. Its key type is also interface{}, and it is also converted and encapsulated before being stored. Let’s call it the dirty dictionary for now.

Note that if both the dirty dictionary and the read-only dictionary store the same key-value pair, the two keys here definitely refer to the same base value, and the same is true for the two values.

As mentioned earlier, when storing keys and values, these two dictionaries only store a pointer to them, not the base values.

When sync.Map looks up the value associated with a specified key, it always searches in the read-only dictionary first, without needing to lock a mutex. It only accesses the dirty dictionary under lock protection when it is determined that “the key may not exist in the read-only dictionary, but may be in the dirty dictionary.”

Similarly, when storing a key-value pair, if the read-only dictionary already contains the key and the key-value pair is not marked as “deleted,” it will store the new value in it and return directly, without needing a lock.

Otherwise, it will store the key-value pair in the dirty dictionary under lock protection. At this point, the “deleted” flag of the key-value pair will be cleared.

read and dirty in sync.Map

By the way, only when a key-value pair should be deleted but still exists in the read-only dictionary will it be logically deleted by marking it as “deleted,” instead of being directly physically deleted.

This situation occurs for a period of time after rebuilding the dirty dictionary. However, soon after, they will be truly deleted. Already logically deleted key-value pairs will always be ignored when looking up and traversing key-value pairs.

When deleting a key-value pair, sync.Map first checks if the corresponding key exists in the read-only dictionary. If not, it may exist in the dirty dictionary, so it will try to delete the key-value pair from the dirty dictionary under lock protection.

Finally, sync.Map sets the pointer to the value of that key-value pair to nil, which is another way of logical deletion.

In addition, there is one more detail to note: the read-only dictionary and the dirty dictionary can be swapped with each other. When the dirty dictionary is accessed enough times, sync.Map directly saves the dirty dictionary as the read-only dictionary in its read field, and sets the value representing the dirty dictionary in the dirty field to nil.

After this, once new key-value pairs are stored, it will rebuild the dirty dictionary based on the read-only dictionary. At this time, it filters out the logically deleted key-value pairs from the read-only dictionary. Naturally, these conversions need to be done under lock protection.

- The interchange between read and dirty in sync.Map

To sum up, the collections of key-value pairs in the read-only dictionary and the dirty dictionary of sync.Map are not synchronized in real time; they may be different during certain periods of time.

Since the collection of keys in the read-only dictionary cannot be changed, the key-value pairs in it may sometimes be incomplete. On the other hand, the collection of key-value pairs in the dirty dictionary is always complete and does not include logically deleted key-value pairs.

Therefore, it can be seen that in situations where there are many read operations but few write operations, the performance of a concurrent-safe dictionary is often better. Among several write operations, adding key-value pairs has the greatest impact on the performance of the concurrent-safe dictionary, followed by deleting, and lastly modifying.

If the key-value pair being operated on already exists in the read-only dictionary of sync.Map and has not been logically deleted, modifying it will not require a lock and will have little impact on performance.

Summary #

In these two articles, we discussed the sync.Map type and how to ensure the correctness of the types of keys and values in a concurrent-safe dictionary.

To further clarify the actual types of keys and values in a concurrent-safe dictionary, there are roughly two options available.

  • One option is to fully determine the types of keys and values at compile time and rely on the Go language compiler to perform the necessary checks.

  • Another option is to accept dynamically typed settings and perform checks using reflection at runtime.

Both of these options have their pros and cons. The first option lacks extensibility, while the second option often affects program performance. In practical use, we generally need to make decisions based on objective testing.

Furthermore, in some cases, using sync.Map can significantly reduce lock contention compared to using a native dictionary and mutex lock. sync.Map itself does use locks, but it tries to avoid their usage as much as possible.

This brings us to the clever use of two native dictionaries in sync.Map. One is referred to as the read-only dictionary, and the other is referred to as the dirty dictionary. By analyzing them, we learn about the use cases for concurrent-safe dictionaries and the performance impact of each operation.

Reflection Questions #

Today’s reflection question is: Can you think of any other solutions to ensure the correctness of the types of keys and values in a concurrent-safe dictionary?

Click here to view the accompanying detailed code for the Go language column article.