30 What Are the Differences in Concurrent Hash Map Between Java7 and 8

30 What Are the Differences in ConcurrentHashMap Between Java7 and 8 #

In Java 8, there were significant upgrades made to the commonly used utility class ConcurrentHashMap compared to the Java 7 version. However, the design concept of the Segment in Java 7 still has value for reference and learning. Therefore, in many cases, interviewers will ask about the structures of ConcurrentHashMap in Java 7 and Java 8, as well as their similarities and differences. This lesson compares and introduces the characteristics and properties of ConcurrentHashMap in these two versions.

ConcurrentHashMap in Java 7 #

Let’s first take a look at the structure of ConcurrentHashMap in Java 7:

img

From the diagram, we can see that ConcurrentHashMap has been segmented internally. Each segment inherits the ReentrantLock, which can be understood as a lock. Each segment is independently locked and does not affect each other. This greatly improves concurrency efficiency compared to Hashtable, which locks the entire object for each operation. The locks of each segment are independent of each other, instead of having only one lock for the entire object.

The underlying data structure of each segment is similar to HashMap, which is a chain structure composed of arrays and linked lists. By default, there are 16 segments numbered from 0 to 15, which means up to 16 threads can operate concurrently (with operations distributed across different segments). This default value can be set to other values during initialization, but once it is initialized, it cannot be resized.

ConcurrentHashMap in Java 8 #

In Java 8, ConcurrentHashMap was almost completely rewritten. The code has increased from over 1,000 lines in Java 7 to over 6,000 lines, making it more challenging to read. To help us understand, let’s start with an overview of the structure and design ideas, and then delve into the details.

img

There are three types of nodes in the diagram.

  • The first type is the simplest, which represents an empty position that hasn’t been filled with an element yet.
  • The second type is similar to the chain structure in HashMap. The first node is inserted into each slot, and subsequent nodes with the same hash value are added as linked lists.
  • The third type of structure is a red-black tree structure, which was not present in the Java 7 ConcurrentHashMap. Before this, we might have had little contact with such a data structure.

When the length of the linked list in the second case exceeds a certain threshold (8 by default) and meets certain capacity requirements, ConcurrentHashMap converts the linked list into a red-black tree structure to further improve lookup performance. Therefore, one important change in Java 8 is the introduction of red-black tree design. Since red-black trees are not a common data structure, let’s briefly introduce their characteristics.

A red-black tree is a binary search tree in which each node has a color property, either red or black. The essence of a red-black tree is a balance strategy for binary search trees (BST). We can think of it as a balanced binary search tree with high lookup efficiency and automatic balancing to prevent extreme imbalances that affect lookup efficiency.

Due to its self-balancing nature, meaning that the heights of the left and right subtrees are almost the same, its lookup performance is similar to binary search, with a time complexity of O(log(n)). In contrast, the time complexity of a linked list is different. In the worst case, it may need to traverse the entire linked list to find the target element, resulting in a time complexity of O(n), which is much greater than the O(log(n)) complexity of a red-black tree. This advantage becomes more apparent as the number of nodes increases.

Some other characteristics of red-black trees:

  • Each node is either red or black, with the root node always being black.
  • Red nodes cannot be consecutive, meaning that neither their children nor parents can be red.
  • Every path from a node to each of its leaf nodes contains the same number of black nodes.

Thanks to these rules and requirements, red-black trees ensure high lookup performance. This explains why Java 8 introduced red-black trees to ConcurrentHashMap. The advantage is that it avoids the situation where a conflict-linked list becomes very long, resulting in slow query performance in extreme cases. Since red-black trees have the characteristic of self-balancing, even in extreme cases, query efficiency can still be guaranteed to be O(log(n)).

Analysis of Important Source Code of ConcurrentHashMap in Java 8 #

Earlier, we explained the main structures of ConcurrentHashMap in Java 7 and Java 8. Now, let’s proceed with an in-depth analysis of the source code. Since Java 7 is outdated, we will focus on the source code analysis of Java 8.

Node #

First, let’s take a look at the most basic internal storage structure, the Node. This represents an individual node, and the code snippet below shows a key-value pair within a Node:

static class Node<K,V> implements Map.Entry<K,V> {

    final int hash;

    final K key;

    volatile V val;

    volatile Node<K,V> next;

    // ...

}

As we can see, each Node contains a key-value pair, with the value marked as volatile to ensure visibility. Additionally, there is a next pointer that refers to the next node, creating a linked list structure.

Now, let’s examine two of the most important and fundamental methods.

put Method Source Code Analysis #

The core of the put method is the putVal method. For ease of reading, I’ll provide explanations in comments within the source code. Let’s analyze this crucial method step by step, as it is a bit lengthy.

final V putVal(K key, V value, boolean onlyIfAbsent) {

    if (key == null || value == null) {

        throw new NullPointerException();

    }

    // Calculate the hash value

    int hash = spread(key.hashCode());
int binCount = 0;

for (Node<K, V>[] tab = table; ; ) {

    Node<K, V> f;

    int n, i, fh;

    // If the array is empty, initialize it
    if (tab == null || (n = tab.length) == 0) {
        tab = initTable();
    }

    // Find the index in the array corresponding to the hash value
    else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
        // If the position is empty, use CAS to insert the new value
        if (casTabAt(tab, i, null,
                new Node<K, V>(hash, key, value, null))) {
            break;
        }
    }

    // If the hash value is MOVED, it means the table is being resized
    else if ((fh = f.hash) == MOVED) {
        tab = helpTransfer(tab, f);
    }

    // If the slot has a value
    else {
        V oldVal = null;

        // Synchronize on the current slot to ensure thread safety
        synchronized (f) {
            if (tabAt(tab, i) == f) {

                // If it is in the form of a linked list
                if (fh >= 0) {
                    binCount = 1;

                    // Traverse the linked list
                    for (Node<K, V> e = f; ; ++binCount) {
                        K ek;

                        // If the key already exists, check if it needs to be overwritten and return
                        if (e.hash == hash &&
                                ((ek = e.key) == key ||
                                        (ek != null && key.equals(ek)))) {
                            oldVal = e.val;
                            if (!onlyIfAbsent) {
                                e.val = value;
                            }
                            break;
                        }

                        Node<K, V> pred = e;

                        // If the key is not found at the end of the linked list, it means it didn't exist before, so add the new value to the end of the list

ConcurrentHashMap in Java 7 and Java 8 #

Introduction #

ConcurrentHashMap is a thread-safe implementation of the Map interface in Java. It allows multiple threads to access and modify the map concurrently without the need for external synchronization. In this article, we will compare ConcurrentHashMap in Java 7 and Java 8 and discuss their similarities, differences, and advantages.

Implementation #

Java 7 #

In Java 7, ConcurrentHashMap is implemented using a segmented structure. It consists of multiple segments, where each segment is essentially a separate hash table, and each segment is guarded by a separate lock. This allows multiple threads to concurrently access different segments without contention. When a thread needs to access a particular segment, it acquires the lock for that segment. This approach ensures that only threads accessing the same segment are blocked, while other segments remain accessible.

Java 8 #

In Java 8, ConcurrentHashMap has a different internal structure. It uses an array of Node objects, where each Node contains a key-value pair. Each array index represents a bucket, which can contain multiple Node objects. If multiple threads access different buckets, they can do so concurrently without any contention. However, if multiple threads access the same bucket, synchronization is still required.

To optimize performance, Java 8’s ConcurrentHashMap introduced the concept of treeified bins. When a bucket’s size exceeds a certain threshold, it is converted into a balanced binary tree, specifically a red-black tree. This improves the time complexity of operations like searching and inserting from O(n) for linked lists to O(log(n)) for red-black trees. This optimization provides a significant performance improvement for large maps with frequent operations.

Key Differences #

  1. Data Structure: Java 7 ConcurrentHashMap uses a segmented structure, whereas Java 8 ConcurrentHashMap uses an array + linked list + red-black tree structure.

  2. Concurrency Level: Java 7 ConcurrentHashMap’s concurrency level is determined by the number of segments, while Java 8 ConcurrentHashMap’s concurrency level is determined by the number of buckets in the array.

  3. Concurrency Control: Java 7 ConcurrentHashMap uses segment-level locking, whereas Java 8 ConcurrentHashMap uses Node + CAS (Compare and Swap) + synchronized to ensure thread safety.

  4. Handling Hash Collisions: Java 7 ConcurrentHashMap handles hash collisions using linked lists (chained approach), while Java 8 ConcurrentHashMap uses linked lists initially and converts them into red-black trees when the list size exceeds a certain threshold.

  5. Query Performance: Java 7 ConcurrentHashMap has a query time complexity of O(n) for linked lists, while Java 8 ConcurrentHashMap has a query time complexity of O(log(n)) for red-black trees.

Conclusion #

Java 7 and Java 8 ConcurrentHashMap differ in their internal implementation, concurrency control mechanism, and handling of hash collisions. Java 8 ConcurrentHashMap provides better concurrency control and improved performance for large maps. However, it’s important to note that the choice between Java 7 and Java 8 ConcurrentHashMap depends on the specific requirements of the application and the performance characteristics of the workload.