33 What Are the Features of CopyOnWriteArrayList #
In this lesson, we will mainly discuss the characteristics of CopyOnWriteArrayList.
But before we talk about CopyOnWriteArrayList, let’s talk about the story before it was born. Actually, we already have ArrayList and LinkedList as the list implementations using arrays and linked lists, and we also have the thread-safe Vector and Collections.synchronizedList() that can be used. So let’s first take a look at the code for the size and get methods of the thread-safe Vector:
public synchronized int size() {
return elementCount;
}
public synchronized E get(int index) {
if (index >= elementCount)
throw new ArrayIndexOutOfBoundsException(index);
return elementData(index);
}
As you can see, Vector uses synchronized internally to ensure thread safety, and the locking granularity is relatively large, with locks at the method level. In high-concurrency situations, it is prone to contention and relatively low concurrency efficiency. In this regard, Vector is similar to Hashtable.
Moreover, the previous types of lists do not allow editing during iteration. If element additions or removals are performed during iteration, a ConcurrentModificationException will be thrown, which can be troublesome in many cases.
Therefore, starting from JDK 1.5, the Java concurrency package provides the CopyOnWriteArrayList as the primary concurrent List implementation using the CopyOnWrite mechanism. The CopyOnWrite concurrent collection also includes CopyOnWriteArraySet, which is implemented using CopyOnWriteArrayList. So today, we will use CopyOnWriteArrayList as a breakthrough to explore the characteristics of the CopyOnWrite container.
Applicable Scenarios #
- Read operations need to be as fast as possible, while slow writes are acceptable
In many application scenarios, there may be far more read operations than write operations. For example, some system-level information is often loaded or modified very few times, but it is frequently accessed by all modules within the system. In such scenarios, we hope that read operations can be as fast as possible, while slow writes are acceptable.
- More read operations and fewer write operations
The blacklist is the most typical scenario. Suppose we have a search website where users enter keywords in the search box to search for content, but some keywords are not allowed to be searched. These keywords that cannot be searched will be placed in a blacklist, which does not need to be updated in real-time and can be updated once a day. When a user searches, the current keyword will be checked against the blacklist. If it exists, a prompt will be displayed indicating that the keyword cannot be searched. This scenario with more read operations and fewer write operations is also suitable for using CopyOnWrite collections.
Read and Write Rules #
- Read-write lock rules
The idea of a read-write lock is that reads can be shared, while all other operations are mutually exclusive (write-write, read-write, write-read), because read operations do not modify the existing data, so concurrent reads have no safety issues. On the other hand, write operations are dangerous, so when a write operation occurs, no read operation should be allowed to join, and a second write thread should also not be allowed to join.
- Upgrade to the rules of read-write lock
The idea of CopyOnWriteArrayList goes even further than read-write locks. In order to maximize the performance of reads, reading in CopyOnWriteArrayList does not require any locks. Even more impressive, writing does not block reading operations, which means you can read while writing, and only synchronization is required between writes, meaning multiple writes are not allowed to occur at the same time, but reading is allowed to occur during writing. This greatly improves the performance of read operations.
Characteristics #
- Meaning of CopyOnWrite
From the name of CopyOnWriteArrayList, we can see that it is an ArrayList that satisfies the CopyOnWrite property. The meaning of CopyOnWrite is that when the container needs to be modified, it does not directly modify the current container, but instead makes a copy of the current container, then modifies the new container, and after the modification is complete, redirects the reference of the original container to the new container. This completes the entire modification process.
The benefit of doing this is that CopyOnWriteArrayList utilizes the principle of “immutability”. Since a new copy is made every time the container is modified, the old container is actually immutable and thread-safe, so no further synchronization operations are required. We can read the CopyOnWrite container concurrently without locking, because the current container does not add any elements or make modifications.
All modification operations (add, set, etc.) in CopyOnWriteArrayList are implemented by creating a new copy of the underlying array, so CopyOnWrite containers also reflect the idea of read-write separation, with different containers for reading and writing.
- Allow modification of the collection during iteration
We know that if we modify the content of an ArrayList during iteration, a ConcurrentModificationException will be thrown. Let’s analyze the reason why ArrayList throws this exception.
In the ArrayList source code, the ListItr’s next method has a checkForComodification method, which is as follows:
final void checkForComodification() {
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
}
First, it checks whether modCount
is equal to expectedModCount
. modCount
is a counter that keeps track of the number of modifications made to the collection. It increases every time we call methods like add
, remove
, or trimToSize
. expectedModCount
is a variable of the iterator that is initialized and recorded with the value of modCount
when the iterator is created. If during the iteration, modCount
and expectedModCount
are not the same, it means that someone has modified the contents of the collection, and an exception is thrown.
Unlike ArrayList
, the iterator of CopyOnWriteArrayList
does not throw a ConcurrentModificationException
if the array content is modified during iteration. This is because the iterator still uses the old array, but the iterated content may be outdated. The demo code below demonstrates this:
/**
* Describes how CopyOnWriteArrayList allows modification of the collection during iteration
*/
public class CopyOnWriteArrayListDemo {
public static void main(String[] args) {
CopyOnWriteArrayList<Integer> list = new CopyOnWriteArrayList<>(new Integer[]{1, 2, 3});
System.out.println(list); //[1, 2, 3]
//Get iterator 1
Iterator<Integer> itr1 = list.iterator();
//Add one element and verify list is updated
list.add(4);
System.out.println(list); //[1, 2, 3, 4]
//Get iterator 2
Iterator<Integer> itr2 = list.iterator();
System.out.println("====Verify Iterator 1 content====");
itr1.forEachRemaining(System.out::println); //1,2,3
System.out.println("====Verify Iterator 2 content====");
itr2.forEachRemaining(System.out::println); //1,2,3,4
}
}
This code first creates a CopyOnWriteArrayList
with an initial value of [1, 2, 3]
. The printed result at this point is obviously [1, 2, 3]
. Then we create an iterator called itr1
and add a new element after creating it using the list.add()
method. This adds the element 4 to the list, so the print result of the list is now [1, 2, 3, 4]
. We then create another iterator called itr2
and print the content generated by iterating through both iterators. The output of this code is:
[1, 2, 3]
[1, 2, 3, 4]
====Verify Iterator 1 content====
1
2
3
====Verify Iterator 2 content====
1
2
3
4
As you can see, the content printed by these two iterators is different. The first iterator prints [1, 2, 3]
, while the second iterator prints [1, 2, 3, 4]
. Although they are printed after the fourth element is added, their creation times are different. The first iterator is created when the list has only three elements, so it remains unaware of any modifications to the list that occur later.
This result illustrates that once the iterator of CopyOnWriteArrayList
is created, any additions to the original CopyOnWriteArrayList
object will not be reflected in the iterator, and no error will be thrown. This is a significant difference from ArrayList
.
Disadvantages #
These disadvantages apply not only to CopyOnWriteArrayList
, but also to other CopyOnWrite
containers:
- Memory Usage:
Because of the copy-on-write mechanism, writing operations in CopyOnWrite
containers require simultaneous memory allocation for two objects, resulting in additional memory usage.
- High cost of copying, especially when there are many or complex elements
The copying process not only consumes double memory, but also consumes CPU and other resources, which will reduce overall performance.
- Data consistency issues
Since modifications to CopyOnWrite containers first modify the copy, these modifications are not immediately visible to other threads. The changes are only reflected after the modification is complete. If you want the written data to be visible to other threads immediately, CopyOnWrite containers are not suitable.
Source Code Analysis #
- Data structure
/** Reentrant lock object */
final transient ReentrantLock lock = new ReentrantLock();
/** CopyOnWriteArrayList is backed by an array, modified with `volatile` to ensure array visibility */
private transient volatile Object[] array;
/**
* Get the array
*/
final Object[] getArray() {
return array;
}
/**
* Set the array
*/
final void setArray(Object[] a) {
array = a;
}
/**
* Initialize CopyOnWriteArrayList, which is equivalent to initializing the array
*/
public CopyOnWriteArrayList() {
setArray(new Object[0]);
}
In this class, there is a ReentrantLock lock used to ensure thread-safe modification operations. The Object[] array named as array
is modified with volatile
to ensure array visibility. This array is used to store elements, and it can be seen from the getArray()
, setArray()
, and the constructor of CopyOnWriteArrayList
that CopyOnWriteArrayList is implemented using an array, which is consistent with its name.
- add method
public boolean add(E e) {
// Lock
final ReentrantLock lock = this.lock;
lock.lock();
try {
// Get the length and elements of the original array
Object[] elements = getArray();
int len = elements.length;
// Create a new array by copying the original array
Object[] newElements = Arrays.copyOf(elements, len + 1);
// Add the new element to the new array
newElements[len] = e;
// Replace the reference of volatile Object[] array with the new array
setArray(newElements);
return true;
} finally {
lock.unlock();
}
}
The add
method is used to add elements to CopyOnWriteArrayList, which is a modification operation. First, a lock is obtained using the lock
method of ReentrantLock. After acquiring the lock, the length and elements of the original array are obtained using the getArray
method. Then, a new array is created by copying the original array using the Arrays.copyOf
method, which creates a new array with the same content as the original array. The new element is added to the new array. After completing the addition, the reference of the volatile Object[] array
is replaced with the new array using the setArray
operation. Finally, the lock is released in the finally
block.
Summary: When adding elements, first lock, then copy a new array, perform the addition operation on the new array, and then replace the reference of array
with the new array. Finally, unlock.
The above steps implement the CopyOnWrite idea: the write operation is performed on the copied container, and the list is not locked when reading data. It can be seen that if a new read thread enters during the process of copying the container, it will still read the old data because the object reference has not been changed at that time.
Next, let’s analyze the code of the read operation, which includes two overloaded methods of get
and the getArray
method. The code is as follows:
public E get(int index) {
return get(getArray(), index);
}
final Object[] getArray() {
return array;
}
private E get(Object[] a, int index) {
return (E) a[index];
}
It can be seen that the get-related operations are not locked, ensuring fast reading operations.
- Iterator COWIterator class
This iterator has two important properties: Object[] snapshot
and int cursor
. snapshot
represents the snapshot of the array at the time the iterator is created, and cursor
represents the iterator’s cursor. The constructor of the iterator is as follows:
private COWIterator(Object[] elements, int initialCursor) {
cursor = initialCursor;
snapshot = elements;
}
It can be seen that when the iterator is constructed, the elements
at that time is assigned to snapshot
. All subsequent operations of the iterator are based on the snapshot
array. For example:
public E next() {
if (!hasNext())
throw new NoSuchElementException();
return (E) snapshot[cursor++];
}
In the next
method, it can be seen that the returned content is from the snapshot
object. Therefore, even if the original array is modified later, the snapshot
will neither be aware of nor affected by those changes. The iterator does not require locking for iteration operations and will not throw exceptions due to concurrent modifications. The results returned by the iterator are consistent with the contents at the time of iterator creation.
Above, we have introduced CopyOnWriteArrayList. We have described the characteristics of Vector and Collections.synchronizedList() before its birth, the applicable scenarios and read/write rules of CopyOnWriteArrayList, and the two characteristics of write-on-copy and allowing modification during iteration. We also mentioned three drawbacks: memory consumption, high cost of copying in cases of many or complex elements, and data consistency issues. Finally, we analyzed its important source code.