28 What Optimizations Has the Jvm Made for Locks

28 What Optimizations Has the JVM Made for Locks #

In this lesson, we’ll discuss the optimizations performed by the JVM on locks.

Compared to JDK 1.5, HotSpot Virtual Machine in JDK 1.6 introduced many optimizations for the performance of the intrinsic synchronized lock. These optimizations include adaptive spin, lock elimination, lock coarsening, biased locking, and lightweight locking. With these optimization measures, the performance of synchronized locks has been greatly improved. Now let’s introduce each of these optimizations separately.

Adaptive Spin Lock #

First, let’s take a look at adaptive spin locks. Let’s review the concept of spin and its drawbacks. “Spin” means not releasing the CPU and continuously attempting to acquire the lock in a loop, as shown in the following code:

public final long getAndAddLong(Object var1, long var2, long var4) {
    long var6;
    do {
        var6 = this.getLongVolatile(var1, var2);
    } while(!this.compareAndSwapLong(var1, var2, var6, var6 + var4));
    return var6;
}

The code uses a do-while loop to continuously try to modify the value of a long. The drawback of spinning is that if the spin time is too long, it will incur significant performance overhead and waste CPU resources.

In JDK 1.6, adaptive spin locks were introduced to solve the problem of spinning for a long time. “Adaptive” means that the spin time is no longer fixed, but determined based on factors such as the recent success and failure rates of spinning attempts, as well as the state of the lock owner. The duration of spinning is variable, making spin locks “smarter.” For example, if the recent attempt to spin and acquire a lock was successful, the next attempt might continue using spinning and allow for a longer spin time. However, if the recent attempt to spin and acquire a lock failed, it might skip the spinning process to reduce useless spinning and improve efficiency.

Lock Elimination #

The second optimization is lock elimination. First, let’s take a look at the following code:

public class Person {
    private String name;
    private int age;
  
    public Person(String personName, int personAge) {
        name = personName;
        age = personAge;
    }
  
    public Person(Person p) {
        this(p.getName(), p.getAge());
    }
  
    public String getName() {
        return name;
    }
  
    public int getAge() {
        return age;
    }
}
  
class Employee {
    private Person person;
  
    // makes a defensive copy to protect against modifications by caller
    public Person getPerson() {
        return new Person(person);
    }
  
    public void printEmployeeDetail(Employee emp) {
        Person person = emp.getPerson();
  
        // this caller does not modify the object, so defensive copy was unnecessary
        System.out.println("Employee's name: " + person.getName() + "; age: " + person.getAge());
    }
}

In this code, we can see the getPerson() method in the Employee class, which uses the person object inside the class and creates a new person object exactly the same as it, aiming to prevent the modification of the original person object by the method caller. However, in this example, there is no need to create a new object because our printEmployeeDetail() method does not modify this object; it only prints its information. In this case, we can directly print the original person object without creating a new one.

If the compiler can determine that the original person object will not be modified, it may optimize and eliminate the process of creating this new person object. According to this idea, let’s give an example of lock elimination. After escape analysis, if it is found that certain objects cannot be accessed by other threads, they can be treated as stack-allocated data. Stack-allocated data is only accessible by the current thread, so it is naturally thread-safe and does not require locking. Therefore, such locks can be automatically eliminated.

For example, let’s take a look at the append method of StringBuffer:

@Override
public synchronized StringBuffer append(Object obj) {
    toStringCache = null;
    super.append(String.valueOf(obj));
    return this;
}

From the code, we can see that this method is synchronized because it may be used by multiple threads.

However, in most cases, it will only be used within a single thread. If the compiler can determine that this StringBuffer object is only used within one thread, it is guaranteed to be thread-safe. In this case, the compiler will optimize the code and eliminate the corresponding synchronized keyword, avoiding the overhead of locking and unlocking, thus increasing overall efficiency.

Lock Coarsening #

Next, let’s introduce lock coarsening. If we release the lock and immediately acquire it again without doing anything in between, as shown in the following code:

public void lockCoarsening() {
    synchronized (this) {
        //do something
    }
    synchronized (this) {
        //do something
    }
    synchronized (this) {
        //do something
    }
}

There is no need to release and acquire the lock again. If we expand the synchronized region, that is, only acquire the lock once in the beginning and release it directly at the end, we can eliminate the meaningless lock releasing and acquiring in between, effectively merging several synchronized blocks into one larger synchronized block. The benefit of doing this is that the thread executing this code does not need to frequently acquire and release the lock, reducing performance overhead.

However, there is a downside to this approach, which is that the synchronized region becomes larger. If we do the same thing within a loop, as shown in the code:

for (int i = 0; i < 1000; i++) {
    synchronized (this) {
        //do something
    }
}

That is, if we expand the synchronized region and hold the lock from the beginning of the first iteration until the end of the last iteration, it will cause other threads to be unable to acquire the lock for a long time. Therefore, lock coarsening is not suitable for scenarios with loops, only for non-loop scenarios.

Lock coarsening is enabled by default. You can disable it with the -XX:-EliminateLocks flag.

Biased Locking/Lightweight Locking/Heavyweight Locking #

Next, let’s introduce biased locking, lightweight locking, and heavyweight locking. These locks were also mentioned when we discussed types of locks. These three types of locks specifically refer to the state of the synchronized lock, which is indicated by the Mark Word in the object header.

Biased Locking

For biased locking, the idea is that if there is no competition for the lock from beginning to end, there is no need to actually lock it, just mark it. After an object is initialized and no thread has tried to acquire its lock yet, it can be biased. When the first thread tries to acquire the lock, the ownership of the biased lock is recorded. If subsequent threads trying to acquire the lock are also the owners of the biased lock, they can directly acquire the lock with minimal overhead.

Lightweight Locking

The developers of the JVM found that in many cases, the code blocks inside synchronized are executed by multiple threads in an alternating manner, meaning that there is no actual competition or only short-term lock competition, which can be resolved using CAS (Compare and Swap). In these cases, heavyweight locking is unnecessary. When the lock is biased and another thread accesses it, indicating that there is competition, the biased lock is upgraded to a lightweight lock. The thread will try to acquire the lock by spinning, without blocking.

Heavyweight Locking

This lock uses the synchronization mechanism provided by the operating system, so it has higher overhead. When multiple threads have actual competition for the lock, and the lock competition lasts for a long time, neither biased locking nor lightweight locking can meet the requirements, and the lock will be further upgraded to a heavyweight lock. The heavyweight lock will block other threads that cannot acquire the lock.

Lock Upgrade Path #

Finally, let’s take a look at the upgrade path of locks, as shown in the diagram below. From no lock to biased lock, then to lightweight lock, and finally to heavyweight lock. Combined with the knowledge we discussed earlier, biased locking has the best performance, avoiding CAS operations. Lightweight locking avoids thread blocking and waking up caused by heavyweight locks by using spin and CAS, with medium performance. Heavyweight locking will block threads that cannot acquire the lock, with the worst performance.

By default, the JVM prefers biased locking and gradually upgrades if necessary, significantly improving lock performance.