04 What's the Difference Between Strong, Soft, Weak and Phantom References

在Java中,引用可以分为四种类型:强引用、软引用、弱引用和幻象引用,它们在使用方式和使用场景上有所不同。

  1. 强引用(Strong Reference):最常见的引用类型,它的生命周期与对象的生命周期一致,只要强引用还存在,对象就不会被垃圾回收器回收。可以通过new关键字创建强引用。一般情况下,我们使用的对象都是通过强引用来引用的。

  2. 软引用(Soft Reference):用于对那些有用但非必需的对象的引用,即在内存不足时可能被垃圾回收器回收的对象。当JVM即将抛出内存溢出异常之前,会回收这些软引用对象。可以通过java.lang.ref.SoftReference类创建软引用。

  3. 弱引用(Weak Reference):用于描述非必需对象的引用,这些对象只能生存到下一次垃圾回收之前。在垃圾回收器线程扫描过这个对象之后,只要发现它只被垃圾引用所引用,就会回收该对象。可以通过java.lang.ref.WeakReference类创建弱引用。

  4. 幻象引用(Phantom Reference):也称为虚引用,是最弱的一种引用。它不会被垃圾回收器自动回收,仅用于跟踪对象被垃圾回收的状态。在垃圾回收器回收对象的时候,会将该引用插入到一个引用队列中,可以通过java.lang.ref.PhantomReference类创建幻象引用。

这些引用类型的具体使用场景如下:

  • 强引用:大部分对象都是通过强引用来引用的,只要引用存在,对象就不会被回收。
  • 软引用:适用于对内存敏感的缓存场景,可以在内存不足时先回收这些缓存对象,而不是抛出内存溢出异常。
  • 弱引用:适用于一些非必需但占用内存较大的对象,当发现它只被弱引用引用时,就会被回收。
  • 幻象引用:常用于引用队列的应用场景,用来跟踪对象被垃圾回收的状态。

了解引用类型的区别和使用场景,可以更好地管理对象的生命周期,优化系统的性能和资源利用。

Typical Answer #

Different types of references mainly reflect the reachability status of objects and their impact on garbage collection.

A strong reference (“Strong” Reference) is the most common type of ordinary object reference. As long as there is a strong reference pointing to an object, it indicates that the object is still “alive” and the garbage collector will not touch such an object. For an ordinary object, if there are no other reference relationships, it can be garbage collected as long as it exceeds the scope of the reference or the corresponding (strong) reference is explicitly assigned as null. However, the specific timing of garbage collection still depends on the garbage collection strategy.

A soft reference (SoftReference) is a type of reference that weakens the strong reference to some extent. It allows objects to be exempt from some garbage collection, and the JVM will attempt to collect the object pointed to by a soft reference only when it determines that there is insufficient memory. The JVM ensures that it cleans up the objects pointed to by soft references before throwing an OutOfMemoryError. Soft references are often used to implement memory-sensitive caches. If there is still free memory, the cache can be retained temporarily, and it can be cleaned up when memory is insufficient. This ensures that memory is not exhausted while using the cache.

A weak reference (WeakReference) does not exempt objects from garbage collection. It only provides a way to access objects in a weakly referenced state. This can be used to establish a relationship without specific constraints, such as maintaining a non-mandatory mapping relationship. If the object is still available when attempting to access it, it can be used; otherwise, it can be reinstantiated. Weak references are also a choice for many cache implementations.

As for phantom references, sometimes translated as virtual references, you cannot access objects through them. Phantom references only provide a mechanism to do something after an object has been finalized. For example, they are often used to implement post-mortem cleanup mechanisms, such as the Java platform’s own Cleaner mechanism that I introduced in a previous column. Some people also utilize phantom references to monitor object creation and destruction.

Analysis of the test point #

This interview question belongs to a category that is both niche and high-frequency. It is considered niche because in most application development scenarios, direct manipulation of different references is rarely needed, even though the libraries and frameworks we use may utilize these mechanisms. It is frequently asked because it is a comprehensive question that tests our understanding of fundamental concepts as well as our knowledge of the underlying object lifecycle and garbage collection mechanisms.

A deep understanding of these references can be very helpful in designing reliable frameworks such as caching systems or diagnosing issues like Out-of-Memory (OOM) errors in applications. For example, diagnosing memory leaks in the MySQL connector-j driver under specific conditions (useCompression=true) requires understanding how to detect the accumulation of phantom references.

Knowledge Expansion #

  1. Object Reachability State Transition Analysis

First, please take a look at the flowchart below. I have summarized the object lifecycle, different reachability states, and the possible transitions between states. It may not be 100% accurate, but it can help illustrate the changes in reachability.

Let me explain the specific states in the diagram. These are the different reachability levels defined in Java:

  • Strongly Reachable: When an object can be accessed by one or more threads without any hindrance. For example, when we create a new object, the thread that creates it has a strong reference to it.
  • Softly Reachable: When we can only access the object through a soft reference.
  • Weakly Reachable: Similar to what was mentioned earlier, it is a state where the object can only be accessed through a weak reference or a soft reference. This state is very close to the finalization state, and it meets the conditions for finalization when the weak reference is cleared.
  • Phantom Reachable: As illustrated in the flowchart, this state occurs when there are no strong, soft, or weak references to the object and the object has been finalized, with only phantom references pointing to it.
  • Finally, there is a state called unreachable, which means that the object can be cleared.

Determining object reachability is an important consideration for JVM garbage collectors.

All reference types are subclasses of the abstract class java.lang.ref.Reference. You may notice that it provides a get() method:

Except for phantom references (since get() always returns null), if the object has not been destroyed, we can use the get() method to retrieve the original object. This means that we can change the reachability state of the object by redirecting the accessed object to a strong reference using soft or weak references. That’s why I drew bidirectional arrows in some places in the diagram.

Therefore, for soft references, weak references, and the like, garbage collectors may have a second confirmation problem to ensure that objects in a weak reference state have not changed to strong references.

However, can you think of any potential issues here?

Correct, if we mistakenly maintain a strong reference (such as assigning it to a static variable), then the object may never have a chance to return to a weak reference-like reachable state, resulting in a memory leak. Therefore, checking whether the weakly referenced object has been garbage collected is also a way to diagnose specific memory leaks. If our framework uses weak references and we suspect memory leaks, we can check from this perspective.

  1. Usage of ReferenceQueue

When programming with different references, we inevitably need to mention the reference queue. When creating various references and associating them with the corresponding objects, we can choose whether to associate them with a reference queue. The JVM will enqueue the references into the queue at specific times, and we can retrieve the references from the queue (the remove method here actually means retrieval) for subsequent logic. Especially for phantom references, the get method only returns null, and it is almost meaningless without specifying a reference queue. Take a look at the example code below. By using the reference queue, we can perform post-processing logic when the object is in the corresponding state (for phantom references, that is when it has been finalized and is in the phantom reachable state).

Object counter = new Object();
ReferenceQueue refQueue = new ReferenceQueue<>();
PhantomReference<Object> p = new PhantomReference<>(counter, refQueue);
counter = null;
System.gc();
try {
    // Remove is a blocking method that can specify a timeout or choose to block indefinitely
    Reference<Object> ref = refQueue.remove(1000L);
    if (ref != null) {
        // do something
    }
} catch (InterruptedException e) {
    // Handle it
}
  1. Explicitly Affecting Garbage Collection of Soft References

Previously, we briefly mentioned the impact of references on garbage collection, especially for soft references. How does the JVM internally handle them? It is not very clear. So, can we use any methods to affect the garbage collection of soft references?

The answer is yes. Soft references usually remain for a certain period of time after the last reference, with the default value calculated based on the remaining heap space (in megabytes). Starting from Java 1.3.1, the -XX:SoftRefLRUPolicyMSPerMB parameter was introduced, which allows us to set it in milliseconds. For example, the following example sets it to 3 seconds (3000 milliseconds).

-XX:SoftRefLRUPolicyMSPerMB=3000

This remaining space is actually influenced by different JVM modes. For Client mode, such as the commonly used Windows 32-bit JDK, the remaining space is calculated based on the currently available space in the heap, making it more likely to trigger garbage collection. However, for server mode JVM, it is calculated based on the maximum value specified by -Xmx.

In essence, this behavior is still a black box, depending on the JVM implementation. Even the aforementioned parameter may not be effective in newer versions of JDK, and the Client mode JDKs have gradually disappeared from the stage of history. Therefore, when we use it in our application, we can refer to similar configurations but should not rely too much on them.

  1. Diagnosing JVM Reference Situation

If you suspect that your application has garbage collection issues caused by references (or finalization), there are many tools and options available to you. For example, the HotSpot JVM itself provides explicit options (PrintReferenceGC) to obtain relevant information. I specified the following options to run a sample application using JDK 8:

-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintReferenceGC

This is the garbage collection log collected by ParrallelGC in JDK 8, and the number of various references is very clear.

0.403: [GC (Allocation Failure) 0.871: [SoftReference, 0 refs, 0.0000393 secs]0.871: [WeakReference, 8 refs, 0.0000138 secs]0.871: [FinalReference, 4 refs, 0.0000094 secs]0.871: [PhantomReference, 0 refs, 0 refs, 0.0000085 secs]0.871: [JNI Weak Reference, 0.0000071 secs][PSYoungGen: 76272K->10720K(141824K)] 128286K->128422K(316928K), 0.4683919 secs] [Times: user=1.17 sys=0.03, real=0.47 secs]

Note: JDK 9 has extensively redesigned the JVM and garbage collection logs. Options like PrintGCTimeStamps and PrintReferenceGC no longer exist. I will explain this topic in a more systematic way in the garbage collection section of my column later.

  1. Reachability Fence

In addition to the basic reference types I introduced earlier, we can also achieve the effect of strong references through low-level APIs, which is called “reachability fence”.

Why do we need this mechanism? Consider the following scenario: according to the Java language specification, if an object has no strong references, it meets the criteria for garbage collection. However, sometimes the object itself does not have a strong reference, but some of its attributes may still be in use. This can lead to unexpected problems, so we need a way to notify the JVM that an object is being used even without a strong reference. It’s a bit complicated to explain, so let’s take a look at an example provided in Java 9.

class Resource {
 private static ExternalResource[] externalResourceArray = ...
 int myIndex;
 
 Resource(...) {
     myIndex = ...
     externalResourceArray[myIndex] = ...;
     ...
 }
 
 protected void finalize() {
     externalResourceArray[myIndex] = null;
     ...
 }
 
 public void action() {
 try {
     // Code that needs to be protected
     int i = myIndex;
     Resource.update(externalResourceArray[i]);
 } finally {
     // Call reachabilityFence to strongly reach the object
     Reference.reachabilityFence(this);
 }
 }
 
 private static void update(ExternalResource ext) {
    ext.status = ...;
 }
} 

The execution of the action method depends on some attributes of the object, so it is protected in a specific way. Otherwise, if we call the code like this:

new Resource().action()

It may cause confusion because there is no strong reference pointing to the Resource object we created, and it is perfectly legal for the JVM to perform the finalization operation on it.

Similar code structures seem to be common in asynchronous programming, as asynchronous programming often does not use the traditional “execute -> return -> use” structure.

Before Java 9, implementing similar functionality was relatively cumbersome, and sometimes required some obscure tricks. Fortunately, java.lang.ref.Reference provides us with a new method as part of JEP 193: Variable Handles, which exposes some capabilities of the Java platform’s underlying layers:

static void reachabilityFence(Object ref)

In the JDK source code, reachabilityFence is mostly used in Executors or similar new HTTP/2 client code, most of which are cases of asynchronous calls. In programming, you can use try-finally to surround the code segment that needs reachability guarantees, and explicitly declare the object as strongly reachable in the finally block.

Today, I have summarized several reference types provided by the Java language, the corresponding reachable states, and their significance for the JVM’s operation. I have also analyzed some practical situations of using reference queues, and finally explained how to use the API to ensure that objects are not unexpectedly reclaimed in new programming patterns. I hope this helps you.

Practice Exercise #

Have you gained a good understanding of the topic we discussed today? Here’s a practice question for you: Can you find examples of various references used in your own product or third-party libraries? What problems are they trying to solve?

Please write your answers in the comment section. I will select the most thoughtful comments and reward you with a learning encouragement bonus. I welcome you to discuss with me.

Are your friends also preparing for interviews? You can “invite friends to read” and share today’s topic. Perhaps you can help them.