22 How to Optimize Garbage Collection Mechanism

22 How to optimize garbage collection mechanism #

Hello, I am Liu Chao.

In Java development, developers do not need to pay excessive attention to object recycling and release, as the garbage collection mechanism of the JVM can reduce a lot of workload. However, completely relying on the JVM to recycle objects can also increase the uncertainty of recycling performance. In some special business scenarios, inappropriate garbage collection algorithms and strategies can lead to a decline in system performance.

Different business scenarios require different garbage collection optimization strategies. For example, in scenarios where memory requirements are strict, it is necessary to improve the efficiency of object recycling. In scenarios with high CPU usage, it is necessary to reduce the frequency of garbage collection during high concurrency. It can be said that garbage collection optimization is an essential skill.

In this lecture, we will break down the learning of this skill and explore the various garbage collection algorithms, the indicators that reflect the quality of garbage collection algorithms, and how to optimize garbage collection strategies according to your own business scenarios.

Garbage Collection Mechanism #

Before understanding the GC algorithm, we need to clarify three questions. First, where does garbage collection occur? Second, when can objects be collected? Third, how are these objects collected?

1. Where does garbage collection occur? #

Among the memory areas of the JVM, the program counter, the virtual machine stack, and the native method stack are thread-private, created and destroyed with each thread. The stack frames in the stack are pushed and popped as methods enter and exit. The amount of memory allocated in each stack frame is generally known when the class structure is determined. Therefore, the memory allocation and collection of these three areas are deterministic.

So, the focus of garbage collection is on the memory in the heap and method area. The collection in the heap mainly involves object collection, while the collection in the method area mainly involves the collection of unused constants and classes.

2. When can objects be collected? #

How does the JVM determine whether an object can be collected? Generally, when an object is no longer referenced, it means that the object can be collected. Currently, there are two algorithms to determine whether an object can be collected.

Reference Counting Algorithm: This algorithm determines whether an object is referenced by using a reference counter. Every time an object is referenced, the reference counter is incremented; Every time a reference becomes invalid, the counter is decremented. When the value of the reference counter of an object is 0, it means that the object is no longer referenced and can be collected. It is worth mentioning that although the implementation of the reference counting algorithm is simple and the judgment efficiency is high, it has the problem of circular referencing between objects.

Reachability Analysis Algorithm: GC Roots is the basis of this algorithm. GC Roots is the root object of all objects. When the JVM is loaded, some ordinary object references are created. These objects serve as starting points for normal objects. During garbage collection, it will start from these GC Roots and search downwards. When an object is not connected to any reference chain from the GC Roots, it means that this object is not available. The current HotSpot JVM adopts this algorithm.

Both of the above algorithms determine whether an object can be collected using references. After Java 1.2, Java expanded the concept of references into four types:

3. How are these objects collected? #

After understanding the conditions for object collection in a Java program, how does the garbage collection thread collect these objects? JVM garbage collection follows the following two characteristics.

Automatically: Java provides a system-level thread to track every allocated memory space. When the JVM is in an idle loop, the garbage collector thread automatically checks each allocated memory space and then automatically collects each idle memory block.

Unpredictable: Once an object is no longer referenced, is it immediately collected? The answer is unpredictable. It is difficult to determine whether an unreferenced object will be immediately collected because it may still be in memory even after the program ends.

The garbage collection thread is automatically executed in the JVM and cannot be forced to execute in a Java program. The only thing we can do is to “suggest” the garbage collector to execute by calling the System.gc method, but whether it is executable and when it is executed is still unpredictable.

GC Algorithm #

JVM provides different garbage collection algorithms to implement its garbage collection mechanism. Typically, the garbage collector’s algorithms can be divided into the following types:

If we consider the collection algorithm as the methodology for memory reclamation, then the garbage collector is the concrete implementation of memory reclamation. After JDK 1.7 update 14, the Hotspot virtual machine’s garbage collectors for server applications are organized as follows:

In fact, the JVM specification does not explicitly define how garbage collection works. Different vendors may implement garbage collectors using different methods. We can query the garbage collector type currently used by the JVM through JVM tools. First, use the ps command to find the process ID, then use jmap -heap ID to query the JVM’s configuration information, which includes the garbage collector’s setting type.

GC Performance Metrics #

Different garbage collectors perform differently in different scenarios, so how do we evaluate the performance of a garbage collector? We can use some metrics to help us.

Throughput: Throughput refers to the ratio of the time consumed by the application to the total running time of the system. We can calculate the throughput of GC using the following formula: Total running time of the system = time consumed by the application + time consumed by GC. For example, if the system runs for 100 minutes and GC consumes 1 minute, the system throughput is 99%. The throughput of GC generally should not be lower than 95%.

Pause Time: Pause time refers to the pause time of the application when the garbage collector is running. For a serial collector, the pause time may be longer; while using a concurrent collector, the pause time of the program will be shorter because the garbage collector and the application run alternately. However, the efficiency may not be as good as that of an exclusive garbage collector, and the system throughput may also be decreased.

Garbage Collection Frequency: How often does garbage collection occur? Generally, the lower the frequency of garbage collection, the better. Increasing the heap memory can effectively reduce the frequency of garbage collection, but it also means that more garbage objects will accumulate, which will eventually increase the pause time during collection. Therefore, as long as we appropriately increase the heap memory to ensure a normal garbage collection frequency.

Viewing & Analyzing GC Logs #

Now that we have performance metrics, we need to query GC logs using a tool to gather information about various metrics. First, we need to pre-set GC logs using JVM parameters. There are usually several JVM parameters to set, including:

    -XX:+PrintGC: Prints GC logs
    -XX:+PrintGCDetails: Prints detailed GC logs
    -XX:+PrintGCTimeStamps: Prints timestamps of GC events (in the form of a baseline time)
    -XX:+PrintGCDateStamps: Prints timestamps of GC events (in the form of a date, such as 2013-05-04T21:53:59.234+0800)
    -XX:+PrintHeapAtGC: Prints heap information before and after GC
    -Xloggc:../logs/gc.log: Specifies the log file output path

Here are the parameters used to print the logs:

    -XX:+PrintGCDateStamps -XX:+PrintGCDetails -Xloggc:./gclogs

The resulting logs are as follows:

The above image shows a GC log for a short period of time. If the GC log is for a long period of time, it’s difficult to get an overview of the GC performance in text form. In such cases, we can use the GCViewer tool to open the log file and visualize the overall GC performance, as shown in the following image:

With this tool, we can see throughput, pause times, and GC frequency, providing a clear understanding of GC performance.

I also recommend another useful GC log analysis tool called GCeasy. It is an intuitive GC log analysis tool. We can compress the log file and upload it to the GCeasy website to get a very clear analysis result, as shown in the following images:

GC Tuning Strategies #

Once the problem is identified, you can start optimizing the garbage collection (GC). Here are several commonly used GC tuning strategies.

1. Reduce Minor GC Frequency #

In most cases, due to the small size of the young generation space, the Eden area is quickly filled up, leading to frequent minor GC. To reduce the frequency of minor GC, you can increase the size of the young generation space.

You might wonder if increasing the size of the Eden area reduces the number of minor GC, won’t it increase the time for each minor GC? If the time for each minor GC increases, it would be hard to achieve the desired optimization effect.

We know that the time for each minor GC consists of two parts: T1 (scanning the young generation) and T2 (copying surviving objects). Suppose an object survives in the Eden area for 500ms, and the time interval for minor GC is 300ms. Normally, the time for a minor GC is T1 + T2.

When we increase the size of the young generation space, the time interval for minor GC may expand to 600ms. In this case, an object surviving for 500ms in the Eden area would be reclaimed, and there would be no surviving objects to be copied, resulting in the occurrence of minor GC: two scans of the young generation, which equals 2T1.

Therefore, after the expansion, the minor GC adds T1 but saves the time for T2. In most virtual machines, the cost of copying objects is much higher than the cost of scanning.

If there are many long-lived objects in the heap memory, increasing the young generation space would actually increase the time for minor GC. On the other hand, if there are many short-lived objects in the heap, expanding the young generation would not significantly increase the time for a single minor GC. Therefore, the time for a single minor GC depends more on the number of surviving objects after GC rather than the size of the Eden area.

2. Reduce Full GC Frequency #

Usually, due to insufficient heap memory or excessive old generation objects, a full GC is triggered. Frequent full GC can cause context switches and increase the system’s performance overhead. What methods can we use to reduce the frequency of full GC?

Reduce the creation of large objects: In typical business scenarios, we often retrieve a large object from the database to display on the web. For example, I encountered a business operation that retrieves 60 fields at once. If such a large object exceeds the maximum object threshold of the young generation, it will be directly created in the old generation. Even if it is created in the young generation, it will enter the old generation after a minor GC due to the limited memory space in the young generation. These large objects can easily cause a large number of full GC.

We can break down these large objects. Initially, only retrieve some important fields, and if other fields are needed for further viewing, retrieve the remaining fields through a second query.

Increase heap memory space: In situations where heap memory is insufficient, increasing the heap memory space and setting the initial heap memory to the maximum heap memory can also reduce the frequency of full GC.

Choose the Right GC Collector #

Suppose we have a requirement that each operation must respond within 500ms. In such cases, we generally choose GC collectors that have faster response time. The CMS (Concurrent Mark Sweep) collector and the G1 collector are good choices.

When throughput is a requirement for the system, the Parallel Scavenge collector can be chosen to improve the throughput of the system.

Summary #

Today’s content is relatively extensive. Let me emphasize a few key points again.

There are many types of garbage collectors, which can be divided into two categories: those with fast response speed and those with high throughput. Generally, CMS and G1 collectors have fast response speed, while the Parallel Scavenge collector has high throughput.

In the JDK 1.8 environment, the default garbage collector is Parallel Scavenge (young generation) + Serial Old (old generation) collector. You can check the default JVM GC configuration by using the method mentioned in the article.

Usually, JVM optimizes garbage collection by default, so it is best to avoid modifying performance configuration parameters of the GC without a performance benchmark. If you must make changes, you must rely on a large number of test results or specific performance data observed in production environment.

Thought Question #

In the above, we discussed the CMS and G1 collectors. Do you know how G1 achieves better GC performance?

Answer:

Does a minor GC cause stop the world? 2. When does a major GC occur, and what is the difference between it and a full GC?

Regardless of the type of GC, it will cause stop the world, but the difference lies in the duration. The duration of stop the world is related to the garbage collector. For the Serial, PartNew, and Parallel Scavenge collectors, whether serial or parallel, they will suspend user threads. In the case of CMS and G1, user threads will not be suspended during concurrent marking, but they will still be suspended at other times, and the duration of stop the world is much shorter.

Many references equate major GC with full GC, and we can also see that many performance monitoring tools only have minor GC and full GC. Generally, a full GC will perform garbage collection on the young generation, old generation, metaspace, and off-heap memory. There are many reasons for triggering a full GC: a. When the size of objects promoted from the young generation to the old generation exceeds the remaining space in the old generation, a full GC will be triggered; b. When the space usage of the old generation exceeds a certain threshold, a full GC will be triggered; c. When there is insufficient metaspace (JDK 1.7 PermGen), a full GC will be triggered; d. When System.gc() is called, a full GC will be scheduled.