17 Gc Log Reading and Analysis Fundamental Configuration

17 GC Log Reading and Analysis Fundamental Configuration #

This chapter demonstrates how to generate and interpret GC logs and extract useful information from them through concrete examples.

Example Code for this Demonstration #

To demonstrate, let’s first write a simple Java code:

package demo.jvm0204;
import java.util.Random;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.LongAdder;

/*
Demo of GC log generation and interpretation
*/
public class GCLogAnalysis {
    private static Random random = new Random();
    public static void main(String[] args) {
        // current timestamp in milliseconds
        long startMillis = System.currentTimeMillis();
        // duration in milliseconds; can be modified as desired
        long timeoutMillis = TimeUnit.SECONDS.toMillis(1);
        // end timestamp
        long endMillis = startMillis + timeoutMillis;
        LongAdder counter = new LongAdder();
        System.out.println("Executing...");
        // cache a portion of objects; go to the old generation
        int cacheSize = 2000;
        Object[] cachedGarbage = new Object[cacheSize];
        // loop continuously within this time range
        while (System.currentTimeMillis() < endMillis) {
            // generate garbage objects
            Object garbage = generateGarbage(100*1024);
            counter.increment();
            int randomIndex = random.nextInt(2 * cacheSize);
            if (randomIndex < cacheSize) {
                cachedGarbage[randomIndex] = garbage;
            }
        }
        System.out.println("Execution finished! Number of generated objects: " + counter.longValue());
    }

    // generate objects
    private static Object generateGarbage(int max) {
        int randomSize = random.nextInt(max);
        int type = randomSize % 4;
        Object result = null;
        switch (type) {
            case 0:
                result = new int[randomSize];
                break;
            case 1:
                result = new byte[randomSize];
                break;
            case 2:
                result = new double[randomSize];
                break;
            default:
                StringBuilder builder = new StringBuilder();
                String randomString = "randomString-Anything";
                while (builder.length() < randomSize) {
                    builder.append(randomString);
                    builder.append(max);
                    builder.append(randomSize);
                }
                result = builder.toString();
                break;
        }
        return result;
    }
}

This program is not complex. We specify a running time as the exit condition, and the loop automatically exits when the time is up. In the generateGarbage method, we use random numbers to generate various types of array objects and return them.

In the main method, we use an array to randomly store a portion of the generated objects, which simulates some objects being promoted to the old generation. The specific running time and number of cached objects can be adjusted as needed.

Generally speaking, large objects in Java are mainly arrays of various types, such as the most common string in development, which is actually stored using character arrays (char[]) internally.

As an aside, besides being used for GC log analysis, this example can also be slightly modified for other purposes:

  • For example, increase the number of cached objects and, under the constraint of limited heap memory, simulate “out of memory”.
  • Increase the runtime, such as to 30 minutes or longer, and use tools like VisualVM introduced earlier for real-time monitoring and observation.
  • Of course, we can also use global static variables for caching to simulate “memory leaks” and conduct experiments and analysis on heap memory dumps.
  • Increase the size of each generated array to simulate “large objects” (mainly the concept in G1, such as arrays larger than 1MB, which will be discussed in more detail later).

Commonly Used GC Parameters #

Let’s validate and apply the knowledge we’ve learned, from simple to complex, step by step.

Starting the Sample Program #

If you are using an integrated development environment such as IDEA or Eclipse, simply right-click in the file and select “Run…” to execute.

If you are using the JDK command line, you can compile it using the javac tool and execute it using the java command (remember? Starting from JDK 8, these two commands can be combined):

$ javac demo/jvm0204/*.java
$ java demo.jvm0204.GCLogAnalysis
Executing...
Execution finished! Number of generated objects: 1423

The program automatically ends after running for 1 second because no startup parameters were specified, so the output log is also simple.

Do you remember the GC parameters we introduced in the previous “[JVM Startup Parameter Explained]” section?

Let’s add these parameters one by one to see the effect.

Then add the startup parameter -XX:+PrintGCDetails to print the details of GC log, and run the example again.

Integrated development environments such as IDEA can specify startup parameters in “VM options”, please refer to the previous content. Be careful not to have any extra spaces.

java -XX:+PrintGCDetails demo.jvm0204.GCLogAnalysis

The excerpt of the execution result is as follows:

Executing...
[GC (Allocation Failure)
  [PSYoungGen: 65081K->10728K(76288K)]
  65081K->27102K(251392K), 0.0112478 secs]
  [Times: user=0.03 sys=0.02, real=0.01 secs]
...Multiple lines are omitted here...
[Full GC (Ergonomics)
  [PSYoungGen: 80376K->0K(872960K)]
  [ParOldGen: 360220K->278814K(481280K)]
  440597K->278814K(1354240K),
  [Metaspace: 3443K->3443K(1056768K)],
  0.0406179 secs]
  [Times: user=0.23 sys=0.01, real=0.04 secs]
Execution completed! Total object generation times: 746
Heap
 PSYoungGen total 872960K, used 32300K [0x000000076ab00000, 0x00000007b0180000, 0x00000007c0000000)
  eden space 792576K, 4% used [0x000000076ab00000,0x000000076ca8b370,0x000000079b100000)
  from space 80384K, 0% used [0x00000007a3800000,0x00000007a3800000,0x00000007a8680000)
  to space 138240K, 0% used [0x000000079b100000,0x000000079b100000,0x00000007a3800000)
 ParOldGen total 481280K, used 278814K [0x00000006c0000000, 0x00000006dd600000, 0x000000076ab00000)
  object space 481280K, 57% used [0x00000006c0000000,0x00000006d1047b10,0x00000006dd600000)
 Metaspace used 3449K, capacity 4494K, committed 4864K, reserved 1056768K
  class space used 366K, capacity 386K, committed 512K, reserved 1048576K

As you can see, by using the startup parameter -XX:+PrintGCDetails, relevant GC logs will be output when GC occurs.

The format of this parameter is: -XX:+, which is a boolean switch.

Before the JVM is shut down, it will also output the usage information of each memory pool, as can be seen from the output at the end.

Now let’s briefly interpret the heap information in the output.

Heap memory usage

PSYoungGen total 872960K, used 32300K [0x......)
 eden space 792576K, 4% used [0x......)
 from space 80384K, 0% used [0x......)
 to space 138240K, 0% used [0x......)

PSYoungGen, the total size of the young generation is 872960K, and the usage is 32300K. The memory address information is in the square brackets.

  • Among them, eden space occupies 792576K, of which 4% is used

  • Among them, from space occupies 80384K, of which 0% is used

  • Among them, to space occupies 138240K, of which 0% is used

    ParOldGen total 481280K, used 278814K [0x……) object space 481280K, 57% used [0x……)

ParOldGen, the total size of the old generation is 481280K, and the usage is 278814K.

  • Among them, object space occupies 481280K, of which 57% is used

    Metaspace used 3449K, capacity 4494K, committed 4864K, reserved 1056768K class space used 366K, capacity 386K, committed 512K, reserved 1048576K

Metaspace, the total usage of the metadata area is 3449K, with a capacity of 4494K. The JVM ensures that there is about 4864K available, with a reserved space of around 1GB.

  • Among them, the class space is used by 366K, with a capacity of 386K

Specify GC Log File #

On top of what we’ve done before, add the startup parameter -Xloggc:gc.demo.log, and run again.

# Please note that there is no line break when starting the command line, and this is a manual typesetting
java -Xloggc:gc.demo.log -XX:+PrintGCDetails demo.jvm0204.GCLogAnalysis

Note: Starting from JDK 8, %p, %t, and other placeholders are supported to specify the output file for GC, representing the process ID and startup timestamp, respectively.

For example: -Xloggc:gc.%p.log, -Xloggc:gc-%t.log.

In some cases, outputting the GC log file for each JVM execution to a different file can facilitate problem troubleshooting.

If the business access volume is high and causes the GC log file to be too large, you can enable GC log rotation to split it into multiple files. For more information, please refer to:

https://blog.gceasy.io/2016/11/15/rotating-gc-log-files

The execution result in the command line after running is as follows: Executing… Execution finished! Total object generation: 1327

Where is the GC log?

By checking the current working directory, you will find a new file called gc.demo.log. If you are in an IDE development environment, the gc.demo.log file may be located in the root directory of the project. Of course, we can also specify the absolute path for storing the GC log file, such as -Xloggc:/var/log/gc.demo.log.

The contents of the gc.demo.log file are as follows:

Java HotSpot(TM) 64-Bit Server VM (25.162-b12) ......
Memory: 4k page, physical 16777216k (1519448k free)

/proc/meminfo:

CommandLine flags:
  -XX:InitialHeapSize=268435456 -XX:MaxHeapSize=4294967296
  -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
  -XX:+UseCompressedClassPointers -XX:+UseCompressedOops
  -XX:+UseParallelGC
0.310: [GC (Allocation Failure)
  [PSYoungGen: 61807K->10732K(76288K)]
  61807K->22061K(251392K), 0.0094195 secs]
  [Times: user=0.02 sys=0.02, real=0.01 secs]
0.979: [Full GC (Ergonomics)
  [PSYoungGen: 89055K->0K(572928K)]
  [ParOldGen: 280799K->254491K(434176K)]
  369855K->254491K(1007104K),
  [Metaspace: 3445K->3445K(1056768K)],
  0.0362652 secs]
  [Times: user=0.20 sys=0.01, real=0.03 secs]
...... Omitted content
Heap
...... Please refer to the log format of the heap memory information

We can see that after adding the -Xloggc: parameter, the GC log information is output to the log file.

At the beginning of the file, there is JVM-related information, such as memory page size, physical memory size, and remaining memory.

Then comes the CommandLine flags section. When analyzing the GC log file, the command line parameters are also an important reference. Because you may receive the log file without knowing the online configuration, this information printed in the log file can effectively reduce the analysis and troubleshooting time.

By specifying the -Xloggc: parameter, the -XX:+PrintGCTimeStamps configuration is automatically added. By observing the GC log file, you can see that each line is preceded by a timestamp (e.g. 0.310:), indicating the time (in seconds) since the JVM started.

Observant students can also notice that the garbage collector parameters used by JDK 8 by default are: -XX:+UseParallelGC.

Printing the date and time when GC events occur #

Based on the previous example, we add the startup parameter -XX:+PrintGCDateStamps and execute it again.

java -Xloggc:gc.demo.log -XX:+PrintGCDetails
 -XX:+PrintGCDateStamps demo.jvm0204.GCLogAnalysis

After execution, the excerpt of the GC log file is as follows:

...... Omitted multiple lines
CommandLine flags:
  -XX:InitialHeapSize=268435456 -XX:MaxHeapSize=4294967296
  -XX:+PrintGC -XX:+PrintGCDateStamps
  -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
  -XX:+UseCompressedClassPointers -XX:+UseCompressedOops
  -XX:+UseParallelGC
2019-12-15T15:09:59.235-0800: 0.296:
  [GC (Allocation Failure)
    [PSYoungGen: 63844K->10323K(76288K)]
    63844K->20481K(251392K),
    0.0087896 secs]
  [Times: user=0.02 sys=0.02, real=0.01 secs]
2019-12-15T15:09:59.889-0800: 0.951:
  [Full GC (Ergonomics)
    [PSYoungGen: 81402K->0K(577536K)]
    [ParOldGen: 270176K->261230K(445952K)]
    351579K->261230K(1023488K),
    [Metaspace: 3445K->3445K(1056768K)],
    0.0369622 secs]
  [Times: user=0.19 sys=0.00, real=0.04 secs]
Heap
  ...... Omitted content following the same format as before

It can be seen that after adding the -XX:+PrintGCDateStamps parameter, the exact time of GC occurrence is printed before each line in the GC log. For example, 2019-12-15T15:09:59.235-0800 represents “December 15, 2019, 15:09:59.235 in GMT+8”.

Specifying the size of the heap memory #

From the previous examples, we can see the CommandLine flags information output in the GC log file.

Even if we do not specify the heap memory, the JVM will automatically calculate a default value when starting up. For example, -XX:InitialHeapSize=268435456 -XX:MaxHeapSize=4294967296 is the default value on my machine, which is equivalent to the -Xms256m -Xmx4g configuration.

Now, let’s continue adding parameters. This time, we add the startup parameters -Xms512m -Xmx512m and execute again.

java -Xms512m -Xmx512m
 -Xloggc:gc.demo.log -XX:+PrintGCDetails
 -XX:+PrintGCDateStamps demo.jvm0204.GCLogAnalysis

At this point, the excerpt of the output GC log file is as follows:

......
CommandLine flags:
  -XX:InitialHeapSize=536870912 -XX:MaxHeapSize=536870912
  -XX:+PrintGC -XX:+PrintGCDateStamps
  -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
  -XX:+UseCompressedClassPointers -XX:+UseCompressedOops
-XX:+UseParallelGC
2019-12-15T15:15:09.677-0800: 0.358:
[GC (Allocation Failure)
  [PSYoungGen: 129204K->21481K(153088K)]
  129204K->37020K(502784K), 0.0121865 secs]
[Times: user=0.03 sys=0.03, real=0.01 secs]

2019-12-15T15:15:10.058-0800: 0.739: [Full GC (Ergonomics) [PSYoungGen: 20742K->0K(116736K)] [ParOldGen: 304175K->247922K(349696K)] 324918K->247922K(466432K), [Metaspace: 3444K->3444K(1056768K)], 0.0319225 secs] [Times: user=0.18 sys=0.01, real=0.04 secs]

Now both the initial and maximum heap memory size are 512MB. The specific parameters can be configured according to actual needs. For demonstration purposes, we are using a smaller heap memory configuration.

Specifying Garbage Collectors #

Generally, when using JDK 8, we can use the following garbage collectors:

  • -XX:+UseSerialGC
  • -XX:+UseParallelGC
  • -XX:+UseParallelGC -XX:+UseParallelOldGC (equivalent to the previous option, as seen from the flags in the GC log file)
  • -XX:+UseConcMarkSweepGC
  • -XX:+UseConcMarkSweepGC -XX:+UseParNewGC (equivalent to the previous option)
  • -XX:+UseG1GC

What do they mean? Let’s briefly review:

  • Use the serial garbage collector: -XX:+UseSerialGC
  • Use the parallel garbage collector: -XX:+UseParallelGC and -XX:+UseParallelGC -XX:+UseParallelOldGC are equivalent, as indicated from the flags in the GC log file.
  • Use the CMS garbage collector: -XX:+UseConcMarkSweepGC and -XX:+UseParNewGC -XX:+UseConcMarkSweepGC are equivalent. However, if only -XX:+UseParNewGC is specified, the old generation GC will use SerialGC. When using CMS, the command line parameters will automatically calculate the initial and maximum values of the young generation and old generation, as well as the maximum promotion threshold (e.g. -XX:MaxNewSize=178958336 -XX:NewSize=178958336 -XX:OldSize=357912576).
  • Use the G1 garbage collector: -XX:+UseG1GC. In principle, the young generation size should not be specified for G1 garbage collector, as it would be redundant. G1 performs garbage collection in small batched regions, and a single GC can include both young and old generations. It is possible for a region to transition between young and old generations.

What happens if an unsupported GC combination is used? It will fail to start and report a fatal error. Interested readers can try it out.

In the next section, we will demonstrate the usage of various garbage collectors and analyze the logs they produce. The format of these logs is not significantly different. Once you understand how to analyze one type of GC log, you can apply the same knowledge to understand the meanings of other types of GC logs.

Other Parameters #

There are also other GC-related parameters in the JVM, such as:

  • -XX:+PrintGCApplicationStoppedTime to output the duration of each GC and the time the program was paused.
  • -XX:+PrintReferenceGC to indicate the number of reference types cleaned up by GC.

We won’t go into detail here, but you can review the previous chapters if you want to learn more about these configurations.

Note: In most cases, it is not better to configure more GC parameters. In principle, only a few important parameters should be configured, and the rest should be kept at their default values, unless you have a deep analysis and understanding of the system’s business characteristics, and you need to adjust certain subtle parameters. After all, as the saying goes, “premature optimization is the root of all evil.”

Introduction to GC Event Types #

In general, Garbage Collection events can be divided into three types:

  • Minor GC
  • Major GC
  • Full GC

Although the terms Minor GC, Major GC, and Full GC are used everywhere, there is no standard definition provided by the official documentation. These terms appear in various analysis tools and garbage collection logs, but they are not very consistent. There is often confusion between the official documentation and tools, and this confusion is even rooted in standard JVM tools.

Should we call Minor GC “small GC” or “minor GC”?

Differentiation: In most cases, Minor GC occurs more frequently in the young generation, so it is not reasonable to refer to the one with more occurrences as “minor GC”.

Here, we will translate Minor GC as “small GC” instead of “minor GC”; Major GC as “major GC” instead of “main GC”; and Full GC as complete GC; sometimes it is also referred to directly as Full GC.

This is because when translating proprietary terms from Chinese to English, there may be multiple English words corresponding to a Chinese word, and there may also be one English word corresponding to multiple Chinese words, depending on the specific context.

For example, a similar situation is seen with Major Version and Minor Version, which are generally translated as “major version” and “minor version”. This is perfectly fine and understandable. It can even be translated directly as “major version number” and “minor version number”.

This section briefly introduces the various event types, and now let’s take a look at the specific details of these event types.

Minor GC #

The GC event that collects the young generation memory is called Minor GC. Regarding Minor GC events, we need to understand some related content:

  1. When the JVM is unable to allocate memory space for new objects, a Minor GC is triggered (usually when the Eden region is full). If the allocation rate of objects is fast, the number of Minor GCs will also be high, and the frequency will be fast.
  2. Minor GC events do not process the old generation, so all references from the old generation to the young generation are treated as GC Roots. References from the young generation to the old generation are ignored during the marking phase.
  3. Contrary to our general understanding, every Minor GC causes a stop-the-world (STW) pause, suspending all application threads. For most applications, the pause time of Minor GC can be ignored because most objects in the Eden region are garbage and are not copied to the survivor space/old generation. However, if this situation does not apply, many newly created objects will not be cleaned up by the GC, and the pause time of Minor GC will increase, resulting in a noticeable impact on GC performance.

Brief definition: Minor GC cleans up the young generation, or in other words, Minor GC is “young GC” (abbreviated as YGC).

Major GC vs. Full GC #

It is worth mentioning that these terms do not have a formal definition, whether in the JVM specification or in GC papers.

As we know, in addition to Minor GC, there are two other types of GC events:

  • Major GC: Cleans up the old generation (Old Space) memory.
  • Full GC: Cleans up the entire heap memory space, including the young generation and the old generation.

In fact, Major GC and Full GC are sometimes not well distinguished. A more complex situation is that many Major GCs are triggered by Minor GCs, so in many cases, these two events are inseparable.

Additionally, garbage collection algorithms like G1 clean a small portion of the heap at a time, which may include both the young and old generations.

So let’s not get too hung up on whether to call it Major GC or Full GC. They generally cause longer STW pauses for a single occurrence. What we need to focus on is whether a particular GC event pauses all threads and thereby affects system performance or if it can run concurrently with other business threads and the pause time can be almost negligible.

This concludes this section. In the next section, we will analyze the logs generated by various GC algorithms through examples.