34 Plus Lesson 4 Analyzing Java Issues Must Use These Tools Part One

34 Plus Lesson 4 Analyzing Java Issues Must Use These Tools Part One #

Today, I want to share with you some commonly used tools for analyzing and locating Java problems.

By now, we have already reached Lesson 17 in our course, which means we are already more than halfway through. During the learning process, you may have noticed that when I introduce various pitfalls, I don’t directly provide the conclusions to the problems. Instead, I use tools to visually see the problems.

Why do I do this? It’s because I always believe that when encountering a problem, we should try not to guess. We must see it with our own eyes. Only by truly seeing the problem through logs, monitoring, or tools, and then comparing and confirming it against the code, can we consider it as finding the root cause.

At first, you may feel overwhelmed by using complex tools to troubleshoot problems, or you may feel unsure where to start after opening the tools. However, as you gain more and more practice and become more familiar with the operation of Java programs and various frameworks, you will find that using these tools becomes more and more handy.

In fact, tools are just a means for us to locate problems. To make good use of tools, it is mainly essential to have a general understanding of how the program itself operates. This requires long-term accumulation.

Therefore, through two additional lessons, I will share four examples with you. These examples will demonstrate how to use the built-in tools in JDK to troubleshoot JVM parameter configuration issues, use Wireshark to analyze network problems, use MAT to analyze memory problems, and use Arthas to analyze high CPU usage issues. These examples are just the tip of the iceberg, and you can further learn and explore through these examples on your own.

In today’s additional lesson, let’s first learn how to use the built-in tools in JDK and Wireshark to analyze and locate Java program problems.

Viewing JVM Information using JDK Tools #

JDK comes with many command-line and graphical tools to help us view various information about the JVM. For example, by running the ls command on my machine, I can see that JDK 8 provides many tools or programs:

Next, I will introduce some commonly used monitoring tools to you. You can also get a basic understanding of the various tools through the following diagram:

To test these tools, let’s first write some code: start 10 threads with infinite loops, each thread allocating a string of around 10MB, and then sleep for 10 seconds. It can be imagined that this program will put pressure on the GC.

// Start 10 threads

IntStream.rangeClosed(1, 10).mapToObj(i -> new Thread(() -> {

    while (true) {

        // Each thread is an infinite loop, sleep for 10 seconds, and print 10M data

        String payload = IntStream.rangeClosed(1, 10000000)

                .mapToObj(__ -> "a")

                .collect(Collectors.joining("")) + UUID.randomUUID().toString();

        try {

            TimeUnit.SECONDS.sleep(10);

        } catch (InterruptedException e) {

            e.printStackTrace();

        }

        System.out.println(payload.length());

    }

})).forEach(Thread::start);



TimeUnit.HOURS.sleep(1);

Modify the pom.xml file to configure the main class of the Java program packaged by the spring-boot-maven-plugin plugin:

<plugin>

    <groupId>org.springframework.boot</groupId>

    <artifactId>spring-boot-maven-plugin</artifactId>

    <configuration>

        <mainClass>org.geekbang.time.commonmistakes.troubleshootingtools.jdktool.CommonMistakesApplication

        </mainClass>

    </configuration>

</plugin>

Then use java -jar to start the process and set JVM parameters so that the heap size is both minimum and maximum of 1GB:

java -jar common-mistakes-0.0.1-SNAPSHOT.jar -Xms1g -Xmx1g

After completing these preparations, we can use the tools provided by JDK to observe and analyze this test program.

jps #

First, use jps to get a list of Java processes, which is more convenient than using ps:

➜  ~ jps

12707

22261 Launcher

23864 common-mistakes-0.0.1-SNAPSHOT.jar

15608 RemoteMavenServer36

23243 Main

23868 Jps

22893 KotlinCompileDaemon

jinfo #

Then, you can use jinfo to print various JVM parameters:

➜  ~ jinfo 23864

Java System Properties:

#Wed Jan 29 12:49:47 CST 2020

…

user.name=zhuye

path.separator=\:

os.version=10.15.2

java.runtime.name=Java(TM) SE Runtime Environment

file.encoding=UTF-8

java.vm.name=Java HotSpot(TM) 64-Bit Server VM

...



VM Flags:

-XX:CICompilerCount=4 -XX:ConcGCThreads=2 -XX:G1ConcRefinementThreads=8 -XX:G1HeapRegionSize=1048576 -XX:GCDrainStackTargetSize=64 -XX:InitialHeapSize=268435456 -XX:MarkStackSize=4194304 -XX:MaxHeapSize=4294967296 -XX:MaxNewSize=2576351232 -XX:MinHeapDeltaBytes=1048576 -XX:NonNMethodCodeHeapSize=5835340 -XX:NonProfiledCodeHeapSize=122911450 -XX:ProfiledCodeHeapSize=122911450 -XX:ReservedCodeCacheSize=251658240 -XX:+SegmentedCodeCache -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC



VM Arguments:

java_command: common-mistakes-0.0.1-SNAPSHOT.jar -Xms1g -Xmx1g

java_class_path (initial): common-mistakes-0.0.1-SNAPSHOT.jar

Launcher Type: SUN_STANDARD

By checking line 15 and line 19, we can see that our way of setting JVM parameters is incorrect. The parameters -Xms1g and -Xmx1g are treated as Java program startup parameters, and the maximum memory of the entire JVM is currently around 4GB, not 1GB.

Therefore, when we suspect that the JVM configuration is not normal, we should use tools to confirm the parameters in a timely manner. In addition to confirming JVM parameters using tools, you can also print out the VM parameters and program parameters:

System.out.println("VM options");

System.out.println(ManagementFactory.getRuntimeMXBean().getInputArguments().stream().collect(Collectors.joining(System.lineSeparator())));

System.out.println("Program arguments");

System.out.println(Arrays.stream(args).collect(Collectors.joining(System.lineSeparator())));

Put the JVM parameters before -jar and restart the program. You can see the following output, from which you can confirm that the configuration of JVM parameters this time is correct:

➜  target git:(master) ✗ java -Xms1g -Xmx1g -jar common-mistakes-0.0.1-SNAPSHOT.jar test

VM options

-Xms1g

-Xmx1g

Program arguments

test

jvisualvm #

Then, start another heavyweight tool, jvisualvm, to observe the program. You can confirm the JVM parameter settings on the Overview panel again:

By continuing to observe the Monitoring panel, you can see that the JVM’s GC activities occur roughly every 10 seconds, the heap fluctuates between 250MB and 900MB, and the number of active threads is 22. We can see the basic situation of the JVM on the Monitoring panel and also manually perform GC and heap dump operations here:

jconsole #

If you want to see the GC curve of each memory area, you can use jconsole for observation. jconsole is also a comprehensive graphical interface monitoring tool. One advantage over jvisualvm is that it can monitor various data, including attribute values in MBeans, in the form of curves:

jstat #

Similarly, if you don’t have the condition to use a graphical interface (after all, on a Linux server, we mainly use command-line tools) and want to see the GC trend, you can use the jstat tool.

The jstat tool allows you to output various monitoring indicators of the JVM at a fixed monitoring frequency. For example, using -gcutil to output GC and memory utilization summary information every 5 seconds, for a total of 100 times, you can see that Young GC occurs relatively frequently, while Full GC occurs roughly every 10 seconds:

➜  ~ jstat -gcutil 23940 5000 100

      S0     S1     E      O      M     CCS    YGC     YGCT    FGC    FGCT    CGC    CGCT     GCT

      0.00 100.00   0.36  87.63  94.30  81.06    539   14.021    33    3.972   837    0.976   18.968

      0.00 100.00   0.60  69.51  94.30  81.06    540   14.029    33    3.972   839    0.978   18.979

      0.00   0.00   0.50  99.81  94.27  81.03    548   14.143    34    4.002   840    0.981   19.126

      0.00 100.00   0.59  70.47  94.27  81.03    549   14.177    34    4.002   844    0.985   19.164

      0.00 100.00   0.57  99.85  94.32  81.09    550   14.204    34    4.002   845    0.990   19.196

      0.00 100.00   0.65  77.69  94.32  81.09    559   14.469    36    4.198   847    0.993   19.659

      0.00 100.00   0.65  77.69  94.32  81.09    559   14.469    36    4.198   847    0.993   19.659

      0.00 100.00   0.70  35.54  94.32  81.09    567   14.763    37    4.378   853    1.001   20.142

      0.00 100.00   0.70  41.22  94.32  81.09    567   14.763    37    4.378   853    1.001   20.142

      0.00 100.00   1.89  96.76  94.32  81.09    574   14.943    38    4.487   859    1.007   20.438

   0.00 100.00   1.39  39.20  94.32  81.09    575   14.946    38    4.487   861    1.010   20.442

其中，S0 表示 Survivor0 区占用百分比，S1 表示 Survivor1 区占用百分比，E 表示 Eden 区占用百分比，O 表示老年代占用百分比，M 表示元数据区占用百分比，YGC 表示年轻代回收次数，YGCT 表示年轻代回收耗时，FGC 表示老年代回收次数，FGCT 表示老年代回收耗时。

jstat 命令的参数众多，包含 -class、-compiler、-gc 等。Java 8、Linux/Unix 平台 jstat 工具的完整介绍，你可以查看这里。jstat 定时输出的特性，可以方便我们持续观察程序的各项指标。

继续来到线程面板可以看到，大量以 Thread 开头的线程基本都是有节奏的 10 秒运行一下，其他时间都在休眠，和我们的代码逻辑匹配：

点击面板的线程 Dump 按钮，可以查看线程瞬时的线程栈：

jstack #

通过命令行工具 jstack，也可以实现抓取线程栈的操作：

➜  ~ jstack 23940

2020-01-29 13:08:15

Full thread dump Java HotSpot(TM) 64-Bit Server VM (11.0.3+12-LTS mixed mode):

...

"main" #1 prio=5 os_prio=31 cpu=440.66ms elapsed=574.86s tid=0x00007ffdd9800000 nid=0x2803 waiting on condition  [0x0000700003849000]

   java.lang.Thread.State: TIMED_WAITING (sleeping)

  at java.lang.Thread.sleep([Native Method])

  at java.lang.Thread.sleep(Thread.java:339)

  at java.util.concurrent.TimeUnit.sleep(TimeUnit.java:446)

  at org.geekbang.time.commonmistakes.troubleshootingtools.jdktool.CommonMistakesApplication.main(CommonMistakesApplication.java:41)

  at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

  at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

  at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

  at java.lang.reflect.Method.invoke(Method.java:566)

  at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:48)

  at org.springframework.boot.loader.Launcher.launch(Launcher.java:87)

  at org.springframework.boot.loader.Launcher.launch(Launcher.java:51)

  at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:52)

"Thread-1" #13 prio=5 os_prio=31 cpu=17851.77ms elapsed=574.41s tid=0x00007ffdda029000 nid=0x9803 waiting on condition  [0x000070000539d000]

   java.lang.Thread.State: TIMED_WAITING (sleeping)

  at java.lang.Thread.sleep(Native Method)

  at java.lang.Thread.sleep(Thread.java:339)

  at java.util.concurrent.TimeUnit.sleep(TimeUnit.java:446)

  at org.geekbang.time.commonmistakes.troubleshootingtools.jdktool.CommonMistakesApplication.lambda$null$1(CommonMistakesApplication.java:33)

  at org.geekbang.time.commonmistakes.troubleshootingtools.jdktool.CommonMistakesApplication$$Lambda$41/0x00000008000a8c40.run(Unknown Source)

  at java.lang.Thread.run(Thread.java:834)

...

抓取后可以使用类似fastthread这样的在线分析工具来分析线程栈。

jcmd #

最后，我们来看一下 Java HotSpot 虚拟机的 NMT 功能。

通过 NMT，我们可以观察细粒度内存使用情况，设置 -XX:NativeMemoryTracking=summary/detail 可以开启 NMT 功能，开启后可以使用 jcmd 工具查看 NMT 数据。

我们重新启动一次程序，这次加上 JVM 参数以 detail 方式开启 NMT：

Xms1g -Xmx1g -XX:ThreadStackSize=256k -XX:NativeMemoryTracking=detail

在这里，我们还增加了 -XX:ThreadStackSize 参数，并将其值设置为 256K，也就是期望将线程栈设置为 256KB。我们通过 NMT 观察一下设置是否成功。

启动程序后执行如下 jcmd 命令，以概要形式输出 NMT 结果。可以看到，当前有 32 个线程，线程栈总共保留了差不多 4GB 左右的内存。我们明明配置线程栈最大 256KB 啊，为什么会出现 4GB 这么夸张的数字呢，到底哪里出了问题呢？

➜  ~ jcmd 24404 VM.native_memory summary

24404:



Native Memory Tracking:



Total: reserved=6635310KB, committed=5337110KB

Java Heap (reserved=1048576KB, committed=1048576KB)
- (mmap: reserved=1048576KB, committed=1048576KB)
Class (reserved=1066233KB, committed=15097KB)
- (classes #902)
- (malloc=9465KB #908)
- (mmap: reserved=1056768KB, committed=5632KB)
Thread (reserved=4209797KB, committed=4209797KB)
- (thread #32)
- (stack: reserved=4209664KB, committed=4209664KB)
- (malloc=96KB #165)
- (arena=37KB #59)
Code (reserved=249823KB, committed=2759KB)
- (malloc=223KB #730)
- (mmap: reserved=249600KB, committed=2536KB)
GC (reserved=48700KB, committed=48700KB)
- (malloc=10384KB #135)
- (mmap: reserved=38316KB, committed=38316KB)
Compiler (reserved=186KB, committed=186KB)
- (malloc=56KB #105)
- (arena=131KB #7)
Internal (reserved=9693KB, committed=9693KB)
- (malloc=9661KB #2585)
- (mmap: reserved=32KB, committed=32KB)
Symbol (reserved=2021KB, committed=2021KB)
- (malloc=1182KB #334)
- (arena=839KB #1)
Native Memory Tracking (reserved=85KB, committed=85KB)
- (malloc=5KB #53)
- (tracking overhead=80KB)
Arena Chunk (reserved=196KB, committed=196KB)
- (malloc=196KB)

Run jcmd with the VM.native_memory detail parameter again:

jcmd 24404 VM.native_memory detail

We can see that there are 16 suspicious threads, with each thread reserving 262144KB of memory, which is 256MB (as shown in the red box in the image below, searching for “262144KB for Thread Stack from” yields 16 results):

In fact, the unit for the ThreadStackSize parameter is KB, so if we want to set the thread stack to 256KB, we should set it to 256 instead of 256k. After correctly setting the parameter, let’s verify it again using jcmd:

In addition to viewing NMT, jcmd has many other functions. We can use the help command to see all its functions:

jcmd 24781 help

For each of these functions, we can further use the help command to view their descriptions. For example, using the GC.heap_info command can print some information about the Java heap:

jcmd 24781 help GC.heap_info

In addition to jps, jinfo, jcmd, jstack, jstat, jconsole, jvisualvm, there are also some other tools in the JDK, which you can find complete descriptions of in the official documentation.

Analyzing the Slow Performance of SQL Batch Insert Using Wireshark #

I previously encountered a case where a data import program needed to import a large amount of data. The developer used Spring JdbcTemplate’s batch operation feature for batch data import, but found that the performance was very poor, similar to executing individual SQL statements.

Let’s recreate this case. After starting the program, first create a testuser table with only one column name, and then use JdbcTemplate’s batchUpdate method to batch insert 10,000 records into the testuser table:

@SpringBootApplication
@Slf4j
public class BatchInsertApplication implements CommandLineRunner {

    @Autowired
    private JdbcTemplate jdbcTemplate;

    public static void main(String[] args) {
        SpringApplication.run(BatchInsertApplication.class, args);
    }

    @PostConstruct
    public void init() {
        // Initialize table
        jdbcTemplate.execute("drop table IF EXISTS `testuser`;");
        jdbcTemplate.execute("create TABLE `testuser` (\n" +
                "  `id` bigint(20) NOT NULL AUTO_INCREMENT,\n" +
                "  `name` varchar(255) NOT NULL,\n" +
                "  PRIMARY KEY (`id`)\n" +
                ") ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;");
    }

    @Override
    public void run(String... args) {
        long begin = System.currentTimeMillis();
        String sql = "INSERT INTO `testuser` (`name`) VALUES (?)";
        // Use JDBC batch update
        jdbcTemplate.batchUpdate(sql, new BatchPreparedStatementSetter() {
            @Override
            public void setValues(PreparedStatement preparedStatement, int i) throws SQLException {
                // Set the value for the name column (1-based index)
                preparedStatement.setString(1, "usera" + i);
            }

            @Override
            public int getBatchSize() {
                // Batch size is 10000
                return 10000;
            }
        });
        log.info("took : {} ms", System.currentTimeMillis() - begin);
    }
}

After executing the program, you can see that inserting 10,000 records took 26 seconds:

[14:44:19.094] [main] [INFO ] [o.g.t.c.t.network.BatchInsertApplication:52 ] - took : 26144 ms

In fact, for batch operations, we want the program to combine multiple insert SQL statements into one, or at least submit multiple statements to the database at once to reduce the number of interactions with MySQL and improve performance. But does our program work like that?

As I mentioned in extra lesson 3, “analyzing problems requires evidence, it cannot be guessed”. Now, let’s use the network analysis tool Wireshark to analyze this case and see it for ourselves.

First, we can download Wireshark here and start it. Choose a network interface to capture. Since we are connecting to MySQL locally, I’ll choose the loopback interface:

Wireshark will capture all network traffic on this interface. We can enter tcp.port == 6657 in the display filter bar on top to filter out all TCP requests on port 6657 (since we are connecting to MySQL through port 6657).

You can see that there is a lot of interaction between the program and MySQL during program execution. Wireshark directly decodes TCP packets into MySQL protocol, so the lower window can directly display the SQL queries for MySQL requests. We can see that for each insert operation on the testuser table, only one row is inserted:

If the Protocol column doesn’t display MySQL in the list, you can manually click on the Analyze menu, then Decode As submenu, and add a rule to set the 6657 port as MySQL protocol:

This indicates that our program is not performing batch inserts and is no different from regular single-row inserts in a loop. By debugging into the ClientPreparedStatement class, we can see that the executeBatchInternal method is responsible for batch operations. The source code for executeBatchInternal is as follows:

@Override
protected long[] executeBatchInternal() throws SQLException {

    synchronized (checkClosed().getConnectionMutex()) {

        if (this.connection.isReadOnly()) {

            throw new SQLException(Messages.getString("PreparedStatement.25") + Messages.getString("PreparedStatement.26"),
                MysqlErrorNumbers.SQL_STATE_ILLEGAL_ARGUMENT);

        }

        if (this.query.getBatchedArgs() == null || this.query.getBatchedArgs().size() == 0) {

            return new long[0];

        }

        // we timeout the entire batch, not individual statements

        int batchTimeout = getTimeoutInMillis();

        setTimeoutInMillis(0);

        resetCancelledState();

        try {

            statementBegins();

            clearWarnings();

            if (!this.batchHasPlainStatements && this.rewriteBatchedStatements.getValue()) {

                if (((PreparedQuery<?>) this.query).getParseInfo().canRewriteAsMultiValueInsertAtSqlLevel()) {

                    return executeBatchedInserts(batchTimeout);

                }

                if (!this.batchHasPlainStatements && this.query.getBatchedArgs() != null
                        && this.query.getBatchedArgs().size() > 3 /* cost of option setting rt-wise */) {

                    return executePreparedBatchAsMultiStatement(batchTimeout);

                }

            }

            return executeBatchSerially(batchTimeout);

        } finally {

            this.query.getStatementExecuting().set(false);

            clearBatch();

        }

    }

}

Notice the judgement on line 18, checking if the rewriteBatchedStatements parameter is set to true in order to optimize the batch. There are two ways to optimize the batch:

If conditions are met, the insert statement will be optimized into one statement, which is the executeBatchedInserts method;

If not, the insert statement will be optimized into multiple statements and submitted together, which is the executePreparedBatchAsMultiStatement method.

To achieve the optimization of batch submission, the key lies in the rewriteBatchedStatements parameter. We modify the connection string and set its value to true:

spring.datasource.url=jdbc:mysql://localhost:6657/common_mistakes?characterEncoding=UTF-8&useSSL=false&rewriteBatchedStatements=true

After following the previous steps to open Wireshark for verification, we can see:

This time the insert SQL statement is concatenated into one statement (as shown in the second red rectangle);

This TCP package is divided into 11 segments for transmission, the #699 request is the last segment, and its actual content is the last part of the insert statement (as shown in the first and third red rectangles).

To view all the packets for the entire TCP connection, you can right-click on the request and choose Follow->TCP Stream:

After opening, you can see the contents of all the packets from the MySQL authentication to the insert statement:

By checking the handshake packet at the beginning, we can find that the maximum segment size (MSS) of TCP is 16344 bytes, and our MySQL long insert data is a total of 138933 bytes, so it is divided into 11 segments for transmission, with the largest segment being 16332 bytes, which is lower than the required 16344 bytes of MSS.

Finally, we can see that inserting 10,000 rows of data only takes 253 milliseconds, which is a performance improvement of 100 times:

[20:19:30.185] [main] [INFO ] [o.g.t.c.t.network.BatchInsertApplication:52 ] - took : 253 ms

Although we have been using MySQL, we seldom consider how MySQL Connector Java interacts with MySQL and how the SQL statements sent to MySQL actually look like. Have you noticed that the MySQL protocol is not far away? We can use Wireshark to observe and analyze the entire process of interaction between the application and MySQL.

Key Review #

Today, I used the tools provided by JDK to view JVM information and analyze the slow performance issue of SQL batch insert. I also showed you some tools and their usage.

Firstly, JDK provides both command-line tools and graphical tools for monitoring and troubleshooting. The command-line tools are more suitable for use on servers, while the graphical tools are more intuitive for local data observation. To help you make good use of these tools, we used them to analyze two problems with setting JVM parameters for program errors and observed the working status of GC.

Then, we used Wireshark to analyze the slow performance issue of MySQL batch insert operation. We saw that analyzing network packets with Wireshark can make everything so transparent. Therefore, mastering Wireshark can greatly help us to troubleshoot bugs or performance issues in C/S network programs.

For example, when encountering network issues such as “Connection reset” or “Broken pipe,” you can use Wireshark to locate the problem and observe what exactly went wrong between the client and the server.

In addition, if you need to develop network programs, Wireshark is an essential tool for analyzing protocols and verifying program correctness.

The code used today is all available on GitHub. You can click on this link to view it.

Reflection and Discussion #

There is another tool in JDK called jmap, and we use the command jmap -dump to perform heap dumping. So, what is the difference between this command and jmap -dump:live? Can you design an experiment to demonstrate their differences?

Have you ever wondered how the client authenticates with MySQL? Can you use Wireshark to observe and analyze this process by referring to the MySQL documentation?

In your daily work, what other tools do you use to analyze and troubleshoot Java applications? I’m Zhu Ye, and I welcome you to leave a comment in the comment section to share your thoughts. Feel free to share today’s content with your friends or colleagues and join in the discussion.