23 Memory Analysis and Related Tools Part 1 Memory Layout and Analysis Tools

In the previous lessons, we learned about the difference between “memory overflow” and “memory leak”.

Simply put, memory overflow in Java means that there is not enough memory, usually resulting in a heap memory error, but it can also be caused by insufficient memory in other spaces.

Now let’s discuss the memory-related knowledge of Java objects in detail.

Introduction to Java Object Memory Layout #

Let’s consider a question: if an object has 100 properties, compared to 100 objects, each with 1 property, which one occupies more memory space?

To answer this question, let’s see how JVM represents an object:

742441.png

Explanation:

  • Alignment: For example, for a data type like long with 8 bytes, the starting address in memory must be a multiple of 8 bytes.
  • Padding: If there is empty space at the end of a field within an object, padding is used to fill it up because the starting position of the next field must be a multiple of 4 or 8 bytes (32-bit/64-bit).
  • In fact, these two are the same principle, aiming to align the positions inside and outside the object.

How much memory does a Java object occupy? #

Referring to Mindprod, we can see that the situation is not simple:

  • The JVM implementation can store internal data in any form, and it can be big-endian or little-endian. It can also add any amount of padding or overhead, although the behavior of primitive types must comply with the specification.

For example, JVM or the local compiler can decide whether to store boolean[] in a 64-bit memory block, similar to BitSet. JVM vendors may not disclose these details to you as long as the results of the program are consistent.

  • The JVM can allocate some temporary objects in the stack space.
  • The compiler may use constants to replace certain variables or method calls.
  • The compiler may perform deep optimizations, such as generating multiple compilation versions for methods and loops, and invoking one of them for certain cases.

Of course, the hardware platform and operating system also have multiple levels of cache, such as the L1/L2/L3 caches built into the CPU, SRAM cache, DRAM cache, conventional memory, and virtual memory on disks.

User data may appear in multiple levels of cache. With so many complex situations, we can only estimate the memory usage roughly.

Methods for Measuring Object Memory Usage #

Generally, we can use the Instrumentation.getObjectSize() method to estimate the memory space occupied by an object.

If you want to view the actual memory layout, footprint, and references of an object, you can use the Java Object Layout (JOL) tool provided by OpenJDK.

Object Header and Object Reference #

In a 64-bit JVM, the space occupied by an object header is 12 bytes (= 96 bits = 64 + 32), but it is aligned by 8 bytes, so an instance of an empty class occupies at least 16 bytes.

In a 32-bit JVM, the object header occupies 8 bytes and is aligned by a multiple of 4 (32 = 4 * 8).

So creating many simple objects, even new Object(), would consume a considerable amount of memory.

In general, in a 32-bit JVM and 64-bit JVM with a heap memory smaller than -Xmx32G (pointer compression enabled by default), a reference occupies 4 bytes by default.

Therefore, a 64-bit JVM generally requires an additional 30%~50% of heap memory.

Why? Please think about it.

Wrapper Types, Arrays, and Strings #

Wrapper types consume more memory than primitive data types. For more details, you can refer to JavaWorld:

  • Integer: Occupies 16 bytes (8 + 4 = 12 + padding), because the int part occupies 4 bytes. So using Integer consumes 300% more memory than the primitive type int.
  • Long: Generally occupies 16 bytes (8 + 8 = 16), of course, the actual size of an object is determined by the memory alignment of the underlying platform, specifically the implementation of the JVM on a specific CPU platform. It seems that an object of type long occupies an additional 8 bytes compared to the primitive type long (also consuming 100% more memory). In contrast, Integer has padding of 4 bytes, most likely because the JVM forces alignment of 8 bytes.

Other container types also occupy a considerable amount of space.

Multi-dimensional Arrays: This is another surprise. When performing numerical or scientific calculations, developers often use the int[dim1][dim2] construct.

In a two-dimensional array int[dim1][dim2], each nested array int[dim2] is a separate object and occupies an additional 16 bytes of space. In some cases, this overhead is wasteful. When the array dimensions are larger, this overhead becomes particularly noticeable.

For example, an instance of int[128][2] occupies 3600 bytes. Whereas an instance of int[256] only occupies 1040 bytes. The effective storage space inside is the same, but 3600 has an additional overhead of 246% compared to 1040. In extreme cases, for byte[256][1], the additional overhead is 19 times! However, in C/C++, the same syntax does not increase the additional storage overhead.

String: The space occupied by a String object increases as the internal character array grows. Of course, String objects have an additional overhead of 24 bytes.

For non-empty Strings with lengths up to 10 characters, the added overhead occupies 100% to 400% more memory compared to the payload (2 bytes per character + 4 bytes for length).

Alignment #

Let’s take a look at the following example object:

class X { // 8 bytes - reference to the class definition
   int a; // 4 bytes
   byte b; // 1 byte
   Integer c = new Integer(); // 4-byte reference
}

We might expect an instance of class X to occupy 17 bytes of space. However, due to alignment requirements, JVM allocates memory in multiples of 8 bytes. So the space occupied is not 17 bytes, but 24 bytes.

After running the JOL example, it can be observed that JVM arranges the fields of the parent class first, then when it comes to the fields of the current class, it first aligns the 8-byte ones, and then the 4-byte fields, and so on. It also adds padding to minimize wasted space.

Java’s built-in serialization is based on this layout, and adding fields makes it incompatible. It also causes issues when only adding methods without fixing the serialVersionUID. So experienced developers do not like to use built-in serialization, for example, when storing custom types in Redis.

JOL Usage Example #

JOL (Java Object Layout) is a small tool for analyzing memory layouts in the JVM. It decodes the actual object layout, occupancy, and references using Unsafe, JVMTI, and the Serviceability Agent (SA), making JOL more accurate than heap dump-based or specification-based tools.

The official website for JOL is:

http://openjdk.java.net/projects/code-tools/jol/

From the example, we can see that JOL supports command-line invocation, called jol-cli. The download page can be found in the Maven central repository:

http://central.maven.org/maven2/org/openjdk/jol/jol-cli/

You can download the jol-cli-0.9-full.jar file from there.

JOL also supports programmatic usage. An example can be found at:

http://hg.openjdk.java.net/code-tools/jol/file/tip/jol-samples/src/main/java/org/openjdk/jol/samples/

The relevant dependencies can be found in the Maven central repository:

<dependency>
    <groupId>org.openjdk.jol</groupId>
    <artifactId>jol-core</artifactId>
    <version>0.9</version>
</dependency>

You can search for the specific jar on this search page:

https://mvnrepository.com/search?q=jol-core

Memory Leaks #

Memory Leak Example #

The following example provides more specific details.

In Java, when creating a new object, like Integer num = new Integer(5), manual memory allocation is not required. This is because JVM automatically wraps and handles memory allocation. During program execution, the JVM checks which objects are still in use in memory when necessary, and abandons those objects that are no longer used. The memory occupied by these discarded objects is reclaimed and reused. This process is called “garbage collection.” The module in JVM responsible for garbage collection is called “garbage collector (GC).”

Java’s automatic memory management relies on the GC, which repeatedly scans the memory regions to remove unused objects. Simply put, memory leaks in Java refer to objects that are logically no longer in use but have not been removed by the garbage collector. As a result, these garbage objects continue to occupy heap memory, gradually accumulate, and eventually lead to java.lang.OutOfMemoryError: Java heap space errors.

It is easy to write a bug program to simulate a memory leak:

import java.util.*;
public class KeylessEntry {
static class Key {
    Integer id;
    Key(Integer id) {
        this.id = id;
    }
    @Override
    public int hashCode() {
        return id.hashCode();
    }
}

public static void main(String[] args) {
    Map m = new HashMap();
    while (true){
        for (int i = 0; i < 10000; i++){
            if (!m.containsKey(new Key(i))){
                m.put(new Key(i), "Number:" + i);
            }
        }
        System.out.println("m.size()=" + m.size());
    }
}

At first glance, you might think there is nothing wrong, because there are at most 10,000 cached elements!

However, upon closer inspection, you will notice that the Key class only overrides the hashCode() method and does not override the equals() method, which means it will keep adding more Key objects to the HashMap.

Please refer to: “The Contract and Principles for Overriding hashCode and equals Methods in Java”.

Over time, the “cached” objects will increase. When the leaked objects fill up all the heap memory and the GC cannot clean them up, a java.lang.OutOfMemoryError: Java heap space error will be thrown.

The solution is simple. Implement the equals() method correctly in the Key class:

@Override
public boolean equals(Object o) {
    boolean response = false;
    if (o instanceof Key) {
        response = (((Key)o).id).equals(this.id);
    }
    return response;
}

To be honest, many times there may be a memory leak, but the functionality may still work until it reaches a certain level. Therefore, the hidden nature of this problem may make you waste a lot of brain cells when finding the real cause of the memory leak.

A Real Scenario in Spring MVC #

We once encountered a scenario like this:

To easily migrate code from Struts2 to Spring MVC, the request was directly obtained in the Controller by using ThreadLocal.

So in the ControllerBase class, the request object held by the current thread was cached through ThreadLocal:

public abstract class ControllerBase {
    private static ThreadLocal<HttpServletRequest> requestThreadLocal = new ThreadLocal<HttpServletRequest>();

    public static HttpServletRequest getRequest(){
        return requestThreadLocal.get();
    }
    
    public static void setRequest(HttpServletRequest request){
        if (null == request){
            requestThreadLocal.remove();
            return;
        }
        requestThreadLocal.set(request);
    }
}

Then in the HandlerInterceptor implementation class in Spring MVC, during the preHandle method, the request object is saved to ThreadLocal:

/**
 * Login interceptor
 */
public class LoginCheckInterceptor implements HandlerInterceptor {
    private List<String> excludeList = new ArrayList<String>();
    
    public void setExcludeList(List<String> excludeList) {
        this.excludeList = excludeList;
    }

    private boolean validURI(HttpServletRequest request) {
        // If it is in the exclude list
        String uri = request.getRequestURI();
        Iterator<String> iterator = excludeList.iterator();
        while (iterator.hasNext()) {
            String exURI = iterator.next();
            if (null != exURI && uri.contains(exURI)){
                return true;
            }
        }
        
        // Can perform login and permission checks
        LoginUser user = ControllerBase.getLoginUser(request);
        if (null != user) {
            return true;
        }
        
        // Not logged in, not allowed
        return false;
    }

    private void initRequestThreadLocal(HttpServletRequest request) {
        ControllerBase.setRequest(request);
        request.setAttribute("basePath", ControllerBase.basePathLessSlash(request));
    }

    private void removeRequestThreadLocal() {
        ControllerBase.setRequest(null);
    }
    
    @Override
    public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler)
            throws Exception {
        initRequestThreadLocal(request);
        
        // If not allowed, return false
        if (false == validURI(request)) {
            // Throw an exception here to allow for unified exception handling
            throw new NeedLoginException();
        }
        return true;
    }

    @Override
    public void postHandle(HttpServletRequest request, HttpServletResponse response, Object handler,
            ModelAndView modelAndView) throws Exception {
        removeRequestThreadLocal();
    }

    @Override
    public void afterCompletion(HttpServletRequest request, HttpServletResponse response, Object handler, Exception ex)
            throws Exception {
        removeRequestThreadLocal();
    }
}

The code is quite long. Just note that we clean up the request object in the ThreadLocal in the postHandle and afterCompletion methods of Spring MVC.

However, in practice, the developers set a large object (such as a List taking up about 200MB of memory) as an attribute of the request and pass it to the JSP.

If there is an exception in the JSP code, the postHandle and afterCompletion methods of Spring MVC will not be executed.

The thread scheduling in Tomcat may not reach the thread that threw the exception, causing the ThreadLocal to hold onto the request object indefinitely.

As time goes by, the available memory fills up, leading to continuous Full GC, but because of the memory leak, GC cannot resolve the problem, causing the system to freeze.

Subsequent revisions: Clean up the ThreadLocal in the finally block using a Filter.

@WebFilter(value="/*", asyncSupported=true)
public class ClearRequestCacheFilter implements Filter{

    @Override
    public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException,
            ServletException {
        clearControllerBaseThreadLocal();
        try {
            chain.doFilter(request, response);
        } finally {
            clearControllerBaseThreadLocal();
        }
    }

    private void clearControllerBaseThreadLocal() {
        ControllerBase.setRequest(null);
    }
    @Override
    public void init(FilterConfig filterConfig) throws ServletException {}
    @Override
    public void destroy() {}
}

The lesson from this case is: while ThreadLocal can be used, there must be controlled release measures, which is generally in the form of try-finally code, to ensure that objects are properly destroyed in any situation. (So, in fact, the GC has already handled 99.99% of object management for us; otherwise, we would encounter more similar problems. I gained this understanding when I was doing C++ development ten years ago.)

Note: In Spring MVC controllers, you can actually inject the request using @Autowired, the injected object being an instance of HttpServletRequestWrapper, which is called via the ThreadLocal mechanism during execution.

Conventional way: Just accept the request parameter directly in the controller method. There is no need to wrap it separately.

This is why we always recommend using existing frameworks and technologies or others’ successful practices. Many times, others’ experiences, especially with mature frameworks and projects, have encountered many pitfalls. If we start from scratch, we will have to go through each pitfall one by one, which may not be worth it.

Memory Dump and Analysis #

Memory dump is divided into 2 types: active dump and passive dump.

  • Active dump tools include: jcmd, jmap, JVisualVM, etc. Please refer to the relevant tool documentation for specific usage.
  • Passive dump mainly refers to: hprof, and parameters such as -XX:+HeapDumpOnOutOfMemoryError.

For more methods, please refer to:

https://www.baeldung.com/java-heap-dump-capture

Regarding hprof’s user manual and internal format, please refer to the documentation in the JDK source code:

http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/raw-file/beb15266ba1a/src/share/demo/jvmti/hprof/manual.html#mozTocId848088

In addition, commonly used analysis tools include:

  • jhat: jhat is used to support analysis of dump files. It is an HTTP/HTML server that can generate online HTML files from dump files for viewing through a browser.
  • MAT: MAT is a good graphical JVM dump file analysis tool.

Useful Analysis Tool: MAT #

1. Introduction to MAT

MAT stands for Eclipse Memory Analyzer Tools.

Its advantage lies in being able to perform object reference analysis from GC roots, calculating how many objects are referenced by each root, and making it relatively easy to locate memory leaks. MAT is a standalone product, less than 100MB, and can be downloaded from the official website: Download Link.

2. MAT Example

Phenomenon description: After optimizing and adjusting slow SQL in the system and going live, no problems were found in the test environment, but after running for a period of time, it was discovered that the CPU was running at full capacity. Below, we will analyze the case.

First, check the local Java process:

jps -v

Assuming the PID displayed by jps is 3826.

Dump the memory:

jmap -dump:format=b,file=3826.hprof 3826

After the export is complete, the dump file is about 3GB. Therefore, you need to modify the MAT configuration parameters. It may not work if the value is too small, but it doesn’t have to be set very large.

In the MAT installation directory, modify the configuration file:

MemoryAnalyzer.ini

The default memory configuration is 1024MB. Analyzing a 3GB dump file may cause errors. Modify the following section:

-vmargs
-Xmx1024m

According to the size of the dump file, increase the maximum heap memory setting appropriately, requiring it to be a multiple of 4MB. For example, change it to:

-vmargs
-Xmx4g

Double-click to open MemoryAnalyzer.exe and open the MAT analysis tool. Select the menu File -> Open File... and choose the corresponding dump file.

Select Leak Suspects Report and confirm to generate a report on memory leaks.

bd3d81d4-d928-4081-a2f7-96c11de76178.png

3. Memory Report

Wait for the analysis to complete. After the analysis is completed, the summary information is as follows:

07acbdb7-0c09-40a5-b2c3-e7621a36870f.png

The analysis report shows the largest memory usage problem source 1:

345818b9-9323-4025-b23a-8f279a99eb84.png

The largest memory usage problem source 2:

07bbe993-5139-416a-9e6d-980131b649bf.png

The largest memory usage problem source 3:

7308f1b5-35aa-43e0-bbb4-05cb2e3131be.png

It can be seen that the total memory usage is only around 2GB. Problem source 1 and source 2 each occupy 800MB, so the problem is likely to be related to them.

Of course, problem source 3 also has some reference value, indicating that there are many JDBC operations at that time.

Viewing problem source 1, its description information is as follows:

The thread org.apache.tomcat.util.threads.TaskThread
  @ 0x6c4276718 http-nio-8086-exec-8
keeps local variables with total size 826,745,896 (37.61%) bytes.

The memory is accumulated in one instance of
"org.apache.tomcat.util.threads.TaskThread"
loaded by "java.net.URLClassLoader @ 0x6c0015a40".
The stacktrace of this Thread is available. See stacktrace.

Keywords
java.net.URLClassLoader @ 0x6c0015a40
org.apache.tomcat.util.threads.TaskThread

4. Interpretation of the Analysis Overall interpretation: This is a (running) thread, with the constructed class being org.apache.tomcat.util.threads.TaskThread, which holds approximately 826MB of objects, accounting for 37.61% of the total.

All running threads (stacks) are GC-Root.

Click on the “See stacktrace” link to view the thread call stack at export.

An excerpt is as follows:

Thread Stack

http-nio-8086-exec-8 … at org.mybatis.spring.SqlSessionTemplate.selectOne at com.sun.proxy.$Proxy195.countVOBy(Lcom//domain/vo/home/residents/ResidentsInfomationVO;)I (Unknown Source) at com..bi.home.service.residents.impl.ResidentsInfomationServiceImpl.countVOBy(….)Ljava/lang/Integer; (ResidentsInfomationServiceImpl.java:164) at com..bi.home.service.residents.impl.ResidentsInfomationServiceImpl.selectAllVOByPage(….)Ljava/util/Map; (ResidentsInfomationServiceImpl.java:267) at com..web.controller.personFocusGroups.DocPersonFocusGroupsController.loadPersonFocusGroups(….)Lcom/****/domain/vo/JSONMessage; (DocPersonFocusGroupsController.java:183) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run()V (TaskThread.java:61) at java.lang.Thread.run()V (Thread.java:745)

One key piece of information is to find our own package, such as:

com.****…..ResidentsInfomationServiceImpl.selectAllVOByPage

And it also gives the line number corresponding to the Java source file.

Analyzing the second root cause, the result is basically the same as the first root cause.

Of course, you can also analyze the number of objects held by each class under this root cause.

Click on the “Details »” link below the explanation of root cause 1 to enter the details page.

Check the “Accumulated Objects in Dominator Tree”:

b5ff6319-a5d9-426f-99ef-19bd100fd80a.png

You can see that the two ArrayList objects occupy the most memory.

Left-click on the first ArrayList object and select “Show objects by class -> by outgoing references” from the pop-up menu.

6dbbb72d-ec2b-485f-bc8e-9de044b21b7d.png

Open the class_references tab:

28fe37ed-36df-482a-bc58-231c9552638d.png

After expanding, it is found that there are 1.13 million PO class objects. The loading is indeed a bit excessive, directly occupying 170MB of memory (each object is approximately 150 bytes).

In fact, this is caused by putting batch processing tasks into real-time requests.

MAT provides other information that can be opened and viewed, and can also provide some basis for diagnosing problems.

JDK built-in troubleshooting tool: jhat

jhat is a Java heap analysis tool. After JDK6u7, it became part of the JDK. Using this command requires some Java development experience, and the official does not provide technical support and customer services for this tool.

  1. jhat usage:

    jhat [options] heap-dump-file

    Parameters:

    • options: Optional command-line parameters, please refer to the [Options] below.
    • heap-dump-file: The binary Java heap dump file to be viewed. If a dump file contains multiple heap dumps, you can specify which dump to parse by appending “#” to the file name, such as: myfile.hprof#3.
  2. jhat example To dump heap memory using the jmap tool, you can use the following command:

jmap -dump:file=DumpFileName.txt,format=b <pid>

For example:

jmap -dump:file=D:/javaDump.hprof,format=b 3614
Dumping heap to D:\javaDump.hprof ...
Heap dump file created

In this example, 3614 is the ID of the Java process. Generally, jmap needs to be compatible with or the same version as the target JVM in order to successfully export the heap dump.

If you are not sure how to use jmap, simply enter the command “jmap” or “jmap -h” to see the instructions.

To analyze the dump file, you can use the jhat command as follows:

jhat -J-Xmx1024m D:/javaDump.hprof
...... Other information ...
Snapshot resolved.
Started HTTP server on port 7000
Server is ready.

The “-J-Xmx1024m” parameter is used because the default heap memory of the JVM may not be enough to load the entire dump file. You can adjust this value as needed. According to the prompt, the port number is 7000. You can then access the analysis results by using a browser to visit http://localhost:7000/.

3. Detailed Explanation

The jhat command supports pre-designed queries, such as displaying all instances of a certain class.

It also supports Object Query Language (OQL), which is similar to SQL and is specifically used for querying heap dumps.

The help information related to OQL can be found at the bottom of the server page provided by the jhat command.

If the default port is used, the OQL help information page is:

http://localhost:7000/oqlhelp/

There are multiple ways to generate heap dumps in Java:

4. Options

  • -stack, with a value of false or true, turns off or on the tracking of object allocation call stack. If allocation position information is not available in the heap dump, this flag must be set to false. The default value is true.
  • -refs, with a value of false or true, turns off or on the tracking of references to objects. The default value is true. By default, the returned pointers are objects that refer to other specific objects, such as referrers or incoming references, and statistics of all objects in the heap are calculated.
  • -port, followed by the port number, sets the port number of the jhat HTTP server. The default value is 7000.
  • -exclude, followed by the exclude-file, specifies a file listing the data members that need to be excluded when performing object queries. For example, if the file lists java.lang.String.value, then any reference path involving java.lang.String.value will be excluded when calculating the reachable object list from a specific object.
  • -baseline specifies a baseline heap dump. Objects with the same object ID in two heap dumps are marked as not new, while other objects are marked as new. This is useful when comparing two different heap dumps.
  • -debug, with an integer value, sets the debug level. 0 means no debug information is output, and the larger the value, the more detailed the debug information.
  • -version shows only the version information and then exits.
  • -h or -help shows the help information and exits.
  • -J <flag> allows passing JVM startup parameters when running the jhat command. Since the jhat command actually starts a JVM, multiple -Jxxxxxx flags can be used to specify multiple JVM startup parameters.

References #