25 Discuss the Jvm Memory Area Allocation and Which Areas Might Trigger an Out of Memory Error

JVM的内存区域主要分为以下几个部分:

  1. 程序计数器(Program Counter Register):该区域用于指示线程执行的字节码指令的地址。每个线程都拥有独立的计数器,用于记录当前线程所执行的位置。由于计数器的作用是线程私有的,因此不会出现OutOfMemoryError问题。

  2. Java虚拟机栈(Java Virtual Machine Stacks):每个线程在运行时都会创建一个虚拟机栈,用于存储局部变量、方法参数以及程序调用的信息。虚拟机栈在方法调用和返回过程中保持数据的一致性和顺序执行。如果线程的栈空间不足,将会抛出StackOverflowError;如果无法继续分配栈空间,将会抛出OutOfMemoryError。

  3. 本地方法栈(Native Method Stack):该区域与Java虚拟机栈类似,用于存储本地方法调用的信息。本地方法一般指的是使用其他语言(如C或C++)编写的方法。与虚拟机栈相似,如果本地方法栈空间不足,将会抛出StackOverflowError;如果无法继续分配栈空间,将会抛出OutOfMemoryError。

  4. Java堆(Java Heap):该区域是Java虚拟机管理的最大一块内存空间,用于存储对象的实例和数组。堆是在JVM启动时创建的,所有线程共享。如果无法分配内存空间来创建新的对象实例,将会抛出OutOfMemoryError。

  5. 方法区(Method Area):该区域存储常量、静态变量、即时编译器编译后的代码等数据。方法区由所有线程共享,用于存储类的结构信息和字节码。如果使用的类或常量过多,导致无法继续分配内存空间,将会抛出OutOfMemoryError。

  6. 运行时常量池(Runtime Constant Pool):该区域存储编译期生成的字面量和符号引用。运行时常量池是方法区的一部分。如果常量池无法继续分配内存空间,将会抛出OutOfMemoryError。

除了上述几个内存区域,还有一些其他的内存区域如直接内存(Direct Memory),它并不是JVM运行时数据区的一部分。直接内存是通过操作系统直接分配的内存缓冲区,它不受JVM内存管理的限制。当使用过多的直接内存时,可能会导致OutOfMemoryError。

25 Discuss the JVM memory area allocation and which areas might trigger an OutOfMemoryError #

JVM memory area can generally be divided into several aspects, some of which are on a per-thread basis, while others are unique to the entire JVM process.

First, the Program Counter (PC). According to the JVM specification, each thread has its own program counter, and at any given time, a thread can only be executing one method, also known as the current method. The program counter stores the JVM instruction address of the Java method currently being executed by the thread, or an undefined value if the thread is executing a native method.

Second, the Java Virtual Machine Stack, formerly known as the Java stack. Each thread creates a virtual machine stack upon its creation, which internally contains stack frames, each corresponding to a Java method invocation.

Previously, when discussing the program counter, it was mentioned that there is a current method. Similarly, at any given point in time, there is only one active stack frame, commonly referred to as the current frame, and the class containing the method is referred to as the current class. If a method within the current method calls another method, a new stack frame is created, becoming the new current frame until it returns a result or completes execution. The JVM only has two operations for the Java stack, which are pushing and popping stack frames.

A stack frame contains the local variable table, operand stack, dynamic linking, and definitions for normal method termination or exceptional method termination.

Third, the Heap, which is the central area for Java memory management, is used to store Java object instances. Almost all created Java object instances are directly allocated on the heap. The heap is shared by all threads and the maximum heap space and other related metrics can be specified through parameters such as “Xmx” during JVM startup.

Naturally, the heap is a focus of garbage collectors, and it is further subdivided by different garbage collectors. The most well-known subdivision is the distinction between the young generation and the old generation.

Fourth, the Method Area, a memory area shared by all threads, is used to store so-called meta data, such as class structure information, as well as the corresponding runtime constant pool, fields, and method code.

Due to the implementation of Hotspot JVM in its early days, many people are accustomed to calling the method area the Permanent Generation (PermGen). In Oracle JDK 8, the Permanent Generation was removed, and the Metaspace was introduced as a replacement for metadata storage.

Fifth, the Run-Time Constant Pool, which is part of the method area. If you carefully analyze the structure of a decompiled class file, you will notice various information such as version numbers, fields, methods, superclasses, interfaces, and the constant pool. Java’s constant pool can store various constant information, ranging from various literals generated during compilation to symbol references that need to be determined at runtime. Therefore, it is more extensive than the symbol tables used in general languages.

Sixth, the Native Method Stack, which is very similar to the Java Virtual Machine Stack, supports the invocation of native methods and is created for each thread. In the Oracle Hotspot JVM, the native method stack and the Java Virtual Machine Stack are located in the same area, which is purely a technical implementation decision and is not enforced by the specification.

Analysis of the Exam Focus #

This is a basic question in the JVM field. The answer I provided is based on the definition of runtime data areas in the JVM Specification, which is consistent with the interpretation of most books and materials.

The concepts within the JVM are complex and may be somewhat obscure for beginners. My suggestion is to read classic books, such as “Understanding the Java Virtual Machine,” which I have recommended multiple times.

In today’s lecture, as an introduction to memory management in the Java virtual machine, I will focus on:

  • Analyzing the broad structure of JVM memory or Java process memory.
  • When discussing the Java memory model, we cannot avoid addressing OutOfMemory (OOM) issues. So, what are the possibilities for OOM in Java, and what are the corresponding exceptions in each memory area?

It should be noted that the specific memory structure of the JVM depends on its implementation. Different JVM vendors or different versions from the same vendor may have certain differences. In my analysis below, I will also introduce some design variations in Oracle Hotspot JVM.

Knowledge Expansion #

First of all, in order to give you a more intuitive and clear impression, I have drawn a simple memory structure diagram, which shows the heap, thread stacks, and other areas mentioned earlier. It also demonstrates the thread-private areas, such as the program counter and the Java stack, as well as what is unique to the Java process. Additionally, it further divides the areas into direct memory and so on.

This diagram reflects the actual memory usage of a Java process, which differs from the JVM runtime data area defined in the specification. It can be considered as a superset of the runtime data area. After all, the theoretical perspective and the practical perspective are different. The specification focuses on the common and indistinguishable parts, while for application developers, anything occupied by the Java process at runtime will affect our engineering practices.

Here are two key differences I want to briefly introduce:

  • The direct memory area, which I mentioned in Chapter 12 of the series, is the memory directly allocated by the Direct Buffer. It is also an area prone to problems. Although in the eyes of JVM engineers, it is not considered as part of the internal memory of the JVM, nor does it reflect the JVM’s memory model.
  • The JVM itself is a native program and requires additional memory to complete various basic tasks. For example, the JIT compiler compiles hot methods at runtime and stores the compiled methods in the Code Cache. The GC and other functionalities need to run in native threads, all of which require memory space. These are the requirements for implementing JVM JIT and other functionalities, but they are not mentioned in the specification.

If you delve into the implementation details of the JVM, you will find that some conclusions seem somewhat ambiguous, such as:

  • Are all Java objects created on the heap?

I have noticed some opinions suggesting that JVM may allocate objects on the stack through escape analysis for objects that will not escape. This is theoretically feasible but depends on the choices made by JVM designers. As far as I know, Oracle Hotspot JVM does not do this. This point has been clarified in the documentation related to escape analysis, so it can be concluded that all object instances are created on the heap.

  • Many books are still based on versions before JDK 7, but JDK has undergone significant changes. For example, the cache for interned strings and static variables used to be allocated in the permanent generation, which has now been replaced by the metadata area. However, the interned string cache and static variables have not been moved to the metadata area. Instead, they are allocated directly on the heap. So this point also confirms the previous conclusion: object instances are allocated on the heap.

Next, let’s take a look at what the OOM problem is and in which memory areas it can occur. First of all, if OOM is explained in plain terms, it means that the JVM is running out of memory. According to the explanation in the javadoc, there is no free memory available and the garbage collector cannot provide more memory.

Implicit in this is the understanding that, usually before throwing an OutOfMemoryError, the garbage collector will be triggered to try its best to clean up space. For example:

  • In my analysis of the reference mechanism in Lesson 4 of the column, I mentioned that the JVM will attempt to reclaim objects pointed to by soft references, etc.

  • In the method java.nio.BIts.reserveMemory(), we can clearly see that System.gc() is called to clean up space. This is also why it is generally recommended not to add the following parameter when using a large number of NIO Direct Buffers, because it is a last resort, and it may avoid certain memory shortage issues.

    -XX:+DisableExplicitGC

Of course, the garbage collector is not triggered in all cases. For example, if we allocate a very large object, such as a large array that exceeds the maximum heap size, the JVM can determine that garbage collection cannot solve this problem and will directly throw an OutOfMemoryError.

From the perspective of the data areas I analyzed earlier, except for the program counter, any of the other areas may experience an OutOfMemoryError due to potential space shortage. In summary:

  • Insufficient heap memory is one of the most common causes of OOM, and the error message thrown is “java.lang.OutOfMemoryError: Java heap space”. The reasons can be diverse, such as the presence of memory leaks, or the heap size is not set reasonably. For example, we need to process a considerable amount of data, but the JVM heap size is not explicitly specified or the specified value is too small. Another possibility is that the JVM does not handle references in a timely manner, resulting in accumulation and the inability to release memory.
  • As for the Java virtual machine stack and native method stack, things are a bit more complicated here. If we write a program that continuously performs recursive calls without an exit condition, this will cause constant stacking. In similar cases, the JVM will actually throw a StackOverflowError; of course, if the JVM fails to expand the stack space when attempting to do so, an OutOfMemoryError will be thrown.
  • For older versions of Oracle JDK, because the size of the permanent generation is limited and the JVM is not very aggressive in garbage collecting the permanent generation (such as constant pool reclamation and unloading of unnecessary types), it is common to encounter OutOfMemoryError when continuously adding new types, especially in scenarios with a large number of dynamically generated types. Similarly, excessive space consumption in the intern string cache can also lead to OOM. The corresponding exception message will indicate its association with the permanent generation: “java.lang.OutOfMemoryError: PermGen space”.
  • With the introduction of the metadata area, the memory pressure in the method area has been alleviated to some extent, so the occurrence of OOM has changed. When it occurs, the exception message becomes: “java.lang.OutOfMemoryError: Metaspace”.
  • Insufficient direct memory can also lead to OOM. This has been discussed in Lesson 11 of the column.

Today is the first lesson on JVM memory, which serves as a warm-up introduction. I explained the main memory areas, the internal changes in different versions of Hotspot JVM, and analyzed whether each area is likely to cause OutOfMemoryError, as well as typical situations where OOME occurs.

Exercise #

Have you grasped the topic we discussed today? Today’s thought question is: When I try to allocate a 100M byte array, I encountered an OOME (OutOfMemoryError), but the GC logs show that there is clearly more than 100M of space available on the heap. What do you think could be the possible cause of the problem? What additional information do we need to clarify this issue?

Please write your thoughts on this problem in the comments section. I will select the well-thought-out comments and send you a learning reward voucher. Feel free to discuss with me.

Are your friends also preparing for interviews? You can “invite friends to read” and share today’s question with them. Perhaps you can help them.