56 Let's Talk About the Java Memory Model

56 Let’s Talk About the Java Memory Model #

In this lecture, we will primarily discuss what Java Memory Model is.

Starting from this lecture, we will delve into the study of Java Memory Model. If you are interested in the underlying principles of Java concurrency, knowledge of Java Memory Model is crucial. It is also a turning point that distinguishes whether we only stay at the level of using concurrent tools or we can go further and understand the reasons behind them.

Confusing Concepts: JVM Memory Structure VS Java Memory Model #

Java, as an object-oriented language, has many concepts that may appear similar in name, such as JVM memory structure and Java memory model. These are two completely different concepts that are often confused. There are also many articles online that discuss the Java memory model, when in fact they are referring to the JVM memory structure.

So let’s first summarize the main purposes of these two concepts:

The JVM memory structure is related to the runtime areas of the Java Virtual Machine.
The Java memory model is related to concurrent programming in Java.

As we can see, these two concepts are quite distinct. Now let’s briefly introduce the JVM memory structure.

JVM Memory Structure #

We all know that Java code runs on a virtual machine, and during the execution of a Java program, the managed memory is divided into several different regions, each with its own purpose. The JVM runtime memory area structure, as described in the “Java Virtual Machine Specification (Java SE 8)”, can be divided into the following 6 regions:

Heap Area: The heap is where objects and arrays are stored, and it is usually the largest area in memory. Instances are easy to understand, for example, new Object() will generate an instance; arrays are also stored on the heap because in Java, arrays are also objects.

Java Virtual Machine Stacks: It stores local variables and partial results, and plays a role in method invocation and return.

Method Area: It stores the structure of each class, such as runtime constant pool, field and method data, as well as the code for methods and constructors, including special methods used for class and interface initialization.

Native Method Stacks: Similar to the Java Virtual Machine Stacks, but it serves native methods instead of Java methods.

The PC Register: It is the smallest memory area and its purpose is usually to save the current JVM instruction address being executed.

Run-Time Constant Pool: It is part of the method area and contains various constants, ranging from compile-time known numbers to method and field references that must be resolved at runtime.

It is important to note that the above description is based on the Java Virtual Machine specification, and different JVM implementations may vary, but they generally adhere to the specifications.

To summarize, the JVM memory structure is defined by the Java Virtual Machine specification and describes the different data areas managed by the JVM during the execution of a Java program, each with its specific functionality. For the official specification, please refer to the link.

From Java Code to CPU Instructions #

After understanding the JVM memory structure, let’s go back to the Java Memory Model. We all know that the Java code we write needs to be translated into CPU instructions before it can be executed. To understand the role of the Java Memory Model, let’s review the general process from Java code to the final execution of CPU instructions:

Initially, the Java code we write is in the form of *.java files;
After compilation (including lexical analysis, semantic analysis, etc.), a new Java bytecode file (_.class file) will be generated alongside the original _.java file;
The JVM will analyze the generated bytecode file (*.class) and convert it into machine instructions specific to the platform and other factors;
Machine instructions can then be directly executed on the CPU, which is the final step of program execution.

Why do we need JMM (Java Memory Model) #

In earlier languages, there was actually no concept of a memory model.

Therefore, the final execution results of a program would depend on the specific processor, and different processors have different rules. Different processors can have significant differences, so the same piece of code may run correctly on processor A, but produce inconsistent results on processor B. Similarly, without JMM, different implementations of the JVM can also result in different “translations”.

Thus, Java needs a standard so that Java developers, compiler engineers, and JVM engineers can reach a consensus. Once we reach a consensus, we can clearly understand what kind of code can ultimately achieve what kind of runtime effect, allowing us to have predictable results in multithreaded execution. This standard is the JMM, which is why we need the JMM.

In this lesson, we will go beyond the level of Java code and start delving into the process of translating from Java code to CPU instructions, following the principles and specifications related to concurrency. This is the main focus of JMM. Without these specifications, the same Java code can produce completely different execution results, which is unacceptable and goes against the “write once, run anywhere” nature of Java.

What is JMM #

With the above introduction, let’s now explain what JMM is.

JMM is a specification #

JMM is a set of specifications related to multithreading. Each JVM implementation needs to comply with the JMM specification so that developers can easily develop multithreaded programs using these specifications. This ensures that even if the same program runs on different virtual machines, the resulting program results are consistent.

Without the JMM memory model specification, it is likely that after “translation” by different JVMs, the results of running on different virtual machines will be different, which would be a major problem.

Therefore, JMM is related to processors, caches, concurrency, and compilers. It solves the problem of unexpected results caused by CPU multi-level caching, processor optimization, and instruction reordering.

JMM is the principle behind utility classes and keywords #

Previously, we used various synchronization utilities and keywords, including volatile, synchronized, Lock, etc., and their principles are all related to JMM. It is the involvement and assistance of JMM that allows these synchronization utilities and keywords to work and help us develop concurrency-safe programs.

For example, when we use the synchronized keyword, the JVM will “translate” suitable instructions based on the rules of JMM, including restricting the order between instructions to ensure the necessary “visibility” even in the case of reordering. As a result, different JVMs will produce predictable results for the same code execution. As Java programmers, we can develop correct concurrent programs using synchronization utilities and keywords, thanks to JMM.

The three most important aspects of JMM are reordering, atomicity, and memory visibility. We will discuss these three aspects in detail later.

Summary #

The above is the content of this lesson. In this lesson, we first distinguished between the confusing concepts of JVM memory structure and Java memory model. Then, we explained what the Java memory model is at a macro level. Next, we gradually explored Java code and explained why JMM is needed and what JMM is.