31 Common Jvm Interview Questions Compilation Victory in Strategies Often Lies Beyond a Thousand Miles

31 Common JVM Interview Questions Compilation- Victory in Strategies Often Lies Beyond a Thousand Miles #

The key points for interviews and written tests are actually similar. The focus is on fundamental knowledge and practical experience (of course, attitude and appearance are also important during interviews).

In actual interviews, due to limited time, it is not possible to ask all the questions. Generally, a portion of the topics are discussed based on the content mentioned in the resume. This is done to assess the candidate’s experience, attitude, and problem-solving approach when faced with deeper questions. It allows understanding of the candidate’s technical proficiency, depth and breadth of knowledge, and their ability to think and solve problems.

What are some common interview strategies?

  • What is XXX?
  • What is the implementation principle?
  • Why is it implemented this way?
  • What would you do if you were to implement it?
  • Analyze the advantages and disadvantages of your implementation.
  • What are some areas for improvement?

Below is a summary of some common interview questions for reference. You can assess yourself based on these questions.

  • 0 points: No knowledge of the topic.
  • 30 points: Some impression, familiar with some terms.
  • 60 points: Familiar with concepts and their meanings, understand the functionality and common uses.
  • 80 points: Able to supplement the provided answer.
  • 100 points: Able to find issues in the provided answer.

Now let’s take a look at some JVM-related interview questions.

1. What is JVM? #

JVM stands for Java Virtual Machine, also known as Java虚拟机 in Chinese.

JVM is the underlying platform for running Java programs and, together with the Java API library, forms the execution environment for Java programs.

It is divided into two parts: JVM specification and JVM implementation. In simple terms, the Java Virtual Machine refers to a virtual computer that can execute standard Java bytecode.

1.1 What is the difference between JDK and JVM? #

In general, the JDK, JRE, and JVM are packaged together.

  • JDK = JRE + Development/Debugging/Profiling Tools
  • JRE = JVM + Java Standard Library

1.2 What JVM vendors are you familiar with? #

Some common JVM vendors include:

  • Oracle, which includes the Hotspot virtual machine and GraalVM, with two versions: OpenJDK and OracleJDK.
  • IBM, which uses the J9 virtual machine in its product suite.
  • Azul Systems, which provides high-performance Zing and open-source Zulu.
  • Alibaba, with Dragonwell as their custom version of OpenJDK.
  • Amazon, with Corretto OpenJDK.
  • Red Hat’s OpenJDK.
  • Adopt OpenJDK.
  • In addition, there are some open-source and experimental JVM implementations, such as Go.JVM.

1.3 What is the difference between OracleJDK and OpenJDK? #

Versions of JDK generally comply with the Java Virtual Machine specification. The differences between the two versions typically include:

  • Differences in the toolkits provided by each JDK, such as copyrighted tools like jmc.
  • Differences in certain protocols or configurations, such as export-restricted encryption algorithms in the US.
  • Other minor differences, such as certain private APIs in the JRE.

1.4 Which version of JDK do you use for development? What about in production? Why did you choose that? #

To be honest, the choice of version depends on the specific situation of the development team. Factors to consider include the machine’s operating system, the team members’ familiarity, and the need to support legacy projects, etc.

Currently, the most popular long-term supported (LTS) versions of Java are Java 8 and Java 11.

  • Java 8 is a classic LTS version, with excellent performance and stability, and good support for various CPU architectures and operating systems.
  • Java 11 is a newer long-term supported version, with stronger performance, supporting more new features, and being stable after several years of maintenance.

Some companies use OracleJDK in the development environment and OpenJDK in the production environment, while others use the opposite approach. Some companies use the same packaged version. As long as testing has been performed during development and deployment, it should not be a problem. Generally, the JDK configuration in testing and pre-production environments should be consistent with the production environment.

2. What is Java bytecode? #

Java bytecode refers to the intermediate code format generated by compiling Java source code, commonly known as bytecode files.

2.1 What information is included in bytecode files? #

Bytecode files generally contain the following parts:

  • Version number information
  • Static constant pool (symbolic constants)
  • Class-related information
  • Field-related information
  • Method-related information
  • Debugging-related information

It can be said that most of the information is represented through symbolic constants in the constant pool.

2.2 What are constants? #

Constants refer to values that do not change. For example, the letter ‘K’ or the number 1024 corresponds to the same binary format in UTF-8 encoding. Similarly, strings in Java are represented in binary format that does not change, such as “KK”.

In Java, it is important to note that variables and fields marked with the final keyword represent constant variables. They can only be assigned once and cannot be changed thereafter. This is enforced by the compiler and the execution engine.

2.3 How do you understand the constant pool? #

In Java, the constant pool has two meanings:

  • Static constant pool: a part of the class file that contains various symbolic constants related to the class.
  • Runtime constant pool: its content is mainly derived from the static constant pool, but can also be added by the program.

3. What are the runtime data areas of JVM? #

According to the JVM specification, the standard runtime data areas of JVM include the following parts:

  • Program Counter Register
  • Java Virtual Machine Stack
  • Heap
  • Method Area
  • Runtime Constant Pool
  • Native Method Stack

Specific JVM implementations can optimize or merge these areas according to actual conditions, as long as they meet the requirements of the specification.

3.1 What is the Heap? #

The Heap is a memory region allocated by program code, distinct from the Stack.

In Java, the Heap is mainly used to allocate storage space for objects, and once an object reference is obtained, the Heap memory can be accessed by all threads.

3.2 What parts are included in the Heap? #

Taking Hotspot as an example, the Heap is mainly allocated and managed by the Garbage Collection (GC) module, and can be divided into the following parts:

  • Young Generation
  • Survivor Space
  • Tenured Generation

Among them, the Young Generation and Survivor Space are commonly referred to as the Young Generation.

3.3 What is the Non-Heap Memory? #

In addition to the Heap memory, the JVM memory pool also includes Non-Heap (NON_HEAP) memory, which corresponds to parts such as the Method Area and the Constant Pool specified in the JVM specification:

  • Metaspace
  • Code Cache
  • Compressed Class Space

4. What is Out of Memory (OOM)? #

Out of Memory (OOM) refers to insufficient available memory.

When the memory required for program execution exceeds the maximum available value, if not handled, it will affect other processes. Therefore, the current operating system’s approach is to immediately report an error if it exceeds, for example, by throwing a “Memory Out of Bounds” error.

It is like a cup that cannot hold more, and when it is full, it overflows and causes damage. For example, a cup has a capacity of only 500ml, but 600ml is poured in, so the water overflows and causes damage.

4.1 What is Memory Leak? #

Memory Leak refers to the situation where an object that is originally useless continues to occupy memory without being released at the appropriate time.

Memory resources that are not used but not released are called “memory leaks”. That means the objects that should be released are not released and the objects that should be recycled are not recycled.

A typical scenario is that when each request arrives or each operation is processed, memory is allocated, but a part of it cannot be recycled (or released). As more and more requests are processed, memory leaks become more and more serious.

In Java, memory leak generally refers to the situation where unused objects cannot be cleaned up by GC due to incorrect reference relationships.

4.2 What is the relationship between the two? #

If there is a serious memory leak problem, it will inevitably cause memory overflow over time.

Memory leaks are generally related to resource management issues and program bugs, while memory overflow is the end result of insufficient memory space and memory leaks.

5. Given a specific class, analyze the memory usage of objects #

public class MyOrder{
  private long orderId;
  private long userId;
  private byte state;
  private long createMillis;
}

In general, each object of the MyOrder class occupies 40 bytes.

5.1 How is this calculated? #

The calculation is as follows:

  • The object header occupies 12 bytes.
  • Each long type field occupies 8 bytes, and 3 long fields occupy 24 bytes.
  • The byte field occupies 1 byte.
  • The total is 37 bytes. With an alignment of 8 bytes, the actual occupation is 40 bytes.

5.2 What parts are included in the object header? #

The object header generally consists of two parts:

  • Mark Word, occupying a machine word, which is 8 bytes.
  • Type Pointer, occupying a machine word, which is also 8 bytes.
  • If the heap memory is less than 32GB, the JVM will enable pointer compression by default, which only occupies 4 bytes.

Therefore, in the previous calculation, the object header occupies 12 bytes. If it is an array, the object header will also have an additional part:

  • Array Length, an int value, occupying 4 bytes.

6. What are the commonly used JVM startup parameters? #

By now (March 2020), the configurable parameters of JVM have reached over 1000, among which there are over 600 JVM parameters related to GC and memory configuration. However, in the vast majority of business scenarios, there are only about 10 commonly used JVM configuration parameters. For example:

# JVM startup parameters without line breaks
# Set heap memory
-Xmx4g -Xms4g 
# Specify the GC algorithm
-XX:+UseG1GC -XX:MaxGCPauseMillis=50 
# Specify the number of parallel GC threads
-XX:ParallelGCThreads=4 
# Print GC logs
-XX:+PrintGCDetails -XX:+PrintGCDateStamps 
# Specify the GC log file
-Xloggc:gc.log 
# Specify the maximum size of the Metaspace
-XX:MaxMetaspaceSize=2g 
# Set the size of a single thread stack
-Xss1m 
# Automatically perform a dump when there is a heap memory overflow
-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/usr/local/

In addition, there are some commonly used attribute configurations:

# Specify the default connection timeout
-Dsun.net.client.defaultConnectTimeout=2000
-Dsun.net.client.defaultReadTimeout=2000
# Specify the time zone
-Duser.timezone=GMT+08 
# Set the default file encoding to UTF-8
-Dfile.encoding=UTF-8 
# Specify the entropy source for random numbers
-Djava.security.egd=file:/dev/./urandom 

6.1 Factors to consider when setting heap memory XMX? #

It needs to be determined based on the system’s configuration, leaving a certain amount of space for the operating system and the JVM itself. It is recommended to configure 70-80% of the available memory in the system or container.

6.2 Assuming the physical memory is 8G, what is the appropriate size for the heap memory setting? #

For example, if the system has 8G of physical memory, the system itself may use some of it, leaving about 7.5G available. It is recommended to configure “-Xmx6g”.

Note: 7.5G * 0.8 = 6G. If there are specific places in the system that use off-heap memory, this value needs to be further reduced.

6.3 What is the relationship between the value set by -Xmx and the memory consumed by the JVM process? #

Total JVM memory = Stack + Heap + Non-heap + Off-heap + Native

6.4 How to enable GC logs? #

Typically, in JDK 8 and earlier versions, the following parameters are used to enable GC logs:

-XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:gc.log

If it is in JDK 9 and later versions, the format is slightly different:

-Xlog:gc*=info:file=gc.log:time:filecount=0

6.5 Please specify using the G1 garbage collector to start the Hello program #

java -XX:+UseG1GC
-Xms4g
-Xmx4g
-Xloggc:gc.log
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
Hello

7. What is the default garbage collector used in Java 8? #

The Hotspot JVM in Java 8 uses the parallel garbage collector (Parallel GC) by default. Other JDK 8 versions provided by vendors also generally use the parallel garbage collector by default.

7.1 What is the default garbage collector in Java 11? #

Starting from Java 9, the official JDK defaults to the G1 garbage collector.

7.2 What are the common garbage collectors? #

Common garbage collectors include:

  • Serial garbage collector: -XX:+UseSerialGC
  • Parallel garbage collector: -XX:+UseParallelGC
  • CMS garbage collector: -XX:+UseConcMarkSweepGC
  • G1 garbage collector: -XX:+UseG1GC

7.3 What is serial garbage collection? #

Serial garbage collection means that only a single worker thread is used to perform GC work.

7.4 What is parallel garbage collection? #

Parallel garbage collection refers to the use of multiple GC worker threads to perform garbage collection in parallel, fully utilizing the multi-core capabilities of the CPU and reducing the pause time for garbage collection.

In addition to single-threaded GC, other garbage collectors such as PS, CMS, and G1 use multiple threads to perform GC work in parallel.

7.5 What is concurrent garbage collection? #

Concurrent garbage collection refers to the GC tasks that are executed concurrently with the application threads while the application is running normally. For example, the various concurrent phases of CMS/G1.

7.6 What is incremental garbage collection? #

First of all, the heap memory of G1 is no longer simply divided into young generation and old generation, but divided into multiple (usually 2048) smaller heap regions that can store objects.

Each small region may be defined as Eden, Survivor, or Old at different times.

With this division, G1 does not need to collect the entire heap space every time, but instead, it performs garbage collection incrementally: it only handles a portion of the memory blocks, called the collection set, during each GC.

In the next GC, a certain area is selected based on the previous GC to perform collection. The advantage of incremental garbage collection is that it greatly reduces the pause time of each GC.

7.7 What is the young generation? #

The young generation is a concept in garbage collection algorithms. In comparison to the old generation, the young generation generally includes:

  • Eden region, which is the nursery.
  • Survivor region, which is used to store surviving objects during young generation GC. The survivor region is also part of the young generation, but there are usually two survivor regions that are used interchangeably.

7.8 What is GC pause? #

During the GC process, certain operations need to wait for all application threads to reach a safe point before they can be executed. This pause is called GC pause or GC pause.

7.9 What is the difference between GC pause and STW pause? #

These two terms are generally considered to mean the same thing.

8. What would you do if CPU usage suddenly spikes? #

If I lack experience, I would typically use different tools to collect information about the current problem, such as:

  • Collecting different metrics (CPU, memory, disk I/O, network, etc.).
  • Analyzing application logs.
  • Analyzing GC logs.
  • Capturing thread dumps and analyzing them.
  • Capturing heap dumps and analyzing them.

8.1 What would you do if the system response becomes slow? #

In general, I would use APM monitoring tools to investigate issues with the application system itself. Sometimes I may also use tools like the Chrome browser to investigate external factors, such as network issues.

8.2 How is system performance typically measured? #

There are three quantifiable performance indicators:

  • System capacity, such as hardware configuration and design capacity.
  • Throughput, with TPS being the most direct indicator.
  • Response time, which includes server-side latency and network latency.

These indicators can be further expanded to include metrics such as single-machine concurrency, overall concurrency, data volume, number of users, budget costs, etc.

Please answer this question based on your actual experience, such as Linux commands or tools provided by the JDK.

9.1 What command is used to view the JVM process ID? #

You can use commands like ps -ef and jps -v, etc.

9.2 How do you check the remaining memory? #

For example, you can use commands like free -m, free -h, top, etc.

9.3 What tool is used to view thread stacks? #

Generally, you can use the jps command first, followed by jstack -l.

9.4 What tool is used to obtain heap dumps? #

Usually, the jmap tool is used to obtain a heap memory snapshot.

9.5 What precautions should be taken when performing a memory dump? #

Depending on the situation, obtaining a memory snapshot may cause the system to pause or block for a period of time, depending on the amount of memory.

When using jmap, if the live parameter is specified, it will trigger a Full GC, which needs to be noted.

9.6 How do you handle the approximate parameters for heap dumping using JMAP? #

Example:

jmap -dump:format=b,file=3826.hprof 3826

9.7 Why does the dump file have a .hprof extension? #

JVM has a built-in analyzer called HPROF, which originally defined the format of heap dump files.

9.8 What tool is used to analyze heap dumps after they are generated? #

Commonly used tools are Eclipse MAT and jhat.

9.9 What do you typically do if you forget what parameter to use? #

Searching the internet is a common and easy approach.

In addition, various JDK tools support the -h option to view help information. If you are familiar with these tools, even if you forget, it is easy to perform operations based on the prompts.

For example, GC issues, memory leaks, or other difficult problems, etc. There may also be some follow-up questions. For example:

  • What is the most memorable JVM problem you have encountered?
  • How did you analyze and solve the problem?
  • What experiences are worth sharing from this process?

This question is open-ended, so please answer based on your own experience. Feel free to share your thoughts in the WeChat group for this column, and we will analyze and provide feedback on each answer.