Android Jvm Ti Mechanics Detailed Explanation Including Benefits and Easter Eggs

Android JVM TI Mechanics Detailed Explanation Including Benefits and Easter Eggs #

Hello, I am Sun Pengfei.

In the analysis of the “Performance Optimization” column, which can be found at Card Stutter Optimization, the author mentioned that the JVM TI mechanism can be used to obtain a wealth of information about the application’s performance, including memory allocation, thread creation, class loading, and GC information.

What exactly is the JVM TI mechanism? Why is it so powerful? How can we apply it to our work? Today, let’s unveil its mysterious veil together.

Introduction to JVM TI #

JVM TI, short for Java Virtual Machine Tool Interface, is a programming interface used for developing virtual machine monitoring tools. It allows for monitoring the execution of internal JVM events and controlling certain behaviors of the JVM. It can be used to implement debugging, monitoring, thread analysis, code coverage analysis tools, and more.

JVM TI is a part of the Java Platform Debugger Architecture (JPDA), and it can be considered a backend in the Debugger Architecture, interacting with JDWP and the front-end JDI. It’s worth noting that JDWP in Android is not developed based on JVM TI.

Since Java SE 5, JVM TI has replaced the earlier JVMPI and JVMDI in the Java platform debugging system. If you are not familiar with this background, I strongly recommend reading the following articles first:

Although Java has been using JVM TI for many years, it was only integrated into Android 8.0 as of JVM TI v1.2. This integration was mainly done to support modifying Dex in memory and monitoring global events in the Runtime. With JVM TI support, we can achieve functions that are not implemented by existing debugging tools or customize our own debug tools to obtain the data we care about.

Currently, there are tools that use JVM TI technology, such as Android Studio’s Profilo tool and LinkedIn’s dexmaker-mockito-inline tool. Android Studio utilizes the JVM TI mechanism to implement real-time memory monitoring, object allocation slicing, GC events, Memory Alloc Diff, and other powerful features. Dexmaker uses this mechanism to mock final methods and static methods.

1. Features Supported by JVM TI

Before diving into the implementation details of JVM TI, let’s take a look at the features provided by JVM TI and what we can do with them.

Thread-related Events - Monitoring thread creation and stack traces, lock information

ThreadStart: Generated when a thread starts executing a method.
ThreadEnd: Generated when a thread finishes executing.
MonitorWait: Generated after a wait method call.
MonitorWaited: Generated after a wait method finishes waiting.
MonitorContendedEnter: Generated when a thread attempts to acquire an object lock already held by another thread.
MonitorContendedEntered: Generated when a thread acquires an object lock and continues execution.

Class Loading and Preparation Events - Monitoring class loading

ClassFileLoadHook: Triggered before a class is loaded.
ClassLoad: Generated when a class is loaded for the first time.
ClassPrepare: Generated when the preparation phase of a class is completed.

Exception Events - Monitoring exception information

Exception: Generated when an exception is thrown.
ExceptionCatch: Generated when an exception is caught.

Debugging-related

SingleStep: Step event that allows for fine-grained bytecode execution sequence, enabling exploration of bytecode execution sequences in multi-threaded scenarios.
Breakpoint: Generated when a thread reaches a breakpoint. Breakpoints can be set using the JVMTI SetBreakpoint method.

Method Execution

FramePop: Generated when a method reaches a return instruction or encounters an exception. This event can also be manually triggered by calling the NotifyFramePop JVM TI function.
MethodEntry: Generated when a Java method starts executing.
MethodExit: Generated when a method completes execution, including when an exception is thrown.
FieldAccess: Generated when accessing a field with a watchpoint set. Watchpoints are set using the SetFieldAccessWatch function.
FieldModification: Generated when a field with a watchpoint set is modified. Watchpoints are set using the SetFieldModificationWatch function.

GC - Monitoring GC events and timing

GarbageCollectionStart: Generated when a garbage collection starts.
GarbageCollectionFinish: Generated when a garbage collection finishes.

Object Events - Monitoring memory allocation

ObjectFree: Generated when an object is freed by garbage collection.
VMObjectAlloc: Generated when a virtual machine allocates an object.

Others

NativeMethodBind: Generated when a native method is called for the first time or when JNI RegisterNatives is called. This event allows switching a JNI call to a specific method.

Through the description of these events, we can roughly understand what features JVM TI supports. Detailed callback function parameters can be obtained from the JVM TI specification document. With these features, we can build customized performance monitoring tools, data collection tools, and behavior modifiers.

2. Implementation Principles of JVM TI

The JVM TI Agent requires support from the virtual machine to start. Our agent runs in the same process as the virtual machine. The virtual machine opens our Agent dynamic link library (DLL) using dlopen and then calls the Agent_OnAttach function to invoke our defined initialization logic.

The principle of JVM TI is actually quite simple. Taking the VmObjectAlloc event as an example, when we set JVMTI_EVENT_VM_OBJECT_ALLOC callback by calling the SetEventNotificationMode function, it will eventually call art::Runtime::Current() -> GetHeap() -> SetAllocationListener(listener).

In this method, the listener is a callback implemented by JVM TI provided by the virtual machine, called art::gc::AllocationListener. When the virtual machine allocates object memory, this callback is called. The source code can be found at heap-inl.h#194. At the same time, the previously set callback method will also be called in this callback function, enabling the transmission of events and related data to our Agent for event listening.

Similar to atrace and StrictMode, each event in JVM TI requires instrumentation support in the source code. For those interested, you can select some events in the source code and further trace them.

Note: The translated content is for reference only. Please refer to the original version for accuracy.

Development of JVM TI Agent #

JVM TI Agent is developed using the C/C++ language and can also be developed using other languages that support C language calling, such as Rust.

The constants, functions, events, and data types involved in JVM TI are defined in the jvmti.h file. We need to download this file and use it in our project. You can download the header file from the Android project.

The output of the JVM TI Agent is an .so file. In Android, we can start a JVM TI Agent program using the system-provided Debug.attachJvmtiAgent method.

static fun attachJvmtiAgent(library: String, options: String?, classLoader: ClassLoader?): Unit

The library parameter is the absolute address of the .so file. It is worth noting that the API level needs to be 28, and the application needs to have the android:debuggable enabled in order to use this method. However, we can force enable debug to start the JVM TI functionality in the release version.

After the JVM TI Agent is loaded by the virtual machine in Android, the Agent_OnAttach method is called in a timely manner. This method can be considered as the main function of the Agent program, so we need to implement the following function in our program.

extern "C" JNIEXPORT jint JNICALL Agent_OnAttach(JavaVM *vm, char *options,void *reserved)

You can perform initialization operations in this method.

Use the JavaVM::GetEnv function to get the jvmtiEnv* environment pointer (Environment Pointer), which can be used to access the functions provided by JVM TI.

jvmtiEnv *jvmti_env;
jint result = vm->GetEnv((void **) &jvmti_env, JVMTI_VERSION_1_2);

Use the AddCapabilities function to enable the required capabilities. You can also enable all capabilities using the following method, but enabling all capabilities will affect the performance of the virtual machine.

void SetAllCapabilities(jvmtiEnv *jvmti) {
    jvmtiCapabilities caps;
    jvmtiError error;
    error = jvmti->GetPotentialCapabilities(&caps);
    error = jvmti->AddCapabilities(&caps);
}

The GetPotentialCapabilities function can retrieve the set of capabilities supported by the current environment, which is returned through the jvmtiCapabilities structure. This structure indicates all the supported capabilities and can be viewed through jvmti.h, with the following approximate content.

typedef struct {
    unsigned int can_tag_objects : 1;
    unsigned int can_generate_field_modification_events : 1;
    unsigned int can_generate_field_access_events : 1;
    unsigned int can_get_bytecodes : 1;
    unsigned int can_get_synthetic_attribute : 1;
    unsigned int can_get_owned_monitor_info : 1;
    ......
} jvmtiCapabilities;

Then, use the AddCapabilities method to enable the required capabilities. If you need to add capabilities separately, you can use the following method.

jvmtiCapabilities caps;
memset(&caps, 0, sizeof(caps));
caps.can_tag_objects = 1;

At this point, the initialization operation of JVM TI is completed.

You can find the explanations of all the functions and data structure types here. Next, I will introduce some commonly used features and functions.

1. JVM TI Event Monitoring

One of the main features of JVM TI is that it can receive various event notifications during virtual machine execution.

First, use the SetEventCallbacks method to set the callback function for the target events. If callbacks is passed as nullptr, all callback functions are cleared.

jvmtiEventCallbacks callbacks;
memset(&callbacks, 0, sizeof(callbacks));

callbacks.GarbageCollectionStart = &GCStartCallback;
callbacks.GarbageCollectionFinish = &GCFinishCallback;
int error = jvmti_env->SetEventCallbacks(&callbacks, sizeof(callbacks));

After setting the callback functions, if you want to receive the target events, you need to use the SetEventNotificationMode. One thing to note about this function is the event_thread parameter. If the event_thread parameter is nullptr, the target event callback will be globally enabled. Otherwise, it will only take effect in the specified thread. For example, for some events, we are only interested in the main thread.

jvmtiError SetEventNotificationMode(jvmtiEventMode mode,
          jvmtiEvent event_type,
          jthread event_thread,
           ...);
typedef enum {
    JVMTI_ENABLE = 1, // Enable
    JVMTI_DISABLE = 0 // Disable
} jvmtiEventMode;

Taking the GC event as an example, the callback functions for the GC event are set above. If you want to receive events in the callback method, you need to use SetEventNotificationMode to enable the events. It is worth mentioning that there is no sequence between the SetEventNotificationMode and SetEventCallbacks method calls.

jvmti->SetEventNotificationMode(JVMTI_ENABLE, JVMTI_EVENT_GARBAGE_COLLECTION_START, nullptr);
jvmti->SetEventNotificationMode(JVMTI_ENABLE, JVMTI_EVENT_GARBAGE_COLLECTION_FINISH, nullptr);

With the above steps, you can receive the corresponding functions in the callback function after the virtual machine generates GC events. It is important to note that JNI and JVM TI functions should not be used in the gc callback because the virtual machine is in a stopped state.

void GCStartCallback(jvmtiEnv *jvmti) {
    LOGI("==========GCStart event triggered =========");
}

void GCFinishCallback(jvmtiEnv *jvmti) {
    LOGI("==========GCFinish event triggered =========");
}

The sample effect is as follows.

com.dodola.jvmti I/jvmti: ==========GCStart Triggered=======
com.dodola.jvmti I/jvmti: ==========GCFinish Triggered=======

**2. JVM TI Bytecode Enhancement**

JVM TI can modify bytecode while the virtual machine is running. There are three ways to modify bytecode:

- Static: Modify the bytecode before the virtual machine loads the Class file. This method is generally not used.
- Load-Time: When the virtual machine loads a specific Class, it can retrieve the bytecode of the class through JVM TI callbacks. This triggers the ClassFileLoadHook callback function. Since the ClassLoader mechanism only triggers once, and we often attach an Agent to the virtual machine after it has been running for some time, we cannot modify already loaded classes such as Object. Therefore, it is necessary to choose this method based on the loading timing of the Class.
- Dynamic: Modify the bytecode of already loaded Class files using the JVM TI mechanism. When the system calls the RetransformClasses function, it triggers the ClassFileLoadHook callback. At this point, the bytecode can be modified. This method is the most practical.

Traditional JVM operates on Java Bytecode, while bytecode operations in Android are performed on [Dalvik Bytecode](https://source.android.com/devices/tech/dalvik/dalvik-bytecode). Dalvik Bytecode is implemented using registers, making it relatively easier to operate compared to Java Bytecode, as it does not require handling the interaction between local variables and operand stack.

Enabling this feature requires enabling JVM TI bytecode enhancement.

jvmtiCapabilities.can_generate_all_class_hook_events=1 // Enable class hook flag
jvmtiCapabilities.can_retransform_any_class=1 // Enable retransform operation on any class

Then register the ClassFileLoadHook event callback.

jvmtiEventCallbacks callbacks;
callbacks.ClassFileLoadHook = &ClassTransform;

Here is the function prototype of ClassFileLoadHook. The process for modifying existing bytecode will be explained later.

static void ClassTransform(
              jvmtiEnv *jvmti_env, // jvmtiEnv environment pointer
              JNIEnv *env, // jniEnv environment pointer
              jclass classBeingRedefined, // Redefined class information
              jobject loader, // The classloader that loads this class; if this is nullptr, it means it is loaded by the BootClassLoader
              const char *name, // Qualified name of the target class
              jobject protectionDomain, // Protection domain of the loaded class
              jint classDataLen, // Length of the class bytecode
              const unsigned char *classData, // Class bytecode data
              jint *newClassDataLen, // Length of the new class data
              unsigned char **newClassData) // Bytecode data of the new class

Then enable the event. Refer to the complete initialization logic in the Sample code for reference.

SetEventNotificationMode(JVMTI_ENABLE, JVMTI_EVENT_CLASS_FILE_LOAD_HOOK, NULL)

The following is an example using the Sample code to explain how to insert a line of log call code in the onCreate method of the Activity class.

After going through the above steps, we will receive event callbacks in the ClassFileLoadHook callback method when the virtual machine first loads a class and when the RetransformClasses or RedefineClasses methods are called. Our target class is Activity, which triggers the class loading process when the application starts. Due to the late timing of enabling the event in this Sample, we do not receive callbacks for loading the Activity class at this moment. Therefore, we need to call RetransformClasses to trigger the event callbacks. This method is used to modify classes that have already been loaded. It takes an array of Class objects to be modified and the length of the array as parameters.

jvmtiError RetransformClasses(jint class_count, const jclass* classes)

After calling this method, the assigned callback set in ClassFileLoadHook, which is the ClassTransform method mentioned above, will receive the callback. In this callback method, we use a bytecode manipulation tool to modify the bytecode of the original class.

Modifying a class triggers the virtual machine to use the new method, and the old method bytecode will no longer be invoked. If a method is on the stack frame, it will continue to run the old method bytecode. The modification made by RetransformClasses does not cause class initialization, i.e., methods are not called again, and the values of static variables and instance variables of the class will not change. However, the breakpoints of the target class will be invalidated.

There are some limitations when processing classes. We can change the implementation and properties of methods, but we cannot add, delete, or rename methods. We cannot change method signatures, parameters, or modifiers, and we cannot change the inheritance relationship of the class. If any of the above actions are taken, the modification will fail. After modification, the class will be verified. Additionally, if there are multiple identical Classes in the virtual machine, we need to make sure that the obtained Class is the currently effective Class, according to the ClassLoader loading mechanism, which means that the pre-loaded Class is used preferentially.

The effect implemented in Sample is to add a line of log output in the onCreate method of Activity.

Before modification:

protected void onCreate(@Nullable Bundle savedInstanceState) {
......
}

After modification:

protected void onCreate(@Nullable Bundle savedInstanceState) {
      com.dodola.jvmtilib.JVMTIHelper.printEnter(this,"....");
...
}

The Dalvik bytecode modification library I used is dexter provided in the Android source code. Although it is flexible to use, it is relatively cumbersome. Another framework, dexmaker, can also be used to achieve this. In this example, dexter is used. The framework is developed in C++, and it can directly read class data and perform operations, similar to the ASM framework. The following code is the core operation code. Refer to the complete code in this Sample for reference.

ir::Type* stringT = b.GetType("Ljava/lang/String;");
ir::Type* jvmtiHelperT=b.GetType("Lcom/dodola/jvmtilib/JVMTIHelper;");
lir::Instruction *fi = *(c.instructions.begin());
VReg* v0 = c.Alloc<VReg>(0);
addInstr(c, fi, OP_CONST_STRING,
         {v0, c.Alloc<String>(methodDesc, methodDesc->orig_index)});
addCall(b, c, fi, OP_INVOKE_STATIC, jvmtiHelperT, "printEnter", voidT, {stringT}, {0});
c.Assemble();

Memory must be allocated for the class data to be modified using the JVM TI function Allocate. The new_class_data is then pointed to the modified class bytecode array, and new_class_data_len is set to the length of the modified class bytecode array. If the class file is not modified, new_class_data does not need to be set. If multiple JVM TI Agents have enabled this event, the set new_class_data will become the class_data for the next JVM TI Agent.

At this point, our generated onCreate method already includes the added log method call. When a new Activity is started, it will be executed using the new class bytecode and the com.dodola.jvmtilib.JVMTIHelper class that we injected will be loaded using the ClassLoader. As mentioned earlier, the Activity is loaded using the BootClassLoader, but our class is obviously not in the BootClassLoader. Therefore, a crash will occur.

java.lang.NoClassDefFoundError: Class not found using the boot class loader; no stack trace available

Therefore, we need to find a way to add the JVMTIHelper class to the BootClassLoader. Here, we can use the AddToBootstrapClassLoaderSearch method provided by JVM TI to add the Dex or APK to the Class search directory. In this Sample, we add getPackageCodePath.

Summary #

Today I mainly explained the concept and principles of JVM TI, as well as the functionalities it can achieve. Through JVM TI, many data that would usually require some “black magic” to obtain can be accessed, such as the Thread Park Start/Finish events and obtaining the waiters of a lock.

In the Android community, there are not many people who are familiar with JVM TI, and research on it is not very in-depth yet. Currently, JVM TI’s functionalities are already very powerful, and future versions of Android will further enhance its support for more functionalities, enabling it to do even more. I believe that in the future, it will become a powerful tool for local automation testing, and even remote diagnostics in production environments.

In this sample of this period, we provided some simple examples, and you can build upon it to implement the functionalities you desire.

Relevant Resources #

In-depth Java Debugging System: Part 1, JPDA System Overview
In-depth Java Debugging System: Part 2, JVMTI and Agent Implementations
In-depth Java Debugging System: Part 3, JDWP Protocol and Implementations
In-depth Java Debugging System: Part 4, Java Debug Interface (JDI)
JVM TI Official Documentation: https://docs.oracle.com/javase/7/docs/platform/jvmti/jvmti.html
Source code is the best resource: http://androidxref.com/9.0.0_r3/xref/art/openjdkjvmti/

Surprise Treat #

According to what was agreed upon in the column digest, Shiwen and I will select some students who have earnestly submitted their homework and completed the exercises, and send them a “study encouragement gift pack”. Since the update of the column, many students have left their thoughts and summaries. We have selected @Owen, @Zhiwei, @Xushengming, @Xiaojie, and @SunnyBird to receive a “Geek Time Weekly Calendar”. We hope that more students can join in the study and discussion to progress together with us.

- @Owen’s study summary: https://github.com/devzhan/Breakpad

@Xushengming, @Xiaojie, and @SunnyBird have submitted exercise homework via Pull Requests at https://github.com/AndroidAdvanceWithGeektime/Chapter04/pulls.

The Geek Time assistant will contact the award-winning users within 24 hours, so pay attention to your messages~