27 Three Methods of Compilation Plug Ins Aspect J, Asm, Re Dex

27 Three Methods of Compilation Plug-ins AspectJ, ASM, ReDex #

Just by briefly reviewing the content of previous lessons, you will notice that we have used compiler instrumentation techniques multiple times in startup time analysis, network monitoring, and power consumption monitoring. So what exactly is compiler instrumentation? As the name suggests, compiler instrumentation is the process of modifying existing code or generating new code during the compilation phase.

As shown in the above figure, please recall the compilation process of Java code and think about where instrumentation work is performed in the compilation process. In addition to the scenarios we have used before, what are some other common applications of compiler instrumentation? How can we make better use of it in practical work? What are some commonly used compiler instrumentation methods available today? Let’s solve these questions together today.

Basic knowledge of compiled instrumentation #

I don’t know if you have noticed, but it is actually very common to modify and generate code during the compilation process. Whether it’s APT (Annotation Processing Tool) annotation generation frameworks like Dagger and ButterKnife, or emerging Kotlin language compiler, they all use compiled instrumentation techniques.

Let’s take a look at some of the scenarios where compiled instrumentation techniques are used.

1. Application scenarios of compiled instrumentation

Compiled instrumentation techniques are very interesting and valuable. Once we master them, we can accomplish tasks that are difficult or impossible to achieve with other techniques. After learning this technique, we can manipulate the code at will to meet the requirements of different scenarios.

  • Code generation. In addition to commonly used annotation generation frameworks such as Dagger and ButterKnife, Protocol Buffers and database ORM frameworks also generate code during the compilation process. Code generation isolates the complexity of internal implementation, making development simpler and more efficient, and also reducing manual repetition and the possibility of errors.

  • Code monitoring. In addition to network monitoring and power consumption monitoring, we can use compiled instrumentation techniques to implement various performance monitoring. Why not implement monitoring functions directly in the source code? Firstly, we may not have access to the source code of third-party SDKs, and secondly, certain calling points may be scattered throughout the code. For example, monitoring all new Thread() calls in the code may not be easy to implement through the source code.

  • Code modification. We have unlimited space for creativity in this scenario. For example, if a certain third-party SDK library doesn’t have source code, we can add a try-catch to one of its crash functions or replace its image library, etc. We can also achieve seamless tracking through code modification, just like NetEase’s HubbleData and 51 Credit Card’s tracking practice.

  • Code analysis. In the previous article, I mentioned continuous integration, and custom code checking can be implemented using compiled instrumentation techniques. For example, checking for new Thread() calls in the code or checking for the use of sensitive permissions. In fact, third-party code checking tools like Findbugs also use compiled instrumentation techniques.

“One thousand readers have one thousand Hamlets.” With compiled instrumentation techniques, you can boldly use your imagination to do things that help improve team quality and efficiency.

From a technical implementation perspective, at which stage of the code compilation process does compiled instrumentation intervene? We can divide it into two categories:

  • Java files. For scenarios like APT and AndroidAnnotation that generate code, they all generate Java files, and they intervene at the beginning of the compilation.

  • Bytecode. For scenarios like code monitoring, code modification, and code analysis, bytecode manipulation is generally used. It can operate on Java bytecode (".class") or Dalvik bytecode (".dex"), depending on the instrumentation method used.

Compared to the Java file approach, the bytecode approach is more powerful with a broader range of applications. However, it is also more complex to use. Therefore, today I will mainly talk about how to use bytecode manipulation to achieve compiled instrumentation functionality.

2. Bytecode

For the Java platform, the Java Virtual Machine runs Class files, which correspond to Java bytecode internally. For embedded platforms like Android, the Android Virtual Machine runs Dex files. Google specifically designed the Dalvik bytecode for optimization, although it increased the length of instructions, it reduced the number of instructions and improved execution speed.

So what are the differences between these two bytecode formats? Let’s start with a very simple Java class.

public class Sample {
public void test() {
    System.out.print("I am a test sample!");
}

}

You can generate and view the Java bytecode and Dalvik bytecode of this Sample.java class by using the following commands:

javac Sample.java   // Generates Sample.class, which is the Java bytecode
javap -v Sample     // Views the Java bytecode of the Sample class

// Generates Dalvik bytecode from the Java bytecode
dx --dex --output=Sample.dex Sample.class   

dexdump -d Sample.dex   // Views the Dalvik bytecode of the Sample.dex

You can visually see the differences between Java bytecode and Dalvik bytecode.

Differences between Java bytecode and Dalvik bytecode

They have distinct formats and instructions. For an introduction to Java bytecode, you can refer to the JVM documentation. As for Dalvik bytecode, you can refer to the official Android documentation. The main differences between them are:

  • Architecture: The Java Virtual Machine (JVM) is stack-based, while the Android Virtual Machine (AVM) is register-based. On ARM platforms, register-based implementation has higher performance compared to stack-based implementation.

Java stack-based architecture vs Android register-based architecture

  • Format structure: For Class files, each file has its own separate constant pool and some other common fields. For Dex files, all Classes within the Dex share the same constant pool and common fields, resulting in a more compact overall structure and significantly reduced file size.

  • Instruction optimization: Dalvik bytecode has specific optimizations and simplifications for a large number of instructions. As shown in the following image, the same code requires over 100 instructions in Java bytecode, while it only requires a few instructions in Dalvik bytecode.

Instruction optimization comparison: Java bytecode vs Dalvik bytecode

For more information on Java bytecode and Dalvik bytecode, you can refer to the following resources:

Three Methods of Compiling Instrumentation #

AspectJ and ASM frameworks take Class files as input and output, making them the most commonly used Java bytecode processing frameworks.

1. AspectJ

AspectJ is a popular AOP (aspect-oriented programming) extension framework in Java. Many online articles claim that AspectJ processes Java files, but this is not entirely correct. In reality, it achieves code injection using bytecode manipulation techniques.

From the bottom-up implementation perspective, AspectJ internally uses the BCEL framework. However, BCEL has not seen much development progress in recent years, and the official recommendation is to switch to ObjectWeb’s ASM framework. For using BCEL, you can refer to the article “Designing Bytecode with BCEL”.

In terms of usage, as an elder statesman of bytecode manipulation, AspectJ does have its advantages:

  • Maturity and stability. Bytecode manipulation is not straightforward, given the bytecode format and various instruction rules. If there are errors in manipulation, it can lead to issues during program compilation or execution. Having been developed since 2001, AspectJ is mature, and there is generally no need to worry about the correctness of inserted bytecode.

  • Ease of use. AspectJ is powerful and easy to use. Users do not need to understand any Java bytecode-related knowledge to use it proficiently. It allows insertion of custom code before and after method invocations (including constructors), inside method bodies (including constructors), at points where variables are read or written, within static code blocks, and at exception handling locations. It also allows direct replacement of original code with custom code.

In a previous article in this column, I mentioned the performance monitoring framework ArgusAPM by 360. It uses AspectJ for performance monitoring. The class TraceActivity is used to monitor the lifecycle of the Application and Activity.

// Call the applicationOnCreate method when Application onCreate is executed
@Pointcut("execution(* android.app.Application.onCreate(android.content.Context)) && args(context)")
public void applicationOnCreate(Context context) {

}
// Call the applicationOnCreateAdvice method after calling applicationOnCreate
@After("applicationOnCreate(context)")
public void applicationOnCreateAdvice(Context context) {
    AH.applicationOnCreate(context);
}

As you can see, we can easily achieve compile-time instrumentation without needing to understand the underlying Java bytecode manipulation process. There are many articles about AspectJ online, but the most comprehensive ones are the official documentation. You can refer to “AspectJ Programming Guide” and “The AspectJ 5 Development Kit Developer’s Notebook”. I won’t go into further detail here.

However, from the usage instructions of AspectJ, we can see some of its disadvantages. Its functionality cannot meet the needs of certain scenarios:

  • Fixed join points. AspectJ can only operate on fixed join points. It cannot perform operations on bytecode sequences with specific rules.

  • Regular Expression. The matching rules of AspectJ are similar to regular expressions. For example, if there are custom methods starting with “on” in addition to the onXXX methods of the Activity lifecycle, they will also be matched.

  • Low performance. AspectJ wraps some of its own classes during implementation, which makes the logic more complex. It not only generates larger bytecode, but also affects the performance of the original functions.

Let’s take the example of the launch time of column 7, Sample, where we want to add a Trace function before and after all method calls. If you choose to use AspectJ, the implementation is really simple.

@Before("execution(* **(..))")
public void before(JoinPoint joinPoint) {
    Trace.beginSection(joinPoint.getSignature().toString());
}

@After("execution(* **(..))")
public void after() {
    Trace.endSection();
}

But as you can see, after the bytecode processing of AspectJ, it does not directly insert the Trace function into the code, but goes through a series of its own encapsulation. If you want to instrument all functions, AspectJ will bring some performance impact.

However, in most cases, we may only instrument a small part of the functions, so the performance impact of AspectJ can be ignored. If you want to use AspectJ directly in Android, it is still quite troublesome. Here I recommend using Hujiang’s AspectJX framework, which is not only easier to use, but also extends the ability to exclude certain classes and JAR packages. If you want to use annotations to integrate, I recommend using Hugo, a project written by the great Jake Wharton.

Although AspectJ is easy to use, if you don’t pay attention, it can still produce some unexpected exceptions. For example, when using Around Advice, you need to pay attention to the issue of method return values. In Hugo, the handling method is to directly return the value of joinPoint.proceed(), while also paying attention to Advice Precedence.

2. ASM

If AspectJ can only meet 50% of the bytecode processing scenarios, then ASM is a Java bytecode manipulation framework that can meet 100% of the scenarios. Its functionality is also very powerful. The main features of using ASM to manipulate bytecode are:

  • Flexibility. It is very flexible to operate, allowing customization of modifications, insertions, and deletions according to needs.

  • Difficult to get started. It is relatively difficult to get started and requires a deep understanding of Java bytecode.

To make it easy to use, compared to the BCEL framework, ASM has the advantage of providing a Visitor-based access interface (Core API). Users don’t need to worry about the format of the bytecode, they only need to focus on the structures they want to modify in each Visitor position. However, the disadvantage of this pattern is that it can generally only be used to handle bytecode in some simple scenarios.

In fact, the internal implementation of the launch time sample of column 7 uses ASM’s Core API. You can refer to the implementation of the MethodTracer class. In terms of the final effect, the bytecode processing of ASM is as follows.

Compared to AspectJ, ASM is more direct and efficient. However, for more complex situations, we may need to use another Tree API to make more direct modifications to the Class file. Therefore, at this time, you need to master some necessary knowledge of Java bytecode.

In addition, we also need to have an understanding of the operating mechanism of the Java Virtual Machine (JVM). As I mentioned before, the JVM is stack-based. So what is the stack of the JVM? According to the description in the “Java Virtual Machine Specification”:

Each Java virtual machine thread has its own private Java virtual machine stack, which is created when the thread is created and is used to store stack frames.

As described in this sentence, each thread has its own stack, so in a multi-threaded application, there will be multiple stacks for multiple threads, and each stack has its own stack frame.

As shown in the figure below, we can simply consider that the stack frame contains three important contents: the local variable array, the operand stack, and the constant pool reference.

  • Local Variable Array: In use, you can think of the local variable array as a place to store temporary data. It has an important feature, which is used to pass parameters when invoking methods. When a method is called, the parameters will be passed to the local variable table at positions starting from 0. If the method being called is an instance method, we can obtain the reference to the current instance from the first local variable, which points to the object indicated by this.

  • Operand Stack: The operand stack can be thought of as a place to store data needed for instruction execution. Instructions take data from the operand stack and push the operation result back onto the stack.

Since the maximum number of local variables and the maximum depth of the operand stack are determined at compile time, after using ASM to manipulate bytecode, you need to call the visitMaxs method provided by ASM to set the maxLocal and maxStack numbers. However, for user convenience, ASM already provides a method to automatically calculate these values. When instantiating the ClassWriter class with the parameter ClassWriter.COMPUTE_MAXS, ASM will automatically calculate the local variable array and the operand stack.

ClassWriter(ClassWriter.COMPUTE_MAXS)

Next, let’s take the simple example of “1+2”. The operands are operated in a LIFO (last-in, first-out) manner.

ICONST_1 pushes the integer value 1 onto the top of the stack, ICONST_2 pushes the integer value 2 onto the top of the stack, and the IADD instruction adds the top two integer values on the stack and pushes the result back onto the stack. The maximum depth of the operand stack is also determined at compile time. Often, the modified code by ASM will increase the maximum depth of the operand stack. However, ASM has provided dynamic computation methods, which will also bring some performance overhead.

In the specific bytecode processing process, it is particularly important to note the data exchange between the local variable array and the operand stack, as well as the handling of try-catch blocks.

  • Data Exchange: As shown in the figure below, after the IADD instruction operation, the ISTORE 0 instruction stores the top integer value on the stack into the first local variable, 0, for temporary storage. In the final addition process, the local variables at position 0 and 1 are taken out and pushed onto the operand stack for use by the IADD instruction. Regarding the instructions for data exchange between the local variable array and the operand stack, you can refer to the virtual machine specification, which provides a series of instructions for different data types.

  • Exception handling. During byte code operation, special attention needs to be paid to the impact of exception handling on the operand stack. If a catchable exception is thrown between try and catch, the current operand stack will be cleared, and the exception object will be pushed into this empty stack, and the execution process will continue at the catch location. Luckily, if incorrect byte code is generated, the compiler can detect this situation and result in a compilation exception. ASM also provides the Byte Code Verify interface, which can be used to verify the byte code after modification to ensure its correctness.

If you want to add code after a method execution, it is relatively simple in ASM. You can add processing logic before every RETURN or ATHROW instruction that appears in the byte code.

3. ReDex

ReDex is not only a Dex optimization tool, but it also provides many small tools and some novel features that are not mentioned in the documentation. For example, ReDex provides a simple Method Tracing and Block Tracing tool, which can insert tracking code before all methods or specified methods.

The official documentation provides an example to demonstrate the use of this tool. Please refer to InstrumentTest for details. This example inserts the onMethodBegin method of InstrumentAnalysis at the beginning of all methods except those in the blacklist. The specific configuration is as follows:

"InstrumentPass" : {
    "analysis_class_name":      "Lcom/facebook/redextest/InstrumentAnalysis;",  // Class with instrumentation code
    "analysis_method_name": "onMethodBegin",    // Method with instrumentation code
    "instrumentation_strategy": "simple_method_tracing"
  }    // Insert strategy, with two options: inserting "simple_method_tracing" before methods, or inserting "basic_block_tracing" before and after CFG blocks

The functionality of ReDex is not a complete AOP tool, but it provides a set of instruction generation APIs and opcode insertion APIs. We can use this feature as a reference to implement our own bytecode injection tool. The code for this feature can be found in Instrument.cpp.

This class has handled various special cases of byte code processing relatively comprehensively. We can directly construct a segment of opcode and use its provided Insert interface to insert code without worrying too much about possible exception scenarios. However, this feature is still coupled with ReDex’s business logic, so we need to extract useful code for our own use.

Due to the short development time of Dalvik byte code and its compact Dex format, modifying it often has a domino effect. Moreover, handling Dalvik byte code is more complex than Java byte code, so there aren’t many tools available for directly manipulating Dalvik byte code.

Most of the situations that require direct modification of Dex files in the market are for reverse engineering, and many students manually write Smali code and then compile it back. Here, I have summarized a few libraries for modifying Dalvik byte code:

  • ASMDEX: Developed by the creator of the ASM library, but it hasn’t been updated for a long time.

  • Dexter: An official Dex manipulation library developed by Google, which is frequently updated but complex to use.

  • Dexmaker: Used to generate Dalvik byte code.

  • Soot: The method of modifying Dex is quite unique. It first converts the Dalvik byte code into a Jimple three-address code, then inserts Jimple opcodes, and finally converts it back to Dalvik byte code. You can refer to this example for details.

Summary #

Today, I introduced several representative frameworks to explain the content related to bytecode instrumentation. Code generation, code monitoring, code modification, and code analysis. Bytecode instrumentation technology is versatile and requires us to fully unleash our imagination.

For common use cases, predecessors have put in a lot of effort to toolize and expose them through APIs, allowing us to easily use them without understanding the underlying bytecode principles. However, if you really want to reach the level of being able to do anything, even with the help of tools like ASM, you still need a deep understanding and knowledge of the underlying bytecode.

Of course, you can also become a “predecessor” and solidify these scenarios, providing them for future use. But sometimes “limited ability restricts imagination”, if your ability is not enough, even with a vivid imagination, there is nothing you can do.

Homework #

Which instrumentation tools have you used? What functionalities have you implemented using instrumentation? Feel free to leave a message and discuss it with me and other classmates.

Today’s homework is to review the implementation principles of Sample in the 7th column exercise, and see how it internally uses ASM for TAG instrumentation. In today’s Sample, I also provide a version implemented using AspectJ. It’s not easy to completely learn instrumentation, it’s not as simple as just writing an efficient Gradle Plugin.

In addition to the above two Samples, I also recommend you to carefully read the following references and projects.

Feel free to click on “Invite friends to read” to share today’s content with your friends and invite them to study together. Finally, don’t forget to submit today’s homework in the comment section. I have prepared a generous “Study Gift Pack” for students who complete the homework seriously, and I look forward to improving together with you.