16 Case Analysis Common Java Code Optimization Rules

16 Case Analysis- Common Java Code Optimization Rules #

Reviewing lessons 06 to 15, we have learned about various optimization methods such as caching, pooling objects, reusing large objects, parallel computing, lock optimization, and NIO. These methods often provide a significant performance improvement.

However, the language itself also has an impact on performance. For example, many companies switch from Java to Golang due to the characteristics of the language. For Java, there are also a set of optimization rules that need to be followed. These subtle performance differences, after multiple invocations and iterations, can have a significant impact.

In this lesson, we will focus on explaining some commonly used code optimization rules. By maintaining good coding habits, we can keep our code in an optimal state.

Code Optimization Rules #

1. Use local variables to avoid heap allocation #

Since heap resources are shared among multiple threads and are the main area where garbage collectors work, having too many objects can put pressure on the garbage collector (GC). By using local variables, the variables are allocated on the stack. These variables will be destroyed when the method execution is completed, thus reducing the pressure on the GC.

2. Reduce the scope of variables #

Pay attention to the scope of variables and try to minimize object creation. For example, in the code below, variable a is created every time the method is called. It can be moved inside the if statement to reduce its scope.

public void test1(String str) {
    final int a = 100;
    if (!StringUtils.isEmpty(str)) {
        int b = a * a;
    }
}

3. Access static variables using class name directly #

Some developers are used to accessing static variables using objects. This method involves an additional addressing operation, where the variable’s corresponding class needs to be found before accessing the variable. For example, in the code below:

public class StaticCall {
    public static final int A = 1;

    void test() {
        System.out.println(this.A);
        System.out.println(StaticCall.A);
    }
}

The corresponding bytecode is:

void test();
    descriptor: ()V
    flags:
    Code:
      stack=2, locals=1, args_size=1
         0: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
         3: aload_0
         4: pop
         5: iconst_1
         6: invokevirtual #3                  // Method java/io/PrintStream.println:(I)V
         9: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
        12: iconst_1
        13: invokevirtual #3                  // Method java/io/PrintStream.println:(I)V
        16: return
      LineNumberTable:
        line 5: 0
        line 6: 9
        line 7: 16

As can be seen, using this introduces an extra step.

4. Use StringBuilder for string concatenation #

For string concatenation, use StringBuilder or StringBuffer instead of the + operator. For example, in the code below, the concatenation of strings is done within a loop.

public String test() {
    String str = "-1";
    for (int i = 0; i < 10; i++) {
        str += i;
    }
    return str;
}

From the corresponding bytecode, it can be seen that a StringBuilder object is created in each iteration of the loop. Therefore, in our regular coding practice, we should explicitly create a StringBuilder object once.

 5: iload_2
 6: bipush        10
 8: if_icmpge     36
11: new           #3                  // class java/lang/StringBuilder
14: dup
15: invokespecial #4                  // Method java/lang/StringBuilder."<init>":()V
18: aload_1
19: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
22: iload_2
23: invokevirtual #6                  // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
26: invokevirtual #7                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
29: astore_1
30: iinc          2, 1
33: goto          5

5. Override the hashCode method for objects, do not simply return a fixed value #

During code review, I found that some developers override the hashCode and equals methods and simply return a fixed value of 0 for the hashCode. This is inappropriate.

When these objects are stored in a HashMap, the performance will be very low because the HashMap uses the hashCode to locate the hash slot. When there are collisions, it will use linked lists or red-black trees to organize the nodes. Returning a fixed value of 0 essentially disables the hash addressing feature.

6. Specify initial capacity when initializing collections like HashMap #

This principle is similar to the one mentioned in “Lesson 10 | Case Study: Goals and Considerations for Reusing Large Objects.” Many objects, such as ArrayList and StringBuilder, can benefit from specifying an initial capacity to reduce the performance impact of resizing.

7. When iterating over a Map, use the EntrySet method #

Using the EntrySet method, you can directly obtain a set object and use it directly. On the other hand, using the KeySet method gives you a set of keys and requires an additional get operation, adding an extra step. Therefore, using the EntrySet method to iterate over a Map is recommended.

8. Do not use the same Random object in a multi-threaded environment #

The seed of the Random class competes in concurrent access, resulting in reduced performance. It is recommended to use the ThreadLocalRandom class in a multi-threaded environment.

On Linux systems, by adding the JVM configuration -Djava.security.egd=file:/dev/./urandom, the urandom random generator can be used, resulting in faster random number generation.

9. Use LongAdder instead of atomic operations for self-incrementing #

Self-increment operation can be achieved by combining synchronized and volatile, or by using atomic classes (such as AtomicLong).

The latter is slightly faster than the former because AtomicLong uses compare-and-swap (CAS) to perform comparisons and replacements. However, in a heavily threaded scenario, it may lead to excessive ineffective spinning. To achieve further performance improvement, LongAdder can be used instead of AtomicLong.

10. Avoid using exceptional control program flow #

Exceptions are used to understand and handle various abnormal situations encountered in a program. The implementation of exceptions is relatively expensive and less efficient than regular conditional statements.

This is because exceptions, at the bytecode level, require the generation of an exception table that involves additional checks and steps.

Exception table:
    from    to  target type
    7    17    20   any
    20    23    20   any

Therefore, it is recommended to avoid using exceptions to control program flow.

11. Do not use try-catch in loops #

The principle is similar to the previous point. Many articles suggest not placing exception handling within loops and instead placing it at the outermost layer. However, actual testing shows that the performance difference between these two approaches is not significant.

Since there is not much difference in performance, it is recommended to code based on the requirements of the business. For example, if the loop should not be interrupted when an exception occurs, meaning the program should continue running, then the exception should be handled within the for loop.

12. Do not catch RuntimeException #

In Java, there are two types of exceptions: those that can be avoided through pre-checking mechanisms, such as RuntimeException, and regular exceptions.

RuntimeException should not be caught using the catch statement; it should be avoided through coding practices instead.

In the code example below, the list may throw an ArrayIndexOutOfBoundsException. Whether an array index is out of bounds can be checked in advance through code, rather than catching it when the exception occurs. Checking in advance is more elegant and efficient.

// BAD
public String test1(List<String> list, int index) {
    try {
        return list.get(index);
    } catch (IndexOutOfBoundsException ex) {
        return null;
    }
}

// GOOD
public String test2(List<String> list, int index) {
    if (index >= list.size() || index < 0) {
        return null;
    }
    return list.get(index);
}

13. Proper use of PreparedStatement #

PreparedStatement optimizes the execution of SQL statements through precompilation. Most databases strive to optimize these reusable queries through precompilation and caching of the compiled results.

This way, when the statements are used again, they can be executed quickly without the need for SQL parsing.

PreparedStatement also improves program security and can effectively prevent SQL injection.

However, if your program has varying SQL statements each time and requires manual concatenation of data, then PreparedStatement loses its effectiveness. In this case, using a regular Statement may be faster.

14. Considerations when logging #

In “06 | Case Study: How Buffering Speeds up Code,” we learned about logback’s asynchronous logging. There are other things to consider when logging.

Usually, we use debug to output some debugging information and then turn it off in the production environment. For example:

logger.debug("xjjdog:" + topic + " is awesome");

Every time the program reaches this point, it constructs a string, regardless of whether you adjust the log level to INFO or WARN. This can impact efficiency.

To mitigate this, you can use the isDebugEnabled method to check the log level before printing each time:

if (logger.isDebugEnabled()) {
    logger.debug("xjjdog:" + topic + " is awesome");
}

Using placeholders achieves the same effect without manually adding the isDebugEnabled method. The code is more elegant:

logger.debug("xjjdog:{} is awesome", topic);

For a business system, logging has a significant impact on system performance. Unnecessary logs should be avoided to prevent IO resource consumption.

15. Reduce the scope of transactions #

If your program uses transactions, pay attention to the scope of the transactions and try to complete them as quickly as possible. This is because the isolation of transactions is implemented using locks, similar to the optimization of multithreaded locks in “13 | Case Study: Optimizing Multithreaded Locks.”

@Transactional 
public void test(String id){
    String value = rpc.getValue(id); //high time-consuming
    testDao.update(sql,value);
}

In the above code, since the rpc service is time-consuming and unstable, it should be moved outside the transaction. The code can be modified as follows:

public void test(String id){
    String value = rpc.getValue(id); //high time-consuming
    testDao(value);
}
@Transactional 
public void testDao(String value){
    testDao.update(value);
}

One thing to note here is that due to Spring AOP, the @Transactional annotation can only be used on public methods. If used on private methods, it will be ignored. This is also one of the common interview questions.

16. Use bit shifting instead of multiplication and division #

Computers use binary representation, and bit shifting operations can greatly improve performance.

« Left-shift is equivalent to multiplication by 2.
Right-shift is equivalent to division by 2.
Unsigned right-shift is also equivalent to division by 2, but it ignores the sign bit and fills empty bits with 0.

int a = 2;
int b = (a++) << (++a) + (++a);
System.out.println(b);

Note: The priority of shift operation is very low, so the output of the code above is 1024.

17. Avoid printing large collections or using the toString method of large collections #

Some developers like to output collections as strings to log files, but this habit is not good.

Take ArrayList as an example. It needs to iterate over all elements to generate the string. In the case of a large number of elements in the collection, this not only consumes a large amount of memory space, but also has a very slow execution efficiency. I have encountered a real case where this batch printing method caused a sharp decline in system performance.

The code below is the toString method of an ArrayList. It needs to generate an iterator and concatenate all the elements into a string, which is very wasteful of space.

public String toString() {
    Iterator<E> it = iterator();
    if (! it.hasNext())
        return "[]";

    StringBuilder sb = new StringBuilder();
    sb.append('[');
    for (;;) {
        E e = it.next();
        sb.append(e == this ? "(this Collection)" : e);
        if (! it.hasNext())
            return sb.append(']').toString();
        sb.append(',').append(' ');
    }
}

18. Use reflection sparingly in programs #

Reflection is very powerful, but it is implemented by parsing bytecode, so its performance is not ideal.

In reality, there are many optimization methods for reflection, such as caching the process of reflection execution (such as Method) to speed up the reflection speed.

After Java 7.0, a new package java.lang.invoke was introduced, and a new JVM bytecode instruction invokedynamic was added to support direct invocation of target methods through strings from the JVM level.

If you have very strict performance requirements, use MethodHandle in the invoke package to optimize the code, but its programming convenience is not as good as reflection. In normal coding, reflection is still the preferred choice.

The following is a code implementation class written using MethodHandle. It can achieve some features of dynamic languages, performing different invocations based on the method name and the passed object body, even if Bike and Man classes have no relationship.

import java.lang.invoke.MethodHandle;
import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodType;

public class MethodHandleDemo {
    static class Bike {
        String sound() {
            return "ding ding";
        }
    }

    static class Animal {
        String sound() {
            return "wow wow";
        }
    }

    static class Man extends Animal {
        @Override
        String sound() {
            return "hou hou";
        }
    }

    String sound(Object o) throws Throwable {
        MethodHandles.Lookup lookup = MethodHandles.lookup();
        MethodType methodType = MethodType.methodType(String.class);
        MethodHandle methodHandle = lookup.findVirtual(o.getClass(), "sound", methodType);

        String obj = (String) methodHandle.invoke(o);
        return obj;
    }

    public static void main(String[] args) throws Throwable {
        String str = new MethodHandleDemo().sound(new Bike());
        System.out.println(str);
        str = new MethodHandleDemo().sound(new Animal());
        System.out.println(str);
        str = new MethodHandleDemo().sound(new Man());
        System.out.println(str);
    }
}

19. Pre-compile regular expressions to speed up execution #

Regular expressions in Java need to be compiled before use.

Typical code is as follows:

Pattern pattern = Pattern.compile({pattern});
Matcher pattern = pattern.matcher({content});

Pattern compilation is very time-consuming. The Matcher method is thread-safe, and each time this method is called, a new Matcher object will be generated. Therefore, Pattern generally only needs to be initialized once and can be used as a static member variable of a class.

Case Study #

Case 1: Regular expressions and state machines #

The execution efficiency of regular expressions is very slow, especially in greedy mode.

Here, I introduce an optimization of regular expressions in my actual work, using state machines to perform string matching.

Consider the following SQL statement, which is similar to NamedParameterJdbcTemplate but enhanced. The SQL accepts two parameters: smallId and firstName. When firstName is empty, the statement enclosed in ##{} will be removed.

select * from USERS
where id>:smallId
##{
 and FIRST_NAME like concat('%',:firstName,'%') }

It can be seen that this function can be easily implemented using regular expressions.

#\{(.*?:([a-zA-Z0-9_]+).*?)\}

By defining such a regular expression, the matched string can be extracted using the group function of Pattern. We save the matched string, and finally use the replace function to replace it with an empty string.

In actual use, we found that regular expression parsing is particularly slow, especially when the SQL statement is very large. In this case, a state machine can be used to optimize the performance. I chose ragel here, but you can also use similar tools like javacc or antlr. It generates Java syntax code through grammar parsing and simple regular expressions. The generated code is generally unreadable, and we only need to pay attention to the definition file. As shown in the code of the definition file below, by defining a batch of descriptors and handlers and using some intermediate data structures to cache the results, we only need to scan the SQL once to obtain the corresponding results.

pairStart = '#{';
pairEnd = '}';
namedQueryStringFull = ( ':'alnum+)
            >buffer
            %namedQueryStringFull
            ;
pairBlock =
        (pairStart
            any*
            namedQueryStringFull
            any*
            pairEnd)
        >pairBlockBegin %pairBlockEnd
        ;
main := any* pairBlock any*;

After defining the file, you can use the ragel command to generate the final Java syntax file.

ragel -G2 -J -o P.java P.rl

The complete code is a bit complicated, and I have put it into the repository. You can analyze it practically.

Let’s take a look at its performance. From the test results, we can see that the performance of Ragel mode is more than 3 times that of Regex mode, and the longer the SQL, the more obvious the effect.

Benchmark                     Mode  Cnt    Score     Error   Units
RegexVsRagelBenchmark.ragel  thrpt   10  691.224 ± 446.217  ops/ms
RegexVsRagelBenchmark.regex  thrpt   10  201.322 ±  47.056  ops/ms

Case 2: Bytecode Modification of HikariCP #

In “09 | Case Study: Application Scenarios of Object Pooling”, we mentioned the bytecode modification of HikariCP, which is managed by the JavassistProxyFactory class. Javassist is a bytecode library used by HikariCP to modify bytecode.

As shown in the following image, this is the main method of the factory class.

It generates proxy classes by calling generateProxyClass, mainly for core JDBC interfaces such as Connection, Statement, ResultSet, and DatabaseMetaData.

By running this class, you can see that the code generates a bunch of Class files.

Generating com.zaxxer.hikari.pool.HikariProxyConnection
Generating com.zaxxer.hikari.pool.HikariProxyStatement
Generating com.zaxxer.hikari.pool.HikariProxyResultSet
Generating com.zaxxer.hikari.pool.HikariProxyDatabaseMetaData
Generating com.zaxxer.hikari.pool.HikariProxyPreparedStatement
Generating com.zaxxer.hikari.pool.HikariProxyCallableStatement
Generating method bodies for com.zaxxer.hikari.proxy.ProxyFactory

For the organization of this code, the delegation pattern, a design pattern, is used. We can see from the HikariCP source code that the proxy classes, such as ProxyConnection, are abstract. The concrete instances are the class files generated by Javassist. Decompiling these generated class files, we can see that they actually process the delegated objects by calling methods from the parent class.

This has two advantages:

Firstly, only the JDBC interface methods that need to be modified need to be implemented in the code, and other code generation is handled by the proxy classes, greatly reducing the amount of code.
Secondly, when problems occur, errors can be uniformly handled through the checkException function.

In addition, we noticed that the methods in the ProxyFactory class are all static methods and are not implemented through a singleton. Why is this done? This involves two bytecode instructions at the JVM level: invokestatic and invokevirtual.

Below are the bytecode instructions for two different types of invocations:

invokevirtual

public final java.sql.PreparedStatement prepareStatement(java.lang.String, java.lang.String[]) throws java.sql.SQLException;
    flags: ACC_PRIVATE, ACC_FINAL
    Code:
      stack=5, locals=3, args_size=3
         0: getstatic     #59                 // Field PROXY_FACTORY:Lcom/zaxxer/hikari/proxy/ProxyFactory;
         3: aload_0
         4: aload_0
         5: getfield      #3                  // Field delegate:Ljava/sql/Connection;
         8: aload_1
         9: aload_2
        10: invokeinterface #74,  3           // InterfaceMethod java/sql/Connection.prepareStatement:(Ljava/lang/String;[Ljava/lang/String;)Ljava/sql/PreparedStatement;
        15: invokevirtual #69                 // Method com/zaxxer/hikari/proxy/ProxyFactory.getProxyPreparedStatement:(Lcom/zaxxer/hikari/proxy/ConnectionProxy;Ljava/sql/PreparedStatement;)Ljava/sql/PreparedStatement;
        18: return

invokestatic

private final java.sql.PreparedStatement prepareStatement(java.lang.String, java.lang.String[]) throws java.sql.SQLException;
    flags: ACC_PRIVATE, ACC_FINAL
    Code:
      stack=4, locals=3, args_size=3
         0: aload_0
         1: aload_0
         2: getfield      #3                  // Field delegate:Ljava/sql/Connection;
         5: aload_1
         6: aload_2
         7: invokeinterface #72,  3           // InterfaceMethod java/sql/Connection.prepareStatement:(Ljava/lang/String;[Ljava/lang/String;)Ljava/sql/PreparedStatement;
        12: invokestatic  #67                 // Method com/zaxxer/hikari/proxy/ProxyFactory.getProxyPreparedStatement:(Lcom/zaxxer/hikari/proxy/ConnectionProxy;Ljava/sql/PreparedStatement;)Ljava/sql/PreparedStatement;
        15: areturn

Most ordinary method invocations use the invokevirtual instruction, which belongs to virtual method invocation.

Many times, the JVM needs to determine the target method to be invoked based on the dynamic type of the caller, which is the process of dynamic binding. In contrast, the invokestatic instruction belongs to the static binding process, which can directly identify the target method and slightly improve efficiency.

Although these optimizations of HikariCP may seem nitpicky, we can see the coding techniques that HikariCP pursues high performance from them.

Summary #

In addition to learning the Java specification, you can also read the Alibaba Java Development Specification (Songshan Version), which contains many meaningful suggestions.

In fact, performance optimization at the language level is a trade-off between various resources (such as development time, code complexity, scalability, etc.). These rules are not rigid dogma, which requires us to choose the appropriate tools and make flexible adjustments based on the actual work scenarios in coding.

Next, we will enter “Module 4: JVM Optimization”. In the next lesson, I will explain “17 | Advanced Development: How Does JVM Perform Garbage Collection?” to guide you to advanced levels.