34 Some Say Lambda Can Make Java Programs 30 Times Faster, What's Your View

34 Some say Lambda can make Java programs 30 times faster, what’s your view #

In the previous lesson, I introduced some basic ideas for analyzing Java performance issues. However, in practical work, we cannot simply wait for performance problems to arise before attempting to solve them. Instead, we need quantitative and comparable methods to evaluate the performance of Java applications and determine whether they can meet business support goals. In today’s lesson, I will introduce how to judge the performance of applications from a Java developer’s perspective at the code level, with a focus on understanding the most widely used benchmark tests.

The question I want to ask you today is, what is your opinion on the claim that “Lambda could make Java programs 30 times slower”?

To help you understand this background clearly, please refer to the code snippets below. In actual execution, the lambda/stream-based version (lambdaMaxInteger) is much slower than the traditional for-each version (forEachLoopMaxInteger).

// A large ArrayList with randomly generated integer data inside
volatile List<Integer> integers = ...

// Benchmark test 1
public int forEachLoopMaxInteger() {
   int max = Integer.MIN_VALUE;
   for (Integer n : integers) {
    max = Integer.max(max, n);
   }
   return max;
}

// Benchmark test 2
public int lambdaMaxInteger() {
   return integers.stream().reduce(Integer.MIN_VALUE, (a, b) -> Integer.max(a, b));
}

Typical Answer #

I think the debate about “Lambda makes Java programs 30 times slower” reflects several aspects:

Firstly, benchmark tests are a very effective and common means to intuitively and quantitatively evaluate the performance of a program under specific conditions.

Secondly, benchmark tests must have a clear definition of their scope and goals, otherwise misleading results are likely to be produced. The code snippet itself has flaws in its logic, and the additional overhead is mainly due to auto-boxing/unboxing, not Lambda and Stream. Therefore, the initial conclusion drawn is not convincing.

Thirdly, although Lambda/Stream provides powerful functional programming capabilities for Java, we also need to acknowledge its limitations:

Generally speaking, we can consider that Lambda/Stream provides performance that is close to on par with traditional approaches. However, if performance is highly sensitive, we cannot completely ignore its performance differences in specific scenarios, such as initialization overhead. Lambda is not just syntactic sugar; it is a new working mechanism. When it is first called, the JVM needs to build a CallSite instance for it. This means that if a Java application introduces many Lambda statements during the startup process, it will slow down the startup. The implementation characteristics determine that JVM’s optimization for Lambda may differ from traditional approaches.
It increases the complexity of program diagnosis, as the program stack becomes more complicated. Fluent style itself is not a very friendly structure for debugging, and there are also limitations in handling checked exceptions, and so on.

Analysis of the Exam Focus #

Today’s question is based on a controversial article, which was later corrected to “If Streams are Used Improperly, They Can Make Your Code 5 Times Slower.” In response to this issue, my answer does not dwell on the so-called “fast” or “slow” aspects, but instead points out the problems with benchmark testing itself and the limitations of Lambdas from an engineering perspective.

From a knowledge point of view, this question tests my understanding of the performance impact of the autoboxing/unboxing mechanism, which I discussed in Lesson 7 of the series. It also tests my knowledge of the Lambda feature introduced in Java 8. In addition to these points, the interviewer may delve further into the topic of how to use benchmark testing and similar methods to turn ambiguous views into verifiable conclusions.

For many features of the Java language, there are often many “secrets” that are not entirely true. It is necessary for us to separate fact from fiction and explore the truth in a quantitative and qualitative way, discussing practices that are easier to promote. The ability to find conclusions is more important than the conclusions themselves. Therefore, in today’s lesson, let’s explore:

The basic elements of benchmark testing and how to build a simple benchmark test using mainstream frameworks.
Further analysis on how to ensure the effectiveness of benchmark testing, how to avoid deviating from the testing purpose, and how to ensure the correctness of benchmark testing.

Knowledge Expansion #

First of all, let’s have an overall understanding of the main purpose and characteristics of benchmarking. I won’t repeat the formal definitions here.

Performance evaluation is often situational, and it can be misleading to simply say that a performance is “good” or “fast.” By introducing benchmarking, we can define clear conditions and specific metrics for performance comparison, ensuring that we obtain quantitative and reproducible comparative data, which is a practical need in engineering.

Different benchmarking tests have different contents and scopes. If you are a professional performance engineer, you may be more familiar with system-level tests provided by industrial standards like SPEC. For most Java developers, they are more familiar with microbenchmarking, which has a smaller scope and focuses on finer details. The problem I mentioned at the beginning of the article is a typical microbenchmark, and it is also the focus of today’s discussion.

When do we need to develop microbenchmarks?

I believe that microbenchmarks can be considered when evaluating the performance of a small part of a large software. In other words, microbenchmarks are often used for API-level validation or comparison with other simple use cases, such as:

When you develop a shared library that provides some kind of service API to other modules.
When your API has strict requirements for performance, such as latency and throughput. For example, if you implement a customized HTTP client API, you need to determine its throughput capability when making a large number of GET requests to an HTTP server. Or you may need to compare it with other APIs to ensure that it meets the same or higher performance standards.

Therefore, microbenchmarks are more suitable for the needs of basic and low-level platform developers, and they are also loved by those cutting-edge engineers who pursue ultimate performance.

How to build your own microbenchmarks and which framework to choose?

One of the most widely used frameworks is JMH. OpenJDK itself extensively uses JMH for performance comparison. If you want to compare Java APIs, JMH is often the preferred choice.

JMH was developed by experts from the Hotspot JVM team. In addition to supporting the complete benchmarking process, including warmup, running, statistics, and reporting, it also supports Java and other JVM languages. More importantly, it provides various features for the Hotspot JVM to ensure the correctness of benchmarking and greatly improves overall accuracy compared to other frameworks. JMH also provides the ability to perform tasks like profiling in a quasi-white-box manner.

Using JMH is also very simple. You can directly add its dependency to your Maven project, as shown in the following image:

![](../images/0dd290f8842959cb02d6c3a434a58e68-20221127212308-eu9ogc7.png)

Alternatively, you can use a command similar to the following to directly generate a Maven project:

$ mvn archetype:generate \
        -DinteractiveMode=false \
        -DarchetypeGroupId=org.openjdk.jmh \
          -DarchetypeArtifactId=jmh-java-benchmark-archetype \
        -DgroupId=org.sample \
        -DartifactId=test \
        -Dversion=1.0

In JMH, you define specific test methods and detailed configurations for benchmarks using annotations. For example, you need to add “@Benchmark” to indicate that it is a benchmarking method. You can also specify the benchmarking mode using @BenchmarkMode. For example, the following example specifies the throughput mode, but you can also specify other modes such as average time based on your needs.

@Benchmark
@BenchmarkMode(Mode.Throughput)
public void testMethod() {
   // Put your benchmark code here.
}

After implementing the specific tests, you can use the following Maven command to build:

mvn clean install

Running the benchmark is not significantly different from running any other Java application:

java -jar target/benchmarks.jar

For more specific steps, please refer to the relevant guide. JMH has a strong engineer flavor throughout, and it doesn’t focus on having perfect documentation. Instead, it provides excellent sample code, so you need to get used to learning directly from the code.

How to ensure the correctness of microbenchmarks and what pitfalls to avoid?

First of all, building microbenchmarks requires a white-box understanding of the code, especially the specific performance overhead, whether it’s CPU or memory allocation. There are two aspects to consider: first, we need to ensure that the benchmarks we write serve the intended purpose and truly validate the functionality we want to cover, as in the example mentioned in this article; second, typically for microbenchmarks, we usually expect the code snippets to be limited in size. If the execution time requires many milliseconds or even seconds, its effectiveness becomes questionable and it’s not conducive to diagnosing problems.

What’s more important is that, since microbenchmarks usually test small-scale API-level functionality, the biggest threat comes from an overly “smart” JVM! Brain Goetz pointed out the typical problems in microbenchmarks early on.

Since we are executing very limited code snippets, we must ensure that the JVM optimization process does not affect the original testing purpose. Here are several aspects that require special attention:

Ensure that the code has been sufficiently and appropriately warmed up. I mentioned this in the first article. By default, in server mode, JIT will compile the code into native code after it has been executed 10,000 times, while in client mode, it will be compiled after 1,500 times. We need to exclude the noise in the initial execution of the code and ensure that the statistical data we sample truly reflects its stable running state. Usually, the following parameters are recommended to determine how long the warm-up work took:

-XX:+PrintCompilation

I suggest adding another parameter to this, otherwise, the JVM will default to enabling background compilation, which means it will be done in other threads and may result in confusing output information.

-Xbatch

At the same time, it is also important to ensure that the code paths during the warm-up phase and the collection phase are consistent, and to observe whether there are sporadic compilation statements in the PrintCompilation output during later execution.

To prevent the JVM from performing dead code elimination, for example, in the code snippet below, since we are not using the calculated result mul, the JVM may directly determine that it is dead code and not execute it at all.

public void testMethod() {
   int left = 10;
   int right = 100;
   int mul = left * right;
}

If you find that the code statistics have significantly increased, you should be alert to the possibility of dead code elimination.

The solution is straightforward - try to ensure that the method has a return value instead of being a void method, or use the BlackHole facility provided by JMH and add the following statement to the method.

public void testMethod(Blackhole blackhole) {
   // ...
   blackhole.consume(mul);
}

To prevent constant folding. If the JVM finds that the calculation process depends on a constant or is essentially a constant, it may directly calculate the result. Therefore, benchmark tests may not truly reflect the performance of code execution. JMH provides the State mechanism to solve this problem by changing local variables to State object information. Please refer to the example below.

@State(Scope.Thread)
public static class MyState {
   public int left = 10;
   public int right = 100;
}

public void testMethod(MyState state, Blackhole blackhole) {
   int left = state.left;
   int right = state.right;
   int mul = left * right;
   blackhole.consume(mul);
}

In addition, JMH also performs additional processing on the State object to minimize the impact of false sharing, marked by @State, and JMH will automatically pad it.
If you want to determine the impact of method inlining on performance, you can consider enabling the following option.

-XX:+PrintInlining

From the summary above, it can be seen that microbenchmark tests are a technology that requires a deep understanding of Java and JVM underlying mechanisms. This is a great tool for a deep understanding of the effects behind the program. However, we also need to approach microbenchmark tests cautiously and not be blinded by possible illusions.

The content I introduced today is relatively common and easy to grasp. For microbenchmark tests, garbage collection and other underlying mechanisms also affect the statistical data. As I mentioned earlier, microbenchmark tests usually hope to control the execution time and memory allocation rate within a limited range. During this process, if garbage collection occurs, it is likely to result in biased data, so the Serial GC is worth considering. In addition, JDK 11 introduced Epsilon GC, which is a “do-nothing” GC approach that can be considered to eliminate related impacts to the maximum extent possible.

Today, I started with a controversial program and discussed how to use (micro)benchmark tests from a developer’s perspective rather than a performance engineer’s perspective to verify your performance judgments. I also introduced the basic construction methods and risk points that need to be avoided.

Practice Exercise #

Have you understood the question we discussed today? In our project, we need to evaluate the capacity of the system to plan and ensure its ability to support business. Can you share your thoughts on how to approach this? What are some commonly used methods?

Please write your thoughts on this question in the comments section. I will select a well-thought-out comment and send you a study reward voucher. Feel free to discuss with me.

Is your friend also preparing for an interview? You can “invite a friend to read” and share today’s question with them. Maybe you can help them.