11 Null Handling the Unclear Null and Frustrating Null Pointer Exception

11 Null Handling The Unclear Null and Frustrating NullPointerException #

Today, I want to share with you the topic of handling null values: distinguishing “null” and annoying null pointers.

One day, I received a text message that said, “Dear null, hello, XXX.” I laughed when I read it, as it was a joke that programmers can all understand. The program failed to retrieve my name and formatted the empty space as “null.” Clearly, null was not handled properly. Even replacing null with “guest” or “customer” would have avoided such a joke.

When a variable in a program is null, it means that it does not refer to any object or, in other words, has no pointer. Any operation on this variable will inevitably result in a null pointer exception, which is NullPointerException in Java. So, in which situations are null pointer exceptions likely to occur, and how should they be fixed?

Although null pointer exceptions can be annoying, they are relatively easy to locate. The more challenging task is to understand the meaning of null. For example, if the client sends null data to the server, is the intention to provide an empty value or no value at all? Another example is the NULL value in a database field. Does it have a special meaning? What do we need to be aware of when writing SQL statements for NULL values in a database?

Today, let’s embark on a journey of pitfalls related to null, starting with these questions.

Fixing and Locating Annoying Null Pointer Issues #

NullPointerException is the most common exception in Java code, and I classify the most likely scenarios for its occurrence into the following 5 types:

  1. The parameter value is an Integer or another wrapper type, and a NullPointerException occurs due to auto-unboxing.
  2. A NullPointerException occurs when comparing strings.
  3. Containers like ConcurrentHashMap do not support null keys or values, so attempting to put null as the key or value will result in a NullPointerException.
  4. Object A contains object B, and after obtaining B through a field of A, a NullPointerException occurs when cascading method calls on B without checking if the field is null.
  5. A List returned by a method or a remote service is null, and a NullPointerException occurs when calling methods on the List without checking if it is null.

To demonstrate these 5 scenarios, I have created a wrongMethod method and a wrong method to call it. The test parameter of the wrong method is a string of length 4 composed of 0s and 1s. The position of each 1 indicates a null parameter, which controls the 4 input parameters of the wrongMethod method, simulating different null pointer scenarios:

private List<String> wrongMethod(FooService fooService, Integer i, String s, String t) {
    log.info("result {} {} {} {}", i + 1, s.equals("OK"), s.equals(t),
            new ConcurrentHashMap<String, String>().put(null, null));
    if (fooService.getBarService().bar().equals("OK"))
        log.info("OK");
    return null;
}

@GetMapping("wrong")
public int wrong(@RequestParam(value = "test", defaultValue = "1111") String test) {
    return wrongMethod(test.charAt(0) == '1' ? null : new FooService(),
            test.charAt(1) == '1' ? null : 1,
            test.charAt(2) == '1' ? null : "OK",
            test.charAt(3) == '1' ? null : "OK").size();
}

class FooService {
    @Getter
    private BarService barService;
}

class BarService {
    String bar() {
        return "OK";
    }
}

Clearly, this case will result in a NullPointerException because the variable is null. Attempting to access the value of the variable or its members will result in a NullPointerException. However, locating this exception is rather tricky.

In the wrongMethod test method, we simulate 4 null pointer scenarios with a single line of logging code:

  1. Performing the +1 operation on the input parameter Integer i
  2. Comparing the input parameter String s with “OK”
  3. Comparing the input parameters String s and String t
  4. Performing the put operation on a newly created ConcurrentHashMap with both the key and value set to null

The output exception information is as follows:

java.lang.NullPointerException: null
  at org.geekbang.time.commonmistakes.nullvalue.demo2.AvoidNullPointerExceptionController.wrongMethod(AvoidNullPointerExceptionController.java:37)
  at org.geekbang.time.commonmistakes.nullvalue.demo2.AvoidNullPointerExceptionController.wrong(AvoidNullPointerExceptionController.java:20)

This information indeed indicates that a NullPointerException occurred at this line of code, but it’s difficult to determine which part of the code caused the null pointer. It could be due to unboxing the input parameter Integer, either of the two input strings being null, or adding a null to the ConcurrentHashMap.

You may think that to troubleshoot such issues, you can simply set a breakpoint and check the input parameters. However, in real-world scenarios, null pointer issues often occur only under specific input and code branches that are hard to reproduce locally. Setting code breakpoints is not feasible, and typically involves either splitting the code or adding more logs, both of which are cumbersome.

In this case, I recommend using Arthas, a powerful Java troubleshooting tool from Alibaba. Arthas is easy to use and can help locate the majority of Java production issues.

Next, let me show you how to determine the input parameters of the wrongMethod method within 30 seconds using Arthas, so that we can pinpoint which input parameter is causing the null pointer. In the screenshot below, there are three red boxes, and I’ll explain the second and third boxes to you:

  • The second box indicates that Arthas has attached itself to the JVM process upon startup.
  • The third box indicates that we’re monitoring the input parameters of the wrongMethod method using the watch command. img

The parameters of the watch command include class name expression, method expression, and observation expression. Here, we set the observation class as AvoidNullPointerExceptionController, the observed method as wrongMethod, and the observation expression as params to observe the input parameters:

watch org.geekbang.time.commonmistakes.nullvalue.demo2.AvoidNullPointerExceptionController wrongMethod params

After enabling watch, execute the wrong method twice, setting the test parameter to 1111 and 1101 respectively. In other words, for the first call to wrongMethod, all 4 parameters are null, while for the second call, the first, second, and fourth parameters are null.

With the help of the first and fourth red boxes in the figure, we can see that during the second call, the third parameter is the string “OK” and the other parameters are null. Archas correctly outputs all the input parameters of the method, making it easy for us to locate the null pointer problem.

At this point, if it is a simple business logic, you can locate the null pointer exception. However, if it is a complex business logic with branches, you need to use the stack command to view the call stack of the wrongMethod method, and use the watch command to view the input parameters of each method to conveniently locate the root cause of the null pointer.

The following figure demonstrates the observation of the call path of wrongMethod through the stack command:

img

If you want to know the detailed usage of various commands of Arthas, you can click here to view.

Next, let’s see how to fix the 5 kinds of null pointer exceptions that occurred above.

In fact, for any null pointer exception handling, the most straightforward way is to check for null before performing operations. However, this can only prevent the exception from occurring, and we still need to find out whether the null pointer in the program logic comes from the input parameters or a bug:

If it comes from the input parameters, further analysis is required to determine whether the input parameters are reasonable, etc.

If it comes from a bug, the null pointer may not be purely a program bug, but may also involve business attributes and interface call specifications, etc.

Here, because it is a demo, we only consider the repair method of null pointer check. If you need to check null first and then process, most people will think of using the if-else code block. However, this approach increases code complexity and reduces readability. We can try to use the Optional class in Java 8 to eliminate this if-else logic and perform null check and processing with one line of code.

The repair ideas are as follows:

For checking null of Integer, Optional.ofNullable can be used to construct an Optional, and then use orElse(0) to replace null with a default value before performing the +1 operation.

For comparing String and literals, you can put the literal in front, such as "OK".equals(s), so even if s is null, a null pointer exception will not occur. For comparing the equals of two string variables that may be null, you can use Objects.equals, which handles null check.

For ConcurrentHashMap, since neither its Key nor Value supports null, the fix is to not store null in it. While the Key and Value of HashMap can be null, ConcurrentHashMap is not a thread-safe version of HashMap, and it does not support null values for Key and Value. This is a misleading point that is easy to misunderstand.

For cascade calls like fooService.getBarService().bar().equals("OK"), there are many places that need null check, including fooService, the return value of getBarService() method, and the string returned by the bar method. If if-else is used to check null, it may need several lines of code. But using Optional will only require one line of code.

For the List returned by rightMethod, since we cannot confirm whether it is null, we can also use Optional.ofNullable to wrap the return value and use orElse(Collections.emptyList()) to obtain an empty List when the List is null, and then call the size method.

private List<String> rightMethod(FooService fooService, Integer i, String s, String t) {
    log.info("result {} {} {} {}", Optional.ofNullable(i).orElse(0) + 1, "OK".equals(s), Objects.equals(s, t), new HashMap<String, String>().put(null, null));
    Optional.ofNullable(fooService)
            .map(FooService::getBarService)
            .filter(barService -> "OK".equals(barService.bar()))
            .ifPresent(result -> log.info("OK"));
    return new ArrayList<>();
}

@GetMapping("right")
public int right(@RequestParam(value = "test", defaultValue = "1111") String test) {
    return Optional.ofNullable(rightMethod(test.charAt(0) == '1' ? null : new FooService(),
            test.charAt(1) == '1' ? null : 1,
            test.charAt(2) == '1' ? null : "OK",
            test.charAt(3) == '1' ? null : "OK"))
            .orElse(Collections.emptyList()).size();
}

After the repair, when calling the right method with the parameter 1111, which sets all 4 parameters of rightMethod to null, there will not be any null pointer exceptions in the log:

[21:43:40.619] [http-nio-45678-exec-2] [INFO ] [.AvoidNullPointerExceptionController:45  ] - result 1 false true null

However, if we modify the input parameter of the right method to 0000, i.e., all 4 parameters passed to the rightMethod method cannot be null, the string “OK” will not appear in the log either. Didn’t the bar method of BarService return the string “OK”?

Let’s use Arthas to locate the problem again. Use the watch command to observe the input parameters of the rightMethod method, and set the -x parameter to 2 to indicate a depth of 2 for parameter printing:

img

As we can see, the barService field in FooService is null, which explains why this bug occurred.

This raises another question. Using null check or Optional to avoid null pointer exceptions may not be the best way to solve the problem. Not having null pointer exceptions may hide deeper bugs. Therefore, to solve null pointer exceptions, we still need to truly analyze and locate case by case, and then do null check. Moreover, the handling is not simply judging non-null and then performing normal business flow, but also considering whether an exception should be thrown, set a default value, or record a log when it is null.

What does null represent in POJO attributes? #

In my opinion, compared to avoiding null pointer exceptions by determining whether an object is null, a more common and error-prone issue is the identification of null. For the program, null means that the pointer has no reference. However, when it comes to business logic, it becomes much more complex. We need to consider the following:

What does null in the fields of a DTO actually mean? Does it indicate that the client did not provide us with this information?

Since null pointer issues are troublesome, should the fields in the DTO have default values?

If there is a null value in a field of a database entity, will saving the data through a data access framework overwrite the existing data in the database?

If we cannot answer these questions clearly, the logical flow of the program we write is likely to be confusing. Let’s look at a practical example next.

There is a POJO named “User” which acts as both a DTO and a database entity, and it contains attributes such as user ID, name, nickname, age, and registration date:

@Data
@Entity
public class User {

    @Id
    @GeneratedValue(strategy = IDENTITY)
    private Long id;

    private String name;

    private String nickname;

    private Integer age;

    private Date createDate = new Date();

}

There is an API endpoint for updating user data called “Post”. The logic for updating is very simple. It automatically sets a nickname based on the user’s name. The nickname follows the rule of “user type + name”. Then, it updates the User object received from the client, which is transmitted as JSON in the RequestBody, to the database through JPA and returns the saved data.

@Autowired
private UserRepository userRepository;

@PostMapping("wrong")
public User wrong(@RequestBody User user) {
    user.setNickname(String.format("guest%s", user.getName()));
    return userRepository.save(user);
}

@Repository
public interface UserRepository extends JpaRepository<User, Long> {

}

First, initialize a user record in the database. The age is 36, the name is “zhuye”, the create_date is January 4, 2020, and the nickname is NULL.

Then, use cURL to test the user information update API “Post” by passing a JSON string with id=1 and name=null. The expectation is to set the name of the user with ID 1 as empty:

curl -H "Content-Type:application/json" -X POST -d '{ "id":1, "name":null}' http://localhost:45678/pojonull/wrong

{"id":1,"name":null,"nickname":"guestnull","age":null,"createDate":"2020-01-05T02:01:03.784+0000"}%

The result returned by the API is consistent with the record in the database.

We can see that there are three problems here:

The caller only wants to reset the username, but the age is also set to null.

The nickname should be the user type concatenated with the name. If the name is reset to null, the nickname of the guest user should be “guest” instead of “guestnull”, which reproduces the joke mentioned at the beginning.

The user’s create time was originally on January 4th, but after updating the user information, it became January 5th.

In summary, there are five aspects to consider:

Clearly define the meaning of null in the DTO. Null has ambiguity in the deserialization process from JSON to DTO. Whether a property is not transmitted by the client or it is transmitted as null, it will be assigned as null in the DTO. However, for the user information update operation, not transmitting means that the client does not need to update this property and the original value in the database should be maintained. Transmitting null means that the client wants to reset this property. Java’s null represents the absence of data, so it cannot distinguish between these two expressions. Therefore, the age property in this example is also set to null. Perhaps we can use Optional to solve this problem.

Provide default values for fields in the POJO. If the client does not provide a value, it will be assigned a default value, which leads to the update of the create time in the database.

Be aware that when formatting a string, null values may be formatted as the string “null”. For example, when setting the nickname, we only performed a simple string formatting, which resulted in “guestnull” being stored in the database. Obviously, this is unreasonable and the source of the joke mentioned at the beginning, so further checks are needed.

The DTO and Entity share the same POJO. The setting of the user nickname is controlled by the program, and we should not expose it in the DTO. Otherwise, it is easy to update the value randomly set by the client to the database. In addition, it is best to let the database set the create time to the current time instead of controlling it by the program. This can be achieved by setting the columnDefinition on the field.

Allowing null in the database fields will further increase the possibility of errors and complexity. If the data actually supports NULL when it is stored, there may be three states: NULL, empty string, and the string “null”. I will explore this further in the next section. If all properties have default values, the problem would be simpler.

Based on this idea, we split the DTO and Entity and modify the code as follows:

In the UserDto, only id, name, and age attributes remain, and name and age are wrapped in Optional to distinguish whether the client did not provide data or intentionally provided null.

Use the @Column annotation on the fields of UserEntity to set the database fields name, nickname, age, and createDate as NOT NULL, and set createDate’s default value to CURRENT_TIMESTAMP, allowing the database to generate the create time.

Use Hibernate’s @DynamicUpdate annotation to generate dynamic update SQL, which only updates the modified fields and requires querying the entity once to allow Hibernate to “track” the current state of the entity properties to ensure their validity.

@Data
public class UserDto {
private Long id;

private Optional<String> name;

private Optional<Integer> age;

; 

@Data
@Entity
@DynamicUpdate
public class UserEntity {

    @Id
    @GeneratedValue(strategy = IDENTITY)
    private Long id;

    @Column(nullable = false)
    private String name;

    @Column(nullable = false)
    private String nickname;

    @Column(nullable = false)
    private Integer age;

    @Column(nullable = false, columnDefinition = "TIMESTAMP DEFAULT CURRENT_TIMESTAMP")
    private Date createDate;

}

After refactoring the DTO and Entity, we redefine an interface called `right` to handle update operations more precisely. First, let's do parameter validation:

Check whether the `UserDto` and ID properties passed in are null, and throw an `IllegalArgumentException` if they are.

Query the entity from the database based on the ID and check whether it is null. If it is null, throw an `IllegalArgumentException`.

Since the DTO cleverly uses `Optional` to distinguish between the client not passing a value and passing a null value, we can implement the logic according to the client's intention. If no value is passed, the `Optional` instance is null, so we can skip updating the entity field, which will exclude the column from the dynamically generated SQL. If a value is passed, we further check if it is null.

Next, let's update the name, age, and nickname based on business requirements:

For the name, we consider it as resetting the name to an empty string if the client passes a null value. We can use the `orElse` method of `Optional` to convert null to an empty string.

For the age, we consider that if the client wants to update the age, they must pass a valid age. There is no reset operation for age, so we can use the `orElseThrow` method of `Optional` to throw an `IllegalArgumentException` when the value is null.

For the nickname, since the name in the database cannot be null, we can safely set the nickname to "guest" concatenated with the name retrieved from the database.

```java
@PostMapping("right")
public UserEntity right(@RequestBody UserDto user) {

    if (user == null || user.getId() == null)
        throw new IllegalArgumentException("用户Id不能为空");
    UserEntity userEntity = userEntityRepository.findById(user.getId())
            .orElseThrow(() -> new IllegalArgumentException("用户不存在"));
    if (user.getName() != null) {
        userEntity.setName(user.getName().orElse(""));
    }

    userEntity.setNickname("guest" + userEntity.getName());
    if (user.getAge() != null) {
        userEntity.setAge(user.getAge().orElseThrow(() -> new IllegalArgumentException("年龄不能为空")));
    }
    return userEntityRepository.save(userEntity);
}

Assuming there is already a record in the database with id=1, age=36, create_date=January 4, 2020, name=zhuye, nickname=guestzhuye:

img

Using the same parameters to call the right endpoint, let’s see if all the issues have been resolved. Pass in a JSON string with id=1 and name=null, expecting to reset the name of the user with id=1 to an empty string:

curl -H "Content-Type:application/json" -X POST -d '{ "id":1, "name":null}' http://localhost:45678/pojonull/right

{"id":1,"name":"","nickname":"guest","age":36,"createDate":"2020-01-04T11:09:20.000+0000"}%

The result is as follows:

img

We can see that the right endpoint perfectly implements the operation to only reset the name property, and the nickname no longer contains a null string. The age and create date fields are not modified.

From the logs, we can see that the SQL statement generated by Hibernate only updates the name and nickname fields:

Hibernate: update user_entity set name=?, nickname=? where id=?

Next, to test whether Optional can effectively differentiate between a property not being passed in the JSON and a property being passed as null, we set the age to null in the JSON. As expected, an error message saying “年龄不能为空” (age cannot be empty) is correctly returned:

curl -H "Content-Type:application/json" -X POST -d '{ "id":1, "age":null}' http://localhost:45678/pojonull/right

{"timestamp":"2020-01-05T03:14:40.324+0000","status":500,"error":"Internal Server Error","message":"年龄不能为空","path":"/pojonull/right"}%

Beware of Three Pitfalls Regarding NULL in MySQL #

As mentioned earlier, allowing NULL in database table fields can not only confuse us but also lead to pitfalls. Here, I will focus on explaining the pitfalls that can occur when using NULL fields with the sum function, count function, and NULL value conditions.

To facilitate the demonstration, let’s first define an entity with only two fields, id and score:

@Entity
@Data
public class User {

    @Id
    @GeneratedValue(strategy = IDENTITY)
    private Long id;

    private Long score;
}

When the program starts, let’s initialize an entity with a single data entry, where the id is automatically set to 1 (due to auto-increment) and the score is NULL:

@Autowired
private UserRepository userRepository;

@PostConstruct
public void init() {
    userRepository.save(new User());
}

Next, let’s test three scenarios to see the pitfalls that can arise when working with NULL values in the database:

  1. Using the sum function to calculate the total sum of a column with only NULL values, for example, SUM(score).
  2. Selecting the number of records using the count function with a field that allows NULL values, for example, COUNT(score).
  3. Querying records where the field value is NULL using the =NULL condition, for example, score=null condition.
@Repository
public interface UserRepository extends JpaRepository<User, Long> {

    @Query(nativeQuery=true,value = "SELECT SUM(score) FROM `user`")
    Long wrong1();

    @Query(nativeQuery = true, value = "SELECT COUNT(score) FROM `user`")
    Long wrong2();

    @Query(nativeQuery = true, value = "SELECT * FROM `user` WHERE score=null")
    List<User> wrong3();
}

The results obtained are null, 0, and an empty list, respectively:

[11:38:50.137] [http-nio-45678-exec-1] [INFO ] [t.c.nullvalue.demo3.DbNullController:26  ] - result: null 0 []

Clearly, the execution results of these three SQL statements are different from our expectations:

  • Although the scores of the records are all NULL, the result of the sum function should be 0 instead of null.
  • Although the score of this record is NULL, the total number of records should be 1.
  • Using =NULL did not retrieve the record with id=1, rendering the query condition ineffective.

The reasons are as follows:

  • In MySQL, when the sum function doesn’t find any records to calculate, it returns null instead of 0. You can use the IFNULL function to convert null to 0.
  • In MySQL, the count function does not count null values in a field. The correct way to count all records is to use COUNT(*).
  • In MySQL, using arithmetic comparison operators such as =, <, > to compare NULL always results in NULL, which makes such comparisons meaningless. You need to use IS NULL, IS NOT NULL, or ISNULL() function for comparison.

Let’s modify the SQL statements accordingly:

@Query(nativeQuery = true, value = "SELECT IFNULL(SUM(score),0) FROM `user`")
Long right1();

@Query(nativeQuery = true, value = "SELECT COUNT(*) FROM `user`")
Long right2();

@Query(nativeQuery = true, value = "SELECT * FROM `user` WHERE score IS NULL")
List<User> right3();

You will obtain the correct results: 0, 1, and [User(id=1, score=null)] respectively:

[14:50:35.768] [http-nio-45678-exec-1] [INFO ] [t.c.nullvalue.demo3.DbNullController:31  ] - result: 0 1 [User(id=1, score=null)]

Key Review #

Today, we discussed several important points to consider when handling null values.

Firstly, I summarized the five most common ways to encounter NullPointerException in business code, along with their respective fixes. To avoid lengthy if-else null-checking logic, we can use Optional combined with Stream for elegant null-checking with just one line of code. Additionally, to locate and fix NullPointerExceptions, apart from adding logs for troubleshooting, we can use Arthas in production to quickly view method call stacks and parameters.

In my opinion, the basic requirement for any business system is to handle null pointer exceptions properly, as they often represent a disruption in business logic. Therefore, I suggest checking production logs daily to identify any null pointer exceptions. If possible, it is also recommended to subscribe to alerts for null pointer exceptions, so that they can be detected and addressed promptly.

Regarding null locating in POJO fields, it is often difficult from the server’s perspective to determine whether the client intended to ignore a field or deliberately passed a null value. Therefore, we can try using the Optional class to differentiate null locating. To prevent updating null values in the database, we can implement dynamic SQL to only update necessary fields.

Lastly, I shared three potential pitfalls of using NULL in database fields (including sum function, count function, and condition with NULL value) and their solutions.

In summary, properly handling null values and avoiding null pointer exceptions is not as simple as just null-checking. It requires careful consideration from the front to the back based on the business attributes. We need to understand what the incoming null from the client represents, whether null is allowed to be replaced by default values, and whether null or empty values should be passed during database insertion. We must ensure the consistency of the entire logic processing to minimize bugs.

To handle null values well, as a client-side developer, it is necessary to align with the server-side on the meaning of null values in fields and the fallback logic. As a server-side developer, it is important to perform pre-checks on input parameters to block unacceptable null values on the server side and ensure comprehensive null value handling throughout the entire business logic process.

I have stored the code used today on GitHub, and you can click on this link to view it.

Thinking and Discussion #

ConcurrentHashMap does not allow null values for both keys and values, while HashMap does. Do you know the reason behind this design? What about TreeMap and Hashtable? Do they support null values for keys and values?

For the Hibernate framework, you can use the @DynamicUpdate annotation to achieve dynamic updates of fields. How can you achieve similar dynamic SQL functionality in the MyBatis framework, where the insert and update statements only include non-null fields from the POJO?

Regarding the issue of null and NullPointerException in programs and databases, have you encountered any pitfalls? My name is Zhu Ye, and I welcome you to leave a comment in the comment section and share your experiences. You are also welcome to share this article with your friends or colleagues for further discussion.