Answer Compilation Code Compilation Thoughts Collection Two

Answer Compilation Code Compilation Thoughts Collection Two #

Today, let’s continue analyzing the coding questions from Lecture 7 to 12 of this course. These questions cover topics such as database indexes, equivalence checking, numerical calculations, collection classes, null value handling, and exception handling, comprising a total of 12 problems.

Let’s analyze each question in detail.

07 | Database Indexes: Indexes are Not Magic Bullets #

Question 1: When introducing the cost of secondary indexes, we mentioned two scenarios: index coverage and table lookup. Can you use optimizer trace to analyze the cost difference between these two scenarios?

Answer: As shown in the code below, after enabling optimizer_trace, executing an SQL query allows us to query the information_schema.OPTIMIZER_TRACE table to view the execution plan. Finally, we can disable the optimizer_trace feature:

SET optimizer_trace="enabled=on";

SELECT * FROM person WHERE NAME > 'name84059' AND create_time > '2020-01-24 05:00:00';

SELECT * FROM information_schema.OPTIMIZER_TRACE;

SET optimizer_trace="enabled=off";

Assuming we have created a combined index for the NAME and SCORE columns of the person table, the second SQL query should be able to utilize index coverage, while the first SQL query requires table lookup:

explain select * from person where NAME='name1';

explain select NAME, SCORE from person where NAME='name1';

By observing the output of OPTIMIZER_TRACE, we can see that the cost of index coverage (index_only=true) is 1.21, while the cost of table lookup (index_only=false) is 2.21. In other words, index coverage saves a cost of 1 compared to table lookup.

Index coverage:

analyzing_range_alternatives": {
  "range_scan_alternatives": [
    {
      "index": "name_score",
      "ranges": [
        "name1 <= name <= name1"
      ] /* ranges */,
      "index_dives_for_eq_ranges": true,
      "rowid_ordered": false,
      "using_mrr": false,
      "index_only": true,
      "rows": 1,
      "cost": 1.21,
      "chosen": true
    }
  ]

Table lookup:

"range_scan_alternatives": [
  {
    "index": "name_score",
    "ranges": [
      "name1 <= name <= name1"
    ] /* ranges */,
    "index_dives_for_eq_ranges": true,
    "rowid_ordered": false,
    "using_mrr": false,
    "index_only": false,
    "rows": 1,
    "cost": 2.21,
    "chosen": true
  }
]

Question 2: Apart from speeding up searches, indexes can also play a role in sorting. Can you prove this using EXPLAIN? Do you know in what situations indexes can be ineffective for sorting?

Answer: Indexes used for sorting are reflected in the execution plan through the “key” column. If an index is not used, you will see “Using filesort” in the “Extra” column, indicating that sorting is done using memory or disk. Whether it uses memory or disk depends on the sort_buffer_size and the size of the data being sorted.

There are several situations where sorting cannot use indexes:

  • When sorting with a compound index, and the ASC and DESC orders are mixed.
  • When using a+b as a compound index, and a range query is performed on a, followed by sorting based on b.
  • When the sorting columns involve multiple fields that do not belong to the same compound index.
  • When the sorting column uses an expression.

In fact, all these reasons are related to the structure of indexes. You can review the section on clustered indexes and secondary indexes in Lesson 07 for more targeted revision.

08 | Equality: How does the program determine that you are you? #

Question 1: When implementing equals, I first determine the types of two objects using the getClass method. You might also think of using instanceof to determine the types. Can you explain the difference between these two implementation approaches?

Answer: In fact, both getClass and instanceof can be used to determine the type of objects. The difference lies in the fact that getClass restricts the two objects to belong to the same class, while instanceof allows the two objects to be the same class or its subclasses.

Because of this difference, different people have different preferences for these two approaches, and there is much debate. In my opinion, you just need to choose according to your own requirements. It is worth mentioning that Lombok uses the instanceof approach.

Question 2: In the example in the section “hashCode and equals must be implemented together”, I demonstrated that the contains method of HashSet can be used to determine if an element is in the HashSet. Is there any difference between the contains method of TreeSet, which is also a Set, and the contains method of HashSet?

Answer: HashSet is based on HashMap and its underlying data structure is a hash table. Therefore, the contains method of HashSet actually determines equality based on hashcode and equals.

TreeSet is based on TreeMap and its underlying data structure is a red-black tree. Therefore, the contains method of TreeSet actually determines equality based on compareTo.

09 | Numeric Calculations: Pay Attention to Precision, Rounding, and Overflow Issues #

Question 1: BigDecimal provides 8 rounding modes. Can you explain their differences using examples?

Answer: @Darren’s comprehensive comment provides a clear explanation. I have made some modifications to his comment, and it becomes the answer to this question.

  • The first rounding mode is ROUND_UP, which rounds away from zero. It always increases the value by adding 1 before discarding non-zero digits. It is important to note that this rounding mode will never decrease the original value.

  • The second rounding mode is ROUND_DOWN, which rounds towards zero. It always truncates the value without adding 1 before discarding digits. It is important to note that this rounding mode will never increase the original value.

  • The third rounding mode is ROUND_CEILING, which rounds towards positive infinity. If the BigDecimal is positive, the rounding behavior is the same as ROUND_UP. If the BigDecimal is negative, the rounding behavior is the same as ROUND_DOWN. It is important to note that this rounding mode will never decrease the original value.

  • The fourth rounding mode is ROUND_FLOOR, which rounds towards negative infinity. If the BigDecimal is positive, the rounding behavior is the same as ROUND_DOWN. If the BigDecimal is negative, the rounding behavior is the same as ROUND_UP. It is important to note that this rounding mode will never increase the original value.

  • The fifth rounding mode is ROUND_HALF_UP, which rounds to the “nearest” number. If the discarded part is >= 0.5, the rounding behavior is the same as ROUND_UP. Otherwise, the rounding behavior is the same as ROUND_DOWN. It is important to note that this rounding mode is commonly known as “round half up”.

  • The sixth rounding mode is ROUND_HALF_DOWN, which rounds to the “nearest” number. If the discarded part is > 0.5, the rounding behavior is the same as ROUND_UP. Otherwise, the rounding behavior is the same as ROUND_DOWN. This rounding mode is commonly known as “round half down”.

  • The seventh rounding mode is ROUND_HALF_EVEN, which rounds to the “nearest” number. This algorithm is known as the “banker’s rounding”. The specific rule is: round half up, except when the preceding digit is odd, in which case round up. For example, 5.5 rounds to 6, and 2.5 rounds to 2.

  • The eighth rounding mode is ROUND_UNNECESSARY, which assumes that the requested operation has an exact result and does not require rounding. If the calculation result has an inexact value, an ArithmeticException will be thrown.

Question 2: Do you know how to define floating-point numbers and integers in databases (e.g., MySQL)? How can accurate calculations be performed with floating-point numbers? Answer: In MySQL, integers are represented by various types based on their range, such as TINYINT, SMALLINT, MEDIUMINT, INTEGER, and BIGINT. Floating-point numbers include single-precision FLOAT and double-precision DOUBLE, which have the same precision issue as float/double in Java.

There are two main ways to solve the precision problem:

First, use the DECIMAL type (similar to INT types, all of which are strict numerical data types), such as DECIMAL(13, 2) or DECIMAL(13, 4).

Second, use integers to store cents. However, this approach is prone to mistakes. For instance, forgetting to divide by 100 when reading or forgetting to multiply by 100 when storing can cause significant issues. Of course, it is also possible to consider storing integers and decimals in separate integer fields.

10 | Collection: Pitfalls of List Operations #

Question 1: When using the remove method of an ArrayList with elements of type Integer, is there any difference between passing an Integer object and passing an int primitive?

Answer: When using the remove method with an int primitive, it removes the element at the specified index and returns the removed value. When using the remove method with an Integer object, it removes the element with the specified value and returns whether the list contained the value before removal (whether the removal was successful).

To verify the difference between these two overloaded remove methods, let’s write a test code to compare them:

private static void removeByIndex(int index) {
    List<Integer> list = IntStream.rangeClosed(1, 10).boxed().collect(Collectors.toCollection(ArrayList::new));
    System.out.println(list.remove(index));
    System.out.println(list);
}

private static void removeByValue(Integer value) {
    List<Integer> list = IntStream.rangeClosed(1, 10).boxed().collect(Collectors.toCollection(ArrayList::new));
    System.out.println(list.remove(value));
    System.out.println(list);
}

Let’s test removeByIndex(4). From the output, we can see that the fifth item was removed and the method returned 5:

5
[1, 2, 3, 4, 6, 7, 8, 9, 10]

When calling removeByValue(Integer.valueOf(4)), it can be seen from the output that the value 4 has been removed and returns true:

true

[1, 2, 3, 5, 6, 7, 8, 9, 10]

Question 2: When iterating over a List and using the remove method to delete elements, you often encounter ConcurrentModificationException. What is the reason for this and how can it be fixed?

Answer: The reason is that removing an element changes modCount. Therefore, if you iterate with an iterator, it will trigger ConcurrentModificationException. Let’s take a look at the relevant source code for the internal iterator of the ArrayList class:

public E next() {
    checkForComodification();
    int i = cursor;
    if (i >= size)
        throw new NoSuchElementException();
    Object[] elementData = ArrayList.this.elementData;
    if (i >= elementData.length)
        throw new ConcurrentModificationException();
    cursor = i + 1;
    return (E) elementData[lastRet = i];
}

final void checkForComodification() {
    if (modCount != expectedModCount)
        throw new ConcurrentModificationException();
}

To fix this problem, there are two solutions:

The first solution is to use the remove method of the ArrayList iterator. The iterator’s remove method maintains an expectedModCount that keeps it consistent with the ArrayList’s modCount:

List<String> list =
    IntStream.rangeClosed(1, 10).mapToObj(String::valueOf).collect(Collectors.toCollection(ArrayList::new));

for (Iterator<String> iterator = list.iterator(); iterator.hasNext(); ) {
    String next = iterator.next();
    if ("2".equals(next)) {
        iterator.remove();
    }
}

System.out.println(list);

The second solution is to directly use the removeIf method, which internally uses the iterator’s remove method:

List<String> list =
    IntStream.rangeClosed(1, 10).mapToObj(String::valueOf).collect(Collectors.toCollection(ArrayList::new));

list.removeIf(item -> item.equals("2"));

System.out.println(list);

11 | Handling Null Values: Distinguishing Null and Annoying NullPointerException #

Question 1: Both the key and value of ConcurrentHashMap cannot be null, while HashMap allows null. Do you know the reason behind this design? Can TreeMap, Hashtable, and other Map implementations allow null as key or value?

Answer: The reason, as mentioned by the author of ConcurrentHashMap, is:

“The main reason that nulls aren’t allowed in ConcurrentMaps (ConcurrentHashMaps, ConcurrentSkipListMaps) is that ambiguities that may be just barely tolerable in non-concurrent maps can’t be accommodated. The main one is that if map.get(key) returns null, you can’t detect whether the key explicitly maps to null vs the key isn’t mapped. In a non-concurrent map, you can check this via map.contains(key), but in a concurrent one, the map might have changed between calls.”

If allowing null for value would introduce ambiguity, for example, if map.get(key) returns null in a multi-threaded situation, we cannot distinguish if the value was originally null or if the key is not mapped at all. The same reasoning applies to the key. Additionally, I agree with the author that allowing null in regular maps is also debatable since it increases the chances of making mistakes.

Hashtable is also thread-safe, so neither the key nor the value can be null.

TreeMap is not thread-safe, but since it requires sorting and uses the compareTo method of the keys, the key cannot be null, but the value can be null.

Question 2: For the Hibernate framework, we can use the @DynamicUpdate annotation to achieve dynamic updating of fields. So, how can we achieve similar dynamic SQL functionality for the MyBatis framework, where the insert and update SQL only include non-null fields of the POJO?

Answer: MyBatis can achieve this through dynamic SQL:

<select id="findUser" resultType="User">
  SELECT * FROM USER
WHERE 1=1

<if test="name != null">

  AND name like #{name}

</if>

<if test="email != null">

  AND email = #{email}

</if>

</select>


If you're using MyBatisPlus, it would be more convenient to implement similar dynamic SQL functionality. You can directly add the @TableField annotation to the field to achieve this, and set the insertStrategy, updateStrategy, and whereStrategy attributes. For the usage of these three attributes, you can refer to the following source code or the official documentation here:

/**
     * Field validation strategy for insert: When performing an insert operation, the strategy for concatenating this field in the insert statement
     * IGNORED: Directly concatenate insert into table_a(column) values (#{columnProperty});
     * NOT_NULL: insert into table_a(<if test="columnProperty != null">column</if>) values (<if test="columnProperty != null">#{columnProperty}</if>)
     * NOT_EMPTY: insert into table_a(<if test="columnProperty != null and columnProperty!=''">column</if>) values (<if test="columnProperty != null and columnProperty!=''">#{columnProperty}</if>)
     *
     * @since 3.1.2
     */
    FieldStrategy insertStrategy() default FieldStrategy.DEFAULT;

    /**
     * Field verification strategy for update: When performing an update operation, the strategy for concatenating this field in the set statement
     * IGNORED: Directly concatenate update table_a set column=#{columnProperty}, both null and empty string properties will be set
     * NOT_NULL: update table_a set <if test="columnProperty != null">column=#{columnProperty}</if>
     * NOT_EMPTY: update table_a set <if test="columnProperty != null and columnProperty!=''">column=#{columnProperty}</if>
     *
     * @since 3.1.2
     */
    FieldStrategy updateStrategy() default FieldStrategy.DEFAULT;

    /**
     * Field verification strategy for where: The strategy for concatenating this field in the where condition
     * IGNORED: Directly concatenate column=#{columnProperty}
     * NOT_NULL: <if test="columnProperty != null">column=#{columnProperty}</if>
     * NOT_EMPTY: <if test="columnProperty != null and columnProperty!=''">column=#{columnProperty}</if>
     * 
     * @since 3.1.2
     */
    FieldStrategy whereStrategy() default FieldStrategy.DEFAULT;


### 12 | Exception handling: Don't turn blind when something goes wrong

Question 1: Regarding the pitfall of throwing exceptions in the finally block, do you think the program will take the return value from the try or catch block, or the return value from the finally block?

Answer: The return value from the finally block will be taken.

Semantically, the finally block is used to handle resource cleanup at the end of a method, so we do not recommend having a return statement in the finally block as it would make the logic confusing. This is because, in implementation, the finally block code is duplicated and placed before the return and throw statements in the try and catch blocks respectively. Therefore, if there is a return value in the finally block, it will override the return value in the try block.

Question 2: For manually thrown exceptions, it is not recommended to directly use Exception or RuntimeException. It is usually recommended to reuse some standard exceptions from the JDK, such as IllegalArgumentException, IllegalStateException, UnsupportedOperationException. Can you explain their use cases and list more commonly reusable standard exceptions?

Answer: Let's first look at the use cases of IllegalArgumentException, IllegalStateException, UnsupportedOperationException.

IllegalArgumentException: Invalid parameter exception, suitable for scenarios where the passed-in parameter does not meet the method requirements.

IllegalStateException: Invalid state exception, suitable for invalid state transitions in state machines, when the current logic execution state is not suitable for the corresponding operation, and other similar scenarios.

UnsupportedOperationException: Unsupported operation exception, suitable for scenarios where a certain operation is not supported in the implementation or environment.

Other reusable exceptions include IndexOutOfBoundsException, NullPointerException, ConcurrentModificationException.

That's it for the answers to the questions in Lessons 7-12 of our course.

If you have any areas that you don't understand about these questions or the underlying concepts, feel free to leave a comment in the comments section. You're also welcome to share today's content with your friends or colleagues and discuss it together.