Answer Compilation Code Compilation Thoughts Collection Three

Answer Compilation Code Compilation Thoughts Collection Three #

Today, let’s continue analyzing the post-lecture exercises from Lectures 13 to 20 of this course. These exercises cover topics such as logging, file IO, serialization, Java 8 date-time classes, OOM, advanced Java features (reflection, annotations, and generics), and 16 questions related to the Spring framework.

Now, let’s analyze each question in detail.

13 | Logging: Logging is not as simple as you think #

Question 1: In the case of “Why is my log duplicated?”, we stored INFO-level logs in _info.log and WARN and ERROR-level logs in _error.log. If we now want to store INFO and WARN-level logs in _info.log and ERROR-level logs in _error.log, how should we configure Logback?

Answer: There are two ways to achieve this configuration: using EvaluatorFilter directly or creating a custom Filter. Let’s look at each option.

The first option is to use the built-in EvaluatorFilter provided by Logback:

<filter class="ch.qos.logback.core.filter.EvaluatorFilter">
    <evaluator class="ch.qos.logback.classic.boolex.GEventEvaluator">
        <expression>
            e.level.toInt() == WARN.toInt() || e.level.toInt() == INFO.toInt()
        </expression>
    </evaluator>
    <OnMismatch>DENY</OnMismatch>
    <OnMatch>NEUTRAL</OnMatch>
</filter>

The second option is to create a custom Filter that parses multiple Levels separated by the “|” character in the configuration:

public class MultipleLevelsFilter extends Filter<ILoggingEvent> {
    @Getter @Setter private String levels;
    private List<Integer> levelList;

    @Override
    public FilterReply decide(ILoggingEvent event) {
        if (levelList == null && !StringUtils.isEmpty(levels)) {
            levelList = Arrays.asList(levels.split("\\|")).stream()
                    .map(item -> Level.valueOf(item))
                    .map(level -> level.toInt())
                    .collect(Collectors.toList());
        }

        if (levelList.contains(event.getLevel().toInt()))
            return FilterReply.ACCEPT;
        else
            return FilterReply.DENY;
    }
}

Then, we can use this MultipleLevelsFilter in the configuration file (complete configuration code can be found here):

<filter class="org.geekbang.time.commonmistakes.logging.duplicate.MultipleLevelsFilter">
    <levels>INFO|WARN</levels>
</filter>

Question 2: In a production-grade project, file logs need to be split and archived based on time and date to avoid having a single large file. Additionally, a certain number of historical logs need to be retained. Do you know how to configure this? You can find the answer in the official documentation.

Answer: The following configuration can be used as a reference. The SizeAndTimeBasedRollingPolicy is used to split and archive files based on file size and the number of historical logs:

<rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy">
    <!-- Number of days to keep historical logs -->
    <MaxHistory>30</MaxHistory>
    <!-- Maximum file size of each log file -->
    <MaxFileSize>100MB</MaxFileSize>
    <!-- Maximum total size of all archived files
    The optional totalSizeCap attribute controls the total size of all archived files. When the total size limit is exceeded, the oldest archive will be asynchronously deleted.
    The totalSizeCap attribute also requires the maxHistory attribute to be set. Additionally, the "MaxHistory" limit is always applied first, followed by the "totalSizeCap" limit.
    -->
    <totalSizeCap>10GB</totalSizeCap>
</rollingPolicy>

14 | File IO: Implementing efficient and correct file read/write is not easy #

Question 1: When using the Files.lines method for stream processing, we need to use try-with-resources to release the resources. So, when using other methods in the Files class that return Stream wrapper objects, such as newDirectoryStream (returns a DirectoryStream), list, walk, and find (return Stream), do we also have resource release problems?

Answer: Yes, when using other methods in the Files class that return Stream wrapper objects for stream processing, resource release is still an issue. As mentioned in the text, if not released explicitly, resource leakage may occur due to the delayed closing of underlying resources.

Question 2: Are the file copying, renaming, and deleting operations provided by the File class and Files class in Java atomic?

Answer: No, the file copying, renaming, and deleting operations provided by the File class and Files class in Java are not atomic. The reason is that these file operations typically invoke the underlying API of the operating system itself. Generally, these file APIs do not have transaction mechanisms like databases (and it is difficult to achieve). Even if there are transaction mechanisms, they are very likely to have platform differences.

For example, the documentation of the File.renameTo method states:

“Many aspects of the behavior of this method are inherently platform-dependent: The rename operation might not be able to move a file from one filesystem to another, it might not be atomic, and it might not succeed if a file with the destination abstract pathname already exists. The return value should always be checked to make sure that the rename operation was successful.”

Similarly, the documentation of the Files.copy method states:

“Copying a file is not an atomic operation. If an IOException is thrown, then it is possible that the target file is incomplete or some of its file attributes have not been copied from the source file. When the REPLACE_EXISTING option is specified and the target file exists, then the target file is replaced. The check for the existence of the file and the creation of the new file may not be atomic with respect to other file system activities.”

15 | Serialization: Will you still be the same after going back and forth? #

Question 1: When discussing the serialization methods in Redis, we customized the RedisTemplate to use String serialization for the Key and JSON serialization for the Value, allowing Redis to directly convert the retrieved Value into the desired object type. So, can RedisTemplate be used to store and retrieve data with Long as the Value? Are there any pitfalls?

Answer: Using RedisTemplate does not guarantee the ability to store and retrieve data with Long as the Value. Within the Integer range, it will return as Integer, but beyond this range, it will return as Long. Here is a test code snippet:

@GetMapping("wrong2")
public void wrong2() {
    String key = "testCounter";
    
    // Test setting a value within the Integer range
    countRedisTemplate.opsForValue().set(key, 1L);
    log.info("{} {}", countRedisTemplate.opsForValue().get(key), countRedisTemplate.opsForValue().get(key) instanceof Long);
    Long l1 = getLongFromRedis(key);
    
    // Test setting a value beyond the Integer range
    countRedisTemplate.opsForValue().set(key, Integer.MAX_VALUE + 1L);
    log.info("{} {}", countRedisTemplate.opsForValue().get(key), countRedisTemplate.opsForValue().get(key) instanceof Long);
    
    // The value converted using getLongFromRedis will always be Long
    Long l2 = getLongFromRedis(key);
    log.info("{} {}", l1, l2);
}

}

private Long getLongFromRedis(String key) {

    Object o = countRedisTemplate.opsForValue().get(key);

    if (o instanceof Integer) {

        return ((Integer) o).longValue();

    }

    if (o instanceof Long) {

        return (Long) o;

    }

    return null;

}

The expected output is as follows:

1 false

2147483648 true

1 2147483648

You can see that when the value is set to 1, the type is not Long, and when the value is set to 2147483648, the type is Long. It means that using RedisTemplate does not guarantee that the retrieved Value is Long.

So, I wrote a getLongFromRedis method here to perform the conversion to avoid errors, judging that if the value is Integer, it will be converted to Long.

Question 2: Can you take a look at the implementation of the Jackson2ObjectMapperBuilder class source code (note the configure method) and analyze what else it does besides disabling FAIL_ON_UNKNOWN_PROPERTIES?

Answer: Besides disabling FAIL_ON_UNKNOWN_PROPERTIES, the Jackson2ObjectMapperBuilder class source code mainly does the following two things.

First, it sets some default values for Jackson, such as:

MapperFeature.DEFAULT_VIEW_INCLUSION is set to disabled;

DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES is set to disabled.

Second, it automatically registers some Jackson modules that exist in the classpath, such as:

jackson-datatype-jdk8, which supports some JDK8 types such as Optional;

jackson-datatype-jsr310, which supports JDK8 date and time types.

jackson-datatype-joda, which supports Joda-Time types.

jackson-module-kotlin, which supports Kotlin.

16 | Make good use of Java 8’s date and time classes and avoid some pitfalls of the “old three” #

Question 1: In this lecture, I have repeatedly emphasized that Date is a timestamp, a UTC time without the concept of time zone. So why does calling its toString method output a time zone like CST?

Answer: Regarding this question, you can refer to the relevant source code in toString, where you can see that it will get the current time zone (display GMT if it cannot be obtained) for formatting:

public String toString() {

    BaseCalendar.Date date = normalize();

    ...

    TimeZone zi = date.getZone();

    if (zi != null) {

        sb.append(zi.getDisplayName(date.isDaylightTime(), TimeZone.SHORT, Locale.US)); // zzz

    } else {

        sb.append("GMT");

    }

    sb.append(' ').append(date.getYear());  // yyyy

    return sb.toString();

}

private final BaseCalendar.Date normalize() {

    if (cdate == null) {

        BaseCalendar cal = getCalendarSystem(fastTime);

        cdate = (BaseCalendar.Date) cal.getCalendarDate(fastTime,

                                                        TimeZone.getDefaultRef());

        return cdate;

    }

    // Normalize cdate with the TimeZone in cdate first. This is

    // required for the compatible behavior.

    if (!cdate.isNormalized()) {

        cdate = normalize(cdate);

    }

    // If the default TimeZone has changed, then recalculate the

    // fields with the new TimeZone.

    TimeZone tz = TimeZone.getDefaultRef();

    if (tz != cdate.getZone()) {

        cdate.setZone(tz);

        CalendarSystem cal = getCalendarSystem(cdate);

        cal.getCalendarDate(fastTime, cdate);

    }

    return cdate;

}

Actually, to put it simply, the displayed time zone here is only used for presentation and does not mean that the Date class itself has time zone information.

Question 2: When date and time data needs to be stored in a database, MySQL has two data types, datetime and timestamp, that can be used to store date and time. Can you explain their differences and whether they contain time zone information?

Answer: The differences between datetime and timestamp mainly include space occupancy, time range representation, and time zone in three aspects.

Space occupancy: datetime occupies 8 bytes; timestamp occupies 4 bytes.

Time range representation: datetime represents a range from “1000-01-01 00:00:00.000000” to “9999-12-31 23:59:59.999999”; timestamp represents a range from “1970-01-01 00:00:01.000000” to “2038-01-19 03:14:07.999999”.

Time zone: When saving timestamp, it is converted to UTC according to the current time zone, and when querying, it is converted back from UTC to the current time zone according to the current time zone; while datetime is just a fixed string time representation (only for MySQL itself).

It should be noted that when we say datetime does not contain time zone, it means that datetime of MySQL itself. When using timestamp, you need to consider the time zone of the Java process and the time zone of the MySQL connection. When using the datetime type, you only need to consider the time zone of the Java process (because MySQL datetime does not have time zone information, and JDBC timestamp conversion to MySQL datetime will do a conversion based on the serverTimezone of MySQL).

If your project has internationalization requirements, I recommend using the timestamp and ensuring that your application server and database server have correctly matched local time zone configurations.

In fact, even if your project does not have internationalization requirements, it is necessary to have consistent time zone settings for the application server and database server at least.

17 | Don’t assume that “automatic” means impossible to encounter OOM #

Question 1: Spring’s ConcurrentReferenceHashMap supports two ways, soft reference and weak reference, for Key and Value. Which way do you think is more suitable for caching?

Answer: The difference between soft reference and weak reference is that if an object is weakly reachable, it will be garbage collected regardless of whether the current memory is sufficient, while an object that is softly reachable will be garbage collected only when the memory is insufficient. Therefore, soft reference is stronger than weak reference.

So, using weak reference as a cache will make the cache’s lifespan too short, so soft reference is more suitable for caching.

Question 2: When we need to execute some expressions dynamically, we can use the Groovy dynamic language to implement it: create an instance of the GroovyShell class and then call the evaluate method to execute the script dynamically. The problem with this approach is that it will repeatedly generate a large number of classes, increasing the GC burden of Metaspace and may cause OOM. Do you know how to avoid this problem?

Answer: Calling the evaluate method to execute scripts dynamically will generate a large number of classes. To avoid the potential OOM problem caused by this, we can wrap the script as a function, call the parse function to get a Script object, and then cache it. Later, we can directly use the invokeMethod method to call this function:

private Object rightGroovy(String script, String method, Object... args) {

    Script scriptObject;

 

    if (SCRIPT_CACHE.containsKey(script)) {

@Getter
private TestC testC;

@Autowired
public TestD(TestC testC) {
    this.testC = testC;
}

遇到这种情况，Spring 会抛出一个 BeanCurrentlyInCreationException 异常，表示循环依赖无法解决。这是因为在实例化 TestC 的时候需要实例化 TestD，但是实例化 TestD 又需要实例化 TestC，形成了死循环。

解决方式有两种：

一种是将注入方式修改为 setter 方法注入。将上面的代码中的构造方法注入修改为 setter 方法注入：

@Getter
private TestD testD;

@Autowired
public void setTestD(TestD testD) {
    this.testD = testD;
}

这样可以避免构造方法循环依赖的问题。

另一种方式是使用 @Lazy 注解。将上面的代码中的构造方法注入修改为 @Lazy 注解：

@Getter
private TestD testD;

@Autowired
public TestC(@Lazy TestD testD) {
    this.testD = testD;
}

使用 @Lazy 注解可以延迟实例化 TestD，解决循环依赖的问题。

@Getter

private TestC testC;

@Autowired

public TestD(TestC testC) {

    this.testC = testC;

}

There are 2 main ways to solve this circular dependency issue:

Change to property or field injection.
Use the @Lazy annotation to delay the injection. For example:

@Component
public class TestC {

    @Getter
    private TestD testD;

    @Autowired
    public TestC(@Lazy TestD testD) {
        this.testD = testD;
    }

}

In this case, the injected value is not the actual type, but a proxy class. When getting the value (instantiating), it will go through the proxy to get the actual value. So, using @Lazy can solve the problem of circular dependency unable to instantiate.

20 | Spring Framework: The framework does a lot of work for us but also brings complexity #

Question 1: In addition to the four indicators execution, within, @within, and @annotation mentioned in these two lectures, Spring AOP also supports this, target, args, @target, and @args. Can you explain the functions of these five indicators?

Answer: For the functions of these indicators, you can refer to the official documentation, which explains them very clearly.

In summary, based on the usage scenarios, it is recommended to use the following indicators:

For method signature matching, use execution.
For type matching, use within (matching types), this (matching proxy class instances), target (matching target class instances behind the proxy), and args (matching arguments).
For annotation matching, use @annotation (methods annotated with a specific annotation), @target (classes annotated with a specific annotation), and @args (classes annotated with a specific annotation as a parameter of a method).

You may ask why @within is not mentioned.

In fact, for Spring’s default AOP based on dynamic proxy or CGLIB, because the pointcut can only be methods, using @within and @target indicators makes no difference. However, if you switch to AspectJ, the behavior of using these two indicators will be different. @within will intercept more member accesses (e.g., static constructors, field accesses). Generally, using @target indicator is sufficient.

Question 2: The PropertySources attribute in Spring’s Environment can contain multiple PropertySources, where the ones in front have higher priority. So, can we utilize this feature to automatically assign values to properties in the configuration file? For example, we can define %%MYSQL.URL%%, %%MYSQL.USERNAME%%, and %%MYSQL.PASSWORD%%, which represent the database connection string, username, and password, respectively. When configuring the data source, we only need to set their values as placeholders, and the framework can automatically replace the placeholders with the real database information based on the current application name application.name. In this way, the production database information does not need to be put in the configuration file, which makes it more secure.

Answer: We can use the priority feature of PropertySource to automatically assign values to properties in the configuration file. The main logic is to iterate through the current property values, find the properties that match the placeholders, and replace the values of these properties with the actual database information. Then, create a new PropertiesPropertySource with the modified properties and add it as the first property source. In this way, the values in this PropertiesPropertySource will take effect.

The main source code is as follows:

public static void main(String[] args) {

    Utils.loadPropertySource(CommonMistakesApplication.class, "db.properties");

    new SpringApplicationBuilder()
            .sources(CommonMistakesApplication.class)
            .initializers(context -> initDbUrl(context.getEnvironment()))
            .run(args);

}

private static final String MYSQL_URL_PLACEHOLDER = "%%MYSQL.URL%%";
private static final String MYSQL_USERNAME_PLACEHOLDER = "%%MYSQL.USERNAME%%";
private static final String MYSQL_PASSWORD_PLACEHOLDER = "%%MYSQL.PASSWORD%%";

private static void initDbUrl(ConfigurableEnvironment env) {

    String dataSourceUrl = env.getProperty("spring.datasource.url");
    String username = env.getProperty("spring.datasource.username");
    String password = env.getProperty("spring.datasource.password");

    if (dataSourceUrl != null && !dataSourceUrl.contains(MYSQL_URL_PLACEHOLDER))
        throw new IllegalArgumentException("Please use the placeholder " + MYSQL_URL_PLACEHOLDER + " to replace the database URL configuration!");

    if (username != null && !username.contains(MYSQL_USERNAME_PLACEHOLDER))
        throw new IllegalArgumentException("Please use the placeholder " + MYSQL_USERNAME_PLACEHOLDER + " to replace the database username configuration!");

    if (password != null && !password.contains(MYSQL_PASSWORD_PLACEHOLDER))
        throw new IllegalArgumentException("Please use the placeholder " + MYSQL_PASSWORD_PLACEHOLDER + " to replace the database password configuration!");

    // Here I hard-coded the values, in actual applications, you can retrieve them from an external service

    Map<String, String> property = new HashMap<>();
    property.put(MYSQL_URL_PLACEHOLDER, "jdbc:mysql://localhost:6657/common_mistakes?characterEncoding=UTF-8&useSSL=false");
    property.put(MYSQL_USERNAME_PLACEHOLDER, "root");
    property.put(MYSQL_PASSWORD_PLACEHOLDER, "kIo9u7Oi0eg");

    // Save the modified configuration properties
    Properties modifiedProps = new Properties();

    // Iterate through the current property values, find the properties that match the placeholders,
    // and replace the values with the actual database information
    StreamSupport.stream(env.getPropertySources().spliterator(), false)
            .filter(ps -> ps instanceof EnumerablePropertySource)
            .map(ps -> ((EnumerablePropertySource) ps).getPropertyNames())
            .flatMap(Arrays::stream)
            .forEach(propKey -> {
                String propValue = env.getProperty(propKey);
                property.entrySet().forEach(item -> {
                    // If the originally configured property value contains our defined placeholder
                    if (propValue.contains(item.getKey())) {
                        // Add the actual configuration information to modifiedProps
                        modifiedProps.put(propKey, propValue.replaceAll(item.getKey(), item.getValue()));
                    }
                });
            });

    if (!modifiedProps.isEmpty()) {
        log.info("modifiedProps: {}", modifiedProps);
        env.getPropertySources().addFirst(new PropertiesPropertySource("mysql", modifiedProps));
    }
}

I have updated my implementation in the source code corresponding to lecture 20 on GitHub. You can click here to view it. Some students may ask what the significance of doing this is and why not directly use a configuration framework like Apollo.

In fact, our goal is to avoid developers manually configuring the database information and hope that the placeholders can be automatically replaced with the actual configuration when the program starts (retrieve the corresponding database information from CMDB using the application ID directly). You may ask what to do if one application ID corresponds to multiple databases. In general, for a microservice system, one application should correspond to one database. By doing this, except for the program, no one else will have access to the production database information, which provides more security.

These are the answers to the questions from lectures 13 to 20 of this course.

If you have any further questions or unclear points about these questions and the underlying knowledge points, please feel free to leave a comment and discuss with me in the comment section. You are also welcome to share today’s content with your friends or colleagues and exchange ideas.