03 Date Types Timestamp Might Be a Pitfall

03 Date Types - TIMESTAMP Might Be a Pitfall #

In the previous two lessons, I introduced you to the common numeric and string types in the MySQL database. In addition to these types, date types are also common.

Almost every business table has a date column used to record the time when each record is created or modified. For example, a user table may have a date column to record the time when a user registers and the time of their last login. Similarly, in the e-commerce industry, the order table (the core business table) may have a column for the time when an order is created, and if the payment time exceeds the order creation time, the order may be automatically canceled by the system.

Although date types are common, they can also be prone to mistakes in table design. For example, many developers tend to use integers to store date types and often overlook the potential performance impact of different date types. So it is necessary for you to study this lesson carefully, apply it to your own business, and design the date types correctly.

Date Types #

The common date types in MySQL are YEAR, DATE, TIME, DATETIME, and TIMESTAMP. Since most business scenarios require dates to be accurate to the second, the commonly used date types in table design are DATETIME and TIMESTAMP. Now, let’s take a deep dive into these two types and their practical applications in design.

DATETIME #

The DATETIME type is displayed as: YYYY-MM-DD HH:MM:SS and occupies a fixed 8 bytes.

Starting from MySQL 5.6, the DATETIME type supports milliseconds. The N in DATETIME(N) represents the precision of milliseconds. For example, DATETIME(6) can store milliseconds with 6 digits. Some date functions also support precision up to milliseconds, such as the commonly used functions NOW and SYSDATE:

mysql> SELECT NOW(6);

+----------------------------+
| NOW(6)                     |
+----------------------------+
| 2020-09-14 17:50:28.707971 |
+----------------------------+
1 row in set (0.00 sec)

Users can set the initial value of DATETIME to the current time and set it to automatically update the current time. For example, on the previously designed user table User, I modified the definitions of register_date and last_modify_date:

CREATE TABLE User (
    id BIGINT NOT NULL AUTO_INCREMENT,
    name VARCHAR(255) NOT NULL,
    sex CHAR(1) NOT NULL,
    password VARCHAR(1024) NOT NULL,
    money INT NOT NULL DEFAULT 0,
    register_date DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
    last_modify_date DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6) ON UPDATE CURRENT_TIMESTAMP(6),
    CHECK (sex = 'M' OR sex = 'F'),
    PRIMARY KEY(id)
);

In the User table above, the column register_date represents the registration time, and DEFAULT CURRENT_TIMESTAMP represents that when a record is inserted and no time is specified, the default is the current time.

The column last_modify_date represents the last modified time of the current record, and DEFAULT CURRENT_TIMESTAMP(6) ON UPDATE CURRENT_TIMESTAMP(6) means that it will be updated to the current time every time it is modified.

This design ensures that when a user’s money (money field) changes, the last_modify_date records the time of the last change. Let’s look at the example below:

mysql> SELECT name,money,last_modify_date FROM User WHERE name = 'David';

+-------+-------+----------------------------+
| name  | money | last_modify_date           |
+-------+-------+----------------------------+
| David |   100 | 2020-09-13 08:08:33.898593 |
+-------+-------+----------------------------+
1 row in set (0.00 sec)

mysql> UPDATE User SET money = money - 1 WHERE name = 'David';

Query OK, 1 row affected (0.06 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> SELECT name,money,last_modify_date FROM User WHERE name = 'David';

+-------+-------+----------------------------+
| name  | money | last_modify_date           |
+-------+-------+----------------------------+
| David |    99 | 2020-09-14 18:29:17.056327 |
+-------+-------+----------------------------+
1 row in set (0.00 sec)

As you can see, when the user’s money is modified, the corresponding last_modify_date field is also updated to the time of the change.

TIMESTAMP #

In addition to DATETIME, there is another type called TIMESTAMP, which is a timestamp type in date types. It actually stores the number of milliseconds from ‘1970-01-01 00:00:00’ until now. In MySQL, since the TIMESTAMP type occupies 4 bytes, its time storage limit can only reach ‘2038-01-19 03:14:07’.

Similar to DATETIME, starting from MySQL 5.6, the TIMESTAMP type also supports milliseconds. Unlike DATETIME, if it contains milliseconds, the TIMESTAMP type occupies 7 bytes, while DATETIME always occupies 8 bytes regardless of whether it stores millisecond information.

The biggest advantage of the TIMESTAMP type is that it can have a time zone property because it is essentially derived from milliseconds. If your business needs to correspond to different time zones of different countries, then the TIMESTAMP type is a good choice. For example, in news-related businesses, users usually want to know the time of a news article published in their own country, so TIMESTAMP is a suitable choice.

In addition, some countries observe daylight saving time. They artificially advance or retreat the time by 1 hour depending on the season. The TIMESTAMP type with time zone property can solve this problem.

The parameter time_zone specifies the current time zone in use, and the default is SYSTEM, which uses the operating system’s time zone. Users can specify the desired time zone through this parameter.

If you want to use the time zone feature of TIMESTAMP, you can modify the data type of the registration time field in the previous user table User from DATETIME(6) to TIMESTAMP(6) using the following statement:

ALTER TABLE User
CHANGE register_date
register_date TIMESTAMP(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6);

Now, by setting different time_zone values, you can observe the registration time in different time zones:

mysql> SELECT name, regis ter_date FROM User WHERE name = 'David';

+-------+----------------------------+
| name  | register_date              |
+-------+----------------------------+
| David | 2018-09-14 18:28:33.898593 |
+-------+----------------------------+
1 row in set (0.00 sec)

mysql> SET time_zone = '-08:00';

Query OK, 0 rows affected (0.00 sec)

mysql> SELECT name,register_date FROM User WHERE name = 'David';

+-------+----------------------------+
| name  | register_date              |
+-------+----------------------------+
| David | 2018-09-14 18:28:33.898593 |
+-------+----------------------------+
1 row in set (0.00 sec)
name register_date
David 2018-09-14 02:28:33.898593
1 row in set (0.00 sec)

From the example above, you can see that the time zone in China is +08:00 and the time zone in the United States is -08:00. Therefore, after changing to the US time zone, you can see that the user’s registration time is delayed by 16 hours compared to before. Of course, directly adding or subtracting time zones is not intuitive and requires a good understanding of the time zone table for each country. In MySQL, you can directly set the name of the time zone, for example:

mysql> SET time_zone = 'America/Los_Angeles';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT NOW();
+---------------------+
| NOW()               |
+---------------------+
| 2020-09-14 20:12:49 |
+---------------------+
1 row in set (0.00 sec)

mysql> SET time_zone = 'Asia/Shanghai';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT NOW();
+---------------------+
| NOW()               |
+---------------------+
| 2020-09-15 11:12:55 |
+---------------------+
1 row in set (0.00 sec)

Now that you understand the types of time fields, I will share how to use them effectively in real business design.

Practical Table Structure Design for Business #

DATETIME vs TIMESTAMP vs INT, How to Choose? #

When designing table structures, developers usually have 3 choices for storing date fields: DATETIME, TIMESTAMP, and INT.

INT type directly stores the number of milliseconds from ‘1970-01-01 00:00:00’ to the present, which is essentially the same as TIMESTAMP. Therefore, using INT is not as good as using TIMESTAMP directly.

Of course, some students may think that INT has better performance than TIMESTAMP. However, with each CPU able to execute billions of calculations per second, there is no need to worry about the performance of this conversion. More importantly, using INT to store dates will drive DBAs and data analysts crazy during later operations and data analysis because the operability of INT is very poor.

Some students are also enthusiastic about using the TIMESTAMP type to store dates because TIMESTAMP only occupies 4 bytes, which is half the storage space of DATETIME.

However, if you want to be accurate to the millisecond, TIMESTAMP will take 7 bytes, which is not much different from the 8 bytes of DATETIME. On the other hand, we are now very close to the maximum value of TIMESTAMP, ‘2038-01-19 03:14:07’, which is a problem that needs to be carefully considered by the development team.

In general, I recommend using the DATETIME type. For time zone issues, it can be converted by the front-end or service layer, and there is no need to solve it in the database.

Do Not Ignore the Performance Issue of TIMESTAMP #

As mentioned earlier, the upper limit value of TIMESTAMP, the year 2038, will arrive soon, and at that time, business will face a problem similar to the Y2K bug. In addition, TIMESTAMP also has potential performance issues.

Although the conversion from the number of milliseconds to the TIMESTAMP type itself does not require many CPU instructions, it does not directly cause performance problems. However, if the default operating system time zone is used, each time the time is calculated through the time zone, the underlying system function __tz_convert() needs to be called, and this function requires additional locking operations to ensure that the operating system time zone has not been modified at that time. Therefore, when there are large-scale concurrent accesses, there will be two problems due to hot resource competition.

  • Performance not as good as DATETIME: DATETIME does not have timezone conversion issues.
  • Performance fluctuations: When dealing with massive concurrency, there may be performance fluctuations.

To optimize the usage of TIMESTAMP, it is strongly recommended to use an explicit timezone instead of the system timezone. For example, set the timezone explicitly in the configuration file instead of using the system timezone:

[mysqld]

time_zone = "+08:00"

Lastly, test the performance of TIMESTAMP and DATETIME using the mysqlslap command:

# Compare the performance between time_zone set as System and Asia/Shanghai

mysqlslap -uroot --number-of-queries=1000000 --concurrency=100 --query='SELECT NOW()'

The final performance comparison is as follows:

3(1).png

From the table, it can be seen that explicitly specifying the timezone performs much better than using the system timezone directly. Therefore, it is recommended to use DATETIME for date fields, which does not involve timezone conversion. Even if using TIMESTAMP, it is necessary to explicitly configure the timezone in the database, instead of using the system timezone.

Table structure design specification: Each record should have a time field #

When designing the table structure, it is strongly recommended to add a DATETIME type last_modify_date field to each core business table, and set up an automatic update mechanism for modifications, marking the last modification time for each record.

For example, in the previous User table, the last_modify_date field is used to indicate the time of the last modification:

CREATE TABLE User (
    id BIGINT NOT NULL AUTO_INCREMENT,
    name VARCHAR(255) NOT NULL,
    sex CHAR(1) NOT NULL,
    password VARCHAR(1024) NOT NULL,
    money INT NOT NULL DEFAULT 0,
    register_date DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
    last_modify_date DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6) ON UPDATE CURRENT_TIMESTAMP(6),
    CHECK (sex = 'M' OR sex = 'F'),
    PRIMARY KEY(id)
);

By using the ON UPDATE CURRENT_TIMESTAMP(6) option on the last_modify_date field, the last_modify_date will be automatically updated to the current time whenever the record is updated.

The benefit of this design is that users can know the most recent update time of each user, facilitating further processing. For example, in an e-commerce order table, it is convenient to handle orders that have exceeded the payment timeout; in financial transactions, the last modification time of user funds can be used for fund reconciliation and more.

In later content, we will also discuss the design and implementation of MySQL database master-slave logical data reconciliation, which will utilize the last_modify_date field.

Summary #

The commonly used date types in MySQL are DATETIME and TIMESTAMP. However, due to performance issues, it is recommended to use DATETIME as much as possible. Here is a summary of today’s key points:

  • Starting from MySQL 5.6, DATETIME and TIMESTAMP support precision up to milliseconds.
  • DATETIME occupies 8 bytes, TIMESTAMP occupies 4 bytes. DATETIME(6) still occupies 8 bytes, while TIMESTAMP(6) occupies 7 bytes.
  • The upper limit for storing dates in TIMESTAMP is 2038-01-19 03:14:07, using TIMESTAMP in business has risks.
  • When using TIMESTAMP, it is necessary to explicitly set the timezone and not use the default system timezone, recommend setting the parameter time_zone = ‘+08:00’ in the configuration file.
  • Recommended usage of DATETIME for date types instead of TIMESTAMP and INT.
  • When designing table structures, it is recommended to include a last_modify_date field in each core business table to record the last modification time for each record.