43 To Use or Not to Use Partitioned Tables #

I am often asked this question: What are the problems with partitioned tables, and why does the company’s standard not allow their use? Today, let’s talk about the behavior of using partitioned tables, and then answer this question together.

What is a partitioned table? #

To illustrate the organizational form of partitioned tables, let me first create a table t:

CREATE TABLE `t` (
  `ftime` datetime NOT NULL,
  `c` int(11) DEFAULT NULL,
  KEY (`ftime`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
PARTITION BY RANGE (YEAR(ftime))
(PARTITION p_2017 VALUES LESS THAN (2017) ENGINE = InnoDB,
 PARTITION p_2018 VALUES LESS THAN (2018) ENGINE = InnoDB,
 PARTITION p_2019 VALUES LESS THAN (2019) ENGINE = InnoDB,
PARTITION p_others VALUES LESS THAN MAXVALUE ENGINE = InnoDB);
 
insert into t values('2017-4-1',1),('2018-4-1',1);

Figure 1 Disk files of table t

I have initialized two rows of records in table t. According to the defined partitioning rule, these two rows of records are stored in the p_2018 and p_2019 partitions respectively.

As you can see, this table contains one .frm file and 4 .ibd files, with each partition corresponding to one .ibd file. That is:

For the storage engine, these are 4 tables;
For the server layer, this is 1 table.

You may think that both of these statements are nonsense. Actually, they are very important and can help us understand the execution logic of partitioned tables.

Behavior of the storage engine for partitioned tables #

Let’s start with an example of adding range locks to a partitioned table, in order to illustrate that for InnoDB, these are 4 tables.

Figure 2 Example of range locks in partitioned tables

Here, let me review the locking rules I introduced to you in [Article 21].

When we initialized the table t, we only inserted two rows of data, with the values of ftime being ‘2017-4-1’ and ‘2018-4-1’. The SELECT statement in session A locks the gap between these two records on the ftime index. If it were an ordinary table, at point T1, the gap and the lock state on the ftime index of table t should look like Figure 3.

Figure 3 Locking range of ordinary tables

This means that the gap between ‘2017-4-1’ and ‘2018-4-1’ is locked. So, both of the INSERT statements in session B should enter the lock-wait state.

However, as you can see from the experimental results above, the first INSERT statement in session B can be executed successfully. This is because, for the storage engine, p_2018 and p_2019 are two different tables. In other words, the next record after 2017-4-1 is not 2018-4-1, but the supremum of partition p_2018. Therefore, at point T1, the range and locking state on the ftime index of table t is actually like Figure 4:

Figure 4 Locking range of partitioned table t

Due to the partitioning rules of the partitioned table, the SELECT statement in session A actually only operates on partition p_2018, so the locking range is the deep green part in Figure 4.

Therefore, when session B wants to write a row with ftime as 2018-2-1, it can be successfully executed, but to write the record 2017-12-1, it needs to wait for the range lock of session A.

Figure 5 is a partial result of show engine innodb status at this time.

Figure 5 Information about session B being locked

After looking at the example of the InnoDB engine, let’s take a look at an example of a partitioned MyISAM table.

First, I use “alter table t engine=myisam” to change the table t to a MyISAM table; then, I use the following example to illustrate that for the MyISAM engine, these are 4 tables.

Figure 6 Validation of MyISAM table locks

In session A, I set the execution time of this statement to 100 seconds with sleep(100). Since the MyISAM engine only supports table locks, this UPDATE statement will lock the read on the entire table t.

But what we see as the result is that the first SELECT statement in session B can be executed normally, and the second statement enters the lock-wait state.

This is because the table lock of MyISAM is implemented at the engine level. The table lock added by session A is actually locked on partition p_2018. Therefore, it only blocks queries executed on this partition, and queries on other partitions are not affected.

At this point, you may say that partitioned tables seem good, why not use them? One important reason we use partitioned tables is when a single table becomes too large. In that case, if we don’t use partitioned tables, we would have to use manual sharding.

Next, let’s see the difference between manual sharding and partitioned tables.

For example, if we partition by year, we would create separate ordinary tables t_2017, t_2018, t_2019, and so on. The logic of manual sharding is to find all the partitions that need to be updated, and then execute the updates one by one. In terms of performance, there is no substantial difference compared to partitioned tables.

Partitioned tables and manual sharding, one is determined by the server layer to use which partition, and the other is determined by the application layer code to use which partition. Therefore, from the perspective of the storage engine, these two methods are also the same.

In fact, the main difference between these two solutions lies in the server layer. When viewed from the server layer, we have to mention a widely criticized issue with partitioned tables: opening tables.

Partitioning strategy #

Whenever the first visit of a partitioned table is made, MySQL needs to access all the partitions. A typical error is as follows: if a partitioned table has many partitions, for example, more than 1000 partitions, and MySQL is started with the default value of the open_files_limit parameter, which is 1024, then when accessing this table, it will try to open all the files, causing the number of open table files to exceed the limit and report an error.

The figure below shows a situation where an error occurs after executing an insert statement on a table t_myisam that contains many partitions.

Figure 7 Error with the insert statement

As you can see, this insert statement clearly only needs to access one partition, but the statement cannot be executed.

At this point, you must have guessed from the table name that I am using the MyISAM engine. Yes, because if the InnoDB engine is used, this problem will not occur.

The partition strategy used by MyISAM partitioned tables is called generic partitioning. Each partition is controlled by the server layer every time it is accessed. The generic partitioning strategy existed in MySQL’s code from the beginning of supporting partitioned tables. The implementation of file management and table management in the generic partitioning strategy is rough, resulting in serious performance issues.

Starting from MySQL 5.7.9, the InnoDB engine introduced the native partitioning strategy. This strategy manages the behavior of opening partitions internally in InnoDB itself.

Starting from MySQL 5.7.17, MyISAM partitioned tables have been marked as deprecated, meaning “it is not recommended to use this feature from this version onwards, please use alternative solutions. This feature will be deprecated in future versions.”

Starting from MySQL 8.0, it is not allowed to create MyISAM partitioned tables anymore. Only engines that have implemented the native partitioning strategy are allowed to be created. Currently, only InnoDB and NDB engines support the native partitioning strategy.

Next, let’s take a look at the behavior of partitioned tables in the server layer.

Server layer behavior of partitioned tables #

From the server layer’s perspective, a partitioned table is just a table.

What does this mean? I will explain this to you using the example below. Figures 8 and 9 show the operation sequence and the execution result for this example, respectively.

Figure 8 MDL lock of the partitioned table

Figure 9 Result of show processlist

As you can see, even though session B only needs to operate on the p_2107 partition, because session A holds the MDL lock of the entire table t, session B’s alter statement is blocked.

This is also what DBA colleagues often say: partitioned tables have a greater impact when performing DDL operations. If you are using ordinary sharding, when you truncate a shard, it will definitely not conflict with query statements on another shard due to the MDL lock.

In summary:

MySQL needs to access all partitions when opening a partitioned table for the first time.
From the server layer’s perspective, it treats a partitioned table as a single table, so all partitions share the same MDL lock.
From the engine layer’s perspective, it treats each partition as a different table, so the execution process after the MDL lock will only access the necessary partitions based on the partition table rules. And regarding the judgment of “necessary partitioning”, it is implemented based on the WHERE condition in the SQL statement, combined with the partitioning rules. For example, in the example above, if the WHERE condition is where ftime='2018-4-1', according to the partitioning rule, the value calculated by the year function is 2018, so it will fall into the partition p_2019.

However, if the WHERE condition is changed to where ftime>='2018-4-1', although the query result is the same, based on the WHERE condition, the partitions p_2019 and p_others need to be accessed.

If the WHERE condition of the query statement does not contain the partition key, then all partitions will need to be accessed. Of course, this is not a problem with partitioned tables. Even when using business sharding, if the partition key is not used in the WHERE condition, all partitions must be accessed.

Now that we have understood the concept of partitioned tables, what scenarios are suitable for using partitioned tables?

Application Scenarios of Partitioned Tables #

One obvious advantage of partitioned tables is their transparency to the business. Compared to user sharding, using partitioned tables results in more concise business code. Additionally, partitioned tables allow for convenient cleaning of historical data.

If a business runs for a long enough period of time, there is often a need to delete historical data based on time. In this case, a partitioned table based on time partitioning can directly drop a partition with the SQL syntax alter table t drop partition ..., thereby deleting expired historical data.

This alter table t drop partition ... operation directly deletes the partition files, similar to dropping a regular table. Compared to deleting data with the delete statement, the advantages are faster speed and smaller impact on the system.

Summary #

In this article, I mainly introduced how the server layer and the engine layer handle partitioned tables. I hope that through these introductions, you will have a clearer idea of whether to choose to use partitioned tables.

It should be noted that I used range partitioning as an example to introduce. In reality, MySQL also supports hash partitioning, list partitioning, and other partitioning methods. You can refer to the manual when you need to use them.

In actual use, partitioned tables have two unavoidable issues compared to user sharding: the need to access all partitions on the first access, and the shared MDL lock.

Therefore, if you want to use partitioned tables, do not create too many partitions. I have seen a user who adopted a daily partitioning strategy and pre-created partitions for 10 years. In this case, the performance of accessing the partitioned table naturally becomes poor. There are two points to note here:

Finer partitions are not necessarily better. In fact, a table or a single partition with ten million rows is already a small table for the current hardware capabilities, as long as there are no particularly large indexes.
Don’t preallocate too many partitions. Only create them before using them. For example, if partitioned by month, create the 12 new partitions for the next year at the end of each year. For historical partitions without data, drop them in a timely manner.

As for other issues with partitioned tables, such as the slow query performance when accessing data across multiple partitions, it is generally not a problem with the partitioned table itself, but rather a problem with the amount of data or the usage method.

Of course, if your team is already maintaining a mature middleware for database sharding, using business sharding is better in terms of no additional complexity for business development colleagues and more intuitive to DBAs.

Finally, I want to leave you with a question to ponder:

In the example we used, there is no use of an auto-increment primary key. Suppose we now want to create an auto-increment field id. MySQL requires the primary key in a partitioned table to include the partitioning field. How would you define the primary key of this table based on table t? Why would you define it this way?

You can write your conclusions and analysis in the comments, and I will discuss this question with you at the end of the next article. Thank you for listening, and feel free to share this article with more friends to read together.

Previous Question Time #

There weren’t enough comments after the previous article, maybe many students haven’t remembered to read it yet. We will supplement this section of “Previous Question Time” when there are more comments in the future.

@夹心面包 mentioned that wildcards are supported in the GRANT statement: “_” represents any single character, and “%” represents any string. This technique is quite convenient when there are multiple databases on the same shard in a sharding scheme. However, I personally believe that when assigning permissions, the precision of control should be given priority.