39 Why Auto-Increment Primary Key is Not Continuous #

In [Article 4], we mentioned that an auto-increment primary key can keep the primary key index in an ascending order, avoiding page splits and making the index more compact.

Previously, I have seen business designs that rely on the continuity of auto-increment primary keys, assuming that the auto-increment primary keys are consecutive. However, in reality, this assumption is incorrect because auto-increment primary keys cannot guarantee consecutive increments.

In today’s article, we will discuss this issue and explore when auto-increment primary keys may have “gaps”.

To illustrate this, let’s create a table called t, with the id as the auto-increment primary key field and c as the unique index.

CREATE TABLE `t` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `c` int(11) DEFAULT NULL,
  `d` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `c` (`c`)
) ENGINE=InnoDB;

Where is the auto-increment value stored? #

Insert a row of data into the empty table t by executing the insert into t values(null, 1, 1); statement, and then execute the show create table command. You will see the following result:

Figure 1: The auto-generated AUTO_INCREMENT value

As you can see, the table definition contains AUTO_INCREMENT=2, which means that the next time data is inserted and an auto-increment value needs to be generated, it will be id=2.

In fact, this output can easily lead to a misunderstanding that the auto-increment value is stored in the table structure definition. However, the table structure definition is stored in a .frm file extension, but it does not store the auto-increment value.

Different storage engines have different strategies for storing auto-increment values.

For the MyISAM engine, auto-increment values are stored in the data file.
For the InnoDB engine, auto-increment values are actually stored in memory. Starting from MySQL 8.0, InnoDB has the ability to “persist” auto-increment values, meaning that if a restart occurs, the auto-increment value of the table can be restored to its value before the restart. The specific situations are as follows:
- In versions before MySQL 5.7, auto-increment values are stored in memory and not persisted. After each restart, when the table is opened for the first time, it will find the maximum auto-increment value max(id), and use max(id) + 1 as the current auto-increment value for that table. For example, if the maximum id in the current data rows of a table is 10, AUTO_INCREMENT=11. At this point, if we delete the row with id=10, the AUTO_INCREMENT will still be 11. However, if we immediately restart the instance, the AUTO_INCREMENT of this table will become 10. In other words, restarting MySQL may modify the value of AUTO_INCREMENT for a table.
- In MySQL 8.0, the change in auto-increment values is recorded in the redo log, and by relying on the redo log, the value can be restored after a restart.

Now that we understand the storage strategies for auto-increment values in MySQL, let’s take a look at the mechanism for modifying the auto-increment values.

Mechanism for modifying auto-increment values #

In MySQL, if a field id is defined as AUTO_INCREMENT, the behavior of the auto-increment value when inserting a row of data is as follows:

If the id field is specified as 0, null, or not specified when inserting data, then the current AUTO_INCREMENT value of this table is filled into the auto-increment field.
If the id field is specified with a specific value when inserting data, the value specified in the statement is used directly.

Depending on the relationship between the value to be inserted and the current auto-increment value, the modification result of the auto-increment value will also vary. Assuming the value to be inserted is X and the current auto-increment value is Y:

If X < Y, the auto-increment value for this table remains unchanged.
If X >= Y, the current auto-increment value needs to be modified to a new value.

The algorithm for generating new auto-increment values is as follows: starting from auto_increment_offset and incrementing continuously by auto_increment_increment, until finding the first value greater than X, which is then used as the new auto-increment value.

Here, auto_increment_offset and auto_increment_increment are two system parameters used to indicate the initial value and step size of the auto-increment, respectively. Both default values are 1.

Note: In some scenarios, not all default values are used. For example, in a dual M master-slave structure, when dual writes are required, we may set auto_increment_increment=2 to make the auto-increment ids in one database always odd and the ones in another database always even, avoiding conflicts between primary keys generated by the two databases.

When both auto_increment_offset and auto_increment_increment are set to 1, the logic for generating new auto-increment values is simple:

If the value to be inserted >= the current auto-increment value, the new auto-increment value is “the value to be inserted +1”.
Otherwise, the auto-increment value remains unchanged. This brings us to the question mentioned at the beginning of our article: When both parameters are set to 1, why can’t the auto-increment primary key id be guaranteed to be consecutive? What is the reason for this?

Timing of Auto-increment Value Modification #

To answer this question, we need to look at the timing of auto-increment value modification.

Suppose the table t already has a record (1, 1, 1). Now, if I execute an insert command:

insert into t values(null, 1, 1);

The execution process of this statement is as follows:

The executor calls the InnoDB engine interface to write a row, and the value of this row is (0, 1, 1).
InnoDB finds that the user has not specified a value for the auto-increment id and retrieves the current auto-increment value of table t, which is 2.
The value of the incoming row is changed to (2, 1, 1).
The auto-increment value of the table is changed to 3.
The insert operation continues, but it fails because there is already a record with c=1, resulting in a Duplicate key error, and the statement is returned.

The corresponding execution process diagram is shown below:

Figure 2: Unique key conflict for insert(null, 1, 1)

As can be seen, the auto-increment value of this table is changed to 3 before the actual execution of the insert operation. When this statement is actually executed, because there is a unique key conflict, the row with id=2 is not successfully inserted, but the auto-increment value is not changed back either.

Therefore, after this, when inserting a new data row, the obtained auto-increment id will be 3. That is to say, a situation where the auto-increment primary key is not consecutive occurs.

As shown in Figure 3, the complete demonstration result is as follows:

Figure 3: Steps to reproduce a non-consecutive auto-increment primary key id

As can be seen, this sequence of operations reproduces a scene where the auto-increment primary key id is not consecutive (there is no row with id=2). Therefore, unique key conflicts are the first reason for the non-consecutive auto-increment primary key id.

Similarly, rollback of a transaction can also cause a similar phenomenon, which is the second reason.

The following sequence of statements can be used to construct non-consecutive auto-increment ids, and you can verify it yourself.

insert into t values(null, 1, 1);
begin;
insert into t values(null, 2, 2);
rollback;
insert into t values(null, 2, 2);
// The inserted row is (3, 2, 2)

You might wonder why MySQL doesn’t change the auto-increment value of table t back when there is a unique key conflict or a rollback. If the current auto-increment value of table t is changed back from 3 to 2, then a row of data with id=2 can be generated when inserting new data.

In fact, MySQL is designed like this to improve performance. Next, I will analyze this design idea with you to see why the auto-increment value cannot be rolled back.

Assume there are two concurrently executing transactions. When applying for an auto-increment value to avoid two transactions applying for the same auto-increment id, locks must be added and applied in order.

Suppose transaction A applies for id=2 and transaction B applies for id=3. At this time, the auto-increment value of table t is 4, and then the execution continues.
Transaction B is correctly committed, but transaction A encounters a unique key conflict.
If we allow transaction A to roll back the auto-increment id, that is, change the current auto-increment value of table t back to 2, then the following situation will occur: there is already a row with id=3 in the table, but the current auto-increment id value is 2.
Next, when other transactions continue to execute, they will apply for id=2 and then apply for id=3. At this point, an error “duplicate key conflict” will occur in the insert statement.

To solve this key conflict, there are two methods:

Before applying for an id, check if the id already exists in the table. If it exists, skip this id. However, this method is very costly since the operation of applying for an id originally is a fast one, but now we have to check if the id exists on the primary key index tree.
Expand the scope of the auto-increment id lock, so that the next auto-increment id can only be applied after a transaction is completed and committed. The problem with this method is that the granularity of the lock is too large, greatly reducing the system’s concurrency capability. As we can see, both of these methods will lead to performance issues. The culprit causing these troubles is the assumption we made about the “allowing auto-increment id rollback” premise.

Therefore, InnoDB abandoned this design, and the auto-increment id is not rolled back even if the statement execution fails. It is precisely because of this that it only guarantees that the auto-increment id is increasing, but not necessarily consecutive.

Optimization of Auto-Increment Locks #

We can see that the auto-increment id lock is not a transaction lock, but is released immediately after each request to allow other transactions to apply. In fact, this was not the case before MySQL 5.1.

Next, I will first introduce the history of the design of auto-increment locks, which will help you analyze a problem that follows.

In MySQL 5.0, the scope of the auto-increment lock was at the statement level. That is, if a statement requests an auto-increment lock for a table, the lock is released only after the statement execution finishes. Obviously, this design affects concurrency.

In MySQL version 5.1.22, a new strategy was introduced, with the new parameter innodb_autoinc_lock_mode added, with a default value of 1.

When the value of this parameter is set to 0, it means the previous strategy used in MySQL version 5.0 is adopted, where the lock is released only after the statement execution finishes.
When the value of this parameter is set to 1: * For normal INSERT statements, the auto-increment lock is released immediately after being requested. * For batch insert statements like INSERT ... SELECT, the auto-increment lock is still released only after the statement execution finishes.
When the value of this parameter is set to 2, all requests for auto-increment keys release the lock immediately after being requested.

You may have two questions: Why does INSERT ... SELECT use statement-level locks by default? Why isn’t the default value of this parameter set to 2?

The answer is that this design is still for the sake of data consistency.

Let’s take a look at this scenario together:

Figure 4 Auto-increment lock for batch inserting data

In this example, I inserted 4 rows of data into table t1, then created a table t2 with the same structure, and then executed the operation of inserting data into table t2 simultaneously in two sessions.

You can imagine that if session B requests the auto-increment value and immediately releases the auto-increment lock, the following situation may occur:

Session B first inserts two rows of records: (1,1,1) and (2,2,2).
Then, session A requests the auto-increment id and obtains id=3, and inserts (3,5,5).
Afterwards, session B continues to execute and inserts two rows of records: (4,3,3) and (5,4,4).

You may say that this doesn’t matter, after all, the semantics of session B itself does not require that all rows of table t2 are the same as session A.

Yes, it is correct from a logical point of view. However, if we now set binlog_format to statement, you can imagine how the binlog will record the updates.

Since the two sessions are executing the insert data command simultaneously, there are only two possibilities for the binlog entry for table t2: either session A’s entry is logged first or session B’s entry is logged first.

But regardless of which one, when this binlog is used to execute on the replica or to recover a temporary instance, the results of executing session B’s statement in the replica or temporary instance will have consecutive ids. At this point, data inconsistency occurs in this database.

You can analyze the reasons for this problem.

In fact, this is because the insert statement from session B in the source database generates non-consecutive ids. Executing this non-consecutive id using statement-based binlog will not yield the correct results.

And there are two ways to solve this problem:

One approach is to ensure that the batch insert statement in the source database always generates consecutive id values. Therefore, the auto-increment lock is released only after the statement execution finishes, in order to achieve this purpose.
Another approach is to faithfully record the insert data operations in the binlog and not rely on auto-increment primary keys for execution on the replica. In this case, innodb_autoinc_lock_mode is set to 2 and binlog_format is set to row.

Therefore, from a perspective of performance in concurrent data insertion scenarios, especially when batch inserting data with INSERT ... SELECT, I recommend you to set innodb_autoinc_lock_mode to 2 and binlog_format to row. This way, you can improve concurrency while avoiding data consistency issues.

Please note that when I mention “batch inserting data”, it includes statement types such as INSERT ... SELECT, REPLACE ... SELECT, and LOAD DATA statements.

However, even if innodb_autoinc_lock_mode is set to 1, the lock will not be released until the statement execution finishes when an ordinary insert statement contains multiple value values. This is because this type of statement can accurately calculate the number of ids needed when requesting auto-increment id and apply for them all at once, so the lock can be released after the application is finished.

In summary, the reason for setting the batch insert statement in this way is that “we don’t know how many ids need to be requested in advance.” Since we don’t know in advance how many auto-increment IDs we need to apply for, one direct idea is to apply for one when needed. However, if a select ... insert statement needs to insert 100,000 rows of data, according to this logic, we need to apply for the ID 100,000 times. Obviously, this strategy of applying for auto-increment IDs is not only slow but also affects the performance of concurrent inserts in the case of batch data insertion.

Therefore, for statements that insert data in batches, MySQL has a strategy for batch application of auto-increment IDs:

During the execution of the statement, the first application for an auto-increment ID will allocate 1.
After one is used up, the second application for an auto-increment ID in this statement will allocate 2.
After two are used up, the third application for an auto-increment ID in this statement will allocate 4.
And so on, for the same statement to apply for an auto-increment ID, the number of auto-increment IDs obtained each time is twice the previous time.

Let’s take a look at the following sequence of statements as an example:

insert into t values(null, 1,1);
insert into t values(null, 2,2);
insert into t values(null, 3,3);
insert into t values(null, 4,4);
create table t2 like t;
insert into t2(c,d) select c,d from t;
insert into t2 values(null, 5,5);

The insert...select statement actually inserts 4 rows of data into table t2. However, these four rows of data are allocated auto-increment IDs in three batches. The first time, ID=1 is allocated, the second time, IDs 2 and 3 are allocated, and the third time, IDs 4 to 7 are allocated.

Since only 4 IDs are used in this statement, IDs 5 to 7 are wasted. After that, when executing insert into t2 values(null, 5,5), the actual data inserted is (8,5,5).

This is the third reason why the primary key auto-increment IDs are not continuous.

Summary #

Today, we started with the question of “Why do auto-increment primary keys have non-continuous values?” First, we discussed the storage of auto-increment values.

In the MyISAM engine, auto-increment values are written on the data file. In InnoDB, auto-increment values are recorded in memory. It was not until MySQL 8.0 that InnoDB table’s auto-increment values had persistence, ensuring that the auto-increment values of a table remain unchanged before and after restart.

Then, I shared with you the timing of when auto-increment values change during the execution of a statement, and analyzed why MySQL cannot recycle auto-increment IDs during transaction rollback.

The innodb_autoinc_lock_mode parameter introduced in MySQL 5.1.22 controls the locking scope for auto-increment value allocation. From the perspective of concurrency performance, I suggest setting it to 2, and also setting binlog_format to row. I’ve mentioned in previous articles multiple times that setting binlog_format to row is necessary. Today’s example gives another reason for this conclusion.

Finally, I’ll leave you with a question to think about.

When executing the statement insert into t2(c,d) select c,d from t;, if the isolation level is repeatable read and binlog_format=statement, this statement will lock all records and gaps in table t.

Why do you think this is necessary?

You can write your thoughts and analysis in the comments section, and I will discuss this question with you in the next article. Thank you for listening, and feel free to share this article with more friends to read together.

Answer to the Previous Question #

The previous question was, if there are memory tables in the MySQL system you maintain, how do you avoid sudden data loss in the memory tables, which leads to the synchronization between the master and the slave being stopped.

Assuming that you cannot temporarily modify the engine on the master, you can first change the engine of the memory tables on the slave to InnoDB. For each memory table, execute:

set sql_log_bin=off;
alter table tbl_name engine=innodb;

This will prevent data loss when the slave restarts.

Since the master, after restarting, will write “delete from tbl_name” into the binlog, when this command is transmitted to the slave, the same table on the slave will also be cleared.

Therefore, the synchronization between the master and the slave will not be stopped.

If the master abnormally restarts and triggers HA, the previous slave that had its engine modified becomes the new master, and the original master becomes the new slave. On the new slave, change all memory tables (which will be empty at this time) to InnoDB tables.

Therefore, if you cannot directly modify the engine on the master, you can configure an automatic monitoring tool to change the engines of the memory tables on the slave when discovered.

At the same time, coordinate with the business development team to agree on table creation rules to avoid creating new memory tables.