06 Global Locks and Table Locks Why Adding a Field to a Table Causes Many Blocks #

Today I want to talk to you about locks in MySQL. The purpose of database locks is to handle concurrency issues. As a shared resource among multiple users, a database needs to control the access rules for resources when concurrent accesses occur. And locks are important data structures used to implement these access rules.

Based on the scope of locking, locks in MySQL can be roughly divided into three categories: global locks, table-level locks, and row locks. In this article, I will share with you about global locks and table-level locks. The details about row locks will be discussed in the next article.

It should be noted that lock design is complex, and these two articles will not cover the specific implementation details of locks. The main focus will be on the phenomena encountered with locks and the underlying principles.

Global Lock #

As the name suggests, a global lock locks the entire database instance. MySQL provides a method to add a global read lock, which is the Flush tables with read lock (FTWRL) command. When you need to put the entire database into a read-only state, you can use this command, and then the following statements from other threads will be blocked: data manipulation statements (inserting, deleting, updating data), data definition statements (including creating tables, modifying table structures, etc.), and commit statements of updating transactions.

The typical use case for a global lock is to perform a logical backup of the entire database. In other words, select every table in the entire database and save the result as text.

In the past, one approach was to use FTWRL to ensure that no other thread would make updates to the database, and then perform a backup of the entire database. It is important to note that the entire database is in a completely read-only state during the backup process.

But making the entire database read-only sounds quite dangerous:

If you perform the backup on the master database, no updates can be executed during the backup period, and the business operations will basically come to a halt.
If you perform the backup on the replica database, the replica cannot execute the binlog received from the master during the backup period, which will result in replication lag.

It seems that adding a global lock is not ideal. But let’s think a bit more about why we need to lock during the backup process. Let’s see what problems would arise if we don’t add a lock.

Suppose you are maintaining the purchasing system for “Geek Time,” and you are concerned about the user account balance table and the user course table.

Now let’s initiate a logical backup. During the backup process, let’s say a user makes a purchase, in which case the business logic would deduct their balance and add a course to the list of purchased courses.

If the time sequence is to backup the account balance table (u_account) first, then the user makes the purchase, and finally backup the user course table (u_course), what would happen? You can take a look at this diagram:

Figure 1: Business and backup status diagram

As you can see from this backup result, the data state for user A is “the account balance is not deducted, but an additional course is added to the user course table.” If this backup is used to restore the data later, user A would find that they have made a profit. As a user, don’t you think it would be great if the backup tables were reversed? If the user’s course schedule is backed up first, followed by the account balance table, what would happen?

In other words, without locking, the backup system does not obtain a logical time point, and the view is logically inconsistent.

Speaking of views, you must remember that when we talked about transaction isolation earlier, there is actually a method to obtain a consistent view, right?

Yes, it’s by starting a transaction under the repeatable-read isolation level.

Note: If you are not very clear about the concept of transaction isolation levels, you can review the relevant content in the third article [“Transaction Isolation: Why Can’t I See Your Changes?”]

The official built-in logical backup tool is mysqldump. When mysqldump uses the –single-transaction parameter, it starts a transaction before data is imported to ensure a consistent view. And due to the support of MVCC, data can be updated normally during this process.

You may wonder why we still need FTWRL if we have this feature. Consistent reads are good, but the premise is that the engine supports this isolation level. For example, for engines that do not support transactions, such as MyISAM, if there are updates during the backup process, only the latest data can be accessed, which breaks the consistency of the backup. In this case, we need to use the FTWRL command.

Therefore, the single-transaction method is only applicable to libraries where all tables use transactional engines. If some tables use engines that do not support transactions, then the backup can only be done through the FTWRL method. This is often one of the reasons why DBAs ask business developers to use InnoDB instead of MyISAM.

You might ask, since we want the entire library to be read-only, why not use the set global readonly=true approach? Indeed, the readonly approach can also make the entire library read-only, but I still recommend using the FTWRL approach for two main reasons:

First, in some systems, the value of readonly is used for other purposes, such as determining whether a library is a master or backup. Therefore, modifying global variables has a larger impact, and I do not recommend it.
Second, there is a difference in exception handling mechanisms. If the FTWRL command is executed and the client is disconnected due to an exception, MySQL will automatically release this global lock, and the entire library will return to a state where it can be updated normally. However, when the entire library is set to readonly, if the client experiences an exception, the database will remain in readonly state, which will keep the entire library in an unwritable state for a long time, posing a higher risk.

Business updates are not limited to data manipulation language (DML) operations such as adding, deleting, and modifying data, but also include operations that modify table structures, such as adding fields (DDL operations). Regardless of the method used, when a library is locked globally, any operation to add a field to any table inside it will be locked.

However, even if it is not locked globally, adding a field may not always be smooth sailing because you will also encounter table-level locks that we are going to introduce next.

Table-level Locks #

MySQL has two types of table-level locks: table locks and metadata locks (MDL).

The syntax for table locks is lock tables … read/write. Similar to FTWRL, you can use unlock tables to release the lock actively, or it will be automatically released when the client is disconnected. It is worth noting that the lock tables syntax not only restricts the read and write of other threads but also limits the subsequent operations of the current thread.

For example, if the lock tables t1 read, t2 write; statement is executed in thread A, then statements that write to t1 or read and write to t2 in other threads will be blocked. At the same time, before executing unlock tables, thread A can only execute read operations on t1 and read and write operations on t2. It is not allowed to write to t1 or access other tables naturally.

Table locks were the most commonly used way to handle concurrency before more granular locks were available. For InnoDB, which supports row-level locks, the lock tables command is generally not used to control concurrency because locking the entire table has too big an impact.

Another type of table-level lock is MDL (metadata lock). MDL does not need to be explicitly used; it is automatically added when accessing a table. The purpose of MDL is to ensure the correctness of reads and writes. You can imagine that if a query is in the process of traversing the data in a table and another thread makes changes to the table structure during the execution, such as deleting a column, the result obtained by the query thread will not match the table structure, which is certainly not allowed. Therefore, MDL was introduced in MySQL 5.5 to add MDL read locks when performing CRUD operations on a table and MDL write locks when performing structural changes on the table.

Read locks are not mutually exclusive, so multiple threads can simultaneously perform CRUD operations on a table.
Read locks are mutually exclusive with write locks and other write locks to ensure the safety of structural changes. Therefore, if two threads want to add a field to a table at the same time, one must wait for the other to finish before starting.

Although MDL locks are automatically added by the system, it is still a mechanism that cannot be ignored. For example, I often see people fall into this pitfall: adding a field to a small table causing the entire database to crash.

You must know that adding a field to a table, modifying a field, or adding an index requires scanning the entire table. When operating on a large table, you would definitely be extra careful to avoid affecting the online service. However, even for small tables, careless operations can still cause problems. Let’s take a look at the following operation sequence, assuming that table t is a small table.

Note: The experimental environment here is MySQL 5.6.

We can see that session A starts first and acquires an MDL read lock on table t. Since session B also needs an MDL read lock, it can be executed normally.

After that, session C is blocked because session A has not released the MDL read lock yet, and session C needs an MDL write lock, so it can only be blocked.

If only session C is blocked, it is not a big deal, but all subsequent requests to apply for MDL read locks on table t will also be blocked by session C. As mentioned earlier, all CRUD operations on the table require applying for an MDL read lock first, which means the table is completely read-only and write-only at the moment.

If there are frequent queries on a table and the client has a retry mechanism, meaning it will start a new session and make another request after a timeout, the threads in this database will soon be overwhelmed.

Now you should understand that MDL locks in a transaction are applied at the beginning of the statement but are not released immediately after the statement is finished. Instead, they are released after the entire transaction is committed.

Based on the above analysis, let’s discuss one problem: How to safely add a field to a small table?

First, we need to deal with long transactions. If a transaction is not committed, it will continue to hold the MDL lock. In the innodb_trx table of the information_schema database in MySQL, you can see the currently executing transactions. If there is a long transaction running on the table you want to modify with a DDL statement, you should consider pausing the DDL or killing the long transaction.

But consider this scenario. If the table you want to modify is a hotspot table, although it has a small amount of data, there are frequent requests on it, and you have to add a field. What should you do in this case?

In this case, killing may not necessarily work because new requests will come immediately. The preferred mechanism is to set a waiting time in the ALTER TABLE statement. If the MDL write lock can be acquired within this specified waiting time, it is best. If it cannot be acquired, do not block subsequent business statements, just give up. Then, the developer or DBA can repeat this process through a retry command.

MariaDB has already merged this feature from AliSQL, so both of these open-source branches currently support the DDL NOWAIT/WAIT n syntax.

ALTER TABLE tbl_name NOWAIT add column ...
ALTER TABLE tbl_name WAIT N add column ...

Summary #

Today, I introduced you to MySQL’s global lock and table-level lock.

The global lock is mainly used in the logical backup process. For databases that are entirely InnoDB-based, I recommend using the --single-transaction parameter, which will be more friendly to the application.

Table-level locks are generally used when the database engine does not support row-level locks. If you find LOCK TABLES statements in your application, you need to investigate. The most likely situations are:

Either your system is still using engines like MyISAM that do not support transactions, in which case you need to plan an upgrade and change the engine.
Or your engine has been upgraded, but the code has not. I have seen cases like this, and the solution was to change LOCK TABLES and UNLOCK TABLES to BEGIN and COMMIT.

MDL (Metadata Lock) is released only after the transaction is committed. When making changes to table structure, you must be careful not to block online queries and updates.

Finally, I’ll leave you with a question. Backups are usually performed on backup servers. While using the --single-transaction method for logical backups, if a DDL operation is performed on a small table on the master server, such as adding a column to a table, what will be the phenomenon observed on the backup server?

You can write your thoughts and opinions in the comments section, and I will discuss this question with you at the end of the next article. Thank you for listening, and feel free to share this article with more friends to read together.

Note: This article does not cover physical backups, which will be discussed in a separate article.

Previous Question #

The previous question was about understanding composite primary key indexes and InnoDB clustered indexes.

I will directly paste @Dear Comrade Lao Yang’s reply, slightly modified as follows (I have highlighted my modifications in orange):

Table Records –a–|–b–|–c–|–d– 1 2 3 d 1 3 2 d 1 4 3 d 2 1 3 d 2 2 2 d 2 3 4 d The clustering index organization order of primary keys a, b is equivalent to order by a,b, which means that it is sorted by a first, then by b, and c is unordered.

The organization of the index ca is to sort first by c, then by a, while recording only the primary key portion b –c–|–a–|–Partial primary key b– (Note: It’s not ab here, but only b) 2 1 3 2 2 2 3 1 2 3 1 4 3 2 1 4 2 3 This is exactly the same as the data in index c.

The organization of index cb is to sort first by c, then by b, while recording only the primary key portion a –c–|–b–|–Partial primary key a– (Same as above) 2 2 2 2 3 1 3 1 2 3 2 1 3 4 1 4 3 2

So, the conclusion is that ca can be removed, and cb needs to be kept.