39 Q& a Session My SQL Inno Db Knowledge Points Lecture

39 Q&A session MySQL InnoDB knowledge points lecture #

Hello, I am Liu Chao.

Module 6, which covers database optimization, officially ended this week. Today, let’s go over the knowledge points of InnoDB in MySQL together. InnoDB is one of the most commonly used storage engines, and being familiar with its implementation and operation principles will help us create and maintain database tables more effectively.

InnoDB Architecture #

InnoDB primarily consists of the memory pool, background threads, and storage files. The memory pool is composed of multiple memory blocks, including cache for disk data and redo log buffer. The background threads include the Master Thread, IO Thread, and Purge Thread. The storage structure files implemented by the InnoDB storage engine generally include table structure files (.frm), shared tablespace files (ibdata1), exclusive tablespace files (ibd), and log files (redo files, etc.).

1. Memory Pool #

As we know, it can cause performance bottlenecks if the client directly reads data from the disk. The purpose of the buffer pool is to improve the overall read and write performance of the database.

When the client reads data, if it exists in the buffer pool, the client will directly read the data from the buffer pool; otherwise, it will read from the disk. For modified data in the database, it first modifies the data in the buffer pool and then flushes it to disk through the Master Thread.

In theory, the larger the buffer pool memory, the better. We have discussed in detail the configuration and optimization of the buffer pool size in Lecture 38.

The buffer pool not only caches index pages and data pages but also includes undo pages, insertion cache, adaptive hash index, and InnoDB lock information, among others.

InnoDB allows multiple buffer pool instances to reduce the competition for database internal resources and enhance the concurrent processing capability of the database. Lecture 38 also discusses the configuration and optimization of buffer pool instances.

The InnoDB storage engine first puts the redo log information into the buffer area and then flushes it to the redo log file.

2. Background Threads #

The Master Thread is mainly responsible for asynchronously flushing the data in the buffer pool to the disk, as well as insertion cache and undo page recycling. The IO Thread is responsible for reading and writing IO. The Purge Thread is mainly used to recycle undo logs of committed transactions. The Pager Cleaner Thread is a newly introduced thread that assists the Master Thread in flushing dirty pages to disk. It can reduce the workload of the Master Thread and minimize blocking.

3. Storage Files #

For each table created in MySQL, a .frm file is generated to store the metadata information of each table, mainly including table structure definitions.

In InnoDB, data is stored in tablespaces. By default, it uses shared tablespace for storage, which is the shared tablespace file (ibdata1). If the innodb_file_per_table parameter is set to 1, data and index information will be stored separately in an exclusive tablespace, creating an exclusive tablespace file (ibd). If you do not have a thorough understanding of shared and exclusive tablespaces, I will explain it in more detail later.

The log files mainly include redo log files, which record the redo log generated by transactions to ensure transaction consistency.

InnoDB Logical Storage Structure #

The InnoDB logical storage structure consists of tablespace, segment, extent, page, and row.

1. Tablespace #

InnoDB provides two ways to store data in a tablespace: shared tablespace and dedicated tablespace. By default, InnoDB stores all its table data in a shared tablespace called “ibdata1”.

You can enable the dedicated tablespace mode by setting the innodb_file_per_table parameter to 1. In this mode, each table has its own dedicated tablespace physical file, and all the data and indexes are stored in that file, making it easier to backup and restore data.

2. Segment #

A tablespace consists of several segments, including data segments, index segments, and rollback segments. As we know, InnoDB uses B+ trees for data storage.

The index segments refer to the non-leaf nodes of the B+ tree, while the data segments refer to the leaf nodes of the B+ tree. The rollback segments refer to the rollback data, which we have discussed when talking about transaction isolation and how MVCC utilizes rollback segments to implement multi-version data querying.

3. Extent / Page #

An extent is the unit structure of a tablespace, and each extent has a size of 1MB. A page is the smallest unit that makes up an extent. It is also the smallest unit of disk management in the InnoDB storage engine, with a default page size of 16KB. To ensure page contiguity, the InnoDB storage engine requests 4-5 extents from the disk each time.

4. Row #

The InnoDB storage engine is row-oriented, which means that data is stored row by row. Each page has a fixed definition of row records, allowing a maximum of 16KB/2-200 rows, which is 7992 row records.

Working Principle of InnoDB Redo Log for Transactions #

InnoDB is a transactional storage engine, and its transaction implementation is based on transaction logs, namely redo log and undo log. Redo log is responsible for write operations to achieve transaction durability, while undo log ensures transaction consistency through rollback operations.

Redo log consists of a log buffer in memory (redo log buffer) and redo log files stored on disk. The log buffer is volatile and prone to loss, while the redo log files are persisted on disk to prevent loss.

InnoDB uses the Write Ahead Log (WAL) strategy for update operations, which means that logs are written before data is written to disk. When a record is updated, InnoDB first writes the record to the redo log buffer and updates the in-memory data. The parameter “innodb_flush_log_at_trx_commit” can be used to customize how the logs in the redo log buffer are flushed to the redo log files upon commit.

It is worth noting that the size of InnoDB’s redo log is fixed. Multiple log files form a circular chain, and when writing reaches the end, it loops back to the beginning to write logs. The parameters “innodb_log_files_in_group” and “innodb_log_file_size” can be used to configure the number of log files and the size of each log file.

Data in the Buffer Pool that has been updated but not yet flushed to disk is called dirty pages. Eventually, the data in dirty pages will be flushed to disk, overwriting the existing data. This process may or may not be related to the redo log.

Dirty pages are only flushed to disk when the redo log is full. Additionally, there are other situations that can trigger the flushing of dirty pages:

When system memory is insufficient, a portion of the data pages needs to be evicted. If the evicted pages are dirty, they must be synchronized to disk first.
During idle time recognized by MySQL. This situation does not have any performance issues.
Before a normal shutdown of MySQL, all dirty pages will be flushed to disk. This situation does not have any performance issues either.

In a production environment, if we enable slow SQL monitoring, you may occasionally encounter SQL statements with slightly longer execution time. This is because flushing dirty pages to disk may incur performance overhead and cause database operations to fluctuate.

LRU Cache Policy #

In the previous section, we learned about the specific implementation principles of the update and insert operations in InnoDB. Now let’s take a look at the implementation and optimization of read operations.

InnoDB storage engine is based on collection index to store data, which means that besides the indexed columns and primary keys stored in B+ trees, other column data is also stored in the leaf nodes of B+ trees. Both index pages and data pages are cached in the buffer pool. When querying data, if the data is already present in the buffer pool, InnoDB does not need to read the page from disk every time, thus improving the query performance of the database.

Although the buffer pool is a large memory area, it cannot store all the data due to the variety of data types and the large amount of data being stored. Therefore, the buffer pool needs to cache the most recently and frequently accessed data using the LRU (Least Recently Used) algorithm, while evicting the less frequently accessed data.

InnoDB has made some optimizations to the LRU algorithm. The traditional LRU algorithm usually places the most recently accessed data at the front of the LRU list, but InnoDB places the data in a midpoint position, often at 5/8 of the list length.

This strategy is mainly to avoid the sudden eviction of hot data by less frequently accessed operations, so that when the hot data is accessed again, it needs to be retrieved from disk, thereby affecting the query performance of the database.

If we have a lot of hot data, we can adjust the midpoint value to increase the amount of stored hot data, thus reducing the eviction rate of hot data.

Summary #

With the above introduction on the implementation and operation principles of InnoDB, we have come to the end. Looking back at Module 6, in the first three lectures, I mainly discussed the performance optimization of database operations, including the optimization of SQL statements, transactions, and indexes. Then, I talked about the optimization of database tables, including table design and the implementation of table partitioning and database sharding. Lastly, I introduced some tuning of database parameters.

In summary, as software engineers, we should grasp several major knowledge points regarding databases and then delve into the details of their internal implementation. Only in this way can we avoid frequently writing SQL statements that have performance issues and develop the ability to optimize database performance.