07 Database Indexes an Index Is Not a Panacea

07 Database Indexes An Index Is Not a Panacea #

Today, I want to share with you the topic that the database index is not a panacea.

Almost all business projects involve data storage. Although various NoSQL and file systems are popular nowadays, relational databases like MySQL are still commonly used for storing important data due to their compliance with ACID, high reliability, and developer-friendliness. In relational databases, indexes are an important means to optimize query performance.

Therefore, I often see some students blindly requesting sysadmins or DBAs to create a large number of indexes for the relevant fields of data tables when they encounter query performance issues. Obviously, this idea is wrong. Today, let’s take MySQL as an example to deepen our understanding of the principles of indexes and related misconceptions.

How does InnoDB store data? #

MySQL abstracts data storage and query operations into storage engines, each with different methods for storing and retrieving data. MySQL supports multiple storage engines, and the storage engine can be set at the table level. Because of its support for transactions, we most commonly use InnoDB. To facilitate understanding of the following content, let me briefly explain how InnoDB stores data.

Although the data is stored on disk, the processing is done in memory. In order to reduce the number of random disk reads, InnoDB uses pages instead of individual rows to store data. The data is divided into several pages, which are stored on disk. The page size of InnoDB is generally 16KB.

The data pages are organized into a doubly linked list, and the records within each data page are organized as a singly linked list in order of the primary key. Each data page contains a page directory, which facilitates searching for records based on the primary key. The structure of a data page is as follows:

The page directory divides the records into different groups using slots. Each group contains several records. In the diagram above, the numbers in the small squares at the beginning of each record represent the number of records in the current group. The smallest and largest slots point to two special pseudo records. With the help of slots, we can use binary search to quickly search for records in a page without having to traverse the entire linked list of records from the smallest record.

Let’s take an example of searching for a record with the primary key (PK) = 15:

First, by binary search, we determine that the middle position of the slots is (0+6)/2=3. We see that the record it points to is 12 < 15, so we need to continue searching records starting from slot #3.

Next, we use binary search again to find the middle position between slot #3 and slot #6, which is (3+6)/2=4.5. After rounding down to 4, we find that the record corresponding to slot #4 is 16 > 15, so the record must be in slot #4.

Finally, we start searching from the record with number 12, pointed to by slot #3, three times downwards, and locate the record with number 15.

Once we understand how InnoDB stores data, we can continue to learn about the principles and pitfalls of MySQL indexing.

Clustered Index and Secondary Index #

When it comes to indexing, the page directory is the simplest form of index. It reduces the time complexity of searches by grouping records into primary categories. However, the number of time complexity reductions that can be achieved this way is limited. When there are countless data pages to store table data, we need to consider how to establish appropriate indexes to facilitate locating the pages where records are located.

To solve this problem, InnoDB introduced the B+ tree. As shown in the figure below, the B+ tree is an upside-down tree:

The features of the B+ tree include:

The lowest level node is called a leaf node, which is used to store data.
Other higher-level nodes are called non-leaf nodes, which are only used to store directory entries as indexes.
Non-leaf nodes are divided into different levels to reduce the amount of searching at each level.
All nodes are sorted according to the index key size, forming a doubly-linked list that accelerates range queries.

Therefore, InnoDB uses the B+ tree to not only store actual data but also accelerate data searches, which is called a clustered index. If we consider the ellipsis in the square below the leaf nodes in the figure to represent actual data, then it is a diagram representing a clustered index. Since the data is physically saved only once, there can only be one clustered index that includes actual data.

InnoDB automatically uses the primary key (a single or multiple fields that uniquely define a record) as the index key for the clustered index (if there is no primary key, it selects the first unique column that does not contain NULL values). The numbers in the square in the figure represent the values of the index key, which is generally the primary key for the clustered index.

Let’s take a look at how the B+ tree achieves fast primary key lookup. For example, if we want to search for data with PK=4, we can determine that the data is in page 2, which is pointed by the first record in the root node, through the index in the root node. Then, through the index in page 2, we can find that the data is in page 5, which is the actual data page. Finally, by using binary search on the page directory, we can immediately find the pointer to the record.

To achieve fast searches for non-primary key fields, secondary indexes, also known as non-clustered indexes or auxiliary indexes, are introduced. The structure of secondary indexes also utilizes the B+ tree, as shown in the figure below:

This time, the leaf nodes of the secondary index store the primary key instead of the actual data. After obtaining the primary key value, the data row is retrieved from the clustered index. This process is called a lookup.

For example, if there is an index created for the username field, the letters in the square on the index record represent the usernames and form a linked list in sequential order. If we want to search for data with the username “b”, after two rounds of positioning, we can determine that it is in data page #5 and find out that the primary keys are 7 and 6. With these two primary keys, we can continue to use the clustered index for two more lookups to obtain the complete data.

Considerations for the Cost of Creating Secondary Indexes #

The cost of creating secondary indexes primarily manifests itself in three aspects: maintenance cost, space cost, and lookup cost. Next, I will analyze them in detail with you.

Firstly, let’s talk about the maintenance cost. Creating N secondary indexes requires the creation of N B+ trees. When adding new data, not only the clustered index needs to be modified, but also these N secondary indexes need to be modified.

Let’s test the cost of creating indexes. Suppose there is a “person” table with a primary key ID and three fields: name, score, and create_time:

CREATE TABLE `person` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) NOT NULL,
  `score` int(11) NOT NULL,
  `create_time` timestamp NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

By executing the following stored procedure to create 100,000 test data records in a loop, it took 140 seconds on my machine (all examples in this article are executed in MySQL 5.7.26):

CREATE DEFINER=`root`@`%` PROCEDURE `insert_person`()
begin
    declare c_id integer default 1;
    while c_id<=100000 do
    insert into person values(c_id, concat('name',c_id), c_id+100, date_sub(NOW(), interval c_id second));
    set c_id=c_id+1;
    end while;
end

If we create two more secondary indexes, one is a composite index of “name” and “score”, and the other is a single-column index of “create_time”, then the time to create 100,000 records increased to 154 seconds:

KEY `name_score` (`name`,`score`) USING BTREE,
KEY `create_time` (`create_time`) USING BTREE

Here, I want to mention that the records in a page are stored in the order of the index values from small to large. When adding new records, data needs to be inserted into the page. When the current page is full, a new page needs to be created, and part of the data in the existing page needs to be moved to the new page. This is called page splitting. If a large number of records are deleted, making the page relatively empty, page merging may also be needed. Both page splitting and merging incur IO costs and may result in deadlocks during the operation.

You can refer to this document to further understand how to set a reasonable merge threshold to balance the page occupancy rate and the cost incurred by page splitting.

Secondly, let’s talk about the space cost. Although secondary indexes do not save the original data, they need to store the data of the indexed columns, so they occupy more space. For example, after creating two indexes for the “person” table, use the following SQL to view the data and index disk usage:

SELECT DATA_LENGTH, INDEX_LENGTH FROM information_schema.TABLES WHERE TABLE_NAME='person'

The result shows that the data itself only occupies 4.7M, while the indexes occupy 8.4M.

Lastly, let’s talk about the lookup cost. Secondary indexes do not store the original data. After finding the primary key through the index, it is necessary to query the clustered index to obtain the data we need. For example, use the SELECT * statement to query users based on the “name” field and use the EXPLAIN statement to view the execution plan:

EXPLAIN SELECT * FROM person WHERE NAME='name1'

The execution plan is as follows:

+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table  | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | person | NULL       | ref  | name_score    | name | 302     | const| 1    | 100.00   | Using where |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+

The “key” field represents which index is actually used, and its value is “name_score”, which means that the “name_score” index is used.

The “type” field represents the way the table is accessed. Its value “ref” indicates an equality match on the secondary index, which meets our query condition.

If we modify the SQL by replacing * with NAME and SCORE, which are the two columns included in the “name_score” composite index:

EXPLAIN SELECT NAME,SCORE FROM person WHERE NAME='name1'

Let’s take a look at the execution plan:

+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+--------------------------+
| id | select_type | table  | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | person | NULL       | ref  | name_score    | name | 302     | const| 1    | 100.00   | Using index              |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+--------------------------+

We can see that the “Extra” column has an additional line of “Using index”, indicating that this query directly uses the secondary index, avoiding the overhead of a lookup operation.

The reason is simple. The composite index actually stores the values of multiple indexed columns. The records in the page are first sorted by the first field, and if they are the same, then sorted by the second field, as shown in the figure:

index structure

In the figure, the first and second blocks of each record in the leaf node represent the data of the indexed columns, and the third block represents the primary key of the record. If we need to query the data that the index column or composite index covers, the index itself has already “covered” the required data, and no lookup operation is needed. Therefore, this situation is also called index covering. I will explain how to view the cost of different queries and the cost difference between index covering and index lookup in the last subsection.

Finally, let me summarize some best practices regarding the cost of indexes.

First, indexes do not need to be created from the beginning. You can wait until the business scenario is clear or until the data volume exceeds 10,000 and queries become slow, then create indexes for the fields that need to be queried, sorted, or grouped. After creating indexes, you can use the EXPLAIN command to confirm whether the query can use indexes. I will explain this in more detail in the next subsection.

Second, try to index lightweight fields. For example, if you can index an int field, there is no need to index a varchar field. Indexed fields can also be partially indexed by specifying the length of the indexed field when creating the index. For searching long text, consider using specialized index databases for text search, such as Elasticsearch.

Third, try not to use SELECT * in SQL statements. Instead, select only the necessary fields, or even consider using composite indexes to include the fields we want to search. This can achieve index acceleration and avoid the overhead of a lookup operation.

Not all queries against indexed columns can use the index #

In the previous example, I created a composite index on “name+score”, which can utilize this composite index even when searching only by “name”. This brings up two questions:

Does creating an index always ensure its usage?

How do you choose between creating a composite index or multiple independent indexes?

Firstly, let’s analyze some scenarios where the index fails.

Firstly, the index can only match the column prefix. For example, in the LIKE statement below, searching for users with a “name” suffix of “name123” cannot utilize the index, and the execution plan with type=ALL represents a full table scan:

EXPLAIN SELECT * FROM person WHERE NAME LIKE '%name123' LIMIT 100

By placing the percent sign at the end for prefix matching, when type=range, it means that an index scan was utilized, and we can see from the “key” value “name_score” that the “name_score” index was used in practice:

EXPLAIN SELECT * FROM person WHERE NAME LIKE 'name123%' LIMIT 100

The reason is simple: the rows in the B+ tree index are sorted according to the index values, and comparisons can only be made based on the prefix. If you also want to search based on the suffix and hope to utilize the index, and if you will always only search based on the suffix, you can reverse the data storage order and flip it back when needed.

Secondly, conditions involving function operations cannot utilize the index. For example, if the search condition uses the LENGTH function, it cannot utilize the index either:

EXPLAIN SELECT * FROM person WHERE LENGTH(NAME)=7

For the same reason, the index stores the original values of the indexed columns, not the values after function calculations. If you need to utilize the database index for function calls, you can only save a copy of the values after the function transformation and then create an index based on this computed column.

Thirdly, a composite index can only match the leftmost column. In other words, although we created a composite index on “name” and “score”, searching only by “score” cannot utilize the index either:

EXPLAIN SELECT * FROM person WHERE SCORE>45678

The reason here is also straightforward. In the case of a composite index, the data is sorted based on the first column of the index, and only when the values of the first column are the same, will the sorting be based on the second column. In other words, if we want to use as many columns from the composite index as possible, the various columns in the search conditions must be consecutive columns from the leftmost side of the composite index. If we only search based on the second column, it cannot utilize the index. By adding the “name” column to the search condition, we can see that the “name_score” index was used:

EXPLAIN SELECT * FROM person WHERE SCORE>45678 AND NAME LIKE 'NAME45%'

Note that, because there is a query optimizer that decides the order of the WHERE clause, the order of the “name” column in the WHERE clause is not very important.

Now let’s go back to the two initial questions.

Does creating an index always ensure its usage? No, only when the query matches the actual structure of the indexed data, it can be utilized. Here, I have provided three examples where the index cannot be utilized. In fact, even if the index can be utilized, MySQL may not necessarily choose to use it. I will elaborate on this in the next section.

How do you choose between creating a composite index or multiple independent indexes? If your search conditions often involve multiple fields, you can consider creating a composite index for these fields. At the same time, creating a composite index for multiple fields increases the possibility of index coverage. If you only query a single field, you can consider creating independent indexes, as composite indexes have a cost for unnecessary fields.

Database determines whether to use an index based on cost #

From the previous examples, we can see that querying data can be done through a full table scan on the clustered index or by scanning a secondary index and then returning to the clustered index. At this point, you may wonder, how does MySQL determine which approach to take?

In fact, before querying data, MySQL first generates execution plans for possible approaches, and then determines which execution plan to use based on cost.

This cost includes IO cost and CPU cost:

IO cost refers to the cost of loading data from disk to memory. By default, the IO cost constant for reading data pages is 1 (i.e., the cost of reading 1 page is 1).

CPU cost refers to the cost of operations such as checking if data satisfies conditions and sorting. By default, the cost of checking records is 0.2.

Based on this, let’s analyze the cost of a full table scan.

A full table scan involves comparing records in the clustered index one by one with the given search conditions, and adding the records that meet the search conditions to the result set. To calculate the cost of a full table scan, we need two pieces of information:

The number of pages occupied by the clustered index, which is used to calculate the IO cost of reading data.

The number of records in the table, which is used to calculate the CPU cost of searching.

So, does MySQL calculate these statistics in real time? Actually, it doesn’t. MySQL maintains statistical information about tables, which can be viewed using the following command:

SHOW TABLE STATUS LIKE 'person'

The output is as follows:

As you can see:

The total number of rows is 100,086 (as we saw before, the rows were also 100,086 when using EXPLAIN). You may ask why there are 86 additional rows compared to the supposed 100,000 rows in the person table. In fact, MySQL’s statistical information is an estimation, and the calculation method is quite complex, so I won’t go into detail. Nevertheless, we can still use this value to estimate the CPU cost, which is around 100,086 * 0.2 = 20,017.

The data length is 4,734,976 bytes. For InnoDB, this corresponds to the space occupied by the clustered index, which is equal to the number of pages in the clustered index multiplied by the size of each page. InnoDB has a page size of 16KB, so we can estimate the number of pages to be around 289, resulting in an IO cost of about 289.

Therefore, the total cost of a full table scan is around 20,306.

Next, let’s analyze how MySQL determines the execution plan based on cost, using the “person” table as an example. Now, I’m going to use the following SQL query: name > ’name84059’ AND create_time > ‘2020-01-24 05:00:00’

EXPLAIN SELECT * FROM person WHERE NAME > 'name84059' AND create_time > '2020-01-24 05:00:00'

The execution plan is a full table scan:

Just by changing the hour in the create_time condition from 5 to 6, it becomes an index scan, specifically on the create_time index instead of the name_score composite index:

From this, we can draw two conclusions:

MySQL does not choose indexes based on the order of columns in the WHERE condition.
Even if a column has an index, and even if there are multiple possible index plans, MySQL may not use an index.

The reason is that MySQL does not make decisions on using indexes based on guesswork; it relies on cost evaluation. Although the table statistics are not completely accurate, they are sufficient for determining the strategy.

However, sometimes due to inaccurate statistics or estimation errors, the actual cost may differ significantly from what MySQL has estimated, leading to MySQL choosing the wrong index or resorting to a full table scan. In such cases, manual intervention is required by using force index. For example, to force the use of the name_score index:

EXPLAIN SELECT * FROM person FORCE INDEX(name_score) WHERE NAME > 'name84059' AND create_time > '2020-01-24 05:00:00'

We have discussed that MySQL selects an execution plan based on cost, and through EXPLAIN we can see what execution plan the optimizer ultimately chooses. However, how MySQL determines the execution plan remains a black box. Is there any way to understand the cost of various execution plans and the basis for MySQL’s choices?

In MySQL 5.6 and later versions, we can use the optimizer trace feature to view the entire process of the optimizer generating the execution plan. With this feature, we can not only understand the optimizer’s selection process, but also the cost of each execution step, allowing us to further optimize the query based on this information.

As shown in the code below, after enabling optimizer_trace, we can execute SQL statements and then query the information_schema.OPTIMIZER_TRACE table to view the execution plan. Finally, we can disable the optimizer_trace feature:

SET optimizer_trace="enabled=on";

SELECT * FROM person WHERE NAME >'name84059' AND create_time>'2020-01-24 05:00:00';
SELECT * FROM information_schema.OPTIMIZER_TRACE;
SET optimizer_trace="enabled=off";

For the SQL statement that performs a full table scan with the condition create_time > '2020-01-24 05:00:00', I extracted several important segments from the execution result of OPTIMIZER_TRACE for detailed analysis:

Using the index scan on name_score for the condition name84059 < name requires scanning 25362 rows with a cost of 30435, therefore, this plan was not chosen. The 30435 here is the sum of the IO cost and CPU cost of querying the secondary index, as well as the IO cost and CPU cost of retrieving data from the clustered index, which I won't go into further analysis:

{ “index”: “name_score”, “ranges”: [ “name84059 < name” ], “rows”: 25362, “cost”: 30435, “chosen”: false, “cause”: “cost” },


Using the index scan on create_time requires scanning 23758 rows with a cost of 28511, again, this plan was not chosen due to its cost:

{ “index”: “create_time”, “ranges”: [ “0x5e2a79d0 < create_time” ], “rows”: 23758, “cost”: 28511, “chosen”: false, “cause”: “cost” },


Finally, a full table scan was chosen as the execution plan. It can be seen that the cost of scanning 100086 rows with a full table scan is 20306, which is consistent with our previous calculation and obviously smaller than the costs of the other two plans, 28511 and 30435:

{ “considered_execution_plans”: [{ “table”: “person”, “best_access_path”: { “considered_access_paths”: [{ “rows_to_scan”: 100086, “access_type”: “scan”, “resulting_rows”: 100086, “cost”: 20306, “chosen”: true }] }, “rows_for_plan”: 100086, “cost_for_plan”: 20306, “chosen”: true }] },


By changing the create_time condition in the SQL statement from 05:00 to 06:00, reanalyzing the OPTIMIZER_TRACE reveals that this time the chosen execution plan is to use the create_time index. Since we are querying data with a later time, the number of rows to scan with the create_time index decreased from 23758 to 16588. The cost of using this index, 19907, is smaller than the cost of a full table scan, 20306, and even smaller than the cost of using the name_score index, 30435:

{ “index”: “create_time”, “ranges”: [ “0x5e2a87e0 < create_time” ], “rows”: 16588, “cost”: 19907, “chosen”: true }


For more information about the optimizer trace, you can refer to the MySQL documentation.
## Key Review

Today, I first analyzed the structure of MySQL InnoDB storage engine pages, clustered indexes, and secondary indexes, and then analyzed two misconceptions about indexes.

The first misconception is that considering the cost of index maintenance, space occupancy, and the cost of table lookups during queries, it cannot be assumed that more indexes are better. Indexes must be created on demand and should be as lightweight as possible. Once a composite index with multiple fields is created, we should try to use the index itself to complete the data query and reduce the cost of table lookups.

The second misconception is that having indexes does not necessarily guarantee effectiveness. Indexes cannot be used for suffix matching queries, queries that do not include the first column of a composite index, and queries involving function calculations. Additionally, even if the SQL itself meets the conditions for using an index, MySQL will evaluate the cost of various query methods to decide whether to use an index and which index to use.

Therefore, when trying to optimize SQL performance using indexes, it is necessary to confirm whether the index can effectively improve performance issues through execution plans or actual results. Otherwise, adding indexes will not only fail to solve performance problems but also increase the burden of database CRUD operations. If you have doubts about the execution plan provided by EXPLAIN, you can also use optimizer_trace to view detailed execution plans for further analysis.

The code used today is stored on GitHub, and you can click on this link to view it.
## Reflection and Discussion

When introducing the cost of secondary index, we have seen two cases: index covering and accessing data from records, using the EXPLAIN command. Can you analyze the cost difference between these two cases using optimizer trace?

In addition to speeding up searches, indexes can also play a role in sorting. Can you prove this using EXPLAIN? Do you know when sorting indexes become ineffective?

Do you have any other insights regarding database indexes? I am Zhuye, feel free to leave a comment in the comment section to share with me. You are also welcome to share this article with your friends or colleagues for further discussion.