16 How Does Order by Work

16 How Does order by Work #

When developing an application, you will often encounter the need to display results sorted by a specified field. Taking the example of the citizen table we used earlier, suppose you want to query the names of all people in the city of “Hangzhou” and sort them by name, returning the names and ages of the top 1000 people.

Assuming the table is defined as follows:

CREATE TABLE `t` (
  `id` int(11) NOT NULL,
  `city` varchar(16) NOT NULL,
  `name` varchar(16) NOT NULL,
  `age` int(11) NOT NULL,
  `addr` varchar(128) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `city` (`city`)
) ENGINE=InnoDB;

Your SQL statement can be written as follows:

select city,name,age from t where city='杭州' order by name limit 1000;

This statement looks logical, but do you understand its execution process? Today, I will talk to you about how this statement is executed and what parameters can affect its behavior.

Sorting by the Entire Field #

We have introduced indexes earlier, so now you understand that to avoid full table scans, we need to index the city field.

After creating an index on the city field, let’s use the explain command to see the execution details of this statement.

img

Figure 1: Using the explain command to view the execution details of the statement

The “Using filesort” in the Extra field indicates that sorting is required. MySQL allocates a memory area called sort_buffer to each thread for sorting.

To illustrate the execution process of this SQL query statement, let’s first take a look at the index diagram for the city field.

img

Figure 2: Index diagram for the city field

From the diagram, you can see that the rows that satisfy the condition city=‘Hangzhou’ are the records from ID_X to ID_(X+N).

In general, the execution process of this statement is as follows:

  1. Initialize the sort_buffer and determine that the fields name, city, and age will be included.
  2. Find the first primary key id that satisfies the condition city=‘Hangzhou’ from the city index, which is ID_X in the diagram.
  3. Retrieve the entire row from the primary key id index, get the values of the name, city, and age fields, and store them in the sort_buffer.
  4. Get the next record’s primary key id from the city index.
  5. Repeat steps 3 and 4 until the value of city no longer satisfies the query condition, and the corresponding primary key id is ID_Y in the diagram.
  6. Perform a quick sort on the data in the sort_buffer based on the name field.
  7. Return the top 1000 rows according to the sorting result to the client.

Let’s temporarily call this sorting process “sorting by the entire field”. The diagram below shows the schematic of the execution process, which will be used in the next article.

img

Figure 3: Sorting by the Entire Field

The “Sorting by name” action in the diagram may be done in memory or may require external sorting, depending on the amount of data to be sorted and the sort_buffer_size parameter.

sort_buffer_size is the size of the memory (sort_buffer) allocated by MySQL for sorting. If the amount of data to be sorted is less than sort_buffer_size, sorting will be performed in memory. However, if the amount of data for sorting is too large to fit in memory, temporary disk files will have to be used for sorting.

You can use the following method to determine whether a sorting statement is using temporary files:

/* Enable the optimizer_trace, only valid for the current thread */
SET optimizer_trace='enabled=on';

/* Save the initial value of Innodb_rows_read in @a */
select VARIABLE_VALUE into @a from performance_schema.session_status where variable_name = 'Innodb_rows_read';

/* Execute the statement */
select city, name, age from t where city='杭州' order by name limit 1000;

/* View the OPTIMIZER_TRACE output */
SELECT * FROM `information_schema`.`OPTIMIZER_TRACE`\G

/* Save the current value of Innodb_rows_read in @b */
select VARIABLE_VALUE into @b from performance_schema.session_status where variable_name = 'Innodb_rows_read';
/* Calculate the difference of Innodb_rows_read */
SELECT @b-@a;

This method is confirmed by looking at the result of OPTIMIZER_TRACE. You can see if temporary files were used from `number_of_tmp_files`.

![img](../images/89baf99cdeefe90a22370e1d6f5e6495.png)

Figure 4 OPTIMIZER_TRACE section of full sort result

`number_of_tmp_files` represents the number of temporary files used during the sorting process. You may wonder why 12 files are needed? When the memory is not enough, external sorting is needed, which generally uses the merge sort algorithm. You can simply understand it as follows: **MySQL divides the data to be sorted into 12 parts, each part is sorted separately and stored in these temporary files. Then, the 12 sorted files are merged into one sorted large file.**

If `sort_buffer_size` exceeds the size of the data to be sorted, `number_of_tmp_files` is 0, indicating that the sorting can be completed directly in memory.

Otherwise, it needs to be sorted in temporary files. The smaller the `sort_buffer_size`, the more parts the data needs to be divided into, and the larger the value of `number_of_tmp_files`.

Next, let me explain the meaning of the other two values in Figure 4.

There are 4000 records in our sample table that meet the condition `city='Hangzhou'`, so you can see `examined_rows=4000`, which means that the number of rows involved in the sorting process is 4000.

The meaning of `packed_additional_fields` in `sort_mode` is that the sorting process has done "compact" processing on strings. Even if the `name` field is defined as `varchar(16)`, the space allocation is still based on the actual length during the sorting process.

At the same time, the result of the last query `SELECT @b-@a` is 4000, indicating that only 4000 rows were scanned during the entire execution process.

It should be noted that in order to avoid interference with the conclusion, I set `internal_tmp_disk_storage_engine` to `MyISAM`. Otherwise, the result of `SELECT @b-@a` would be displayed as 4001.

This is because when querying the table `OPTIMIZER_TRACE`, temporary tables are needed, and the default value of `internal_tmp_disk_storage_engine` is InnoDB. If the InnoDB engine is used, when the data is taken out from the temporary table, the value of `Innodb_rows_read` will be incremented by 1.

# Rowid Sorting

In the algorithm process mentioned above, only the original table data is read once, and the remaining operations are performed in the `sort_buffer` and temporary files. But this algorithm has a problem: if the query needs to return many fields, then the number of fields that need to be stored in the `sort_buffer` will be too large, and the number of rows that can be simultaneously stored in memory will be very few. As a result, many temporary files need to be created, which leads to poor sorting performance.

So, **what will MySQL do if it thinks that the sorting row length is too large?**

Next, I will modify a parameter to make MySQL adopt another algorithm.
    
    SET max_length_for_sort_data = 16;

`max_length_for_sort_data` is a parameter in MySQL that specifically controls the length of row data used for sorting. It means that if the length of a single row exceeds this value, MySQL considers the row to be too large and switches to another algorithm.

The total length of fields `city`, `name`, and `age` is 36. I set `max_length_for_sort_data` to 16. Let's see what changes there will be in the calculation process.

In the new algorithm, the only fields put into the `sort_buffer` are the columns to be sorted (i.e., the `name` field) and the primary key `id`.

However, at this time, because the result of the sorting no longer includes the values of the `city` and `age` fields, it cannot be directly returned. Therefore, the entire execution process becomes as follows:

  1. Initialize `sort_buffer` and determine to include two fields, namely `name` and `id`.
  2. Find the first primary key `id` that satisfies the condition `city='Hangzhou'` from the `city` index, which is the `ID_X` in the figure.
  3. Retrieve the entire row from the primary key `id` index, extract the `name` and `id` fields, and store them in the `sort_buffer`.
  4. Fetch the next primary key `id` of the record from the `city` index.
  5. Repeat steps 3 and 4 until the condition `city='Hangzhou'` is not satisfied, which is the `ID_Y` in the figure.
  6. Sort the data in the `sort_buffer` based on the `name` field.
  7. Traverse the sorted result, retrieve the first 1000 rows, and return the values of the `city`, `name`, and `age` fields back to the client based on the `id` value.

The schematic diagram of this execution process is shown below, and I call it rowid sorting.

![img](../images/dc92b67721171206a302eb679c83e86d.jpg)

Figure 5 Rowid Sorting

Comparing Figure 3's flowchart of full-field sorting, you will find that rowid sorting accesses the primary key index of table `t` one more time, which is step 7.

It needs to be clarified that the "result set" in the end is a logical concept. In reality, the MySQL server retrieves the `id` from the sorted `sort_buffer` one by one and retrieves the results of the `city`, `name`, and `age` fields from the original table. The server does not need to store the results in memory again, and they are returned directly to the client.

Based on this explanation process and the diagram, can you think about what the result will be when executing `SELECT @b-@a` at this point?

Now, let's look at what the different results are.

First, the value of `examined_rows` in the figure is still 4000, indicating that the data used for sorting is 4000 rows. However, the value of `SELECT @b-@a` becomes 5000.

This is because besides the sorting process, after the sorting is completed, values need to be retrieved from the original table based on the `id`. Since the statement is `LIMIT 1000`, an additional 1000 rows are read.

![img](../images/27f164804d1a4689718291be5d10f89b.png)

Figure 6 OPTIMIZER_TRACE section of rowid sorting output

From the result of OPTIMIZER_TRACE, you can also see that two other pieces of information have changed:

  * `sort_mode` becomes ``, indicating that only the `name` and `id` fields are involved in the sorting.
  * `number_of_tmp_files` becomes 10. Although the number of rows involved in sorting is still 4000, each row becomes smaller, so the total amount of data to be sorted becomes smaller, and the number of temporary files needed becomes correspondingly smaller.

Full Column Sorting vs. Rowid Sorting #

Let’s analyze what we can learn from these two execution processes.

If MySQL is concerned about the small sorting memory affecting sorting efficiency, it will use the rowid sorting algorithm. With this algorithm, more rows can be sorted at once, but the data needs to be retrieved from the original table again.

If MySQL believes that the memory is large enough, it will prioritize full column sorting. It will put all the required columns into the sort_buffer, so the sorted results can be directly returned from the memory without retrieving data from the original table.

This reflects MySQL’s design philosophy: If the memory is sufficient, make the most use of it and try to reduce disk access as much as possible.

For InnoDB tables, rowid sorting requires more disk reads, so it will not be the preferred option.

This conclusion may seem redundant, but you should remember it because we will use it in the next article.

Now that you understand that sorting in MySQL is a costly operation, you may wonder if all order by statements require sorting. If we can get the correct results without sorting, it will reduce the system’s consumption and the execution time of the query.

Actually, not all order by statements require sorting. From the analysis of the execution process above, we can see that the reason why MySQL needs to generate a temporary table and perform sorting operations on it is that the original data is unordered.

You can imagine that if the rows retrieved from the index “city” are naturally sorted in ascending order according to “name”, then there is no need for sorting, right?

Indeed, that’s the case.

So, we can create a composite index on “city” and “name” on this citizens table. The SQL statement for that is:

alter table t add index city_user(city, name);

As a comparison to the “city” index, let’s take a look at the diagram of this new index.

img

Figure 7: Diagram of the composite index “city_user” on “city” and “name”

In this index, we can still locate the first record that satisfies the condition “city=‘Hangzhou’” using a tree search method. Moreover, during the process of traversing the “next record” in order, as long as the value of “city” is “Hangzhou”, the value of “name” will always be in order.

The entire query process becomes:

  1. Find the first row that satisfies the condition “city=‘Hangzhou’” in the index (city, name).
  2. Retrieve the entire row from the primary key index using the retrieved primary key id. Take the values of name, city, and age as part of the result set and directly return them.
  3. Retrieve the next record primary key id from the index (city, name).
  4. Repeat steps 2 and 3 until the 1000th record is found or the loop ends when the condition city=‘Hangzhou’ is not satisfied.

img

Figure 8: Execution plan of the query after introducing the composite index (city, name)

As you can see, this query process does not require a temporary table or sorting. Moreover, since the composite index (city, name) itself is ordered, it doesn’t need to read all 4000 rows. Only the first 1000 records that satisfy the condition need to be examined. In other words, in our example, it only needs to scan 1000 times.

Since we’ve come this far, let’s further discuss if we can simplify the execution process of this query. Do you remember in the 5th article [“In-depth Understanding of Indexes (Part 2)”], I introduced the concept of a covering index?

Here, let’s review it briefly. A covering index means that the index contains enough information to satisfy the query and doesn’t require retrieving data from the primary key index.

According to the concept of a covering index, we can optimize the execution process of this query.

For this query, we can create a composite index on “city”, “name”, and “age”. The SQL statement for that is:

alter table t add index city_user_age(city, name, age);

Now, for rows with the same value in the “city” field, they are still sorted in ascending order by the “name” field. Thus, the query no longer requires sorting. The execution process of the entire query becomes:

  1. Find the first record that satisfies the condition “city=‘Hangzhou’” in the index (city, name, age), and retrieve the values of city, name, and age as part of the result set and directly return them.
  2. Retrieve the next record from the index (city, name, age), and retrieve the values of city, name, and age as part of the result set and directly return them.
  3. Repeat step 2 until the 1000th record is found or the loop ends when the condition city=‘Hangzhou’ is not satisfied.

img

Figure 10: Execution plan of the query after introducing the composite index (city, name, age)

Let’s take a look at the explain results.

img

Introduction #

In Figure 11, after introducing the composite index (city, name, age), the execution plan of the query statement can be seen.

It can be seen that the “Extra” field now includes “Using index”, indicating the use of a covering index, which significantly improves performance.

Of course, this does not mean that you need to create composite indexes for every query that can benefit from covering indexes, as indexes still have maintenance costs. This is a decision that needs to be balanced.

Summary #

In today’s article, I introduced several algorithms for the order by statement in MySQL.

When developing a system, you will inevitably use the order by statement. You need to understand how the sorting logic is implemented in each statement and analyze the resource consumption of each statement in the worst-case scenario in order to avoid making low-level mistakes.

Finally, I have a question for you.

Assuming you already have a composite index city_name(city, name) in your table, and you want to retrieve the names of all citizens in Hangzhou and Suzhou cities, sorted by name, and display the first 100 records. If the SQL query statement is written as follows:

mysql> select * from t where city in ('Hangzhou', 'Suzhou') order by name limit 100;

Will this statement perform a sorting process? Why?

If you were to develop the business-side code and needed to implement a solution that does not require sorting on the database side, how would you do it?

Furthermore, if there is a paging requirement and you need to display the 101st page, in other words, the statement needs to be changed to “limit 10000,100”, what would be your implementation method?

You can write your thoughts and opinions in the comments section, and I will discuss this question with you at the end of the next article. Thank you for listening, and feel free to share this article with more friends to read together.

Time for Previous Question #

The previous question was: When MySQL updates a row, but the value to be modified is the same as the original value, does MySQL really execute the modification or does it return directly when it sees that the values are the same?

This was the first time that all three options were selected by students for our after-class question, so I need to provide a detailed explanation.

The first option is that MySQL reads the data, sees that the value is the same as the original, and does not update it, returning directly and ending the execution. This can be confirmed by performing a lock experiment.

Assuming the current value in table t is (1, 2).

img

Figure 12 Lock validation method

Session B’s update statement is blocked, and locking is an action that can be performed by InnoDB, so option 1 can be ruled out.

The second option is that MySQL calls the interface provided by the InnoDB engine, but the engine realizes that the value is the same as the original, so it does not update it and returns directly. Is this possible? This can be confirmed by performing a visibility experiment.

Assuming the current value in the table is (1,2).

img

Figure 13 Visibility validation method

The second select statement in session A is a consistent read (snapshot read), which cannot see the update made by session B.

Now it returns (1,3), indicating that it has seen a new version, which can only be generated when session A performs its own update statement. (If you have doubts about this logic, you can review the relevant content in the 8th article [“Are transactions isolated or not?”])

Therefore, the answer to our previous question should be option 3: InnoDB faithfully executes the operation “modify this value to (1,2)”, locking if necessary and updating if necessary.

Then you might ask, why doesn’t MySQL check whether the value is the same before updating? If it did, it wouldn’t waste InnoDB’s operations and perform an unnecessary update.

In fact, MySQL has already made the confirmation. It is just that in this statement, MySQL believes that the value read out is only one specific value (id=1), while the value to write is (a=3), and it cannot be seen that “no modification is needed” from these two pieces of information.

As a verification, you can take a look at the following example.

img

Figure 14 Visibility validation method - Comparison

Additional Explanation:

The verification results above were performed under the binlog_format=statement format.

@dideren added a case where if binlog_format=row and binlog_row_image=FULL, MySQL needs to record all fields in the binlog, so it will read all the data when reading, regardless of blobs.

According to the rules mentioned earlier, “since data has been read, a check will be performed”. Therefore, in this case, the result of select * from t where id=1 would be “returning (1,2)”.

Similarly, if binlog_row_image=NOBLOB, it will read all fields except blobs. In our example, the result would still be “returning (1,2)”.

The corresponding code is shown in Figure 15. This was introduced in MySQL version 5.6, which I have not seen before. So, I wanted to clarify this specifically.

img

Figure 15 Reading fields logic with binlog_row_image=FULL

Similarly, @mahonebags mentioned the issue with the timestamp field. The conclusion is that if there is a timestamp field in the table and it is set to automatically update, when updating “other fields”, MySQL will read all the relevant fields. This way, it will realize that there is no need for modification.