07 Pooling Techniques How to Reduce the Performance Loss of Frequently Creating Database Connections

07 Pooling Techniques - How to Reduce the Performance Loss of Frequently Creating Database Connections #

In the previous lessons, I provided you with a macro perspective on the basics of designing high-concurrency systems. You already know that the purpose of system design is to achieve better performance, higher availability, and stronger scalability.

Starting from this lesson, we will formally enter the evolution section. I will explain, one by one, the methods used to achieve these goals, which specifically address the problems encountered in the design of high-concurrency systems. For example, in Lesson 15, I will mention the Bloom filter, which is a component used to improve cache hit rates in scenarios with a high cache penetration.

Of course, simply explaining theories and solutions can be boring, so I will use a fictitious system as a thread throughout the entire course to demonstrate the problems encountered when this system reaches a certain stage, and the solutions required to address these problems, including the technical points involved. Through this mode of narration, I hope to use case studies to introduce problems, enabling you to understand the thought process when facing different problems. Of course, I hope that you can engage in more thinking during this process and apply the knowledge you have learned to actual projects.

Now, let us proceed with the course.

Imagine this scenario: one day, the CEO of a company calls you into a meeting room and tells you that the company has identified a new business opportunity. They hope you can lead a team and rapidly develop an e-commerce system targeting a specific vertical field.

Due to the shortage of manpower and time constraints, in order to complete the task, you immediately choose the simplest architecture: one web server for front-end business code and one database server for storing business data.

This architecture is the most familiar and simplest prototype for every one of us. Many systems start off like this, but as the complexity of the business increases, the architecture becomes more layered and appears more complex.

Returning to our vertical e-commerce system, after the system is launched, although the user count is small, it runs smoothly and you feel a sense of accomplishment. However, the CEO feels that the user count is too low, so they urgently mobilize the operations team to promote the system across the entire network.

This promotion quickly brings in a large influx of traffic, but at this point, the system’s access speed starts to slow down.

After analyzing the program’s logs, you discover that the cause of the system’s slowness lies in its interaction with the database. The method of database calls used is to first obtain a database connection, then query data from the database using this connection, and finally close the connection to release database resources. Under this calling method, each SQL execution requires establishing a new connection. Therefore, you suspect that the slow access problem is caused by the long time spent frequently creating database connections.

But why does frequent connection creation result in slow response times? Let’s look at an actual test.

I used the “tcpdump -i bond0 -nn -tttt port 4490” command to capture the network packets of the live MySQL connection for analysis. From the captured packets, the MySQL connection process can be divided into two parts:

The first part consists of the first three data packets. The first data packet is a “SYN” packet sent by the client to the server, the second packet is an “ACK” packet and another “SYN” packet sent by the server to the client, and the third packet is an “ACK” packet sent by the client back to the server. Those familiar with the TCP protocol can see that this is a three-way handshake process.

The second part is the process of the MySQL server verifying the client’s password. In this process, the first packet is a server message requesting authentication from the client, the second and third packets are encrypted passwords sent by the client to the server, and the last two packets are server messages indicating successful authentication sent back to the client. From the graph, you can see that the entire connection process took approximately 4ms (969012-964904).

So, how long does it take for a single SQL execution? We measured the execution time of SQL statements over a period of time and found that the average execution time was approximately 1ms. This means that compared to SQL execution, the MySQL connection process is relatively time-consuming. This doesn’t have a significant impact when the request volume is low because the time consumed by both establishing connections and executing SQL statements is on the millisecond scale. However, when the request volume increases, if connections are only established once and one SQL statement is executed, only 200 database queries can be executed in 1 second, and the time spent establishing database connections accounts for 4/5 of this.

So, what should you do in this situation?

After a Google search, you find that the solution is quite simple: pre-establish database connections using a connection pool so that frequent connection creation is not necessary when using them. After making this adjustment, you find that you can execute 1,000 database queries in 1 second, greatly improving the querying performance.

Using Connection Pool to Pre-establish Database Connections #

Although you solved the problem temporarily, you still want to thoroughly understand the core principles of problem-solving. So you decided to catch up on some studying.

In fact, during the development process, we often use various connection pools, such as database connection pools, HTTP connection pools, Redis connection pools, and so on. The management of connection pools is the core of connection pool design. Let’s take the database connection pool as an example to explain the key points of connection pool management.

There are two most important configurations for the database connection pool: minimum connection count and maximum connection count, which control the process of obtaining connections from the connection pool:

If the current connection count is less than the minimum connection count, then create a new connection to handle database requests.
If there are idle connections in the connection pool, reuse the idle connections.
If there are no connections in the idle pool and the current connection count is less than the maximum connection count, create a new connection to handle requests.
If the current connection count is already greater than or equal to the maximum connection count, then wait for an old connection to become available according to the configured time (checkoutTimeout in C3P0 connection pool configuration).
If the wait time exceeds this set time, throw an error to the user.

You don’t need to memorize this process, as it is very simple. You can take a moment to think about how you would design it if you were the connection pool designer. This design mindset is often used in our future architecture designs.

To help you understand and remember this process, let me give you an example.

Imagine you operate a small massage chair store in an airport. Your store has a total of 10 massage chairs (analogous to the maximum connection count). To save costs (electricity for the massage chairs), you usually keep 4 massage chairs running (minimum connection count), and the other 6 are turned off.

When a customer comes, if there are empty massage chairs among the 4 chairs that are usually kept running, you directly ask them to go to the empty one. But if all 4 massage chairs are occupied when a customer comes, you will start a new one until all 10 chairs are fully utilized.

What if all 10 massage chairs are fully utilized? You will tell the customer to wait for a while, promising that a massage chair will definitely become available within 5 minutes (wait time), and then the 11th customer starts waiting. In this case, there will be two results: if a massage chair becomes available within 5 minutes, the customer can go to the available massage chair, but if the customer waits for 5 minutes and no chair becomes available, you have to apologize and ask the customer to try another store.

For a database connection pool, based on my experience, I suggest keeping the minimum connection count around 10 and the maximum connection count around 20-30 for online systems.

Here, you need to pay attention to maintaining the connections in the pool, similar to what I mentioned with the massage chairs. Although some massage chairs are running, sometimes they may have faults. Generally, the reasons for “massage chair faults” may include:

The IP corresponding to the database’s domain name has changed, but the connections in the pool are still using the old IP. When the database service on the old IP is closed, an error will occur when using this connection to query.
MySQL has a parameter called “wait_timeout” which controls when the database will actively close a connection after it has been idle for a certain period of time. This mechanism is transparent to the database user, so when we use a closed connection, an error will occur.

So, as the owner of the massage chair store, how do you ensure that the running massage chairs are always available?

Start a thread to periodically check if the connections in the connection pool are available. For example, send the command “select 1” to the database with the connection to see if an exception is thrown. If an exception is thrown, remove this connection from the connection pool and attempt to close it. Currently, C3P0 connection pool supports this method of detecting connection availability, which is also the method I recommend.
After obtaining a connection, verify its availability before executing SQL statements. For example, the testOnBorrow configuration item in the DBCP connection pool controls whether this validation is enabled. This method introduces extra overhead when obtaining connections, so it is recommended not to enable it in production systems, but it can be used in testing services.

Now, you have completely understood how a connection pool works. However, just when you were about to take a breather, the CEO came up with a new requirement. After analyzing this requirement, you found that in a very important interface, you need to access the database 3 times. Based on your experience, you believe that this will definitely become a bottleneck in the system in the future.

Furthermore, you think that creating multiple threads to process the interaction with the database in parallel could speed up the process. However, because of the lesson learned from the previous database issue, you realize that frequent thread creation during high concurrency periods would also incur significant overhead. So you continue thinking along the previous line of thought and guess the use of a thread pool.

Pre-creating threads with thread pool #

As expected, the ThreadPoolExecutor introduced in JDK 1.5 is an implementation of a thread pool. It has two important parameters: coreThreadCount and maxThreadCount, which control the execution of the thread pool. Its execution principle is similar to the massage chair store model mentioned above. Let me describe it to further deepen your memory:

If the number of threads in the thread pool is less than coreThreadCount, new threads will be created when handling new tasks;
If the number of threads is greater than coreThreadCount, the tasks will be put into a queue and executed by currently idle threads;
When the queue of tasks becomes full, threads will continue to be created until maxThreadCount is reached;
When the number of threads reaches maxThreadCount and there are still new tasks submitted, we have no choice but to discard them.

This task processing flow seems simple, but in reality, there are many pitfalls that you need to be aware of when using it.

First, this thread pool implemented by the JDK primarily puts tasks into the queue for temporary storage instead of creating more threads. It is more suitable for executing CPU-intensive tasks, which require a large amount of CPU computation. Why is this? Because when executing CPU-intensive tasks, the CPU is busy, so we only need to create a number of threads equivalent to the number of CPU cores. Creating more threads will actually cause thread context switching, which reduces task execution efficiency. Therefore, when the current number of threads exceeds the coreThreadCount, the thread pool does not increase the number of threads but instead puts them in a queue, waiting for the core threads to become idle.

However, the web systems that we usually develop often involve a large number of IO operations, such as database queries and cache queries. When tasks are performing IO operations, the CPU becomes idle. At this time, if we increase the number of threads to perform tasks instead of putting them in a queue, we can execute more tasks in unit time, greatly improving the throughput of task execution. That is why Tomcat uses a thread pool that is not native to the JDK but has been modified. After the number of threads exceeds coreThreadCount, it will prioritize creating threads until maxThreadCount is reached. This is more suitable for web systems with a large number of IO operations, and you can refer to it in your actual application.

Second, the amount of tasks accumulated in the queue used by the thread pool is also an important metric that needs to be monitored, especially for tasks that have high real-time requirements.

In one of my past projects, I encountered a strange problem where tasks that were handed over to the thread pool were not executed for a long time. At first, I thought it was a bug in the code, but after investigation, I found that it was because the coreThreadCount and maxThreadCount of the thread pool were set relatively small, resulting in a large accumulation of tasks in the thread pool. After increasing these two parameters, the problem was resolved. After overcoming this hurdle, I included the important metric of task accumulation in the thread pool as a key monitoring indicator on the system monitoring dashboard.

Finally, if you use a thread pool, always remember not to use an unbounded queue (i.e., a queue with no fixed size). You might think that using an unbounded queue means that tasks will never be discarded and as long as the tasks do not require high real-time performance, they will be consumed sooner or later. However, a large accumulation of tasks will consume a large amount of memory. Once the memory is full, frequent Full GC will be triggered, causing the service to become unavailable. I have previously investigated an incident where a system crashed due to a Full GC triggered by a thread pool using an unbounded queue.

Having understood the key points of the thread pool, you have added this feature to the system. At this point, the system is stable, and you have successfully completed the development task assigned by the company.

Now, as you look back at these two technologies, you will find that they have a common characteristic: the objects they manage, whether connections or threads, have a relatively time-consuming and resource-consuming creation process. Therefore, we put them in a pool to manage them collectively in order to improve performance and resource reuse.

This is a common software design concept called pooling technology. Its core idea is trading space for time, hoping to reduce the performance overhead of frequent object creation by using pre-created objects. At the same time, it allows for unified management of objects, reducing the cost of their use. In summary, it brings many benefits.

However, pooling technology also has some shortcomings. For example, storing objects in a pool will definitely consume extra memory. If the objects are not frequently used, it will result in memory waste. Another example is that objects in the pool need to be pre-created when the system starts up, which increases the system startup time to some extent.

But compared to the advantages of pooling technology, these shortcomings are relatively trivial. As long as we confirm that the objects we want to use do indeed consume time or resources during creation and these objects will indeed be created and destroyed frequently, we can use pooling technology to optimize them.

Summary of the Course #

In this lesson, I simulated the most primitive scenario of developing a vertical e-commerce system. When encountering a problem of declining database query performance, we solved the performance issue caused by frequent connection creation by using a database connection pool. Later, we used a thread pool to improve the performance of parallel database queries.

In fact, you are not unfamiliar with connection pools and thread pools, but you may still have confusion or misconceptions about their principles and usage. During interviews, I found that many students are not familiar with the basic usage of thread pools. Taking advantage of this lesson, I would like to emphasize the key points once again:

The setting of the maximum and minimum values of the pool is very important. Initially, you can set them based on experience, but later adjustments need to be made according to the actual running situation.
Objects in the pool need to be pre-initialized before use, which is called pool warm-up. For example, when using a thread pool, all core threads need to be pre-initialized. If the pool is not warmed up, it may result in a large number of slow requests after the system restarts.
The core of pooling technology is a practice of space-time trade-off optimization. Therefore, it is necessary to pay attention to space usage to avoid excessive use of space, leading to memory leaks or frequent garbage collection issues.