08 Relationship Between Concurrent Online and Tps

08 Relationship Between Concurrent Online and Tps

08 Relationship between concurrent online and TPS #

Hello, I’m Gao Lou.

In the field of performance, we often use “concurrent users” to assess whether a system meets performance requirements. For example, we describe the performance requirement as “the system can support 1000 users”. But what exactly does concurrency mean? What is the relationship between concurrency and TPS? And how does it relate to concurrent user count and online user count?

These questions have long puzzled performance engineers. Whether it’s from online articles or discussions in various groups, we hear different opinions. Therefore, even though I run the risk of causing a debate, I still want to write about this issue.

Typical Debates #

One day, a young man told me that he and his colleagues had a fierce argument in the company meeting room after reading an article from my previous column titled “Performance Testing in Practice: 30 Lectures”. One of his colleagues even brought up calculus. I was delighted to hear about such a debate. Just like the Jixia Academy in the Warring States Period, there would not have been such a brilliant peak of civilization without debates.

Their argument revolved around the capacity evaluation of a project, which aimed to assess the performance of a system with Kubernetes as the underlying layer and a microservice architecture as the upper layer (system performance verification). The crux of their dispute was the evaluation method for capacity.

They divided the evaluation methods into two camps. The first camp aimed to derive the number of concurrent users (the future number of threads in the tool) based on the Daily Active Users (DAU) and the user business model. The second camp opposed the first camp, believing that the first method was not reasonable. Instead, they advocated inferring the number of concurrent requests per second (TPS) that the server should bear from the perspective of the server.

The young man said that during the argument between the two camps, they couldn’t agree on certain concepts, including users, the number of threads in the tool, TPS, and response time. Later, he told me that the first camp’s evaluation method focused on users but did not consider their actions. The second camp approached it from the perspective of user operations and calculated the “number of user operations per time period” based on the operation frequency. They then used this value as the required TPS, added their own tolerance for response time (RT), and derived the number of concurrent users.

During the debate, neither side managed to convince the other. The first camp believed that the second camp’s estimation might have a larger deviation, while the second camp claimed that not converting business indicators into technical indicators would be unscrupulous. One of the colleagues even came up with a formula:

\(Number\ of\ Concurrent\ Users = TPS \times RT\)

Therefore, the result of this debate was: no conclusion.

Now let’s consider who was right in the debate mentioned above. Which formula is more reasonable?

Let me give you an example to explain (to simplify the problem, the following diagram does not consider changes in response time):

As you can see, there are 5 pressure threads in this diagram, and each thread can complete 5 requests within 1 second. According to the formula mentioned above, the conclusion we should arrive at is:

\(Concurrent = TPS \times RT = 25(threshold\ number\ of\ transactions) \times 0.2(response\ time) = 5\)

This 5 obviously represents the number of concurrent threads, but is this number of concurrent threads from the perspective of the user? Obviously not. Because based on the diagram, each transaction represents a real user.

This raises a crucial question: What exactly is concurrency?

According to Baidu Baike (a Chinese-language encyclopedia), concurrency in terms of an operating system is defined as follows:

In an operating system, concurrency refers to the situation where several programs are started and running on the same processing unit during a certain period of time, and at any given point in time, only one program is running on the processor.

According to Wikipedia:

In computer science, concurrency is the ability of different parts or units of a program, algorithm, or problem to be executed out-of-order or in partial order, without affecting the final outcome. This allows for parallel execution of the concurrent units, which can significantly improve the overall speed of execution in multi-processor and multi-core systems. In more technical terms, concurrency refers to the decomposability of a program, algorithm, or problem into order-independent or partially ordered components or units of computation.

These two descriptions may seem a bit different, so we need to understand their similarities and differences. If you try to understand concurrency from the perspective of the English description, I think you can know how much concurrency there is without even considering performance scenarios, by simply counting the number of processor cores. If you try to understand it from the perspective of the Chinese description, you must consider how many transactions were completed within a “period of time”.

Having said that, you should also note that these descriptions are all based on the level of the processor.

So, from the perspective of the user, what do you think is a more reasonable description? In my opinion, a more reasonable description is: “Concurrency is the number of transactions (T) completed within a unit of time. If the transaction (T) represents a user operation, then it can be described as concurrent users.”

Now let’s go back to the previous example. If the corresponding value in the formula is 100 TPS, it means there are 100 concurrent transactions, not 10 concurrent transactions, because 10 concurrent transactions do not represent a “period of time”.

Misconceptions in the industry #

In the third lecture of the book “Practical Performance Testing in 30 Lessons”, I also described the relationship between concurrency and online users. Among them, I described two calculation formulas that are most commonly seen online and are referred to as “industry standards” and “classic formulas”. These formulas come from an article written by Eric Man Wong titled “Method for Estimating the Number of Concurrent Users”. To save you the trouble, I will list these two formulas here.

Calculation of average concurrent users (Formula 1): \(C = nL \\div T\)

where: C is the average concurrent user count; n is the number of Login Sessions; L is the average length of a Login Session; T is the length of the observed time period.

Calculation of peak concurrent users (Formula 2): \(C’ \\approx C + 3\\times \\sqrt\[\]{C}\)

where: C’ is the peak concurrent user count, and C is the average concurrent user count obtained from the previous formula. This formula assumes that the generation of Login Sessions by users follows a Poisson distribution.

Obviously, the term “online user count” is not mentioned in the above formulas. In the original article, the “online user count” is assumed, and the logic is the same as that of Little’s Law and these two formulas. I have already described the problems with these two formulas in detail in a previous column, so I won’t go into detail here. If you are interested, you can refer to the third lecture of the book “Practical Performance Testing in 30 Lessons”. It’s just that in the previous article, I did not provide a detailed explanation of these formulas, so I feel that it was not comprehensive enough.

Now let me explain why these two formulas cannot be called “industry standards” or “classic formulas”.

Firstly, in the original article, the author used this chart to illustrate the state of user concurrency.

And it was assumed that the user arrival rate follows a Poisson distribution, so the Poisson distribution formula was applied. The author made this assumption because the Poisson distribution is the most feasible and commonly used tool for modeling the arrival rate of random and independent events, and can be found in most statistical books. Therefore, this premise was assumed.

However, this leap directly filters out many situations, because the arrival rate of your system may not follow a Poisson distribution, but another distribution, such as these types of distributions:

To determine which distribution your system belongs to, you need to analyze user data. Although the Poisson distribution is commonly used, it is not suitable for a specific system, such as a subway system. In 2018, there was a research paper analyzing the passenger flow of the Beijing subway, which showed the distribution of passenger flow over time as follows:

After analyzing the passenger flow mentioned above, the author found that the passenger flow data followed a gamma distribution. Based on different peak periods, the author then obtained the following distribution fitting results:

From this, you can see that in the article “Method for Estimating the Number of Concurrent Users”, the assumption that the user arrival rate follows a Poisson distribution only describes one possible result. Therefore, the two calculation formulas mentioned earlier cannot naturally become “industry standards”.

Calculation of Average Concurrent Users (Formula 1): \(C = \frac{nL}{T}\)

Where:

\(C\) is the average number of concurrent users

\(n\) is the number of Login Sessions

\(L\) is the average length of a Login Session

\(T\) is the length of the observation period

Calculation of Peak Concurrent Users (Formula 2): \(C’ \approx C + 3 \times \sqrt{C}\)

Where:

\(C’\) represents the peak number of concurrent users

\(C\) is the average number of concurrent users obtained from the previous formula. This formula assumes that the Login Sessions of users follow a Poisson distribution and estimates the results accordingly.

In this article, the author also approximates the Poisson distribution to the normal distribution. The second formula is derived by finding the corresponding result in the standard normal distribution table, where the mean is 0 and the standard deviation is 1.

However, there are several issues:

Is our system really an interaction without wait time, as shown in the figure? This is just a simplified assumption. If your scenario is more complex, Formula 1 may not be applicable.
Formula 2 assumes that \(C\) follows a Poisson distribution. This means that in order to use this formula, you first need to determine if the arrival rate of users in your system follows a Poisson distribution. However, in our analysis of systems, you will see that many systems do not meet this condition.
It is mentioned in the original text that \(C\) refers to the average value. Because it is an average value, there may be significant errors between \(C\) and the peak number of concurrent users.
Formula 2 is derived by approximating the Poisson distribution to a standard normal distribution with a mean of 0 and a standard deviation of 1. Now think about it, does the average number of users in your system satisfy this condition?
These two formulas are actually derived based on the assumption of a simple business scenario. However, our system may support multiple business operations. Do we need to calculate each one separately?
Technically, whether it is online users or concurrent users, it needs to be reflected at the request level. However, these two formulas obviously do not reach that level. In the original text, the author used requests as an example, but it is only used to calculate request rates and bandwidth.
Another major issue is that this method of estimating concurrent users is based on a specific business function. If a system has multiple business functions, this calculation method cannot be applied.

In summary, you can see that the so-called “industry-recognized” calculation formulas actually have many limitations. It is also difficult for us to apply this logic to our own systems in real-world scenarios.

In 2011, a domestic paper titled “The Estimation Method of Common Testing Parameters in Software Stress Testing” used Chebyshev’s inequality for calculations. If you are interested, you can take a look at it.

I don’t mean to dismiss the efforts of these individuals, I just hope that performance professionals can see where the issues lie. If you have conducted various statistical analyses and found that the various assumption conditions in the original text are met, then the formulas above can be used. If not, it is clear that we cannot blindly apply them.

Practical Experience Brings True Understanding #

Since we place such importance on the relationship between concurrent users, online users, and TPS within the industry, but there is no unified landing reference available, and the efforts of some people cannot be effectively confirmed. Does that mean there is no solution?

Of course not. Next, I want to show you the deduction logic of this key point through a specific practice, and then you can consider how to implement it in your own system.

Here, I will use an example of placing an order in an e-commerce system for the operation. Please don’t pay too much attention to what type of system it is. I hope you can see the logic clearly.

Let me explain first. Because the operation I am going to perform is from the user’s perspective. So, I have built a system with a user interface here to demonstrate this example. This is mainly to explain the relationship between online users, concurrent users, and TPS.

There are a total of 7 steps in this example of front-end operations, as shown below:

(Note: The last image shown in the above figure is the interface after logging out, with no operation, so there are a total of 7 steps.)

Now we want to know what requests were generated during the entire process of these operations. The specific request logs are as follows:

{"client_ip":"59.109.155.203","local_time":"27/Mar/2021:23:16:50 +0800","request":"GET / HTTP/1.1","status":"200","body_bytes_sent":"23293","http_x_forwarded_for":"-","upstream_addr":"127.0.0.1:8180","request_time":"0.001","upstream_response_time":"0.000"}
(skipping 98 lines)
{"client_ip":"59.109.155.203","local_time":"27/Mar/2021:23:21:00 +0800","request":"GET /resources/common/fonts/iconfont.ttf?t=1499667026218 HTTP/1.1","status":"200","body_bytes_sent":"159540","http_x_forwarded_for":"-","upstream_addr":"127.0.0.1:8180","request_time":"0.259","upstream_response_time":"0.005"}

These are all the operation logs of a user throughout the entire process, from opening the homepage, logging in, selecting goods, placing an order, to logging out, totaling 100 records. Let’s not discuss whether they are static resources or API calls for now. We will mainly talk about how these requests are converted into TPS, and what the relationship is between TPS, online users, and concurrent users.

Relationship between Online Users and TPS #

We must look at the relationship between online users and TPS from the level of practical operation in order to avoid speculation, as it will not be convincing.

Earlier, we grabbed the corresponding logs (including static resources) through a user’s actions. This user is obviously an online user.

Calculation of TPS for a single online user

From the time window above, the user’s entire operation process is from 23:16:50 to 23:21:00, and the total time window is 250 seconds (a lucky number). The total number of requests is 100. However, we usually set up transactions, right? Now we need to discuss how transactions are defined.

If you set each request as a transaction T, obviously you don’t need to calculate anything. A user would require 0.4 TPS. The corresponding TPS calculation is as follows: - \(1(user) \times 100(requests) \div 250(time window) \approx 0.4(requests/second)\)
If you define transactions at the level of each business operation, as mentioned earlier, there are a total of 7 business operations completed within 250 seconds. The corresponding TPS is: - \(1(user) \times 7(single business operation-level transaction) \div 250(time window) \approx 0.028 (TPS)\)

This means that if you define transactions at the level of business operations, in this example, a user would require 0.028 TPS. Please note that the number of requests in each transaction is not consistent.

If you define transactions at the entire user level (usually, business departments will require this because completing these steps indicates the completion of a business), it is obvious that only 1 transaction was completed within the 250 seconds. The corresponding TPS is: - \(1(user) \times 1(complete user-level transaction) \div 250(time window) \approx 0.004 (TPS)\)

As you can see, the results are different when defining the transaction size at different levels. Therefore, if we simply state in the project that the performance requirement is to achieve a certain TPS, it will lead to different interpretations of TPS by different people. So, if someone asks you to achieve 1000 TPS, you should ask, what level is T?

Please note that in this logic, I did not incorporate the business model into the discussion, because adding the business model will actually make the problem more complex.

Calculation of TPS for multiple online users

The above calculation was based on the actions of one user, but what if it is another user? It is unlikely that the timing will be exactly 250 seconds. And what about thousands or even tens of thousands of users? It is certainly not possible for all of them to take exactly 250 seconds. So, this premise becomes a challenge.

To address this, let’s assume (please note that this assumption is only made for the convenience of subsequent calculations and does not mean that this assumption is true) that there are 100,000 users in the system who complete business operations on average within 250 seconds and complete them within an hour (this data is already highly concentrated). Now you can calculate the required TPS.

TPS at the request level: - \((100,000(users) \times 100(requests)) \div 3600(seconds) \approx 2,777.78(TPS)\)
TPS at the single business operation level: - \((100,000(users) \times 7(business operations)) \div 3600(seconds) \approx 194.44(TPS)\)
TPS at the user level: - \((100,000(users) \times 1(user-level)) \div 3600(seconds) \approx 27.78(TPS)\)

With these calculations, we can determine how many TPS are needed to correspond to the number of online users.

Calculation of peak TPS for online users

Clearly, the above calculation is based on the assumption that all users are evenly distributed within an hour. But what if there is a peak? This algorithm would not be applicable, right? This is why I mentioned the need for historical business peak data. For the specific statistical process, please refer to Chapter 6.

The shorter the statistical time period for the peak of online business, the more accurate it is. Let’s assume that we have statistically determined that in the production environment, 100,000 users completed their transactions within an hour. Among them, 10,000 users completed their transactions within 1 minute of that hour. This data has already reached the level of flash sales in large-scale e-commerce platforms. With the calculation method mentioned above, we can obtain the following:

TPS at the request level: - \((10,000(users) \times 100(requests)) \div 60(seconds) \approx 16,666.67(TPS)\)
TPS at the single business operation level: - \((10,000(users) \times 7(business operations)) \div 60(seconds) \approx 1,166.67(TPS)\)
TPS at the user level: - \((10,000(users) \times 1(user-level)) \div 60(seconds) \approx 166.67(TPS)\)

To obtain an accurate peak TPS, it is obvious that the prerequisite is to have a sufficiently accurate statistical time period.

Through the calculation process described above, we can determine the corresponding TPS at different levels based on the number of online users, including static resources. For the calculation process excluding static resources, you can calculate it based on the same logic.

Relationship between Concurrent Users and TPS #

From the example of calculating online users above, you may have noticed that there is a time interval between two operations in the logs. So, what would be the TPS if a user has no interval between operations?

By recording browser behavior with JMeter, we first need to convert the same operation steps into a JMeter script, then replay it, capture the logs, and see how long a complete user process takes when there is no pause. The logs are as follows:

{"client_ip":"59.109.155.203","local_time":"28/Mar/2021:01:08:56 +0800","request":"GET / HTTP/1.1","status":"200","body_bytes_sent":"23293","http_x_forwarded_for":"-","upstream_addr":"127.0.0.1:8180","request_time":"0.109","upstream_response_time":"0.109"}
(middle section omitted)
{"client_ip":"59.109.155.203","local_time":"28/Mar/2021:01:09:02 +0800","request":"GET /resources/common/fonts/iconfont.ttf?t=1499667026218 HTTP/1.1","status":"200","body_bytes_sent":"159540","http_x_forwarded_for":"-","upstream_addr":"127.0.0.1:8180","request_time":"0.005","upstream_response_time":"0.005"}

From the timestamps, it can be seen that there are a total of 100 requests from the first request to the last request, which took a total of 6 seconds (please note this response time, to make it clearer, I only captured a complete request from one user. In fact, this should be the average response time of all these requests in a stress scenario).

Similarly, let’s calculate the corresponding TPS:

TPS at the request level: \(1(user) \times 100(requests) \div 6(seconds) \approx 16.67(TPS)\)
TPS at the single business operation level: \(1(user) \times 7(business operations) \div 6(seconds) \approx 1.17(TPS)\)
TPS at the user level: \(1(user) \times 1(user level) \div 6(seconds) \approx 0.17(TPS)\)

We can also calculate how many users with intervals (online users) are equivalent to one user without intervals (concurrent users). In this conversion process, we temporarily do not consider the difference in requests. So, obviously it is:

\(16.67 \div 0.4 = 1.17 \div 0.028 = 0.17 \div 0.004 \approx 41.79 (times)\)

It doesn’t matter which level of TPS you use.

In this way, we have a clear understanding of concurrency, which is:

\(1(concurrent user) \div 41.79(online user) \approx 2.4% (i.e., 6/250)\)

So, if you have recorded a script and have not set a pause time (also known as Think Time or wait time), and you want to support 100,000 online users to complete all business operations in one hour, then the corresponding number of concurrent users you need to support is:

\( 100000(online users) \times 2.4% = 2,400(concurrent users) \)

And the TPS at the request level that we measured from one thread is 16.67. To simulate 100,000 online users, the number of threads required to generate load is:

\(2,777.78(TPS at 100,000 online users) \div 16.67(TPS per thread) \approx 167(threads)\)

At this point, let’s summarize the formulas mentioned earlier.

Relationship between Online Users and Number of Threads:

Calculated using TPS at the request level:

\(Number\ of\ Threads = \\frac{(Online\ Users \times Number\ of\ Requests\ per\ User)}{Peak\ Sampling\ Time} \\div TPS\ per\ Thread\ at\ the\ request\ level\)

Calculated using TPS at the single business operation level:

\(Number\ of\ Threads = \\frac{(Online\ Users \times Number\ of\ Business\ Operations\ per\ User)}{Peak\ Sampling\ Time} \\div TPS\ per\ Thread\ at\ the\ business\ operation\ level\)

Calculated using TPS at the user level:

\(Number\ of\ Threads = \\frac{(Online\ Users \times Number\ of\ Complete\ Business\ Flows\ per\ User)}{Peak\ Sampling\ Time} \\div TPS\ per\ Thread\ at\ the\ user\ level\)

Calculation of Concurrent Users:

\(Concurrent\ Users = Online\ Users \times \\frac{TPS\ per\ Thread\ with\ Pause}{TPS\ per\ Thread\ without\ Pause}\)

Concurrency:

\(Concurrency = \\frac{Concurrent\ Users}{Online\ Users} \times 100% (values should be within the same time period)\)

From the above calculation logic, we can see that there are several key data points:

Online users. This value can be obtained from the logs.
Time period for counting online users. This value can also be obtained from the logs.
Time taken for a complete user-level operation flow (remember to sample more data and calculate the average time). This value is also obtained from the logs.
Time taken for a complete business flow without pauses. This value can be obtained from the load testing tool.
Number of requests in a complete user-level operation flow. This value can be obtained from the logs.

How to use “Think Time”? #

In the performance industry, when converting online users and concurrent users, there is a concept that we absolutely cannot overlook, and that is “Think Time.” Because many people want to use Think Time to describe the pause during actual online user operations, let’s talk about this important concept.

Ever since Mercury (the original vendor of LoadRunner) entered the Chinese market and introduced the concept of Business Technique Optimization (BTO), Think Time has gradually gained popularity along with the widespread use of LoadRunner.

However, using Think Time is not as easy as it seems.

In the previous example, we saw that a user’s complete business flow operation took 250 seconds, including Think Time. For the user, there were 7 operations, but what does it mean for the system? Let’s take a look at the distribution of these operations over time.

(Note: The extra requests in the above figure are some automatic triggers, which can be ignored.)

As you can see, there are actually intervals between each operation. And this time interval is often referred to as Think Time in performance scripts. If you want to set Think Time, you need to get the time interval between every two operations.

Furthermore, pay attention! You can’t just use the data from a single user, but you need to gather the time intervals between operations for a large number of real users, and then calculate the average and standard deviation before configuring them into the load testing tools.

In my work experience, hardly any company has been able to achieve this. Whenever I see such a situation, I advise them not to use Think Time because even if they do, it doesn’t necessarily indicate that they are simulating the behavior of real users.

Why can’t we use the timeout of user sessions to calculate concurrent users? #

It is difficult to accurately track the amount of time a user spends online and the intervals between their actions. As a result, some people have suggested using the timeout period of user sessions to calculate concurrent users. Let me first draw a diagram for you and then explain it to you.

(Note: In the diagram above, an arrow represents a complete user-level business flow operation.)

Those who use this approach to calculate concurrent users would often say:

“You see, when a user enters the system and performs some actions, the concurrency is 1. But before the first user finishes their actions, a second user enters, so the concurrency becomes 2. And it’s also possible for more users to enter subsequently, so the concurrency might become 3…”

Doesn’t this logic seem very reasonable? This line of thinking was used in the “Method for Estimating the Number of Concurrent Users” mentioned earlier. But what’s the issue with this approach? There are two problems:

Problem 1: Can you draw the red lines shown in the diagram? Obviously not, because they represent temporal points! When you’re doing statistics in a system, how can you obtain data at a specific point in time? No matter how detailed your logs are, even down to the nanosecond level, they still represent time periods.
Problem 2: Can a user’s behavior in a system be represented by a straight line like in the diagram? Clearly not. As we can see from the user operation interval graph we captured earlier, a user considers themselves to be online during their actions, but at the request level, there are pauses in between. Therefore, even if a user is constantly performing actions in the system, their requests will not flow continuously like water.

We know that a session is a string of characters stored on the user’s end and in the system. It can be used to identify requests exchanged between the user and the system. However, the existence of a session does not mean that there is continuous interaction between the user and the system without any gaps.

Therefore, calculating the number of sessions can inform us of how many users are online, but it does not necessarily mean that these users are actively interacting with the system.

In terms of session configuration, if the session’s expiration time is set to 30 minutes, then any actions performed within those 30 minutes will be recognized. However, during that time, it doesn’t mean that the user will continuously make requests, and even the TCP connection might not be maintained. For systems with short connections, the TCP connection will be immediately terminated after a request is completed. For systems with long connections, we need to send heartbeats to keep the connection alive, but that also only maintains the connection and does not necessarily involve data exchange.

Therefore, a session merely consumes memory to store the string and is used for identification between the user and the system. It cannot be used to calculate the number of concurrent users in terms of performance.

Is there really any controversy between RPS and TPS? #

I remember reading an article online that suggested measuring a system’s performance using RPS (Requests Per Second) rather than TPS (Transactions Per Second). The article even referred to TPS as a “shocking misconception that has plagued the industry for years.” There have been similar discussions among my students as well.

When it comes to RPS and TPS, you can see that many people have different opinions and are at odds with each other. The result is that no one can convince anyone else, turning this issue into a philosophical debate.

After looking through several articles, if I understand correctly, the main points of disagreement are as follows:

TPS is measured from the perspective of the load testing tool, but because TPS can be influenced by response time, it is not recommended to use TPS for performance measurement.
In the case of serial request interfaces, due to various exceptional issues, TPS may not reflect the number of requests per second from the backend services.
TPS reflects the perspective of the business, while RPS reflects the perspective of the server.

These arguments seem reasonable, but are there any errors? Let’s go through each point and understand them further.

In synchronous request-response logic, TPS will inevitably be inversely proportional to response time. So it is reasonable for TPS to be affected by response time. And what we want to analyze is the performance issue when response time increases. Does using RPS mean we no longer care about response time?

In asynchronous logic, if we only focus on how many requests are sent out, it obviously cannot reflect the system’s complete processing capacity. Therefore, the first point of debate does not actually exist.

Even if the interfaces are serial and the backend processes are lengthy, which leads to multiple requests being generated at different nodes, the number of requests on the backend will definitely be greater than TPS in the load testing tool. In a fixed scenario, TPS in the load testing tool and the number of backend requests should be in a linear relationship, shouldn’t they?

If exceptions occur, such as errors causing certain backend services to receive fewer requests, isn’t this precisely the performance issue we want to analyze? Therefore, the second point should not be subject to any debate.

This point is even more peculiar. No one is actually confusing TPS in the load testing tool with RPS on the server side. These two are different statistical techniques, so why would they be debated as opposed to each other? They are inherently different perspectives and should not be treated as opposing arguments. So the third point should not be subject to debate either.

Let me use a diagram to illustrate the relationship between requests and TPS:

As shown in the diagram above, if a thread in the load testing tool (represented by the position of the person in the diagram) sends a request (i.e., at position 0 in the diagram), a total of 4 requests will be generated in the system (represented by positions 0, 1, 2, and 3 in the diagram). Whether these requests are synchronous or asynchronous, they are all real requests. If another thread comes and sends the same request, the system will inevitably generate a total of 8 requests. This logic is clear.

If we consider the requests generated by threads in the load testing tool as T (transactions in the load testing tool), the corresponding backend requests should be 4R (total number of backend requests). Please note that the load testing tool cannot directly measure the 4 backend requests, and it is also unnecessary to measure them. We should leave this statistical work to business monitoring and log monitoring systems, without burdening the load testing tool.

Clearly, there is a linear relationship between requests and TPS, unless you are sending a different request or changing the parameters.

If you want to focus on backend RPS, go ahead; if you prefer to focus on the load testing tool’s TPS, that’s fine too. However, in the specific implementation of a project, whether it’s RPS or TPS, it must be clearly stated and understood by everyone.

Since TPS and RPS are linear, there is no need to treat these two perspectives as contrary to each other. This not only adds complexity to understanding performance but also has no practical value. In other words, it is not a point of controversy at all.

Summary #

In this lesson, I made a thorough analysis of the relationship between the number of online users, concurrent users, concurrency, and TPS. When communicating with people in different positions, it is important to pay attention to their understanding of concurrency, online users, and TPS at which level, because if we are not on the same page, we cannot reach a consensus.

When working on performance projects, if you can obtain the key data, you can calculate it according to the corresponding formulas we discussed earlier. And this calculation logic is not only applicable to HTTP, but also to any protocol.

In this lesson, I also provided a detailed explanation of the misunderstandings about the number of online users, concurrent users, concurrency, and TPS, and analyzed some industry misconceptions in depth. From this, you can understand that the thoughts that lean towards the business layer or TPS layer are both incorrect. Only by linking them together can we have a reasonable technical-to-business thinking logic.

I hope you can understand and truly understand the relationship between them.

Homework #

That’s all for today’s content. Finally, please think about the following:

How do you obtain the TPS of active online users (regardless of the level)?
What are the hazards in performance scenarios that do not include static resources?

Remember to discuss and exchange your thoughts with me in the comments section. Every thought you have will help you make further progress.

If this lesson has benefitted you, feel free to share it with your friends and learn and progress together. See you in the next lesson!