05 Performance Solution Is Your Solution Still Stuck on the Formality

05 Performance Solution Is your solution still stuck on the formality #

Hello, I am Gao Lou.

The performance plan is a very important document in performance projects. It guides the entire project execution process and also constrains the boundaries of the project, defining the functions of relevant personnel. However, it is disheartening to see that it has become “insignificant” today.

In many common performance projects, the performance plan is just a document and a static one at that. No one really cares about what is written inside and whether the project will follow its contents. It has become a mere formality and is only taken out for review purposes. In some third-party testing projects, I have seen some client parties who don’t even read the content of the plan, they just ask if it exists. If it does, they pass. You see, an essential deliverable is being ignored.

In my performance engineering philosophy, the performance plan is a heavyweight document. In a performance project, it is referred to as a “performance testing plan”. In my approach, I want to remove the word “testing”. Why? It depends on the performance engineering philosophy that I mentioned in the previous lessons. I want to describe the entire project process in the plan.

So, what is the difference between the performance plan I am talking about and the common performance plans? Let’s take a look at what the latter typically looks like.

You are probably familiar with these sections, as we often see performance testing plans with such a table of contents.

I won’t list them one by one here, but you can see that there are more similar sections in other plans as well. This kind of table of contents can generally be divided into several parts from my perspective.

  • General project information: such as testing background, testing scope, testing criteria, testing environment, implementation preparation, organizational structure, project risks, milestones.
  • Performance implementation information: such as testing models, testing strategies, monitoring strategies.
  • Project outputs: such as testing scripts, test cases/scenarios, monitoring data collection, test reports, tuning reports.

From the perspective of a performance testing plan, these contents seem sufficient. However, if we abandon the “testing” perspective and look at it from the perspective of a complete performance project, these contents are still insufficient.

In the past, people often asked me for a performance project plan template. I didn’t quite understand why they needed it, because it’s just a table of contents, couldn’t they write it themselves letter by letter? But then I gradually understood that what they really wanted was not just the outline but the complete content of a performance plan.

However, as we know, performance implementation plans are hardly possible to be directly distributed, even if anonymized, some content can still be identified as belonging to certain companies. Therefore, out of professional courtesy, these contents have to be stored on our hard drives until they become outdated and irrelevant. This is also why we can’t find very complete performance plans online.

However, even though the plans available online are incomplete, in the performance market, we still see too many performance plans being copied and pasted, with a generally similar structure. This has led to a situation where many plans in performance projects only have formal significance.

Because this course requires us to write based on a complete project, I have written the overall plan of this project here. You will see what I consider a truly complete and meaningful performance plan, and I hope it will give you some inspiration.

Due to the detailed and relatively fragmented content of the performance plan, I have prepared a table of contents for you to study the specific contents.

Performance Project Implementation Plan #

Background #

Project Background #

As we mentioned earlier, this course requires building a relatively complete performance project. However, due to the restrictions imposed by commercial software in various organizations, we can only choose an open-source project. Ideally, this project should cover common technology stacks so that we can provide you with more content for reference.

For these reasons, we have set up an e-commerce project. I want to emphasize two points: first, this project is relatively comprehensive; second, the current e-commerce system is typical, and this project is completely open-source, making it easy for us to modify.

However, because this is an open-source project, we do not know what kind of issues we may encounter in terms of functionality and performance. We can only explore these issues step by step during the performance implementation process. Therefore, this is a project that aligns well with our current goal.

Performance Objectives #

  1. Test the current system’s maximum capacity for a single interface based on the classic e-commerce ordering process.
  2. Design capacity scenarios based on business proportions, fully utilize the current resources, identify performance bottlenecks in the current system, optimize them, and achieve the best operating state of the system.
  3. Evaluate the maximum cumulative capacity that the current system can support based on stability scenarios.
  4. Determine the impact of exceptions in the current system on performance based on exception scenarios.

In each performance project, performance objectives will impact the entire project process. Therefore, grasping the objectives will determine the direction of a performance project.

I remember in a previous project, the client requested support for 10 million concurrent online users. The project was not small, with a development team of around 300 people. However, when I arrived, I found that there were only two performance testers, and one of them was a recent graduate who was still in the learning phase. So, I went to the head of the technology department and told him that I couldn’t handle this project. Because based on the given objective and the allocation of resources, I knew very well that I was not capable of filling this hole, so I had to quickly admit my inability.

Later, the head of the technology department asked what kind of resources would be needed to proceed with the project. So, I listed several essential conditions, and only when these conditions were met, did I dare to accept the project.

I mentioned this example to help you understand that performance objectives are simply not the same in the eyes of different levels of management. I took this approach to align the understanding of performance objectives between different levels of management. This point is crucial.

Test Scope #

Features to be tested #

The main process of e-commerce is as follows:

Features not to be tested #

Batch operations.

Guidelines #

Start-up Guidelines #

  1. Ensure that the system’s logical architecture and deployment architecture are consistent with production.
  2. Ensure that the basic data is consistent with production or scaled according to the model.
  3. Ensure that the business model can simulate real production business.
  4. Environment is prepared, including:
    • 4.1. Function verification passed.
    • 4.2. Basic parameters of each component are sorted out and configured correctly.
    • 4.3. Load machine is in place and deployed.
    • 4.4. Network configuration is correct, connection is smooth, and it can meet the requirements of stress testing.
  5. Test plan and proposal review is completed.
  6. Architecture team, operations team, development team, testing team, and relevant experts are in place.

Exit Criteria #

  1. Achieve the performance requirements specified by the project.
  2. Critical performance bottlenecks have been resolved.
  3. Complete performance testing report and performance optimization report.

Suspension/Resumption Criteria #

1. Suspension Criteria

  • System environment changes: Examples include hardware damage to the system host, excessive network transmission time, damage to the stress generator, suspension of the system host upgrade due to other reasons, etc.
  • Testing environment is disturbed, such as temporary acquisition of server, or other uses of the server that would interfere with the test results.
  • Need to adjust testing environment resources, such as operating system, database parameters, etc.
  • The test model cannot meet the requirements of the planned target.
  • Problems listed in the testing risks occur.

2. Resumption Criteria

  • Problems identified during testing have been resolved.
  • Testing environment is back to normal.
  • Problems listed in the testing risks have been resolved.
  • Environment adjustment is completed.

Business Model and Performance Metrics #

Business Model / Test Model #

Please note that this model is not filled randomly, but directly obtained from the production environment in terms of business proportions. There are many ways to extract such business proportions from production. It’s not difficult, and it can be accomplished through log statistics.

However, in some companies, production data is kept by the operation and maintenance team, and the performance team cannot access it because they do not have the permissions, not even the data for building the business model. If this is the case, the performance project can be terminated directly because there is no much meaning in doing it, at most just finding the architects who talk big and the developers who write code randomly and pointing out some mistakes they made.

Business Metrics / Performance Metrics #

In the absence of a clear target TPS (transactions per second) for the project, we assume the target TPS to be 1000 for now. Why did we set it to 1000? Because based on experience, setting it to 1000 is not considered high in this hardware environment, unless there is no reasonable software architecture.

System Architecture Diagram #

System Technology Stack #

The system technology stack informs us of which technology components are used in the entire architecture. By examining the technology stack, we can obtain information about common performance bottlenecks and performance parameters. In the subsequent work, we will also need to organize the corresponding key performance parameter configurations.

The table below is the technology stack we will use in the case analysis in the subsequent courses. When building this system, I aimed to cover the mainstream technology components in the current technology market.

System Logical Architecture Diagram #

Creating a logical architecture diagram for the system allows us to have a business path in mind for subsequent performance analysis. When conducting performance analysis, we need to break down response time, and only by understanding the logical architecture diagram can we know where to start and where to end.

System Deployment Architecture Diagram #

Creating a deployment architecture diagram allows us to understand the number of nodes and machines. When executing capacity scenarios, you should have a concept in mind of the maximum capacity limit that such a deployment architecture can support.

In addition, after reviewing the deployment architecture, you can actually reject unreasonable performance requirements. For example, some time ago, someone told me they had a CRM system and wanted to achieve 10,000 concurrent users in terms of performance. However, in reality, even if that system goes live, the total number of users may not even reach 10,000.

Performance Implementation Prerequisites #

Hardware Environment #

By organizing the overall hardware resources, we can use experience to estimate approximately how much business load the capacity can support, without setting unrealistic goals. For example, when we see a hardware configuration like the one in the following table, I don’t think anyone would set the target as 10000 TPS. Even for the most basic interface layer, this hardware cannot support such a large TPS.

We can see that the current total resources used in the application on the server are: 64 CPU cores and 128 GB of memory. The NFS server is not used in the application, so it is not counted. Because a single machine has relatively abundant hardware resources, in the subsequent work, we can convert these physical machines into virtual machines to facilitate application management.

In terms of cost, the total cost of all physical machines is approximately 80,000 RMB, including various miscellaneous expenses such as switches, cabinets, and network cables.

The reason I am explaining the cost of the hardware is mainly because in the current performance industry, there are very few performance engineers who calculate the cost. We say that the goal of a performance project is to make the online system run better, and at the same time, we also need to know how much cost is involved in running the business system.

In the current performance industry, there are a large number of online hosts in a state of high cost and low utilization, which is a huge waste of resources and cost consumption. I often see in performance projects that a 256-core 512-GB hardware server only runs a 4-GB JVM Tomcat, and the value of performance engineering is completely unutilized in such projects.

Therefore, I often lament the stagnation of the performance industry:

  • Companies are not aware of the value of performance. They believe that by using a high-configured server, the performance of the business system will improve. But this is not the case at all.

  • Performance professionals in the market fail to transparently demonstrate the value of performance. And many performance professionals have insufficient technical capabilities, which makes it impossible for companies to see the value that performance should have.

Given this, as performance professionals, we must understand the relationship between hardware configuration and overall business capacity.

Tool Preparation #

Testing Tools #

In the testing process, we will use JMeter’s backend listener to directly send data to InfluxDB, and then display it using Grafana. We do not use JMeter’s distributed execution function or local data collection function, as this would consume local IO.

However, there are still many performance professionals who frequently use low-performance operations of performance tools in their projects while constantly complaining about poor performance. Regarding this situation, I hope you can understand one thing: we need to use tools sensibly, and not assume that a performance testing tool can be used right away.

Monitoring Tools #

According to the RESAR performance engineering’s global-directed monitoring approach, when choosing the first-level monitoring tool, we need to collect full global counters, including various levels, as shown in the previous architecture diagram.

However, please note that in global monitoring, we should try to avoid using targeted monitoring methods, such as method-level monitoring in Java applications or SQL monitoring in databases. Because at the beginning of the project, we cannot determine which level the problem will occur, so it is not appropriate to use a targeted monitoring approach.

So how should we implement global monitoring most reasonably? Here, we can refer to the monitoring methods used in online operations and maintenance. Note that in the performance monitoring process, we should try not to imagine and randomly create monitoring tools.

Sometimes, in order to have more monitoring, we may use many monitoring methods in the test environment. But in reality, those methods are not used in online operations and maintenance, which leads to monitoring consuming more resources than the production environment, and we cannot obtain normal and effective results.

Previously, we mentioned that when choosing the first-level monitoring tool, we need to collect full global counters. After collecting the global counters, we also need to analyze and discover performance issues, and then find the root cause of performance bottlenecks through the evidence chain.

Data Preparation #

Basic Data #

In performance engineering, we have always emphasized that basic data must satisfy two characteristics:

  1. Meets the real data distribution in the production environment: To achieve this, the most reasonable approach is to sanitize the production data. If you want to generate your own data, you must first analyze the business logic. In our system, I generated 2.43 million user data and 2.5 million address data.

  2. Parameterized data must use basic data to cover real users: Many people have been using a small amount of data under high pressure in performance scripts, but this logic is completely wrong. In performance scripts, you must use basic data for parameterization, and the amount of data depends on the design of the performance scenario.

Performance Design #

Scene Execution Strategy #

Incremental Scene Strategy #

For performance scenarios, I have always emphasized that performance scenarios must meet two conditions:

  • Continuous
  • Incremental

Therefore, in this execution process, I will also apply these two points to the following business scenarios.

You might ask, what problems would occur if it is not continuous and incremental? For example, the graph below:

In the area marked in the red box in the graph, it actually represents the performance problems caused by increment. During the incremental process, the resources of the system under test need to be dynamically allocated. From the graph, we can clearly see if the system will jitter at this time, and this kind of scenario is the real online scenario.

If it is not continuous and incremental, there will be no part like the red box in the graph. Of course, if it is not continuous and incremental, we cannot simulate the real online scenario.

Pay attention here! Take notes! To simulate the production scenario, we must ensure that continuous increment is achieved, without hesitation.

However, the method to set continuous increment differs in different tools.

LoadRunner is designed as follows:

JMeter is designed as follows:

In summary, please remember that in scenario design, we must achieve the continuous increment as shown above.

Business Scenarios #

In the RESAR performance engineering, only these four types of performance scenarios are needed:

The order of execution is: baseline scenario, capacity scenario, stability scenario, and exception scenario.

Please note that, apart from these four types of performance scenarios, there are no other types of scenarios. In each scenario category, we can design multiple specific scenarios to cover the entire business.

Now, let me explain them one by one.

1. Baseline Scenario

I often see people say that using scripts with three or five threads to run a certain number of iterations can be considered as a baseline scenario. You can think about the significance of such a scenario. It can only verify the correctness of the script and scenario. Therefore, I do not consider such steps as a baseline scenario.

In my RESAR performance engineering concept, the baseline scenario must be a prelude to the capacity scenario. How should we do it concretely? We need to achieve the maximum TPS by using continuous incremental scenarios in the baseline scenario as well. That is to say, in the baseline scenario, we need to pressure test a single interface or single business to the maximum TPS, and then analyze where the bottleneck of the single interface or single business is.

You may ask, is it necessary to do optimization in the baseline scenario?

Based on my experience, we should first determine whether the maximum TPS of the current single interface or single business has exceeded the target TPS. If it has exceeded, and the response time is within an acceptable range for the business, there is no need for optimization. If it has not exceeded, optimization is necessary.

In addition, according to the RESAR performance engineering theory, the goal of the first stage of performance execution is to utilize all resources, and the goal of the second stage is to optimize the system to meet business capacity. You should know that optimization of any system is boundless, and our goal is to ensure the normal operation of the system.

Because there is only one system in our example project, we will start with interface level, assemble the interfaces into a complete business volume, and implement the business model before executing the capacity scenario. Here, we will execute the baseline scenarios of the interfaces within the test scope.

2. Capacity Scenario

After obtaining the result of the baseline scenario, we enter the capacity scenario stage. In the capacity scenario, we still adhere to the execution mindset of “continuous, incremental” and, most importantly, faithfully simulate the business scenarios of the production environment with the business model we mentioned earlier.

It is often seen that in many performance projects nowadays, most performance requirements are not specific, resulting in the model of performance scenarios not being consistent with the production scenarios, which is a serious problem.

Another serious problem is that even if the business model and the production are consistent, the results of the performance scenario become meaningless due to the failure of the performance engineer to strictly simulate the proportions of the business model during the execution process. You should know that during the execution process, response time will increase with the increase of stress. Controlling proportions only by using the number of threads is very irrational because business proportions may become imbalanced during the execution process.

So how should we control this proportion relationship? If you are using JMeter, you can use the Throughput Controller to control the business proportion, as shown below:

Of course, you can also achieve this in other ways. In short, after the scenario execution is complete, we need to statistically analyze the business proportion and compare it with the expected proportion. The scenario is considered reasonable only when the proportions are consistent.

In the capacity scenario, there is one more thing we need to determine, which is the maximum TPS.

I would like you to take a look at this graph. What do you think is the maximum TPS?

Do you want to say that the maximum TPS is 700?

Regardless of the answer you provide, I want to emphasize one point in my performance philosophy: the maximum TPS of the capacity scenario refers to the maximum stable TPS. So you see, the above graph is already shaking and unstable. What’s the point of finding its maximum TPS? Would you dare to run a production system in such a shaky state? Therefore, for a TPS curve like the one in the above graph, I would define the maximum stable TPS as the third step, around 600, instead of 700.

In addition, please note that in performance scenarios, especially in capacity scenarios, people often mention the term “performance inflection point” and consider it as the key knowledge for determining performance bottlenecks. I won’t judge this for now, let’s see what an inflection point is and how it is defined in mathematics:

An inflection point is a point on a curve where the curve changes direction from upward to downward, or vice versa. Intuitively, an inflection point is a point where the tangent line crosses the curve (i.e. the boundary between the concave arc and the convex arc of a continuous curve).

So can you really find such points in the TPS curve? Well, I can’t. Taking the above graph as an example, where is the inflection point in the graph? Some might say that there is no inflection point in this curve. Well, then there’s nothing to discuss…

Clearly, the concept of a performance inflection point is very misleading in actual execution. Please try to avoid using the term “performance inflection point” to describe performance curves in the future, unless you actually see an inflection point.

3. Stability Scenario

After completing the capacity scenario, we will move on to the stability scenario. So far, in the performance field, no one has been able to give an exact conclusion on how long a stability scenario should run. We know that stability scenarios can be designed differently depending on the nature of the business, but this seems a bit vague. Therefore, I will provide some guidance on how a stability scenario should run.

In a stability scenario, we only have two key points:

The first key point: Duration of the stability scenario.

When it comes to the duration of a stability scenario, I often see people online saying it should generally run for two hours, or 7*24 hours, and so on. But what does “generally” mean? And what is considered “unusual”? As an experienced professional with more than ten years of experience, I have never followed such logic nor have I ever seen a specific source for these durations. I have only seen a lot of misinformation being spread.

There are many examples like this in the performance field, and by now I’m not surprised anymore. After all, it’s most important to stay true to ourselves and do the right things. Let me explain what is a reasonable duration for a stability scenario.

We know that after a system goes live, the operations team will definitely perform regular maintenance checks and take action if any issues are found. Some systems have fixed maintenance cycles, such as using scheduled jobs for archiving, while others take action based on the results of the checks.

The purpose of a stability scenario is to ensure that the system can function properly within the maintenance cycle. Therefore, in the stability scenario of performance testing, we need to fully cover the business capacity. For example, consider the following graph:

Within the maintenance cycle, there is a capacity of 100 million transactions. Based on the test results from the previous capacity scenario, let’s assume the maximum stable TPS is 500. Then the duration of the stability scenario would be:

\[Stability\ duration = \frac{100000000}{500 \cdot 3600 \cdot 24} \approx 2.3 (days)\]

With this calculation, we can determine how long the stability scenario should run, and this is the only reasonable way to do it.

The second key point: TPS rate used in the stability scenario.

I have seen people mentioning online that the stability scenario should be run with 80% of the maximum TPS. This makes me wonder why. Why can’t we use the maximum TPS?

I remember having several discussions like this during training sessions. Some people said that the reason for using 80% of the maximum TPS is because we shouldn’t put too much pressure on the system during the stability scenario, as it could easily lead to system issues.

This statement is strange. If we can determine the maximum TPS from the capacity scenario, why can’t we use it for the stability scenario? If using the maximum TPS in the stability scenario causes issues, aren’t these performance issues the ones we want to test and identify? Why use a lower TPS to avoid performance issues? Therefore, it is a wrong approach to use 80% of the maximum TPS for stability scenarios.

In my performance philosophy, when executing stability scenarios, it is perfectly fine to run at the maximum stable TPS as long as it covers the business capacity within the operations and maintenance cycle. If you don’t run at the maximum stable TPS and instead use a lower TPS, you still need to cover the business capacity within the operations and maintenance cycle.

With that being said, I believe the above content is sufficient to guide you in conducting stable performance scenario tests correctly and reasonably.

4. Exceptional Scenarios

Regarding exceptional scenarios, some companies classify them as non-functional scenarios, but I personally think it doesn’t matter where they are classified as long as someone executes them. The reason I classify exceptional scenarios under the performance section is that these scenarios need to be executed under pressure.

For regular exceptional scenarios, we often do the following:

  • Crash the host machine;

  • Disable the network card;

  • Crash the application.

In addition to these scenarios, with the prevalence of microservices nowadays, we have new methods like crashing the container.

Of course, you can also use some tools, such as “chaos engineering” tools, to randomly delete containers, cause network packet loss, simulate high CPU usage, and so on. However, that is a big topic. In the following courses, I will design several commonly used exceptional performance scenarios to show you their effects.

Monitoring Design #

Global Monitoring #

In fact, with the previous section on monitoring tools, the design of monitoring should have already crossed the mind of the person writing the plan. For the system used in this course, the global monitoring is as follows:

From the above diagram, we can see that we can achieve the first-level monitoring with a global view using Prometheus/Grafana/Spring Boot Admin/SkyWalking/Weave Scope/ELK/EFK. For the first-level counters that are not covered by the tools, we can only execute commands during scenario execution to supplement them.

Targeted Monitoring #

What do we do with targeted monitoring? Here, I will roughly list some commonly used tools. However, please note that these tools are only used when there are problems.

In performance analysis, besides the three tools listed in the table, there are many other tools that can be used to find evidence chains for performance bottlenecks. I cannot list all of them here, so I can only list some commonly used tools based on the technology components used in the system. In the operations of the subsequent courses, if you find that we use tools not listed in the table, please don’t be surprised.

Project Organizational Structure #

In the performance plan, it is crucial to create an organizational structure chart for the project. I urge you to clearly specify the scope of work and responsibilities for each organizational member in this section to avoid any conflicts. Here’s a rough sketch of a commonly seen organizational structure:

This structure is based on the nature of the work, not the specific job positions in the workplace. I believe this is a reasonable organizational structure. In this chart, the responsibilities of performance script engineers actually encompass what most performance practitioners are doing. As for performance analysis engineers, they are almost non-existent in many performance projects, and there is no fixed position like that. However, it is essential to have performance analysis engineers.

In addition, architects, developers, and operations engineers all need to be involved in supporting performance analysis. Please note that when I say “support,” I don’t mean just standing by and watching. It means being able to provide specific support when problems occur, instead of evading responsibility.

Business stakeholders are the source of performance requirements, which is crucial. If business stakeholders cannot articulate reasonable performance requirements, the project will likely end up being chaotic.

Regarding the role of the manager, in performance projects, I often see managers who have no understanding of performance, but only demand support for a certain number of concurrent users or online users. However, communication with these types of managers can be straightforward - just provide them with the results. However, when there is a lack of resources during the execution of the performance project, please make sure to inform the manager and lower their expectations. Otherwise, subsequent communication will be very difficult.

Output of Results #

Process Output #

  1. Script
  2. Results of scenario execution
  3. Monitoring results
  4. Problem records

In a performance project, these process outputs are sufficient, no more or less. I often see many performance projects that have no process output besides a performance testing report. I really don’t understand how these companies accumulate performance experience. Therefore, I still want to advise you that in a performance project, try to do more archiving and organizing work to be able to reference in future projects and achieve your own technical accumulation.

Result Output #

Usually, in the performance projects I have worked on, there are two final deliverables: a report recording the results of performance scenario execution (which is what we commonly call a performance testing report) and a performance tuning report.

Performance Project Testing Report #

We must have seen many performance project testing reports, so here I will only emphasize a few points:

  1. The performance result report must have conclusions, instead of just providing a summary of descriptions like “what the resource utilization rate is,” “what the TPS is,” “what the response time is.” Think about it, the performance results are already in this report, who can’t see it? Do we still need to restate it? We should provide conclusions like “the current system can support XXX concurrent users and has XXX online users.”

  2. Never use ambiguous words like “may,” “perhaps,” or “should be.” Otherwise, it would be blatantly cheating.

  3. The performance result report should provide suggestions for operations and maintenance work, such as configuration recommendations for key performance parameters like thread pool, queue, and timeout.

  4. The performance result report should provide suggestions for future performance work.

Performance Tuning Report #

Why do I stress the writing of a separate tuning report? Because the tuning report is the essence of the entire performance project, and it must record the symptoms, analysis process, solution, and effectiveness of each performance problem. It can be said that the tuning report is a complete reflection of the team’s technical capabilities.

Project Risk Analysis #

For performance projects, I have listed the more common risks here:

  1. Unclear performance requirements at the business level
  2. Environmental issues
  3. Data issues
  4. Inaccurate business model
  5. Difficulties in coordination and communication between teams
  6. Inadequate bottleneck analysis affecting progress
  7. ……

In the project we are working on for this course, the major risks are:

  1. Limited hardware resources.
  2. Uncontrollable project schedule because there is no support in case of problems, so we have to handle it ourselves.

However, please rest assured that I will make efforts to overcome these difficulties and document the execution process of this project.

With this, the entire performance project implementation plan is concluded. If you have carefully read up to here, congratulations, you have surpassed many people. I give you a thumbs up!

Summary #

In this lesson, I have organized the content of a performance plan and discussed to what extent it should be written. I hope it can provide you with some reference.

A performance plan is an important output of a performance project. If you are doing rapid iterations in a project, you may not need to write such complex and heavyweight documents. This is because many of the tasks described in the document have already been completed, and you may only need to compare and iterate according to the versions.

However, for a complete project, a performance plan becomes extremely important as it guides the entire process of the project. In the performance plan, we emphasize several key points: business model, performance metrics, system architecture diagrams, scenario design, monitoring design, etc. They all play a crucial role in the quality of the entire project.

Finally, I hope you can try to write a complete performance plan in your future projects.

Homework #

After learning this lesson, please consider two questions:

  1. How to accurately simulate the business model?
  2. Why do we emphasize the importance of system architecture diagrams?

Feel free to discuss and exchange ideas with me in the comments section. Of course, you can also share this lesson with your friends around you, as their thoughts may bring you even greater insights. See you in the next lesson!

About the Course Reader Group

Click the link on the course details page, scan the QR code, and you can join our course reader group. I hope the exchange and collision of thoughts here can help you make greater progress. Looking forward to your presence~