00 Preface Why You Should Learn High Concurrency Distributed System Design

00 Preface - Why You Should Learn High-Concurrency Distributed System Design #

Hello, I’m Tang Yang. I am currently a technical expert at Meitu, responsible for the development, optimization, and operation maintenance of the Meitu Xiuxiu community. I have been working in the field for ten years, mainly in the development, architectural design, and system optimization of community systems. During this time, I have participated in the development of three large-scale high-concurrency systems with over tens of millions of daily active users (DAU). In these three projects, I have been involved in the development and transformation of business systems, as well as the research and development of middleware systems such as RPC frameworks, distributed message systems, and registry centers. I have gained a lot of experience in various aspects of designing high-concurrency systems.

I have witnessed the entire process of building a system from the initial stage to handling high-concurrency and high-traffic loads, and have accumulated a wealth of experience in system evolution. Although each company may be in a different industry and have different business scenarios, I believe that the principles of design and optimization remain constant.

These experiences are like various “tricks”, they are interconnected and form a knowledge system that guides us in designing high-concurrency systems. This system includes the explanation of theoretical knowledge, introduction of problem scenarios, the process of problem analysis, and the approach to problem-solving. When you master these “tricks”, you will be able to clearly understand the potential problems that a system may face at different stages, and then find timely solutions and optimize the architecture to improve system performance.

Starting today, I will share these “tricks” on “Geek Time” and analyze the causes of problems, discuss solutions with you, so that you can learn and apply them effectively!

Why should we learn high-concurrency system design? #

Before answering why we should learn high-concurrency system design, I would like you to think about a few questions:

In Weibo, where celebrities often have tens of millions or even hundreds of millions of fans, how can you ensure that fans can see the content posted by celebrities in real-time?
During the Double 11 Shopping Festival on Taobao, when you and tens of thousands of people are rushing to buy a highly cost-effective piece of clothing, how can you ensure that the clothing will not be oversold?
During the Spring Festival travel rush, we all go to 12306 to book train tickets. In the past, we often encountered situations where the page couldn’t be opened while rushing to buy tickets. So, if you were to design the 12306 system, how would you ensure that it can support normal ticket purchases while millions of people are accessing it at the same time?

These are the pain points you often encounter when designing and implementing high-concurrency systems. They all involve how to achieve high performance and high availability in high-concurrency scenarios. By mastering these concepts, not only can you provide users with a better user experience for the products you develop, but your technical skills can also undergo a qualitative change.

Knowledge of high-concurrency system design is an essential tool to acquire offers from top companies #

There is no denying that the current economic situation is not good, and many companies (such as Alibaba, Tencent, and ByteDance) are reducing the number of people they hire while also expecting to bring more value to the company after investing human resources. For companies, programmers who only understand CRUD are less attractive than programmers who have experience in high-concurrency system design.

So when you go for an interview, interviewers will require you to have experience in designing high-concurrency systems. Some interviewers may ask you about the potential bottleneck points and optimization strategies when your system encounters millions of concurrent requests, to test whether you truly understand these concepts.

If you cannot enter a top company or lack high-concurrency scenarios in your current company, where can you gain experience in designing such systems? This is a question of which came first, the chicken or the egg. What I can be sure of is that when you learn this course and master the necessary skills, offers from top companies will no longer be out of reach.

Do not restrict your abilities to your current company’s business scenarios #

You might say, “I work in a small company, where the system has low concurrency and low traffic. It seems unnecessary to learn high-concurrency system design.” But what I want to say is that a company’s business may encounter high-concurrency requirements, even if its traffic is generally stable.

Take the design of the order placement process in an e-commerce system as an example. In a system that has only one API call per second, you only need to focus on the business logic itself: check if the inventory is sufficient, generate an order in the database if it is, lock the inventory after success, and then proceed to the payment process.

This process is very clear, and the implementation is simple. But if you want to launch a flash sale event with some marketing promotions, you may find that the number of order placement API calls can reach 10,000 per second!

Will the inventory system be overwhelmed by 10,000 concurrent requests? #

If 10,000 requests are made simultaneously to query the inventory, there is a possibility that the inventory system may be overwhelmed. The system needs to be able to handle such a high volume of requests in order to prevent it from becoming overloaded.

Can the database handle the generation of 10,000 orders simultaneously? #

If all the requests are processed successfully, it means that 10,000 orders need to be generated simultaneously. Whether the database can handle this depends on its capacity and performance. If the database cannot handle such a large number of simultaneous requests, alternative solutions need to be considered.

What should we do if the system cannot handle the high volume of requests? #

If the system cannot handle the high volume of requests, there are several possible solutions to consider:

Implement load balancing: Distribute the requests across multiple servers to evenly distribute the workload and prevent any single server from being overwhelmed.
Optimize the database performance: Improve the query performance and optimize the database schema to handle more concurrent requests.
Implement caching: Cache frequently accessed data to reduce the load on the database and improve response times.
Implement queuing: Introduce a message queue system to decouple the request and processing, allowing for asynchronous processing and better handling of high volume requests.

These problems may render the previous solution ineffective, and in such cases, a new design solution needs to be implemented.

In addition, the use of caching, while providing basic understanding and usage in low concurrency scenarios, becomes more complex in high concurrency scenarios. It is necessary to consider cache hit rate, how to handle cache penetration, how to avoid cache avalanche, and how to ensure cache consistency. This increases the complexity of the design solution and requires a higher level of expertise from the designer. Therefore, in order to avoid being caught off guard when problems arise, it is necessary to have a sufficient amount of knowledge in high concurrency and be prepared to handle potential high concurrency requirements.

I have seen many friends around me who have worked hard in small companies and made some achievements. They have all gone through a period of lows and have carved out a place for themselves. The reason for their success is that they do not focus solely on the existing business scenarios but maintain a curiosity for new technologies and constantly pay attention to the implementation principles of new technologies in the industry, thinking about how to use technology to solve business problems.

Although their personalities may be different, they all share a common belief that they are not satisfied with the status quo and are committed to pushing their own limits. I believe you are also the same. Therefore, completing business requirements and solving product problems should not be your ultimate goal; improving technical capabilities and expanding technical vision should always be your constant pursuit.

Despite the vast array of knowledge in the field of computer science, many core concepts are interconnected.

For example, message queues are a common component in high-concurrency systems that decouple message producers and consumers, reducing the impact of sudden traffic surges on the system. But does that mean you will never use message queues if your system doesn’t have such high traffic? Of course not.

System modules should strive for high cohesion and low coupling, regardless of whether they operate at high concurrency or not. Message queues, as a primary means of system decoupling, should be an indispensable tool in your technical arsenal.

Similarly, caching technology embodies the concept of trading space for time, while compression reflects the concept of trading time for space. Distributed thinking was originally reflected in CPU design and implementation… These content are all part of high-concurrency system design. I hope that through this course, I can help you grasp these core concepts and apply them in different scenarios.

Therefore, whether it is for engineers who are entering the workplace and need to understand basic system design concepts or for experienced professionals who want to enhance their skills and prepare for future system problems, learning about high-concurrency system design is very helpful.

You may be worried about the lack of a systematic knowledge framework, or concerned that the course will only cover theory without practical scenarios, or that it will only provide superficial introductions without any substantial content. Rest assured! I have considered these issues and after careful consideration, I have decided to use a virtual system as the main focus of the course, explaining step by step how to optimize it as the traffic and concurrency increase, while interspersing explanations of key knowledge points. This approach, combining scenarios, principles, and practices, will help you understand and internalize the knowledge more quickly and deeply.

In summary, after completing this course, you will gain three benefits:

Master the “tricks” of high-concurrency system design.
Understand the basic principles of system design, allowing you to apply new knowledge to different scenarios and think critically.
Break through technical bottlenecks and platform limitations, and acquire the qualifications of an excellent architect.

Course Design #

I have divided the course into three modules: Basics, Evolution, and Practical Application.

The Basics module mainly covers the fundamental concepts of high concurrency architecture design. You can consider it as an overview of the entire course, establishing a preliminary understanding of high concurrency systems.

The Evolution module is the core of the course, focusing on methods to support high concurrency in systems. I will use a virtual system to analyze the changes in the system as frontend concurrency increases, as well as a series of pain points you may encounter. For example, performance bottlenecks in data querying and high availability issues in caching. I will then discuss the solution from five perspectives: database, caching, message queue, distributed services, and maintenance. This will allow you to immerse yourself and truly experience the path of system evolution.

The Practical Application module will provide two real-world cases to apply the knowledge learned in dealing with high concurrency and high traffic.

One case is how to design a system that can handle hundreds of thousands of user unread count requests per second. It is chosen because in most systems, unread count is the service with the highest request volume and concurrency. In systems like Weibo, the QPS can reach 500,000 requests per second. Additionally, the business logic of the unread count system is relatively simple, so you don’t need to have a deep understanding of the business logic beforehand when designing the solution. Another example is the design of an information flow system, which is the core system in community social products. It has complex business logic and high request volume, encompassing almost all aspects of high concurrency system design.

Below is the table of contents for the course, which will give you a quick overview of the knowledge framework.

Conclusion #

From principles to practice, this course covers the entire knowledge system of designing high-concurrency systems, with case studies as the main focus. As long as you persist in learning step by step, think more after class, practice more, I believe that your system design abilities will definitely improve significantly, and your career development path will become broader.

Lastly, I welcome you to share your situation in the comments section, such as the aspects of high-concurrency you want to understand or any confusion you have in this area. This will not only help me focus on relevant topics in future explanations, but also allow you to reflect on your own growth and improvement after completing this course. This is what I am most eager to see.

I look forward to your comments and appreciate your trust. Over the next three months, let’s communicate, discuss, and make progress together.