00 Preface the Technical Power of High Concurrency Systems

00 Preface The Technical Power of High Concurrency Systems #

Hello, I’m Xu Changlong, and welcome to my practical high-concurrency course.

Currently, I am a technical architect at Geek Time. Before that, I have been engaged in architecture for over a decade, with previous positions at Qiongyou.com, Weibo, and TAL Education Group. My main expertise lies in the high-concurrency migration and transformation of legacy systems, as well as having rich experience in RPC construction, service-oriented architecture (SOA), frameworks, distributed tracing and monitoring, and Kubernetes management platforms.

I have a strong interest in computer technology and have always been actively learning various technologies. In my early years, I was active in the Swoole community and PHP developer conferences.

As a frontline technical veteran, looking back at the key milestones in my career development over the years, they have always been closely related to “high-concurrency system transformation”.

Why do big companies attach so much importance to high concurrency? #

Speaking of high concurrency systems, you may be both familiar and unfamiliar with them.

You are familiar because the services we commonly use in our daily lives belong to high concurrency systems, such as Taobao, Weibo, Meituan, Ele.me, 12306, Didi, and so on.

You find them unfamiliar because only a small number of development colleagues can truly come into contact with such systems in reality, and more colleagues may only encounter them during interviews at big companies. For example, have you ever encountered these questions:

Why can’t a million-concurrency system directly use the MySQL service?
Why does Redis require more space in memory compared to disk?
How to ensure data consistency in condition query caching?
Why can’t high-level languages directly serve as business caching services?

So what do big companies really focus on? How should we view high concurrency?

No matter how fancy the questions may be, in the end, it can all be summed up in one sentence: big companies value your problem-solving mindset and methods, supported by a deeper understanding of system design principles and concepts.

For example, let’s go back to the question about why a million-concurrency system can’t directly use the MySQL service. If you haven’t accumulated enough knowledge, your answer might be that high-concurrency queries will slow down MySQL, and then you might briefly explain how to use caching to handle the traffic.

However, if you are interviewing for a higher-level position, what the interviewer actually wants is for you to explain why a MySQL database cannot provide such a high level of concurrency, while also delving into discussions on distributed database indexes, storage, data sharding, and separation of storage and computation.

We know that the core value of internet services lies in traffic. The larger the traffic, the greater the potential and space for the platform. This is why big companies tend to prioritize developers with experience in high concurrency. Since 2014, the internet has entered the era of high concurrency, and the technical barriers between big companies and startups have been continuously raised. Professionals in high concurrency have transitioned from being a trend a few years ago to becoming a standard requirement at big companies today.

In recent years, the infrastructure of cloud service providers has become increasingly mature, as they directly provide seamless support for distributed services. This has further reduced the opportunity for us to practice hands-on, leading to scenarios where the work of many architects is limited to selecting providers, services, and figuring out how to quickly integrate and save costs.

Therefore, we must face the reality that high concurrency has indeed created a barrier between big and small companies. To overcome it, it is necessary to systematically learn the underlying knowledge and practice high concurrency scenarios.

Advanced high-concurrency, the most important thing is project-level practical experience #

So how do you make the leap? You can refer to my experience.

When I graduated in 2007, the technology environment in China was not yet focused on high concurrency. My work was limited to small-scale scenarios, mostly focused on code reusability and business logic completeness. And in the market, there was no shortage of developers at my stage. Being trapped in the implementation of business logic, I started paying attention to various technologies, but my understanding of open source and system internals was still shallow, and I didn’t know how to deepen this knowledge.

It wasn’t until I joined Qyer.com and took charge of the high-concurrency transformation of the old system that I encountered a bottleneck in RPC performance, which made me truly recognize the gap.

Some of the techniques I had previously learned may not be applicable to systems with higher requirements. Issues that are harmless in small-scale scenarios can be amplified infinitely when the system scales up, dealing a “fatal blow” to fragile systems. In high-concurrency scenarios, you will find that many self-introductions of open-source projects found online are far from the results of practical validation.

This experience has greatly changed my thinking and perspective on problem-solving. In order to make up for my deficiencies, I read a large number of computer system literature and filled in my knowledge gaps. I vigorously discussed with like-minded people in relevant technology communities and conducted extensive testing of many open-source projects in my own projects, providing improvement and issue suggestions to these projects.

In summary, learning, practice, and communication all contribute to your progress, and it is very effective. Soon after, I joined the Weibo Advertising Department and started working on infrastructure-related projects.

Weibo was a period of significant growth for me. I experienced many “interesting but challenging requirements” that demanded building services to serve the entire Weibo platform with only two servers while also requiring that the services never go down. During this period, I also participated in the development of many practical and interesting services, which helped me stand out from the advertising department of over 300 people and gave me valuable opportunities for career advancement. It was also this experience that made me truly shift my focus to developing basic services and gained more experience in data services and high-concurrency services.

Later on, I received invitations from many companies or friends to provide guidance on service transformation and optimization for various systems. Some system migrations and transformations were as difficult as moving ant colonies, taking more than two years of intermittent efforts; some systems crashed, resulting in losses of tens of millions of yuan for the company, and I was called in to save the day; and there were also systems that no one seemed able to optimize, with no one being able to clearly explain how to proceed…

So, do you have a clear understanding of the path to advancement? Learning, practice, and communication will be the most practical methods that ultimately help you develop a systematic thinking approach.

You can start with the projects at hand, such as conducting high-concurrency transformation of the existing systems in your company. Pay attention not to simply read theory, but to analyze and practice, using load testing to validate. If the risk is controllable, I recommend starting with some small and insignificant systems for practical experience.

How to practice high concurrency? #

So how should we specifically transform? The following four steps are crucial: identifying system types, improving monitoring systems, organizing transformation points, and verifying incremental improvements.

Take the first step as an example. We can classify systems according to their data characteristics, as read-heavy write-light, strong consistency, write-heavy read-light, read-heavy write-heavy. Once the system type is determined, it is equivalent to determining the specific optimization direction.

This column will focus on these four optimization directions and guide you through the key transformation points. Whether you need to build a highly concurrent system, face business traffic growth, or system transformation and upgrade, you can find references here.

Here, I have summarized the knowledge structure diagram of the course. Let me explain the design ideas of the course in combination with the diagrams below:

Read-heavy write-light system #

I will start with the most common “read-heavy write-light” system and guide you through the optimization and transformation of the user center project. The optimization work of such systems focuses on how to alleviate database query pressure through caching. Therefore, our learning focus is to do caching well, including but not limited to data organizing, caching data, and ensuring data consistency after caching.

In addition, in order to help you “break free” from simple business implementation ideas, we will also expand on the relevant knowledge of master-slave synchronization delay and multi-data center synchronization. This will lay a solid foundation for subsequent learning on distributed systems and strong consistency.

Strong consistency in e-commerce systems #

In this chapter, we will use the most typical e-commerce system as an example to explore more demanding strong consistency systems.

The main challenges of such systems are handling high concurrency traffic while maintaining system isolation, transaction consistency, and preventing over-selling in inventory concurrency. I will discuss in detail the key points of system splitting, deepening your understanding of system isolation, synchronous degradation, and inventory locks. You will also understand the operating principles of distributed transaction components. Understanding these will make it easier for you to see through the original intentions of some basic architectural components.

How to perform link tracking in write-heavy read-light systems #

Next, we will focus on write-heavy systems, which involve issues such as how to persist, transmit, store, compress a large amount of data, as well as switch and backup hot and cold data, and index queries. I will analyze these aspects one by one. I will also share with you a complete case of a distributed link tracking system for full log data. This will help you familiarize yourself with every aspect of implementing concurrent writing scenarios.

In addition, high-concurrency writing services in the industry usually rely on some open-source implementations. I will introduce some relevant open-source implementation principles and application directions, enriching your “arsenal”.

Read-heavy write-heavy live streaming systems #

Read-heavy write-heavy systems are the most complex type, just like the hottest games and live streaming services. Many technologies in this category belong to the ceiling level of the industry. After all, even a slight problem online can greatly affect the user experience.

In these systems, data is primarily served directly from memory and services are divided into small units. Data is periodically stored to disk or databases rather than being updated in real-time to databases. Therefore, our learning focus is on how to use in-memory data for business services, achieve hot updates without system restarts, integrate script engines, exchange data between scripts and services, optimize for high concurrency in live streaming scenarios, and gain knowledge about network optimization such as CDN and DNS, as well as business traffic scheduling and client-side local caching.

Chapter 5: Case Study of Intranet Construction #

In the final chapter, I have selected some cases specifically added by me. Here, you will find impressive project proposals as well as many interesting and practical designs. The main purpose is to help you broaden your horizons and enable you to independently implement some basic service designs in the future.

For businesses that are just experiencing traffic growth, this chapter is of great reference value. It will help your system withstand the impact of increasing business traffic and solve problems quickly. Meanwhile, I believe you will have a deeper understanding of the head open-source solution.

After reaching the destination together, I hope you will have a more macroscopic view and understand high concurrency through multiple project practices. When facing various related problems, you will be able to carry out transformations and optimizations that better match business needs and technological conditions for different types of systems.

High concurrency is not the standard for distinguishing engineers working in large or small companies, but it is a test of technical capabilities. The learning scenarios created by this course provide a good starting point for you to improve your abilities and opportunities. I look forward to seeing your growth and breakthroughs in the future!

Leave a message and let’s chat about your pain points in learning high concurrency. Perhaps the difficulties you encounter already have answers in the course, or I can provide targeted additional information. Let’s exchange and learn together.