00 Preface Do You Really Use Pre Commit Hooks Properly

00 Preface Do You Really Use Pre-commit Hooks Properly #

Hello, I am your cache teacher, Chen Bo, and perhaps you are more familiar with my pen name, Fishermen.

I am a seasoned coder and have experienced the technical evolution of large-scale internet systems, such as Sina Weibo, from its inception to its current hundreds of millions of monthly active users. I am currently a technology expert at Sina Weibo. I joined Sina in 2008 and initially worked on the backend development of Sina IM. Starting from 2009, I began working on the development and architecture of the Weibo Feed platform system, deeply involved in the development and architectural improvements of almost all the initial versions of the business. Since 2013, I have been working on the development of Weibo platform infrastructure, cache middleware, distributed storage, and optimization.

So, why do we need to learn about cache? Is it necessary to learn about cache?

With the transition of the internet from the portal/search era to the mobile social era, internet products have also evolved from satisfying the needs of users for one-way browsing to meeting the demands of personalized information retrieval and social interactions. This requires products to be based on users and relationships and perform real-time analysis and calculations on massive amounts of data. This means that for every user request, the backend service needs to query the user’s personal information, social relationship graph, and a large amount of related information involving the relationship graph. These information needs to be aggregated, filtered, sorted, and finally responded to the user. If all this information is loaded from the database, it would be an unbearable and lengthy waiting process.

Using cache is the only solution to improve system performance and enhance user experience.

Taking Sina Weibo as an example, as a pioneer and heavyweight social sharing platform in the mobile internet era, since its launch in 2009, the number of users and number of Weibo posts have started from zero and grown rapidly. Until 2019, the daily active users have exceeded 200 million, with 100-200 million new feed posts generated daily and a daily access volume in the tens of billions. The historical data has reached trillions. Meanwhile, in the daily service of Weibo, the core API availability needs to reach 99.99%, with response time within 10-60ms, and the data access volume of a single core business reaches millions of QPS.

All this data relies on a well-designed architecture and continuously improved cache system to support it.

In fact, as an internet company, whenever there is a business directly facing users, cache needs to be used to ensure the system’s access performance and availability. Therefore, cache is also a very important topic in backend engineer interviews. Interviewers usually use the depth of the interviewee’s understanding of cache-related knowledge to judge their development experience and learning capabilities. It can be said that the mastery of cache to some extent determines the professional level of backend developers.

So, what knowledge do we need to master in order to learn cache well?

Let’s take a look at this “Cache Knowledge Panorama”.

First of all, we need to be familiar with the basics of cache and understand the common classification and reading/writing patterns of cache. We should also be familiar with the seven classic problems of cache and their solutions. At the same time, we should start by understanding the access protocols and clients for various cache components, such as Memcached, Redis, Pika, etc.
Secondly, we should strive to deeply understand the implementation solutions and design principles of cache components, as well as the various features, advantages, and disadvantages of cache. This way, when the cache data is inconsistent with expectations, we can quickly locate and solve the problem.
Thirdly, we should learn more about how large and medium-sized systems design cache architecture. Online systems have diverse and changing business functions, complex cross-domain deployment environments, frequent hotspots, and varying user habits. Therefore, cache systems need to be well-designed from the beginning, with plans for hash and distribution, consistency of data, and scalability. Of course, the cache system also needs to evolve continuously with the development of the business. This requires continuous monitoring, exception reporting, and failure drills of the cache system to ensure timely manual or automated operation and maintenance in case of failures, and continuous optimization and improvement based on the online situation.
Lastly, we should understand the best practices of cache in various scenarios and understand the trade-offs behind these best practices. We need to understand the reasons behind them so that we can apply knowledge and experience to work practice more effectively.

How to Efficiently Learn about Caching? What Can You Learn?

There is a lot of online learning material about caching, but it is often scattered and repetitive. To learn about caching systematically, it is necessary to read books, papers, and source code related to caching, or take online courses that provide practical knowledge. However, all of these methods require a significant amount of time. In order to provide students with a systematic and fast way to acquire the necessary knowledge, Lagou Education has launched a series of technical courses called “Learn in 300 Minutes.” I will be teaching the course on caching.

In these 300 minutes, I will share my experience in caching architecture on Weibo platform with 10 class sessions:

How to better introduce and use caching by identifying the key points of caching design from the beginning of system design.
How to avoid and solve the seven classic problems in caching design.
In-depth analysis from various perspectives, including protocols, usage techniques, network models, core data structures, storage architectures, data processing models, optimizations, and improvement plans, of popular open-source caching components such as Memcached and Redis that are widely used by Internet companies.
Teaching how to use them to build a distributed caching service system.
Finally, I will analyze how to build a corresponding high availability, high performance, and scalable caching architecture system for classic business scenarios such as spike sales, massive counts, and Weibo feed aggregation.

Through this course, you will be able to:

Systematically learn the key knowledge points of caching architecture design.
Learn how to better use caching components such as Memcached and Redis.
Gain a deeper understanding of the internal architecture and design principles of these caching components, and truly understand their why.
Learn how to modify caching components according to business needs.
Understand how to build a large-scale distributed caching service system.
Understand the best practices of caching services in various popular scenarios.
Learn by doing and build a better caching architecture system for large and medium-sized Internet systems, which can greatly improve system throughput and response performance, achieve high availability and scalability, and be better prepared for massive concurrent requests and extreme hot events.