00 Preface What Can Reading Redis Source Code Bring to You

00 Preface - What Can Reading Redis Source Code Bring to You #

I am currently employed as an associate researcher at the Institute of Computing, Chinese Academy of Sciences. In 2015, my team and I started designing and implementing a high-performance key-value database. In order to achieve this goal, we researched various commonly used key-value databases in the industry and chose Redis as our focus of study. During the process of learning Redis, I thoroughly read the source code, especially its data structures, master-slave replication, RDB/AOF, and other key functionalities.

Through reading the Redis source code, I found that I gained a more direct and profound understanding of Redis’ key design principles and mechanisms. More importantly, Redis’ code design and implementation taught me a lot about computer system design, which greatly benefited me.

In 2020, I launched a course on Redis core technologies and practical applications on GeekTime, aiming to help students master the core principles and practical application techniques of Redis. During the course updates and learning process, many students expressed their desire to understand and study the Redis source code, but they struggled with where to start. Therefore, after one year, I have brought another source code course.

From the perspective of reading Redis source code, this course will not only introduce you to the code implementation of Redis’ key technologies, so that you can thoroughly understand and master these key technologies, but also, more importantly, I hope to share with you the common design ideas of computer single-node systems and distributed systems that I experienced and mastered while reading Redis source code. I want you to be able to apply these design ideas to your own project development.

Alright, now let’s talk about what reading Redis source code can bring us, that is, why we should learn Redis source code.

Isn’t it enough to just use Redis? Why do we need to read the source code? #

Usually, when we develop applications based on Redis, we may only use Redis as a cache system or a database to store and retrieve data, without touching the source code. For example, when developing social applications, we cache user data, follow information, etc. in Redis; when developing storage system software, we also use Redis to store system metadata.

However, I have encountered many development or operations teams that often face issues such as degraded Redis performance and Redis instance failures when using or managing Redis, which can affect the operation of business applications. Moreover, those who have experienced interviews with large companies know that many internet companies ask questions related to Redis when recruiting senior technical positions.

In other words, if you don’t understand the implementation principles of Redis at the source code level, you may encounter obstacles whether you are troubleshooting problems and failure points in actual development or quickly analyzing problems during technical interviews.

Let me give you a simple example. During the operation of Redis, as the amount of saved data increases, Redis performs rehashing operations, which can have a certain impact on Redis performance. If we want to determine whether the current performance problem is caused by rehashing, we need to understand the specific triggering conditions for rehashing. This includes understanding what the triggering conditions for rehashing are and at which operations these triggering conditions are checked.

However, when we only understand the basic principles of rehashing, we only know that rehashing will be performed when the load factor of the hash table is greater than a preset threshold. But for Redis specifically, we need to further understand:

How is the load factor of the hash table calculated? Knowing this, we can estimate the workload of Redis.
Besides the load factor, are there any other triggering conditions? Knowing this can help us infer whether rehashing is currently happening based on the running status of Redis.
In which functions are the conditions for triggering rehashing checked? Knowing this is useful to know in which operation execution processes the triggering conditions for rehashing are checked and rehashing is executed.

You see, although rehashing is theoretically just one operation, when it comes to troubleshooting performance issues in practice, we face many specific questions.

So, the best way to answer these questions is to read and study the Redis source code. Through studying the source code, we can further grasp the implementation details of Redis, and the most obvious benefit of this is understanding various conditions that need to be checked and handled during the running process of Redis. These details correspond to our troubleshooting ideas when investigating Redis performance and failure issues, and can help us solve problems systematically and efficiently.

Moreover, from my experience, studying the source code not only helps us understand the design details of Redis but also brings the following three benefits.

First, from principles to source code, learning source code reading methods, developing the habit of reading source code, and mastering the initiative to learn.

Reading source code itself is a laborious process, especially when facing systems like Redis. However, once you have mastered the reading methods and developed the habit of reading source code, you can grasp various implementation details of Redis from the source code and establish a comprehensive understanding of Redis. In this way, you can become a Redis expert.

In addition, once we develop the habit of reading source code, when we encounter problems, we will instinctively search for answers in the source code. Moreover, Redis’s code is continuously updated, so sometimes the working principles corresponding to updated code may also undergo changes without timely supporting materials explaining the changes. At this time, if we are already accustomed to understanding the working mechanism of Redis from the code level, we can grasp the latest developments and changes in Redis at the first time and apply them to practical work.

For example, Redis released version 6.0 in May 2020, which implemented a multi-IO thread mechanism. If we have developed the habit of reading the Redis source code, we can learn about the specific implementation of multi-IO threads in Redis 6.0 at an early stage and evaluate its usability.

Second, learning good programming practices and techniques, and writing high-quality code. The second benefit of studying Redis source code is that it provides a classic software system example developed using the C language, which allows us to learn and master good C language coding standards and techniques.

Redis stable versions include 2, 3, 4, 5, and the 6.0 version released in 2020, which are deployed and used in actual business scenarios, and their code stability and robustness have been tested. Therefore, Redis source code is an excellent resource for learning C language programming. Whether you are a beginner in C language or an experienced C language developer, studying Redis source code can help you master coding standards and techniques.

For example, we can learn programming methods for unit testing functional modules from Redis source code. The following code shows the unit test for Redis SDS data type. By defining test functions and macro switch, various operations on the SDS type can be tested.

int sdsTest() {
    ...
}

#ifdef SDS_TEST_MAIN
int main(void) {
    return sdsTest();
}
#endif

Third, by analogy, learn computer system design principles to advance professional skills.

Finally, studying Redis source code has another major benefit, which is to learn key design principles of computer systems along with Redis. Redis is a classic in-memory database, and its design and implementation involve key technologies in two categories of computer systems.

Firstly, key technologies for single-node key-value databases, including data structures that support high-performance access, data structures that support efficient space utilization, high-concurrency communication in network servers, efficient thread execution models, memory management, and logging mechanisms. These technologies are issues that need to be considered when designing and implementing a single-node key-value database.

Secondly, key technologies for distributed systems, including master-slave replication mechanisms in distributed systems, scalable cluster data sharding and placement technologies, and scalable cluster communication mechanisms.

During the development of Redis, reasonable designs and optimizations were made for the above-mentioned problems. Therefore, by reading Redis source code, you can fully learn these design principles of computer systems and apply them to your own project development, further enhancing your professional competitiveness.

I have drawn the following diagram to illustrate the computer system design principles that can be learned and mastered through reading Redis source code, please take a look.

Redis Source Code Diagram

To summarize, through reading and studying Redis source code, whether it’s to gain a deep understanding of Redis and become an expert in Redis, develop the habit of reading source code and actively keep up with the latest developments of Redis, or learn coding standards and design principles through Redis, it is highly beneficial.

How to Properly Study Redis Source Code? #

However, when you try to read Redis source code, do you feel lost or confused, for example:

There are many functional modules in the Redis source code, and you are not sure about the logical relationships between them. Or, there is a lot of content in a module, making it difficult to determine a clear calling path.
You spend a lot of time reading the code, but you always struggle to grasp the main points. Or, when reading a function code, you easily get caught up in the details and cannot quickly identify the key parts of the code.

The reason why you feel “lost” is that you lack a panorama of the code structure, and the reason for feeling “confused” is the lack of guidance for reading and the support of basic principles. In simple terms, you have not mastered a scientific and efficient method of reading code.

Based on my experience of reading the source code of large systems like Redis, I will provide you with three tips.

The first key to efficiently reading code is to grasp the overall structure of the source code.

This is because if you start by focusing on a single code file, you easily get caught up in the details and fail to understand the composition of the Redis source code from a global perspective, making it difficult to differentiate between main and secondary parts.

Therefore, when reading the Redis source code, we need to first form a panorama of the Redis source code, as shown below.

Redis Source Code Structure

With this diagram, we can look up the code file we want to study according to our learning needs. Then, based on the different features of Redis, we can study the key technologies and design principles involved in each feature.

The second key to efficiently reading code is to have clear goals and a solid understanding of the principles.

Redis has many functional modules, and each module’s implementation is relatively complex. Before reading the code, we must clarify our goals, whether it is to understand a specific data structure or the process of master-slave replication.

After determining our goals, we also need to have a basic understanding of the corresponding principles before starting to read the source code. This is because the source code reflects the principles. If we don’t understand the basic principles of Redis features, it will be difficult to understand the code logic, making code reading more challenging.

The third key to efficiently reading code is to start with the main logic before diving into the branch details.

Although the source code reflects the principles, compared to principles, the source code usually considers various situations and details during system operation. I have seen some developers who start reading the code by examining each branch and then delve into each function within each branch. However, functions in different branches often involve other processing details. This makes it difficult to understand the main logic of the code and can be discouraging.

In fact, when reading code, we should first outline the main logic of the functional module. Specifically, we need to understand the code execution path and mark the branches without reading line by line from the beginning. Once the main logic is clear, we can examine the different branch treatments.

For example, when reading the code of the Redis event-driven processing framework, we need to outline the main steps of the event handling process in the code, including event creation, event listening, and event processing loop. Then, we can explore the various details of event creation, listening, and processing. This way, code reading can be more efficient.

Now that you understand the methods for studying code, you can delve into specific modules of Redis and learn about the design and implementation of different feature characteristics.

How is this course designed? #

When it comes to the features of Redis, Redis provides rich data types such as String, List, Hash, Set, and Sorted Set. In addition, Redis has high access performance and can be built into a master-slave cluster and a sharded cluster to improve the reliability and scalability of Redis usage.

Therefore, based on the above features of Redis, I have divided this course into five modules, as follows:

Data Structures: You will learn about the design principles and implementation of the main data structures in Redis, including the implementation of strings, the design of memory-efficient structures, performance optimization of hash tables, as well as the design and implementation of ziplist, quicklist, listpack, and skip list.
Network Communication and Execution Model: You will master the startup process of the Redis server, the design and implementation of high-performance network communication, the design and implementation of event-driven frameworks, and the design and optimization of Redis thread types.
Caching: You will understand how common caching replacement algorithms are transformed from theory to code.
Reliability Guarantee: You will master the specific implementations of RDB and AOF, the design and implementation of the Raft consensus protocol in distributed systems, and the key code implementation of failover.
Sharded Cluster: You will learn about the design and implementation of key mechanisms in Redis sharded clusters, including the Gossip communication protocol, request redirection, and data migration.

Furthermore, while studying the key source code of these five modules, I will also introduce the corresponding design principles of computer systems to help you apply these design principles to your own system development. Finally, I will introduce some programming techniques used in the Redis source code, so that after learning and mastering them, you can apply them to your own program development.

In Conclusion #

The beginning is always the hardest, especially when it comes to reading source code. Redis has hundreds of source code files, with code lines often exceeding a thousand. To completely master the Redis source code, it does require a considerable amount of effort and time.

However, having a good method is the key to successfully doing things. Therefore, I hope that as you follow the journey of learning Redis source code, you can grasp the three key learning points I have provided:

Obtain an overview of the code.
Determine specific learning goals and prepare the principles before reading the code.
When reading the code, first outline the main logic of the code and then study the detailed branch details.

Finally, I would like to formally get to know you. You can introduce yourself in the comments section and share with me the difficulties you encounter in using or reading the Redis source code, as well as any unique thoughts and experiences you may have. Let’s communicate and discuss together.

Alright, let’s work hard and start our journey into Redis code.