Extra Meal 02 How User Kaito Learned Redis

Extra Meal 02 How User Kaito Learned Redis #

While reading the course comments, I noticed that Kaito often had excellent summaries, so I asked the editor to contact Kaito and asked him to share his experience of learning Redis.

Next, I will share Kaito’s learning experience with you.


Hello, I’m Kaito.

I am honored to be invited by the Geek Time editor to share my methods of learning Redis with you, hoping to help you learn Redis more efficiently.

Let me introduce myself first.

I have been working for 7 years since graduation and am currently a senior development engineer at a mobile internet company in Beijing. I have previously led the design of a vertical crawler collection platform, and later developed a backend service system for users. Now I am engaged in the development of infrastructure and database middleware, specifically focusing on the development of cross-datacenter storage layer disaster recovery and active-active domains. My main technical stack is Golang.

The Redis cluster solution used by our company is Codis, so I am mainly responsible for the customized development of Codis within the company. In the past year and more, much of my work has been focused on Redis. During this period, I have encountered many Redis-related problems, such as increased access latency, unreasonable deployment and operation parameter configurations, etc. I have also deeply studied Redis knowledge, read books, looked at source code, encountered bugs, and dealt with pitfalls. Along the way, I have gradually developed an efficient learning path, which I divide into three major modules:

  1. Mastering the basic usage of data structures and caching.
  2. Mastering the technologies that support Redis to achieve high reliability and high performance.
  3. Proficiency in the underlying implementation principles of Redis.

In today’s session, I’d like to talk to you about “how to learn Redis efficiently”. Later, I will share some of my learning experiences and summaries with you.

Mastering the Basics of Data Structures and Caching #

To be able to use a system, we first need to learn some basic operations. In our daily development of business systems, we often use Redis as a database or cache to some extent. Redis also provides a rich set of data structures, which greatly facilitates our development.

Therefore, to quickly get started with Redis, I suggest starting with three steps:

  1. Learn the usage of basic data types.
  2. Master the usage of advanced data types.
  3. Accumulate some methods for using Redis as a cache and solutions to typical problems.

When first starting with Redis, the first step is to study its basic data structures, namely, String, List, Hash, Set, and Sorted Set. After all, the popularity of Redis is closely related to its rich data types. Its data is stored in memory, which enables extremely fast access speeds and is very suitable for our common business scenarios. Let me give you a few examples:

  • If you only need to store simple key-value pairs or perform increment and decrement operations on numbers, you can use String.
  • If you need a simple distributed queue service, List can meet your needs.
  • If you need to store key-value data and also want to operate on a specific field, Hash is very convenient.
  • If you want to obtain a unique collection, you can use Set, and it can also perform union, difference, and intersection operations.
  • If you want to implement a weighted comment or leaderboard list, Sorted Set can meet your needs.

When we become proficient in using these basic data types, it means we have mastered Redis. At this point, if your business scale is not large, you will not encounter significant problems during usage. However, now we have entered the big data era, and we will inevitably encounter business scenarios with huge data request volumes. In such cases, the basic data types are no longer sufficient.

Here is the simplest example: when the data volume is small, if we want to calculate the number of unique users (UV) in an app on a certain day, we only need to use a Set to store the visiting users on that day, and then use SCARD to calculate the result. However, if the number of users visiting in a day reaches billions, we cannot store them in this way because it would consume a significant amount of memory space. Moreover, such a large key creates the risk of blocking when it expires. At this time, we need to learn the advanced usage of Redis data structures.

Redis provides three types of extended data types: HyperLogLog, Bitmap, and GEO, which we learned earlier.

HyperLogLog is very suitable for storing business data such as UV, and it occupies very little memory. Similarly, when you need to calculate the check-in status of a large number of users, you will find that using String, Set, or Sorted Set will consume a significant amount of memory, while the bit operations provided by Redis come into play. If you encounter cache penetration problems, you can use the bit operation-based Bloom filter. This method can solve our problems while occupying very little memory.

Based on this idea, you will discover many clever ways to use Redis. At this stage, based on the data types provided by Redis, you can explore their usage methods as much as possible and implement your business models.

In addition to implementing business models using data types, when using Redis, we often use it as a cache.

Because Redis is extremely fast, it is very suitable for caching data from databases in Redis, which can improve the access speed of our applications. However, since Redis stores data in memory, and the memory of a single machine has its limits, it is impossible to store infinite data. Therefore, we also need to consider “how Redis does caching”.

You may have also heard about typical problems when Redis is used as a cache, such as data consistency issues between databases and Redis caches, cache penetration problems, and cache avalanche problems. These problems involve cache strategies, how to set cache expiration times, and how the application and cache collaborate, among others. So, during the early learning phase, we need to know some coping strategies.

Once we have learned these, we can easily operate Redis. Next, we can start learning some advanced usage.

Mastering the key points for Redis to achieve high performance and high reliability #

If you have read articles related to software architecture design, you should know that an excellent software must meet three conditions: high reliability, high performance, and easy scalability. As an outstanding database software, Redis also meets these conditions. However, easy scalability refers to deep involvement in Redis development, which we have limited access to for now and can temporarily ignore. We need to focus on the other two: high reliability and high performance.

The reason why Redis can achieve high reliability and high performance is closely related to its persistence mechanism, master-slave replication mechanism, sentinel, automatic failover, and sharded clustering. Therefore, we also need to master this series of mechanisms. This way, when problems arise, we can quickly locate and solve them. Moreover, we can also learn design concepts from Redis, an excellent software, which will greatly help us in learning other databases.

Let me start with the simplest version of Redis and share my understanding with you.

Suppose we only deploy a Redis instance and store all the business data in this instance. Redis only stores data in memory. If this Redis instance fails and crashes, it means that all of our business data will be lost, which is obviously unacceptable. So, how do we handle this?

Redis needs to have the ability to persist data. Specifically, it can persist the data in memory to disk, so that when the instance crashes, we can recover the data from the disk. Therefore, Redis provides two persistence options: RDB and AOF, corresponding to data snapshots and real-time command persistence, respectively. They complement each other to achieve Redis’ persistence functionality.

With data persistence, can we rest easy?

Not really. After the instance crashes and we need to recover the data from the disk, we will still face a problem: recovery also takes time, and the larger the instance, the longer the recovery time, which will have a greater impact on the business.

To address this problem, the solution is to use multiple replicas. We need Redis to keep multiple replicas in sync in real-time, which is what we call master-slave replication. In this way, when an instance crashes, we still have other complete replicas that can be used. At this point, we only need to promote one replica to become the new master and continue to provide services, avoiding some of the impacts during data recovery.

However, if we think further, when the master node crashes and we promote a slave node, this process is manual. Manual triggering means that when a failure occurs, it requires human reaction time and operation time, which also consumes time. If we delay the operation for a while, it will have a sustained impact on the business. What should we do? We easily think that when a failure occurs, can’t we let the program automatically switch between master and slave?

To achieve automatic failover, we need a high availability component: sentinel. Sentinel can monitor the health of the master node in real-time. When the master node fails, it immediately promotes a slave node to become the new master, achieving automatic failover. The entire process is completed automatically without human intervention, greatly reducing the impact of failures.

So you see, through the analysis just now, we know that in order to ensure reliability, a database software must inevitably achieve data persistence, master-slave replication, and automatic failover. Other database software also follows these principles, which you can observe and pay attention to.

Up to this point, what we have discussed is all about the functionality of a single Redis instance. If the read and write requests of our business are not large, using a single instance is not a problem. However, when the business write volume is large, a single Redis instance cannot handle such a large write volume.

At this time, we need to introduce sharded clustering, which means organizing multiple Redis instances into a cluster to provide services externally. At the same time, this cluster needs to have the ability to horizontally scale, so that when the business volume grows, we can deploy new instances by adding machines to handle a larger volume of requests. This way, the performance of our cluster can also become very high. Therefore, there are Redis Cluster, Twemproxy, and Codis as cluster solutions. Redis Cluster is the official cluster solution, while Twemproxy and Codis were developed when Redis Cluster was not yet perfect.

Since it involves storing data on multiple nodes and being able to add new nodes for cluster expansion when necessary, this corresponds to the core problems of sharded clustering: data routing and data migration.

Data routing is used to solve the problem of which node to write data to, while data migration is used to solve the problem of redistributing cluster data when nodes change.

When we enter the sharded clustering field from single-node Redis, we open the door to another world.

Have you ever thought about this question: When our system needs to handle a larger volume of requests, where are the areas that are prone to performance problems from the application layer to the data layer?

In fact, it all comes down to the database layer. Because our application layer is stateless, if the performance reaches a bottleneck, we can easily increase the horizontal scalability by deploying multiple instances. However, even with horizontal scaling of the application layer, the database remains monolithic, and a large number of requests are still supported by only one machine’s database, which will inevitably lead to performance bottlenecks. Therefore, the best solution is to make the database layer distributed as well, which means that data can also be distributed across different machines and have the ability to scale horizontally, allowing both the business layer and the database layer to scale elastically according to the volume of the business, which is very flexible.

While sharded clustering is more reliable and performs better, it introduces new problems because it involves deploying multiple machines, such as how to organize multiple nodes? How to keep the states of multiple nodes consistent? How to detect failures across machines? Can the cluster still work properly during network latency? These issues involve knowledge related to distributed systems.

The above are all knowledge related to reliability. Now let’s take a look at high performance.

Redis stores its data in memory, and coupled with the use of IO multiplexing, Redis has very high performance. If used together with sharded clustering, the performance can be further improved. However, this also means that if there is a significant increase in operation latency, it will be different from our expectations. Therefore, how to use and operate Redis well is also something we need to pay close attention to. Only in this way can Redis continue to perform stably and leverage its high performance.

Performance issues run through all aspects we just mentioned. Improper use of business, improper operation of high reliability and sharded clustering will all lead to performance problems.

For example, at the business use level, using commands with excessive complexity, using O(N) commands with a large N, concentration of a large amount of data expiration, reaching the memory limit within an instance, etc., will all cause increased operation latency. At the operation level, improper choice of persistence strategy, unreasonable configuration of master-slave replication parameters, inadequate deployment and monitoring, saturated machine resources, and so on, will also cause performance problems.

Redis performance involves various aspects such as CPU, memory, network, and even disk. Once there is a problem in any link, it will affect performance. Therefore, in the second phase, we need to master a series of mechanisms related to high reliability and high performance.

At this point, our Redis usage skills surpass many people, but it has not reached the level of mastery yet. In order to become a Redis expert, we must also have the ability to solve tricky problems at any time. At this point, we need to study the underlying principles of Redis.

Mastering the Underlying Implementation Principles of Redis #

To understand the underlying principles of various data types, we can refer to the source code. For example, t_string.c, t_list.c, t_hash.c, t_set.c, and t_zset.c.

By reading the source code, we can learn about the specific implementation of each data structure. For example, List is implemented as a linked list, so searching for elements in a List can be slow. On the other hand, Hash and Set are both implemented as hash tables, which makes locating elements very fast. Sorted Set combines hash tables and skip lists, which results in fast element lookup and traversal. Without understanding the implementation details of these data structures, it is impossible to choose the best approach.

If you pay close attention, you will also notice that each data structure has different implementations. For example, List, Hash, and Sorted Set use compressed lists (ziplist) to store data when the data size is relatively small, in order to save memory. String and Set also prefer to use integer encoding for storing data, to further optimize memory usage. These are all optimizations Redis has made for its data structures. Only by understanding these underlying principles can we fully leverage the advantages of Redis when using it.

In addition, we also need to understand a series of principles related to high performance and high reliability, primarily persistence, master-slave synchronization, failover, and sharded clusters. For example:

  • Both RDB and AOF persistence use the “fork” mechanism provided by the operating system, which involves knowledge at the operating system level.
  • Failover is implemented with a Sentinel cluster, which involves the election and consensus problems in distributed systems.
  • Managing a sharded cluster involves operating multiple nodes on different machines, which raises many issues related to distributed systems, such as CAP theorem, distributed transactions, and architectural design.

By mastering these principles, we can adapt to any situation. No matter what problems we encounter, we can easily analyze and locate them. At this stage, our ability to utilize Redis surpasses that of many others.

Well, these are the learning paths I have summarized for Redis. They are arranged in increasing difficulty. During the learning process, you can focus on reading books, taking related courses, such as our Redis series, which will help you quickly improve your practical skills.

Lastly, I would also like to hear from you about how you learned Redis. I hope you can share your learning methods in the comments so that we can exchange ideas together.