40 Redis's Next Step Practices Based on Nvm Memory

40 Redis’s Next Step Practices Based on NVM Memory #

Today’s lesson is the final lesson of our course. Let’s talk about the next step for Redis.

In recent years, the development of new Non-Volatile Memory (NVM) devices has been very fast. NVM devices have the characteristics of large capacity, fast performance, and persistent data storage, which happen to be the goals that Redis pursues. At the same time, NVM devices, like DRAM, allow software to access data at the byte level. Therefore, in practical applications, NVM can be used as memory, which we call NVM memory.

You must be thinking that if Redis, as an in-memory key-value database, could be used in combination with NVM memory, it could fully enjoy these characteristics. In my opinion, the next step in Redis’ development could be the implementation of large-capacity instances based on NVM memory or the realization of fast data persistence and recovery. In this lesson, I will introduce you to this new trend.

Next, let’s learn about the characteristics of NVM memory and the two modes of software using NVM memory. In different usage modes, the NVM features that software can use are different. Therefore, mastering this knowledge can help us better choose the appropriate mode according to business needs.

Characteristics and Usage Patterns of NVM Memory #

Redis is a key-value database based on DRAM memory. Compared to traditional DRAM memory, NVM has three significant characteristics.

Firstly, the biggest advantage of NVM memory is that it can directly persist data. This means that even if there is a power failure or system crash, the data stored in NVM memory will still be retained. On the other hand, if the data is stored in DRAM, it will be lost after a power failure.

Secondly, the access speed of NVM memory is comparable to that of DRAM. I have personally tested the access speed of NVM memory, and the results show that the read latency is approximately 200-300 ns, while the write latency is approximately 100 ns. In terms of read-write bandwidth, a single NVM memory module has a write bandwidth of about 1-2 GB/s and a read bandwidth of about 5-6 GB/s. When software systems store data in NVM memory, they can still access the data quickly.

Lastly, NVM memory has a large capacity. This is because NVM devices have a high density, allowing a single NVM storage unit to hold more data. For example, a single NVM memory module can have a capacity of up to 128 GB, and even up to 512 GB, whereas a single DRAM memory module typically has a capacity of 16 GB or 32 GB. Therefore, it is relatively easy to build memory systems at the terabyte level using NVM memory.

In summary, the characteristics of NVM memory can be summarized in three points:

Data persistence
Read-write speed comparable to DRAM
Large capacity

Nowadays, there are actual NVM memory products available in the industry, such as the Optane AEP memory module (referred to as AEP memory) launched by Intel in April 2019. When using AEP memory, we need to be aware of the two usage modes provided by AEP memory, which correspond to the two characteristics of using large capacity NVM and persisting data. Let’s learn about these two modes.

The first mode is Memory mode.

This mode utilizes NVM memory as large capacity memory, which means that it only utilizes the characteristics of large capacity and high performance of NVM, without enabling data persistence.

For example, we can install 6 NVM memory modules, each with 512 GB, on a single server, thereby obtaining a memory capacity of 3 TB.

In Memory mode, the server still needs to be equipped with DRAM memory. However, the DRAM memory is used as a cache for AEP memory by the CPU, and the space in DRAM is not visible to the application software. In other words, the memory space accessible to the software system is limited to the capacity of the AEP memory module.

The second mode is App Direct mode.

This mode enables the functionality of persisting data in NVM. In this mode, when the application software writes data to AEP memory, the data is directly persisted. Therefore, AEP memory used in App Direct mode is also called Persistent Memory (PM).

Now that we know about the two usage modes of AEP memory, how does Redis utilize it? Let me explain it to you in detail.

Redis Practice Based on NVM Memory #

When AEP memory is used in Memory mode, application software can utilize its large capacity to store a large amount of data. Redis can also provide high-capacity instances to upper-level business applications. Moreover, in Memory mode, Redis can run directly on AEP memory without modifying the code, just as it runs on DRAM memory.

However, there is something to note: in Memory mode, the access latency of AEP memory is slightly higher than that of DRAM. As mentioned earlier, the read latency of NVM is about 200-300ns, while the write latency is about 100ns. Therefore, running Redis instances in Memory mode will result in a decrease in instance’s read performance, and we need to strike a balance between storing a large amount of data and slower read performance.

So, when we use the App Direct mode to use AEP memory as PM, how should Redis make use of the fast data persistence feature of PM? This is related to Redis’s requirements for data reliability and the existing mechanisms. Let’s analyze it in detail.

To ensure data reliability, Redis has designed two mechanisms, RDB and AOF, to persistently save data to disk.

However, both RDB and AOF require writing data or command operations as files to the disk. For RDB, although Redis instances can generate RDB files through subprocesses, the main thread of the instance is still blocked during the fork operation. Moreover, generating an RDB file requires going through the file system, which incurs certain operational overhead on the file itself.

For AOF log, although Redis provides three options: always, everysec, and no, among them, the always option saves data to disk using fsync, which ensures data reliability but faces the risk of performance loss. The everysec option avoids immediate disk writes for each operation and instead performs periodic writes in the background. In this case, Redis’s write performance is improved, but there is a risk of losing data within seconds.

In addition, when we use RDB files or AOF files to recover Redis, we need to load the RDB file into memory or replay the log operations in AOF. The efficiency of this recovery process is affected by the size of the RDB file and the number of log operations in the AOF file.

So, in the previous courses, I often reminded you not to let a single Redis instance become too large, otherwise it will cause the RDB file to be too large. In master-slave cluster applications, a large RDB file will result in inefficient synchronization between the master and slave.

Let’s briefly summarize the issues when Redis is involved in persistence operations:

The fork operation during RDB file creation blocks the main thread.
When AOF logs, a balance must be struck between data reliability and write performance.
When using RDB or AOF to recover data, the recovery efficiency is restricted by the size of RDB and AOF.

However, if we use persistent memory, we can fully leverage the fast persistence feature of PM to avoid the operations of RDB and AOF. Since PM supports memory access and Redis operations are all memory-based, Redis can run directly on PM. At the same time, the data can be persistently saved on PM itself, so we no longer need additional RDB or AOF log mechanisms to ensure data reliability.

So, when using PM to support Redis’s persistence operations, how can we implement it specifically?

I’ll first introduce the usage of PM.

When PM is deployed in the server, we can see a PM device in the /dev directory of the operating system, as shown below:

/dev/pmem0

Then, we need to format this device using the ext4-dax file system:

mkfs.ext4 /dev/pmem0

After that, we mount this newly formatted device to a directory on the server:

mount -o dax /dev/pmem0 /mnt/pmem0

Now we can create files in this directory. After creating the files, we can map them to the process space of Redis using memory mapping (mmap). In this way, Redis can directly save the received data to the mapped memory space, which is provided by PM. Therefore, when data is written to this memory space, it can be directly persisted.

Moreover, if we need to modify or delete data, PM itself supports data access at the byte level, so Redis can directly modify or delete data on PM.

If an instance failure occurs and Redis crashes, because the data itself has been persistently saved on PM, we can directly use the data on PM for instance recovery, without the need to load RDB files or replay AOF log operations as in the current Redis, which achieves fast fault recovery.

Of course, because the read and write speed of PM is slower than DRAM, if we use PM to run Redis, we need to evaluate whether the access latency and bandwidth provided by PM can meet the requirements of the business layer.

Let me give you an example to show how to evaluate PM’s bandwidth to support Redis’s business.

Suppose the business layer needs to support a throughput of 1 million QPS, and the average size of each request is 2KB. Then, the machine needs to support a bandwidth of 2GB/s (1 million requests per second * 2KB per request = 2GB/s). If these requests are all write operations, the write bandwidth of a single PM may not be sufficient.

In this case, we can use multiple PM memory modules in one server to support high bandwidth requirements. Of course, we can also use a sharded cluster to distribute data to multiple instances to alleviate the access pressure.

Alright, by now, we have learned how to directly persist Redis data on PM. Now, we can use a single instance with a large-capacity PM to store more business data, and directly use the data saved on PM for fault recovery after an instance failure.

Summary #

In this lesson, I introduced you to the three main features of NVM (Non-Volatile Memory): high performance, large capacity, and persistent data storage. Software systems can access NVM memory just like traditional DRAM memory. Currently, Intel has launched the NVM memory product, Optane AEP.

This NVM memory product provides two usage modes for software: Memory mode and App Direct mode. In Memory mode, Redis can take advantage of the large capacity of NVM to achieve high-capacity instances and store more data. In App Direct mode, Redis can directly read and write data on persistent memory. In this case, Redis no longer needs to use RDB or AOF files, and data will not be lost after a power failure. Moreover, instances can directly recover using the data on persistent memory, resulting in fast recovery speed.

NVM memory is a significant change in the storage device field in recent years. It can persistently store data and quickly access it like memory, which undoubtedly brings new opportunities for optimizing current software systems based on DRAM and hard drives. Many internet giants have already started using NVM memory, and I hope you can pay attention to this important trend and be prepared for future developments.

One Question per Lesson #

As usual, I have a small question for you: Do you think we still need a Redis master-slave cluster with persistent memory?

Please feel free to write your thoughts and answer in the comment section. Let’s discuss and exchange ideas together. If you find today’s content helpful, you are also welcome to share it with your friends or colleagues.