04 How to Design Memory Friendly Data Structures in Detail

04 How to Design Memory-Friendly Data Structures in Detail #

Today, let’s talk about how Redis improves memory utilization through optimized data structure designs.

As we know, Redis is an in-memory database, so efficient memory usage is crucial for its implementation. In fact, Redis primarily improves memory utilization through two techniques: optimized design and usage of data structures, as well as evicting memory data following certain rules.

Regarding the eviction of memory data based on rules, this is achieved through Redis’s memory replacement strategy. It involves evicting less frequently used data from memory to allocate limited memory space for frequently accessed data. The design and implementation of this aspect mainly depend on the memory replacement strategy, which I will cover in detail in the caching module later.

Therefore, in this lesson, I mainly aim to teach you about optimizing Redis data structures for efficient memory usage. This includes two design approaches: memory-friendly data structure design and memory-friendly data usage.

These two design approaches and implementation methods are generally applicable. When designing system software, if you need to carefully manage memory usage to save memory overhead, these two design methods and considerations are worth learning and mastering.

Alright, next, let’s start by learning about memory-friendly data structure design.

Memory-friendly Data Structures #

First of all, we need to know that in Redis, there are three data structures that have been designed and optimized for memory usage efficiency. They are Simple Dynamic Strings (SDS), ziplists, and intsets. Now, let’s learn about them one by one.

Memory-friendly Design of SDS #

In fact, I have already introduced the structure design of SDS to you in Lecture 2, so let’s do a simple review here. SDS has designed different types of structure headers, including sdshdr8, sdshdr16, sdshdr32, and sdshdr64. These different types of structure headers can adapt to different sizes of strings, thus avoiding memory waste.

However, besides using well-designed structure headers, SDS also uses the design method of embedded strings when saving smaller strings. This method avoids allocating additional space for strings and allows strings to be directly saved in the basic data object structure of Redis.

So this means we need to understand what the basic data object structure redisObject used in Redis is like in order to understand the design and implementation of embedded strings.

The redisObject Structure and Bit Field Definition Method #

The redisObject structure is defined in the server.h file and is mainly used to store the values in key-value pairs. This structure defines 4 metadata and a pointer.

type: The data type of redisObject, which is the data type saved by the application in Redis, including String, List, Hash, etc.
encoding: The encoding type of redisObject, which is the data structure used by Redis to implement various data types.
lru: The LRU time of redisObject.
refcount: The reference count of redisObject.
ptr: A pointer to the value.

The following code shows the definition of the redisObject structure:

typedef struct redisObject {
    unsigned type:4; // 4 bits for the data type of redisObject
    unsigned encoding:4; // 4 bits for the encoding type of redisObject
    unsigned lru:LRU_BITS; // LRU_BITS (24 bits) for the LRU time of redisObject
    int refcount; // 4 bytes for the reference count of redisObject
    void *ptr; // 8 bytes for the pointer to the value
} robj;

From the code, we can see that there is a colon followed by a numeric value after the type, encoding, and lru variables, indicating the number of bits occupied by each metadata. Among them, type and encoding each occupy 4 bits. The number of bits occupied by lru is determined by the macro definition LRU_BITS in server.h, with a default value of 24 bits, as shown below:

#define LRU_BITS 24

Here, what I want you to learn and master is this method of defining variables after the colon and numeric value. This method is actually the bit field definition method in the C language, which can effectively save memory overhead.

This method is more suitable for situations where a variable cannot occupy all the bits of a data type. The bits of a data type can be divided into multiple bit fields, with each bit field occupying a certain number of bits. In this way, multiple variables can be defined in all the bits of a data type, thus effectively saving memory overhead.

In addition, you may also notice that for the type, encoding, and lru variables, their data types are all unsigned. It is known that an unsigned type occupies 4 bytes, but these three variables each occupy 4 bits, 4 bits, and 24 bits respectively within a 4-byte unsigned type. Therefore, compared to defining each variable with a 4-byte unsigned type, using the bit field definition method can reduce the usage to 4 bytes for the three variables, thus saving 8 bytes of overhead.

So, when designing and developing memory-sensitive software, you can use this bit field definition method.

Now that we have understood the redisObject structure and the bit field definition method it uses, let’s take a look at how embedded strings are implemented.

Embedded Strings #

As I mentioned earlier, when saving smaller strings, SDS uses the design method of embedded strings to save the strings directly in the redisObject structure. In the redisObject structure, there is a pointer “ptr” that usually points to the data structure of the value.

Let’s take creating a String type value as an example. Redis calls the createStringObject function to create the corresponding redisObject, and the pointer “ptr” in this redisObject will point to the SDS data structure, as shown in the following diagram.

In the Redis source code, the createStringObject function determines which function to call to complete the creation based on the length of the string to be created.

For the createStringObject function, its parameters are the string pointer “ptr” and the length of the string “len”. When the length “len” is greater than the macro definition OBJ_ENCODING_EMBSTR_SIZE_LIMIT, the createStringObject function calls the createRawStringObject function. Otherwise, it calls the createEmbeddedStringObject function. In the Redis 5.0.8 source code version we are analyzing, the OBJ_ENCODING_EMBSTR_SIZE_LIMIT is defined as 44 bytes by default.

The relevant code is shown below:

#define OBJ_ENCODING_EMBSTR_SIZE_LIMIT 44
robj *createStringObject(const char *ptr, size_t len) {
    // create embedded string when the length of the string is less than or equal to 44 bytes
    if (len <= OBJ_ENCODING_EMBSTR_SIZE_LIMIT)
        return createEmbeddedStringObject(ptr,len);
    // create regular string when the length of the string is greater than 44 bytes
    else
        return createRawStringObject(ptr,len);
}

Now, let’s analyze the source code implementation of the createStringObject function to understand how ordinary strings larger than 44 bytes and embedded strings of size 44 bytes or smaller are created.

First, for the createRawStringObject function, when creating a value of String type, it calls the createObject function.

Note: The createObject function is mainly used to create Redis data objects. Since Redis data objects have many types such as String, List, Hash, etc., one of the two parameters of the createObject function is used to indicate the type of data object to be created, and the other is a pointer to the data object.

Then, when the createRawStringObject function calls the createObject function, it passes the OBJ_STRING type to indicate the creation of a String type object, and passes a pointer to the SDS structure, as shown in the code below. It should be noted that the pointer to the SDS structure is returned by the sdsnewlen function, which is used to create SDS structures.

robj *createRawStringObject(const char *ptr, size_t len) {
    return createObject(OBJ_STRING, sdsnewlen(ptr,len));
}

Finally, let’s take a closer look at the createObject function. This function directly assigns the pointer to the SDS structure passed as a parameter to the ptr field of the redisObject, as shown in the following code:

robj *createObject(int type, void *ptr) {
    //Allocate memory for redisObject structure
    robj *o = zmalloc(sizeof(*o));
    //Set the type of redisObject
    o->type = type;
    //Set the encoding type of redisObject, which is OBJ_ENCODING_RAW indicating the regular SDS
    o->encoding = OBJ_ENCODING_RAW;
    //Assign the pointer passed in to the pointer in redisObject directly.
    o->ptr = ptr;
    o->refcount = 1;
    ...
    return o;
}

To facilitate understanding the creation method of ordinary strings, I drew a diagram which you can refer to.

Diagram

This means that when creating an ordinary string, Redis needs to allocate memory for redisObject and SDS separately. This brings both memory allocation overhead and memory fragmentation. Therefore, when the string is 44 bytes or smaller, Redis uses the creation method of embedded strings to reduce memory allocation and fragmentation.

And this creation method is precisely completed by the createEmbeddedStringObject function we mentioned earlier, which uses a contiguous block of memory to store both redisObject and SDS structures. This way, there is only one memory allocation and avoids memory fragmentation.

The prototype definition of the createEmbeddedStringObject function is as follows, and its parameters are the string pointer (ptr) obtained from the createStringObject function and the string length (len).

robj *createEmbeddedStringObject(const char *ptr, size_t len)

Now, let’s take a closer look at how the createEmbeddedStringObject function puts redisObject and SDS together.

First, the createEmbeddedStringObject function allocates a contiguous block of memory. The size of this memory block is equal to the size of the redisObject structure, the size of the SDS structure header (sdshdr8), and the sum of the string size, plus one byte. Note that the last 1 byte here is the null-terminating character “\0” added to the end of the string in the SDS.

The allocation of this contiguous memory block is shown in the code below:

robj *o = zmalloc(sizeof(robj)+sizeof(struct sdshdr8)+len+1);

You can also refer to the diagram below, which shows the layout of this contiguous memory block.

Diagram

Alright, after the createEmbeddedStringObject function allocates the memory block, it creates a pointer to the SDS structure (sh) and points sh to the location where the SDS structure header lies within this contiguous memory space. The following code demonstrates this step. Here, o is a variable of the redisObject structure, and o+1 means moving the memory address from the variable o by a distance equal to the size of the redisObject structure.

struct sdshdr8 *sh = (void*)(o+1);

After this step, the position pointed to by sh is shown in the diagram below:

Diagram

Next, the createEmbeddedStringObject function sets the ptr field of the redisObject structure to point to the character array in the SDS structure.

As shown in the code below, where sh is the pointer to the SDS structure mentioned earlier, belonging to the sdshdr8 type. sh+1 represents moving the memory address starting from the address of sh by a size equal to the size of the sdshdr8 structure.

o->ptr = sh+1;

After this step, the pointer ptr in the redisObject structure points to the position shown in the diagram below. It points to the end of the SDS structure header and the starting position of the character array simultaneously.

Finally, the createEmbeddedStringObject function will copy the string pointed to by the ptr parameter into the character array in the SDS structure and append a null character at the end of the array. The code for this part is shown below:

memcpy(sh->buf,ptr,len);
sh->buf[len] = '\0';

The following image also shows the process of creating an embedded string with the createEmbeddedStringObject function, you can take a look at it in its entirety.

In short, you can remember that Redis will implement a continuous memory space by design, tightly placing the redisObject structure and the SDS structure together. In this way, for strings that do not exceed 44 bytes, memory fragmentation and the overhead of memory allocation twice can be avoided.

In addition to embedded strings, Redis also designs compressed lists and integer sets, which are two types of compact memory data structures. So let’s learn about their design ideas below.

Design of compressed lists and integer sets #

First of all, you need to know that the List, Hash, and Sorted Set data types can all use compressed lists (ziplist) to store data. The function definitions and implementation code for compressed lists are in the ziplist.h and ziplist.c files, respectively.

However, in the ziplist.h file, you can’t really see the structure definition of the compressed list. This is because the compressed list itself is a continuous block of memory space, and it saves data using different encodings.

To understand the design and implementation of the compressed list, let’s first take a look at its creation function ziplistNew, as shown below:

unsigned char *ziplistNew(void) {
    // Initial allocation size
    unsigned int bytes = ZIPLIST_HEADER_SIZE+ZIPLIST_END_SIZE;
    unsigned char *zl = zmalloc(bytes);
    ...
    // Set the end of the list to ZIP_END
    zl[bytes-1] = ZIP_END;
    return zl;
}

In fact, the logic of the ziplistNew function is simple, it creates a continuous block of memory space with a size equal to the sum of ZIPLIST_HEADER_SIZE and ZIPLIST_END_SIZE, and then assigns the last byte of this continuous space with the value ZIP_END to indicate the end of the list.

Another thing to note is that in the above code, the three macros ZIPLIST_HEADER_SIZE, ZIPLIST_END_SIZE, and ZIP_END are defined in ziplist.c, respectively, representing the size of the list header, the size of the list end, and the byte content of the list end, as shown below:

// Size of the list header in the ziplist, including two 32-bit integers and one 16-bit integer, which represent the total number of bytes in the compressed list, the offset of the last element from the list head, and the number of elements in the list.
#define ZIPLIST_HEADER_SIZE     (sizeof(uint32_t)*2+sizeof(uint16_t))
// Size of the list end in the ziplist, including one 8-bit integer representing the end of the list.
#define ZIPLIST_END_SIZE        (sizeof(uint8_t))
// Byte content of the list end in the ziplist.
#define ZIP_END 255

After creating a new ziplist, the memory layout of the list is shown in the following diagram. Note that at this point, the list does not have any actual data.

Then, when we insert data into the ziplist, the ziplist will encode the data differently based on whether it is a string or an integer and based on their sizes. This design of using different encodings based on data size is exactly what Redis uses to save memory.

So, how does the ziplist perform encoding? To understand the encoding implementation, we need to first understand the structure of a list item in the ziplist.

A ziplist item consists of three parts: the length of the previous item (prevlen), the encoding result of the length information of the current item (encoding), and the actual data of the current item (data). The following diagram shows the structure of a list item (the content outside the list items in the diagram represents the beginning and end of the ziplist memory space).

In fact, the so-called encoding technique refers to representing the saved information using different numbers of bytes. In the ziplist, the encoding technique is mainly applied to the prevlen and encoding metadata in each list item. The actual data data of the current item is represented using normal integers or strings.

So here, let’s first take a look at the encoding design of prevlen. In a ziplist, there are multiple list items, and each item is stored adjacent to each other, as shown in the following diagram:

In order to facilitate searching, the length of the previous item is recorded in each list item. Since the lengths of each list item vary, if the same number of bytes is used to record prevlen, it will cause wasted memory space.

Let me give you an example. Let’s assume we use 4 bytes to record prevlen. If the previous list item is just a string “redis” with a length of 5 bytes, then using 1 byte (8 bits) we can represent a string length of 256 bytes (2 to the power of 8 equals 256). At this time, if prevlen is recorded using 4 bytes, 3 bytes will be wasted.

Now, let’s go back and look at how the ziplist encodes prevlen. When encoding prevlen, the ziplist will first call the function zipStorePrevEntryLength to check if the length of the previous item is less than 254 bytes. If it is, then prevlen will be represented using 1 byte. Otherwise, the zipStorePrevEntryLength function will call the zipStorePrevEntryLengthLarge function to further encode prevlen. The code for this part is shown below:

// Determine if the length of prevlen is less than ZIP_BIG_PREVLEN, where ZIP_BIG_PREVLEN is 254
if (len < ZIP_BIG_PREVLEN) {
  // If less than 254 bytes, return prevlen as 1 byte
  p[0] = len;
  return 1;
} else {
  // Otherwise, call zipStorePrevEntryLengthLarge for encoding
  return zipStorePrevEntryLengthLarge(p,len);
}

In other words, the zipStorePrevEntryLengthLarge function first sets the first byte of prevlen to 254, and then uses the memcpy function to copy the length value of the previous list item to bytes 2 to 5 of prevlen. Finally, the zipStorePrevEntryLengthLarge function returns the size of prevlen, which is 5 bytes.

if (p != NULL) {
  // Set the first byte of prevlen to ZIP_BIG_PREVLEN, which is 254
  p[0] = ZIP_BIG_PREVLEN;
  // Copy the length value of the previous list item to bytes 2 to 5 of prevlen, where the value of sizeof(len) is 4
  memcpy(p+1,&len,sizeof(len));
  ...
}
// Return the size of prevlen, which is 5 bytes
return 1+sizeof(len);

Now that we have understood the two encoding methods used for prevlen, which are 1 byte and 5 bytes, let’s learn about the encoding method for encoding.

We know that the actual data of a list item can be an integer or a string. Integers can have different byte lengths such as 16, 32, 64, and strings can have varying lengths.

In the zipStoreEntryEncoding function, different byte lengths are used for encoding integers and strings. The following code shows a portion of the zipStoreEntryEncoding function, where you can see that the length len of the encoding result is different for different lengths of string or integer data.

// The default length of the encoding result is 1 byte
unsigned char len = 1;
// If the data is a string
if (ZIP_IS_STR(encoding)) {
  // If the string length is less than or equal to 63 bytes (0x3f in hexadecimal)
  if (rawlen <= 0x3f) {
    // The default length of the encoding result is 1 byte
    ...
  }
  // If the string length is less than or equal to 16383 bytes (0x3fff in hexadecimal)
  else if (rawlen <= 0x3fff) {
    // The length of the encoding result is 2 bytes
    len += 1;
    ...
  }
  // If the string length is greater than 16383 bytes
  else {
    // The length of the encoding result is 5 bytes
    len += 4;
    ...
  }
} else {
  /* If the data is an integer, the length of the encoding result is 1 byte */
  if (!p) return len;
  ...
}

In summary, different sizes of metadata information (prevlen and encoding) are used for data of different lengths, which effectively saves memory. In addition to ziplist, Redis has also designed a memory-friendly data structure called the intset as the underlying structure for the Set data type.

Similar to SDS embedded strings and ziplist, the intset is also a contiguous block of memory, which can be seen from the definition of the intset. The intset.h and intset.c files include the definition and implementation of the intset.

The following code shows the structure definition of intset. As we can see, the part that records the data in the intset structure is an integer array contents of type int8_t. From a memory usage perspective, the integer array is a contiguous block of memory, so this avoids memory fragmentation and improves memory usage efficiency.

typedef struct intset {
  uint32_t encoding;
  uint32_t length;
  int8_t contents[];
} intset;

So far, we have learned about the data structure optimizations Redis has made to reduce memory usage, which are SDS embedded strings, ziplist, and intset.

In addition to optimizing data structures, Redis also strives to save memory usage in data access. Let’s learn about that next.

Memory-efficient Data Access #

As we know, when a Redis instance is running, some data is frequently accessed, such as common integers and frequently used reply messages in the Redis protocol, including successful operations (“OK” string), failed operations (ERR), and common error messages.

To avoid repeatedly creating these frequently accessed data in memory, Redis adopts the design concept of shared objects. This design concept is simple: create these commonly used data as shared objects, and the upper-layer application can directly read them when needed.

Let’s make an assumption. There are 1000 clients, and they all need to store the integer “3”. If Redis creates a redisObject with a value of 3 for each client, there will be a large amount of redundancy in memory. However, by using the shared object method, Redis only needs to store one redisObject with a value of 3 in memory, effectively saving memory space.

The following code shows the function createSharedObjects in the server.c file, which creates shared objects. You can take a look.

void createSharedObjects(void) {
   ...
   // Common reply messages
   shared.ok = createObject(OBJ_STRING, sdsnew("+OK\r\n"));
   shared.err = createObject(OBJ_STRING, sdsnew("-ERR\r\n"));
   ...
   // Common error messages
   shared.nokeyerr = createObject(OBJ_STRING, sdsnew("-ERR no such key\r\n"));
   shared.syntaxerr = createObject(OBJ_STRING, sdsnew("-ERR syntax error\r\n"));
   // Integers from 0 to 9999
   for (j = 0; j < OBJ_SHARED_INTEGERS; j++) {
        shared.integers[j] = makeObjectShared(createObject(OBJ_STRING, (void*)(long)j));
        ...
    }
   ...
}

Summary #

For memory databases like Redis, it is very important to reduce memory overhead. In today’s lesson, we have learned two methods for optimizing memory usage in Redis: memory-optimized data structure design and memory-saving shared data access.

When it comes to implementing data structures to save memory, Redis provides us with two excellent design concepts. The first one is the use of continuous memory space to avoid memory fragmentation overhead. The second one is the use of different sizes of metadata for data of different lengths to avoid wasting memory space by using uniform-sized metadata.

In terms of data access, you should also know that using shared objects can avoid creating redundant data, thereby effectively saving memory space. However, shared objects are mainly suitable for read-only scenarios. If a string is repeatedly modified, it cannot be accessed by multiple requests in a shared manner. So, you also need to pay attention to this point when applying it in your application.

One Question per Lesson #

SDS uses a condition of 44 bytes to determine whether to use embedded strings. Do you know why it is 44 bytes?

Feel free to share your thought process in the comment section, and let’s exchange and discuss together. If you find it helpful, you’re also welcome to share today’s content with more friends.