10 List Usage and Internal Implementation Principles

10 List Usage and Internal Implementation Principles #

The List type is an ordered structure stored using a linked list. Its elements are inserted into the linked list in the order they are added, so the time complexity for element operations (insertion/deletion) is O(1), making it relatively fast. However, the time complexity for querying is O(n), which means querying can be slower.

1 Basic Usage #

Using the List type is relatively simple. Operations on it are similar to operating on a collection of values without any key values, as shown in the diagram below: List Usage - List Structure Diagram.png

1) Add one or more elements to the list #

Syntax: lpush key value [value …] Example:

127.0.0.1:6379> lpush list 1 2 3
(integer) 3

2) Add one or more elements to the end of the list #

Syntax: rpush key value [value …] Example:

127.0.0.1:6379> rpush list2 1 2 3
(integer) 3

3) Return the elements within a specified range in the list #

Syntax: lrange key start stop Example:

127.0.0.1:6379> lrange list 0 -1
"3"
"2"
"1"
127.0.0.1:6379> lrange list2 0 -1
"1"
"2"
"3"

Here, -1 represents the last element in the list.

4) Get and remove the first element of the list #

Syntax: lpop key Example:

127.0.0.1:6379> lrange list 0 -1
1) "d"
2) "c"
3) "b"
4) "a"
127.0.0.1:6379> lpop list
"d"
127.0.0.1:6379> lrange list 0 -1
1) "c"
2) "b"
3) "a"

5) Get and remove the last element of the list #

Syntax: rpop key Example:

127.0.0.1:6379> lrange list 0 -1
1) "c"
2) "b"
3) "a"
127.0.0.1:6379> rpop list
"a"
127.0.0.1:6379> lrange list 0 -1
1) "c"
2) "b"

6) Get the element at a specified index #

Syntax: lindex key index Example:

127.0.0.1:6379> rpush list3 a b c
(integer) 3
127.0.0.1:6379> lindex list3 0
"a"

For more commands, please see the appendix.

2 Code Practical #

Let’s look at the usage of the List type in Java. First, add the Jedis framework and use the following code:

public class ListExample {
    public static void main(String[] args) {
        Jedis jedis = new Jedis("127.0.0.1", 6379);
        // Declare Redis key
        final String REDISKEY = "list";
        // Add one or more elements to the front
        Long lpushResult = jedis.lpush(REDISKEY, "Java", "Sql");
        System.out.println(lpushResult); // Output: 2
        // Get the value of the element at index 0
        String idValue = jedis.lindex(REDISKEY, 0);
        System.out.println(idValue); // Output: Sql
        // Query elements within a specified range
        List<String> list = jedis.lrange(REDISKEY, 0, -1);
        System.out.println(list); // Output: [Sql, Java]
        // Add element MySQL before Java
        jedis.linsert(REDISKEY, ListPosition.BEFORE, "Java", "MySQL");
        System.out.println(jedis.lrange(REDISKEY, 0, -1)); // Output: [Sql, MySQL, Java]
        jedis.close();
    }
}

The program output is as follows:

2 Sql [Sql, Java] [Sql, MySQL, Java]

3 Internal Implementation #

First, use debug encoding key to check the internal storage type of the list. Here is an example:

127.0.0.1:6379> object encoding list
"quicklist"

From the result, we can see that the underlying data type of the list is a quicklist.

Quicklist is a data type introduced in Redis 3.2. The earlier list type was composed of a ziplist (compressed list) and a doubly linked list. Redis 3.2 replaced it with a quicklist to store list elements.

Let’s take a look at the implementation source code for quicklist:

typedef struct quicklist { // src/quicklist.h
    quicklistNode *head;
    quicklistNode *tail;
    unsigned long count;        /* number of ziplists */
    unsigned long len;          /* number of quicklist nodes */
    unsigned int compress : 16; /* LZF compression level */
    //...
} quicklist;
typedef struct quicklistNode {
    struct quicklistNode *prev;
    struct quicklistNode *next;
    unsigned char *zl;           /* corresponding ziplist */
    unsigned int sz;             /* ziplist size in bytes */
    unsigned int count : 16;     /* element count */
    unsigned int encoding : 2;   /* RAW==1 or LZF==2 */
    unsigned int container : 2;  /* NONE==1 or ZIPLIST==2 */
    unsigned int recompress : 1; /* if this node was compressed before */
    unsigned int attempted_compress : 1; /* if node is too small to compress */
    //...
} quicklistNode;
typedef struct quicklistLZF {
    unsigned int sz; 
    char compressed[];
} quicklistLZF;

From the above source code, we can see that quicklist is a doubly linked list, and each node in the list is actually a ziplist. Their structures are shown in the following diagram: List Usage - quicklist Structure Diagram.png

ziplist, as the actual storage structure of quicklist, is essentially a byte array. The data structure of ziplist is shown in the following diagram:

List Usage - Ziplist Structure Diagram.png

The meanings of the fields are as follows:

zlbytes: Length of the ziplist in bytes, 4 bytes in length;
zltail: The offset of the tail element relative to the start element address, occupying 4 bytes.
zllen: The number of elements in the compressed list.
entryX: All elements stored in the compressed list, which can be byte arrays or integers.
zlend: The end of the compressed list, occupying 1 byte.

4 Source Code Analysis #

Let’s take a look at the source code implementation of the list type.

1) Analysis of Add Functionality Source Code #

The function corresponding to the add operation in quicklist is quicklistPush, the source code is as follows:

void quicklistPush(quicklist *quicklist, void *value, const size_t sz,
                   int where) {
    if (where == QUICKLIST_HEAD) {
        // Add element at the head of the list
        quicklistPushHead(quicklist, value, sz);
    } else if (where == QUICKLIST_TAIL) {
        // Add element at the tail of the list
        quicklistPushTail(quicklist, value, sz);
    }
}

Taking quicklistPushHead as an example, the source code is as follows:

int quicklistPushHead(quicklist *quicklist, void *value, size_t sz) {
    quicklistNode *orig_head = quicklist->head;
    if (likely(
            _quicklistNodeAllowInsert(quicklist->head, quicklist->fill, sz))) {
        // Insert element at the head node
        quicklist->head->zl =
            ziplistPush(quicklist->head->zl, value, sz, ZIPLIST_HEAD);
        quicklistNodeUpdateSz(quicklist->head);
    } else {
        // The head node cannot be inserted, so a new quicklistNode and ziplist are needed for insertion
        quicklistNode *node = quicklistCreateNode();
        node->zl = ziplistPush(ziplistNew(), value, sz, ZIPLIST_HEAD);
        quicklistNodeUpdateSz(node);
        // Insert the newly created quicklistNode into the quicklist structure
        _quicklistInsertNodeBefore(quicklist, quicklist->head, node);
    }
    quicklist->count++;
    quicklist->head->count++;
    return (orig_head != quicklist->head);
}

The execution flow of the quicklistPushHead function is as follows: first, it checks if the head node of the quicklist can insert data. If it can, it uses the ziplist interface to insert the data. Otherwise, it creates a new quicklistNode node for insertion.

The parameters of the function are the quicklist to be inserted, the value to be inserted, and its size.

The return value of the function is an int, with 0 indicating that no new head has been created and 1 indicating that a new head has been created. The execution flow of quicklistPushHead is shown in the following diagram:

列表类型使用-插入流程图.png

2) Analysis of Delete Functionality Source Code #

There are two cases for deleting elements in quicklist: deleting a single element and deleting a range of elements, both of which are located in the src/quicklist.c file.

① Deleting a Single Element #

The function for deleting a single element is quicklistDelEntry, the source code is as follows:

void quicklistDelEntry(quicklistIter *iter, quicklistEntry *entry) {
    quicklistNode *prev = entry->node->prev;
    quicklistNode *next = entry->node->next;
    // Delete the element at the specified position
    int deleted_node = quicklistDelIndex((quicklist *)entry->quicklist,
                                         entry->node, &entry->zi);
    //...
}

It can be seen that the bottom layer of the quicklistDelEntry function relies on the quicklistDelIndex function to delete the element.

② Deleting a Range of Elements #

The function for deleting a range of elements is quicklistDelRange, the source code is as follows:

// start represents the starting index for deletion, count represents the number of elements to delete
void quicklistDelRange(quicklist *quicklist, quicklistEntry *start,
                       quicklistEntry *end) {
    //...
}

int quicklistDelRange(quicklist *quicklist, const long start,
                      const long count) {
    if (count <= 0)
        return 0;
    unsigned long extent = count;
    if (start >= 0 && extent > (quicklist->count - start)) {
        // If the number of elements to be deleted is greater than the number of elements present
        extent = quicklist->count - start;
    } else if (start < 0 && extent > (unsigned long)(-start)) {
        // Delete the specified number of elements
        extent = -start; /* c.f. LREM -29 29; just delete until end. */
    }
    // ...
    // extent is the number of remaining elements to be deleted
    while (extent) {
        // Save the next quicklistNode because the current one may be deleted
        quicklistNode *next = node->next;
        unsigned long del;
        int delete_entire_node = 0;
        if (entry.offset == 0 && extent >= node->count) {
            // Delete the entire quicklistNode
            delete_entire_node = 1;
            del = node->count;
        } else if (entry.offset >= 0 && extent >= node->count) {
            // Delete all elements in this node
            del = node->count - entry.offset;
        } else if (entry.offset < 0) {
            // entry.offset < 0 indicates delete from the end, otherwise it indicates the number of remaining elements from the beginning
            del = -entry.offset;
            if (del > extent)
                del = extent;
        } else {
            // Delete a portion of the current node's elements
            del = extent;
        }
        D("[%ld]: asking to del: %ld because offset: %d; (ENTIRE NODE: %d), "
          "node count: %u",
          extent, del, entry.offset, delete_entire_node, node->count);
        if (delete_entire_node) {
            __quicklistDelNode(quicklist, node);
        } else {
            quicklistDecompressNodeForUse(node);
            node->zl = ziplistDeleteRange(node->zl, entry.offset, del);
            quicklistNodeUpdateSz(node);
            node->count -= del;
            quicklist->count -= del;
            quicklistDeleteIfEmpty(quicklist, node);
            if (node)
                quicklistRecompressOnly(quicklist, node);
        }
        // Update the number of remaining elements to be deleted
        extent -= del;
        // Move to the next quicklistNode
        node = next;
        // Start deleting from the beginning of the next quicklistNode
        entry.offset = 0;
    }
    return 1;
}


From the above code, it can be seen that when deleting a range in Quicklist, it first finds the quicklistNode where start is located, and calculates whether the number of deleted elements is less than count. If the number of elements to be deleted is not met, it moves on to the next quicklistNode to continue deleting, looping until the deletion is completed.

The return value of quicklistDelRange function is of type int. When 1 is returned, it indicates that the specified range of elements has been successfully deleted. When 0 is returned, it means that no elements have been deleted.

#### 3）More Source Code

In addition to the several commonly used functions mentioned above, there are more functions, such as:

 * quicklistCreate: Create quicklist;
 * quicklistInsertAfter: Add data after a certain element;
 * quicklistInsertBefore: Add data before a certain element;
 * quicklistPop: Retrieve and delete the first or last element of the list;
 * quicklistReplaceAtIndex: Replace a certain element.

### 5 Use Cases

There are two typical use cases for lists:

 * Message queues: The list type can use rpush to implement the first-in-first-out function, and lpop can easily pop (query and delete) the first element, so the list type can be used to implement message queues;
 * Article lists: For blog sites, as the number of users and articles increases, in order to speed up the response speed of the program, we can store the user's own articles in the List, because the List is an ordered structure, so this can perfectly implement the paging function, thereby speeding up the response speed of the program.

### 6 Summary

Through this article, we can know that the list type is not a simple doubly-linked list, but uses the quicklist data structure to store and retrieve data. The quicklist is a new data type added in Redis 3.2. It uses a storage structure of compressed lists plus doubly-linked lists. To store more data, the quicklist compresses each quicklistNode, allowing for efficient storage of more message queue or article data.