Extra Meal 04 How Redis Clients Exchange Commands and Data With Server

Extra Meal 04 How Redis Clients Exchange Commands and Data with Server #

In the previous lessons, we mainly studied the mechanism and key technologies of the Redis server, with little emphasis on the client-side issues. However, Redis adopts the typical client-server architecture, where the client sends requests to the server and the server responds to the client.

If we want to develop the Redis client further, such as adding new commands, we need to understand how the commands and data involved in requests and responses are encoded and transmitted between the client and the server. Otherwise, the newly added commands on the client-side will not be recognized and processed by the server.

Redis uses the RESP (Redis Serialization Protocol) to define the encoding format of commands and data exchanged between the client and the server. In Redis 2.0, the RESP protocol officially became the standard communication protocol between the client and the server. From Redis 2.0 to Redis 5.0, the RESP protocol is referred to as RESP 2. Starting from Redis 6.0, Redis adopted the RESP 3 protocol. However, Redis 6.0 was just released in May this year, so the widely used protocol currently is still RESP 2.

In this lesson, I will focus on introducing the specification requirements of RESP 2 protocol and the improvements of RESP 3 compared to RESP 2.

First, let’s take a look at what content is involved in the interaction between the client and the server, as the encoding formats vary depending on the content.

What are the contents of the interaction between the client and the server? #

To help you better understand how the RESP 2 protocol encodes commands and data, we can categorize the interaction into two types: client requests and server responses.

In client requests, the client sends commands, keys, and values to Redis. In server responses, the Redis instance returns values that are read, OK indicators, the number of successfully written elements, error messages, and the command (e.g., the MOVE command in Redis Cluster).

In fact, these interaction contents can be further divided into seven categories. Let’s explore them:

Commands: These are operation commands for different data types. For example, SET and GET for String types, HSET and HGET for Hash types. These commands are represented as strings that denote the semantic operation.
Single values: These correspond to String type data, where the data itself can be strings, numbers (integers or floating-point numbers), booleans (True or False), etc.
Set values: These correspond to List, Hash, Set, and Sorted Set type data. They not only contain multiple values but each value can also be a string, number, boolean, etc.
OK response: This corresponds to the successful result of a command operation, which is represented as a string “OK”.
Integer response: There are two situations here. One is when the result of a command operation is an integer, for example, when the LLEN command returns the length of a list. The other situation is when a set command operation is successful, and it returns the actual number of elements operated on, for example, the SADD command returns the number of successfully added elements.
Error message: This is the result returned when a command operation encounters an error. It includes the “error” flag and the specific error message.
Bulk response: This corresponds to binary data such as files, images, etc. It is a binary-safe string in the Redis protocol.

Now that you know what these seven categories of contents are, let’s take a look at three specific examples to help you further understand these interaction contents.

Let’s start with the first example, which involves the following command:

# Successfully write data of type String and return OK
127.0.0.1:6379> SET testkey testvalue
OK

The interaction contents here include the command (SET command), key (single value String type key testkey), and value (String type value testvalue). The server directly returns an OK response.

The second example is executing the HSET command:

# Successfully write data of type Hash and return the actual number of written set elements
127.0.0.1:6379>HSET testhash a 1 b 2 c 3
(integer) 3

The interaction contents here include a Hash set value of three key-value pairs (a 1 b 2 c 3), and the server returns an integer response (3), indicating the number of elements successfully written.

The last example is executing the PUT command, as shown below:

# The command is incorrect, returning an error message
127.0.0.1:6379>PUT testkey2 testvalue
(error) ERR unknown command 'PUT', with args beginning with: 'testkey', 'testvalue'

Here, the interaction contents include an error message. Since the Redis instance does not support the PUT command, the server reports an error and returns the specific error message, stating that the command “PUT” is unknown.

Alright, now you understand the contents of the interaction between the Redis client and server. Next, let’s take a look at the format specifications used by RESP 2 to encode these contents.

Encoding Format Specification for RESP 2 #

The design goal of the RESP 2 protocol is to make it simple and convenient for Redis developers to implement clients, thereby reducing the occurrence of bugs during client development. Furthermore, when there are issues during the interaction between the client and the server, it is desired that developers can quickly locate the problem and facilitate debugging by examining the protocol interaction process. To achieve this goal, the RESP 2 protocol uses a text-based encoding format with good readability, that is, it uses a series of strings to represent various commands and data.

However, there are multiple types of interactions, and the actual transmitted commands or data can also be numerous. In response to these two situations, the RESP 2 protocol has designed two basic specifications for encoding.

To encode different types of interaction content, the RESP 2 protocol implements 5 encoding format types. To differentiate these 5 encoding types, RESP 2 uses a dedicated character as the starting character for each encoding type. This way, when the client or server parses the encoded data, they can directly determine the current parsing encoding type by examining the starting character.

When encoding with RESP 2, it will be done at the granularity of a single command or a single piece of data, and a newline character “\r\n” (also represented as CRLF) will be added after each encoding result to indicate the end of the encoding.

Next, let me introduce these 5 encoding types separately.

Simple String (RESP Simple Strings)

This type is encoded using a single string. For example, the “OK” reply used as an indication of a successful operation executed by the server in response to a request is encoded using this type.

When the server successfully executes an operation and returns the OK reply, it can be encoded as follows:

+OK\r\n

Bulk String (RESP Bulk Strings)

This type is encoded using a binary-safe string. The term “binary-safe” here is actually in relation to how strings are handled in the C programming language. Let me explain this in detail.

When Redis parses a string, it does not treat “\0” as the end of a string, as the C programming language does. Redis will interpret “\0” as a normal 0 character and use additional property values to represent the actual length of the string.

For example, for the string “Redis\0Cluster\0”, the C language would parse it as “Redis”, while Redis would parse it as “Redis Cluster”, and use the len property to indicate that the actual length of the string is 14 bytes, as shown in the following image:

Redis Bulk String

This way, even if the string contains the “\0” character, Redis will not interpret it as the end of the string and stop parsing, as this would compromise data safety. Compared to bulk strings, simple strings are not binary-safe.

Bulk strings can have a maximum length of 512MB, so they can be used to encode large amounts of data, which is suitable for fulfilling the data volume requirements of key-value pairs. Therefore, RESP 2 uses this type to encode keys or values in the interaction content, and uses the “$” character as the starting character, followed immediately by a number, which represents the actual length of the string.

For example, when we use the GET command to retrieve the value (assuming the key is “testkey”), the encoded result of the String value returned by the server is as follows, where the number 9 after the “$” character indicates the data length in 9 characters:

$9 testvalue\r\n

Integer (RESP Integer)

This type is also a string but represents a signed 64-bit integer. To distinguish it from simple strings that contain numbers, the integer type uses the “:” character as the starting character and can be used to encode integer replies returned by the server.

For example, in the example we previously introduced, when we use the HSET command to set three elements of “testhash”, the actual encoded result returned by the server is as follows:

:3\r\n

Error (RESP Errors)

It is a string that includes an error type and specific error information. This type is used to encode error responses from the Redis server. RESP 2 uses the “-” character as its starting character.

For example, in the previous example, when we execute the PUT testkey2 testvalue command in redis-cli and encounter an error, the actual encoded error response sent by the server to the client is as follows:

-ERR unknown command `PUT`, with args beginning with: `testkey`, `testvalue`

Here, “ERR” is the error type, indicating a general error, and the text after “ERR” is the specific error message.

Array (RESP Arrays)

This is an array that contains multiple elements, where the type of each element can be one of the 4 encoding types mentioned above.

Array encoding types can be used when clients send requests and servers return results. When clients perform request operations, they generally include both the command and the data to be operated on. Since the array type contains multiple elements, it is suitable for encoding the sent command and data. To distinguish it from other types, RESP 2 uses the “*” character as the starting character.

For example, when we execute the command GET testkey, the encoding result of the command sent by the client to the server is encoded using the array type, as shown below:

*2\r\n$3\r\nGET\r\n$7\r\ntestkey\r\n

Here, the first* character indicates that the encoding result is an array type, and the number 2 indicates that the array has 2 elements, corresponding to the command GET and the key testkey. Both the command GET and the key testkey are encoded using bulk string type, which is represented by the “$” character followed by the string length.

Similarly, when the server returns collection-type data that contains multiple elements, it will also use the “*” character and the number of elements as an identifier, and encode the returned set elements using bulk string type.

Alright, now you understand the 5 encoding types of the RESP 2 protocol and their corresponding starting characters. I have summarized them in the table below for your reference.

Encoding Types

In Redis 6.0, the RESP 3 protocol is used, which has improvements over RESP 2.0. Let’s explore what specific improvements have been made.

Limitations of RESP 2 and Improvements in RESP 3 #

Although we just mentioned that the RESP 2 protocol provides 5 encoding types, it is actually not enough. After all, there are many other basic data types, such as floating-point numbers and boolean values. The limited encoding types cause two problems.

On one hand, in terms of the basic data types of values, RESP 2 can only differentiate between strings and integers. For other data types, when clients use the RESP 2 protocol, they need to perform additional conversion operations. For example, when a floating-point number is represented as a string, the client needs to compare the value in the string with the actual numeric value to determine if it is a numeric value, and then convert the string into the actual floating-point number.

On the other hand, RESP 2 uses array category encoding to represent all collection types. However, Redis’s collection types include List, Hash, Set, and Sorted Set. When the client receives an array-encoded result, it also needs to determine which collection type the returned array represents based on the invoked command operation interface.

Let’s take an example. Suppose there is a key of Hash type called “testhash” with collection elements of a:1, b:2, and c:3. And there is a key of Sorted Set type called “testzset” with collection elements of a, b, and c, and their scores are 1, 2, and 3 respectively. When we read their results in the redis-cli client, they are returned in the form of an array, as shown below:

127.0.0.1:6379>HGETALL testhash
1) "a"
2) "1"
3) "b"
4) "2"
5) "c"
6) "3"

127.0.0.1:6379>ZRANGE testzset 0 3 withscores
1) "a"
2) "1"
3) "b"
4) "2"
5) "c"
6) "3"

In order to handle the returned data as Hash and Sorted Set types in the client’s code, the client also needs to convert these array-encoded results into the corresponding Hash set and Sorted Set based on the sent command operations HGETALL and ZRANGE, which adds additional overhead to the client.

Starting from Redis version 6.0, the RESP 3 protocol has added support for multiple data types, including null values, floating-point numbers, boolean values, ordered dictionary sets, unordered sets, and more. RESP 3 also distinguishes different data types by using different starting characters. For example, when the first character is “,”, it indicates that the following encoded result is a floating-point number. With this improvement, the client no longer needs to perform extra string comparisons to achieve data conversion, which improves client efficiency.

Summary #

In this lesson, we learned about the RESP 2 protocol. This protocol defines the encoding format for command and data interactions between Redis clients and servers. RESP 2 provides 5 types of encoding formats, including Simple String, Bulk String, Integer, Error, and Array types. To distinguish between these 5 types, RESP 2 protocol uses 5 different characters as the first character of the encoded result for each type, namely +, $, :, -, and *.

RESP 2 protocol is a text-based protocol that is simple to implement, reduces the likelihood of bugs in client development, and has strong readability, making it easy for development and debugging. When you need to develop a customized Redis client, it is necessary to understand and master the RESP 2 protocol.

One drawback of the RESP 2 protocol is that it supports a limited number of types. Therefore, Redis 6.0 introduced the RESP 3 protocol. Compared to the RESP 2 protocol, RESP 3 protocol adds support for multiple data types such as floating-point numbers, boolean types, sorted sets, and unordered sets. However, it is worth noting that Redis 6.0 only supports RESP 3 and is not compatible with RESP 2 protocol. Therefore, if you are using Redis 6.0, you need to ensure that your client supports the RESP 3 protocol, otherwise, you will not be able to use Redis 6.0.

Finally, I will provide you with a handy tool. If you want to view the RESP 2 encoded result of the data returned by the server, you can use the telnet command to connect to the Redis instance and execute the following command:

telnet instanceIP instancePort

Then, you can send commands to the instance and see the returned results encoded using the RESP 2 protocol. Of course, you can also use telnet to send commands written in RESP 2 protocol to the Redis instance, and the instance will handle them as well. You can try it out later as an exercise.

One question per class #

As usual, I have a small question for you. Assuming there is a List type of data in the Redis instance, with a key of “mylist” and a value that consists of 5 elements pushed into the List set using the LPUSH command, which are: 1, 2, 3.3, 4, hello. What will be the encoding result returned to the client when executing the LRANGE mylist 0 4 command?

Feel free to leave your thoughts and answers in the comments below. Let’s discuss and exchange ideas together. If you found today’s content helpful, feel free to share it with your friends or colleagues. See you in the next class.