16 Two Tools That Kill Most Development Issues Instantly Documentation and Test Cases

16 Two Tools that Kill Most Development Issues Instantly- Documentation and Test Cases #

Hello, I’m Wen Ming. After learning about the principles and several important concepts of OpenResty, we are finally going to start learning about APIs.

In my personal experience, learning OpenResty’s APIs is relatively easy, so it hasn’t taken up too much space in this column. You may wonder: Aren’t APIs the most commonly used and important part? Why isn’t there much emphasis on them?

Actually, this is mainly due to two reasons.

First, OpenResty provides very detailed documentation. Compared to many other development languages or platforms, OpenResty not only provides parameter and return value definitions for APIs, but also provides complete and runnable code examples, clearly showing how APIs handle various edge cases.

This practice of providing example code and notes directly below the API definition is the consistent style of OpenResty documentation. With this approach, after reading the API description, you can immediately run the example code in your own environment, modify the parameters to verify and deepen your memory and understanding in comparison with the documentation.

Second, in addition to the documentation, OpenResty also provides a comprehensive set of test cases. Just now I mentioned that the OpenResty documentation provides code examples for APIs, but due to limited space, it does not cover how multiple APIs work together or how to handle various exceptional situations.

However, there is no need to worry, as you can find most of this content in the test case suite.

For OpenResty developers, the best learning resources for APIs are the official documentation and test cases, which are professional and user-friendly enough. Therefore, it wouldn’t make much sense if I simply translated the documentation into Chinese and presented it in this column.

It is better to teach people to fish rather than just giving them fish. I hope to teach you generic methods and experiences. Let’s experience how to make the documentation and test case suite more powerful through a real example in OpenResty development.

shdict get API #

The shared dict (共享字典) is a Lua dictionary object based on NGINX shared memory zone. It can be accessed by multiple workers and is commonly used to store data such as rate limiting, throttling, and caching. There are more than 20 APIs related to shared dict, making it one of the most commonly used and important API sets in OpenResty.

Let’s take the simplest get operation as an example. You can click on the documentation link for reference. The following minimal example is adapted from the official documentation:

http {
    lua_shared_dict dogs 10m;
    server {
        location /demo {
            content_by_lua_block {
                local dogs = ngx.shared.dogs
                dogs:set("Jim", 8)
                local v = dogs:get("Jim")
                ngx.say(v)
            }
        }
    }
}

Let me explain a bit. Before using the shared dict in Lua code, we need to allocate a memory space using the lua_shared_dict directive in nginx.conf. In this example, the shared dict is named “dogs” and has a size of 10MB. After modifying nginx.conf, you need to restart the process in order to see the result by accessing it with a browser or using curl.

Doesn’t this step seem a bit cumbersome? Let’s transform it in a more straightforward way using the resty CLI. You can see that using the resty CLI achieves the same effect as embedding the code in nginx.conf.

$ resty --shdict 'dogs 10m' -e 'local dogs = ngx.shared.dogs
 dogs:set("Jim", 8)
 local v = dogs:get("Jim")
 ngx.say(v)
 '

Now you know how nginx.conf and Lua code work together, and you have successfully executed the set and get methods of the shared dict. Generally speaking, most developers stop at this point and don’t delve deeper.

In fact, there are still a few things worth noting here, such as:

  1. In which phase can the shared memory-related API not be used?
  2. In our example code, the get function only has one return value. In what cases are there multiple return values?
  3. What is the parameter type for the get function? Is there a length limit?

Don’t underestimate these questions. They can give us a better understanding of OpenResty. Next, I will explain each of them one by one.

Let’s first look at the first question. The answer is straightforward. In the documentation, there is a context section that specifically lists the environments in which you can use the API:

context: set_by_lua*, rewrite_by_lua*, access_by_lua*, content_by_lua*, header_filter_by_lua*, body_filter_by_lua*, log_by_lua*, ngx.timer.*, balancer_by_lua*, ssl_certificate_by_lua*, ssl_session_fetch_by_lua*, ssl_session_store_by_lua*

We can see that the init and init_worker phases are not included. This means that the get API for shared memory cannot be used in these two phases. It’s important to note that not all shared memory APIs can be used in the same stages. For example, the set API can be used in the init phase.

So, don’t assume anything. As I always say, consult the documentation. However, keep in mind that relying solely on the documentation is not as good as practical testing. Sometimes errors or omissions can occur in the OpenResty documentation, and in such cases, you need to verify using actual tests.

Next, let’s modify the test suite to determine whether the init phase can run the get API for shared dictionaries.

To find the test suite related to shared memory, you can look in the /t directory of OpenResty. The test suites are named in a structured manner, with an incrementing number followed by the name of the functionality, such as 043-shdict.t. This is the test suite for shared memory, and it contains nearly 100 test cases, including various normal and exceptional scenarios.

Let’s try modifying the first test case.

You can change the phase from content to init and simplify the irrelevant code to see if the get API can run. Here, I need to remind you that at this stage, you don’t need to figure out how the test cases are written, organized, and executed. You just need to know that it is testing the get API:

=== TEST 1: string key, int value
 --- http_config
     lua_shared_dict dogs 1m;
 --- config
     location = /test {
         init_by_lua '
             local dogs = ngx.shared.dogs
             local val = dogs:get("foo")
             ngx.say(val)
         ';
     }
 --- request
 GET /test
 --- response_body
 32
 --- no_error_log
 [error]
 --- ONLY

You may have noticed that I added the --ONLY flag at the end of the test case. This indicates that all other test cases should be ignored, and only this test case should be executed to improve the speed of execution. I will explain various flags in the testing section later. For now, just remember this.

After making the modifications, you can use the prove command to run this test case:

$ prove t/043-shdict.t

Then, you will get an error, confirming the stage limitation described in the documentation:

nginx: [emerg] "init_by_lua" directive is not allowed here

When does the get function have multiple return values? #

Let’s take a look at the second question, which can be summarized from the official documentation. The syntax description of this interface is at the beginning of the document:

value, flags = ngx.shared.DICT:get(key)

Under normal circumstances,

  • The first parameter value returns the value corresponding to the key in the dictionary; however, when the key does not exist or expires, the value of value is nil.
  • The second parameter flags is a little more complex. It returns if the flags are set in the set interface, otherwise it does not return.

Once there is an API error, value returns nil and flags returns the specific error message.

From the information summarized in the documentation, we can see that the syntax local v = dogs:get("Jim") which only has one receiving parameter is not perfect, because it only covers the normal use cases without receiving the second parameter or handling exceptions. We can modify it as follows:

local data, err = dogs:get("Jim")
if data == nil and err then
    ngx.say("get not ok: ", err)
    return
end

Like the first question, we can search in the test case suite to verify our understanding of the documentation:

=== TEST 65: get nil key
 --- http_config
     lua_shared_dict dogs 1m;
 --- config
     location = /test {
         content_by_lua '
             local dogs = ngx.shared.dogs
             local ok, err = dogs:get(nil)
             if not ok then
                 ngx.say("not ok: ", err)
                 return
             end
             ngx.say("ok")
         ';
     }
 --- request
 GET /test
 --- response_body
 not ok: nil key
 --- no_error_log
 [error]

In this test case, the input parameter of the get interface is nil, and the returned error message is nil key. This verifies our analysis of the documentation on one hand, and provides a partial answer to the third question - at least, the input parameter of get cannot be nil.

What type is the input parameter of the get function? #

As for the third question, what types can the input parameter of get be? Let’s start by looking at the documentation, but unfortunately, you will find that the valid types for the key are not specified in the documentation. What should we do in this case?

Don’t worry, at least we know that the key can be of type string and cannot be nil. Do you remember the data types in Lua? Besides strings and nil, there are also numbers, arrays, boolean types, and functions. Obviously, there is no need to use boolean types and functions as the key, so we only need to verify the first two. Let’s search the test file to see if there are any cases where numbers are used as keys:

=== TEST 4: number keys, string values

Through this test case, you can see clearly that numbers can also be used as keys, and they will be internally converted to strings. What about arrays? Unfortunately, the test cases did not cover this, so we need to try it ourselves:

$ resty --shdict 'dogs 10m' -e 'local dogs = ngx.shared.dogs
 dogs:get({})
 '

As expected, it reports an error:

ERROR: (command line -e):2: bad argument #1 to 'get' (string expected, got table)

In conclusion, we can deduce that the get API accepts keys of type string and number.

Is there a limit on the length of the input parameter key? There is actually a corresponding test case for this, so let’s take a look together:

=== TEST 67: get a too-long key
 --- http_config
     lua_shared_dict dogs 1m;
 --- config
     location = /test {
         content_by_lua '
             local dogs = ngx.shared.dogs
             local ok, err = dogs:get(string.rep("a", 65536))
             if not ok then
                 ngx.say("not ok: ", err)
                 return
             end
             ngx.say("ok")
         ';
     }
 --- request
 GET /test
 --- response_body
 not ok: key too long
 --- no_error_log
 [error]

Obviously, when the length of the string is 65536, it will be flagged as a too-long key. You can try changing the length to 65535, and even though it is only one byte shorter, there will be no error anymore. This indicates that the maximum length of the key is 65535.

Conclusion #

Currently, the official documentation of OpenResty is only available in English. When Chinese engineers read it, they may have difficulty grasping the key points or even misunderstand the content due to language barriers. However, the more challenging it is, the fewer shortcuts there are. In fact, you should carefully read the documentation from start to finish. When you have questions, try to find the answers through the test case collection and your own experiments. This is the correct way to assist in learning OpenResty.

Finally, I want to remind you that in OpenResty’s API, any return values that contain error messages must be received in variables and handled accordingly. Otherwise, there will definitely be pitfalls waiting for you. For example, if an error occurs and the connection is put into the connection pool or if the API call fails, the subsequent logic will continue. In short, it will be a source of frustration.

So, when you encounter problems while writing OpenResty code, how do you usually solve them? Do you rely on documentation, mailing lists, QQ groups, or other channels?

Feel free to leave a comment for discussion, and also feel free to share this article with your colleagues and friends. Let’s exchange ideas and progress together.