45 the Unmissable Capability Common Third Party Libraries Often Used With Open Resty

45 The Unmissable Capability- Common Third-party Libraries Often Used with OpenResty #

Hello, I’m Wen Ming.

For development languages and platforms, a lot of learning often involves studying standard libraries and third-party libraries, and not much time needs to be spent on the syntax itself. The same goes for OpenResty. After studying its own API and performance optimization techniques, we need various lua-resty libraries to help us extend the abilities of OpenResty and apply them to more scenarios.

Where to find lua-resty libraries? #

Compared to PHP, Python, and JavaScript, the standard and third-party libraries for OpenResty are relatively scarce. Finding suitable lua-resty libraries is not an easy task. However, there are still two recommended channels that can help you find them faster.

First, I recommend the awesome-resty repository maintained by Aapo. This repository organizes OpenResty-related libraries into different categories, including Nginx C modules, lua-resty libraries, web frameworks, routing libraries, templates, testing frameworks, and more. It is the first choice for finding OpenResty resources.

Of course, if you cannot find a suitable library in Aapo’s repository, you can also try your luck with luarocks, opm, and GitHub. Some libraries that are relatively new or not widely known may be hidden among them.

In previous courses, we have already encountered many useful libraries, such as lua-resty-mlcache, lua-resty-traffic, and lua-resty-shell. Today, in the last section of the OpenResty performance optimization course, we will introduce three distinctive peripheral libraries, all of which are contributed by developers in the community.

Performance Improvement of ngx.var #

First, let’s take a look at a C module called lua-var-nginx-module. I mentioned earlier that ngx.var is an operation with significant performance overhead, so in actual usage, we need to use ngx.ctx as a cache layer.

Is there any way to completely solve the performance issue of ngx.var?

This C module has made some attempts in this regard, and the results are quite significant. Its performance is 5 times better than ngx.var. It uses FFI, so when compiling OpenResty, you need to add the following compilation options:

./configure --prefix=/opt/openresty \
         --add-module=/path/to/lua-var-nginx-module

Then, use luarocks to install the Lua library:

luarocks install lua-resty-ngxvar

The method invocation here is also very simple, just one line of calling the fetch function. Its effect is completely equivalent to the original ngx.var.remote_addr for getting the client’s IP address:

content_by_lua_block {
    local var = require("resty.ngxvar")
    ngx.say(var.fetch("remote_addr"))
}

Now that you know these basic operations, you may be more curious about how this module achieves such a significant performance improvement. As the saying goes, there are no secrets in front of the source code, so let’s take a look at how the remote_addr variable is obtained in it:

ngx_int_t 
ngx_http_lua_var_ffi_remote_addr(ngx_http_request_t *r, ngx_str_t *remote_addr) 
{ 
    remote_addr->len = r->connection->addr_text.len; 
    remote_addr->data = r->connection->addr_text.data; 

    return NGX_OK; 
}

After reading this code, you will find that this Lua FFI approach is very similar to the one used in lua-resty-core. Its advantages are obvious: using FFI to directly access variables bypasses the original lookup logic of ngx.var. At the same time, its drawback is also obvious: it requires adding corresponding C functions and FFI calls for every variable you want to retrieve, which is actually a laborious task.

Some may ask why I say this is a laborious task. Isn’t the C code above quite substantial? Let’s take a look at the source of these few lines of code, which come from the Nginx code in src/http/ngx_http_variables.c:

static ngx_int_t
ngx_http_variable_remote_addr(ngx_http_request_t *r,
ngx_http_variable_value_t *v, uintptr_t data)
{
    v->len = r->connection->addr_text.len;
    v->valid = 1;
    v->no_cacheable = 0;
    v->not_found = 0;
    v->data = r->connection->addr_text.data;

    return NGX_OK;
}

After seeing the source code, the mystery is solved! lua-var-nginx-module is actually a porter of Nginx variable code and wraps it with FFI externally to achieve performance optimization in this way. This is also a good idea and optimization direction.

Here I would like to say a few more words. When learning a library or tool, we must not just stop at the level of operation and usage, but also ask why and look at the source code. Only at the level of underlying principles can we learn more design ideas and problem-solving approaches. Of course, I highly encourage you to contribute code to support more Nginx variables.

JSON Schema #

The following is an introduction to a lua-resty library: lua-rapidjson. It is a wrapper for the Tencent open-source JSON library rapidjson known for its high performance. Here, we focus on the differences between lua-rapidjson and cjson, specifically its support for JSON Schema.

JSON Schema is a universal standard that allows us to accurately describe the format of parameters in an interface and how to validate them. Here is a simple example:

"stringArray": {
    "type": "array",
    "items": { "type": "string" },
    "minItems": 1,
    "uniqueItems": true
}

This JSON accurately describes the parameter stringArray as a string array, ensuring that the array is not empty and that its elements are unique.

With lua-rapidjson, we can use JSON Schema in OpenResty, which greatly facilitates parameter validation for interfaces. For example, for the rate limit interface mentioned earlier, we can use the following schema to describe it:

local schema = {
    type = "object",
    properties = {
        count = {type = "integer", minimum = 0},
        time_window = {type = "integer",  minimum = 0},
        key = {type = "string", enum = {"remote_addr", "server_addr"}},
        rejected_code = {type = "integer", minimum = 200, maximum = 600},
    },
    additionalProperties = false,
    required = {"count", "time_window", "key", "rejected_code"},
}

You will find that this brings two obvious benefits:

  • For the front-end, the schema can be reused for front-end page development and parameter validation, without having to worry about the back-end.
  • For the back-end, it can directly use the schema validation function SchemaValidator from lua-rapidjson to determine the validity of the interface, without the need to write extra code.

Worker Intercommunication #

Finally, I want to talk about the lua-resty-worker-events library that enables worker intercommunication in OpenResty. In OpenResty, there is no direct communication mechanism between workers, which can cause several issues. Let’s consider the following scenario:

An OpenResty service has 24 worker processes. The administrator updates a configuration of the system through a REST HTTP interface. However, only one worker receives the update operation and writes the result to the database, updates the shared dictionary, and its own worker-level LRU cache. Then, how can the other 23 workers be notified to update this configuration?

Clearly, multiple workers need a notification mechanism to accomplish the task above. In the absence of native support in OpenResty, we can resort to using the shared dictionary, which can be accessed across workers.

That’s exactly where lua-resty-worker-events comes in. It maintains a version number in the shared dictionary. When there is a new message to be published, the version number is incremented and the message content is stored in the dictionary using the version number as the key:

event_id, err = _dict:incr(KEY_LAST_ID, 1)
success, err = _dict:add(KEY_DATA .. tostring(event_id), json)

In the background, a polling loop is created using ngx.timer with a default interval of 1 second. It continuously checks if the version number has changed:

local event_id, err = get_event_id()
if event_id == _last_event then
    return "done"

Once a new event notification is detected, the message content is retrieved from the shared dictionary based on the version number:

while _last_event < event_id do
    count = count + 1
    _last_event = _last_event + 1
    data, err = _dict:get(KEY_DATA..tostring(_last_event))
end

In summary, although lua-resty-worker-events introduces a delay of 1 second, it achieves the event notification mechanism between workers, which is still commendable despite the minor flaw.

However, in scenarios where real-time requirements are high, such as message pushing, the lack of direct communication between worker processes in OpenResty may cause some difficulties. Currently, there is no better solution for this issue. If you have any good ideas, feel free to discuss them on GitHub or the OpenResty mailing list. Many of OpenResty’s features are driven by the community of users, which helps to create a healthy ecosystem.

Conclusion #

The three libraries we introduced today all have their own features and have brought more possibilities to the application of OpenResty. In the end, let’s discuss an interactive topic. Have you ever discovered any interesting libraries related to OpenResty? Or do you have any discoveries or doubts about the libraries mentioned today? Please feel free to leave a comment and share with me. You are also welcome to share this article with OpenResty users around you for further discussion and improvement.