43 Agile Implementation of Dynamic Rate Limiting Is Not That Difficult

43 Agile Implementation of Dynamic Rate Limiting Is Not That Difficult #

Hello, I’m Wen Ming.

In the previous lessons, I introduced leaky bucket and token bucket algorithms, both of which are commonly used methods to handle burst traffic. We also learned how to implement rate limiting and throttling for requests using the Nginx configuration file. However, it is evident that using the Nginx configuration file only scratches the surface and there is still a long way to go before it becomes truly powerful.

The first problem is that the rate limiting key is restricted within the scope of Nginx variables and cannot be flexibly set. For example, it is not possible to set different rate limiting thresholds based on different provinces and client channels, which is a common requirement that cannot be implemented with Nginx.

Another bigger problem is the inability to dynamically adjust the rate. Every time a modification is made, Nginx needs to be reloaded, as we mentioned in the last lesson. As a result, for requirements such as rate limiting based on different time periods, one can only awkwardly implement them with external scripts.

It should be noted that technology serves the business, and at the same time, the business drives the advancement of technology. At the time of Nginx’s inception, there was no need for dynamically adjusting configurations. The focus was more on requirements such as reverse proxy, load balancing, and low memory usage, which were driving Nginx’s growth. In terms of architecture and implementation, no one could have anticipated the massive demand for dynamic and fine-grained control in scenarios such as mobile internet, IoT, and microservices.

OpenResty, with its ability to use Lua scripts, happens to fill the gap left by Nginx in this regard, forming an effective complement. This is also the fundamental reason why OpenResty is widely used to replace Nginx. In the upcoming lessons, I will continue to introduce you to more dynamic scenarios and examples in OpenResty. Today, let’s first look at how to use OpenResty to implement dynamic rate limiting and throttling.

In OpenResty, we recommend using lua-resty-limit-traffic for traffic limiting. It includes three different forms of limitation: limit-req (limiting request rate), limit-count (limiting number of requests), and limit-conn (limiting concurrent connections). It also provides limit.traffic, which allows you to aggregate these three forms.

Rate Limiting #

Let’s first take a look at limit-req, which uses the leaky bucket algorithm to limit the rate of requests.

In the previous section, we briefly introduced the key implementation code of the leaky bucket algorithm in the resty library. Now let’s learn how to use this library. Let’s take a look at the following example code:

resty --shdict='my_limit_req_store 100m' -e 'local limit_req = require "resty.limit.req"
local lim, err = limit_req.new("my_limit_req_store", 200, 100)
local delay, err = lim:incoming("key", true)
if not delay then
    if err == "rejected" then
        return ngx.exit(503)
    end
    return ngx.exit(500)
end

 if delay >= 0.001 then
    ngx.sleep(delay)
end'

As we know, lua-resty-limit-traffic uses shared dictionaries to store and count keys, so before using limit-req, we need to declare the my_limit_req_store 100m space. The same is true for limit-conn and limit-count because they all require separate shared dictionary spaces to differentiate.

limit_req.new("my_limit_req_store", 200, 100)

The above line of code is the most crucial line of code. It means using a shared dictionary named my_limit_req_store to store statistical data and set the rate per second to 200. So, if it exceeds 200 but is less than 300 (this value is calculated as 200 + 100), it needs to wait in queue. If it exceeds 300, it will be rejected directly.

After the setting is complete, we need to handle the terminal requests. lim:incoming("key", true) is used for this purpose. This incoming function has two parameters that we need to explain in detail.

The first parameter is the user-specified rate limiting key. In the example above, it is a string constant, which means that the rate is uniformly limited for all terminals. If you want to limit the rate based on different provinces and channels, it is also easy. Just use both pieces of information as the key. Here is a pseudocode for implementing this requirement:

local province = get_province(ngx.var.binary_remote_addr)
local channel = ngx.req.get_headers()["channel"]
local key = province .. channel
lim:incoming(key, true)

Of course, you can also be more creative by defining the meaning of the key and the conditions for calling incoming, so that you can achieve very flexible rate limiting effects.

Now let’s look at the second parameter of the incoming function. It is a boolean value, which is false by default, which means that this request will not be recorded in the shared dictionary for statistics. This is just a “practice”. If set to true, it will have an actual effect. Therefore, in most cases, you need to explicitly set it to true.

You may wonder why this parameter exists. Let’s consider a scenario where you set up two different limit-req instances, one for the hostname and the other for the client’s IP address. When a terminal request is processed, the incoming method of these two instances will be called in the order they are created, as shown in the following pseudocode:

local limiter_one, err = limit_req.new("my_limit_req_store", 200, 100)
local limiter_two, err = limit_req.new("my_limit_req_store", 20, 10)

limiter_one:incoming(ngx.var.host, true)
limiter_two:incoming(ngx.var.binary_remote_addr, true)

If a user’s request passes the threshold check of limiter_one but is rejected by the check of limiter_two, then the call to limiter_one:incoming should be considered a “practice” and should not be counted.

In this case, the logic of the above code is not strict enough. We need to perform a practice operation on all limiters in advance. If any limiter’s threshold is triggered, we can reject the terminal request and return directly:

for i = 1, n do
    local lim = limiters[i]
    local delay, err = lim:incoming(keys[i], i == n)
    if not delay then
        return nil, err
    end
end

This is actually the meaning of the second parameter of the incoming function. This piece of code is the most crucial part of the limit.traffic module, used for combining multiple limiters.

Request Rate Limiting #

Now let’s take a look at the library limit.count for request rate limiting. Its effect is similar to GitHub API’s Rate Limiting, as it can restrict the number of requests a user can make within a fixed time window. As usual, let’s start with an example code:

local limit_count = require "resty.limit.count"

local lim, err = limit_count.new("my_limit_count_store", 5000, 3600)

local key = ngx.req.get_headers()["Authorization"]
local delay, remaining = lim:incoming(key, true)

As you can see, the usage of limit.count is similar to limit.req. First, we need to define a shared dictionary in the Nginx.conf file:

lua_shared_dict my_limit_count_store 100m;

Then we create a limiter object using new, and finally use the incoming function to check and handle the request.

However, unlike limit-req, the second return value of the incoming function in limit.count represents the remaining number of calls. We can use this value to add fields to the response headers, providing better feedback to the client:

ngx.header["X-RateLimit-Limit"] = "5000"
ngx.header["X-RateLimit-Remaining"] = remaining

Limiting Concurrent Connections #

The third method, limit.conn, is a library used to limit concurrent connections. It is different from the previous two libraries as it has a special leaving API, which I will briefly introduce here.

The previous methods of limiting request rates and request numbers can be implemented directly in the access phase. However, limiting concurrent connections is different because it not only requires checking whether the threshold has been exceeded in the access phase, but also calling the leaving interface in the log phase:

log_by_lua_block {
    local latency = tonumber(ngx.var.request_time) - ctx.limit_conn_delay
    local key = ctx.limit_conn_key

    local conn, err = lim:leaving(key, latency)
}

However, the core code of this interface is actually quite simple. It is represented by the following line of code, which basically reduces the connection count by one. If you don’t perform this cleanup operation in the log phase, the connection count will keep increasing and quickly reach the threshold for concurrent connections.

local conn, err = dict:incr(key, -1)

Combination of Limiters #

Now that we have covered these three methods separately, let’s take a look at how to combine limit.rate, limit.conn, and limit.count and use them together. This requires the use of the combine function in limit.traffic:

local lim1, err = limit_req.new("my_req_store", 300, 200)
local lim2, err = limit_req.new("my_req_store", 200, 100)
local lim3, err = limit_conn.new("my_conn_store", 1000, 1000, 0.5)

local limiters = {lim1, lim2, lim3}
local host = ngx.var.host
local client = ngx.var.binary_remote_addr
local keys = {host, client, client}

local delay, err = limit_traffic.combine(limiters, keys, states)

With the knowledge we have just gained, you should have no problem understanding this piece of code. The core code of the combine function was mentioned when we analyzed limit.rate earlier. It mainly utilizes the exercise feature and the uncommit function to achieve the combination. With this combination, you can set different thresholds and keys for multiple limiters, thereby achieving more complex business requirements.

Conclusion #

limit.traffic not only supports the three rate limiters discussed today, in fact, as long as a rate limiter has incoming and uncommit interfaces, it can be managed by the combine function of limit.traffic.

Finally, I have a homework question for you. Can you write an example to combine the token bucket rate limiter we previously introduced? Feel free to write your answer in the comment section for discussion. Also, feel free to share this article with your colleagues and friends for learning and communication together.