10 Why to Avoid Using Nyi in Jit Compiler

10 Why to Avoid Using NYI in JIT Compiler #

Hello, I am Wen Ming.

In the previous section, we learned about FFI in LuaJIT. If your project only uses the APIs provided by OpenResty and does not have the need to call C functions on its own, then FFI may not be as important to you. You just need to make sure that lua-resty-core is enabled.

However, what we are going to talk about today is NYI in LuaJIT, which is a crucial issue that every engineer using OpenResty cannot avoid, and it has a significant impact on performance.

You can quickly write code with correct logic using OpenResty, but without understanding NYI, you cannot write efficient code and cannot fully unleash the power of OpenResty. The performance difference between the two is at least an order of magnitude.

What is NYI? #

So what exactly is NYI? Let’s review a knowledge point we mentioned earlier:

In the runtime environment of LuaJIT, in addition to an assembly-based Lua interpreter, there is also a JIT compiler that can directly generate machine code.

The implementation of the JIT compiler in LuaJIT is not yet perfect, and there are some primitives that it cannot compile because they are difficult to implement. Additionally, the author of LuaJIT is currently in a semi-retired state. These primitives include common functions like pairs(), unpack(), and Lua C modules implemented based on Lua CFunction. As a result, when the JIT compiler encounters unsupported operations on the current code path, it falls back to the interpreter mode.

These unsupported primitives by the JIT compiler are actually what we are talking about today: NYI, which stands for Not Yet Implemented. The LuaJIT official website has a complete list of these NYIs. I recommend you browse through it carefully. Of course, the goal is not for you to memorize the contents of this list, but to remind yourself consciously when writing code.

Below, I have extracted a few functions from the NYI list under the string library:

Among them, the compile status for string.byte is yes, indicating that it can be JIT compiled, so you can confidently use it in your code.

The compile status for string.char is 2.1, meaning it is supported starting from LuaJIT 2.1. We know that LuaJIT used in OpenResty is based on LuaJIT 2.1, so you can also use it without worry.

The compile status for string.dump is never, meaning it will not be JIT compiled and will fall back to the interpreter mode. Currently, there are no plans to support this primitive in the future.

The compile status for string.find is 2.1 partial, which means it is partially supported starting from LuaJIT 2.1. The remark in the description says that it only supports searching for fixed strings and does not support pattern matching. So for looking up fixed strings, you can use string.find and it will be JIT compiled.

Naturally, we should avoid using NYIs as much as possible to enable more code to be JIT compiled, ensuring better performance. But in practical environments, sometimes we unavoidably need to use the functionality of some NYI functions. So what should we do in such cases?

Alternatives to NYI #

Don’t worry, most of the NYI functions can be avoided by using alternative methods to achieve their functionality. In the following section, I will explain several typical examples of NYI and introduce different alternative solutions. By understanding these examples, you will be able to apply the same principles to other NYI functions.

1. string.gsub() Function #

First, let’s look at the string.gsub() function. It is a built-in string manipulation function in Lua, used for doing global string replacements. For example:

$ resty -e 'local new = string.gsub("banana", "a", "A"); print(new)'
bAnAnA

This function is an NYI primitive and cannot be JIT compiled.

We can try to find alternative functions within OpenResty’s API, but for most people, it is not practical to remember all the APIs and their usage. So, in my daily development, I usually refer to the GitHub documentation page of lua-nginx-module.

For example, for the previous example, we can use “gsub” as the keyword and search in the documentation page, and we will find “ngx.re.gsub”.

Some of you might wonder why we don’t use the recommended “restydoc” tool to search for OpenResty APIs. You can try using it to search for “gsub”:

$ restydoc -s gsub

As you can see, it does not return “ngx.re.gsub” as expected, but displays the Lua built-in functions. In fact, at this stage, “restydoc” only returns the exact matches, so it is more suitable to use when you already know the API name. As for fuzzy searches, you still need to manually search in the documentation.

Going back to the previous search result, we can see that the ngx.re.gsub function is defined as follows:

newstr, n, err = ngx.re.gsub(subject, regex, replace, options?)

Here, the function parameters and return values are named with specific meanings. In OpenResty, I don’t actually recommend writing a lot of comments. Most of the time, good naming is better than several lines of comments.

For engineers who are not familiar with OpenResty’s regular expression system, you might be confused by the optional parameter “options”. However, the explanation for this parameter is not in this function, but in the documentation of the ngx.re.match function.

By checking the documentation for the “options” parameter, you will discover that by setting it to “jo”, you enable PCRE’s JIT. This way, the code using ngx.re.gsub can be JIT compiled by both LuaJIT and PCRE JIT.

I won’t go into the specific documentation content, but I want to emphasize that when looking at the documentation, we must have a spirit of thoroughness. In fact, the documentation for OpenResty is very comprehensive, and by carefully reading the documentation, you can solve most of your problems.

2. string.find() Function #

Different from string.gsub, string.find can be JIT compiled in plain mode (i.e., searching for fixed strings). However, when it comes to searching with regular expressions, string.find cannot be JIT compiled. In this case, you need to switch to OpenResty’s own API, which is ngx.re.find.

So, when doing string searching in OpenResty, you must first distinguish whether you are searching for a fixed string or a regular expression. If it is the former, you should use string.find and remember to set the last “plain” parameter to true:

string.find("foo bar", "foo", 1, true)

If it is the latter, you should use OpenResty’s API and enable the PCRE JIT option:

ngx.re.find("foo bar", "^foo", "jo")

Ideally, it is more suitable to wrap this behavior and enable the optimization option by default, so that the end users do not need to know these details. This way, there will be a unified string search function externally. As you can see, sometimes having too many choices and being too flexible is not a good thing.

3. unpack() Function #

Now let’s look at the unpack() function. unpack() should also be avoided, especially inside loops. You can access the array using array indices instead. For example, in the following code:

$ resty -e '
 local a = {100, 200, 300, 400}
 for i = 1, 2 do
    print(unpack(a))
 end'

You can change it to:

$ resty -e 'local a = {100, 200, 300, 400}
 for i = 1, 2 do
    print(a[1], a[2], a[3], a[4])
 end'

Let’s dig deeper into unpack, this time we can use restydoc to search:

$ restydoc -s unpack

From the unpack documentation, you can see that unpack (list [, i [, j]]) is equivalent to return list[i], list[i+1], ..., list[j]. You can consider unpack as syntax sugar. In this way, you can completely access the array using array indices to avoid interrupting the JIT compilation of LuaJIT.

4. pairs() Function #

Lastly, let’s talk about the pairs() function for traversing hash tables, which also cannot be JIT compiled.

Unfortunately, there is no equivalent alternative for this function. You can only try to avoid using it or switch to accessing the array using numeric indices, especially when you need to traverse a hash table on a hot code path. Here, let me explain what the hot code path means: it refers to the code that will be executed multiple times, such as in a large loop.

In summary, to avoid using NYI primitives, you need to pay attention to the following two points:

  • Prefer to use APIs provided by OpenResty instead of Lua’s standard library functions. Remember that Lua is an embedded language, and we are actually programming in OpenResty, not Lua.
  • If you must use NYI primitives, make sure they are not on the hot code path.

How to Detect NYI? #

After discussing various ways to avoid NYI, the focus has been on teaching you how to do it. However, if we stop here, it doesn’t quite align with a philosophy practiced by OpenResty:

If something can be automated, don’t do it manually.

Humans are not machines and there will always be room for oversight. The ability to automatically detect NYI used in code is an important manifestation of an engineer’s value.

Here, I recommend the jit.dump and jit.v modules that come with LuaJIT. They can both print out the process of the JIT compiler. The former provides very detailed information and can be used to debug LuaJIT itself. You can refer to its source code for a deeper understanding. The latter provides simpler output, with each line corresponding to a trace, usually used to detect if it can be JIT-compiled.

So, how should we proceed?

First, we can add the following two lines of code in init_by_lua:

local v = require "jit.v"
v.on("/tmp/jit.log")

Then, run your own stress testing tool or run a few hundred unit test suites to make LuaJIT hot enough to trigger compilation. Once these are done, check the results in /tmp/jit.log.

Of course, this method is relatively cumbersome. If you just want a simple verification, using resty is sufficient. The resty CLI of OpenResty comes with relevant options:

$resty -j v -e 'for i=1, 1000 do
      local newstr, n, err = ngx.re.gsub("hello, world", "([a-z])[a-z]+", "[$0,$1]", "i")
 end'
 [TRACE   1 (command line -e):1 stitch C:107bc91fd]
 [TRACE   2 (1/stitch) (command line -e):2 -> 1]

Here, -j is an option related to LuaJIT, and the following value, dump and v, corresponds to turning on jit.dump and jit.v modes, respectively.

In the output of the jit.v module, each line represents a successfully compiled trace object. The previous example is one that can be JIT-compiled. However, if an NYI primitive is encountered, the output will indicate NYI. For example, consider the pairs example below:

$resty -j v -e 'local t = {}
 for i=1,100 do
     t[i] = i
 end

 for i=1, 1000 do
     for j=1,1000 do
         for k,v in pairs(t) do
             --
         end
     end
 end'

It cannot be JIT-compiled, so in the result, it clearly states that there is an NYI primitive on line 8.

 [TRACE   1 (command line -e):2 loop]
 [TRACE --- (command line -e):7 -- NYI: bytecode 72 at (command line -e):8]

Final Thoughts #

This is the first time we have spent a significant amount of space discussing performance issues with OpenResty. After reading about the optimization of NYI, what are your thoughts? Feel free to leave a comment and share your views.

Lastly, here’s a question for you to ponder. When discussing alternatives to the string.find() function, I mentioned that it would be more appropriate to add a layer of encapsulation and enable optimization options by default. Now, I hand this task over to you to give it a try.

Feel free to write down your answer in the comment section, and feel free to share this article with your colleagues and friends for further discussion and progress together.