08 Differences Between Lua Jit Branch and Standard Lua

08 Differences between LuaJIT Branch and Standard Lua #

Hello, I’m Wen Ming.

In this lesson, we will learn about LuaJIT, another cornerstone of OpenResty. Today, I will focus on important and lesser-known aspects of Lua and LuaJIT. For more basic knowledge of the Lua language, you can learn through search engines or Lua books. I recommend the book “Programming in Lua” written by the author of Lua.

Of course, in OpenResty, the barrier to writing correct LuaJIT code is not high, but writing efficient LuaJIT code is not easy. I will discuss the key points in detail in the later sections on performance optimization in OpenResty.

First, let’s take a look at the position of LuaJIT in the overall architecture of OpenResty:

As mentioned before, the worker processes in OpenResty are all forked from the master process. Actually, the LuaJIT virtual machine in the master process will also be forked. All coroutines within the same worker process share this LuaJIT virtual machine, and the execution of Lua code is also performed within this virtual machine.

This can be considered the basic principle of OpenResty, which we will discuss in detail in later lessons. Today, let’s first clarify the relationship between Lua and LuaJIT.

Relationship between Standard Lua and LuaJIT #

Let’s start with the important point:

Standard Lua and LuaJIT are two different things. LuaJIT is only compatible with the syntax of Lua 5.1.

The latest version of Standard Lua is now 5.3, while the latest version of LuaJIT is 2.1.0-beta3. In the old versions of OpenResty a few years ago, you could choose to use either the Standard Lua VM or the LuaJIT VM as the execution environment when compiling. However, support for Standard Lua has been removed and now only LuaJIT is supported.

LuaJIT is syntax compatible with Lua 5.1, and selectively supports Lua 5.2 and 5.3. Therefore, we should first learn the syntax of Lua 5.1 and then build upon it to learn the features of LuaJIT. In the previous class, I have already introduced the basic syntax of Lua, and today I will only point out some of the special features of Lua.

It is worth noting that OpenResty does not directly use the 2.1.0-beta3 version provided by LuaJIT. Instead, it maintains its own fork based on it: [openresty-luajit2]:

OpenResty maintains its own branch of LuaJIT and has added many unique APIs.

These unique APIs were added during the development of OpenResty for performance considerations. Therefore, when we mention LuaJIT later, we specifically refer to the LuaJIT branch maintained by OpenResty.

Why choose LuaJIT? #

After hearing so much about the relationship between LuaJIT and Lua, you might wonder why not use Lua directly and instead use the self-maintained LuaJIT? The main reason is the performance advantage of LuaJIT.

In fact, for performance reasons, standard Lua also comes with a built-in virtual machine. Therefore, Lua code is not directly interpreted, but first compiled into bytecode by the Lua compiler and then executed by the Lua virtual machine.

In addition to an assembly-based Lua interpreter, LuaJIT’s runtime environment includes a JIT compiler that can directly generate machine code. At the beginning, just like standard Lua, Lua code is compiled into bytecode and then interpreted by the LuaJIT interpreter.

However, unlike standard Lua, LuaJIT’s interpreter records some runtime statistics while executing the bytecode, such as the actual number of calls to each Lua function entry point and the actual number of executions of each Lua loop. When these numbers exceed a certain threshold, which is randomly set, the corresponding Lua function entry point or Lua loop is considered sufficiently hot, and this triggers the JIT compiler.

The JIT compiler starts from the entry point of the hot function or some position of the hot loop, and attempts to compile the corresponding Lua code path. The compilation process involves transforming the LuaJIT bytecode into LuaJIT’s own defined intermediate representation (IR), and then generating machine code for the target architecture.

Therefore, the so-called performance optimization of LuaJIT is essentially about allowing as much Lua code as possible to be compiled into machine code by the JIT compiler, rather than falling back to the interpretation mode of the Lua interpreter. Only by understanding this principle can you grasp the essence of the performance optimization of OpenResty that you will learn later.

Special Features of Lua #

As we learned in the previous lesson, Lua is a relatively simple language. For engineers with a background in other programming languages, it is easy to understand the logic of the code once you notice some unique aspects of Lua. Let’s take a look at several features that make Lua different from other languages.

1. Lua indexes start from 1 #

Lua is the only programming language I know of that uses 1-based indexing. Although this makes it easier for people without a programming background to understand, it can easily cause bugs in programs.

Here is an example:

$ resty -e 't={100}; ngx.say(t[0])'

You would naturally expect to print 100 or get an error saying that index 0 does not exist. But the result is unexpected – nothing is printed and no error is thrown. Let’s add the type command to see what the output actually is:

$ resty -e 't={100};ngx.say(type(t[0]))'
nil

It turns out to be nil. In fact, in OpenResty, dealing with and handling nil values can also be confusing, but we will discuss this in more detail when we talk about OpenResty.

2. Concatenating strings with .. #

As I mentioned in the previous lesson, Lua uses two dots (..) to concatenate strings, unlike most languages that use the plus sign (+):

$ resty -e "ngx.say('hello' .. ', world')"
hello, world

In actual project development, we usually use multiple programming languages, and Lua’s unconventional design always makes developers pause when concatenating strings. It can be quite frustrating.

3. Only table as a data structure #

Unlike languages like Python that have a rich set of built-in data structures, Lua only has one data structure called a table, which can include both arrays and dictionaries:

local color = {first = "red", "blue", third = "green", "yellow"}
print(color["first"])       --> output: red
print(color[1])             --> output: blue
print(color["third"])       --> output: green
print(color[2])             --> output: yellow
print(color[3])             --> output: nil

If no explicit key-value pair assignment is used, the table will default to using numbers as the indexes, starting from 1. So color[1] is blue.

Furthermore, getting the correct length of a table is not easy. Let’s look at the following examples:

local t1 = { 1, 2, 3 }
print("Test1 " .. table.getn(t1))

local t2 = { 1, a = 2, 3 }
print("Test2 " .. table.getn(t2))

local t3 = { 1, nil }
print("Test3 " .. table.getn(t3))

local t4 = { 1, nil, 2 }
print("Test4 " .. table.getn(t4))

Running these examples with resty gives the following results:

Test1 3
Test2 2
Test3 1
Test4 1

As you can see, except for the first test case that returns a length of 3, the rest of the tests have unexpected results. In Lua, in order to obtain the length of a table correctly, you must be aware that it can only return the correct value when the table is a sequence.

What is a sequence? A sequence is a subset of an array. This means that all elements in a table can be accessed using positive integer indices, without any key-value pairs. Referring to the code above, except for t2, all the other tables are sequences.

Additionally, a sequence does not contain any holes, meaning there are no nil values in between. Taking these two points into consideration, in the table examples above, t1 is a sequence, while t3 and t4 are arrays but not sequences.

At this point, you might have a question: why is the length of t4 1? This is because when encountering a nil value, the logic of obtaining the length does not continue running, but returns immediately.

Did you completely understand all of that? This part can be quite complex. So is there any way to obtain the desired table length? The answer is yes, OpenResty provides extensions for this, which I will talk about in the dedicated table section later. For now, let’s keep it a mystery.

4. Variables are global by default #

I want to emphasize one thing: unless you are certain, always use local when declaring variables in Lua:

local s = 'hello'

This is because in Lua, variables are global by default and are placed in a table called _G. Accessing variables without using local involves costly operations as they need to be searched in the global table. Additionally, misspelling variable names can lead to difficult-to-locate bugs.

Therefore, in OpenResty programming, I strongly recommend that you always use local to declare variables, even when requiring a module:

-- Recommended 
local xxx = require('xxx')

-- Avoid
require('xxx')

LuaJIT #

Now that we understand these four special aspects of Lua, let’s continue discussing LuaJIT. In addition to being compatible with the syntax of Lua 5.1 and supporting JIT, LuaJIT also integrates closely with FFI (Foreign Function Interface), which allows you to directly call external C functions and use C data structures in Lua code.

Here is a simple example:

local ffi = require("ffi")
ffi.cdef[[
int printf(const char *fmt, ...);
]]
ffi.C.printf("Hello %s!", "world")

With just a few lines of code, you can directly call the C printf function in Lua and print out Hello world!. You can use the resty command to run it and see if it succeeds.

Similarly, we can use FFI to call C functions from NGINX and OpenSSL to accomplish more tasks. In fact, the FFI approach performs better than the traditional Lua/C API approach, which is why the lua-resty-core project exists. In the next section, we will specifically talk about FFI and lua-resty-core.

In addition, for performance reasons, LuaJIT has extended the functions related to tables: table.new and table.clear. These are two functions that are very important for performance optimization, and they are frequently used in the lua-resty library of OpenResty. However, because the relevant documentation is very hard to find and there are no sample codes, not many developers are familiar with them. We will cover them in detail in the section on performance optimization.

Conclusion #

Let’s review the content of today.

For performance reasons, OpenResty chose LuaJIT instead of standard Lua and maintained its own LuaJIT branch. LuaJIT is based on Lua 5.1 syntax and selectively supports some syntax from Lua 5.2 and Lua 5.3, forming its own system. When it comes to Lua syntax that you need to master, it has its distinctive features in indexing, string concatenation, data structures, and variables, so you should pay special attention when writing code.

While learning Lua and LuaJIT, have you encountered any traps and pitfalls? Feel free to leave a comment and let’s chat about it. I have also written a separate article to share the pitfalls I have encountered. You’re welcome to share this article with your colleagues and friends to learn and progress together.