32 String Operations Hated and Loved by Many

32 String Operations- Hated and Loved by Many #

Hello, I’m Wen Ming.

In the previous lesson, I introduced you to the common blocking functions in OpenResty, which are often the places where beginners make mistakes. Starting from today, we will enter the core part of performance optimization, which will involve many optimization techniques that can help you quickly improve the performance of your OpenResty code. So, please take it seriously.

In this process, you need to write more test code to experience how these optimization techniques are used and validate their effectiveness. You should have a good understanding of them and be able to use them readily.

Behind the Performance Optimization Techniques #

Optimization techniques are all part of the “art” of optimization. Before discussing the techniques, let’s first talk about the “philosophy” of optimization.

Performance optimization techniques will change as LuaJIT and OpenResty versions iterate. Some techniques may be optimized by underlying technologies and no longer need to be mastered by us. At the same time, new optimization techniques may emerge. Therefore, understanding the unchanging principles behind these optimization techniques is the most important.

Next, let’s take a look at several important concepts regarding performance in OpenResty programming.

Concept 1: Request processing should be short, simple, and fast #

OpenResty is a web server, so it often handles thousands, tens of thousands, or even hundreds of thousands of requests simultaneously. In order to achieve the highest overall performance, we must ensure that individual requests are processed quickly and resources are reclaimed.

  • The “short” mentioned here means that the lifespan of the request should be short, and resources should not be occupied for a long time without being released. Even for long connections, set a time or request count threshold to periodically release resources.
  • The second key point is “simple.” In each API, only one thing should be done. Complex business logic should be split into multiple APIs, keeping the code concise.
  • Lastly, “fast” means not blocking the main thread and avoiding heavy CPU operations. Even if such logic cannot be avoided, don’t forget the methods introduced in the previous section and coordinate with other services to complete them.

In fact, these architectural considerations are not only suitable for OpenResty, but also apply to other programming languages and platforms. I hope you can understand and reflect on them seriously.

Concept 2: Avoid generating intermediate data #

Avoiding unnecessary intermediate data can be said to be the most important optimization concept in OpenResty programming. Here, let me give you a small example to explain what is meant by “unnecessary intermediate data.” Let’s take a look at the following code:

$ resty -e 'local s= "hello"
s = s .. " world"
s = s .. "!"
print(s)
'

In this code, we concatenate the s variable multiple times to obtain the result hello world!. However, it is obvious that only the final state of s, which is hello world!, is useful. The initial value and intermediate assignments of s all belong to intermediate data and should be generated as little as possible.

Because these temporary data will bring performance overhead for initialization and garbage collection. Do not underestimate these costs. If they occur in hot code such as loops, they will cause a significant performance drop. Later, I will use an example with strings to explain this point.

Strings Are Immutable! #

Now, let’s get back to the topic of this section - strings. Here, I want to emphasize that in Lua, strings are immutable.

Of course, this doesn’t mean that strings cannot be concatenated or modified. It means that when you modify a string, you are actually creating a new string object and changing the reference to the string. Naturally, if the original string has no other references, Lua’s garbage collector will reclaim it.

The advantage of immutable strings is obvious, it saves memory. In this way, there is only one copy of a string with the same content in memory, and different variables will point to the same memory address.

As for the drawbacks of this design, when it comes to adding new strings or garbage collection involving strings, every time you add a string, LuaJIT has to call lj_str_new to check if the string already exists; if not, a new string needs to be created. If operations are frequent, it will naturally have a significant impact on performance.

Let’s look at a specific example, similar to the string concatenation operation in many OpenResty open source projects:

$ resty -e 'local begin = ngx.now()
local s = ""
-- for loop, use .. to concatenate strings
for i = 1, 100000 do
    s = s .. "a"
end
ngx.update_time()
print(ngx.now() - begin)
'

The purpose of this sample code is to concatenate the s variable with “a” one hundred thousand times and print the running time. Although the example is somewhat extreme, it effectively demonstrates the difference in performance before and after optimization. When not optimized, this code ran for 0.4 seconds on my laptop, which is quite slow. So how can we optimize it?

In the previous lessons, I actually gave the answer, which is to use table as a layer of abstraction to eliminate all temporary intermediate strings and only keep the original data and the final result. Let’s take a look at the specific code implementation:

$ resty -e 'local begin = ngx.now()
local t = {}
-- for loop, use an array to store the strings, calculate the array length each time
for i = 1, 100000 do
    t[#t + 1] = "a"
end
-- concatenate the strings using the concat method of the array
local s =  table.concat(t, "")
ngx.update_time()
print(ngx.now() - begin)
'

As you can see, I used a table to save each string one by one, the index is determined by #t + 1, which means to use the current length of the table plus 1; finally, using the table.concat function, concatenate each element of the array to get the final result directly. This naturally skips all temporary strings and avoids 100,000 lj_str_new and GC calls.

That was our analysis of the code. So how effective is the optimization? Clearly, the optimized code only took 0.007 seconds, which means that the performance has improved by more than fifty times. In fact, in practical projects, the performance improvement may be even more significant, because in this example, we only added one character a each time.

What if the newly added string is 10 characters long? This is a homework question for you, feel free to share your results in the comments.

Returning to our optimization work, is the code that took 0.007 seconds already good enough? Actually, no, there is still room for further optimization. Let’s modify one line of code and see the effect:

$ resty -e 'local begin = ngx.now()
local t = {}
-- for loop, use an array to store the strings, maintain the length of the array
for i = 1, 100000 do
    t[i] = "a"
end
local s =  table.concat(t, "")
ngx.update_time()
print(ngx.now() - begin)
'

This time, I changed t[#t + 1] = "a" to t[i] = "a", just modifying one line of code can avoid calling the function to get the array length one hundred thousand times. Do you remember the operation to get the length of an array that we mentioned in the table chapter? It has a time complexity of O(n), which is obviously a relatively expensive operation. So here, we simply maintain our own array index and bypass the operation to get the array length. As the saying goes, if you can’t beat them, avoid them.

Of course, this is a simplified version of the code. The code I wrote below shows more clearly how to maintain the array index yourself, you can refer to it for a better understanding:

$ resty -e 'local begin = ngx.now()
local t = {}
local index = 1
for i = 1, 100000 do
    t[index] = "a"
    index = index + 1
end
local s = table.concat(t, "")
ngx.update_time()
print(ngx.now() - begin)
'

Reducing other temporary strings #

As we just discussed, it is obvious that string concatenation creates temporary strings. With the reminders from the previous example codes, I believe you will not make similar mistakes again. However, there are still some more subtle temporary strings being generated in OpenResty, which are not easy to be found. Can you imagine that the string processing function I am about to talk about also generates temporary strings?

We know that the string.sub function is used to extract a specific part of a string. As mentioned earlier, strings in Lua are immutable, so the newly extracted string will involve lj_str_new and subsequent garbage collection (GC) operations.

resty -e 'print(string.sub("abcd", 1, 1))'

The purpose of the above code is to get the first character of the string and print it. Naturally, it unavoidably generates a temporary string. Is there a better way to achieve the same result?

resty -e 'print(string.char(string.byte("abcd")))'

Of course. In the second code snippet, we first use string.byte to obtain the numerical encoding of the first character, and then use string.char to convert the number into the corresponding character. No temporary strings are generated in this process. Therefore, using string.byte to perform string-related scanning and analysis is the most efficient approach.

Using SDK to support table types #

After learning the methods to reduce temporary strings, are you excited to give it a try? We can output the result of the example code above as the content of the response body to the client. At this point, you can take a break and try to write this code yourself.

$ resty -e 'local begin = ngx.now()
local t = {}
local index = 1
for i = 1, 100000 do
    t[index] = "a"
    index = index + 1
end
local response = table.concat(t, "")
ngx.say(response)
'

By writing this code, you have already surpassed the majority of OpenResty developers. However, don’t be conceited, there is still room for improvement. OpenResty’s Lua API has already considered the situation of using tables for string concatenation. Therefore, in APIs that can accept a large number of strings such as ngx.say, ngx.print, ngx.log, cosocket:send, etc., they not only accept strings as arguments, but also accept tables as arguments:

resty -e 'local begin = ngx.now()
local t = {}
local index = 1
for i = 1, 100000 do
    t[index] = "a"
    index = index + 1
end
ngx.say(t)
'

In the last code snippet, we omitted local response = table.concat(t, "") and directly passed the table to ngx.say. This way, the task of string concatenation is shifted from the Lua layer to the C layer, avoiding one more string lookup, generation, and garbage collection. For relatively long strings, this is another significant performance improvement.

Final Thoughts #

After completing this lesson, you should have noticed that many of the performance optimizations in OpenResty involve tweaking various details. Therefore, you need to be familiar with LuaJIT and OpenResty’s Lua API in order to achieve optimal performance. This also serves as a reminder that if you have forgotten any of the earlier content, be sure to review and consolidate it in a timely manner.

Lastly, I will leave you with a homework question. I request that you write the strings “hello”, “world”, and an exclamation mark into the error log. Can you come up with an example code that does not involve string concatenation?

Additionally, do not forget about the other homework question in the following code. What would be the performance difference if the newly added string has a length of 10 “a"s?

$ resty -e 'local begin = ngx.now()
local t = {}
for i = 1, 100000 do
    t[#t + 1] = "a"
end
local s =  table.concat(t, "")
ngx.update_time()
print(ngx.now() - begin)
'

I hope you will think and act actively, and share your answers and thoughts in the comments section. You are also welcome to share this article with your friends for learning and exchange.