37 How 'Systemtaptoolkit' and 'Stapxx' Can Pin Down Those Pesky Issues With Data

37 How ‘systemtaptoolkit’ and ‘stapxx’ Can Pin Down Those Pesky Issues with Data #

Hello, I’m Wen Ming.

As introduced in the previous lesson, as a server-side development engineer, we usually do not delve deeply into the toolkit for dynamic debugging. Instead, we mostly stay at the level of using these tools, and at most write some simple stap scripts. Delving deeper into topics such as CPU cache, architecture, compilers, etc., is the domain of performance engineers.

In OpenResty, there are two open source projects: openresty-systemtap-toolkit and stapxx. They are toolkits encapsulated based on Systemtap and used for real-time analysis and diagnosis of Nginx and OpenResty. They can cover common functionalities and debugging scenarios such as on-CPU, off-CPU, shared dictionaries, garbage collection, request latency, memory pools, connection pools, file access, etc.

In today’s lesson, I will take you through these tools and their corresponding usage methods. The purpose is to help you quickly find the tools for locating problems when encountering difficult issues with Nginx and OpenResty. In the world of OpenResty, learning to use these tools is a necessary step for advancing and also an effective way to communicate with other developers. After all, the data generated by tools is more accurate and detailed than textual descriptions.

However, it is important to note that the latest version of OpenResty 1.15.8 enables LuaJIT GC64 mode by default, but openresty-systemtap-toolkit and stapxx have not been modified accordingly, which means that the tools inside cannot be used properly. Therefore, it is best to use these tools in the earlier 1.13 version of OpenResty.

The contributors to open source projects are mostly part-time, and they have no obligation to ensure that these tools can always be used properly. This is something you need to be aware of when using open source projects.

Using shared dictionaries as an example #

As usual, let’s start with an example using a tool that you are most familiar with and is the easiest to get started with: ngx-lua-shdict.

ngx-lua-shdict is a tool that can analyze Nginx’s shared memory dictionaries and track dictionary operations. You can use the -f option to specify the dictionary and key to retrieve the data from the shared memory dictionary. The --raw option can be used to export the raw value of the specified key.

Here is an example command-line to retrieve data from a shared memory dictionary:

# Assuming the Nginx worker PID is 5050
$ ./ngx-lua-shdict -p 5050 -f --dict dogs --key Jim --luajit20
Tracing 5050 (/opt/nginx/sbin/nginx)...

type: LUA_TBOOLEAN
value: true
expires: 1372719243270
flags: 0xa

Similarly, you can use the -w option to trace dictionary write operations for a specific key:

$ ./ngx-lua-shdict -p 5050 -w --key Jim --luajit20
Tracing 5050 (/opt/nginx/sbin/nginx)...

Hit Ctrl-C to end

set Jim exptime=4626322717216342016
replace Jim exptime=4626322717216342016
^C

Now let’s take a look at how this tool is implemented. ngx-lua-shdict is a Perl script, but the specific implementation is not related to Perl; Perl is only used to generate and run the SystemTap script:

open my $in, "|stap $stap_args -x $pid -" or die "Cannot run stap: $!\n";

You can write it in any language you like, such as Python, PHP, Go, or any language you prefer. In the SystemTap script, the key line is:

probe process("$nginx_path").function("ngx_http_lua_shdict_set_helper")

This is the probe we mentioned in the previous lesson, which detects the function ngx_http_lua_shdict_set_helper. And the calls to this function are all in the file lua-nginx-module/src/ngx_http_lua_shdict.c of the lua-nginx-module module:

static int
ngx_http_lua_shdict_add(lua_State *L)
{
return ngx_http_lua_shdict_set_helper(L, NGX_HTTP_LUA_SHDICT_ADD);
}

static int
ngx_http_lua_shdict_safe_add(lua_State *L)
{
return ngx_http_lua_shdict_set_helper(L, NGX_HTTP_LUA_SHDICT_ADD
|NGX_HTTP_LUA_SHDICT_SAFE_STORE);
}

static int
ngx_http_lua_shdict_replace(lua_State *L)
{
return ngx_http_lua_shdict_set_helper(L, NGX_HTTP_LUA_SHDICT_REPLACE);
}

This way, as long as we detect this function, we can trace all the operations of the shared dictionary.

on CPU and off CPU #

When using OpenResty, the most common issue you may encounter is likely performance problems. Poor performance, characterized by low QPS (Queries Per Second), can mainly be attributed to two factors: high CPU utilization and low CPU utilization. The former bottleneck may result from not using the performance optimization methods we introduced before, while the latter may be caused by the use of blocking functions. Correspondingly, on CPU and off CPU flame graphs can help us pinpoint the ultimate cause.

To generate an on CPU flame graph at the C level, you need to use sample-bt from the systemtap-toolkit. On the other hand, for Lua level on CPU flame graphs, lj-lua-stacks from stapxx is used to generate them.

Let’s use sample-bt as an example to demonstrate how to use it. This script allows you to sample the call stack of any user process you specify (not limited to Nginx and OpenResty processes).

For example, you can use the following code to sample an ongoing Nginx worker process with PID 8736 for 5 seconds:

$ ./sample-bt -p 8736 -t 5 -u > a.bt
WARNING: Tracing 8736 (/opt/nginx/sbin/nginx) in user-space only...
WARNING: Missing unwind data for module, rerun with 'stap -d stap_df60590ce8827444bfebaf5ea938b5a_11577'
WARNING: Time's up. Quitting now...(it may take a while)
WARNING: Number of errors: 0, skipped probes: 24

The resulting file a.bt can be used with the FlameGraph toolkit to generate a flame graph:

stackcollapse-stap.pl a.bt > a.cbt
flamegraph.pl a.cbt > a.svg

Here, a.svg is the generated flame graph, which you can open and view in a browser. However, be aware that during sampling, you need to maintain a certain level of request pressure; otherwise, if the number of samples is 0, no flame graph can be generated.

Now let’s take a look at how to sample off CPU. The script you need to use is sample-bt-off-cpu from the systemtap-toolkit. Its usage is similar to sample-bt, and it is explained in the code below:

$ ./sample-bt-off-cpu -p 10901 -t 5 > a.bt
WARNING: Tracing 10901 (/opt/nginx/sbin/nginx)...
WARNING: _stp_read_address failed to access memory location
WARNING: Time's up. Quitting now...(it may take a while)
WARNING: Number of errors: 0, skipped probes: 23

In stapxx, the tool for analyzing delays is epoll-loop-blocking-distr. It samples the specified user process and outputs the distribution of delays between consecutive epoll_wait system calls:

$ ./samples/epoll-loop-blocking-distr.sxx -x 19647 --arg time=60
Start tracing 19647...
Please wait for 60 seconds.
Distribution of epoll loop blocking latencies (in milliseconds)
max/avg/min: 1097/0/0
value |-------------------------------------------------- count
    0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@  18471
    1 |@@@@@@@@                                            3273
    2 |@                                                    473
    4 |                                                     119
    8 |                                                      67
   16 |                                                      51
   32 |                                                      35
   64 |                                                      20
  128 |                                                      23
  256 |                                                       9
  512 |                                                       2
 1024 |                                                       2
 2048 |                                                       0
 4096 |                                                       0

In the output above, you can see that most delays are less than 1 millisecond, but there are a few that exceed 200 milliseconds, which should be the focus of attention.

Upstream and Phase Tracking #

In addition to potential performance issues in the OpenResty code itself, when OpenResty communicates with upstream modules such as cosocket or proxy_pass to interact with upstream services, if the upstream service has significant latency, it can have a big impact on overall performance.

In such cases, you can use tools like ngx-lua-tcp-recv-time, ngx-lua-udp-recv-time, and ngx-single-req-latency to analyze the performance. Here, I will explain ngx-single-req-latency as an example.

This tool is different from most of the tools in the toolkit. Other tools are based on extensive sampling and statistical analysis to derive mathematical distribution conclusions. On the other hand, ngx-single-req-latency analyzes individual requests, tracking the time spent in various phases within OpenResty, such as rewrite, access, content phase, as well as the upstream latency.

Let’s take a look at a specific example code:

# making the ./stap++ tool visible in PATH:
$ export PATH=$PWD:$PATH

# assuming an nginx worker process's pid is 27327
$ ./samples/ngx-single-req-latency.sxx -x 27327
Start tracing process 27327 (/opt/nginx/sbin/nginx)...

POST /api_json
    total: 143596us, accept() ~ header-read: 43048us, rewrite: 8us, pre-access: 7us, access: 6us, content: 100507us
    upstream: connect=29us, time-to-first-byte=99157us, read=103us

$ ./samples/ngx-single-req-latency.sxx -x 27327
Start tracing process 27327 (/opt/nginx/sbin/nginx)...

GET /robots.txt
    total: 61198us, accept() ~ header-read: 33410us, rewrite: 7us, pre-access: 7us, access: 5us, content: 27750us
    upstream: connect=30us, time-to-first-byte=18955us, read=96us

This tool will trace the first request it encounters after being started. The output is similar to OpenTracing and you can even consider the systemtap-toolkit and stapxx as a non-intrusive version of APM (Application Performance Monitoring) in OpenResty.

Closing Remarks #

In addition to the commonly used tools I mentioned today, OpenResty naturally provides more tools for you to explore and learn on your own.

In fact, when it comes to tracking technology, OpenResty has a significant difference compared to other development languages and platforms. I hope you can gradually experience it:

Keep the code base clean and stable, and avoid adding probes to it. Instead, use external dynamic tracking techniques for sampling.

Finally, I leave you with a question: What tools have you used for tracing and analyzing problems when using OpenResty? Feel free to leave a comment and discuss this question with me. Also, please feel free to share this article so that we can communicate and improve together.