25 More Testing Techniques

25 More Testing Techniques #

In the previous article, we learned the basic knowledge and fundamental testing techniques for testing Go programs. This includes the basic rules and main process of testing Go programs, the common methods of the testing.T type and the testing.B type, the basic usage of the go test command, and how to interpret regular testing results, etc.

In this article, I will continue to explain more advanced testing methods for you. This will involve more APIs in the testing package, more complex testing results supported by the go test command with various flags, as well as test coverage analysis, and so on.

Introduction: -cpu flag #

Continuing from the previous content. I mentioned the -cpu flag of the go test command earlier, which is used to set the maximum number of P (processors) for test execution.

Let’s review. When talking about the go language, I mentioned that P stands for processor. Each processor can carry several G (goroutines) and can timely coordinate and run these G by interacting with M (machine) as an intermediary.

It is because of the existence of P that G and M can have a many-to-many relationship and can be combined and separated in a timely and flexible manner.

Here, G stands for goroutine, which can be understood as user-level threads implemented by the Go language. M represents machine, which stands for system-level threads or kernel-level threads of the operating system.

The P in the Go language concurrent programming model is the key to having tens of thousands of goroutines. The number of P means how many queues are there in the runtime system of the Go program to carry runnable G.

Each queue is like a pipeline that continuously delivers runnable G to idle M and makes them interact.

Once the interaction is complete, the G being interacted with runs on the kernel-level threads of the operating system. Although there is some communication between the pipelines, they operate independently.

Therefore, the maximum number of P represents the ability of the Go language runtime system to run goroutines simultaneously, and it can also be regarded as the maximum number of logical CPUs. The -cpu flag of the go test command is used to set this maximum number.

Perhaps you already know that by default, the maximum number of P is equal to the actual number of CPU cores of the current computer.

Of course, the former can be greater than or less than the latter, which can simulate computers with different numbers of CPU cores to some extent.

So, it can also be said that using the -cpu flag can simulate the behavior of the tested program on computers with different computational capabilities.

Now that you know the purpose of the -cpu flag and its meaning behind it. Do you also understand its specific usage and the impact it has on the go test command?

Our question today is: How to set the value of the -cpu flag, and what impact does it have on the test process?

The typical answer here is:

The value of the -cpu flag should be a list of positive integers, and the representation of this list is multiple integer literals separated by commas, such as 1,2,4.

For each positive integer in this value, the go test command will first set the maximum number of P to that number and then execute the test function.

If there are multiple test functions, the go test command will execute them one by one in the same way.

Taking 1,2,4 as an example, the go test command will first execute the first test function with a maximum of 1, 2, and 4 P, and then execute the second test function in the same way, and so on.

Problem Analysis #

In fact, regardless of whether or not we append the -cpu flag, the process of executing test functions with the go test command is the same, but the specific steps will be slightly different.

When the go test command is preparing, it reads the value of the -cpu flag and converts it into a slice with int as the element type, which we can also call the logical CPU slice.

If the command finds that we did not append this flag, it will make the logical CPU slice only contain one element value, which is the default value of the maximum P quantity, that is, the actual number of CPU cores of the current computer.

When preparing to execute a test function, whether it is a functional test function or a performance test function, the go test command iterates through the logical CPU slice. And in each iteration, it first sets the maximum P quantity based on the current element value, and then executes the test function.

Note that for performance test functions, this may not be executed only once. Do you remember the maximum execution time of the test function and the number of times the program under test, represented by b.N, is executed?

If you forget, you can review the second extended question in the previous article. In summary, each execution of the performance test function by the go test command is an exploratory process. It tries to find the maximum number of times the program under test can be executed while keeping the maximum execution time of the test function unchanged.

In this process, the performance test function may be executed multiple times. For the convenience of description later, we call such an exploratory process: an exploratory execution of the performance test function. This includes multiple executions of the function, and of course, more executions of the program under test.

Speaking of multiple executions of the test function, we have to mention another flag, -count. The -count flag is specifically used to repeat the execution of the test function. Its value must be greater than or equal to 0, and the default value is 1.

If we append -count 5 when running the go test command, the command will repeat the execution of each test function under different preset conditions (such as different maximum P quantities) five times.

If we combine the -cpu flag, -count flag, and exploratory execution, we can use a formula to describe the number of times a single performance test function is executed during a single run of the go test command as follows:

Number of executions of the performance test function = number of positive integers in the value of the `-cpu` flag x value of the `-count` flag x actual number of executions of the test function in the exploratory execution

For functional test functions, this formula will be simpler, that is:

Number of executions of the functional test function = number of positive integers in the value of the `-cpu` flag x value of the `-count` flag

(Actual number of executions of the test function)

After reading these two formulas, I think you may have encountered this situation, in the process of performing some kind of automated testing on a Go program, the test logs appear to be particularly long, and many of them are repetitive. At this point, we should first think about the tags and processes that cause the test function to be executed multiple times. We often need to check whether the use of these tags is reasonable, whether logging is necessary, and so on, in order to streamline the test logs.

For example, for functional test functions, it is usually unnecessary to execute them repeatedly, even under different maximum P values. Note that repeated execution here refers to multiple executions in the same input (such as the parameter values of the function under test) situation.

Sometimes, when the input is exactly the same, the program under test may exhibit different behaviors due to differences in other external environments. In such cases, what we need to consider is usually whether the program is reasonable in design, rather than detecting risks through repeated test execution.

Sometimes, our program unavoidably depends on certain external environments, such as databases or other services. In this case, we should not use repeated test execution as a means of detection, but should circumvent their uncertainty by simulating (mocking) the external environment in the test.

In fact, the meaning of unit testing is to test a single functional module with clearly defined boundaries, without any detection of external environments. This is also the main meaning conveyed by the word “unit.”

On the contrary, for performance test functions, we often need to execute them repeatedly in order to mitigate the impact of slight differences in resource scheduling during the time when the program was executed. With the -cpu flag, we can also simulate the performance of the program under test in computers with different computational capabilities.

However, it is worth noting that the maximum P value set here should not exceed the actual number of CPU cores on the current computer. Because once it exceeds the actual parallel processing capability of the computer, the Go program will no longer benefit significantly in terms of performance improvement.

This is like a funnel, no matter how we pour water in, the rate of water leaking out is always limited. Moreover, in order to manage too many P’s, the Go runtime system will also consume additional computational resources.

Obviously, the performance obtained from such simulations will certainly be inaccurate. However, this can more or less serve as a reference, because the simulated performance is generally lower than the actual performance of the program in the computing environment.

Okay, that’s all for now about the -cpu flag, as well as the -count flag and the issue of multiple execution of test functions that it leads to. However, in order to reinforce your understanding of the previous knowledge, I will now provide you with a set of test results:

pkg: puzzlers/article21/q1
BenchmarkGetPrimesWith100-2        10000000        218 ns/op
BenchmarkGetPrimesWith100-2        10000000        215 ns/op
BenchmarkGetPrimesWith100-4        10000000        215 ns/op
BenchmarkGetPrimesWith100-4        10000000        216 ns/op
BenchmarkGetPrimesWith10000-2         50000      31523 ns/op
BenchmarkGetPrimesWith10000-2         50000      32372 ns/op
BenchmarkGetPrimesWith10000-4         50000      32065 ns/op
BenchmarkGetPrimesWith10000-4         50000      31936 ns/op
BenchmarkGetPrimesWith1000000-2         300    4085799 ns/op
BenchmarkGetPrimesWith1000000-2         300    4121975 ns/op
BenchmarkGetPrimesWith1000000-4         300    4112283 ns/op
BenchmarkGetPrimesWith1000000-4         300    4086174 ns/op

Now, I hope you can reverse engineer the values of the -cpu flag and the -count flag that I appended when running the go test command. After deducing, you can verify them experimentally.

Knowledge Expansion #

Question 1: What is the purpose of the -parallel flag? #

When running the go test command, we can append the -parallel flag, which is used to set the maximum number of concurrent executions of functional test functions in the same package under test. The default value of this flag is the maximum number of P (which can be obtained by calling the expression runtime.GOMAXPROCS(0)) during the test runtime.

As mentioned in the previous article, for functional tests, in order to speed up the test process, the command usually tests multiple packages concurrently.

However, by default, the command executes multiple functional test functions in the same package serially. Unless we explicitly call the t.Parallel method in some of the functional test functions.

At this time, the functional test functions that contain the t.Parallel method calls will be executed concurrently by the go test command, and the maximum number of concurrent executions is determined by the value of the -parallel flag. However, it is important to note that multiple executions of the same functional test function must be serial.

You can run the command go test -v puzzlers/article21/q2 or go test -count=2 -v puzzlers/article21/q2 to see the test results and experience it carefully.

Finally, it is worth emphasizing that the -parallel flag is not effective for performance testing. Of course, performance testing can also be performed concurrently, but the mechanism is different.

In summary, this involves the combined use of the b.RunParallel method, b.SetParallelism method, and the -cpu flag. If you want to learn more, you can refer to the documentation of the testing package.

Question 2: What is the purpose of the timer in performance test functions? #

If you have read the documentation of the testing package, you may have noticed several pointer methods of the testing.B type: StartTimer, StopTimer, and ResetTimer. These methods are used to manipulate the timer dedicated to the current performance test function.

The so-called timer is a logical concept, which is actually a collective term for some fields in the testing.B type. These fields are used to record: the time spent by the current test function in the current execution, the number of bytes of heap memory allocated, and the number of allocations.

As a matter of fact, the go test command itself uses such a timer. When preparing to execute a performance test function, the command will reset and start the dedicated timer for that function. Once this function is executed, the command will immediately stop the timer.

In this way, the command can accurately record the execution time (which we have mentioned many times before). Then, the command will compare this time with the execution time limit and decide whether to increase the value of b.N and execute the test function again.

Remember? This is what I mentioned earlier, the exploratory execution of performance test functions. Obviously, if we manipulate this timer ourselves in the test function, it will definitely affect the results of this exploratory execution. In other words, this will make the command find a different maximum number of executions of the tested program.

Take the performance test function in the demo57_test.go file as an example, as shown below:

func BenchmarkGetPrimes(b *testing.B) {
 b.StopTimer()
 time.Sleep(time.Millisecond * 500) // Simulate an additional time-consuming operation unrelated to the program under test.
 max := 10000
 b.StartTimer()

 for i := 0; i < b.N; i++ {
  GetPrimes(max)
 }
}

Please pay attention to the first four lines of code in this function. I first stopped the timer for the current test function, and then simulated a time-consuming additional operation by calling the time.Sleep function. After assigning the variable max, I started the timer again.

You can imagine that we need extra time to determine the value of the max variable. Although it will be passed to the GetPrimes function later, the performance test of the GetPrimes function itself should not include the process of determining the parameter value.

Therefore, we need to exclude the time spent on this process from the execution time of the current test function. In this way, we can avoid the adverse effects of this process on the test results.

After each execution of this test function, the go test command should only include the time spent calling the GetPrimes function in the execution time. Only based on this time can we make accurate subsequent judgments and find the maximum number of executions of the tested program.

In performance test functions, we can remove the execution time of any code segment by using the combined use of the b.StartTimer and b.StopTimer methods.

By comparison, the flexibility of the b.ResetTimer method is a bit worse, it can only be used to remove the execution time of the code before calling it. However, it can work regardless of whether the timer is running when it is called.

Summary #

In this article, I assume that you have already understood the content covered in the previous article. Therefore, I further elaborate on the testing process of functional testing and performance testing under different conditions, focusing on several important flags that can be accepted by the go test command.

Among them, the significance of the maximum P count, the role of the -cpu flag and its impact on the testing process, the significance of exploratory execution for performance testing functions, the calculation method of the execution time of the test function, and the purpose and applicable scenarios of the -count flag are particularly important.

Of course, it is also necessary to learn how to execute multiple functional testing functions concurrently. This requires the combined use of the -parallel flag and the t.Parallel method in the functional testing function.

In addition, you also need to understand the implications of the dedicated timer for performance testing functions, and the functions that these three methods serve for the timer. Through operations on the timer, we can achieve the goal of accurately measuring the execution time of performance testing functions, thereby helping the go test command find the true maximum execution count of the program under test.

With this, our discussion on testing Go programs is coming to an end. It is important to understand the basic testing process performed by the go test command, and the means by which we can make changes to the testing process to meet our testing requirements and provide more comprehensive testing results.

I hope you have learned something from this and can apply it in practice.

Thought Exercise #

What are the functions of the -benchmem flag and the -benchtime flag? How can you enable test coverage analysis during testing? If enabled, what side effects might there be?

For these two questions, you can refer to the testing flags section of the official Go command documentation to answer.

Click here to view the detailed code accompanying the Go language column article.