16 Why Is Your Test Not Good Enough

16 Why are your tests not good enough? #

Hello! My name is Zheng Ye. Today is Chinese New Year’s Eve, and I am here to wish everyone a prosperous year ahead!

Regarding testing, we have discussed a lot so far, such as: developers should write tests; code should be written with testability in mind; in order to do TDD well, we need to decompose tasks properly. I have also shown you a practical example of task decomposition.

But there is one important topic about testing that we haven’t talked about yet, which is how tests should be written. Today, I will talk about how to write good tests.

You might say, isn’t it simple? Haven’t we already discussed this earlier? Isn’t it just writing code using a testing framework? In theory, it should indeed be that simple, but the reality is often different. I have seen many teams facing various issues with testing, such as:

Tests are not stable, they pass sometimes but fail other times;
Sometimes a test itself is simple, but there are many dependencies to set up, which takes a long time;
A test can only be run after another test is finished;
…

If you have encountered similar problems in your work, then your understanding of writing tests may be different from mine. So what is the problem?

Why are your tests not good enough?

The main reason is that these tests are not simple enough. Only by breaking down complex tests into simple tests can we achieve good testing.

A Simple Test #

Why should tests be simple? There’s an interesting logic to it. Have you ever thought about what tests are used for? Obviously, they are used to ensure the correctness of the code. This leads to a question: who ensures the correctness of the tests?

Many people might be initially stumped by this question, but soon an answer would come to mind: tests. However, have you ever seen someone writing tests for tests? Definitely not. Because if you do that, the same question arises: who ensures the correctness of those tests? You can’t keep recursively writing tests for tests.

Since we cannot ensure the correctness of the tests by programming, we have only one option: write tests in a simple manner, so simple that it is self-evident and doesn’t require proof of correctness. Therefore, if you see a test that is complicated, it is definitely not a good test.

Since tests should be simple, let’s take a look at what a simple test might look like. Below, I provide a simple example for you to see.

@Test
void should_extract_HTTP_method_from_HTTP_request() {
  // preparation
  request = mock(HttpRequest.class);
  when(request.getMethod()).thenReturn(HttpMethod.GET);
  HttpMethodExtractor extractor = new HttpMethodExtractor();
  
  // execute
  HttpMethod method = extractor.extract(request);
  
  // assertion
  assertThat(method, is(HttpMethod.GET);
  
  // cleanup
}

This test is from my open-source project Moco, and I made a few adjustments to make it easier to understand. This test is very simple, extracting the HTTP method from an HTTP request.

I divided this code into four sections: preparation, execution, assertion, and cleanup, which are the four parts that a general test should have.

The core of these sections is the execution part, which is the target of the test. However, in practice, this part is often the shortest, usually just a single line of code. The other parts revolve around it. In this case, executing the HTTP method extractor to extract the HTTP method.
Preparation is about preparing the dependencies required for the execution part. For example, the components on which a class depends, or the parameters required to call a method. In this test, we prepare an HTTP request, set its method to GET, and use the previously mentioned mock framework, because it’s cumbersome to fully set up an HTTP request, and it’s also unrelated to this test.
Assertion is our expectation of what the code should do when executed. Here, we check whether the extracted method is GET. Additionally, a side note, assertion is not just about assert, if you use a mock framework, using verify to verify the behavior of the mock object is also a form of assertion.
Cleanup is a section that might be necessary. If your test uses any resources, you can release them here. However, if you make good use of existing testing infrastructure (e.g., JUnit’s Rule) and follow good testing practices, in many cases, this section can be omitted.

So, looking at it, it seems very simple, doesn’t it? It meets the requirement of being self-evident without requiring proof, as I mentioned earlier, doesn’t it?

Bad Smells in Testing #

Now that we have an understanding of the structure of testing, let’s talk about common “bad smells” in testing.

Firstly, let’s look at the execution part. I don’t know if you noticed, but earlier when I mentioned the execution part, I used a term called “a single line of code invocation.” Yes, the first “bad smell” comes from here.

Many people always want to do a lot of things in one test, such as invoking several different methods. So, I ask you, who is your code actually testing?

Once an error occurs in this test, you have to check all the relevant methods, which undoubtedly increases the complexity of the work.

You may ask, what should I do if I have several methods to test? It’s simple, just write several tests.

Another typical “bad smell” hotspot is assertions. Please remember, tests must have assertions. Tests without assertions are meaningless, just like claiming to be the world champion but never participating in a competition!

I have seen many people write a lot of tests, but the tests hardly ever fail. Out of curiosity, I looked at the code and found no assertions.

Of course, without assertions, the tests won’t fail. The colleague who wrote the tests complained that it is difficult to write tests and that he has already verified that the code is correct. As I mentioned earlier, if it is difficult to write tests, it is often a design problem that needs to be adjusted, rather than compromising in the testing phase.

There is another common “bad smell”: complexity. The most typical scenario is when you see various conditional statements and loop statements in the test code. In most cases, this test has problems.

For example, testing a function with assertions written in a bunch of if statements, claiming that it depends on the conditions. Again, how can you ensure that this test function is written correctly? Unless you use debugging methods, you cannot determine if your conditional branches have been executed.

You may question, what if I have a large amount of different data to test, without using loops or conditionals, what should I do? What you should really do is write several tests that cover different scenarios.

A-TRIP: A Journey #

What constitutes a good test? Someone has summarized it as A-TRIP, an acronym for five words:

Automatic: automated
Thorough: comprehensive
Repeatable: repeatable
Independent: independent
Professional: professional

Now, let’s take a look at what each of these words represents.

Automatic: With the groundwork laid out earlier about automated testing, this is perhaps the easiest to understand—it means delegating testing to machines as much as possible, with minimal human involvement.

This is also why we emphasized the need for assertions in testing, since a test can only be automatically judged as successful when there are assertions in place.

Thorough: To be comprehensive, tests should cover various scenarios as much as possible. There are two aspects to understanding this. One is considering various scenarios before writing code, including normal, abnormal, and various boundary conditions. The other aspect is checking if the tests cover all the code and branches after writing the code, which is where various test coverage tools come into play.

Of course, achieving thoroughness is not easy. If your team is catching up on testing, one way is to gradually increase test coverage.

Repeatable: There are two aspects to this. First, running a test repeatedly should yield the same results. This means that each test should not rely on any environment beyond our control. If it does, you should find a solution.

For example, if there are external dependencies, we can use simulated services. Moco, for instance, is born to solve external dependency issues. It can simulate external HTTP services, making tests controllable.

Some tests depend on databases. In such cases, after running the tests, the database environment should be restored. Testing frameworks like Spring offer the ability to rollback test databases. If your tests do not produce the same results when run repeatedly, either there is a problem with the code or the tests themselves.

Understanding repeatability also has another aspect: running a batch of tests repeatedly should yield the same results. This indicates that there is no dependency between tests, which brings us to the next characteristic of testing.

Independent: There should be no dependency between tests. What does it mean to have a dependency? For example, if a test relies on an external database or a third-party service, and Test A writes some values to the database during its runtime, and Test B needs these values from the database, Test B must run after Test A. That’s what having a dependency means.

We cannot assume that tests are run in the order they are written. For example, sometimes, to speed up test execution, we might run tests in parallel. In such cases, the order is completely unpredictable. If tests have dependencies, various issues can arise.

Reduce external dependencies by using mocks. If dependencies are necessary, each test should take responsibility for its pre-test preparation and post-test cleanup. What if multiple tests share the same preparation and cleanup? That’s where setup and teardown come into play. The testing infrastructure has long prepared for this.

Professional: This is a missing point in the minds of many people. Testing code is still code, and it should be maintained according to the standards of code. This means that your test code should also be written clearly, with good naming conventions, small functions, refactoring, and even abstracting the testing library. The PageObject pattern commonly used in web testing is an extension of this concept.

After reading all this, you might think that what I’m saying makes sense, but your code is so complex, with so many possible test paths. How can you meet these requirements for your tests?

I must emphasize a point that was mentioned earlier in the context of Test-Driven Development: write testable code. Many people struggle with writing tests or find them difficult because they always approach it from the perspective of writing code, rather than writing tests. If you do not value testing and do not leave space for it, how can testing be done well?

Summary #

Testing is something that seems simple but is actually difficult to do well. Many people encounter various problems with testing in their actual work. The main reason for these problems is that the tests are written too complex. Once the tests become complex, it’s difficult to ensure their correctness, let alone using tests to ensure the correctness of the code.

I have explained the basic structure of testing to you: setup, execute, assert, and cleanup. I also introduced some common “smells” in tests: tests that do too much, tests without assertions, and tests with conditional statements that indicate a problem at first glance.

How do we measure whether a test is well done? There is a standard called A-TRIP, which is an abbreviation of five words: Automatic, Thorough, Repeatable, Independent, and Professional.

If there is only one thing you can remember from today’s content, please remember this: To write good tests, you need to write simple tests.

Finally, I would like to ask you to share, after the recent continuous explanation of testing, what understanding do you have about testing that is different from before? Please write down your thoughts in the comments section.

Thank you for reading. If you found this article helpful, feel free to share it with your friends.