30 Is It Really Necessary to Write Unit Tests

30 Is it Really Necessary to Write Unit Tests #

Hello, I’m Jingxiao.

When it comes to unit tests, most people probably have two reactions: either they think it’s quite simple and it doesn’t matter whether they do it or not, or they think the project is too rushed and they can do the unit tests later.

Obviously, both types of people fail to recognize the value of unit tests and haven’t mastered the correct methods for unit testing. Do you think that as long as you understand the various features of Python and can write programs that meet the functional requirements, that’s enough?

Actually, that’s not the case. Completing the functional requirements of a product is just a basic part. The key to our work is how to ensure that the code we write is stable, efficient, and error-free. Learning to use unit tests properly is an important way to help you achieve this goal.

We often talk about Test-Driven Development (TDD). Today, I will use Python as an example to teach you how to design and write unit test code in Python, guiding you to become familiar with and master this important skill.

What is Unit Testing? #

Unit testing, in simple terms, is the process of writing tests to verify the correctness of a certain module’s functionality. Typically, specific inputs are specified, and the outputs are validated to see if they match the expected results.

In actual production environments, we test all possible input values for each module. Although this may appear tedious and add extra workload, it greatly improves the code quality, reduces the likelihood of bugs occurring, and also makes system maintenance more convenient.

When it comes to unit testing, we cannot ignore the Python unittest library, which provides most of the tools we need. Let’s take a look at the following simple test to understand how to use it:

import unittest

# The sorting function to be tested
def sort(arr):
    l = len(arr)
    for i in range(0, l):
        for j in range(i + 1, l):
            if arr[i] >= arr[j]:
                tmp = arr[i]
                arr[i] = arr[j]
                arr[j] = tmp


# Write a subclass that inherits from unittest.TestCase
class TestSort(unittest.TestCase):

    # Functions starting with 'test' will be tested
    def test_sort(self):
        arr = [3, 4, 1, 5, 6]
        sort(arr)
        # Assert the result is as expected
        self.assertEqual(arr, [1, 3, 4, 5, 6])

if __name__ == '__main__':
    ## If in Jupyter, run unit tests using the following code
    unittest.main(argv=['first-arg-is-ignored'], exit=False)
    
    ## If running from the command line:
    ## unittest.main()

## Output:
..
----------------------------------------------------------------------
Ran 2 tests in 0.002s

OK

Here, we created a unit test for a sorting function to validate its functionality. The code includes detailed comments, and I believe you can understand it to some extent. Let me explain some additional details.

First, we need to create a class called TestSort that inherits from the class unittest.TestCase. Then, within this class, we define the corresponding test function test_sort() to perform the test. Note that test functions should start with test, and within these functions, we usually use assert statements such as assertEqual(), assertTrue(), assertFalse(), and assertRaises() to validate the results.

Finally, when running the code, if you are in an IPython or Jupyter environment, please use the following line of code:

unittest.main(argv=['first-arg-is-ignored'], exit=False)

If you are using the command line, simply use unittest.main(). You can see that the output is OK, which indicates that our test has passed.

Of course, the function being tested in this example is relatively simple, so writing the corresponding unit test is very natural and does not require many unit testing techniques. However, functions in real-world scenarios are often more complex. When encountering complex problems, the biggest difference between an expert and a novice lies in the use of unit testing techniques.

Several Tips for Unit Testing #

Next, I will introduce several tips for unit testing in Python, namely mock, side_effect, and patch. These three have different usage, but they share a core idea: replace some dependencies of the function being tested with fake implementations so that we can focus more on the functionality that needs to be tested.

Mock #

Mock is the most important part in unit testing. Mock means using a fake object to replace the objects required by the function or module being tested.

For example, if you want to test the functionality of a back-end API, you may need to mock some objects such as the database, file system, or network, as back-end APIs usually rely on them. This allows you to easily test the core back-end logic unit.

In Python, the mock or MagicMock objects are mainly used for mocking. Here is an example:

import unittest
from unittest.mock import MagicMock

class A(unittest.TestCase):
    def m1(self):
        val = self.m2()
        self.m3(val)

    def m2(self):
        pass

    def m3(self, val):
        pass

    def test_m1(self):
        a = A()
        a.m2 = MagicMock(return_value="custom_val")
        a.m3 = MagicMock()
        a.m1()
        self.assertTrue(a.m2.called) # Verify if m2 was called
        a.m3.assert_called_with("custom_val") # Verify if m3 was called with the specified arguments

if __name__ == '__main__':
    unittest.main()

In this code, we define three methods m1(), m2(), and m3() in a class. We need to test m1(), but it depends on m2() and m3(). If m2() and m3() are complex internally, you cannot simply call the m1() function to test it. You may need to solve many dependency issues.

This might seem overwhelming, right? However, with mock, it becomes much simpler. We can replace m2() with a value that returns a specific number and replace m3() with another mock (an empty function). With this, testing m1() becomes easy. We can test that m1() calls m2() and that m2()’s return value is used to call m3().

You might wonder if testing m1() in this way is almost meaningless. It seems like we are only symbolically testing the logic, right?

Actually, it’s not true. In real-world code, there are usually multiple layers of modules that call each other in a tree-like structure. When conducting unit testing, it is important to test the logical functionality of a certain node by mocking the relevant dependencies. This is why it’s called unit testing, rather than other types such as integration testing or end-to-end testing.

Mock Side Effect #

The second concept we will discuss is Mock Side Effect. This concept is easy to understand: the functions or attributes of the mock can return different values based on different inputs, instead of just a single return_value.

For example, consider the following example. It’s very simple; it tests whether the input parameter is negative. If the input is less than 0, it returns 1; otherwise, it returns 2. The code is short, and you should be able to understand it. This is how Mock Side Effect is used:

from unittest.mock import MagicMock

def side_effect(arg):
    if arg < 0:
        return 1
    else:
        return 2

mock = MagicMock()
mock.side_effect = side_effect

mock(-1)  # Output: 1

mock(1)   # Output: 2

Patch #

As for patch, it provides developers with a very convenient method for mocking. It can apply Python’s decoration mode or context manager concept to quickly and naturally mock the required functions. Its usage is not difficult. Let’s take a look at the code:

from unittest.mock import patch

@patch('sort')
def test_sort(self, mock_sort):
    ...
    ...

In this test, mock_sort replaces the existence of the sort function itself, so we can set return_value and side_effect just like with mock objects.

Another common usage of patch is to mock a member function of a class. We often use this technique in our work, for example, when a class’s constructor is very complex, and testing one of its member functions does not depend on all the initialized objects. Its usage is as follows:

with patch.object(A, '__init__', lambda x: None):
      ...

The code should be relatively easy to understand. In the with statement, we use patch to mock the constructor of class A as a function that does nothing, allowing us to easily avoid complex initialization.

In fact, considering these points we discussed earlier, you should realize that the core of unit testing is still mocking—mocking the dependencies to test the accuracy of the corresponding logic or algorithm. In my opinion, although the Python unittest library has many other methods, as long as you can master MagicMock and patch, writing unit tests for the majority of work scenarios should not be a problem.

Key Elements of High-Quality Unit Testing #

In this final lesson, I would like to discuss high-quality unit testing. I understand that unit testing is something that even those who are currently using it “hate”. Many people often do it half-heartedly. I also find it troublesome, but I never dare to slack off because in large companies, if you write an important module/function, it cannot pass code review without unit tests.

Low-quality unit testing can really be just a decoration, incapable of verifying the correctness of our code and wasting time. So, since we need to do unit testing, instead of wasting time fooling ourselves, we should strive for high-quality unit testing to effectively improve the code quality.

How do we achieve this? Based on work experience, I believe that a high-quality unit test should pay special attention to the following two points.

Test Coverage #

First of all, we need to focus on Test Coverage, which measures the percentage of statements covered in the code. It can be said that improving the Test Coverage of a code module is basically equivalent to improving the correctness of the code.

Why is that?

You need to know that most code modules in company code repositories are very complex. Although they follow the concept of modular design, there are still complex business logics involved, which make the modules more and more complex. Therefore, writing high-quality unit tests requires us to cover every statement in the module and improve the Test Coverage.

We can use the coverage tool in Python to measure Test Coverage and display the uncovered statements for each module. If you want to learn more about its detailed usage, you can click the following link: https://coverage.readthedocs.io/en/v4.5.x/.

Modularity #

High-quality unit testing not only requires us to improve Test Coverage and try to cover every statement in the code module, but also requires us to look at the codebase from a testing perspective and consider how to modularize the code in order to write high-quality unit tests.

Just talking about this may be a bit abstract, so let’s consider the following scenario. For example, I wrote the following function to process an array and return a new array:

def work(arr):
    # pre-process
    ...
    ...
    # sort
    l = len(arr)
    for i in range(0, l):
        for j in range(i + 1, j):
            if arr[i] >= arr[j]:
                tmp = arr[i]
                arr[i] = arr[j]
                arr[j] = tmp
    # post-process
    ...
    ...
    return arr

The general idea of this code is to first perform preprocessing, then sorting, and finally post-processing before returning the result. If you are asked to write unit tests for this function, would you feel at a loss?

After all, this function is a bit complex to the extent that you don’t even know what the input should be or what the expected output should be. It’s very painful to write unit tests for such code, not to mention the requirement to cover every statement.

Therefore, the correct testing approach should be to modularize the code and write it in the following form:

def preprocess(arr):
    ...
    ...
    return arr

def sort(arr):
    ...
    ...
    return arr

def postprocess(arr):
    ...
    return arr

def work(self):
    arr = preprocess(arr)
    arr = sort(arr)
    arr = postprocess(arr)
    return arr

Then, carry out the corresponding tests, testing the functionality of the three sub-functions. Afterwards, use the mock function to call the work() function to verify that the three sub-functions have been called.

from unittest.mock import patch

def test_preprocess(self):
    ...

def test_sort(self):
    ...

def test_postprocess(self):
    ...

@patch('%s.preprocess')
@patch('%s.sort')
@patch('%s.postprocess')
def test_work(self,mock_post_process, mock_sort, mock_preprocess):
    work()
    self.assertTrue(mock_post_process.called)
    self.assertTrue(mock_sort.called)
    self.assertTrue(mock_preprocess.called)

As you can see, by refactoring the code, we can make the unit tests more comprehensive and precise, and it also makes the overall architecture and function design much more elegant.

Summary #

Looking back at this class, overall, the concept of unit testing is to first modularize code design, and then write separate tests for each functional unit to validate their accuracy. Better modular design and more test coverage are the core aspects of improving code quality. The essence of unit testing is to use mocks to eliminate dependencies that do not affect the tests and focus on the core logic of the code that needs to be tested.

After discussing so much, I still want to tell you that unit testing is a very, very important skill. It is an essential part of ensuring code quality and accuracy in actual work. Furthermore, the design skills of unit testing are not only applicable to Python but also to any language. Therefore, unit testing is indispensable.

Reflection Questions #

So, have you ever written unit tests in your day-to-day studies or work? When writing unit tests, what techniques have you used or what problems have you encountered? Feel free to leave a message and discuss with me. You are also welcome to share this article.