09 Indispensable Custom Functions

09 Indispensable Custom Functions #

Hello, I’m Jingxiao.

In my actual work and daily life, I have seen many Python programs written by beginners. In these programs, which can be hundreds of lines long, there is not a single function. All the code is piled up in sequence, which not only makes it time-consuming and difficult to read, but also prone to errors.

A well-structured Python program worth emulating should consist of multiple functions unless the code is very small (e.g., less than 10 or 20 lines). With multiple functions, the code becomes more modular and standardized.

Functions are an indispensable part of a Python program. In fact, in our previous learning, we have already used many built-in functions in Python, such as sorted(), which sorts a collection sequence, and len(), which returns the size of a collection sequence. In this lesson, we will mainly learn about custom functions in Python.

Function Basics #

So, what exactly is a function and how do you define a function in a Python program?

Simply put, a function is a code block that is designed to do a specific task, and once you write it, you can reuse it multiple times. Let’s start with a simple example:

def my_func(message):
    print('Got a message: {}'.format(message))

# Call the function my_func()
my_func('Hello World')
# Output
Got a message: Hello World

In the example above:

  • def is the declaration of the function.
  • my_func is the name of the function.
  • The message inside the parentheses is the parameter of the function.
  • The line with print is the body of the function, where you can perform certain statements.
  • At the end of the function, you can return the result (using return or yield), or you can choose not to return anything.

To summarize, the general form of a function would look like this:

def name(param1, param2, ..., paramN):
    statements
    return/yield value # optional

Unlike other compiled languages (like C), def in Python is an executable statement, which means that the function doesn’t exist until it is called. When a program calls a function, the def statement creates a new function object and assigns it a name.

Let’s look at a few more examples to deepen your understanding of functions:

def my_sum(a, b):
    return a + b

result = my_sum(3, 5)
print(result)

In this example, we define a function called my_sum(), which takes two parameters a and b and adds them together. We then call my_sum() with the values 3 and 5, assign the result to the variable result, and finally print the result, which is 8.

Here’s another example:

def find_largest_element(l):
    if not isinstance(l, list):
        print('input is not type of list')
        return
    if len(l) == 0:
        print('empty input')
        return
    largest_element = l[0]
    for item in l:
        if item > largest_element:
            largest_element = item
    print('largest element is: {}'.format(largest_element))

find_largest_element([8, 1,-3, 2, 0])

In this example, we define a function called find_largest_element that iterates over the input list, finds the largest element, and prints it. So when we call this function and pass the list [8, 1, -3, 2, 0] as a parameter, the program will output largest element is: 8.

It’s important to note that when the main program calls a function, the function must have been defined previously, otherwise an error will occur. For example:

my_func('hello world')
def my_func(message):
    print('Got a message: {}'.format(message))
    
# Output
NameError: name 'my_func' is not defined

However, if we call another function inside a function, the order of declaration doesn’t matter, because def is an executable statement and the functions don’t exist until they are called. We just need to make sure that all the functions that are needed are declared and defined when they are called:

def my_func(message):
    my_sub_func(message) # Calling my_sub_func() before its declaration doesn't affect the program execution
    
def my_sub_func(message):
    print('Got a message: {}'.format(message))

my_func('hello world')

# Output
Got a message: hello world

In addition, Python function parameters can have default values, such as in the following format:

def func(param = 0):
    ...

In this case, when calling the function func(), if the parameter param is not provided, the parameter defaults to 0. However, if a value is passed to the parameter param, it will override the default value.

As mentioned before, a major characteristic of Python compared to other languages is that Python is dynamically typed and can accept any data type (integers, floats, strings, etc.). This also applies to function parameters. For example, referring back to the my_sum function, we can pass a list as a parameter to concatenate two lists:

print(my_sum([1, 2], [3, 4]))

# Output
[1, 2, 3, 4]

Similarly, we can pass strings as parameters to concatenate them:

print(my_sum('hello ', 'world'))

# Output
hello world

Of course, if the data types of the two parameters are different, such as one being a list and the other being a string, they cannot be added together and an error will be raised:

print(my_sum([1, 2], 'hello'))
TypeError: can only concatenate list (not "str") to list

As we can see, Python does not require considering the data type of the input, but rather delegates it to the specific code to determine execution. The same function (such as the addition function my_sum()) can be applied to operations involving integers, lists, strings, and so on.

In programming languages, we refer to this behavior as polymorphism. This is a significant difference between Python and other languages such as Java and C. However, Python’s convenient feature can also bring about various issues in practical use. Therefore, when necessary, you should add type checks at the beginning.

Another major feature of Python functions is that they support function nesting. Function nesting refers to the situation where a function is defined inside another function, as shown below:

def f1():
    print('hello')
    def f2():
        print('world')
    f2()
f1()

# Output
hello
world

Here, the function f1() contains the definition of the function f2() inside it. When calling the function f1(), it will first print the string 'hello', and then f1() will call f2() internally to print the string 'world'. You might wonder why we need function nesting and what are its advantages.

Function nesting primarily serves two purposes:

First, function nesting ensures the privacy of internal functions. Internal functions can only be called and accessed by the outer function, and they are not exposed in the global scope. Therefore, if your function contains private data (such as database usernames and passwords) that you do not want to expose externally, you can use function nesting to encapsulate them in internal functions and only access them through the outer function. For example:

def connect_DB():
    def get_DB_configuration():
        ...
        return host, username, password
    conn = connector.connect(get_DB_configuration())
    return conn

The function get_DB_configuration here is the internal function, and it cannot be called separately outside the connect_DB() function. This means that the following direct external call is incorrect:

get_DB_configuration()

# Output
NameError: name 'get_DB_configuration' is not defined

We can only access it by calling the outer function connect_DB(). This significantly enhances the program’s security.

Second, the proper use of function nesting can improve program efficiency. Let’s consider the following example:

def factorial(input):
    # validation check
    if not isinstance(input, int):
        raise Exception('input must be an integer.')
    if input < 0:
        raise Exception('input must be greater or equal to 0' )
    ...

    def inner_factorial(input):
        if input <= 1:
            return 1
        return input * inner_factorial(input-1)
    return inner_factorial(input)


print(factorial(5))

In this example, we use recursion to calculate the factorial of a number. Because we need to validate the input before calculation, I wrote it in the form of function nesting so that the validity of the input is checked only once. If we do not use function nesting, then for each recursive call, it would require a separate validation check, which is unnecessary and would reduce program efficiency.

In practical work, if you encounter similar situations where input validation is not fast and also consumes certain resources, using function nesting becomes necessary.

Function Variable Scope #

The scope of variables in Python functions is similar to other languages. If a variable is defined inside a function, it is called a local variable and is only valid within the function. Once the function is executed, the local variable is recycled and cannot be accessed. For example:

def read_text_from_file(file_path):
    with open(file_path) as file:
        ...

In this example, we define the variable file inside the function read_text_from_file, which is only valid within the function and cannot be accessed outside of it.

On the other hand, global variables are defined at the file level and can be accessed anywhere within the file. For example:

MIN_VALUE = 1
MAX_VALUE = 10
def validation_check(value):
    if value < MIN_VALUE or value > MAX_VALUE:
        raise Exception('validation check fails')

In this case, MIN_VALUE and MAX_VALUE are global variables that can be accessed anywhere within the file, including inside functions. However, we cannot freely change the value of a global variable inside a function. For example, the following code is incorrect:

MIN_VALUE = 1
MAX_VALUE = 10
def validation_check(value):
    ...
    MIN_VALUE += 1
    ...
validation_check(5)

If you run this code, the program will raise an error:

UnboundLocalError: local variable 'MIN_VALUE' referenced before assignment

This is because Python’s interpreter assumes that the variables inside the function are local variables. Since it cannot find a local variable named MIN_VALUE, it cannot perform the related operation. Therefore, if we want to change the value of a global variable inside a function, we must add the global declaration:

MIN_VALUE = 1
MAX_VALUE = 10
def validation_check(value):
    global MIN_VALUE
    ...
    MIN_VALUE += 1
    ...
validation_check(5)

The global keyword here does not mean that a new global variable named MIN_VALUE is created. It tells the Python interpreter that the variable MIN_VALUE inside the function is the same as the previously defined global variable, not a new global or local variable. This way, the program can access and modify the value of the global variable inside the function.

Also, in the case of nested functions, an inner function can access the variables defined in the outer function, but it cannot modify them. To modify them, the nonlocal keyword must be used:

def outer():
    x = "local"
    def inner():
        nonlocal x  # the nonlocal keyword indicates that 'x' is the variable defined in the outer function
        x = 'nonlocal'
        print("inner:", x)
    inner()
    print("outer:", x)
outer()
# Output:
# inner: nonlocal
# outer: nonlocal

If we don’t use the nonlocal keyword and the variables in the inner function have the same name as the variables in the outer function, the inner function’s variables will override the outer function’s variables as well:

def outer():
    x = "local"
    def inner():
        x = 'nonlocal'  # 'x' here is the local variable of the inner function
        print("inner:", x)
    inner()
    print("outer:", x)
outer()
# Output:
# inner: nonlocal
# outer: local

Closures #

For the third focus of this lesson, I would like to introduce closures. Closures are actually similar to nested functions, but the difference is that here the outer function returns another function instead of a specific value. The returned function is usually assigned to a variable and can be called later.

Let’s take an example to help you understand better. For instance, if we want to calculate the nth power of a number, we can write it using closures as follows:

def nth_power(exponent):
    def exponent_of(base):
        return base ** exponent
    return exponent_of  # The return value is the function exponent_of

square = nth_power(2)  # Calculate the square of a number
cube = nth_power(3)  # Calculate the cube of a number
square
# Output
<function __main__.nth_power.<locals>.exponent(base)>

cube
# Output
<function __main__.nth_power.<locals>.exponent(base)>

print(square(2))  # Calculate the square of 2
print(cube(2))  # Calculate the cube of 2
# Output
4  # 2^2
8  # 2^3

In this example, the outer function nth_power() returns the function exponent_of() instead of a specific value. It is important to note that, after executing square = nth_power(2) and cube = nth_power(3), the parameter exponent of the outer function nth_power() is still remembered by the inner function exponent_of(). This allows us to successfully output results when we call square(2) or cube(2) without causing an error due to the parameter exponent not being defined.

Seeing this, you may wonder, why use closures? In the above program, I could write it in the following form as well!

def nth_power_rewrite(base, exponent):
    return base ** exponent

Certainly, that is possible. However, it is worth noting that using closures has a reason, which is to make the program more concise and readable. Just imagine, if you need to calculate the squares of many numbers, which form do you think is better?

# Without using closures
res1 = nth_power_rewrite(base1, 2)
res2 = nth_power_rewrite(base2, 2)
res3 = nth_power_rewrite(base3, 2)
...

# Using closures
square = nth_power(2)
res1 = square(base1)
res2 = square(base2)
res3 = square(base3)
...

Clearly, the second form is better, right? Firstly, intuitively speaking, the second form allows you to input one less parameter every time you call the function, making it more concise.

Secondly, similar to the advantages mentioned earlier about nested functions, when the function needs to perform some additional work at the beginning, and you need to call that function multiple times, you can put the code for that additional work in the outer function. This can reduce unnecessary overhead caused by multiple calls and improve the program’s runtime efficiency.

Additionally, as we will discuss later, closures are often used together with decorators.

Summary #

In this lesson, we learned about the concept and applications of Python functions. Here are a few points to note:

  1. Python functions can accept any data type as parameters. When using them, it is important to be cautious and add data type checks if necessary.

  2. Unlike other languages, Python functions can have default parameter values.

  3. The use of nested functions can ensure data privacy and improve program efficiency.

  4. Using closures effectively can simplify the complexity of a program and improve readability.

Thought-provoking Question #

Finally, here is a thought-provoking question for you. In your actual learning and work experience, have you encountered any examples using nested functions or closures? Please feel free to leave a comment below and discuss with me. Also, feel free to share this article with your colleagues and friends.