03 List and Tuple, Which One to Use

03 List and Tuple, Which One to Use #

Hello, I’m Jingxiao.

In the previous lessons, we discussed the learning method for the Python language and introduced you to Jupyter, an essential tool for Python. Starting from this lesson, we will officially start learning the specific knowledge of Python.

For every programming language, data structures are the foundation. Understanding and mastering the basic data structures in Python are crucial for learning this language well. Today, we will learn together about the two most common data structures in Python: lists and tuples.

Lists and Tuples Basics #

First of all, let’s clarify the basic concepts: what are lists and tuples exactly?

In fact, both lists and tuples are ordered collections that can hold any data type.

In most programming languages, the elements in a collection must have the same data type. However, in Python, this requirement does not apply to lists and tuples:

l = [1, 2, 'hello', 'world'] # a list contains elements of both int and string
l
[1, 2, 'hello', 'world']

tup = ('jason', 22) # a tuple contains elements of both int and string
tup
('jason', 22)

Secondly, we need to understand their differences.

  • Lists are mutable, which means their length can change and we can add, remove or modify elements freely.

  • Tuples are immutable, which means their length is fixed and we cannot add, remove or modify elements.

In the following examples, we create a list and a tuple respectively. You can see that for the list, we can easily change the last element from 4 to 40. However, if we try to perform the same operation on the tuple, Python will throw an error because tuples are immutable.

l = [1, 2, 3, 4]
l[3] = 40 # indexing in Python starts from 0, l[3] represents the fourth element in the list
l
[1, 2, 3, 40]

tup = (1, 2, 3, 4)
tup[3] = 40
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment

But what if you want to “change” an existing tuple? You can only create a new tuple by reallocating memory.

For example, if we want to add an element 5 to the tuple, we essentially create a new tuple and fill it with the values from the original tuple.

On the other hand, for lists, we can simply append the corresponding element to the end of the list. After the operation, the original list is modified instead of creating a new list.

tup = (1, 2, 3, 4)
new_tup = tup + (5, ) # create a new tuple new_tup and fill it with the values from the original tuple
new_tup
(1, 2, 3, 4, 5)

l = [1, 2, 3, 4]
l.append(5) # append element 5 to the end of the original list
l
[1, 2, 3, 4, 5]

With the examples above, I believe you understand the basic concepts of lists and tuples. Next, let’s look at some basic operations and considerations for lists and tuples.

Firstly, unlike some other programming languages, both lists and tuples in Python support negative indexing. -1 represents the last element, -2 represents the second last element, and so on.

l = [1, 2, 3, 4]
l[-1]
4

tup = (1, 2, 3, 4)
tup[-1]
4

Apart from basic initialization and indexing, both lists and tuples support slicing:

l = [1, 2, 3, 4]
l[1:3] # return a sublist of the list with indices from 1 to 2
[2, 3]

tup = (1, 2, 3, 4)
tup[1:3] # return a subtuple of the tuple with indices from 1 to 2
(2, 3)

Furthermore, both lists and tuples can be nested freely:

l = [[1, 2, 3], [4, 5]] # each element in the list is also a list

tup = ((1, 2, 3), (4, 5, 6)) # each element in the tuple is also a tuple

Of course, you can convert lists and tuples to each other using the list() and tuple() functions:

list((1, 2, 3))
[1, 2, 3]

tuple([1, 2, 3])
(1, 2, 3)

Finally, let’s see some commonly used built-in functions for lists and tuples:

l = [3, 2, 3, 7, 8, 1]
l.count(3)
2
l.index(7)
3
l.reverse()
l
[1, 8, 7, 3, 2, 3]
l.sort()
l
[1, 2, 3, 3, 7, 8]

tup = (3, 2, 3, 7, 8, 1)
tup.count(3)
2
tup.index(7)
3
list(reversed(tup))
[1, 8, 7, 3, 2, 3]
sorted(tup)
[1, 2, 3, 3, 7, 8]

Let me briefly explain the meanings of these functions:

  • count(item) returns the number of occurrences of item in the list/tuple.

  • index(item) returns the index of the first occurrence of item in the list/tuple.

  • list.reverse() and list.sort() represent in-place reverse and sort operations on the list (note that tuples do not have these built-in functions).

  • reversed() and sorted() also perform reverse and sort operations on the list/tuple, but reversed() returns a reversed iterator (as demonstrated in the previous example, we can convert it to a list using the list() function) and sorted() returns a new sorted list.

Differences in Storage Methods Between Lists and Tuples #

As mentioned before, the most important difference between lists and tuples is that lists are dynamic and mutable, while tuples are static and immutable. This difference inevitably affects the storage methods of the two. Let’s take a look at the example below:

l = [1, 2, 3]
l.__sizeof__() # 64
tup = (1, 2, 3)
tup.__sizeof__() # 48

As you can see, we have placed the same elements in a list and a tuple, but the tuple occupies 16 bytes less space than the list. Why is that?

In fact, since lists are dynamic, they need to store pointers to the corresponding elements (in the above example, the pointer size for int is 8 bytes). In addition, because lists are mutable, they need to store the allocated length (8 bytes) in order to track the usage of the list space in real-time and allocate additional space when necessary.

l = []
l.__sizeof__() # The storage space for an empty list is 40 bytes
40
l.append(1)
l.__sizeof__()
72 # After adding element 1, the list allocates space for storing 4 elements ((72 - 40)/8 = 4)
l.append(2)
l.__sizeof__()
72 # Since space was allocated previously, adding element 2 does not change the list's space
l.append(3)
l.__sizeof__()
72 # Same as above
l.append(4)
l.__sizeof__()
72 # Same as above
l.append(5)
l.__sizeof__()
104 # After adding element 5, the list's space is insufficient, so additional space for storing 4 elements is allocated

The example above roughly describes the process of list space allocation. As you can see, in order to reduce the overhead of space allocation during each insertion/deletion operation, Python always allocates some extra space when allocating space. This mechanism (over-allocating) ensures the efficiency of these operations: the time complexity of insertion/deletion is O(1).

However, things are different for tuples. Since the length of tuples is fixed and the elements are immutable, the storage space is also fixed.

After reading the analysis above, you may think that such differences can be ignored. But imagine if the number of elements stored in a list or tuple is in the range of hundreds of millions, billions, or even larger. Can you still ignore such differences then?

Performance of Lists and Tuples #

By studying the differences in storage between lists and tuples, we can conclude that tuples are slightly lighter weight than lists, so overall, tuples have slightly better performance speed compared to lists.

In addition, Python will do some resource caching for static data in the background. Generally, because of the presence of garbage collection, if some variables are no longer in use, Python will reclaim the memory they occupy and return it to the operating system for use by other variables or other applications.

However, for some static variables, such as tuples, if they are not in use and occupy a small amount of space, Python will temporarily cache this memory. This means that the next time we create a tuple of the same size, Python can allocate the previously cached memory space without having to make a request to the operating system to find memory, which greatly speeds up the program’s execution.

The following example demonstrates the time required to initialize a list and a tuple with the same elements. We can see that tuple initialization is 5 times faster than list initialization.

python3 -m timeit 'x=(1,2,3,4,5,6)'
20000000 loops, best of 5: 9.97 nsec per loop
python3 -m timeit 'x=[1,2,3,4,5,6]'
5000000 loops, best of 5: 50.1 nsec per loop

However, if it is an indexing operation, the difference in speed between the two is very small and can be virtually ignored.

python3 -m timeit -s 'x=[1,2,3,4,5,6]' 'y=x[3]'
10000000 loops, best of 5: 22.2 nsec per loop
python3 -m timeit -s 'x=(1,2,3,4,5,6)' 'y=x[3]'
10000000 loops, best of 5: 21.9 nsec per loop

Of course, if you want to add, delete, or change elements, lists are obviously more advantageous. As you probably know now, that’s because for tuples, you have to create a new tuple to make any changes.

Usage scenarios of lists and tuples #

So, which one should we use, lists or tuples? Based on the characteristics mentioned above, we need to analyze specific situations.

1. If the stored data and quantity remain constant, for example, if you have a function that needs to return the longitude and latitude of a location and pass it directly to the front-end for rendering, then a tuple is definitely more appropriate.

def get_location():
    # .....
    return (longitude, latitude)

2. If the stored data or quantity is variable, for example, a logging feature on a social platform that tracks which users’ posts a user has viewed within a week, then a list is more suitable.

viewer_owner_id_list = [] # Each element in the list records the ID of the owner whose posts the viewer has viewed within a week
records = queryDB(viewer_id) # Query the database to get the logs of a specific viewer within a week
for record in records:
    viewer_owner_id_list.append(record.id)

Summary #

In summary, we have discussed a lot about lists and tuples today. Let’s summarize the key points that you must understand.

In general, both lists and tuples are ordered collections that can store any data type, but they differ in the following two points:

  • Lists are dynamic, meaning their length can be changed by adding, removing, or modifying elements. Lists have slightly larger storage space and slightly lower performance compared to tuples.

  • Tuples are static, meaning their length is fixed and cannot be changed by adding, removing, or modifying elements. Tuples are relatively lightweight compared to lists and have slightly better performance.

Thought Questions #

1. To create an empty list, we can use options A and B as shown below. Is there any difference in efficiency between them? Which one should we prefer to use? Can you explain your reasons?

# Creating an empty list
# option A
empty_list = list()

# option B
empty_list = []

2. In your daily study or work, in what scenarios do you use lists or tuples? Feel free to leave a comment and share with me.