08 Containers in the Package Container

08 Containers in the package container #

In our previous discussion, we talked about arrays and slices, and when we mention arrays, we often think of linked lists. So what does a linked list look like in Go?

The implementation of a linked list in Go can be found in the container/list package in the standard library. This package contains two public entities - List and Element. The List struct implements a doubly linked list, while the Element struct represents an element in the list.

So, my question today is: Can we pass our own created Element values to the linked list?

In this case, we will be using four methods of List.

The MoveBefore method and MoveAfter method are used to move the given element before or after another element, respectively.

The MoveToFront method and MoveToBack method are used to move the given element to the front or the back of the list, respectively.

In these methods, the “given element” is of type *Element, where *Element is a pointer to the Element struct and its value is the pointer to the element.

func (l *List) MoveBefore(e, mark *Element)
func (l *List) MoveAfter(e, mark *Element)

func (l *List) MoveToFront(e *Element)
func (l *List) MoveToBack(e *Element)

The specific question is, what happens if we pass our own created value of Element to these methods of the linked list? Will the linked list accept it?

Here, we provide a typical answer: No, it will not accept it. These methods will not make any changes to the linked list because the Element value we created ourselves is not in the list. Therefore, there is no concept of “moving an element in the list” for our own created Element value. Moreover, the linked list does not allow us to insert our own created Element value into it.

Problem Analysis #

Among the methods in the List, the methods used to insert new elements only accept values of type interface{}. These methods internally use the Element value to wrap the received new elements.

This is done precisely to avoid directly using the elements we generate ourselves, the primary reason being to avoid the internal association of the linked list being destroyed by the outside world, which is beneficial for both the linked list itself and us as users.

The methods of List are as follows:

The Front and Back methods are used to get the elements at the front and back of the linked list, respectively. The InsertBefore and InsertAfter methods are used to insert new elements before or after the specified element, respectively. The PushFront and PushBack methods are used to insert new elements at the front and back of the linked list, respectively.

    func (l *List) Front() *Element
    func (l *List) Back() *Element

    func (l *List) InsertBefore(v interface{}, mark *Element) *Element
    func (l *List) InsertAfter(v interface{}, mark *Element) *Element

    func (l *List) PushFront(v interface{}) *Element
    func (l *List) PushBack(v interface{}) *Element

These methods all return a pointer to an Element value as a result. They are the safe “interface” left to us by the linked list. With the pointers to these internal elements, we can call the methods mentioned earlier for moving elements.

Knowledge Expansion

1. Question: Why can a linked list be used out of the box?

Both List and Element are struct types. Struct types have a characteristic, which is that their zero value will be a value with a specific structure but no customized content, equivalent to an empty shell. The fields in the value will also be assigned the zero value of their respective types.

Broadly speaking, the so-called zero value is the default value given to a variable that has only been declared but not initialized. The zero value of each type is set according to the characteristics of that type.

For example, the value of the variable a declared by the statement var a [2]int will be an integer array containing two 0s. Another example, the value of the variable s declared by the statement var s []int will be a slice of type []int with a value of nil.

So what will the value of a variable l declared by the statement var l list.List be? [1] This zero value will be a linked list with a length of 0. The root element held by this linked list will also be an empty shell, containing only default content. Can we use such a linked list directly?

The answer is yes. This is called “use out of the box”. Many program entities of struct types in the Go standard library achieve this. This is also one of the best practices we recommend when writing code packages (or program libraries) for others to use. So how does the declared linked list l with the statement var l list.List achieve this?

The key lies in its “lazy initialization” mechanism.

The so-called lazy initialization can be understood as postponing the initialization operation until it is actually needed. The advantage of lazy initialization lies in “postponement”; it can distribute the computational and storage space consumption caused by initialization to the time when they are actually used.

For example, if we need to declare a large number of high-capacity slices all at once, the CPU and memory usage at that time will certainly increase significantly, and the memory usage will only decrease if we manage to recycle the slices and their underlying arrays.

If arrays can be lazily initialized, then the pressure of computational and storage space can be distributed to the time when they are actually used. The more time is dispersed when these arrays are actually used, the more obvious the advantages brought by lazy initialization will be.

In fact, Go’s slices play the role of lazily initializing their underlying arrays. You can think about the reason why this is the case.

The disadvantage of lazy initialization lies precisely in “postponement”. You can imagine that if I need to check whether the linked list has been initialized before calling each method of the linked list, this will also be a waste of computational resources. In the case where these methods are called very frequently, the impact of this wasted computational resources will become apparent, and the program’s performance will be reduced.

In this linked list implementation, some methods do not need to check whether the linked list has been initialized. For example, the Front method and the Back method simply return nil if they find that the length of the linked list is 0. For example, in methods used to delete elements, move elements, and insert elements, all we need to do is to check whether the pointer that points to the linked list in the incoming element is equal to the pointer of the current linked list.

If they are not equal, it means that the incoming element is not in this linked list, so we don’t need to proceed with further operations. On the contrary, if they are equal, it means that this linked list has been initialized.

The reason is that the PushFront method, PushBack method, PushBackList method, and PushFrontList method of the linked list always check the status of the linked list first and initialize it if necessary. This is called lazy initialization.

Moreover, when we add a new element to an empty linked list, we will definitely call one of these four methods. At this time, the pointer in the new element that points to the linked list will be set to the pointer of the current linked list. Therefore, the equality of the pointers is both necessary and sufficient condition for the linked list to be initialized.

Do you understand? List makes use of its own structure as well as the structure of Element to balance the advantages and disadvantages of lazy initialization, making the linked list ready to use and achieving optimal performance.

Question 2: What is the difference between Ring and List?

The Ring type in the container/ring package implements a circular linked list, commonly known as a ring. In fact, List itself is a circular linked list. Its root element never holds any actual element value, and its existence is to connect the beginning and end of the circular linked list.

So it can also be said that the zero value of List is an empty linked list that only contains the root element but does not contain any actual element values. Then, since Ring and List are essentially both circular linked lists, what are the differences between them?

The main differences are as follows.

  1. The Ring data structure can be represented by itself alone, while the List type needs to be represented by both itself and the Element type. This is a difference in representation and complexity of the structure.
  2. Strictly speaking, a Ring value represents only one element in its associated circular linked list, while a List value represents a complete linked list. This is a difference in representation dimension.
  3. When creating and initializing a Ring value, we can specify the number of elements it contains, but for a List value, we cannot do so (nor is it necessary). Once a circular linked list is created, its length is immutable. This is the first difference in functionality between the New functions in the two packages and the initialization of the two types.
  4. A var r ring.Ring statement will create a circular linked list with a length of 1, while the zero value of the List type is a linked list with a length of 0. Don’t forget that the root element in List does not hold any actual element values, so it is not included when calculating the length. This is the second difference in initializing the two types.
  5. The algorithm complexity of the Len method of a Ring value is O(N), while the algorithm complexity of the Len method of a List value is O(1). This is the most obvious difference in performance between the two.

The other differences are basically in methods. For example, circular linked lists also have methods for inserting, moving, or deleting elements, but they are more abstract to use, and so on.

Summary

Today we mainly discussed the implementation of linked lists in the container/list package. We explained in detail some key usage techniques and implementation characteristics of linked lists. Since this linked list implementation is actually a circular linked list internally, we also made a comparison with the circular linked list implementation in the container/ring package, including structure, initialization, and performance.

Questions for Thought

  1. What are the applicable scenarios for circular linked lists in the container/ring package?
  2. Have you used the heap in the container/heap package? What are the applicable scenarios for it?

Here, we don’t need to strive to master their implementation right away, using them correctly and effectively is the first step before advancing. Well, thank you for listening, and see you next time.


[1]: The List structure type has two fields, one is a field of type Element called root, and the other is an int field called len. As the names suggest, the former represents the root element, and the latter stores the length of the linked list. Note that both of them are package-private, which means that users cannot view or modify them.

For a declaration like l mentioned before, both its fields root and len will be assigned their respective zero values. The zero value of len is 0, which precisely indicates that the linked list does not yet contain any elements. Since root is of type Element, its zero value is an empty shell of that type, represented by the literal Element{}.

The Element type contains several package-private fields, which are used to store the previous element, the next element, and the pointer value to the associated linked list. There is also a public field named Value, which holds the actual value of the element and is of type interface{}. In the zero value of the Element type, the values of these fields are all nil.

References #

Comparison between Slices and Arrays #

Slices themselves have the characteristics of occupying less memory and being easy to create, but they are essentially arrays. One major advantage of slices is that they allow us to quickly locate and retrieve or modify elements in the underlying array using a window.

However, when we want to delete elements from a slice, it is not as simple. Element copying is generally unavoidable, even if we are only deleting one element, sometimes it can cause a large number of element movements. At this time, we also need to pay attention to “clearing” the empty element slots, otherwise it may cause memory leaks.

On the other hand, when a slice is frequently “resized”, new underlying arrays will be constantly generated. The amount of memory allocation and the number of element copies may be considerable, which will definitely have a negative impact on the performance of the program.

Especially when we do not have a reasonable and effective “shrink” strategy, the old underlying array cannot be reclaimed, and the new underlying array will also have a large amount of unused element slots. Excessive memory waste will not only reduce the performance of the program, but also may cause memory overflow and program crash.

From this, it can be seen how important it is to use slices correctly. However, a more important fact is that no data structure is a silver bullet. Isn’t that right? The characteristics and applicable scenarios of arrays are very distinct, and the same goes for slices. They are both native data structures in Go language and are easy to use. However, your collection toolbox should not only contain them. That’s why we use linked lists.

However, in comparison, a linked list generally occupies much more memory than an array containing the same elements. This is because the elements of the linked list are not stored contiguously, so neighboring elements need to save each other’s pointers. Not only that, each element also needs to store a pointer to the linked list it belongs to.

With these associations, the structure of the linked list becomes even simpler. It only needs to hold the head element (or root element). Of course, in order to prevent unnecessary traversal and calculation, it is also necessary to record the length of the linked list.

Click here to view the detailed code accompanying the Go language column article.