07 Arrays and Slices

07 Arrays and Slices #

Starting from this article, we officially enter Module 2 of the course. Before this, we have discussed a lot about the fundamentals of Go language and programming, and I believe you already have a good understanding of Go language development environment configuration, writing common source code files, and various concepts and programming techniques related to entities (especially variables) such as type inference, variable redeclaration, redeclared variables, type assertion, type conversion, alias types, and underlying types.

These are all important parts of what I consider to be the fundamentals of Go language programming, and they also serve as the cornerstone for the following articles. If you feel a bit overwhelmed in the learning process ahead, it may be because your fundamentals are not solid enough, so you can go back and review them again.

In this article, we will mainly discuss the array and slice types in Go language. Arrays and slices can sometimes confuse beginners.

The common point between them is that they both belong to collection types and their values can be used to store values (or elements) of a certain type.

However, the most important difference between them is: the length of an array (referred to as an array for simplicity) is fixed, while the length of a slice (referred to as a slice for simplicity) is variable.

The length of an array must be specified when it is declared and cannot be changed afterwards. It can be said that the length of an array is part of its type. For example, [1]string and [2]string are two different array types.

On the other hand, the type literal of a slice only contains the element type, without the length. The length of a slice can automatically increase with the increase in the number of elements, but it will not decrease with the decrease in the number of elements.

Array and Slice Literals

( Array and Slice Literals)

In fact, we can think of a slice as a simple encapsulation of an array, because in the underlying data structure of each slice, it always contains an array. The array can be called the underlying array of the slice, and the slice can be seen as a reference to a certain continuous segment of the array.

For this reason, the slice type in Go language belongs to reference types, and other reference types include map types, channel types, function types, etc. On the other hand, the array type in Go language belongs to value types, and other value types include basic data types and struct types.

Note that in Go language, there is no confusing “pass-by-value or pass-by-reference” problem like in Java and other programming languages. In Go language, we can determine whether it is “pass-by-value” or “pass-by-reference” just by looking at the type of the value being passed.

If the value being passed is of reference type, then it is “pass-by-reference”. If the value being passed is of value type, then it is “pass-by-value”. In terms of the cost of passing values, the cost of reference types is often much lower than that of value types.

We can use index expressions on arrays and slices to get specific elements. We can also use slice expressions on them to get a new slice.

We can get the length of an array or a slice by calling the built-in function len. We can get the capacity of an array or a slice by calling the built-in function cap.

However, it is important to note that the capacity of an array is always equal to its length and is immutable. The capacity of a slice is not like this, and its change follows a certain pattern.

Now let’s take a look at a question to understand this. Today’s question is: how to correctly estimate the length and capacity of a slice?

To help answer this question, I have written a simple command source file called demo15.go.

package main

import "fmt"

func main() {
    // Example 1.
    s1 := make([]int, 5)
    fmt.Printf("The length of s1: %d\n", len(s1))
    fmt.Printf("The capacity of s1: %d\n", cap(s1))
    fmt.Printf("The value of s1: %d\n", s1)
    s2 := make([]int, 5, 8)
    fmt.Printf("The length of s2: %d\n", len(s2))
    fmt.Printf("The capacity of s2: %d\n", cap(s2))
    fmt.Printf("The value of s2: %d\n", s2)
}

Let me describe what it does.

First, I use the built-in function make to declare a variable s1 of type []int, which is a slice. I pass 5 as the second argument to the make function, which specifies the length of the slice. I declare another slice s2 in a similar way, but this time I pass an additional argument 8 to specify the capacity of the slice.

Now, the specific question is: what are the capacities of the slices s1 and s2?

The typical answer to this question is: the capacity of slice s1 is 5, and the capacity of slice s2 is 8.

Problem Analysis #

Let’s analyze this problem. Why is the capacity of s1 5? It is because when I declared s1, I set its length to 5. When we use the make function to initialize a slice without specifying its capacity, its capacity will be set to the same as its length. If the capacity is specified during initialization, then the actual capacity of the slice will be the specified value. This is also the reason why the capacity of s2 is 8.

Let’s also clarify the length, capacity, and their relationship using s2. When I initialized the slice represented by s2, I specified both its length and capacity.

As I mentioned earlier, a slice can be seen as a simple layer of encapsulation over an array, because every underlying data structure of a slice will contain an array. The array can be referred to as the underlying array of the slice, and the slice can be seen as a reference to a contiguous segment of the array.

In this case, the capacity of the slice actually represents the length of its underlying array, which is 8 in this case. (Note that the underlying array of a slice is equivalent to what we referred to as an array earlier, and its length is immutable.)

Now, imagine this: You have a window through which you can see an array, but you may not be able to see all the elements of the array, sometimes only a continuous part of them.

Now, the array is the underlying array of slice s2, and the window is the slice s2 itself. The length of s2 actually indicates the width of this window, determining which continuous elements of its underlying array you can see through s2.

Since the length of s2 is 5, you can see the first element to the fifth element of its underlying array, corresponding to the index range of the underlying array of [0, 4].

The window represented by the slice will also be divided into small grids, just like the windows in our house. Each small grid corresponds to an element in the underlying array.

Let’s continue with s2 as an example. The leftmost small grid of this window corresponds exactly to the first element of its underlying array, which is the element at index 0. Therefore, it can be said that the elements pointed to by the indices from 0 to 4 in s2 correspond exactly to the 5 elements represented by the indices from 0 to 4 in its underlying array.

Please remember that when we use the make function or a slice value literal (e.g. []int{1, 2, 3}) to initialize a slice, the leftmost small grid of this window always corresponds to the first element of its underlying array.

However, when we generate a new slice based on an array or slice using a slice expression, the situation becomes more complex.

Let’s look at another example:

s3 := []int{1, 2, 3, 4, 5, 6, 7, 8}
s4 := s3[3:6]
fmt.Printf("The length of s4: %d\n", len(s4))
fmt.Printf("The capacity of s4: %d\n", cap(s4))
fmt.Printf("The value of s4: %d\n", s4)

There are 8 elements in the slice s3, which are integers from 1 to 8. The length and capacity of s3 are both 8. Then, I initialized the slice s4 using the slice expression s3[3:6]. The question is, what are the length and capacity of s4?

This is not difficult, subtraction will do the trick. First, you need to know what the two integers in the square brackets of the slice expression represent. Let me put it differently, [3, 6).

This is mathematical notation for an interval and is commonly used to represent a range of values. I have actually used it several times in this column. From this, we can see that [3:6] represents the index range of the elements in s3 that can be seen through the new window from 3 to 5 (note that 6 is not included).

The 3 here can be called the starting index, and 6 can be called the ending index. Then the length of s4 is 6 minus 3, which is 3. Therefore, it can be said that the elements pointed to by indices from 0 to 2 in s4 correspond to the 3 elements represented by indices from 3 to 5 in s3 and its underlying array.

Slice and array relationship

(Looking at the relationship between slice and array)

Now let’s look at the capacity. I mentioned earlier that the capacity of a slice represents the length of its underlying array, but this only applies when initializing a slice using the make function or a slice value literal.

The more general rule is: the capacity of a slice can be seen as the maximum number of elements in its underlying array that can be seen through this window.

Since s4 is obtained by applying a slice operation on s3, the underlying array of s3 is also the underlying array of s4.

And because, with the underlying array remaining unchanged, the window represented by the slice can be expanded to the right, until the end of its underlying array.

Therefore, the capacity of s4 is the length of its underlying array, 8, minus the starting index 3 in the slice expression mentioned above, which is 5.

Note that the window represented by the slice cannot be expanded to the left. In other words, we can never see the leftmost 3 elements in s3 through s4.

Finally, let me mentioned how to expand the window of a slice to the maximum. For s4, the slice expression s4[0:cap(s4)] can achieve this. I think you should understand. The result value of this expression (i.e. a new slice) will be []int{4, 5, 6, 7, 8}, with a length and capacity of 5.

Knowledge Expansion #

Question 1: How to estimate the growth of slice capacity?

Once a slice is unable to accommodate more elements, Go will find a way to expand its capacity. However, it does not modify the original slice, but generates a larger capacity slice and then copies the original and new elements to the new slice. In general, you can simply assume that the capacity of the new slice (referred to as new capacity) will be twice the capacity of the original slice (referred to as original capacity).

However, when the length of the original slice is greater than or equal to 1024, Go will use 1.25 times the original capacity as the baseline for the new capacity (referred to as new capacity baseline). The new capacity baseline will be adjusted (continuously multiplied by 1.25) until the result is not less than the sum of the original length and the number of elements to be appended (referred to as new length). In the end, the new capacity is often larger than the new length, but equal is also possible.

In addition, if we append too many elements at once, resulting in a new length greater than twice the original capacity, then the new capacity will be based on the new length. Note that, similar to the previous case, the final new capacity is often larger than the new capacity baseline. For more details, refer to the specific implementation of the growslice function in the slice.go file in the runtime package.

I have put some examples demonstrating the expansion strategy mentioned above in the demo16.go file. You can try running them.

Question 2: When will the underlying array of a slice be replaced?

Strictly speaking, the underlying array of a slice will never be replaced. Why? Although a new underlying array is definitely generated when expanding the slice, a new slice is also generated at the same time.

It just reassigns the new slice as a window of the new underlying array, without modifying the original slice or its underlying array.

Please remember that when there is no need for expansion, the append function returns a slice pointing to the original underlying array, while when expansion is needed, the append function returns a slice pointing to the new underlying array. So strictly speaking, the word “expansion” is not appropriate here, but since it has been widely used, there is no need to find a new term.

By the way, as long as the new length does not exceed the original capacity of the slice, appending elements to it using the append function will not cause expansion. It will only replace the elements immediately to the right of the slice window in the underlying array with the new elements. You can run the demo17.go file to enhance your understanding of these concepts.

Summary

In summary, we have discussed arrays and slices and their relationship today. Slices are based on arrays, are variable length, and very lightweight. The capacity of a slice is always fixed, and a slice is always bound to a certain underlying array.

In addition, the capacity of a slice will always be a value between the length of the slice and the length of the underlying array, and it is also related to the position of the element corresponding to the leftmost end of the slice window in the underlying array. You must remember the two methods of subtracting to calculate the length and capacity of a slice.

Moreover, if the new length is larger than the capacity of the original slice, then the underlying array will definitely be new, and the append function will also return a new slice. Also, you don’t have to pay too much attention to the details of the “expansion” strategy of a slice, as long as you can understand its basic rules and estimate it approximately.

Thinking Questions

Here we are still focusing on slice-related questions.

If there are multiple slices pointing to the same underlying array, what should you pay attention to?
How can you use the idea of “expansion” to “shrink” a slice? Please write the code.

Both of these questions are open-ended, so you need to think carefully. It’s best to use your brain and hands at the same time.

Click here to view the detailed code accompanying the Go language column article.