06 Things About the Executable Programs Part 3

06 Things about the Executable Programs - Part 3 #

In the previous article, we have been discussing about variables with the same name, specifically about variables with the same name in different code blocks.

Do you remember? In the end, I emphasized that if variables with the same name have different types, we need to pay special attention as there may be “shadowing” between them.

When necessary, we need to strictly check their types. But how do we do that? Let’s talk about it now.

My question today is: How do we determine the type of a variable?

We will continue with the demo11.go file that was shown in the previous article.

package main

import "fmt"

var container = []string{"zero", "one", "two"}

func main() {
    container := map[int]string{0: "zero", 1: "one", 2: "two"}
    fmt.Printf("The element is %q.\n", container[1])
}

So, how do we correctly determine the type of the container variable before printing its element?

Typical Answer #

The answer is to use a “type assertion” expression. How to write it specifically?

value, ok := interface{}(container).([]string)

Here is an assignment statement. On the right side of the assignment operator is a type assertion expression.

It includes converting the value of the container variable to an empty interface value, interface{}(container).

And a check to determine whether the former is of type slice []string, .([]string).

The result of this expression can be assigned to two variables, represented by value and ok here. Variable ok is of type bool, representing the result of the type check, true or false.

If it’s true, then the value being checked will be automatically converted to a value of type []string and assigned to the variable value, otherwise value will be assigned nil (which means “empty”).

By the way, ok can also be omitted here. In other words, the result of the type assertion expression can be assigned to only one variable, which is value here.

But in this case, it will cause a panic when the check fails.

This kind of panic is called a runtime panic in Go language. Because it is an exception that is thrown during the execution of a Go program, and “panic” is the English translation of the word “恐慌” (panic in Chinese).

Unless this kind of panic is explicitly “recovered”, it will cause the Go program to crash and stop. So, in general, it is advisable to use the form with the ok variable.

Problem Analysis #

To formally explain, the syntax for a type assertion expression is x.(T), where x is the value to be checked for its type. The current type of this value must be an interface type, although the specific interface type doesn’t matter.

So, if the variable container is of some interface type, the type assertion expression can be written as container.([]string). Is it clear now?

In Go, interface{} represents an empty interface, and any type can implement it. In the next module, I will talk about interfaces and their implementing types. For now, you just need to know that any value of any type can easily be converted to a value of the empty interface.

The specific syntax for this is interface{}(x), as shown earlier with interface{}(container).

You might be confused about the {} here - why do we have to add something to the right of the interface keyword?

Please remember that curly braces without anything inside can represent an empty code block, but they can also be used to represent a data structure (or data type) that contains nothing.

For example, you will definitely encounter struct{} in the future, which represents an empty struct type with no fields or methods.

The empty interface interface{} represents an empty interface type with no method definitions.

Of course, for some collection data types, {} can also represent a value with no elements, such as an empty slice value []string{} or an empty map value map[int]string{}.

(Type assertion expression)

Now let’s look to the far right of the answer. The []string in parentheses is a type literal. A type literal is a sequence of characters used to represent a data type itself.

For example, string is a type literal representing the string type, and uint8 is a type literal representing the unsigned 8-bit integer type.

A bit more complex example is []string, which represents the slice type with elements of type string, and map[int]string, which represents the map type with keys of type int and values of type string.

There are also more complex type literals, such as struct type literals and interface type literals, which take up more space to explain. I will discuss them later.

For the current problem, I have written demo12.go, which is a modified version of demo11.go. In demo12.go, I have used two different methods to perform the type assertion. One method is the one I mentioned above, and the other method uses a switch statement, which we haven’t discussed yet. You can refer to it for now.

You can see that the answer to the current problem can be written in just one line of code. You might be thinking that this one line of code seems too complicated to explain.

Don’t worry about it. A lot of this complexity comes from basic syntax and concepts, and you just need to remember them. But this is also what I want to tell you - a small piece of code can hide a lot of details. The interviewer can extend the discussion in several directions based on this code. It’s a bit like ink splashed on paper, which quickly leads to broader discussions.

Further Knowledge #

Question 1. What are some important points to consider in the type conversion rules?

I have already shown the basic syntax of a type conversion expression earlier. The syntax form is T(x).

The x can be a variable, a literal representing a value (like 1.23 and struct{}{}), or an expression.

Note that if it is an expression, the result of the expression can only be a single value, not multiple values. In this context, x can be called the source value, its type is the source type, and the type represented by T is the target type.

If the conversion from the source type to the target type is not valid, it will result in a compilation error. What qualifies as valid? The specific rules can be found in the Conversions section of the Go Language Specification.

What we are concerned with here are not the issues that the Go language compiler can detect. On the contrary, we should focus on the things that are difficult to detect at the programming language level.

Many of the so-called pitfalls or traps mentioned by beginners mainly arise from the knowledge and skills they need to know but don’t know. Therefore, in these rules, I want to point out three points that I think are commonly used and worth paying attention to, and highlight some “traps” in advance.

First of all, for type conversions between integer type values and integer constants, as a general rule, as long as the source value is within the representable range of the target type, it is valid.

For example, the reason why uint8(255) can convert the untyped constant 255 to a value of type uint8 is because 255 is within the range [0, 255].

However, it is important to note that the representable range of the source integer type may be larger than that of the target type, for example, converting the type of a value from int16 to int8. Consider the following code:

var srcInt = int16(-255)
dstInt := int8(srcInt)

The value of the variable srcInt is -255 of type int16, and the value of the variable dstInt is obtained by converting the former and its type is int8. The representable range of the int16 type is much larger than that of the int8 type. The question is, what is the value of dstInt?

First, you need to know that integers are stored in twos complement form in Go and computers in general. This is mainly to simplify the calculation process for integers. (The complement of a negative number) is actually the original code with each bit inverted and then plus one.

For example, the two’s complement of the value -255 of the int16 type is 1111111100000001. If we convert this value to a value of type int8, Go will simply truncate the 8 bits binary number at the higher position (or the leftmost position), resulting in 00000001.

And since the leftmost bit is 0, indicating that it is a positive integer, and the complement of a positive integer is equal to its original code, the value of dstInt is 1.

Remember, when the effective range of the integer value changes from wide to narrow, it is only necessary to truncate a certain number of high-order binary digits in twos complement form.

Another similar rule is: when converting a value of a floating-point type to a value of an integer type, the decimal part of the former will be completely truncated.

Second, although it is possible to directly convert an integer value to a value of type string, it is worth noting that the integer value being converted should represent a valid Unicode code point, otherwise the result of the conversion will be "�" (a string value composed only of highlighted question marks).

The Unicode code point of the character '�' is U+FFFD. It is the Replacement Character defined in the Unicode standard, used to replace unknown, unrecognized, and unrepresentable characters.

I certainly won’t ask “which integer value will be converted into which string”. That would be crazy! But I will write:

string(-1)

and ask what the result will be? This is a completely different question. Since -1 definitely cannot represent a valid Unicode code point, the result will always be "�". In actual work, when troubleshooting, you may encounter �, and you need to know what may have caused it. The third knowledge point is about the conversion between the string type and various slice types.

First, you need to understand that when a value is converted from the string type to the []byte type, it means that the UTF-8 encoded string will be split into separate, independent bytes.

Except for the part of character sets that are compatible with ASCII encoding, a single byte in UTF-8 encoding cannot represent a character.

string([]byte{'\xe4', '\xbd', '\xa0', '\xe5', '\xa5', '\xbd'}) // 你好

For example, the three bytes \xe4, \xbd, and \xa0 in UTF-8 encoding combined represent the character '你', and \xe5, \xa5, and \xbd combined represent the character '好'.

Second, when a value is converted from the string type to the []rune type, it means that the string will be split into individual Unicode characters.

string([]rune{'\u4F60', '\u597D'}) // 你好

When you truly understand the Unicode standard, its character sets, and encoding schemes, these concepts will become easy to grasp. What is the Unicode standard? I recommend you visit its official website to find out.

Question 2. What are alias types? What are underlying types?

We can declare various custom types using the type keyword. Of course, these types must fall within the category of Go’s basic types and advanced types.

Among them, there is a type called “alias type”. We can declare it as follows:

type MyString = string

This declaration statement means that MyString is an alias type for the string type. As the name suggests, the only difference between an alias type and its source type is the name; they are completely identical.

Source types and alias types are opposite concepts. Alias types mainly exist for code refactoring. For more detailed information, refer to the Go Language’s official documentation on Proposal: Type Aliases.

There are two alias types among the built-in basic types of the Go language. byte is an alias type for uint8, and rune is an alias type for int32.

It is important to note that if I declare it as follows:

type MyString2 string // Note that there is no equal sign here.

MyString2 and string become two different types. MyString2 is a new type, different from any other type.

This method is also called redefining a type. We have just redefined the string type to another type, MyString2.

- (Alias types, type redefinition, and underlying types)

For this type redefinition, string can be considered the underlying type of MyString2. The underlying type refers to the fundamental type of a certain type.

Values of different types with the same underlying type can be converted. Therefore, values of type MyString2 and type string can be converted using a type conversion expression.

However, for collection types such as []MyString2 and []string, this is not allowed because []MyString2 and []string have different underlying types, namely []MyString2 and []string. Additionally, even if two different types have the same underlying type, their values cannot be compared or equated, and their variables cannot be assigned to each other.

Summary #

In this article, we focus on types. Every variable in Go has a type, and we can use type assertion expressions to determine the type of a variable.

Using this expression correctly requires a few tricks, such as always assigning the result to two variables. Additionally, we need to ensure that the variable being asserted is of interface type, which may require the use of type conversion expressions.

When using a type conversion expression to convert the type of a variable, we are subject to a set of strict rules.

We must pay attention to some details in this set of rules, especially the details that Go commands will not check for you, otherwise, you may fall into so-called “traps”.

Furthermore, you should also understand the differences between alias type declarations and type redefinitions, and the differences in type conversion, equality testing, comparison, and assignment operations that they bring.

Thinking Questions #

There are two thinking questions in this article.

Besides those mentioned above, what other aspects of the type conversion rules do you think are noteworthy?
Could you please elaborate on the role of alias types in the code refactoring process?

The answers to these questions can all be found in the official documentation mentioned in the article.

Click here to view the detailed code accompanying the Go language column article.