08 Concurrency Fundamentals Declaration and Use of Goroutines and Channels

08 Concurrency Fundamentals - Declaration and Use of Goroutines and Channels #

Before starting this lesson, let’s first recap the previous lesson’s thought question: Can there be multiple defer statements, and if so, what is the execution order?

To answer this question, we can directly test it by writing some code, as shown below:

func moreDefer() {
   defer fmt.Println("First defer")
   defer fmt.Println("Second defer")
   defer fmt.Println("Three defer")
   fmt.Println("Function's own code")
}

func main() {
  moreDefer()
}

In this code, I defined a function called moreDefer which contains three defer statements. Then I called it in the main function. Running this program will produce the following output:

Function's own code
Three defer
Second defer
First defer

From the example above, we can conclude that:

  1. In a method or function, there can be multiple defer statements.
  2. The execution order of multiple defer statements follows the LIFO (Last In, First Out) principle.

defer statements have a call stack, with earlier defined statements closer to the bottom of the stack and later defined statements closer to the top. When executing these defer statements, it will pop the topmost defer from the stack and execute it, which explains the order of execution in our example.

Now let’s begin this lesson. This lesson focuses on goroutines and channels, which are the foundation of concurrency in Go. I will start with these two basic concepts and guide you through a deep dive into Go’s concurrency.

What is Concurrency #

In the previous lessons, the code I wrote was executed in sequence, meaning that the next statement would only be executed after the previous one finished. This logical flow is simple and aligns with our reading habits.

However, this is not enough because computers are powerful, and it would be a waste to let them finish one task before moving on to the next. For example, consider a music application that allows you to listen to music while downloading songs. These two tasks are done simultaneously at the same time, and this is where concurrency comes into play. Concurrency allows the programs you write to perform multiple tasks simultaneously.

Processes and Threads #

When talking about concurrency, it is impossible to ignore threads. However, before introducing threads, let’s first discuss processes.

Processes #

In an operating system, processes are a very important concept. When you start a software application (e.g., a web browser), the operating system creates a process for it. This process serves as the working space for the software, containing all the resources needed for its operation, such as memory space, file handles, and the threads I will discuss next. The following image shows the processes running on my computer:

Drawing 0.png

(Processes running on a computer)

Now, what are threads?

Threads #

Threads are the execution space within a process. A process can have multiple threads, which are scheduled and executed by the operating system. Examples of thread operations include downloading a file or sending a message. When multiple threads within a process are scheduled and executed simultaneously, it is referred to as multithreaded concurrency.

When a program starts, a corresponding process is created, and the process also starts a thread called the main thread. If the main thread ends, the entire program terminates. Once the main thread is established, you can start many other threads from within the main thread, resulting in multithreaded concurrency.

Goroutines #

In Go, there is no concept of threads, only goroutines. Goroutines are also known as coroutines. Compared to threads, goroutines are much lighter, and a program can start thousands or even millions of goroutines.

Goroutines are scheduled by the Go runtime, which differentiates them from threads. In other words, Go’s concurrency is scheduled by Go itself—it decides how many goroutines to execute simultaneously and when to execute them. As developers, we are completely transparent to this process and only need to tell Go how many goroutines to start. We don’t need to worry about how to schedule and execute them.

Starting a goroutine is very simple because Go provides the go keyword, which simplifies the process compared to other programming languages. The following code illustrates this:

ch08/main.go

func main() {
   go fmt.Println("FeiXueWuQing")
   fmt.Println("I am the main goroutine")
   time.Sleep(time.Second)
}

This code starts a goroutine that calls the fmt.Println function to print “FeiXueWuQing”. So there are two goroutines in this code: the main goroutine started by the main function, and the goroutine I started myself using the go keyword.

From the example, we can summarize the syntax of the go keyword as follows:

go function()

The go keyword is followed by a method or function call, which starts a goroutine and runs the method or function in this newly created goroutine. When we run the above example, we can see the following output:

I am the main goroutine
FeiXueWuQing

As we can see from the output, the program is concurrent. The goroutine started by the go keyword does not block the execution of the main goroutine, which is why we see the above printing results.

Tip: The time.Sleep(time.Second) in the example means to wait for one second. In this case, it makes the main goroutine wait for a second, otherwise the main goroutine will finish executing and the program will exit before we can see the printing result of the newly started goroutine.

Channel #

So how do multiple goroutines communicate with each other if they are started? This is the problem that the channel in Go language is designed to solve.

Declaring a channel #

In Go language, declaring a channel is very simple. Just use the built-in make function, as shown below:

ch := make(chan string)

Here, chan is a keyword that indicates a channel type. The string after it indicates that the data in the channel is of type string. Through the declaration of the channel, we can also see that chan is a collection type.

Once the channel is defined, it can be used. A channel has only two operations: send and receive.

  1. Receive: Get the value in the channel. The operator is <- chan.
  2. Send: Send a value to the channel. Put the value into the channel. The operator is chan <-.

Tip: Note the operators for send and receive, which are both <-, but their positions are different. The receive operator <- is on the left of the channel, and the send operator <- is on the right of the channel.

Now I’ll modify the previous example to use a channel instead of the time.Sleep function for waiting, as shown in the code below:

ch08/main.go

func main() {
   ch := make(chan string)
   go func() {
      fmt.Println("FeiXueWuQing")
      ch <- "goroutine completed"
   }()
   fmt.Println("I am the main goroutine")
   v := <-ch
   fmt.Println("Received value from chan:", v)
}

After running this example, we can see that the program does not exit. We can see the output result “FeiXueWuQing”, which achieves the same effect as the time.Sleep function, as shown below:

I am the main goroutine
FeiXueWuQing
Received value from chan: goroutine completed

We can understand it like this: In the above example, we send a value to the variable ch of type chan in the newly started goroutine. In the main goroutine, we receive the value from the variable ch. If there is no value in ch, it will block and wait until there is a value in ch that can be received. I believe you should understand why the program does not exit before the new goroutine completes. This is because the chan created by make does not have any values, and the main goroutine wants to get a value from the chan. It will keep waiting until another goroutine sends a value to the chan.

A channel is like a pipe set up between two goroutines. One goroutine can send data into this pipe, and the other can retrieve data from it. It is similar to a queue.

Unbuffered channel #

In the above example, the chan created by make is an unbuffered channel with a capacity of 0. It cannot store any data. Therefore, unbuffered channels only serve the purpose of transmitting data and data does not stay in the channel. This also means that sending and receiving operations on unbuffered channels happen simultaneously, so they can also be called synchronous channels.

Buffered channel #

A buffered channel is similar to a blocking queue where elements are consumed in a first-in-first-out (FIFO) order. The capacity of a buffered channel, which determines how many elements can be stored, can be specified as the second argument to the make function, as shown in the code below:

cacheCh := make(chan int, 5)

I created a channel with a capacity of 5, with elements of type int. This means that this channel can store up to 5 elements of type int, as shown in the following diagram:

Drawing 2.png

(A buffered channel)

A buffered channel has the following characteristics:

  1. It has an internal buffer queue.
  2. Sending an element to a buffered channel inserts it at the end of the queue. If the queue is full, it blocks until another goroutine performs a receive operation, which frees up space in the queue.
  3. Receiving an element from a buffered channel retrieves it from the head of the queue and removes it. If the queue is empty, it blocks until another goroutine performs a send operation, inserting a new element.

Since a buffered channel is like a queue, we can retrieve its capacity and the number of elements it contains. This is shown in the code below:

ch08/main.go

cacheCh := make(chan int, 5)
cacheCh <- 2
cacheCh <- 3
fmt.Println("Capacity of cacheCh:", cap(cacheCh), ", number of elements:", len(cacheCh))

You can use the built-in functions cap and len to get the capacity and the number of elements in a channel, respectively.

Tip: An unbuffered channel is actually a channel with a capacity of 0. For example, make(chan int, 0).

Closing a channel #

Channels can also be closed using the built-in function close. The code below shows an example:

close(cacheCh)

Once a channel is closed, you cannot send any more data into it. Sending data will cause a panic. However, you can still receive data from a closed channel. If the channel is empty, the received data will be a zero value for the element type.

Unidirectional channels #

Sometimes, we have special requirements where we want to limit a channel to only send or receive operations. These types of channels are called unidirectional channels.

Declaring unidirectional channels is simple. Just include the <- operator when declaring the channel, as shown in the code below:

onlySend := make(chan<- int) // channel for sending only
onlyReceive := make(<-chan int) // channel for receiving only

Note that the position of the <- operator in the declaration is the same as the send and receive operations mentioned earlier.

Unidirectional channels are often used as function or method parameters to prevent certain operations from affecting the channel.

In the example below, the counter function has a parameter out which is a channel that can only be used for sending. Inside the body of the counter function, the out parameter can only be used for sending. If a receive operation is attempted, the program will not compile.

func counter(out chan<- int) {
  // Can only send on the `out` channel inside this function
}

Example of select+channel #

Let’s say we want to download a file from the internet. We start 3 goroutines to perform the downloads and send the results to 3 different channels. Whichever channel receives a value first will be used.

In this case, if we attempt to retrieve the result from the first channel, the program will block and we won’t be able to get the results from the other two channels or determine which one finished first. This is where multiplexing comes in. In Go, we can use the select statement for multiplexing. The format of the select statement is as follows:

select {
   case i1 = <-c1:
     //todo
   case c2 <- i2:
       //todo
   default:
       // default todo
}

The structure of the select statement is similar to a switch statement, with cases representing different operations on channels.

Tip: Multiplexing can be thought of as listening to N channels. When any of the channels has data, the select statement listens to it, executes the corresponding branch, and receives the data for processing.

With the select statement, we can now implement the download example. The code below shows how to do it:

ch08/main.go

func main() {

   // Declare three channels to store the results
   firstCh := make(chan string)
   secondCh := make(chan string)
   threeCh := make(chan string)

   // Start 3 goroutines for downloading
   go func() {
      firstCh <- downloadFile("firstCh")
   }()

   go func() {
      secondCh <- downloadFile("secondCh")
   }()

   go func() {
      threeCh <- downloadFile("threeCh")
   }()

   // Use select statement for multiplexing. Whichever channel
   // gets a value first corresponds to the file that finished
   select {
      case filePath := <-firstCh:
         fmt.Println(filePath)
      case filePath := <-secondCh:
         fmt.Println(filePath)
      case filePath := <-threeCh:
         fmt.Println(filePath)
   }
}

func downloadFile(chanName string) string {

   // Simulate file downloading, you can try different time.Sleep durations
   time.Sleep(time.Second)
   return chanName + ":filePath"
}

If any of the cases in the select statement is ready to execute, it will be chosen. If multiple cases are ready at the same time, one of them will be chosen randomly. This ensures that each case has an equal chance of being selected. If a select statement has no cases, it will wait indefinitely.

Summary #

In this lesson, I introduced how to create a goroutine using the go keyword, and how to use channels for communication between goroutines. These are the basics of concurrency in Go, and understanding them will help you master concurrency.

In Go, the recommended approach for sharing data in concurrent programs is to communicate by sharing channels, rather than sharing memory. This means that channel-based message passing is preferred over modifying the same variables. Therefore, in scenarios involving data flow and communication, using channels as a means of data transfer is preferred. Channels are concurrent-safe and have good performance.