12 What Are the Six Types of Common Thread Pools What Is Java8's Fork Join Pool

12 What Are the Six Types of Common Thread Pools What Is Java8’s ForkJoinPool #

In this lesson, we will mainly learn about the 6 common types of thread pools and provide a detailed explanation of the ForkJoinPool thread pool introduced in Java 8. The 6 common types of thread pools are as follows:

  • FixedThreadPool
  • CachedThreadPool
  • ScheduledThreadPool
  • SingleThreadExecutor
  • SingleThreadScheduledExecutor
  • ForkJoinPool

FixedThreadPool #

The first type of thread pool is called FixedThreadPool. It has the same number of core threads and maximum threads, so it can be regarded as a fixed number of thread pool. The special feature of this thread pool is that once the thread pool is created, the number of threads remains fixed except for the initial phase where the thread number increases from 0. Even if the number of tasks exceeds the number of threads, the thread pool will not create more threads to handle the tasks. Instead, it will put the tasks that exceed the thread processing capacity into the task queue for waiting. Besides, even if the task queue is full, when it is supposed to continue increasing the number of threads, it cannot do so because its maximum number of threads is the same as the core number of threads.

img

As shown in the picture, the thread pool has 10 threads t0~t9, which will continuously execute tasks. If a thread finishes executing a task, it will fetch a new task from the task queue and continue executing. During this process, the number of threads will neither increase nor decrease, always remaining at 10.

CachedThreadPool #

The second type of thread pool is called CachedThreadPool, which can be referred to as a cache thread pool. The special feature of this thread pool is that the number of threads can almost increase indefinitely (in actuality, the maximum can reach Integer.MAX_VALUE, which is 2^31-1, a very large number that is almost impossible to reach). When the threads are idle, they can also be recycled. It means that the number of threads in this thread pool is not fixed and can dynamically increase or decrease. Of course, it also has a queue for storing submitted tasks, but this queue is a SynchronousQueue with a capacity of 0, which means it does not actually store any tasks. It only serves to transfer and pass on the tasks, so it is more efficient.

When we submit a task, the thread pool will check if there are any idle threads among the threads already created. If there are idle threads, it will assign the task directly to one of them. If there are no idle threads, it will create a new thread to execute the task, thus dynamically increasing the number of threads. Let’s take an example, as shown in the code below:

ExecutorService service = Executors.newCachedThreadPool();

for (int i = 0; i < 1000; i++) {

    service.execute(new Task() {

});

When we use a for loop to submit 1000 tasks to the CachedThreadPool, what will happen if these tasks take a long time to process? Because the task submission operation in the for loop is very fast, but executing the tasks takes a long time, it may happen that all 1000 tasks are submitted while the first task has not been executed yet. In this case, the CachedThreadPool can dynamically scale the number of threads. With the continuous submission of tasks, it will create 1000 threads to execute the tasks. However, when the tasks are finished and there are no new tasks, a large number of idle threads will cause wasted memory resources. At this time, the thread pool will check if there are any executable tasks within the past 60 seconds. If there are none, the idle threads will be destroyed, and ultimately the number of threads will be reduced to 0.

ScheduledThreadPool #

The third thread pool is called ScheduledThreadPool, which supports the execution of tasks at a scheduled or periodic interval. There are mainly three methods to achieve this, as shown in the code below:

ScheduledExecutorService service = Executors.newScheduledThreadPool(10);

service.schedule(new Task(), 10, TimeUnit.SECONDS);

service.scheduleAtFixedRate(new Task(), 10, 10, TimeUnit.SECONDS);

service.scheduleWithFixedDelay(new Task(), 10, 10, TimeUnit.SECONDS);

So what is the difference between these three methods?

  • The first method, schedule, is relatively simple. It means to delay the execution of a task after a specified time. If the parameter in the code is set to 10 seconds, it means the task will be executed once after a delay of 10 seconds and then end.
  • The second method, scheduleAtFixedRate, means to execute the task at a fixed rate. Its second parameter, initialDelay, represents the initial delay time for the first execution of the task, and the third parameter, period, represents the interval, or the time between each execution of the task after the initial delay.
  • The third method, scheduleWithFixedDelay, is similar to the second method in that it also executes the task periodically. The difference lies in how the periodicity is defined. The previous scheduleAtFixedRate counts time from the start of the task execution and starts the second execution when the time is up, regardless of how long the task takes to execute. On the other hand, the scheduleWithFixedDelay method counts time from the end of the previous task and starts the next iteration of the task.

Example #

Let me give you an example. Suppose a student is staying up late to write code and needs to drink coffee to stay awake. Let’s say it takes 10 minutes to drink each cup of coffee. If the student uses the second method, scheduleAtFixedRate, with a time interval set to 1 hour, they will drink a cup of coffee every hour on the dot. Here is the schedule:

  • 00:00: Start drinking coffee
  • 00:10: Finished drinking
  • 01:00: Start drinking coffee
  • 01:10: Finished drinking
  • 02:00: Start drinking coffee
  • 02:10: Finished drinking

However, if the student uses the third method, scheduleWithFixedDelay, with the time interval also set to 1 hour, the second coffee drinking time will be at 1:10 instead of 1:00 because each coffee takes 10 minutes, and scheduleWithFixedDelay starts counting from the completion time of each task. Here is the schedule:

  • 00:00: Start drinking coffee
  • 00:10: Finished drinking
  • 01:10: Start drinking coffee
  • 01:20: Finished drinking
  • 02:20: Start drinking coffee
  • 02:30: Finished drinking

SingleThreadExecutor #

The fourth thread pool is SingleThreadExecutor. It uses a single thread to execute tasks, similar to FixedThreadPool. The difference is that there is only one thread in this pool. If the thread encounters an exception during task execution, the thread pool will create a new thread to execute subsequent tasks. This thread pool is suitable for scenarios where all tasks need to be executed in the order they were submitted. The previous thread pools may not guarantee that the execution order of tasks is the same as the submission order because they execute tasks in parallel with multiple threads.

SingleThreadScheduledExecutor #

The fifth thread pool is SingleThreadScheduledExecutor. It is very similar to ScheduledThreadPool, but it is a special case of ScheduledThreadPool with only one thread. It is created as shown in the source code:

new ScheduledThreadPoolExecutor(1)

It sets the core thread count of ScheduledThreadPool to 1.

img

To summarize the five thread pools mentioned above, let’s compare them based on three dimensions: core thread count, maximum thread count, and thread keep-alive time, as shown in the table.

The first thread pool, FixedThreadPool, sets the core thread count and maximum thread count directly through the constructor, and their values are equal. Therefore, the maximum thread count will not exceed the core thread count, and there is no need to consider thread recycling. If there are no tasks to execute, the thread will stay alive and wait in the thread pool.

The core thread count of CachedThreadPool is 0, and its maximum thread count is Integer’s maximum value. The actual number of threads is generally not as high as that, so if there are many tasks and they are time-consuming, CachedThreadPool will create a large number of threads to handle them.

Using the same method, you can analyze the parameters of the other three thread pools after class to deepen your understanding.

ForkJoinPool #

Finally, let’s take a look at the sixth thread pool, ForkJoinPool. This thread pool was introduced in JDK 7, and its name, ForkJoin, describes its execution mechanism. Its main usage and previous thread pools are similar; tasks are handed over to the thread pool for execution, and the thread pool has a task queue to store tasks. However, ForkJoinPool has two very big differences from the previous thread pools. The first difference is that it is very suitable for executing tasks that can generate subtasks.

As shown in the figure, we have a Task that can generate three subtasks. After the three subtasks are executed in parallel and completed, they are aggregated and given to the Result. For example, if the main task needs to perform very heavy computation, we can divide the computation into three parts, which are independent and do not affect each other. This way, we can take advantage of multi-core CPUs and perform parallel computation, and then aggregate the results. There are two main steps involved: the first step is splitting (Fork), and the second step is joining (Join). By now, you should understand the origin of the name ForkJoinPool.

Let’s take the Fibonacci sequence as an example, which is often asked in interviews. The Fibonacci sequence has the property that the next number is the sum of the previous two numbers. The 0th number is 0, and the first number is 1. So the second number is 0 + 1 = 1, and so on. When writing code, we should prefer more efficient iterative forms or more advanced forms such as exponentiation or matrix formulas. However, assuming we wrote the initial version in recursive form, the pseudo code is as follows:

if (n <= 1) {
    return n;
}
 } else {

    Fib f1 = new Fib(n - 1);

    Fib f2 = new Fib(n - 2);

    f1.solve();

    f2.solve();

    number = f1.number + f2.number;

    return number;

 }

You can see that if n <= 1, then n is directly returned. If n > 1, the value of the previous term f1 is calculated first, and then the value of f2 is obtained by going back two terms. Finally, the two values are added together to get the result. Therefore, we can see that there are two subtasks generated during the summation. The process of calculating f(4) is shown in the following diagram:

img

When calculating f(4), we need to first calculate f(2) and f(3). Similarly, when calculating f(3), we need to calculate f(1) and f(2), and so on. img

This is a typical recursive problem. In our ForkJoin pattern, the subtasks will also generate sub-subtasks, and then gradually accumulate to obtain the final result.

The ForkJoinPool thread pool has multiple methods to implement the splitting and summarization of tasks. One usage example is shown in the following code:

class Fibonacci extends RecursiveTask<Integer> { 

    int n;

    public Fibonacci(int n) { 

        this.n = n;

    } 

    @Override

    public Integer compute() { 

        if (n <= 1) { 

            return n;

        } 

    Fibonacci f1 = new Fibonacci(n - 1);

    f1.fork();

    Fibonacci f2 = new Fibonacci(n - 2);

    f2.fork();

    return f1.join() + f2.join();

    } 

 }

We can see that it first inherits from RecursiveTask, which is a simple wrapper for ForkJoinTask. We override the compute() method, which returns directly when n <= 1, and creates recursive tasks (f1 and f2) when n > 1. Then we use the fork() method to split the tasks and execute them separately. Finally, in the return statement, we use the join() method to summarize the results, thus achieving task splitting and summarization.

public static void main(String[] args) throws ExecutionException, InterruptedException { 

    ForkJoinPool forkJoinPool = new ForkJoinPool();

    for (int i = 0; i < 10; i++) { 

        ForkJoinTask task = forkJoinPool.submit(new Fibonacci(i));

        System.out.println(task.get());

    } 

 }

The above code will print the values of Fibonacci sequence from the 0th to the 9th terms:

0

1

1

2

3

5

8

13

21

34

This is the first difference between the ForkJoinPool thread pool and other thread pools.

Now let’s look at the second difference, which is the internal structure. In the previous thread pool, all threads share a common queue. However, in the ForkJoinPool thread pool, each thread has its own independent task queue, as shown in the diagram.

img

In addition to a shared task queue, the ForkJoinPool thread pool has an individual double-ended queue deque for each thread. When a thread’s task is split, the subtasks are placed in the thread’s own deque instead of the shared task queue. If there are three subtasks placed in thread t1’s deque, the cost of task acquisition for thread t1 is reduced since it can directly obtain tasks from its own task queue without competing or being blocked by the shared queue (except for the “steal” situation mentioned later). This greatly reduces the competition and context switching between threads, making it highly efficient.

Let’s consider another scenario: when there are multiple threads and the task for thread t1 is particularly heavy, resulting in dozens of subtasks being split, while thread t0 has no tasks to execute at this time and its own deque is empty. In order to improve efficiency, thread t0 will help execute tasks for t1. This is the meaning of “work-stealing”.

In the deque, thread t1 obtains tasks using the Last In First Out (LIFO) logic, while thread t0 “steals” tasks from thread t1’s deque using the First In First Out (FIFO) logic, as shown in the diagram. You can see that using the “work-stealing” algorithm and double-ended queues, the workloads of each thread are well balanced.

img

Finally, let’s use a panoramic view to describe the internal structure of the ForkJoinPool thread pool. You can see that the ForkJoinPool thread pool is very similar to other thread pools in many aspects, but the key difference is that each thread has its own double-ended queue to store the split subtasks. ForkJoinPool is very suitable for recursive scenarios such as traversing trees and searching for the optimal path.