17 What Happens When a Thread Calls the Start Method Twice

当一个线程两次调用start()方法时,会抛出IllegalThreadStateException异常。这是因为线程的生命周期中,一个线程只能被启动一次,一个线程的状态只能从新建(New)状态转变为就绪(Runnable)状态,再转变为运行(Running)状态。在运行状态时,再次调用start()方法是非法的。

线程的生命周期包括以下几个状态转移:

  1. 新建(New):当一个Thread实例被创建时,它处于新建状态。
  2. 就绪(Runnable):线程被调用start()方法后,进入就绪状态。此时线程已经具备了运行的条件,但是还没被分配到CPU进行执行。
  3. 运行(Running):在就绪状态时,线程被分配到CPU并开始执行。
  4. 阻塞(Blocked):线程在执行过程中,因为某些原因而被挂起,比如等待用户输入或等待某个资源,进入阻塞状态。
  5. 结束(Terminated):线程执行完毕或者由于异常退出后,进入结束状态。

线程的状态转移如下:

  • 从新建状态转变为就绪状态,通过调用start()方法。
  • 从就绪状态转变为运行状态,通过CPU调度。
  • 从运行状态转变为阻塞状态,通过调用sleep()方法、wait()方法等。
  • 从阻塞状态转变为就绪状态,等待重新调度。
  • 从运行状态或就绪状态转变为结束状态,线程执行完毕或者发生异常。

了解线程的生命周期和状态转移对于编写多线程程序非常重要,可以帮助我们更好地控制线程的执行顺序和状态。

Typical Answer #

Java threads are not allowed to be started twice. The second call will inevitably throw an IllegalThreadStateException, which is a runtime exception and is considered a programming error.

Regarding the different states of thread lifecycle, since Java 5, thread states have been clearly defined in the public inner enum class java.lang.Thread.State, which are:

  • NEW: represents the state when the thread is created but not yet started. It can be considered as an internal state of Java.
  • RUNNABLE: represents that the thread is already executing in the JVM. Due to the need for computing resources, it may either be running or waiting for the system to allocate CPU time to it and is in the ready queue.
  • In some analyses, an additional state called RUNNING may be distinguished, but from the perspective of Java API, it cannot be expressed.
  • BLOCKED: this state is closely related to the synchronization discussed in the previous lessons. Blocked means the thread is waiting for a monitor lock. For example, when a thread tries to acquire a lock through synchronized, but another thread already has exclusive access to it, the current thread will be in the blocked state.
  • WAITING: represents that the thread is waiting for other threads to perform certain actions. A common scenario is similar to the producer-consumer pattern. If the task conditions are not met, the current consumer thread will wait, while another producer thread prepares the task data and then notifies the consumer thread to continue working through actions like notify. Thread.join() also puts the thread in the waiting state.
  • TIMED_WAITING: it has the same entry conditions as the waiting state, but calls methods with timeouts, such as wait or join with specified timeout versions, as shown in the following example:
public final native void wait(long timeout) throws InterruptedException;
  • TERMINATED: whether it is an unexpected exit or normal completion, the thread has fulfilled its mission and terminated. Some people also refer to this state as “death”.

When calling the start() method for the second time, the thread may be in the terminated state or some other (non-NEW) state, but regardless, it cannot be started again.

Analysis of examination points #

Today’s question can be considered as a common warm-up question in interviews. The typical answer provided earlier is an introduction to the basic states and simple flow of threads. If you feel that it is not intuitive enough, I will analyze and introduce it with a comparative state diagram below. Overall, understanding threads is an essential foundation for our daily development or diagnostic analysis.

The interviewer may take this opportunity to test your mastery of threads from various perspectives:

  • Interviewers who focus on relatively theoretical aspects may ask you what threads are and how they are implemented at the Java underlying level.
  • Thread state transitions and interactions with concurrent tools such as locks.
  • Common pitfalls and recommendations in thread programming.

As you can see, just a thread alone requires a lot of knowledge to master. We will focus on the main points and begin our detailed analysis.

Knowledge Expansion #

First of all, let’s take a closer look at what a thread is.

From the perspective of an operating system, a thread can be thought of as the smallest unit of scheduling. A process can contain multiple threads, which act as the actual executors of tasks and have their own stack, registers, thread local storage, etc. However, they share file descriptors, virtual address spaces, and other resources with other threads within the same process.

In specific implementations, threads can be divided into kernel threads and user threads. The implementation of Java threads is actually related to the virtual machine. For the well-known Sun/Oracle JDK, its thread implementation has gone through an evolution. After Java 1.2, the JDK has abandoned the so-called Green Thread, which were user-scheduled threads. The current model is a one-to-one mapping to operating system kernel threads.

If we look at the source code of the Thread class, we can see that most of its basic operations are called as native code through JNI.

private native void start0();
private native void setPriority0(int newPriority);
private native void interrupt0();

This implementation has pros and cons. Overall, the Java language benefits from fine-grained threads and related concurrency operations. Its ability to build highly scalable large applications is beyond doubt. However, its complexity also increases the threshold for concurrent programming. In recent years, languages such as Go have provided coroutines, greatly improving the efficiency of building concurrent applications. Meanwhile, Java is also nurturing new mechanisms such as lightweight user threads (Fibers) in the Loom project. Perhaps in the near future, we will be able to use it in the new version of JDK.

Next, let’s analyze the basic operations of threads. You must be very familiar with how to create a thread. Please take a look at the following example:

Runnable task = () -> {System.out.println("Hello World!");};
Thread myThread = new Thread(task);
myThread.start();
myThread.join();

We can directly extend the Thread class and then instantiate it. But in this example, I chose another way, which is to implement a Runnable, put the code logic in the Runnable, then build a Thread object and start it, and finally wait for it to finish using join().

The advantage of using Runnable is that it is not limited by Java’s lack of support for multiple inheritance, allowing for code reuse. It is particularly useful when we need to repeat the execution of certain logic. It can also be better combined with modern Java concurrency libraries such as Executors, for example, we can completely rewrite the above start and join logic as follows:

Future future = Executors.newFixedThreadPool(1)
    .submit(task)
    .get();

This way, we don’t have to worry about creating and managing threads, and we can also use mechanisms like Future to handle execution results more effectively. The lifecycle of a thread typically has no inherent connection with business logic. Confusing implementation requirements with business requirements will lower development efficiency.

Starting from the states of the thread lifecycle, what factors may affect the state of a thread in Java programming? The main factors are:

  • Thread-specific methods, in addition to start(), there are multiple join() methods to wait for the thread to finish. yield() is used to tell the scheduler to voluntarily give up the CPU. In addition, there are deprecated methods like resume(), stop(), and suspend(), as far as I know, in the latest version of JDK, the destroy() and stop() methods will be removed directly.
  • The Object superclass provides some basic methods like wait(), notify(), and notifyAll(). If we hold the monitor lock of an object, calling wait() will put the current thread in a waiting state until another thread notify() or notifyAll(). Essentially, it provides the ability to acquire and release the monitor, which is a basic way of inter-thread communication.
  • Utilities in concurrent libraries, such as CountDownLatch.await(), will put the current thread into a waiting state until the latch count reaches zero. This can be seen as a signaling mechanism for inter-thread communication.

Here is a diagram showing the correspondence between states and methods:

Although the methods of Thread and Object may sound simple, they have been proven to be very obscure and error-prone in practical applications. That’s why Java later introduced the concurrency package. In general, with the concurrency package, in most cases, we no longer need to call methods like wait() or notify().

So far, we have discussed a lot of theory. Now let’s talk about the use of thread APIs, focusing on some aspects that are easily overlooked in daily work and study.

Let’s start with daemon threads. Sometimes, in an application, there is a need for a long-lived service program, but we don’t want it to prevent the application from exiting. In such cases, we can set the thread as a daemon thread. If the JVM finds that only daemon threads exist, it will terminate the process. You can refer to the code snippet below for specific examples. Note that it must be set before the thread starts.

Thread daemonThread = new Thread();
daemonThread.setDaemon(true);
daemonThread.start();

Now let’s talk about Spurious wakeup. Especially in systems with multi-core CPUs, there is a possibility that a thread may be awakened without any thread broadcasting or signaling. If not handled properly, it can lead to strange concurrency issues. Therefore, when waiting for conditions, it is recommended to use the following pattern:

// Recommended
while (isCondition()) {
    waitForACondition(...);
}

// Not recommended, may introduce bugs
if (isCondition()) {
    waitForACondition(...);
}

Thread.onSpinWait() is a feature introduced in Java 9. In Column 16, I mentioned the concept of “spin-wait”, which can be considered as a performance optimization technique for short-term waiting. It is not considered a lock, but rather a hint to the JVM. The JVM may utilize the CPU’s pause instruction to further improve performance. Applications with high performance sensitivity can pay attention to this feature.

Another point to be cautious about is the use of ThreadLocal. It is a mechanism provided by Java to store thread-private information. It is effective throughout the lifespan of a thread, making it convenient to pass information between different business modules associated with a thread, such as transaction IDs, cookies, and other context-related information.

The implementation structure of ThreadLocal can be referred to in the source code. The data is stored in the thread-related ThreadLocalMap, and its internal entries are weak references, as shown in the code snippet below:

static class ThreadLocalMap {
  static class Entry extends WeakReference<ThreadLocal<?>> {
      // The value associated with this ThreadLocal.
      Object value;
      
      Entry(ThreadLocal<?> k, Object v) {
          super(k);
          value = v;
      }
  }
  // ...
}

When the key is null, the entry becomes a “stale entry”, and the disposal of the related value often relies on several key points, namely set, remove, and rehash.

The following is an example of set, which I have simplified and commented:

private void set(ThreadLocal<?> key, Object value) {
  Entry[] tab = table;
  int len = tab.length;
  int i = key.threadLocalHashCode & (len-1);

  for (Entry e = tab[i];; ...) {
      // ...
      if (k == null) {
          // Replace the stale entry
          replaceStaleEntry(key, value, i);
          return;
      }
   }

  tab[i] = new Entry(key, value);
  int sz = ++size;
  // Scans and cleans up stale entries, and checks if the capacity is exceeded
  if (!cleanSomeSlots(i, sz) && sz >= threshold)
      rehash(); // Clean up stale entries, and if the capacity is still exceeded, resize (double)
}

The specific cleaning logic is implemented in cleanSomeSlots and expungeStaleEntry. If you are interested, you can read about it yourself.

Combined with the reference types introduced in Column 4, you will find something special. Usually, weak references are used with reference queues for cleanup purposes, but ThreadLocal is an exception and does not do so.

This means that the disposal of stale entries relies on explicit triggering, otherwise you have to wait for the thread to end and then recycle the corresponding ThreadLocalMap! This is one of the sources of many OutOfMemoryErrors. Therefore, it is usually recommended that the application takes responsibility for removing entries and does not use ThreadLocal with thread pools, as worker threads often do not exit.

Today, I introduced the basics of threads, analyzed the states in the thread lifecycle and the corresponding methods. This also helps us better understand the impact of synchronized and locks. I also mentioned some operations that need to be noted. I hope this helps you.

Practice Exercise #

Do you have a clear understanding of the topic we discussed today? Today, I have prepared an interesting question: write a simplest program to print “Hello World”. Now, let’s discuss - when running this application, how many threads will Java create at least? Then, think about how to accurately verify your conclusion. The real situation may surprise you.

Please share your thoughts on this question in the comments. I will choose well-thought-out comments and reward you with a learning coupon. I welcome you to discuss with me.

Are your friends also preparing for interviews? You can “invite friends to read” and share today’s question with them. Maybe you can help them.