28 Performance How to Understand Parallelism and Concurrency in Java Script Part 2

28 Performance How to Understand Parallelism and Concurrency in JavaScript Part 2 #

Hello, I’m Ishikawa.

In the previous lesson, we introduced the concepts of concurrency and parallelism and compared the support for multithreaded development in different programming languages. We also learned about using message passing through postMessage to enable interaction between the main thread and worker threads. However, we also discovered that JavaScript has limitations in terms of multithreading compared to other languages. It seems that data sharing cannot truly be achieved between different threads through message passing alone, but rather it is only the exchange of copied information.

So today, let’s take a closer look at how we can truly share and modify data between multiple threads based on message passing. However, more importantly, we need to consider whether such modifications are really necessary.

SAB+Atomics Mode #

Previously, we mentioned that the data structure of objects cannot be shared between threads. If we want to use postMessage for information transfer, we need to deep copy the data. Is there any way to allow different threads to access a data source simultaneously? The answer is yes, to achieve data sharing, which is memory sharing, we need to use SAB (SharedArrayBuffer) and Atomics. Let’s start by understanding SAB.

Shared ArrayBuffer #

SAB is a shared ArrayBuffer memory block. Before discussing SAB, let’s first understand what ArrayBuffer is. This starts with memory. We can think of memory as a shelf in a storeroom. In order to find the stored items, there are addresses from 1 to 9. The items stored inside are represented in bytes. Bytes are usually the smallest unit of value in memory and can contain different numbers of bits, such as 8, 32, or 64 bits in a byte. We can see that the English writing and pronunciation of bit and byte are somewhat similar, so we need to be careful not to confuse bytes and bits.

Another point to note is that data storage in computers’ memory is binary. For example, the binary representation of the number 2 is 00000010, using 8 bits to represent it, as shown in the following diagram. If it is a letter, it can be converted to a number through a method like UTF-8, and then converted to binary. For example, the letter “H” is converted to the number 72, and then to the binary 01001000.

Image

In the JavaScript language, memory management is automatic. This means that when we write a line of code, our virtual machine automatically finds the remaining space in memory and stores the data in it. It also tracks whether this piece of code can still be accessed in our program. If it finds that it can no longer be accessed, it will perform the necessary cleanup. This process is also called garbage collection.

If you write in C language and compile it into WebAssembly, then based on the manual memory management and garbage collection mechanism of C language, you need to use the memory allocation (malloc) function to find a place to store data from a free list, and after use, release the memory through the release (free) function.

Returning to the JavaScript scenario, why did we introduce automatic and manual memory management earlier?

This brings us back to the question we left at the end of the previous lesson, which is why using ArrayBuffer is more efficient. Let’s also solve the problem we discussed last time: if we use more advanced data types in development and completely hand over the data processing work to the JavaScript virtual machine like V8, it does bring convenience, but the downside is that it reduces the flexibility of extreme performance tuning. For example, when we create a variable, the virtual machine may spend 2-8 times the memory to guess its type and representation in memory. And some ways of creating and using objects may increase the difficulty of garbage collection.

However, if we use a more primitive data type like ArrayBuffer, and write programs in C language and compile them into WebAssembly, it can give developers more control to more finely manage memory allocation based on specific scenarios, and improve program performance.

So what is the difference between an ArrayBuffer and the arrays we often use? Let’s take a look at the code below. In a regular array, we can have different types of data such as numbers, strings, objects, etc., but in ArrayBuffer, the only thing available is bytes.

// Array
[5, {prop: "value"}, "a string"];

[0] = 5;
[1] = {prop: "value"};
[2] = "a string";

// ArrayBuffer
[01001011101000000111]

Although bytes in ArrayBuffer can be represented by a series of numbers, there is a problem: how can the machine know their unit? As I mentioned earlier, this series of numbers itself does not have any meaning. Only by using different units of 8, 32, or 64 bits, it can have meaning. At this point, we need a view to segment it.

Image

Therefore, the data in an ArrayBuffer cannot be directly manipulated. It needs to be accessed through a TypedArray or DataView.

// main.js
var worker = new Worker('worker.js');

// Create a 1KB size ArrayBuffer
var buffer = new SharedArrayBuffer(1024);

// Create a DataView of a TypedArray
var view = new Uint8Array(buffer);

// Pass the information
worker.postMessage(buffer);

setTimeout(() => {
  // The first byte in the buffer
  console.log('later', view[0]); // later 5
  // The value of the foo property in the buffer
  console.log('prop', buffer.foo); // prop 32
}, 1000);
// worker.js
self.onmessage = ({data: buffer}) => {
  buffer.foo = 32;
  var view = new Uint8Array(buffer);
  view[0] = 5;
}

In fact, when initializing an ArrayBuffer or SharedArrayBuffer, postMessage and structured cloning algorithm are also used. However, unlike message passing, if the data passed into the request end is modified after being cloned, the modified data will still point to the same data as before. Let’s compare the differences between regular postMessage, ArrayBuffer, and SharedArrayBuffer.

Image

Therefore, in the example with SharedArrayBuffer mentioned above, we can see that by using setTimeout instead of onmessage, we can obtain the bytes and properties of the modified buffer in worker.js. However, it is worth noting that the number of bytes in a SharedArrayBuffer is fixed and cannot be modified.

Atomics and Atomicity #

After discussing SharedArrayBuffer, let’s look at atomicity. Since data needs to be shared, we need to consider the issue of atomicity.

If you have experience in database development, you may have heard of the ACID principles, which stands for Atomicity, Consistency, Isolation, and Durability. Atomicity refers to the property of a transaction in which all operations either complete or none of them are performed. Any error that occurs during the execution of a transaction will cause all changes to be rolled back to the initial state. As a result, a transaction is indivisible and irreducible.

Why is atomicity so important in database development?

Consider this: if we look at a single client request in isolation, it may be atomic. However, if it involves multiple requests, atomicity may not be guaranteed. But if all these requests belong to the same transaction, when a user successfully makes a payment but the payment result fails to reach the e-commerce interface, the transaction is considered incomplete. This not only results in potential financial loss but also creates a negative user experience. From this perspective, the entire transaction that includes these three requests is an atomic transaction.

Similarly, in distributed system design, the interactions between different nodes in a network should also adhere to the principle of atomicity. Returning to threads, we can say that different threads in a computer should also maintain atomic operations on shared data.

Now you may wonder, as we mentioned earlier, our programs are prone to race conditions in concurrency. If we need to maintain atomicity in concurrent design, how do we deal with concurrency in JavaScript?

Don’t worry, this problem can be solved using the atomic operations provided by JavaScript, known as Atomics. Atomics provides the necessary tools for atomic operations and also provides thread-safe waiting mechanisms. In JavaScript, there is a global Atomics object, which comes with some built-in methods. Image

In SAB’s memory management, these methods mentioned above can solve three types of problems. The first problem is race conditions in a single operation, the second problem is race conditions in multiple operations, and the third problem is problems caused by instruction ordering. Let’s take a look at each of them.

Race conditions in a single operation #

You might wonder why there can be a race condition in a single operation? Let’s take an example: if we use two worker threads to increment a number by 1, you might think that no matter who performs the operation, the result would be the same, which is +1. However, the problem is not that simple. In the actual computation, our data is fetched from memory, stored in registers, and then operated on by the arithmetic logic unit (ALU). At this point, if the number 6 is fetched and computed by both worker thread 1 and worker thread 2 simultaneously, the result might be 7 instead of 8. This is because both threads access the data in memory and receive the value 6 before the computation, so the result of +1 is overwritten and computed twice.

Image

To solve this problem, the aforementioned methods such as Atomics.add(), Atomics.sub(), Atomics.and(), Atomics.or(), Atomics.xor(), Atomics.exchange() can effectively avoid this type of problem. If multiplication or division is needed, relevant functionalities can be created using Atomics.compareExchange().

Race conditions in multiple operations #

After discussing race conditions in a single operation, let’s take a look at race conditions in multiple operations. In JavaScript, we can use futex to achieve the effect of a mutex lock. It comes from the Linux kernel and has a type of mutex lock called fast userspace mutex (futex). There are two methods in futex: Atomics.wait() and Atomics.wake(). As the names suggest, one represents waiting and the other represents awakening.

When using locks, we need to be aware that, in the front-end browser, the main thread is not allowed to acquire locks, while in the back-end Node.js, the main thread can acquire locks. The reason why locks are not allowed in the front-end browser is that it would impede the execution of JavaScript’s main thread, greatly affecting the user experience. However, for the back-end, there is no such direct impact.

If you want to use wait() in the front-end main thread, there is still a solution: you can use waitAsync(). Compared to wait(), which can pause the main thread and pass a string, waitAsync() needs to create another thread, so in terms of performance, it is slightly worse than wait(). Therefore, for hot paths, which are frequently executed code in a program, it may not be so useful. But for non-informational tasks, such as notifying another thread, it is still useful.

Race conditions caused by instruction ordering #

Lastly, let’s examine race conditions caused by instruction ordering. If you have an understanding of the chip level of computers, you would know that our code may be re-ordered at the instruction execution pipeline level. In a single-threaded scenario, this might not pose a problem because other code needs to complete execution in the call stack of the current function before seeing the result. However, in a multi-threaded scenario, other threads may see changes before the result appears and may not consider the subsequent code instruction result. How can we solve this problem?

This is where Atomics.store() and Atomics.load() come into play. All variable updates before Atomics.store() are guaranteed to be completed before writing the variable value back to memory, and all variable loads after Atomics.load() are guaranteed to be completed after fetching the variable value. This avoids race conditions caused by instruction ordering.

Serialization of Data Transfer #

When using SAB, there is one more thing to note, which is the serialization of data, meaning that when we use SAB to transfer strings, booleans, or objects, we need an encoding and decoding process. Especially for objects, because we know that object types cannot be directly transferred, we need to use the method of “serializing objects into strings via JSON”. Therefore, it is more suitable to use postMessage and onmessage for transmission rather than through SAB.

Actor Model Pattern #

From the example above, we can see that using SAB+Atomics directly is quite complex. If we are not careful, it may cause performance problems that far outweigh the optimization effects. Therefore, unless it is a research and development project, for pure application projects, it is best to use mature libraries or WebAssembly to abstract complex memory management into simple interfaces, which will be more suitable. Additionally, we can also consider an alternative solution to SAB+Atomics, which is the Actor Model pattern.

In the Actor Model pattern, since actors are distributed in different processes, as mentioned earlier, the memory between processes is not shared. Each actor may not be on the same thread, and they manage their own data without being accessed by other threads, so there are no mutex locks and thread safety issues. Actors can only pass and receive information to each other.

Image

This pattern aligns more with the design of JavaScript because JavaScript itself has low support for manual memory management. In the Actor Model pattern, we only use threads for information passing and not for sharing. However, this does not mean that the interaction between the main thread and the worker thread is limited to information passing only. For example, the main thread can change the DOM based on the data received from the worker thread. However, during this process, some conversion work needs to be done on its own.

When it comes to transmitting large amounts of data, we should pay attention to some optimization strategies:

  • We can split the task into smaller chunks and transmit them one by one.
  • Each time, we can choose to transmit only the delta, which is the part that has changed, instead of transmitting the entire data.
  • If the transmission frequency is too high, we can bundle messages for transmission.
  • Lastly, using ArrayBuffer can improve performance.

Summary #

Through the study of these two lessons, we can see that the development of multithreading in front-end still has a long way to go.

We have also seen the SAB+Atomics pattern, which can be implemented in JavaScript to some extent. However, in reality, the Actor Model is easier to use in JavaScript, especially in front-end scenarios. Obviously, we don’t want to cause race conditions due to parallel modifications of the same set of objects by multiple threads. We also don’t want to add overly complex logic to support data encoding, decoding, and conversion for the sake of data sharing in memory. Nor do we want to add additional logic for manual memory management.

Although the development of multithreading is still in an experimental stage in the front-end, I believe it still has great potential for imagination. Because if the front-end has tasks that require a large amount of computation and consume memory, they can be handed over to Worker Threads for processing, so that the JavaScript main thread can focus on UI rendering. Especially with the Actor model, the performance of the main program can be greatly improved while avoiding side effects.

Thought-provoking question #

We have mentioned that objects cannot be shared between threads. Do you think it is possible to achieve object sharing through SharedArrayBuffer?

Feel free to share your answer, exchange learning experiences, or ask questions in the comments section. If you found it valuable, you are also welcome to share today’s content with more friends. See you in the next class!