11 Understand Closure Principles Through Js Engine's Heap and Stack

11 Understand Closure Principles Through JS Engine’s Heap and Stack #

Hello, I’m Shikawa.

In the previous discussion on programming patterns, we mentioned closures.

If we say that the “birthplace” of a function is the scope, and its entire “lifetime” from birth to disposal is its lifecycle, then closures have the ability to break through these spatial and temporal limitations. How do they achieve this breakthrough?

In this lesson, we will explore the principle of closures by delving into the JavaScript compilation process and its data structures such as the stack and heap.

Static and Dynamic Scopes #

Let’s start with the concept of scope. Scope can be divided into static scope and dynamic scope.

Static scope depends on where variables and functions are declared, which can be thought of as their “birthplace,” and it is determined before their execution. Therefore, static scope is also known as lexical scope because the function’s “birthplace” is registered during lexical analysis.

In contrast, dynamic scope is different. Under dynamic scope, the scope of a function is determined at the time of function invocation. Therefore, it depends on where it is called, which can be thought of as its “residence,” and this can be modified at runtime.

The JavaScript code we write is usually compiled and executed by the web browser. This process involves compilation first and then execution. Hence, the scope of JavaScript code is determined during the compilation process by analyzing where it is declared, which belongs to static (lexical) scope.

Now let’s take a look at how the scope of a function is defined during the code compilation phase and its lifecycle during the execution phase.

Scope: Code Compilation #

Let’s start with the scope. Taking V8 as an example, let’s have an overview of the process from compilation to execution when we open a page and run a piece of code. You can refer to the illustration below:

图片

In Lesson 10, we introduced the data types in JavaScript. Here, we focus on the stack space and heap space shown in the red dashed box in the figure above, where data is stored and processed.

图片

The stack is a linear and continuous storage space for data structures, mainly storing the addresses of JavaScript primitive data types as well as complex data types like objects. It also contains the execution state of functions and the this value. The heap is a tree-like and non-continuous storage space, where complex data types such as objects, arrays, and functions are stored, as well as the built-in window and document objects.

After discussing storage, let’s look at the process from lexical analysis to syntax analysis by using a piece of code below.

var base = 0;
var scope = "global";
function addOne () {
    var base = 1;
    return base + 1;
}

function displayVal () {
    var base = 2;
    var scope = "local"
    increment = addOne();
    return base + increment;
}

When we input the above code, the code is split into segments like strings, which is the process called tokenizing/lexing. In this process, for example, var base = 0 is split into var variable, base, assignment expression, and the numeric constant 0. Lexical scope refers to the scope where this piece of code is located when it is split into lexical tokens. It is shown in the red dashed box below:

图片

After lexical splitting, in the next step of parsing, the code segments above are transformed into an abstract syntax tree (AST), which is the process of syntax analysis. At the top of this syntax tree, we can see a parent node, which is the declaration of the variable var. Below this parent node, there are two child nodes: one is the identifier count, and the other is the assignment expression with an equal sign. Below the assignment expression node, there is another child node representing the numeric literal 0. It is shown in the red dashed box below:

图片

According to the red dashed box in the flowchart, after lexical analysis, while performing syntax analysis, the JavaScript engine updates the global scope and creates local scopes. In this code example, variables base and scope are added to the global scope; variables base, scope, and increment are added to the displayVal scope; and variable base is added to the addOne scope.

图片

After scope creation, the code above becomes intermediate code. V8 uses a dual-wheel drive design of a compiler and an interpreter for real-time compilation (JIT, Just in Time). One wheel is for direct execution, and the other optimizes the hot code into machine code before execution. This is done for balancing and improving performance. We don’t need to delve deep into this in this lesson. We only need to know that after this, the compilation is completed, and our code will enter the execution phase.

Feeling a bit dizzy? No worries, let’s summarize. From a spatial perspective, we understand that functions are initially stored in the heap and accessed through addresses in the stack. By going through the compilation process, we learn that scopes are completed in the parsing phase before the code is executed.

Lifecycle: Code Execution #

If scope is understood from the perspective of “space”, then lifecycle is understood from the perspective of “time”. Next, we will take a look at the process of a function, from invocation to completion, during the code execution stage, which is its lifecycle.

Image

Function Lifecycle #

Earlier, we mentioned the concepts of heap and stack. When JavaScript is executing, the global execution context is placed in a data structure similar to a stack and executes according to the function call chain, hence it is also referred to as the call stack. Below, we will look at the stack that will be generated step by step based on the code above.

At the beginning, base, scope, addOne, and displayVal will be stored in the variable environment. The executable code includes the assignments of base and scope, as well as the invocation of the displayVal() function. The displayVal function will be executed once the assignments are completed.

Image

When executing the displayVal function, the relevant global context of the displayVal function will be pushed into the stack. Since both base and scope have function-level declarations, they will also be hoisted above the increment. As executable code, the assignments of base and scope are completed. Next, the addOne function is invoked.

Image

After that, the addOne function needs to be pushed into the stack again. The base variable is assigned once again, and then the result of base+1 is executed. After this step, the addOne function’s context will be popped out of the stack and returned as a value to the displayVal function.

Image

In the final step, the increment within displayVal will be assigned as 2, and then the function will return the value of 2+2, which is 4. After this, the execution context of the displayVal function will also be popped out, leaving only the global execution context in the stack. The lifecycles of the addOne and displayVal functions end with the completion of their execution and will be garbage collected in the subsequent garbage collection process.

Image

Variable Lookup during Execution #

Earlier, we mentioned that the scope in JavaScript is determined during the compilation process through lexical analysis. Now let’s take a look at how the engine interacts with the scope during the process from compilation to execution.

Taking var base = 0 as an example once again, in the left part of the following diagram, we can see that when the compiler encounters var base, it will ask the scope if base already exists. If it does, the declaration will be ignored. If base does not exist, the scope will create a new variable base, and then the engine will handle the assignment base = 2.

At this point, the engine will then ask the scope, within the current execution scope, if there is a variable named base. If it exists, the assignment will be executed, otherwise the search will continue until it finds the variable or produces an error.

Image

There is another question worth pondering here: In the last step of the example mentioned above, we mentioned that if the engine cannot find the relevant variable within the current execution scope, it will continue searching or produce an error. The order of this “continuing search” is from inner to outer. We can look at the following classic example of nested scopes.

In this example, we can see that the first layer is the global scope, which only contains a function declaration named outer. The middle layer is a function scope that contains a, which is a parameter of the outer function; b is a variable declaration; and inner is a nested function. Finally, the innermost layer is a function scope that contains a, b, and c. When we execute outer(1), the engine will start searching from the innermost layer, then continue searching in the scope of inner, and if it still doesn’t find it, it will search in the scope of outer. In this case, a can be found in outer’s scope.

Image

IIFE: Encapsulation using Scope #

From the example, we can see that the hierarchy of scopes can be divided into nested relationships such as block-level, function-level, and global. Both block-level and function-level scopes can help us encapsulate code and control code visibility. Although commonly used declarative functions can help us achieve this goal, it has two problems:

  • The first problem is that if we use a declarative function for encapsulation, it indirectly creates the function foo and pollutes the global scope.
  • The second problem is that we need to call it with foo(). The solution to this problem is to use an immediately invoked function expression (IIFE).

In the example below, we can see that when using this expression, we can use the first set of parentheses to encapsulate the function and call it immediately with the last set of parentheses.

var a = 2;
(function foo(){
    var a = 3;
    console.log( a ); // 3
})();
console.log( a ); // 2

Closures: Breaking Scope Limitations #

Earlier, we systemically learned about the scope and lifespan of functions, mentioning the calling and execution of variables and functions in the stack, as well as their destruction and garbage collection. I referred to this as “keeping to the norm”. Now, let’s see how to break through these limitations, which can be considered as “the unexpected”.

In the following code, we utilize three characteristics of functions:

  • We create a local variable ‘i’ inside the function.
  • We embed two function methods, ‘increment’ and ‘getValue’.
  • We use ‘return’ to return the functions as a result.
function createCounter(){
    let i = 0;
    function increment(){
        i++;
    }  
    
    function getValue(){
        return i; 
    }
    return {increment, getValue};
}

const counter = createCounter();

By performing the above operations, we can see that even after ‘createCounter’ is executed, theoretically the related variable ‘i’ should have been destroyed and the garbage collection should have been completed. However, we can still access the internal variable ‘i’ and continue calling the ‘increment’ and ‘getValue’ methods. When we attempt to increase the value of ‘i’, we will observe the continuously increasing result.

counter.increment(); 
counter.getValue(); // Returns 1
counter.increment();
counter.getValue(); // Returns 2

What is the principle behind this? We need to go back to the earlier step of parsing or syntax analysis.

During this process, when the JavaScript engine parses a function, it uses delayed parsing instead of immediate parsing. This is done to reduce parsing time and memory usage. Therefore, during the syntax analysis, only the ‘createCounter’ function is parsed, and the embedded ‘increment’ and ‘getValue’ functions are not further parsed. However, the engine also has a pre-parsing function, which recognizes that ‘increment’ and ‘getValue’ will reference an external variable ‘i’. Thus, this variable is moved from the stack to the heap, using a longer memory to record the value of ‘i’. In this way, closures break through the limitations of scope and lifespan, while still adhering to the norm.

However, it is important to note that due to performance, memory, and execution speed considerations, when using closures, we should prefer local variables over global variables as much as possible.

Extension: Improving Problems and Solutions #

Variable and Function Hoisting #

Let’s further explore variable and function hoisting. First, let’s look at an example where we separate the code var base = 2 into two lines. In the first line, base = 2 is an assignment expression, and in the second line, var base is a variable declaration. Common sense would tell us that this might result in base returning undefined because it is assigned a value before it is declared. However, when you execute console.log(base), you will see that the result is 2.

base = 2;
var base;
console.log(base); // 2

Why is this happening? This is because during the compilation and execution process of the code, the declaration var base is hoisted first, and then the assignment base = 2 is executed.

Image

In the following example, let’s see what happens if we first try to get the value of base through console.log, and then declare and assign a variable var base = 3. Shouldn’t it return 3 according to the principle of variable hoisting? The answer is undefined!

console.log(base); // undefined
var base = 3;

In this compilation and execution process, the declaration and assignment var base = 3 are split into two parts. One is the declaration var base, and the other is the assignment base = 3. Only the declaration of the variable base is hoisted; the assignment of the variable base is not hoisted and is still executed after hoisting. The following shows its lexical splitting and hoisting order.

Image

Similarly to variables, function declarations are also hoisted to the top. Moreover, if function and variable hoisting happen at the same time, the function will be hoisted before the variable. Another point worth noting is that, as mentioned in the previous lecture, only function declarations are hoisted; function expressions, like variable assignments, are not hoisted.

ES6 Block Scope #

However, regarding the characteristics of variable and function hoisting, there is still a certain problem: it can lead to variable overriding and pollution. Starting from ES6, in addition to global scope and function scope, block scope has been introduced to JavaScript. Therefore, in variable declarations, besides var, let and const have been added. These two variables and constants belong to block scope and are not hoisted.

We can try this out. When we input console.log(base) and then declare base using let, we will see an error.

{
    console.log(base); // ReferenceError!
    let base = 0;
}

Similarly, in the example below, we can see that the count inside the if-else curly brackets does not pollute the global scope.

var base = 1;
if (base) {
    let count = base * 2;
    console.log(count);
}
console.log(count); // ReferenceError

Summary #

In this lesson, we spent a lot of space talking about “keeping the norm”. This refers to the scope and lifetime of a function in general. We spent less space talking about “surprising”. This refers to the principle of breaking the limitations of closures. Only when we have an understanding of the actual compilation and execution process of a function, when we stand from the perspective of a function and walk through its life journey, from how it is created, step by step to its lexical and syntactic analysis, compilation optimization to execution, invocation to destruction and recycling, can we more clearly understand how to use rules, or even break the limitations of rules.

At the same time, we also see that the variable and function hoisting in JavaScript itself has a certain counter-intuitiveness. Although we cannot say that this is a bug, some developers see it as a defect. Therefore, starting from ES6, block-level scope was introduced. I hope that through an understanding of the principles, you will have a clearer understanding of their usage.

Reflection questions #

When discussing functional programming, we mentioned that closures can be used as a data storage structure compared to objects. When discussing the object-oriented programming pattern, we also mentioned that closures can be used to create private attributes for objects. So besides these examples, can you give other examples to illustrate its other uses?

Feel free to share your experience in the comments section and let’s discuss together. Also, feel free to share today’s content with more friends. See you next time!