11 Memory Management From Creation to Expiration What It Goes Through

11 Memory Management: What Does a Value Go Through From Creation to Destruction? #

Hello, I’m Chen Tian.

Since we started exploring Rust, we have been learning about ownership and lifetimes. By now, you should have a sufficient understanding of the core ideas behind Rust’s memory management.

Through the unique ownership model, Rust addresses the problems associated with the overly flexible heap memory that is not easily and efficiently released, avoiding the immense mental burden and potential mistakes of manual memory release; and also the efficiency issues brought by globally introducing traceable GC or ARC-like additional mechanisms.

However, the ownership model also introduces a lot of new concepts, from Move/Copy/Borrow semantics to lifetime management, making it somewhat challenging to learn.

But, have you noticed? Most of the concepts introduced, including Copy semantics and the lifetime of values, exist implicitly in other languages. Rust just defines them more clearly and delineates their scope of use more explicitly.

Today, following our previous path, let’s first organize and summarize the basic content of Rust memory management. Then, we’ll start with the fantastic journey of a value to see what experiences it has in memory from creation to destruction, integrating our previous topics.

At this point, you may be a little impatient, asking why today we are talking about memory again. This is because memory management is the core of any programming language, as important as the “inner strength” in martial arts. Only when we understand how data is created, placed, and destroyed in memory can we feel confident and at ease when reading code and analyzing problems.

Memory Management #

We mentioned heap and stack in our [first lecture], which are the main scenarios for using memory in code.

Stack memory “allocation” and “release” are highly efficient, determined at compile time. Therefore, it can’t safely carry values with dynamic sizes or lifetimes beyond the scope of the frame. For this, we need heap memory, which can be freely manipulated at runtime, to compensate for the stack’s limitations.

Heap memory is flexible enough. However, managing the lifecycle of data on the heap has become a major concern for various languages.

C adopts an undefined approach, leaving manual control to the developer; C++ improves upon C by introducing smart pointers, with semi-manual and semi-automatic control. Java and DotNet use GC to take full control of heap memory, entering the era of managed heap memory. Managed code means that code works under a “runtime” that ensures safe heap memory access.

The entire evolution of heap memory lifecycle management is shown in the following diagram: Heap Memory Lifecycle Evolution

The creators of Rust took a new look at the lifecycle of heap memory and found that most heap memory needs are for dynamic sizes, with a small part for longer lifetimes. As a result, it defaults to tying the lifecycle of heap memory to that of the stack memory using it, with a small leeway for the leaked mechanism, allowing the heap memory to have a lifetime beyond the frame’s survival period when necessary.

Let’s look at the comparative summary in the following diagram: - Heap and Stack Memory Comparison

With this basic understanding, let’s look at how Rust manages memory during the creation, use, and destruction of values.

I hope that after today’s content, when you see a Rust data structure, you can roughly imagine its layout in memory: which fields are on the stack, which are on the heap, and its approximate size.

Value Creation #

When we create a value for a data structure and assign it to a variable, it may be created on the stack or on the heap, depending on the properties of the value.

A simple recap: as we said in our [first] and [second lectures], theoretically, values with sizes that can be determined at compile time are placed on the stack, including native Rust types like characters, arrays, tuples, as well as user-defined fixed-size structures (structs) and enums.

If the size of the data structure is indeterminate or if its size is determinate but requires a longer lifespan during use, it is better placed on the heap.

Next, let’s look at the memory layout of several important data structures like structs, enums, vecs, and Strings when they are created.

Struct #

In Rust, when arranging data in memory, it will rearrange the data according to the alignment of each field, making its size and access efficiency optimal. For example, a struct with three fields—A, B, and C—might be arranged in memory as A, C, B: Struct Memory Layout

Why would the Rust compiler do that?

Let’s look at the problems that C encounters when representing a structure in memory. Write a code segment where two data structures, S1 and S2, each have three fields a, b, c. Fields a and c are u8 and occupy one byte, while b is u16 and occupies two bytes. S1’s definition order is a, b, c, while S2’s definition order is a, c, b:

Guess what are the sizes of S1 and S2?

#include <stdio.h>

struct S1 {
    u_int8_t a;
    u_int16_t b;
    u_int8_t c;
};

struct S2 {
    u_int8_t a;
    u_int8_t c;
    u_int16_t b;
};

void main() {
    printf("size of S1: %d, S2: %d", sizeof(struct S1), sizeof(struct S2));
}

The correct answer is: 6 and 4.

Why does S1 have a size of 6 when it only uses 4 bytes? This is because the CPU’s performance dramatically drops when loading unaligned memory, so to avoid the performance impact caused by poorly defined user data structures, C languages will process structures as follows:

  1. First, determine the length and alignment length of each field. The alignment length of a primitive type is consistent with its type length.
  2. The starting position of each field must align with its alignment length. If it cannot align, add padding until it aligns.
  3. The alignment size of the structure is the same as the largest field’s alignment size, while the length of the structure is rounded to a multiple of its alignment.

On the surface, these three rules might seem like a tongue twister, but don’t worry. Combining them with the code we just discussed, it’s easy to understand.

For S1, field a is of type u8, so its length and alignment are both 1, while b is of type u16, so its length and alignment are 2. However, since a only occupies one byte, b’s offset is 1, and according to the second rule, its starting position cannot align with its length, so the compiler will add a byte of padding to make b’s offset 2, thus aligning b. Struct Padding

Then, c has a length and alignment of 1, no padding needed. So, S1’s size comes out to 5, but according to the third rule above, S1’s alignment is 2, and the closest “multiple of 2” to 5 is 6, so S1’s final length is 6. This last rule ensures that when S1 is placed in an array, it can be effectively aligned.

So, if the definition of a structure does not take these factors into account, a lot of space may be wasted for alignment. We see that S1 and S2, which store the same data, differ in size by 50%.

When using C, the best practice for defining a structure is to fully consider the alignment of each field and arrange them reasonably so that memory usage is most efficient. This task would be very difficult for developers, especially for nested structures, requiring careful calculation to find the optimal solution.

In Rust, the compiler automatically optimizes this for us. That’s why Rust automatically rearranges the structure you define to be as efficient as possible. Here’s the same code in Rust, where the sizes of S1 and S2 are both 4 (code):

use std::mem::{align_of, size_of};

struct S1 {
    a: u8,
    b: u16,
    c: u8,
}

struct S2 {
    a: u8,
    c: u8,
    b: u16,
}

fn main() {
    println!("sizeof S1: {}, S2: {}", size_of::<S1>(), size_of::<S2>());
    println!("alignof S1: {}, S2: {}", align_of::<S1>(), align_of::<S2>());
}

You can also look at this diagram to visually compare the behavior of C and Rust: C vs Rust Struct Behavior

Although Rust’s compiler optimizes the layout of structures by default, you can use the #[repr] macro to force Rust not to optimize, making it behave like C. This allows Rust code to seamlessly interact with C code.

Now that we understand the layout of structs in Rust (tuples are similar), let’s look at enums.

Enum #

As we discussed earlier, enums in Rust are tagged unions. Their size is the size of the tag plus the length of the largest type.

In [Lecture 3], when we defined the enum data structure, we briefly mentioned examples of two design types, Option and Result. Option is the simplest type of enumeration with a value/no value, and Result includes enumeration types that return data on success and error, which we will discuss in more detail later. For now, we just need to understand the memory design.

According to the three alignment rules we just mentioned, the memory after the tag will be aligned according to its alignment size. Therefore, for Option, its length is 1 + 1 = 2 bytes, and for Result, its length is 8 + 8 = 16 bytes. Generally speaking, under a 64-bit CPU, the maximum length of an enum is: the length of the largest type + 8, since the maximum alignment of a 64-bit CPU is 64 bits, which is 8 bytes.

The following diagram shows the layouts of enums, Option, and Result: Enum, Option, and Result Memory Layout

It’s worth noting that the Rust compiler applies some additional optimizations to enum to make the memory layout of specific commonly used structures more compact. Let’s write a code section to help you better understand the size of different data structures (code):

use std::collections::HashMap;
use std::mem::size_of;

enum E {
    A(f64),
    B(HashMap<String, String>),
    C(Result<Vec<u8>, String>),
}

// This is a declarative macro that prints the sizes of various data structures, their sizes inside an Option, and their sizes inside a Result
macro_rules! show_size {
    (header) => {
        println!(
            "{:<24} {:>4}    {}    {}",
            "Type", "T", "Option<T>", "Result<T, io::Error>"
        );
        println!("{}", "-".repeat(64));
    };
    ($t:ty) => {
        println!(
            "{:<24} {:4} {:8} {:12}",
            stringify!($t),
            size_of::<$t>(),
            size_of::<Option<$t>>(),
            size_of::<Result<$t, std::io::Error>>(),
        )
    };
}

fn main() {
    show_size!(header);
    show_size!(u8);
    show_size!(f64);
    show_size!(&u8);
    show_size!(Box<u8>);
    show_size!(&[u8]);

    show_size!(String);
    show_size!(Vec<u8>);
    show_size!(HashMap<String, String>);
    show_size!(E);
}

This code includes a declarative macro show_size, let’s not worry about it for now. When you run this code, you will find that Option combined with reference data structures, such as &u8, Box, Vec, and HashMap, does not occupy additional space, which is quite interesting.

Type                        T    Option<T>    Result<T, io::Error>
----------------------------------------------------------------
u8                          1        2           24
f64                         8       16           24
&u8                         8        8           24
Box<u8>                     8        8           24
&[u8]                      16       16           24
String                     24       24           32
Vec<u8>                    24       24           32
HashMap<String, String>    48       48           56
E                          56       56           64

For the Option structure, its tag only has two states: 0 or 1, with the tag being 0 for None and 1 for Some.

Normally, when we pair it with a reference, although the tag only occupies 1 bit, under a 64-bit CPU, the reference structure’s alignment is 8, so it would occupy 8 bytes plus extra padding, resulting in 16 bytes total—quite a waste of memory. What to do?

Rust handles this by realizing that the first field of a reference type is a pointer, and a pointer can’t be equal to 0. However, we can reuse this pointer: when it is 0, it means None, otherwise it means Some, reducing memory usage—a clever optimization that we can learn from.

vec and String #

From the code result above, we also see that String and Vec occupy the same size, which is 24 bytes. Indeed, if you look at the source code for the String structure, you can see that it is internally a Vec.

The Vec structure is a fat pointer with 3 words, including: a pointer to heap memory (pointer), the allocated heap memory capacity (capacity), and the length of data in heap memory (length), as shown in the following diagram: Vec and String Memory Layout

Many dynamically sized data structures are laid out similarly in memory when created: fat pointers on the stack that point to data allocated on the heap. The Rc we introduced earlier is the same.

That’s all for value creation in terms of memory layout. If you’re interested in the memory layout of other data structures, you can visit cheats.rs, a cheat sheet for Rust, which is great for quick reference. For example, the memory layout of reference types: Reference Type Memory Layout

Now, the value has been successfully created, and we have a good understanding of its memory layout. What changes will occur in its memory during use? Let’s take a look.

Value Usage #

When talking about ownership, we learned that for Rust, if a value does not implement Copy, it will be Moved when assigned, passed as an argument, or returned from a function.

In fact, both Copy and Move are implemented internally as a shallow, bitwise memory copy, only Copy allows you to access the previous variable, and Move does not. Here’s a diagram: Copy and Move Semantics

In our minds, memory copying is a heavy operation with low efficiency. Indeed, if every call along your critical path involves copying several hundred kilobytes of data, such as a large array, it is very inefficient.

However, if what you are copying is only a native type (Copy) or a fat pointer on the stack (Move), without involving a deep copy (that is, copying heap memory), then the efficiency is very high, and we do not need to worry about the performance loss brought by each assignment or parameter passing.

So, whether it’s Copy or Move, its efficiency is very high.

However, there is one exception worth mentioning: passing large arrays on the stack can affect efficiency due to copying the entire array. Therefore, we generally suggest not to place large arrays on the stack. If necessary, when passing these arrays, it is best to pass references instead of values.

In addition to Move during value usage, dynamic growth needs to be considered. In Rust, collection types of data structures will automatically expand during usage.

Taking a Vec as an example, when you continue to add new content after using the current capacity of heap memory, it triggers automatic growth. Sometimes, the data in collection types come in and out, causing them to keep growing, but only a small portion of the capacity is used, leading to low memory efficiency. So, you should consider using methods like shrink_to_fit to save memory.

Value Destruction #

Alright, the journey of the value is more than halfway through; we’ve covered creation and usage, so let’s talk about the destruction of values.

We have generally mentioned that when an owner leaves the scope, the values they own will be discarded. But from a code perspective, how exactly does Rust discard them?

This is where the Drop trait comes in. The Drop trait is similar to a destructor in object-oriented programming. When a value is to be released, its Drop trait will be called. For instance, in the following code, the variable greeting is a String, and when it leaves the scope, its drop() function is automatically called, releasing the memory on the heap that contains “hello world” and then the stack memory: Value Destruction with Drop Trait

If the value to be released is a complex data structure, such as a struct, that struct will call the drop() function of each of its fields during the drop(), and if a field is a complex structure or a collection type, it will recurse until each field is released.

Here is an example: Drop Trait Recursion

In the code, the student variable is a struct with fields: name, age, scores, where name is a String and scores is a HashMap, which themselves