Extra Meal the Summit of Ignorance Your Rust Learning Frequently Asked Questions Summary

Extra Lesson: The Pinnacle of Ignorance – Your Common Rust Learning Questions Summarized #

Hello, I am Chen Tian.

So far, we have learned a lot about Rust, such as basic syntax, memory management, ownership, lifetimes, and we have also showcased three very representative example projects, to give you an idea of what Rust code looks like in a near real-world application environment.

Although we’ve covered much, do you still feel like there’s a “learn once, fail once you write” sensation? Don’t worry, take it one step at a time. Learning any new knowledge is not accomplished overnight, let’s let the bullets fly for a while. You can also encourage yourself, you have already completed so many check-ins, continue to persist.

In today’s extra lesson, let’s take a little break, adjust the learning pace, and talk about common questions in Rust development, hoping to resolve some of your confusions.

Ownership Issues #

Q: If I want to create a doubly linked list, how do I handle it?

Rust’s standard library has LinkedList, which is an implementation of a doubly linked list. But when you need to use a linked list, consider if the same demand can be met by using a Vec list or VecDeque circular buffer. This is because linked lists are not cache-friendly and perform much worse.

If you’re simply curious about how to implement a doubly-linked list, you can use Rc/RefCell ([Lecture 9]) to implement it. For the next pointer of the list, you could use Rc; for the prev pointer, you could use Weak.

Weak is a weakened version of Rc that does not participate in reference counting, but Weak can be upgraded to Rc for use. If you’ve used reference count data structures in other languages, Weak won’t be unfamiliar to you, as it can help us break circular references. Interested students can try to implement it on their own and then compare it with this reference implementation.

You might be curious why Rust’s standard LinkedList doesn’t use Rc/Weak. That’s because the standard library directly uses NonNull pointers and unsafe code.

Q: The compiler always tells me the “use of moved value” error, how do I break this?

This is a common error we often encounter when learning Rust, which means you are trying to access a variable whose ownership has already been moved.

For such errors, first, you have to judge whether this variable really needs to be moved to another scope or not? If not, can borrowing be used instead? (Lecture 8) If indeed it needs to be moved to another scope:

If multiple owners need to share the same data, Rc/Arc can be used, along with Cell/RefCell/Mutex/RwLock. (Lecture 9)
If you do not need multiple owners to share, consider implementing Clone or even Copy. (Lecture 7)

Lifetime Issues #

Q: Why does the compiler always pick a fight with me when my function returns a reference?

When a function returns a reference, unless it is a static reference, this reference must be related to some input parameter with a reference. Input parameters could be &self, &mut self, or &T/&mut T. We need to establish the correct relationship between input and return values, which has nothing to do with the implementation inside the function, it’s only related to the function’s signature.

For example, the HashMap’s get() method:

pub fn get<Q: ?Sized>(&self, k: &Q) -> Option<&V>
    where
        K: Borrow<Q>,
        Q: Hash + Eq

We don’t need to implement it or know how it’s implemented to determine who the return value Option<&V> is related to. There are only two choices here: &self or k: &Q. Obviously, it is &self, because HashMap holds the data, and k is just a key to query in the HashMap.

Why don’t we need to use lifetime parameters here? Because of the rule we discussed earlier: when &self/&mut self appears, the return value’s lifetime is associated with it. (Lecture 10) This is a great rule because most methods, if they return a reference, it is essentially referencing some data in &self.

If you can understand this layer of relationship, then it is relatively easy to handle lifetime errors that occur when the function returns a reference.

When you have to return data created or obtained during the execution of the function, unrelated to the arguments, then whether it is an owned data or a reference, you can only return owned data. For references, this means calling clone() or to_owned() to gain ownership from the reference.

Data Structure Issues #

Q: Why is Rust’s string so messy, with so many different expressions like String, &String, &str?

I have to say that this is a very misleading question because the question tends to summarily confuse, and it can easily lead people astray.

First, any data structure T can have a reference to it, &T, so the difference between String and &String, as well as the difference between String and &str, are fundamentally two different questions.

A better question is: why do we need &str when we have String? Or, more generally: why do containers like String and Vec, which hold continuous data, also need the concept of slices?

Once the question hits the spot, the answer speaks for itself, because a slice is a very general data structure.

Those who have used Python know:

s = "hello world"
let slice1 = s[:5]  # You can slice strings
let slice2 = slice1[1:3]  # You can slice the slices again
print(slice1, slice2)  # Print hello, el

This is very similar to Rust’s String slices:

let s = "hello world".to_string();
let slice1 = &s[..5];  // You can slice strings
let slice2 = &slice1[1..3];  // You can slice the slices again
println!("{} {}", slice1, slice2);  // Print hello el

So &str is a slice of String, and can also be a slice of &str. It’s nothing special, just a fat pointer with length, pointing to a continuous piece of memory.

You can understand it this way: slices to data in Vec/String, etc., are similar to views in a database to tables. We will go in detail on this topic when we talk about Rust data structures later.

Q: In the course’s example code, unwrap() is often used, is that okay?

When we need to get data out of an Option or Result, we can use unwrap(), this is why unwrap() appears in the example code.

If we are only writing some code for learning purposes, then unwrap() is acceptable, but in a production environment, unless you can ensure that unwrap() will not trigger panic!(), you should handle data with pattern matching, or use the ? operator for error handling. We will have a lecture dedicated to Rust’s error dealing later.

In what cases can we be sure that unwrap() won’t panic? If before doing unwrap(), there is already an appropriate value in Option or Result (Some(T) or Ok(T)), you can do unwrap(). For instance, in code like this:

// Assuming v is a Vec<T>
if v.is_empty() {
    return None;
}

// We are now certain that there is at least one piece of data, so unwrap is safe
let first = v.pop().unwrap();

Q: Why do standard library data structures like Rc/Vec use so much unsafe, but people always tell me, unsafe is bad?

Good question. Developers of C also consider asm bad, but many libraries in C also heavily use asm.

The standard library’s responsibility is to implement the required functions in the most efficient manner possible, even at the cost of certain readability, while ensuring safety; at the same time, to provide users of the standard library with an elegant high-level abstraction, so they can write beautiful codes under most circumstances without dealing with ugliness.

In Rust, unsafe code entrusts the correctness and safety of the program to the developer to ensure, and the developers of the standard library have devoted a lot of effort and testing to guarantee this correctness and safety. When we write our own unsafe code, unless reviewed by an experienced developer, it is possible to overlook concurrent situations and write buggy code.

So unless it’s necessary, it’s advised not to write unsafe code. After all, most of the problems we deal with can be solved with good design, appropriate data structures, and algorithms.

Q: How do I declare global variables in Rust?

In [Lecture 3], we talked about const and static, which can be used to declare global variables. However, note that unless using unsafe, static cannot be used mutably, because that implies it could be modified under multiple threads, so it’s not safe:

static mut COUNTER: u64 = 0; 

fn main() {
    COUNTER += 1; // Compile error, the compiler tells you need to use unsafe
}

If you indeed want to use a writable global variable, you can use Mutex, but initializing it is cumbersome. Then, you can use a library lazy_static. For example (code):

use lazy_static::lazy_static;
use std::collections::HashMap;
use std::sync::{Arc, Mutex};

lazy_static! {
    static ref HASHMAP: Arc<Mutex<HashMap<u32, &'static str>>> = {
        let mut m = HashMap::new();
        m.insert(0, "foo");
        m.insert(1, "bar");
        m.insert(2, "baz");
        Arc::new(Mutex::new(m))
    };
}

fn main() {
    let mut map = HASHMAP.lock().unwrap();
    map.insert(3, "waz");

    println!("map: {:?}", map);
}

Debugging Tools #

Q: Under Rust, how do you generally debug applications?

I usually use tracing to log messages, and for some simple example codes, I use println!/dbg! to check the state of data structures at a particular moment. However, in my usual development, I hardly ever use debuggers to set breakpoints for step-by-step tracing.

Because it’s better to spend time on design rather than wasting time on debugging. By implementing clear logs and writing appropriate unit tests to ensure the correctness of the code logic. If you find yourself always needing to use debugging tools to step through to understand the state of the program, it indicates that the code is not well designed and is too complex.

When I was learning Rust, I often used debugging tools to check memory information. We will see later in the course that these tools are used to analyze some data structures.

Under Rust, we can use rust-gdb or rust-lldb, which provide some Rust-friendly pretty-print functionality. When installing Rust, they are also installed. I am personally used to gdb, but rust-gdb is suitable for use under Linux, there are some problems under OS X, so I usually switch to an Ubuntu virtual machine to use rust-gdb.

Thinking Question #

Let’s have an easy thinking question to integrate what we learned before. The code shows a problematic lifetime, can you find the reason? (code)

use std::str::Chars;

// Wrong, why?
fn lifetime1() -> &str {
    let name = "Tyr".to_string();
    &name[1..]
}

// Wrong, why?
fn lifetime2(name: String) -> &str {
    &name[1..]
}

// Correct, why?
fn lifetime3(name: &str) -> Chars {
    name.chars()
}

Feel free to answer in the comments section, and also very welcome to share your learning feelings during this time for mutual improvements. We’ll go back to the main content in the next lesson talking about Rust’s type system. See you next lesson!