03 Start Your Journey With Your First Rust Program

Getting a Glimpse of the Doorway: Starting with Your First Rust Program! #

Hi, I’m Chen Tian. After building up the prerequisite knowledge, today we’ll officially start learning the Rust language itself.

The best shortcut to learn a language is to immerse yourself in its environment. As programmers we believe in getting hands dirty - starting directly from code brings the most intuitive experience. So starting this lesson, you’ll need to set up the Rust environment on your computer.

Today we’ll cover a lot of Rust basics. I’ve carefully constructed code examples to help you understand, I highly recommend you type in these code snippets line by line, think about why it’s written this way as you go, and experience the execution and output at runtime. If you run into issues, you can also click the code links with each example to run them on Rust playground.

Installing Rust is very easy. You can follow the methods on rustup.rs to install based on your OS. For example on UNIX systems you can directly run:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

This will install the Rust toolchain on your system. Afterwards you can use cargo new locally to create a new Rust project and try out Rust features. Go ahead and write your first hello world program in Rust!

fn main() {
  println!("Hello world!");
}

You can use any editor to write Rust code. I personally prefer VS Code since it’s free, full-featured, and fast. In VS Code I’ve installed some plugins for Rust in this order, you can use it as a reference:

  1. rust-analyzer: Real time compilation and analysis of your Rust code, pointing out errors and annotating types. You can also use the official Rust plugin instead.
  2. rust syntax: Syntax highlighting for code.
  3. crates: Helps analyze whether dependencies for the current project are the latest versions.
  4. better toml: Rust uses toml for project configuration management. better toml provides syntax highlighting and shows errors in toml files.
  5. rust test lens: Quickly runs Rust tests.
  6. Tabnine: AI-powered autocomplete to help you write code faster.

The First Useful Rust Program #

Now you have the tools and environment. Although we haven’t introduced any Rust syntax yet, that doesn’t stop us from writing a slightly useful Rust program. Running through it will give you a basic feel for Rust’s features, key syntax, and ecosystem. Then we can analyze in detail afterwards.

You must get hands on and follow along typing line by line. If you encounter concepts you don’t quite understand, don’t worry - today you just need to get the code running first. We’ll learn the specifics in a structured way later.

I also recommend you solve the same problem in a language you normally use and compare with Rust - which feels more concise and readable?

The requirements for this program are simple - make an HTTP request to the Rust home page, then convert the HTML obtained to Markdown and save it. I believe with JavaScript or Python, just picking good dependencies, this would be around 10+ lines. Let’s see how to do it in Rust.

First, we generate a new project with cargo new scrape_url. By default this command generates an executable project scrape_url with main entry point src/main.rs. In Cargo.toml we add the following dependencies:

[dependencies]
reqwest = { version = "0.11", features = ["blocking"] }  
html2md = "0.2"

Cargo.toml is the configuration file for Rust projects, using toml syntax. We’ve added two dependencies to this project: reqwest and html2md. reqwest is an HTTP client, with usage similar to Python’s request. html2md as the name implies, converts HTML text to Markdown.

Next, in src/main.rs we add the following code to the main() function:

use std::fs;

fn main() {
  let url = "https://www.rust-lang.org/";
  let output = "rust.md";

  println!("Fetching url: {}", url);
  let body = reqwest::blocking::get(url).unwrap().text().unwrap();

  println!("Converting html to markdown...");
  let md = html2md::parse_html(&body);

  fs::write(output, md.as_bytes()).unwrap();
  println!("Converted markdown has been saved in {}.", output);
}

After saving, enter the project directory in the command line and run cargo run. After some slightly long compilation, the program starts running. In the command line you’ll see the output:

Fetching url: https://www.rust-lang.org/
Converting html to markdown... 
Converted markdown has been saved in rust.md.

And a rust.md file is created in the current directory. Opening it, the content is the Rust home page content.

Bingo! Our first Rust program runs successfully!

From this not very long code, we can get a feel for some basic Rust traits:

First, Rust uses a tool called cargo to manage projects, similar to Node.js’ npm, Golang’s go, used for dependency management and task management during development like compiling, running, testing, code formatting, etc.

Second, Rust’s overall syntax leans C/C++ style. Function bodies are wrapped in curly braces {}, expressions separated by semicolons ;, accessing a struct’s member functions or variables uses the dot . operator, while accessing namespaces or static object functions uses the double colon :: operator. If you want to simplify referencing functions or data types within a namespace, you can use the use keyword, like use std::fs. Additionally, the executable entry point function is main().

You can also easily see that although Rust is a strongly typed language, its compiler supports type inference, making the intuitive feel of writing code similar to scripting languages.

Many developers not used to type inference feel this reduces code readability, since the variable type may only be known from context. But don’t worry - if you use the rust-analyzer plugin in your editor, variable types are auto-suggested:

image

Finally, Rust supports macro programming, with many basic functions like println!() wrapped as macros, allowing developers to write concise code.

The examples here don’t demonstrate it, but Rust also has other traits not shown:

  • Rust variables are immutable by default. The mut keyword must be explicitly used if modifying the variable’s value.
  • Aside from a few statements like let/static/const/fn, the vast majority of Rust code consists of expressions. So if/while/for/loop all return values, the last function expression is its return value, consistent with functional languages.
  • Rust supports interface-oriented and generic programming.
  • Rust has a very rich set of data types and powerful standard library.
  • Rust has very rich control flow including pattern matching.

The first useful Rust program runs successfully! I wonder if you’re hesitating now, thinking I don’t really understand all this yet, do I need to learn it all first before continuing? Don’t hesitate, keep learning, we’ll cover it all later.

Next, to quickly get started with Rust, we’ll go over Rust development basics together.

This knowledge has just minor differences across programming languages and may feel a little dry, but this lesson forms the foundation for subsequent learning. I recommend you type out each code sample, run it, and compare with languages you’re familiar with to reinforce your understanding.

image

Basic Syntax and Primitive Data Types #

First let’s look at how to define variables, functions, and data structures in Rust.

Variables and Functions #

Mentioned earlier, Rust supports type inference. In cases where the compiler can deduce types, variable types can generally be omitted, but constants (const) and static variables (static) must declare types.

When defining variables, you can add the mut keyword if needed to make the variable mutable. Variables are immutable by default, which is a very important trait, conforming to the principle of least privilege and helping us write robust, correct code. When you use mut but don’t modify the variable, Rust will kindly warn at compile time to remove unnecessary mut.

In Rust, functions are first-class citizens that can be passed as parameters or returned as values. Let’s look at an example of a function parameter (code):

fn apply(value: i32, f: fn(i32) -> i32) -> i32 {
  f(value)
}

fn square(value: i32) -> i32 {
  value * value
}

fn cube(value: i32) -> i32 {
  value * value * value 
}

fn main() {
  println!("apply square: {}", apply(2, square)); 
  println!("apply cube: {}", apply(2, cube));
}

Here fn(i32) -> i32 is the type for the second parameter of the apply function, indicating it accepts a function as parameter. This passed in function must have: one param of type i32, return type also i32.

Rust function parameters and return types must be explicitly defined. If no return, unit can be omitted. The return keyword must be used for early returns within functions, otherwise the last expression is the return value. If a semicolon ; is added after the last expression, it implies a unit return value. You can see this example (code):

fn pi() -> f64 {
  3.1415926
}

fn not_pi() {
  3.1415926; 
}

fn main() {
  let is_pi = pi();
  let is_unit1 = not_pi(); 
  let is_unit2 = {
    pi();
  };

  println!("is_pi: {:?}, is_unit1: {:?}, is_unit2: {:?}", is_pi, is_unit1, is_unit2);
}

Data Structures #

After understanding how functions are defined, let’s look at how to define data structures in Rust.

Data structures are a core component of programs. When modeling complex problems, we need to define custom data structures. Rust is very powerful, allowing structs to be defined with struct, tagged unions with enum, and tuple types like Python can be defined on the fly.

For example, we can define data structures for a chat service like (code):

#[derive(Debug)] 
enum Gender {
  Unspecified = 0,
  Female = 1,
  Male = 2,
}

#[derive(Debug, Copy, Clone)]
struct UserId(u64);

#[derive(Debug, Copy, Clone)]
struct TopicId(u64);

#[derive(Debug)]
struct User {
  id: UserId,
  name: String,
  gender: Gender, 
}

#[derive(Debug)]  
struct Topic {
  id: TopicId,
  name: String,
  owner: UserId,
}

// Define possible events in the chat room
#[derive(Debug)]
enum Event {
  Join((UserId, TopicId)),
  Leave((UserId, TopicId)),
  Message((UserId, TopicId, String)),
}

fn main() {
  //...
}

Brief explanation:

  1. Gender: enum type. In Rust enum can be used like C enums.
  2. UserId/TopicId: Special struct form called tuple structs. Fields are anonymous, accessible by index, good for simple structs.
  3. User/Topic: Standard struct, can combine any types.
  4. Event: Standard tagged union, defines three events: Join, Leave, Message, each with its own data structure.

When defining data structures, we generally add annotations to introduce additional behaviors. In Rust, data behaviors are defined through traits. We’ll cover traits in detail later, for now you can think of traits as interfaces a data structure can implement, similar to Java interfaces.

Generally we use the impl keyword to implement traits for data structures, but Rust helpfully provides derive macros that can greatly simplify defining some standard interfaces. For example #[derive(Debug)] implements the Debug trait for the data structure, providing debug capabilities, so we can print with {:?} and println!.

When defining UserId/TopicId we also used the Copy/Clone derive macros. Clone lets the data structure be cloned, while Copy causes the data structure to be automatically copied byte-wise when passed as parameters. In the next lesson on ownership I’ll specifically cover when Copy is needed.

Let’s briefly summarize defining variables, functions, and data structures in Rust:

image

Control Flow #

A program’s basic control flow includes the following, which we should be very familiar with. Focus on how to achieve them in Rust.

Sequential execution is line by line code execution. During execution, function calls lead to function calls - the code jumps into the context of another function to execute until returning.

Rust’s loops are consistent with most languages, supporting endless loops with loop, conditional loops with while, and looping over iterators with for. Loops can exit early using break or skip to the next iteration with continue.

Jumps occur when satisfying some condition. Rust supports branch jumps, pattern matching jumps, error jumps, and async jumps.

  • Branch jumps are the if/else we’re used to.
  • Rust’s pattern matching can jump branches based on matching part of an expression or value.
  • In error jumps, when a called function returns an error, Rust terminates execution of the current function early and returns the error upward.
  • In Rust async jumps, when an async function executes await, the current context may block and execution will jump to another async task until await no longer blocks.

We’ll use the Fibonacci sequence with if and loop/while/for loops to implement basic control flow (code):

fn fib_loop(n: u8) {
  let mut a = 1;
  let mut b = 1;
  let mut i = 2u8;

  loop {
    let c = a + b;
    a = b;
    b = c;
    i += 1;

    println!("next val is {}", b);

    if i >= n {
      break; 
    }
  }
}

fn fib_while(n: u8) {
  let (mut a, mut b, mut i) = (1, 1, 2);

  while i < n {
    let c = a + b;
    a = b; 
    b = c;
    i += 1;

    println!("next val is {}", b);
  }
}

fn fib_for(n: u8) {
  let (mut a, mut b) = (1, 1);

  for _i in 2..n {
    let c = a + b;
    a = b;
    b = c;
    println!("next val is {}", b); 
  }
}

fn main() {
  let n = 10;
  fib_loop(n);
  fib_while(n);
  fib_for(n); 
}

Note here Rust for loops work on any data structure implementing the IntoIterator trait.

During execution, IntoIterator generates an iterator. The for loop continually takes values from the iterator until it returns None. So in reality for loops are just syntax sugar - the compiler expands them using a loop loop iterating over the iterator until None is returned.

In the fib_for function, we also see syntax like 2…n. Python developers will immediately recognize this as a range operation, with 2…n containing all values x where 2 <= x < n. As in Python, in Rust you can also omit the lower or upper bound of ranges, e.g.:

let arr = [1, 2, 3];
assert_eq!(arr[..], [1, 2, 3]);  
assert_eq!(arr[0..=1], [1, 2]);

Different from Python, Rust ranges do not support negatives, so you cannot use syntax like arr[1..-1]. This is because the lower and upper bounds of Rust ranges are usize types, which cannot be negative.

Below is a summary of Rust’s main control flow constructs:

image

Pattern Matching #

Rust’s pattern matching incorporates strengths of functional languages - powerful, elegant, and efficient. It can match on partial or full contents of structs/enums. For example with the Event data structure we designed earlier, we can match like (code):

fn process_event(event: &Event) {
  match event {
    Event::Join((uid, _tid)) => println!("user {:?} joined", uid),
    Event::Leave((uid, tid)) => println!("user {:?} left {:?}", uid, tid), 
    Event::Message((_, _, msg)) => println!("broadcast: {}", msg),
  }
} 

From the code we can see it directly matches and assigns from the inner enum data, saving several lines compared to languages like JavaScript and Python that only support simple pattern matching.

Aside from match, we can also use if let/while let for simple matches. If we only care about Event::Message in the above code, we can write (code):

fn process_message(event: &Event) {
  if let Event::Message((_, _, msg)) = event {
    println!("broadcast: {}", msg); 
  }
}

Pattern matching is a very important Rust language feature, widely used in state machine handling, message processing, and error handling. If you’ve used C/Java/Python/JavaScript which lack strong pattern matching support, be sure to practice this well.

Error Handling #

Rust does not use the exception handling adopted by predecessors like C++/Java. Instead it borrows from Haskell, encapsulating errors in the Result<T, E> type and provides the ? operator for error propagation, convenient for development. Result<T, E> is a generic data structure where T represents the result type on successful execution, and E represents the error type.

The scrape_url project we started today already uses Result<T, E> in many places. Here I’ll show the code again, but we used the unwrap() method which only cares about the successful result - errors terminate the whole program.

use std::fs;

fn main() {
  let url = "https://www.rust-lang.org/";
  let output = "rust.md";

  println!("Fetching url: {}", url);
  let body = reqwest::blocking::get(url).unwrap().text().unwrap();

  println!("Converting html to markdown...");
  let md = html2md::parse_html(&body);

  fs::write(output, md.as_bytes()).unwrap();
  println!("Converted markdown has been saved in {}.", output);
}

To propagate errors, we can replace all unwrap() with the ? operator and have main() return a Result, like:

use std::fs;

// main now returns a Result 
fn main() -> Result<(), Box<dyn std::error::Error>> {
  let url = "https://www.rust-lang.org/";
  let output = "rust.md";

  println!("Fetching url: {}", url);
  let body = reqwest::blocking::get(url)?.text()?;

  println!("Converting html to markdown...");
  let md = html2md::parse_html(&body);

  fs::write(output, md.as_bytes())?;
  println!("Converted markdown has been saved in {}.", output);

  Ok(())
}

That’s it for error handling for now, we’ll dedicate a full lesson later to study Rust error handling in depth compared to other languages.

Organizing Rust Projects #

As Rust code grows in scale, we can no longer fit everything in one file and need multiple files and directories working together. Here we can use mod to organize code.

The approach is: In the entry lib.rs/main.rs file, use mod to declare other code files to load. If the module content is larger, it can be placed in a directory with a mod.rs file that imports the other files in that module. This file serves a similar role to Python’s __init__.py. Afterwards the module can be imported using mod + directory name as shown:

image

In Rust, a project is also called a crate. A crate can be an executable project or a library. We can use cargo new <name> --lib to create a library. When code in a crate changes, the crate must be recompiled.

Within a crate, in addition to the project source code, unit test and integration test code are also located in the crate.

Rust unit tests are generally placed in the same file as the code under test, using the compile condition #[cfg(test)] to ensure test code only compiles under test environments. Here is an example unit test:

#[cfg(test)] 
mod tests {
  #[test]
  fn it_works() {
    assert_eq!(2 + 2, 4);
  }
}

Integration tests are generally placed under the tests directory, parallel to src. Different from unit tests, integration tests can only test the public interfaces of the crate, compiling into a separate executable during testing.

In a crate, cargo test can be used to run test cases.

As code size continues growing, putting everything in one crate is not ideal since any code change causes recompilation of that crate, inefficient. We can use a workspace.

A workspace can contain one to multiple crates. When code changes, only the related crates need recompiling. When building a workspace, we first generate a Cargo.toml in a directory like the image below, containing all crates of the workspace, then we can cargo new to generate the crates:

image

Crates and workspaces have some more advanced uses we’ll cover when encountered. If interested, you can also first read Chapter 14 of the Rust book to learn more.

Summary #

We briefly went over Rust basics - defining variables with let/let mut, functions with fn, complex data structures with structs/enums. We also learned basic Rust control flow, understood how pattern matching works, and know how to handle errors.

Finally, considering code scale, we introduced how to organize Rust code with mod, crates, and workspaces. I’ve summarized it in the diagram below:

image

Today’s goal was to give you a very basic understanding of Rust, able to start trying some simple Rust projects.

You may be surprised that using Rust for scrape_url-like functionality has an experience almost identical to scripting languages like Python - it’s so simple!

Next lesson we’ll continue writing more code to truly experience Rust’s charm through useful small tool development.

Review Questions #

  1. In the Fibonacci code above, you may have noticed the code to compute the next sequence number is repeatedly duplicated across the three functions. This violates the DRY (Don’t Repeat Yourself) principle. Can you write a function to extract it out?

  2. In the scrape_url example, we hardcoded the URL to fetch and output filename in the code, not very flexible. Could you improve the code to get the URL and filename provided by the user from command line arguments? Like:

cargo run -- https://www.rust-lang.org rust.md

Hint - print out std::env::args() and see what happens:

for arg in std::env::args() {
  println!("{}", arg);
}

Please feel free to share your thoughts in the comments! Congrats on completing the third lesson in our Rust journey. See you next time!

Reference Materials #

  1. TOML
  2. static keyword
  3. lazy_static
  4. unit type
  5. How to write tests
  6. More about cargo and crates.io
  7. Rust supports declarative macros and procedural macros. Procedure macros include function macros, derive macros, and attribute macros. println! is a function macro since Rust is strongly typed so function types must be set at compile time, while println! accepts any number of arguments, so macros must be used.