05 Get Hands Dirty How Hard Is It to Make an Image Server

05 Get Hands Dirty: How Difficult Is It to Make an Image Server? #

Hello, I’m Chen Tian.

In our last talk, we wrote a little tool called HTTPie with just over a hundred lines of code. Are you longing for more? Today, let’s write another practical example to see what else we can do with Rust.

Before we begin, I want to clarify that it’s okay if you don’t fully understand the code right now. Don’t be too harsh on yourself trying to grasp everything; just follow my pace and write line by line. First, get your code running, feel the difference between Rust and your usual go-to language, and look at the coding style—that will be enough.

Today’s example is a common requirement we encounter at work: building a Web Server to provide certain services. Similar to HTTPie in the last talk, let’s continue re-writing an existing open-source tool in Rust, but challenge a slightly bigger project today: building an image server similar to Thumbor.

Thumbor #

Thumbor is a very well-known image server in Python, widely used in various scenarios requiring dynamic image size adjustments.

It can achieve dynamic image cropping and resizing through a simple HTTP interface. It also supports file storage, different processing engines, and other auxiliary functions. I have used it in my previous startup project; it was very practical with decent performance.

Let’s look at an example:

http://<thumbor-server>/300x200/smart/thumbor.readthedocs.io/en/latest/_images/logo-thumbor.png

In this example, Thumbor can perform smart crop on the final URL of the image and adjust its size to output a thumbnail of 300x200 pixels. Users accessing this URL will get a thumbnail image of that size.

Today, let’s implement its core functionality of dynamic image transformation. Think about it: if you were to implement this service with your most familiar language, how would you design it, what libraries would you use, and approximately how many lines of code would it take? If you used Rust, what would the line count be like?

With your own thoughts in mind, start building this tool with Rust! Our goal remains the same: implement our requirements in about 200 lines of code.

Design Analysis #

Since it’s about image transforming, the basic requirement is definitely to support various kinds of manipulation, like resizing, cropping, adding watermarks, and even including image filters. However, the tricky part of an image transformation service actually lies in the interface design—how to design a user-friendly, concise interface that can easily be extended for future enhancements.

Why do I say this? Imagine one day, your product manager suddenly wants the service, originally only used for thumbnails, to support vintage photo filters; how would you handle that?

Thumbor’s solution is to format and order the methods used for processing directly in the URL path, leaving out methods not being used:

/hmac/trim/AxB:CxD/(adaptative-)(full-)fit-in/-Ex-F/HALIGN/VALIGN/smart/filters:FILTERNAME(ARGUMENT):FILTERNAME(ARGUMENT)/*IMAGE-URI*

However, this approach is not easy to extend and inconvenient to parse, and it barely meets the requirements for applying multiple ordered operations on images. For instance, if I want to add a filter to an image and then a watermark, while for another image, I want to add a watermark first and then a filter.

Furthermore, if more parameters need to be added in the future, it might easily conflict with existing ones or cause API breaking changes. As developers, we should never underestimate the restless heart of product managers brimming with bizarre ideas.

So, while conceptualizing this project, we need to find a more concise and extensible way to describe a series of orderly operations on images. For example: resize first, then add a watermark on the resized result, and finally apply a filter.

Such orderly operations can be represented with a list in the code, where each operation could be an enum, like so:

// The parsed image processing parameters
struct ImageSpec {
    specs: Vec<Spec>
}

// Each parameter is one of our supported processing methods
enum Spec {
    Resize(Resize),
    Crop(Crop),
    // ...
}

// Processing image resize
struct Resize {
    width: u32,
    height: u32
}

With the required data structures in place, and having analyzed why Thumbor’s approach is not good for extension, how do we design an interface that any client can use and is reflected in a URL that can be parsed into our data structure?

Use querystring? It’s feasible but tends to grow disorganised when the image processing steps become complex, such as when seven or eight transformations need to be applied to a certain image—the querystring will become very long.

My thought is to use protobuf. Protobuf can describe data structures and is supported by almost all languages. Once a protobuf describing an image spec is generated, we can serialize it into a byte stream. But byte streams cannot be placed in a URL—what now? We can encode it with base64!

Following this idea, let’s try writing the protobuf message definition for the image spec:

message ImageSpec { repeated Spec specs = 1; }

message Spec {
  oneof data {
    Resize resize = 1;
    Crop crop = 2;
    // ...
  }
}

...

This way, we can embed a base64 encoded string generated by protobuf in the URL to provide extensible image processing parameters. The processed URL would look like this:

http://localhost:3000/image/CgoKCAjYBBCgBiADCgY6BAgUEBQKBDICCAM/<encoded origin url>

CgoKCAjYBBCgBiADCgY6BAgUEBQKBDICCAM describes the image processing flow we mentioned above: resize first, then add a watermark to the result of the resize, and apply a filter at the end. It can be implemented with the following code:

fn print_test_url(url: &str) {
    use std::borrow::Borrow;
    let spec1 = Spec::new_resize(600, 800, resize::SampleFilter::CatmullRom);
    let spec2 = Spec::new_watermark(20, 20);
    let spec3 = Spec::new_filter(filter::Filter::Marine);
    let image_spec = ImageSpec::new(vec![spec1, spec2, spec3]);
    let s: String = image_spec.borrow().into();
    let test_image = percent_encode(url.as_bytes(), NON_ALPHANUMERIC).to_string();
    println!("test url: http://localhost:3000/image/{}/{}", s, test_image);
}

The benefit of using protobuf is that the serialized result is compact, and any language supporting protobuf can generate or parse this interface.

Okay, we have settled on the interface. Next is to make an HTTP server to provide it. In the handling process for the /image route on the HTTP server, we need to fetch the original image from the URL, process it according to the image spec in sequence, and finally return the processed byte stream to the user.

In this flow, an obvious optimization that comes to mind is to provide an LRU (Least Recently Used) cache for the fetching process of the original image, as accessing external networks is the slowest and most uncontrollable segment of the entire path.

Image

After our analysis, does thumbor still seem not so complicated? However, you must be wondering: Can all these tasks really be accomplished with only 200 lines of code? Let’s start writing and tally the lines upon completion.

Protobuf Definition and Compilation #

This project requires many dependencies, and I won’t introduce them one by one here. As you progress through your learning and work in the future, you’ll gradually encounter and use most of them.

As usual, let’s start by cargo new thumbor to generate a new project, and then add the following dependencies in the project’s Cargo.toml:

[dependencies]
axum = "0.2" # Web server
anyhow = "1" # Error handling
base64 = "0.13" # Base64 encoding/decoding
bytes = "1" # Byte stream handling
image = "0.23" # Image processing
lazy_static = "1" # Initializing static variables easily with macros
lru = "0.6" # LRU cache
percent-encoding = "2" # URL encoding/decoding
photon-rs = "0.3" # Image effects
prost = "0.8" # Protobuf handling
reqwest = "0.11" # HTTP client
serde = { version = "1", features = ["derive"] } # Serializing/deserializing data
tokio = { version = "1", features = ["full"] } # Async handling
tower = { version = "0.4", features = ["util", "timeout", "load-shed", "limit"] } # Service handling and middleware
tower-http = { version = "0.1", features = ["add-extension", "compression-full", "trace" ] } # HTTP middleware
tracing = "0.1" # Logging and tracing
tracing-subscriber = "0.2" # Logging and tracing

[build-dependencies]
prost-build = "0.8" # Compiling protobuf

In the project’s root directory, create an abi.proto file with the following data structures used by our image processing service:

syntax = "proto3";

package abi; // This name will be used in the compilation result, prost will generate: abi.rs

// An ImageSpec is an ordered array, which the server processes following the spec order
message ImageSpec { repeated Spec specs = 1; }

// Processing image resizing
message Resize {
  uint32 width = 1;
  uint32 height = 2;

  enum ResizeType {
    NORMAL = 0;
    SEAM_CARVE = 1;
  }

  ResizeType rtype = 3;

  enum SampleFilter {
    UNDEFINED = 0;
    NEAREST = 1;
    TRIANGLE = 2;
    CATMULL_ROM = 3;
    GAUSSIAN = 4;
    LANCZOS3 = 5;
  }

  SampleFilter filter = 4;
}

// Processing image cropping
message Crop {
  uint32 x1 = 1;
  uint32 y1 = 2;
  uint32 x2 = 3;
  uint32 y2 = 4;
}

// Processing horizontal flipping
message Fliph {}
// Processing vertical flipping
message Flipv {}
// Processing contrast
message Contrast { float contrast = 1; }
// Processing filters
message Filter {
  enum Filter {
    UNSPECIFIED = 0;
    OCEANIC = 1;
    ISLANDS = 2;
    MARINE = 3;
    // more: https://docs.rs/photon-rs/0.3.1/photon_rs/filters/fn.filter.html
  }
  Filter filter = 1;
}

// Processing watermarking
message Watermark {
  uint32 x = 1;
  uint32 y = 2;
}

// A spec can contain one of the processing methods above
message Spec {
  oneof data {
    Resize resize = 1;
    Crop crop = 2;
    Flipv flipv = 3;
    Fliph fliph = 4;
    Contrast contrast = 5;
    Filter filter = 6;
    Watermark watermark = 7;
  }
}

This includes the image processing services we support, and more operations can be easily added in the future.

Protobuf is a backward-compatible tool, which means it can remain compatible with older client versions as the server keeps supporting more functions. In Rust, we can use prost to work with and compile protobuf. Also, create a build.rs in the project’s root directory with the following code:

fn main() {
    prost_build::Config::new()
        .out_dir("src/pb")
        .compile_protos(&["abi.proto"], &["."])
        .unwrap();
}

build.rs can perform additional compilation processes when compiling a cargo project. Here we use prost_build to compile abi.proto into the src/pb directory.

This directory doesn’t exist yet, so you need to mkdir src/pb to create it. Running cargo build, you’ll find an abi.rs file generated under src/pb, containing Rust data structures converted from protobuf messages. For now, let’s ignore various tags added by prost and treat them as common data structures.

Next, create src/pb/mod.rs to include abi.rs and write some helper functions. These functions are mainly for converting ImageSpec into a string and recovering it from a string.

We also wrote a test to ensure the functionality’s correctness. You can test it with cargo test. Remember to add mod pb; to main.rs to bring in the module.

use base64::{decode_config, encode_config, URL_SAFE_NO_PAD};
use photon_rs::transform ::SamplingFilter;
use prost::Message;
use std::convert::TryFrom;

mod abi; // Declare abi.rs
pub use abi::*;

impl ImageSpec {
    pub fn new(specs: Vec<Spec>) -> Self {
        Self { specs }
    }
}

// Allow ImageSpec to generate a string
impl From<&ImageSpec> for String {
    fn from(image_spec: &ImageSpec) -> Self {
        let data = image_spec.encode_to_vec();
        encode_config(data, URL_SAFE_NO_PAD)
    }
}

// Allow ImageSpec to create through a string. For example s.parse().unwrap()
impl TryFrom<&str> for ImageSpec {
    type Error = anyhow::Error;

    fn try_from(value: &str) -> Result<Self, Self::Error> {
        let data = decode_config(value, URL_SAFE_NO_PAD)?;
        Ok(ImageSpec::decode(&data[..])?)
    }
}

// Helper functions to support the image spec
// These functions mainly for convenient conversion between ImageSpec and strings
// ... (Your code with helper functions and tests here)

Introducing the HTTP Server #

Let’s introduce the HTTP service now that we have processed everything related to protobuf. There are many high-performance web servers in the Rust community, such as actix-web, rocket, warp, and the recently new axum. We’ll use axum for this server.

Based on axum’s documentation, we can construct the following code:

use axum::{extract::Path, handler::get, http::StatusCode, Router};
use percent_encoding::percent_decode_str;
use serde::Deserialize;
use std::convert::TryInto;

// Import the protobuf-generated code, which we don't need to focus on too much right now
mod pb;

use pb::*;

// Parameters use serde for Deserialize: axum will recognize and parse automatically
#[derive(Deserialize)]
struct Params {
    spec: String,
    url: String,
}

#[tokio::main]
async fn main() {
    // Initialize tracing
    tracing_subscriber::fmt::init();

    // Build routes
    let app = Router::new()
        // `GET /image` will execute the generate function with spec and url passed in
        .route("/image/:spec/:url", get(generate));

    // Run the web server
    let addr = "127.0.0.1:3000".parse().unwrap();
    tracing::debug!("listening on {}", addr);
    axum::Server::bind(&addr)
        .serve(app.into_make_service())
        .await
        .unwrap();
}

// Currently, we're just parsing out the parameters
async fn generate(Path(Params { spec, url }): Path<Params>) -> Result<String, StatusCode> {
    let url = percent_decode_str(&url).decode_utf8_lossy();
    let spec: ImageSpec = spec
        .as_str()
        .try_into()
        .map_err(|_| StatusCode::BAD_REQUEST)?;
  		Ok(format!("url: {}\n spec: {:#?}", url, spec))
}

Add them to main.rs and use cargo run to start the server. Then we can test with HTTPie from the previous talk:

httpie get "http://localhost:3000/image/CgoKCAjYBBCgBiADCgY6BAgUEBQKBDICCAM/https%3A%2F%2Fimages%2Epexels%2Ecom%2Fphotos%2F2470905%2Fpexels%2Dphoto%2D2470905%2Ejpeg%3Fauto%3Dcompress%26cs%3Dtinysrgb%26dpr%3D2%26h%3D750%26w%3D1260"
HTTP/1.1 200 OK

content-type: "text/plain"
content-length: "901"
date: "Wed, 25 Aug 2021 18:03:50 GMT"

url: https://images.pexels.com/photos/2470905/pexels-photo-2470905.jpeg?auto=compress&cs=tinysrgb&dpr=2&h=750&w=1260
 spec: ImageSpec {
    specs: [
        Spec {
            data: Some(
                Resize(
                    Resize {
                        width: 600,
                        height: 800,
                        rtype: Normal,
                        filter: CatmullRom,
                    },
                ),
            ),
        },
        Spec {
            data: Some(
                Watermark(
                    Watermark {
                        x: 20,
                        y: 20,
                    },
                ),
            ),
        },
        Spec {
            data: Some(
                Filter(
                    Filter {
                        filter: Marine,
                    },
                ),
            ),
        },

Wow, our web server’s interface section already works correctly.

Writing to this point, if you find the syntax perplexing, don’t worry. Since we haven’t covered concepts like ownership, type system, generics, and others, many details might be incomprehensible to you. As with today’s example, merely grasp the overall process to understand.

Fetching Original Images and Caching #

Now that the interface can work, let’s handle the logic for fetching original images.

Based on our previous design, an LRU cache is needed to cache original images. Usually, web frameworks have middleware to manage global states, and axum is no exception. An AddExtensionLayer can be used to add a global state, which is the LRU cache caching the original images fetched from network requests.

Let’s change the code in main.rs to the following:

use anyhow::Result;
use axum::{
    extract::{Extension, Path},
    handler::get,
    http::{HeaderMap, HeaderValue, StatusCode