Crates and Modules in Rust

TL;DR

Crates and modules keep our program organized.

Crates

Rust programs are made of crates. Each crate is a complete, cohesive unit: all the source code for a single library or executable, plus any associated tests, examples, tools, configuration, and other junk.

The easiest way to see what crates are and how they work together is to use cargo build with the --verbose flag to build an existing project that has some dependencies.

The Cargo.toml file for each package is called its manifest. It is written in the TOML format. It contains metadata that is needed to compile the package.

Checkout the cargo locate-project section for more detail on how Cargo finds the manifest file.

An example of Cargo.toml file:

1
2
3
4
[dependencies]
num = "0.4"
image = "0.13"
crossbeam = "0.8"

We found these crates on crates.io, the Rust community’s site for open source crates. Each crate’s page on crates.io shows its README.md file and links to documentation and source, as well as a line of configuration like image = "0.13" that you can copy and add to your Cargo.toml.

When we run cargo build, Cargo starts by downloading source code for the specified versions of these crates from crates.io. Then, it reads those crates’ Cargo.toml files, downloads their dependencies, and so on recursively.

The dependencies that are dependent on indirectly are called transitive dependencies. The collection of all these dependency relationships, which tells Cargo everything it needs to know about what crates to build and in what order, is known as the dependency graph of the crate.

Once it has the source code, Cargo compiles all the crates. It runs rustc, the Rust compiler, once for each crate in the project’s dependency graph. When compiling libraries, Cargo uses the --crate-type lib option. This tells rustc not to look for a main() function but instead to produce an .rlib file containing compiled code that can be used to create binaries and other .rlib files.

When compiling a program, Cargo uses --crate-type bin, and the result is a binary executable for the target platform.

With each rustc command, Cargo passes --extern options, giving the filename of each library the crate will use. That way, when rustc sees a line of code like use image::png::PNGEncoder, it can figure out that image is the name of another crate, and thanks to Cargo, it knows where to find that compiled crate on disk. The Rust compiler needs access to these .rlib files because they contain the compiled code of the library. Rust will statically link that code into the final executable. The .rlib also contains type information so Rust can check that the library features we’re using in our code actually exist in the crate and that we’re using them correctly. It also contains a copy of the crate’s public inline functions, generics, and macros, features that can’t be fully compiled to machine code until Rust sees how we use them.

cargo build supports all sorts of options.

cargo build --release produces an optimized build. Release builds run faster, but they take longer to compile, they don’t check for integer overflow, they skip debug_assert!() assertions, and the stack traces they generate on panic are generally less reliable.

Editions

Rust has extremely strong compatibility guarantees. Any code that compiled on Rust 1.0 must compile just as well on Rust 1.50 or, if it’s ever released, Rust 1.900.

But sometimes the community comes across compelling proposals for extensions to the language that would cause older code to no longer compile. For example, after much discussion, Rust settled on a syntax for asynchronous programming support that repurposes the identifiers async and await as keywords. But this language change would break any existing code that uses async or await as the name of a variable.

To evolve without breaking existing code, Rust uses editions. The 2015 edition of Rust is compatible with Rust 1.0. The 2018 edition changed async and await into keywords and streamlined the module system, while the 2021 edition improved array ergonomics and made some widely-used library definitions available everywhere by default. These were all important improvements to the language, but would have broken existing code. To avoid this, each crate indicates which edition of Rust it is written in with a line like this in the [package] section atop its Cargo.toml file:

1
edition = "2018"

If that keyword is absent, the 2015 edition is assumed, so old crates don’t have to change at all.

Rust promises that the compiler will always accept all extant editions of the language, and programs can freely mix crates written in different editions. It’s even fine for a 2015 edition crate to depend on a 2018 edition crate. In other words, a crate’s edition only affects how its source code is construed; edition distinctions are gone by the time the code has been compiled. This means there’s no pressure to update old crates just to continue to participate in the modern Rust ecosystem. Similarly, there’s no pressure to keep your crate on an older edition to avoid inconveniencing its users. You only need to change editions when you want to use new language features in your own code.

Editions don’t come out every year, only when the Rust project decides one is necessary. For example, there’s no 2020 edition. Setting edition to "2020" causes an error. The Rust Edition Guide covers the changes introduced in each edition and provides good background on the edition system.

It’s almost always a good idea to use the latest edition, especially for new code. cargo new creates new projects on the latest edition by default.

If you have a crate written in an older edition of Rust, the cargo fix command may be able to help you automatically upgrade your code to the newer edition. The Rust Edition Guide explains the cargo fix command in detail.

Build Profiles

There are several configuration settings you can put in your Cargo.toml file that affect the rustc command lines that cargo generates.

Command line	Cargo.toml section used
cargo build	[profile.dev]
cargo build –release	[profile.release]
cargo test	[profile.test]

The defaults are usually fine, but one exception is when you want to use a profiler—a tool that measures where your program is spending its CPU time. To get the best data from a profiler, you need both optimizations (usually enabled only in release builds) and debug symbols (usually enabled only in debug builds). To enable both, add this to your Cargo.toml:

1
2
[profile.release]
debug = true # enable debug symbols in release builds

The debug setting controls the -g option to rustc. With this configuration, when you type cargo build --release, you’ll get a binary with debug symbols. The optimization settings are unaffected.

The Cargo documentation lists many other settings you can adjust in Cargo.toml.

Modules

Whereas crates are about code sharing between projects, modules are about code organization within a project. They act as Rust’s namespaces, containers for the functions, types, constants, and so on that make up your Rust program or library. A module looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
mod spores {
    use cells::{Cell, Gene};

    pub struct Spore {
        // ...
    }

    pub fn produce_spore(factory: &mut Sporangium) -> Spore {
        // ...
    }

    pub(crate) fn genes(spore: &Spore) -> Vec<Gene> {
        // ...
    }

    fn recombine(parent: &mut Cell) {
        // ...
    }
}

A module is a collection of items, named features like the Spore struct and the two functions in this example. The pub keyword makes an item public, so it can be accessed from outside the module. Marking an item as pub is often known as “exporting” that item.

pub(crate) means that it is available anywhere inside this crate, but isn’t exposed as part of the external interface. It can’t be used by other crates, and it won’t show up in this crate’s documentation.

Anything that isn’t marked pub is private and can only be used in the same module in which it is defined, or any child modules.

Marking an item as pub is often known as “exporting” that item.

Nested Modules

Modules can nest, and it’s fairly common to see a module that’s just a collection of submodules:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
mod plant_structures {
    pub mod roots {
        // ...
    }
    pub mod stems {
        // ...
    }
    pub mod leaves {
        // ...
    }
}

If you want an item in a nested module to be visible to other crates, be sure to mark it and all enclosing modules as public.

It’s also possible to specify pub(super), making an item visible to the parent module only, and pub(in <path>), which makes it visible in a specific parent module and its descendants. This is especially useful with deeply nested modules:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
mod plant_structures {
    pub mod roots {
        pub mod products {
            pub(in crate::plant_structures::roots) struct Cytokinin {
                // ...
            }
        }   

        use products::Cytokinin;    // ok: in `roots` module
    }

    use roots::products::Cytokinin; // error: `Cytokinin` is private
}

// error: `Cytokinin` is private
use plant_structures::roots::products::Cytokinin;

In this way, we could write out a whole program, with a huge amount of code and a whole hierarchy of modules, related in whatever ways we wanted, all in a single source file.

Modules in Separate Files

A module can also be written like this:

1
mod spores;

Earlier, we included the body of the spores module, wrapped in curly braces. Here, we’re instead telling the Rust compiler that the spores module lives in a separate file, called spores.rs:

1
2
3
4
// spores.rs
pub struct Spore {
    // ...
}

spores.rs contains only the items that make up the module. It doesn’t need any kind of boilerplate to declare that it’s a module. Rust never compiles modules separately, even if they’re in separate files: when you build a Rust crate, you’re recompiling all of its modules.

A module can have its own directory. When Rust sees mod spores;, it checks for both spores.rs and spores/mod.rs; if neither file exists, or both exist, that’s an error.

For this example, we used spores.rs, because the spores module did not have any submodules. But consider the plant_structures module. If we decide to split that module and its three submodules into their own files, the resulting project would look like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
// fern_sim/
// ├── Cargo.toml
// └── src/
//     ├── main.rs
//     ├── spores.rs
//     └── plant_structures/
//         ├── mod.rs
//         ├── leaves.rs
//         ├── roots.rs
//         └── stems.rs

In main.rs, we declare the plant_structures module:

1
pub mod plant_structures;

This causes Rust to load plant_structures/mod.rs, which declares the three submodules:

1
2
3
4
// in plant_structures/mod.rs
pub mod roots;
pub mod stems;
pub mod leaves;

The content of those three modules is stored in separate files named leaves.rs, roots.rs, and stems.rs, located alongside mod.rs in the plant_structures directory.

It’s also possible to use a file and directory with the same name to make up a module. For instance, if stems needed to include modules called xylem and phloem, we could choose to keep stems in plant_structures/stems.rs and add a stems directory:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
// 3. modules in their own file with a supplementary directory containing submodules

// fern_sim/
// ├── Cargo.toml
// └── src/
//     ├── main.rs
//     ├── spores.rs
//     └── plant_structures/
//         ├── mod.rs
//         ├── leaves.rs
//         ├── roots.rs
//         ├── stems/
//         │ ├── phloem.rs
//         │ └── xylem.rs
//         └── stems.rs

Then, in stems.rs, we declare the two new submodules:

1
2
3
// in plant_structures/stems.rs
pub mod xylem;
pub mod phloem;

These three options—modules in their own file, modules in their own directory with a mod.rs, and modules in their own file with a supplementary directory containing submodules—give the module system enough flexibility to support almost any project structure you might desire.

Paths and Imports

The :: operator is used to access features of a module.

Code anywhere in your project can refer to any standard library feature by writing out its path:

1
2
3
if s1 > s2 {
    std::mem::swap(&mut s1, &mut s2);
}

std is the name of the standard library. The path std refers to the top-level module of the standard library. std::mem is a submodule within the standard library, and std::mem::swap is a public function in that module.

Writing this way would be tedious to type and hard to read.

The alternative is to import features into the modules where they’re used:

1
2
3
4
5
use std::mem;

if s1 > s2 {
    mem::swap(&mut s1, &mut s2);
}

The use declaration causes the name mem to be a local alias for std::mem throughout the enclosing block or module.

We could write use std::mem::swap; to import the swap function itself instead of the mem module. But it is generally considered the best style: only import types, traits, and modules and then use relative paths to access the functions, constants, and other members within.

Several names can be imported at once:

1
2
3
4
5
use std::collections::{HashMap, HashSet};  // import both

use std::fs::{self, File}; // import both `std::fs` and `std::fs::File`.

use std::io::prelude::*;  // import everything

This is just shorthand for writing out all the individual imports:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
use std::collections::HashMap;
use std::collections::HashSet;

use std::fs;
use std::fs::File;

// all the public items in std::io::prelude:
use std::io::prelude::Read;
use std::io::prelude::Write;
use std::io::prelude::BufRead;
use std::io::prelude::Seek;

You can use as to import an item but give it a different name locally:

1
2
3
4
use std::io::Result as IOResult;

// This return type is just another way to write `std::io::Result<()>`:
fn save_spore(spore: &Spore) -> IOResult<()>

Modules do not automatically inherit names from their parent modules. For example, suppose we have this in our proteins/mod.rs:

1
2
3
// proteins/mod.rs
pub enum AminoAcid { ... }
pub mod synthesis;

Then the code in synthesis.rs does not automatically see the type AminoAcid:

1
2
// proteins/synthesis.rs
pub fn synthesize(seq: &[AminoAcid]) // error: can't find type `AminoAcid`

Instead, each module starts with a blank slate and must import the names it uses.

1
2
3
4
// proteins/synthesis.rs
use super::AminoAcid; // explicitly import from parent

pub fn synthesize(seq: &[AminoAcid]) // ok

By default, paths are relative to the current module:

1
2
3
4
// in proteins/mod.rs

// import from a submodule
use synthesis::synthesize;

self is also a synonym for the current module, so we could write either:

1
2
3
4
5
// in proteins/mod.rs

// import names from an enum,
// so we can write `Lys` for lysine, rather than `AminoAcid::Lys`
use self::AminoAcid::*;

or simply:

1
2
3
// in proteins/mod.rs

use AminoAcid::*;

The keywords super and crate have a special meaning in paths: super refers to the parent module, and crate refers to the crate containing the current module.

Using paths relative to the crate root rather than the current module makes it easier to move code around the project, since all the imports won’t break if the path of the current module changes. For example, we could write synthesis.rs using crate:

1
2
3
4
5
// proteins/synthesis.rs
use crate::proteins::AminoAcid; // explicitly import relative to crate root

pub fn synthesize(seq: &[AminoAcid]) // ok
// ...

Submodules can access private items in their parent modules with use super::*.

If you have a module with the same name as a crate that you are using, then referring to their contents takes some care. For example, if your program lists the image crate as a dependency in its Cargo.toml file, but also has a module named image, then paths starting with image are ambiguous:

1
2
3
4
5
6
7
8
mod image {
    pub struct Sampler {
        // ...
    }
}

// error: Does this refer to our `image` module, or the `image` crate?
use image::Pixels;

Even though the image module has no Pixels type, the ambiguity is still considered an error: it would be confusing if adding such a definition later could silently change what paths elsewhere in the program refer to.

To resolve the ambiguity, Rust has a special kind of path called an absolute path, starting with ::, which always refers to an external crate. To refer to the Pixels type in the image crate, you can write:

1
use ::image::Pixels;        // the `image` crate's `Pixels`

To refer to your own module’s Sampler type, you can write:

1
use self::image::Sampler;   // the `image` module's `Sampler`

Modules aren’t the same thing as files, but there is a natural analogy between modules and the files and directories of a Unix filesystem. The use keyword creates aliases, just as the ln command creates links. Paths, like filenames, come in absolute and relative forms. self and super are like the . and .. special directories.

The Standard Prelude

Each module starts with a “blank slate,” as far as imported names are concerned. But the slate is not completely blank.

For one thing, the standard library std is automatically linked with every project. This means you can always go with use std::whatever or refer to std items by name, like std::mem::swap() inline in your code. Furthermore, a few particularly handy names, like Vec and Result, are included in the standard prelude and automatically imported. Rust behaves as though every module, including the root module, started with the following import:

1
use std::prelude::v1::*;

The standard prelude contains a few dozen commonly used traits and types.

Libraries sometimes provide modules named prelude. But std::prelude::v1 is the only prelude that is ever imported automatically. Naming a module prelude is just a convention that tells users it’s meant to be imported using *.

Making `use` Declarations `pub`

Even though use declarations are just aliases, they can be public:

1
2
3
4
// in plant_structures/mod.rs
// ...
pub use self::leaves::Leaf;
pub use self::roots::Root;

This means that Leaf and Root are public items of the plant_structures module. They are still simple aliases for plant_structures::leaves::Leaf and plant_structures::roots::Root.

The standard prelude is written as just such a series of pub imports.

Making Struct Fields `pub`

A module can include user-defined struct types, introduced using the struct keyword. A simple struct looks like this:

1
2
3
4
pub struct Fern {
    pub roots: RootSet,
    pub stems: StemSet
}

A struct’s fields, even private fields, are accessible throughout the module where the struct is declared, and its submodules. Outside the module, only public fields are accessible.

It turns out that enforcing access control by module, rather than by class as in Java or C++, is surprisingly helpful for software design. It cuts down on boilerplate “getter” and “setter” methods, and it largely eliminates the need for anything like C++ friend declarations. A single module can define several types that work closely together, such as perhaps frond::LeafMap and frond::LeafMapIter, accessing each other’s private fields as needed, while still hiding those implementation details from the rest of your program.

Statics and Constants

In addition to functions, types, and nested modules, modules can also define constants and statics.

The const keyword introduces a constant. The syntax is just like let except that it may be marked pub, and the type is required. Also, UPPERCASE_NAMES are conventional for constants:

1
pub const ROOM_TEMPERATURE: f64 = 20.0;     // degrees Celsius

The static keyword introduces a static item, which is nearly the same thing:

1
pub static ROOM_TEMPERATURE: f64 = 68.0;    // degrees Fahrenheit

A constant is a bit like a C++ #define: the value is compiled into your code every place it’s used. A static is a variable that’s set up before your program starts running and lasts until it exits.

Use constants for magic numbers and strings in your code. Use statics for larger amounts of data, or any time you need to borrow a reference to the constant value.

There are no mut constants. Statics can be marked mut, but Rust has no way to enforce its rules about exclusive access on mut statics. They are, therefore, inherently non-thread-safe, and safe code can’t use them at all:

1
2
3
static mut PACKETS_SERVED: usize = 0;

println!("{} served", PACKETS_SERVED); // error: use of mutable static

Rust discourages global mutable state.

Turning a Program into a Library

Suppose you’ve got one command-line program. The first step to turn a program into a library is to factor your existing project into two parts: a library crate, which contains all the shared code, and an executable, which contains the code that’s only needed for your existing command-line program.

Consider the simplified example program:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// src/main.rs
struct Fern {
    size: f64,
    growth_rate: f64
}

impl Fern {
    /// Simulate a fern growing for one day.
    fn grow(&mut self) {
        self.size *= 1.0 + self.growth_rate;
    }
}

/// Run a fern simulation for some number of days.
fn run_simulation(fern: &mut Fern, days: usize) {
    for _ in 0 .. days {
        fern.grow();
    }
}

fn main() {
    let mut fern = Fern {
        size: 1.0,
        growth_rate: 0.001
    };
    run_simulation(&mut fern, 1000);
    println!("final fern size: {}", fern.size);
}

Assume that this program has a trivial Cargo.toml file:

[package]
name = "fern_sim"
version = "0.1.0"
authors = ["You <you@example.com>"]
edition = "2021"

Steps to turn this program into a library:

Rename the file src/main.rs to src/lib.rs.
Add the pub keyword to items in src/lib.rs that will be public features of our library.
Move the main function to a temporary file somewhere.

We didn’t need to change anything in Cargo.toml. This is because our minimal Cargo.toml file leaves Cargo to its default behavior. By default, cargo build looks at the files in our source directory and figures out what to build. When it sees the file src/lib.rs, it knows to build a library.

The code in src/lib.rs forms the root module of the library. Other crates that use our library can only access the public items of this root module.

The `src/bin` Directory

Cargo has some built-in support for small programs that live in the same crate as a library.

In fact, Cargo itself is written this way. The bulk of the code is in a Rust library. The cargo command-line program is a thin wrapper program that calls out to the library for all the heavy lifting. Both the library and the command-line program live in the same source repository.

We can keep our program and our library in the same crate, too. Put this code into a file named src/bin/efern.rs:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// src/bin/efern.rs
use fern_sim::{Fern, run_simulation};

fn main() {
    let mut fern = Fern {
        size: 1.0,
        growth_rate: 0.001
    };
    run_simulation(&mut fern, 1000);
    println!("final fern size: {}", fern.size);
}


// src/lib.rs
pub struct Fern {
    pub size: f64,
    pub growth_rate: f64
}

impl Fern {
    /// Simulate a fern growing for one day.
    pub fn grow(&mut self) {
        self.size *= 1.0 + self.growth_rate;
    }
}

/// Run a fern simulation for some number of days.
pub fn run_simulation(fern: &mut Fern, days: usize) {
    for _ in 0 .. days {
        fern.grow();
    }
}

We’ve added a use declaration for some items from the fern_sim crate, Fern and run_simulation. In other words, we’re using that crate as a library.

Because we’ve put this file into src/bin, Cargo will compile both the fern_sim library and this program the next time we run cargo build. We can run the efern program using cargo run --bin efern.

We still didn’t have to make any changes to Cargo.toml, because, again, Cargo’s default is to look at your source files and figure things out. It automatically treats .rs files in src/bin as extra programs to build.

We can also build larger programs in the src/bin directory using subdirectories. Suppose we want to provide a second program that draws a fern on the screen, but the drawing code is large and modular, so it belongs in its own file. We can give the second program its own subdirectory:

fern_sim/
├── Cargo.toml
└── src/
    └── bin/
        ├── efern.rs
        └── draw_fern/
            ├── main.rs
            └── draw.rs

This has the advantage of letting larger binaries have their own submodules without cluttering up either the library code or the src/bin directory.

Now that fern_sim is a library, we could have put this executable in its own isolated project, in a completely separate directory, with its own Cargo.toml listing fern_sim as a dependency:

1
2
[dependencies]
fern_sim = { path = "../fern_sim" }

The src/bin directory is just right for simple programs like efern and draw_fern.

Attributes

Any item in a Rust program can be decorated with attributes. Attributes are Rust’s catchall syntax for writing miscellaneous instructions and advice to the compiler.

Some examples of attributes:

1
2
3
4
5
6
7
8
#[allow(non_camel_case_types)]
pub struct git_revspec {
    // ...
}

// Only include this module in the project if we're building for Android.
#[cfg(target_os = "android")]
mod mobile;

Occasionally, we need to micromanage the inline expansion of functions, an optimization that we’re usually happy to leave to the compiler. We can use the #[inline] attribute for that:

1
2
3
4
#[inline]
fn do_osmosis(c1: &mut Cell, c2: &mut Cell) {
    // ...
}

There’s one situation where inlining won’t happen without #[inline]. When a function or method defined in one crate is called in another crate, Rust won’t inline it unless it’s generic (it has type parameters) or it’s explicitly marked #[inline].

Otherwise, the compiler treats #[inline] as a suggestion. Rust also supports the more insistent #[inline(always)], to request that a function be expanded inline at every call site, and #[inline(never)], to ask that a function never be inlined.

New Bing: Inline expansion is a compiler optimization technique that reduces the overhead of a function call by simply not doing the call: instead, the compiler effectively rewrites the program to appear as though the definition of the called function was inserted at each call site. It is used to eliminate the time overhead when a function is called and is typically used for functions that execute frequently.

Some attributes, like #[cfg] and #[allow], can be attached to a whole module and apply to everything in it. Others, like #[test] and #[inline], must be attached to individual items. Each attribute is custom-made and has its own set of supported arguments.

To attach an attribute to a whole crate, add it at the top of the main.rs or lib.rs file, before any items, and write #! instead of #, like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
// libgit2_sys/lib.rs
#![allow(non_camel_case_types)]

pub struct git_revspec {
    // ...
}

pub struct git_error {
    // ...
}

The #! tells Rust to attach an attribute to the enclosing item rather than whatever comes next: in this case, the #![allow] attribute attaches to the whole libgit2_sys crate, not just struct git_revspec.

#! can also be used inside functions, structs, and so on, but it’s only typically used at the beginning of a file, to attach an attribute to the whole module or crate.

Some attributes always use the #! syntax because they can only be applied to a whole crate. For example, the #![feature] attribute is used to turn on unstable features of the Rust language and libraries.

Tests and Documentation

A simple unit testing framework is built into Rust. Tests are ordinary functions marked with the #[test] attribute.

cargo test runs all the tests in your project. This works the same whether your crate is an executable or a library. You can run specific tests by passing arguments to Cargo: cargo test math runs all tests that contain math somewhere in their name.

Tests commonly use the assert! and assert_eq! macros from the Rust standard library. assert!(expr) succeeds if expr is true. Otherwise, it panics, which causes the test to fail. assert_eq!(v1, v2) is just like assert!(v1 == v2) except that if the assertion fails, the error message shows both values.

You can use these macros in ordinary code, to check invariants, but note that assert! and assert_eq! are included even in release builds. Use debug_assert! and debug_assert_eq! instead to write assertions that are checked only in debug builds.

To test error cases, add the #[should_panic] attribute to your test:

1
2
3
4
5
6
7
/// This test passes only if division by zero causes a panic
#[test]
#[allow(unconditional_panic, unused_must_use)]
#[should_panic(expected="divide by zero")]
fn test_divide_by_zero_error() {
    1 / 0; // should panic!
}

We also need to add an allow attribute to tell the compiler to let us do things that it can statically prove will panic, and perform divisions and just throw away the answer, because normally, it tries to stop that kind of silliness.

You can also return a Result<(), E> from your tests. As long as the error variant is Debug (implement Debug trace), which is usually the case, you can simply return a Result by using ? to throw away the Ok variant:

1
2
3
4
5
6
7
use std::num::ParseIntError;

#[test]
fn main() -> Result<(), ParseIntError> {
    i32::from_str_radix("1024", 10)?;
    Ok(())
}

Functions marked with #[test] are compiled conditionally. A plain cargo build or cargo build --release skips the testing code. But when you run cargo test, Cargo builds your program twice: once in the ordinary way and once with your tests and the test harness enabled. This means your unit tests can live right alongside the code they test, accessing internal implementation details if they need to, and yet there’s no run-time cost. However, it can result in some warnings. For example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
// support code for tests
fn roughly_equal(a: f64, b: f64) -> bool {
    (a - b).abs() < 1e-6
}

#[test]
fn trig_works() {
    use std::f64::consts::PI;
    assert!(roughly_equal(PI.sin(), 0.0));
}
// cargo build
// warning: function `roughly_equal` is never used
// |
// |     fn roughly_equal(a: f64, b: f64) -> bool {
// |        ^^^^^^^^^^^^^
// |
// = note: `#[warn(dead_code)]` on by default

In builds that omit the test code, roughly_equal appears unused, and Rust will complain:

So the convention, when your tests get substantial enough to require support code, is to put them in a tests module and declare the whole module to be testing-only using the #[cfg] attribute:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
#[cfg(test)] // include this module only when testing
mod tests {
    // helper function for tests
    fn roughly_equal(a: f64, b: f64) -> bool {
        (a - b).abs() < 1e-6
    }

    #[test]
    fn trig_works() {
        use std::f64::consts::PI;
        // test the sine of pi is zero
        assert!(roughly_equal(PI.sin(), 0.0));
    }
}

Rust’s test harness uses multiple threads to run several tests at a time, a nice side benefit of your Rust code being thread-safe by default. To disable this, either run a single test, cargo test testname, or run cargo test -- --test-threads 1. (The first -- ensures that cargo test passes the --test-threads option through to the test executable.)

Normally, the test harness only shows the output of tests that failed. To show the output from tests that pass too, run cargo test -- --no-capture.

Integration Tests

Your fern simulator continues to grow. You’ve decided to put all the major functionality into a library that can be used by multiple executables. It would be nice to have some tests that link with the library the way an end user would, using fern_sim.rlib as an external crate. Also, you have some tests that start by loading a saved simulation from a binary file, and it is awkward having those large test files in your src directory. Integration tests help with these two problems.

Integration tests are .rs files that live in a tests directory alongside your project’s src directory. When you run cargo test, Cargo compiles each integration test as a separate, standalone crate, linked with your library and the Rust test harness.

Here is an example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
// tests/unfurl.rs - Fiddleheads unfurl in sunlight

use fern_sim::Terrarium;
use std::time::Duration;

#[test]
fn test_fiddlehead_unfurling() {
    let mut world = Terrarium::load("tests/unfurl_files/fiddlehead.tm");
    assert!(world.fern(0).is_furled());
    let one_hour = Duration::from_secs(60 * 60);
    world.apply_sunlight(one_hour);
    assert!(world.fern(0).is_fully_unfurled());
}

Integration tests are valuable in part because they see your crate from the outside, just as a user would. They test the crate’s public API.

cargo test runs both unit tests and integration tests. To run only the integration tests in a particular file—for example, tests/unfurl.rs—use the command cargo test --test unfurl.

Documentation

cargo doc creates HTML documentation for your library:

1
cargo doc --no-deps --open

The --no-deps option tells Cargo to generate documentation only for the library itself, and not for all the crates it depends on.

The --open option tells Cargo to open the documentation in your browser afterward.

Cargo saves the new documentation files in target/doc.

The documentation is generated from the pub features of your library, plus any doc comments you’ve attached to them.

When Rust sees comments that start with three slashes, it treats them as a #[doc] attribute instead:

1
2
3
4
5
6
/// Simulate the production of a spore by meiosis.
pub fn produce_spore(factory: &mut Sporangium) -> Spore {}

// equivalent to:
#[doc = "Simulate the production of a spore by meiosis."]
pub fn produce_spore(factory: &mut Sporangium) -> Spore {}

When you compile a library or binary, these attributes don’t change anything, but when you generate documentation, doc comments on public features are included in the output.

Comments starting with //! are treated as #![doc] attributes and are attached to the enclosing feature, typically a module or crate.

The content of a doc comment is treated as Markdown. You can also include HTML tags, which are copied verbatim into the formatted documentation.

One special feature of doc comments in Rust is that Markdown links can use Rust item paths, like leaves::Leaf, instead of relative URLs, to indicate what they refer to. Cargo will look up what the path refers to and subtitute a link to the right place in the right documentation page. For example, the documentation generated from this code links to the documentation pages for VascularPath, Leaf, and Root:

1
2
3
4
5
6
7
/// Create and return a [`VascularPath`] which represents the path of
/// nutrients from the given [`Root`][r] to the given [`Leaf`](leaves::Leaf).
///
/// [r]: roots::Root
pub fn trace_path(leaf: &leaves::Leaf, root: &roots::Root) -> VascularPath {
    // ...
}

You can also add search aliases to make it easier to find things using the built-in search feature. Searching for either “path” or “route” in this crate’s documentation will lead to VascularPath:

1
2
3
4
#[doc(alias = "route")]
pub struct VascularPath {
    // ...
}

For longer blocks of documentation, or to streamline your workflow, you can include external files in your documentation. For example, if your repository’s README.md file holds the same text you’d like to use as your crate’s top-level documentation, you could put this at the top of lib.rs or main.rs:

1
#![doc = include_str!("../README.md")]

You can use `backticks` to set off bits of code in the middle of running text. In the output, these snippets will be formatted in a fixed-width font. Larger code samples can be added by indenting four spaces:

1
2
3
4
5
/// A block of code in a doc comment:
///
///     if samples::everything().works() {
///         println!("ok");
///     }

You can also use Markdown-fenced code blocks. This has exactly the same effect:

1
2
3
4
5
6
7
/// Another snippet, the same code, but written differently:
///
/// ```
/// if samples::everything().works() {
///     println!("ok");
/// }
///

Whichever format you use, an interesting thing happens when you include a block of code in a doc comment. Rust automatically turns it into a test.

Doc-Tests

When you run tests in a Rust library crate, Rust checks that all the code that appears in your documentation actually runs and works. It does this by taking each block of code that appears in a doc comment, compiling it as a separate executable crate, linking it with your library, and running it.

Here is a standalone example of a doc-test. Create a new project by running cargo new --lib ranges (the --lib flag tells Cargo we’re creating a library crate, not an executable crate) and put the following code in ranges/src/lib.rs:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
use std::ops::Range;

/// Return true if two ranges overlap.
///
///     assert_eq!(ranges::overlap(0..7, 3..10), true);
///     assert_eq!(ranges::overlap(1..5, 101..105), false);
///
/// If either range is empty, they don't count as overlapping.
///
///     assert_eq!(ranges::overlap(0..0, 0..10), false);
///
pub fn overlap(r1: Range<usize>, r2: Range<usize>) -> bool {
    r1.start < r1.end && r2.start < r2.end &&
        r1.start < r2.end && r2.start < r1.end
}

The two small blocks of code in the doc comment appear in the documentation generated by cargo doc.

They also become two separate tests:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
cargo test
   Compiling ranges v0.1.0 (file:///.../ranges)
...
   Doc-tests ranges

running 2 tests
test overlap_0 ... ok
test overlap_1 ... ok

test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

If you pass the --verbose flag to Cargo, you’ll see that it’s using rustdoc --test to run tests. rustdoc stores each code sample in a separate file, adding a few lines of boilerplate code, to produce programs:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
// the first program
use ranges;
fn main() {
    assert_eq!(ranges::overlap(0..7, 3..10), true);
    assert_eq!(ranges::overlap(1..5, 101..105), false);
}

// the second program
use ranges;
fn main() {
    assert_eq!(ranges::overlap(0..0, 0..10), false);
}

The idea behind doc-tests is not to put all your tests into comments. Rather, you write the best possible documentation, and Rust makes sure the code samples in your documentation actually compile and run.

Very often a minimal working example includes some details, such as imports or setup code, that are necessary to make the code compile, but just aren’t important enough to show in the documentation. To hide a line of a code sample, put a # followed by a space at the beginning of that line:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
/// Let the sun shine in and run the simulation for a given
/// amount of time.
///
///     # use fern_sim::Terrarium;
///     # use std::time::Duration;
///     # let mut tm = Terrarium::new();
///     tm.apply_sunlight(Duration::from_secs(60));
///
pub fn apply_sunlight(&mut self, time: Duration) {
    // ...
}

Sometimes it’s helpful to show a complete sample program in documentation, including a main function. Obviously, if those pieces of code appear in your code sample, you do not also want rustdoc to add them automatically. The result wouldn’t compile. rustdoc therefore treats any code block containing the exact string fn main as a complete program and doesn’t add anything to it.

Testing can be disabled for specific blocks of code. To tell Rust to compile your example, but stop short of actually running it, use a fenced code block with the no_run annotation:

1
2
3
4
5
6
7
8
9
/// Upload all local terrariums to the online gallery.
///
/// ```no_run
/// let mut session = fern_sim::connect();
/// session.upload_all();
/// ```
pub fn upload_all(&mut self) {
    // ...
}

If the code isn’t even expected to compile, use ignore instead of no_run. Blocks marked with ignore don’t show up in the output of cargo run, but no_run tests show up as having passed if they compile.

If the code block isn’t Rust code at all, use the name of the language, like c++ or sh, or text for plain text. rustdoc doesn’t know the names of hundreds of programming languages; rather, it treats any annotation it doesn’t recognize as indicating that the code block isn’t Rust. This disables code highlighting as well as doc-testing.

Specifying Dependencies

One way of telling Cargo where to get source code for crates is by version number.

1
image = "0.6.1"

You may want to use dependencies that aren’t published on crates.io at all. One way to do this is by specifying a Git repository URL and revision:

1
image = { git = "https://github.com/Piston/image.git", rev = "528f19c" }

You can specify the particular rev, tag, or branch to use. (These are all ways of telling Git which revision of the source code to check out.)

Another alternative is to specify a directory that contains the crate’s source code:

1
image = { path = "vendor/image" }

This is convenient when your team has a single version control repository that contains source code for several crates, or perhaps the entire dependency graph. Each crate can specify its dependencies using relative paths.

Versions

When you write something like image = "0.13.0" in your Cargo.toml file, Cargo interprets this rather loosely. It uses the most recent version of image that is considered compatible with version 0.13.0.

The compatibility rules are adapted from Semantic Versioning.

A version number that starts with 0.0 is so raw that Cargo never assumes it’s compatible with any other version.
A version number that starts with 0.x, where x is nonzero, is considered compatible with other point releases in the 0.x series. We specified image version 0.6.1, but Cargo would use 0.6.3 if available.
- This is not what the Semantic Versioning standard says about 0.x version numbers, but the rule proved too useful to leave out.
Once a project reaches 1.0, only new major versions break compatibility. So if you ask for version 2.0.1, Cargo might use 2.17.99 instead, but not 3.0.

Version numbers are flexible by default because otherwise the problem of which version to use would quickly become overconstrained. Suppose one library, libA, used num = "0.1.31" while another, libB, used num = "0.1.29". If version numbers required exact matches, no project would be able to use those two libraries together. Allowing Cargo to use any compatible version is a much more practical default.

You can specify an exact version or range of versions by using operators.

Cargo.toml line	Meaning
image = “=0.10.0”	Use only the exact version 0.10.0
image = “>=1.0.5”	Use 1.0.5 or any higher version (even 2.9, if it’s available)
image = “>1.0.5 <1.1.9”	Use a version that’s higher than 1.0.5, but lower than 1.1.9
image = “<=2.7.10”	Use any version up to 2.7.10

The wildcard * tells Cargo that any version will do. Unless some other Cargo.toml file contains a more specific constraint, Cargo will use the latest available version. The Cargo documentation covers version specifications in even more detail.

The compatibility rules mean that version numbers can’t be chosen purely for marketing reasons. They actually mean something. They’re a contract between a crate’s maintainers and its users.

Cargo.lock

The version numbers in Cargo.toml are deliberately flexible, yet we don’t want Cargo to upgrade us to the latest library versions every time we build.

The first time you build a project, Cargo outputs a Cargo.lock file that records the exact version of every crate it used. Later builds will consult this file and continue to use the same versions. Cargo upgrades to newer versions only when you tell it to, either by manually bumping up the version number in your Cargo.toml file or by running cargo update.

cargo update only upgrades to the latest versions that are compatible with what you’ve specified in Cargo.toml. If you’ve specified image = "0.6.1", and you want to upgrade to version 0.10.0, you’ll have to change that in Cargo.toml. The next time you build, Cargo will update to the new version of the image library and store the new version number in Cargo.lock.

Something very similar happens for dependencies that are stored in Git.

Cargo.lock is automatically generated for you, and you normally won’t edit it by hand. Nonetheless, if your project is an executable, you should commit Cargo.lock to version control. That way, everyone who builds your project will consistently get the same versions.

If your project is an ordinary Rust library, don’t bother committing Cargo.lock. Your library’s downstream users will have Cargo.lock files that contain version information for their entire dependency graph; they will ignore your library’s Cargo.lock file. In the rare case that your project is a shared library (i.e., the output is a .dll, .dylib, or .so file), there is no such downstream cargo user, and you should therefore commit Cargo.lock.

Cargo.toml’s flexible version specifiers make it easy to use Rust libraries in your project and maximize compatibility among libraries. Cargo.lock’s bookkeeping supports consistent, reproducible builds across machines. Together, they go a long way toward helping you avoid dependency hell.

Publishing Crates to crates.io

First, make sure Cargo can pack the crate for you. The cargo package command creates a file (for example, target/package/fern_sim-0.1.0.crate) containing all your library’s source files, including Cargo.toml. This is the file that you’ll upload to crates.io to share with the world. (You can use cargo package --list to see which files are included.) Cargo then double-checks its work by building your library from the .crate file, just as your eventual users will.

Cargo warns that the [package] section of Cargo.toml is missing some information that will be important to downstream users, such as the license under which you’re distributing the code.

Another problem that sometimes arises at this stage is that your Cargo.toml file might be specifying the location of other crates by path. Cargo ignores the path key in automatically downloaded libraries, and this can cause build errors. If your library is going to be published on crates.io, its dependencies should be on crates.io too. Specify a version number instead of a path. If you prefer, you can specify both a path, which takes precedence for your own local builds, and a version for all other users:

1
image = { path = "vendor/image", version = "0.13.0" }

Lastly, before publishing a crate, you’ll need to log in to crates.io and get an API key. Cargo saves the key in a configuration file, and the API key should be kept secret, like a password.

1
2
cargo login 5j0dV54BjlXBpUUbfIj7G9DvNl1vsWW1
cargo publish

Workspaces

As your project continues to grow, you end up writing many crates. They live side by side in a single source repository:

fernsoft/
├── .git/...
├── fern_sim/
│   ├── Cargo.toml
│   ├── Cargo.lock
│   ├── src/...
│   └── target/...
├── fern_img/
│   ├── Cargo.toml
│   ├── Cargo.lock
│   ├── src/...
│   └── target/...
└── fern_video/
    ├── Cargo.toml
    ├── Cargo.lock
    ├── src/...
    └── target/...

The way Cargo works, each crate has its own build directory, target, which contains a separate build of all that crate’s dependencies. These build directories are completely independent. Even if two crates have a common dependency, they can’t share any compiled code. This is wasteful.

You can save compilation time and disk space by using a Cargo workspace, a collection of crates that share a common build directory and Cargo.lock file. All you need to do is create a Cargo.toml file in your repository’s root directory and put these lines in it:

1
2
3
# Cargo.toml
[workspace]
members = ["fern_sim", "fern_img", "fern_video"]

Here fern_sim etc. are the names of the subdirectories containing your crates. Delete any leftover Cargo.lock files and target directories that exist in those subdirectories. Once you’ve done this, cargo build in any crate will automatically create and use a shared build directory under the root directory (in this case, fernsoft/target). The command cargo build --workspace builds all crates in the current workspace. cargo test and cargo doc accept the --workspace option as well.

More Nice Things

When you publish an open source crate on crates.io, your documentation is automatically rendered and hosted on docs.rs.

If your project is on GitHub, Travis CI can build and test your code on every push. It’s surprisingly easy to set up; see travis-ci.org for details. If you’re already familiar with Travis, this .travis.yml file will get you started:

1
2
3
language: rust
rust:
  - stable

You can generate a README.md file from your crate’s top-level doc-comment. This feature is offered as a third-party Cargo plug-in by Livio Ribeiro. Run cargo install cargo-readme to install the plug-in, then cargo readme --help to learn how to use it.

References

Programming Rust, 2nd Edition (Covers the Rust 2021 Edition)
The Manifest Format
Conditional Compilation
Attributes
Specifying Dependencies

TL;DR#

Crates#

Editions#

Build Profiles#

Modules#

Nested Modules#

Modules in Separate Files#

Paths and Imports#

The Standard Prelude#

Making use Declarations pub#

Making Struct Fields pub#

Statics and Constants#

Turning a Program into a Library#

The src/bin Directory#

Attributes#

Tests and Documentation#

Integration Tests#

Documentation#

Doc-Tests#

Specifying Dependencies#

Versions#

Cargo.lock#

Publishing Crates to crates.io#

Workspaces#

More Nice Things#