Traits and Generics in Rust

TL;DR

When you implement a trait, either the trait or the type must be new in the current crate. This is called the orphan rule. It helps Rust ensure that trait implementations are unique. Your code can’t impl Write for u8, because both Write and u8 are defined in the standard library. If Rust let crates do that, there could be multiple implementations of Write for u8, in different crates, and Rust would have no reasonable way to decide which implementation to use for a given method call.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
trait IsEmoji {
    fn is_emoji(&self) -> bool;
}

/// Implement IsEmoji for the built-in character type.
impl IsEmoji for char {
    fn is_emoji(&self) -> bool {
        // ...
    }
}

assert_eq!('$'.is_emoji(), false);

The trait itself must be in scope in order to use its methods. Rust has this rule because you can use traits to add new methods to any type—even standard library types like u32 and str. Third-party crates can do the same thing. Clearly, this could lead to naming conflicts. But since Rust makes you import the traits you plan to use, crates are free to take advantage of this superpower. To get a conflict, you’d have to import two traits that add a method with the same name to the same type. This is rare in practice. (If you do run into a conflict, you can spell out what you want using fully qualified method syntax.)

Two different crates might extend the same type, let’s say, u32, using extension trait by implementing the trait method in each crate respectively. If the trait is not in scope, Rust would have no reasonable way to decide which implementation to use for a given method call. However, if both of the traits are imported at the same time, you will get a conflict.
The reason Clone and Iterator methods work without any special imports is that they’re always in scope by default: they’re part of the standard prelude, names that Rust automatically imports into every module. In fact, the prelude is mostly a carefully chosen selection of traits.

A trait can use the keyword Self as a type. It’s an alias for the concrete type that implements the trait.

Every impl block, generic or not, defines the special type parameter Self (note the CamelCase name) to be whatever type we’re adding methods to.

Rust passes a method the value it’s being called on as its first argument, which must have the special name self. Since self’s type is obviously the one named at the top of the impl block, or a reference to that, Rust lets you omit the type, and write self, &self, or &mut self as shorthand for self: Queue, self: &Queue, or self: &mut Queue.

There are two ways of using traits to write polymorphic code in Rust: trait objects and generics.

Rust doesn’t permit variables of type dyn Write. A variable’s size has to be known at compile time, and types that implement Write can be any size.

1
2
3
4
5
use std::io::Write;

let mut buf: Vec<u8> = vec![];
let writer: dyn Write = buf;    // error: `Write` does not have a constant size
let writer: &mut dyn Write = &mut buf; // ok, automatically converts an ordinary reference to a trait object

A reference to a trait type, is called a trait object.

1
2
3
4
5
6
7
8
let mut local_file = File::create("hello.txt")?;

// out is a trait object
fn say_hello_to(out: &mut dyn Write) -> std::io::Result<()> {
    out.write_all(b"hello world\n")?;
    out.flush()
}
say_hello_to(&mut local_file)?; // automatically converts an ordinary reference to a trait object

The type of out is &mut dyn Write, meaning “a mutable reference to any value that implements the Write trait.”
The type of &mut local_file is &mut File, and the type of the argument to say_hello_to is &mut dyn Write. Since a File is a kind of writer, Rust allows this, automatically converting the plain reference to a trait object.
In memory, a trait object is a fat pointer consisting of a pointer to the value, plus a pointer to a table representing that value’s type. Each trait object therefore takes up two machine words.
- If the type implements multiple traits, the vtable will contain pointers to all the trait methods. The vtable still takes up one machine word.
Rust automatically converts ordinary references into trait objects when needed. At the point where the conversion happens, Rust knows the referent’s true type (in this case, File), so it just adds the address of the appropriate vtable, turning the regular pointer into a fat pointer. This kind of conversion is the only way to create a trait object.

Trait objects are runtime polymorphism. Rust never knows what type of value a trait object points to until run time. With trait objects, you lose the type information Rust needs to type-check your program.

Generics are compile-time polymorphism. No dynamic dispatch is involved. Rust generates machine code for each type T that you actually use.

1
2
3
4
5
6
7
8
let mut local_file = File::create("hello.txt")?;

// generic function
fn say_hello_gf<W: Write>(out: &mut W) -> std::io::Result<()> { 
    out.write_all(b"hello world\n")?;
    out.flush()
}
say_hello_gf(&mut local_file)?;    // calls say_hello::<File>

<W: Write> is a type parameter. W stands for some type that implements the Write trait. <W: Write> in the function signature means that say_hello_gf can be used with arguments of any type W that implements the Write trait. A requirement like this is called a bound, because it sets limits on which types T could possibly be.

Trait method call like the following one is fast, as fast as any other method call:

1
2
3
4
use std::io::Write;

let mut buf: Vec<u8> = vec![];
buf.write_all(b"hello")?; // ok

Simply put, there’s no polymorphism here. It’s obvious that buf is a vector, not a file or a network connection. The compiler can emit a simple call to Vec<u8>::write(). It can even inline the method.
Only calls through &mut dyn Write incur the overhead of a dynamic dispatch, also known as a virtual method call, which is indicated by the dyn keyword in the type.

Traits can describe relationships between types. These are ways of avoiding virtual method overhead and downcasts, since they allow Rust to know more concrete types at compile time.

The line impl<W: Write> WriteHtml for W means “for every type W that implements Write, here’s an implementation of WriteHtml for W.”

Brief

One of the great discoveries in programming is that it’s possible to write code that operates on values of many different types, even types that haven’t been invented yet. Here are two examples:

Vec<T> is generic: you can create a vector of any type of value, including types defined in your program that the authors of Vec never anticipated.
Many things have .write() methods, including Files and TcpStreams. Your code can take a writer by reference, any writer, and send data to it. Your code doesn’t have to care what type of writer it is. Later, if someone adds a new type of writer, your code will already support it.

This capability is called polymorphism. Rust supports polymorphism with two related features: traits and generics. These concepts will be familiar to many programmers, but Rust takes a fresh approach inspired by Haskell’s typeclasses.

Traits are Rust’s take on interfaces or abstract base classes. At first, they look just like interfaces in Java or C#. The trait for writing bytes is called std::io::Write, and its definition in the standard library starts out like this:

1
2
3
4
5
6
trait Write {
    fn write(&mut self, buf: &[u8]) -> Result<usize>;
    fn flush(&mut self) -> Result<()>;
    fn write_all(&mut self, buf: &[u8]) -> Result<()> { ... }
    // ...
}

The standard types File and TcpStream both implement std::io::Write. So does Vec<u8>. All three types provide methods named .write(), .flush(), and so on. Code that uses a writer without caring about its type looks like this:

1
2
3
4
5
6
use std::io::Write;

fn say_hello(out: &mut dyn Write) -> std::io::Result<()> {
    out.write_all(b"hello world\n")?;
    out.flush()
}

The type of out is &mut dyn Write, meaning “a mutable reference to any value that implements the Write trait.” We can pass say_hello a mutable reference to any such value:

1
2
3
4
5
6
7
use std::fs::File;
let mut local_file = File::create("hello.txt")?;
say_hello(&mut local_file)?; // works

let mut bytes = vec![];
say_hello(&mut bytes)?; // also works
assert_eq!(bytes, b"hello world\n");

We can use traits to add extension methods to existing types, even built-in types like str and bool.

Built-in traits are the hook into the language that Rust provides for operator overloading and other features.

Generics are the other flavor of polymorphism in Rust. Like a C++ template, a generic function or type can be used with values of many different types:

1
2
3
4
5
6
7
8
/// Given two values, pick whichever one is less.
fn min<T: Ord>(value1: T, value2: T) -> T {
    if value1 <= value2 {
        value1
    } else {
        value2
    }
}

The <T: Ord> in this function means that min can be used with arguments of any type T that implements the Ord trait—that is, any ordered type. A requirement like this is called a bound, because it sets limits on which types T could possibly be. The compiler generates custom machine code for each type T that you actually use.

Generics and traits are closely related: generic functions use traits in bounds to spell out what types of arguments they can be applied to.

Using Traits

A trait is a feature that any given type may or may not support. Most often, a trait represents a capability: something a type can do.

A value that implements std::io::Write can write out bytes.
- std::fs::File implements the Write trait; it writes bytes to a local file. std::net::TcpStream writes to a network connection. Vec<u8> also implements Write. Each .write() call on a vector of bytes appends some data to the end.
A value that implements std::iter::Iterator can produce a sequence of values.
- Range<i32> (the type of 0..10) implements the Iterator trait, as do some iterator types associated with slices, hash tables, and so on.
A value that implements std::clone::Clone can make clones of itself in memory.
- Most standard library types implement Clone. The exceptions are mainly types like TcpStream that represent more than just data in memory.
A value that implements std::fmt::Debug can be printed using println!() with the {:?} format specifier.

There is one unusual rule about trait methods: the trait itself must be in scope. Otherwise, all its methods are hidden:

1
2
3
4
5
6
7
8
let mut buf: Vec<u8> = vec![];
buf.write_all(b"hello")?; // error: no method named `write_all`


use std::io::Write;

let mut buf: Vec<u8> = vec![];
buf.write_all(b"hello")?; // ok

Rust has this rule because you can use traits to add new methods to any type—even standard library types like u32 and str. Third-party crates can do the same thing. Clearly, this could lead to naming conflicts. But since Rust makes you import the traits you plan to use, crates are free to take advantage of this superpower. To get a conflict, you’d have to import two traits that add a method with the same name to the same type. This is rare in practice. (If you do run into a conflict, you can spell out what you want using fully qualified method syntax.)

The reason Clone and Iterator methods work without any special imports is that they’re always in scope by default: they’re part of the standard prelude, names that Rust automatically imports into every module. In fact, the prelude is mostly a carefully chosen selection of traits.

C++ and C# programmers will already have noticed that trait methods are like virtual methods. Still, calls like the one shown above are fast, as fast as any other method call. Simply put, there’s no polymorphism here. It’s obvious that buf is a vector, not a file or a network connection. The compiler can emit a simple call to Vec<u8>::write(). It can even inline the method. (C++ and C# will often do the same, although the possibility of subclassing sometimes precludes this.) Only calls through &mut dyn Write incur the overhead of a dynamic dispatch, also known as a virtual method call, which is indicated by the dyn keyword in the type.

dyn Write is known as a trait object.

Trait Objects

A trait object points to both an instance of a type implementing our specified trait and a table used to look up trait methods on that type at runtime.

We create a trait object by specifying some sort of pointer, such as a & reference or a Box<T> smart pointer, then the dyn keyword, and then specifying the relevant trait.

We can use trait objects in place of a generic or concrete type. Wherever we use a trait object, Rust’s type system will ensure at compile time that any value used in that context will implement the trait object’s trait. Consequently, we don’t need to know all the possible types at compile time.

In Rust, we refrain from calling structs and enums “objects” to distinguish them from other languages’ objects. In a struct or enum, the data in the struct fields and the behavior in impl blocks are separated, whereas in other languages, the data and behavior combined into one concept is often labeled an object. However, trait objects are more like objects in other languages in the sense that they combine data and behavior. But trait objects differ from traditional objects in that we can’t add data to a trait object. Trait objects aren’t as generally useful as objects in other languages: their specific purpose is to allow abstraction across common behavior.

Rust doesn’t permit variables of type dyn Write:

1
2
3
4
use std::io::Write;

let mut buf: Vec<u8> = vec![];
let writer: dyn Write = buf;    // error: `Write` does not have a constant size

A variable’s size has to be known at compile time, and types that implement Write can be any size.

In Java, a variable of type OutputStream (the Java standard interface analogous to std::io::Write) is a reference to any object that implements OutputStream. The fact that it’s a reference goes without saying. It’s the same with interfaces in C# and most other languages.

What we want in Rust is the same thing, but in Rust, references are explicit:

1
2
let mut buf: Vec<u8> = vec![];
let writer: &mut dyn Write = &mut buf; // ok

A reference to a trait type is called a trait object. Like any other reference, a trait object points to some value, it has a lifetime, and it can be either mut or shared.

writer is a trait object. It’s a reference to the trait of type Write.

What makes a trait object different is that Rust usually doesn’t know the type of the referent at compile time. So a trait object includes extra information about the referent’s type. This is strictly for Rust’s own use behind the scenes: when you call writer.write(data), Rust needs the type information to dynamically call the right write method depending on the type of *writer. You can’t query the type information directly, and Rust does not support downcasting from the trait object &mut dyn Write back to a concrete type like Vec<u8>.

ChatGPT: Downcasting is the process of casting an object from a more general type to a more specific type.

Trait object layout

In memory, a trait object is a fat pointer consisting of a pointer to the value, plus a pointer to a table representing that value’s type. Each trait object therefore takes up two machine words.

If the type implements multiple traits, the vtable will contain pointers to all the trait methods. The vtable still takes up one machine word.

C++ has this kind of run-time type information as well. It’s called a virtual table, or vtable. In Rust, as in C++, the vtable is generated once, at compile time, and shared by all objects of the same type. Everything shown in the darker shade, including the vtable, is a private implementation detail of Rust. Again, these aren’t fields and data structures that you can access directly. Instead, the language automatically uses the vtable when you call a method of a trait object, to determine which implementation to call.

In C++, the vtable pointer, or vptr, is stored as part of the struct. Rust uses fat pointers instead. The struct (data filed of a trait object) itself contains nothing but its fields. This way, a struct can implement dozens of traits without containing dozens of vptrs. Even types like i32, which aren’t big enough to accommodate a vptr, can implement traits.

Rust automatically converts ordinary references into trait objects when needed. This is why we’re able to pass &mut local_file to say_hello in this example:

1
2
let mut local_file = File::create("hello.txt")?;
say_hello(&mut local_file)?;

The type of &mut local_file is &mut File, and the type of the argument to say_hello is &mut dyn Write. Since a File is a kind of writer, Rust allows this, automatically converting the plain reference to a trait object.

Likewise, Rust will happily convert a Box<File> to a Box<dyn Write>, a value that owns a writer in the heap:

1
let w: Box<dyn Write> = Box::new(local_file);

Box<dyn Write>, like &mut dyn Write, is a fat pointer: it contains the address of the writer itself and the address of the vtable. The same goes for other pointer types, like Rc<dyn Write>.

This kind of conversion is the only way to create a trait object. What the compiler is actually doing here is very simple. At the point where the conversion happens, Rust knows the referent’s true type (in this case, File), so it just adds the address of the appropriate vtable, turning the regular pointer into a fat pointer.

Generic Functions and Type Parameters

Rewrite say_hello() as a generic function:

1
2
3
4
5
6
fn say_hello<W: Write>(out: &mut W) -> std::io::Result<()> { // generic function
    out.write_all(b"hello world\n")?;
    out.flush()
}

fn say_hello(out: &mut dyn Write)     // plain function

The phrase <W: Write> is what makes the function generic. This is a type parameter. It means that throughout the body of this function, W stands for some type that implements the Write trait. Type parameters are usually single uppercase letters, by convention.

Which type W stands for depends on how the generic function is used:

1
2
say_hello(&mut local_file)?;    // calls say_hello::<File>
say_hello(&mut bytes)?;         // calls say_hello::<Vec<u8>>

When you pass &mut local_file to the generic say_hello() function, you’re calling say_hello::<File>(). Rust generates machine code for this function that calls File::write_all() and File::flush(). When you pass &mut bytes, you’re calling say_hello::<Vec<u8>>(). Rust generates separate machine code for this version of the function, calling the corresponding Vec<u8> methods. In both cases, Rust infers the type W from the type of the argument. This process is known as monomorphization, and the compiler handles it all automatically.

You can always spell out the type parameters:

1
say_hello::<File>(&mut local_file)?;

This is seldom necessary, because Rust can usually deduce the type parameters by looking at the arguments. Here, the say_hello generic function expects a &mut W argument, and we’re passing it a &mut File, so Rust infers that W = File.

If the generic function you’re calling doesn’t have any arguments that provide useful clues, you may have to spell it out:

1
2
3
// calling a generic method collect<C>() that takes no arguments
let v1 = (0 .. 1000).collect();             // error: can't infer type
let v2 = (0 .. 1000).collect::<Vec<i32>>(); // ok

Sometimes we need multiple abilities from a type parameter. The syntax for this uses the + sign:

1
2
3
4
use std::hash::Hash;
use std::fmt::Debug;

fn top_ten<T: Debug + Hash + Eq>(values: &Vec<T>) { ... }

It’s also possible for a type parameter to have no bounds at all, but you can’t do much with a value if you haven’t specified any bounds for it. You can move it. You can put it into a box or vector. That’s about it.

Generic functions can have multiple type parameters:

1
2
3
/// Run a query on a large, partitioned data set.
/// See <http://research.google.com/archive/mapreduce.html>.
fn run_query<M: Mapper + Serialize, R: Reducer + Serialize>(data: &DataSet, map: M, reduce: R) -> Results {}

The bounds can get to be so long that they are hard on the eyes. Rust provides an alternative syntax using the keyword where:

1
fn run_query<M, R>(data: &DataSet, map: M, reduce: R) -> Results where M: Mapper + Serialize, R: Reducer + Serialize {}

The type parameters M and R are still declared up front, but the bounds are moved to separate lines. This kind of where clause is also allowed on generic structs, enums,type aliases, and methods—anywhere bounds are permitted.

A generic function can have both lifetime parameters and type parameters. Lifetime parameters come first:

1
2
3
/// Return a reference to the point in `candidates` that's
/// closest to the `target` point.
fn nearest<'t, 'c, P>(target: &'t P, candidates: &'c [P]) -> &'c P where P: MeasureDistance {}

Lifetimes never have any impact on machine code. Two calls to nearest() using the same type P, but different lifetimes, will call the same compiled function. Only differing types cause Rust to compile multiple copies of a generic function.

In addition to types and lifetimes, generic functions can take constant parameters as well:

1
2
3
4
5
6
7
fn dot_product<const N: usize>(a: [f64; N], b: [f64; N]) -> f64 {
    let mut sum = 0.;
    for i in 0..N {
        sum += a[i] * b[i];
    }
    sum
}

The phrase <const N: usize> indicates that the function dot_product expects a generic parameter N, which must be a usize. What distinguishes N from an ordinary usize argument is that you can use it in the types in dot_product’s signature or body.

N 是编译时常量，此处可以用于数组的长度，而 usize 是运行时值，不能用于数组的长度。

As with type parameters, you can either provide constant parameters explicitly, or let Rust infer them:

1
2
3
4
5
// Explicitly provide `3` as the value for `N`.
dot_product::<3>([0.2, 0.4, 0.6], [0., 0., 1.])

// Let Rust infer that `N` must be `2`.
dot_product([3., 4.], [-5., 1.])

Functions are not the only kind of generic code in Rust:

There’re generic structs and generic enums.

An individual method can be generic, even if the type it’s defined on is not generic:

1
2
3
4
5
6
impl PancakeStack {
    fn push<T: Topping>(&mut self, goop: T) -> PancakeResult<()> {
        goop.pour(&self);
        self.absorb_topping(goop)
    }
}

Type aliases can be generic:

1
type PancakeResult<T> = Result<T, PancakeError>;

There’re generic traits.

All the features introduced—bounds, where clauses, lifetime parameters, and so forth—can be used on all generic items, not just functions.

Which to Use

The choice of whether to use trait objects or generic code is subtle. Since both features are based on traits, they have a lot in common.

Trait objects are the right choice whenever you need a collection of values of mixed types, all together. It is technically possible to make generic salad:

1
2
3
4
5
6
7
trait Vegetable {
    // ...
}

struct Salad<V: Vegetable> {
    veggies: Vec<V>
}

However, this is a rather severe design. Each such salad consists entirely of a single type of vegetable.

Since Vegetable values can be all different sizes, we can’t ask Rust for a Vec<dyn Vegetable>:

1
2
3
4
struct Salad {
    veggies: Vec<dyn Vegetable> // error: `dyn Vegetable` does
                                // not have a constant size
}

Trait objects are the solution:

1
2
3
struct Salad {
    veggies: Vec<Box<dyn Vegetable>>
}

Each Box<dyn Vegetable> can own any type of vegetable, but the box itself has a constant size—two pointers—suitable for storing in a vector.

传参、赋值时会做隐式转换，但作为 Vec 的元素时，不会隐式转换，需要转换成符合要求的类型。

Another possible reason to use trait objects is to reduce the total amount of compiled code. Rust may have to compile a generic function many times, once for each type it’s used with. This could make the binary large, a phenomenon called code bloat in C++ circles. These days, memory is plentiful, and most of us have the luxury of ignoring code size; but constrained environments do exist.

Outside of situations involving salad or low-resource environments, generics have three important advantages over trait objects, with the result that in Rust, generics are the more common choice.

The first advantage is speed.
- Note the absence of the dyn keyword in generic function signatures. Because you specify the types at compile time, either explicitly or through type inference, the compiler knows exactly which write method to call. The dyn keyword isn’t used because there are no trait objects—and thus no dynamic dispatch— involved.
  1 2 3 4 5 6 7 8
  /// Given two values, pick whichever one is less. fn min<T: Ord>(value1: T, value2: T) -> T { if value1 <= value2 { value1 } else { value2 } }
  - The generic min() function is just as fast as if we had written the separate functions min_u8, min_i64, min_string, and so on. The compiler can inline it, like any other function, so in a release build, a call to min::<i32> is likely just two or three instructions. A call with constant arguments, like min(5, 3), will be even faster: Rust can evaluate it at compile time, so that there’s no run-time cost at all.
- In the following generic function call, std::io::sink() returns a writer of type Sink that quietly discards all bytes written to it.
  1 2
  let mut sink = std::io::sink(); say_hello(&mut sink)?;
  - When Rust generates machine code for this, it could emit code that calls Sink::write_all, checks for errors, and then calls Sink::flush. That’s what the body of the generic function says to do.
  - Or, Rust could look at those methods and realize that Sink::write_all() does nothing; Sink::flush() does nothing; neither method ever returns an error. Rust has all the information it needs to optimize away this function call entirely.
  - Compare that to the behavior with trait objects. Rust never knows what type of value a trait object points to until run time. So even if you pass a Sink, the overhead of calling virtual methods and checking for errors still applies.
Not every trait can support trait objects.
- Traits support several features, such as associated functions, that work only with generics: they rule out trait objects entirely.
It’s easy to bound a generic type parameter with several traits at once.
- Types like &mut (dyn Debug + Hash + Eq) aren’t supported in Rust. You can work around this with subtraits.

Defining and Implementing Traits

Defining a trait is simple. Give it a name and list the type signatures of the trait methods. To implement a trait, use the syntax impl TraitName for Type:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
/// A trait for characters, items, and scenery -
/// anything in the game world that's visible on screen.
trait Visible {
    /// Render this object on the given canvas.
    fn draw(&self, canvas: &mut Canvas);

    /// Return true if clicking at (x, y) should
    /// select this object.
    fn hit_test(&self, x: i32, y: i32) -> bool;
}

impl Visible for Broom {
    fn draw(&self, canvas: &mut Canvas) {
        for y in self.y - self.height - 1 .. self.y {
            canvas.write_at(self.x, y, '|');
        }
        canvas.write_at(self.x, self.y, 'M');
    }

    fn hit_test(&self, x: i32, y: i32) -> bool {
        self.x == x
        && self.y - self.height - 1 <= y
        && y <= self.y
    }
}

This impl contains an implementation for each method of the Visible trait, and nothing else. Everything defined in a trait impl must actually be a feature of the trait; if we wanted to add a helper method in support of Broom::draw(), we would have to define it in a separate impl block. These helper functions can be used within the trait impl blocks:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
impl Broom {
    /// Helper function used by Broom::draw() below.
    fn broomstick_range(&self) -> Range<i32> {
        self.y - self.height - 1 .. self.y
    }
}

impl Visible for Broom {
    fn draw(&self, canvas: &mut Canvas) {
        for y in self.broomstick_range() {
            // ...
        }
        // ...
    }

    // ...
}

Default Methods

The Sink writer type can be implemented in a few lines of code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
/// A Writer that ignores whatever data you write to it.
pub struct Sink;

use std::io::{Write, Result};

impl Write for Sink {
    fn write(&mut self, buf: &[u8]) -> Result<usize> {
        // Claim to have successfully written the whole buffer.
        Ok(buf.len())
    }

    fn flush(&mut self) -> Result<()> {
        Ok(())
    }
}

Sink is an empty struct, since we don’t need to store any data in it.

The Write trait has a write_all method:

1
2
let mut out = Sink;
out.write_all(b"hello world\n")?;

The reason Rust let us impl Write for Sink without defining this method is that the standard library’s definition of the Write trait contains a default implementation for write_all:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
trait Write {
    fn write(&mut self, buf: &[u8]) -> Result<usize>;
    fn flush(&mut self) -> Result<()>;

    fn write_all(&mut self, buf: &[u8]) -> Result<()> {
        let mut bytes_written = 0;
        while bytes_written < buf.len() {
            bytes_written += self.write(&buf[bytes_written..])?;
        }
        Ok(())
    }

    // ...
}

The write and flush methods are the basic methods that every writer must implement. A writer may also implement write_all, but if not, the default implementation will be used. Your own traits can include default implementations using the same syntax.

The most dramatic use of default methods in the standard library is the Iterator trait, which has one required method (.next()) and dozens of default methods.

Traits and Other People’s Types

Rust lets you implement any trait on any type, as long as either the trait or the type is introduced (new) in the current crate.

This means that any time you want to add a method to any type, you can use a trait to do it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
trait IsEmoji {
    fn is_emoji(&self) -> bool;
}

/// Implement IsEmoji for the built-in character type.
impl IsEmoji for char {
    fn is_emoji(&self) -> bool {
        // ...
    }
}

assert_eq!('$'.is_emoji(), false);

Like any other trait method, this new is_emoji method is only visible when IsEmoji is in scope.

The sole purpose of this particular trait is to add a method to an existing type, char. This is called an extension trait.

You can even use a generic impl block to add an extension trait to a whole family of types at once.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
// !!! `self` brings `io` module to scope. Otherwise we could have used `io::Result`.
use std::io::{self, Write};

/// Trait for values to which you can send HTML.
trait WriteHtml {
    fn write_html(&mut self, html: &HtmlDocument) -> io::Result<()>;
}

/// You can write HTML to any std::io writer.
impl<W: Write> WriteHtml for W {
    fn write_html(&mut self, html: &HtmlDocument) -> io::Result<()> {
        // ...
    }
}

Implementing the trait for all writers makes it an extension trait, adding a method to all Rust writers. The line impl<W: Write> WriteHtml for W means “for every type W that implements Write, here’s an implementation of WriteHtml for W.”

The serde library offers a nice example of how useful it can be to implement user-defined traits on standard types. serde is a serialization library. That is, you can use it to write Rust data structures to disk and reload them later. The library defines a trait, Serialize, that’s implemented for every data type the library supports. So in the serde source code, there is code implementing Serialize for bool, i8, i16, i32, array and tuple types, and so on, through all the standard data structures like Vec and HashMap.

The upshot of all this is that serde adds a .serialize() method to all these types. It can be used like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
use serde::Serialize;
use serde_json;

pub fn save_configuration(config: &HashMap<String, String>)
    -> std::io::Result<()>
{
    // Create a JSON serializer to write the data to a file.
    let writer = File::create(config_filename())?;
    let mut serializer = serde_json::Serializer::new(writer);

    // The serde `.serialize()` method does the rest.
    config.serialize(&mut serializer)?;

    Ok(())
}

When you implement a trait, either the trait or the type must be new in the current crate. This is called the orphan rule. It helps Rust ensure that trait implementations are unique. Your code can’t impl Write for u8, because both Write and u8 are defined in the standard library. If Rust let crates do that, there could be multiple implementations of Write for u8, in different crates, and Rust would have no reasonable way to decide which implementation to use for a given method call.

C++ has a similar uniqueness restriction: the One Definition Rule. In typical C++ fashion, it isn’t enforced by the compiler, except in the simplest cases, and you get undefined behavior if you break it.

Self in Traits

A trait can use the keyword Self as a type.

The standard Clone looks like this (slightly simplified):

1
2
3
4
pub trait Clone {
    fn clone(&self) -> Self;
    // ...
}

Using Self as the return type here means that the type of x.clone() is the same as the type of x, whatever that might be. If x is a String, then the type of x clone() is String—not dyn Clone or any other cloneable type.

Self 是具体类型。

The following code defines a trait with two implementations:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
pub trait Spliceable {
    fn splice(&self, other: &Self) -> Self;
}

impl Spliceable for CherryTree {
    fn splice(&self, other: &Self) -> Self {
        // ...
    }
}
impl Spliceable for Mammoth {
    fn splice(&self, other: &Self) -> Self {
        // ...
    }
}

Inside the first impl Spliceable, Self is simply an alias for CherryTree, and in the second, it’s an alias for Mammoth. This means that we can splice together two cherry trees or two mammoths, not that we can create a mammoth-cherry hybrid. The type of self and the type of other must match.

A trait that uses the Self type is incompatible with trait objects:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
pub trait Spliceable {
    // the *type* of `self` and the type of `other` must match.
    fn splice(&self, other: &Self) -> Self;
}

// error: the trait `Spliceable` cannot be made into an object
fn splice_anything(left: &dyn Spliceable, right: &dyn Spliceable) {
    let combo = left.splice(right);
    // ...
}

Rust rejects this code because it has no way to type-check the call left.splice(right). The whole point of trait objects is that the type isn’t known until run time. Rust has no way to know at compile time if left and right will be the same type, as required (in the function signature).

具体类型要相同，是否相同要运行时才能确定。

Trait objects are really intended for the simplest kinds of traits, the kinds that could be implemented using interfaces in Java or abstract base classes in C++. The more advanced features of traits are useful, but they can’t coexist with trait objects because with trait objects, you lose the type information Rust needs to type-check your program.

The following trait is compatible with trait objects:

1
2
3
pub trait MegaSpliceable {
    fn splice(&self, other: &dyn MegaSpliceable) -> Box<dyn MegaSpliceable>;
}

There’s no problem type-checking calls to this .splice() method because the type of the argument other is not required to match the type of self, as long as both types are MegaSpliceable.

编译时可以校验类型是否都实现了 MegaSpliceable trait，不再需要校验 self 和 other 的具体类型是否相同。

Subtraits

We can declare that a trait is an extension of another trait:

1
2
3
4
5
trait Creature: Visible {
    fn position(&self) -> (i32, i32);
    fn facing(&self) -> Direction;
    // ...
}

The phrase trait Creature: Visible means that all creatures are visible. Every type that implements Creature must also implement the Visible trait:

1
2
3
4
5
6
7
impl Visible for Broom {
    // ...
}

impl Creature for Broom {
    // ...
}

We can implement the two traits in either order, but it’s an error to implement Creature for a type without also implementing Visible. We say that Creature is a subtrait of Visible, and that Visible is Creature’s supertrait.

Subtraits resemble subinterfaces in Java or C#, in that users can assume that any value that implements a subtrait implements its supertrait as well. But in Rust, a subtrait does not inherit the associated items of its supertrait; each trait still needs to be in scope if you want to call its methods.

In fact, Rust’s subtraits are really just a shorthand for a bound on Self. A definition of Creature like this is exactly equivalent to the one shown earlier:

1
2
3
trait Creature where Self: Visible {
    // ...
}

Type-Associated Functions

All functions defined within an impl block are called associated functions because they’re associated with the type named after the impl. We can define associated functions that don’t have self as their first parameter (and thus are not methods) because they don’t need an instance of the type to work with.

Associated functions that aren’t methods are often used for constructors that will return a new instance of the struct.

In most object-oriented languages, interfaces can’t include static methods or constructors, but traits can include type-associated functions, Rust’s analog to static methods:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
trait StringSet {
    /// Return a new empty set.
    fn new() -> Self;
    /// Return a set that contains all the strings in `strings`.
    fn from_slice(strings: &[&str]) -> Self;

    /// Find out if this set contains a particular `value`.
    fn contains(&self, string: &str) -> bool;
    /// Add a string to this set.
    fn add(&mut self, string: &str);
}

Every type that implements the StringSet trait must implement these four associated functions. The first two, new() and from_slice(), don’t take a self argument. They serve as constructors.

In nongeneric code, these functions can be called using :: syntax, just like any other type-associated function:

1
2
3
// Create sets of two hypothetical types that impl StringSet:
let set1 = SortedStringSet::new();
let set2 = HashedStringSet::new();

In generic code, it’s the same, except the type is often a type variable:

1
2
3
4
5
6
7
8
9
fn unknown_words<S: StringSet>(document: &[String], wordlist: &S) -> S {
    let mut unknowns = S::new();
    for word in document {
        if !wordlist.contains(word) {
            unknowns.add(word);
        }
    }
    unknowns
}

Like Java and C# interfaces, trait objects don’t support type-associated functions. If you want to use &dyn StringSet trait objects, you must change the trait, adding the bound where Self: Sized to each associated function that doesn’t take a self argument by reference:

1
2
3
4
5
6
7
8
9
trait StringSet {
    fn new() -> Self
        where Self: Sized;
    fn from_slice(strings: &[&str]) -> Self
        where Self: Sized;

    fn contains(&self, string: &str) -> bool;
    fn add(&mut self, string: &str);
}

This bound tells Rust that trait objects are excused from supporting this particular associated function. With these additions, StringSet trait objects are allowed; they still don’t support new or from_slice, but you can create them and use them to call .contains() and .add(). The same trick works for any other method that is incompatible with trait objects.

Fully Qualified Method Calls

All the ways for calling trait methods we’ve seen so far rely on Rust filling in some missing pieces.

Consider the following code:

1
2
// #1
"hello".to_string()

to_string refers to the to_string method of the ToString trait, of which we’re calling the str type’s implementation. So there are four players in this game: the trait, the method of that trait, the implementation of that method, and the value to which that implementation is being applied.

In some cases we might need a way to say exactly what you mean by using fully qualified method calls.

First of all, it helps to know that a method is just a special kind of function. These two calls are equivalent:

1
2
3
"hello".to_string()

str::to_string("hello")

The second form looks exactly like a associated function call. This works even though the to_string method takes a self argument. Simply pass self as the function’s first argument.

Since to_string is a method of the standard ToString trait, there are two more forms you can use:

1
2
ToString::to_string("hello")
<str as ToString>::to_string("hello")

All four of these method calls do exactly the same thing. Most often, you’ll just write value.method(). The other forms are qualified method calls. They specify the type or trait that a method is associated with. The last form, with the angle brackets, specifies both: a fully qualified method call.

When you write "hello".to_string(), using the . operator, you don’t say exactly which to_string method you’re calling. Rust has a method lookup algorithm that figures this out, depending on the types, deref coercions, and so on. With fully qualified calls, you can say exactly which method you mean, and that can help in a few odd cases:

When two methods have the same name from two different traits. The classic hokey example is the Outlaw with two .draw() methods from two different traits, one for drawing it on the screen and one for interacting with the law:
1 2 3
outlaw.draw(); // error: draw on screen or draw pistol? Visible::draw(&outlaw); // ok: draw on screen HasPistol::draw(&outlaw); // ok: corral
- Usually you’re better off renaming one of the methods, but sometimes you can’t.

When the type of the self argument can’t be inferred:

1
2
3
4
let zero = 0;   // type unspecified; could be `i8`, `u8`, ...
zero.abs();     // error: can't call method `abs`
// on ambiguous numeric type
i64::abs(zero); // ok

When using the function itself as a function value:

1
2
3
4
let words: Vec<String> =
    line.split_whitespace()  // iterator produces &str values
        .map(ToString::to_string)  // ok
        .collect();

When calling trait methods in macros.

Fully qualified syntax also works for associated functions.

1
2
3
S::new()
StringSet::new()
<S as StringSet>::new()

Traits That Define Relationships Between Types

So far, every trait we’ve looked at stands alone: a trait is a set of methods that types can implement. Traits can also be used in situations where there are multiple types that have to work together. They can describe relationships between types.

The std::iter::Iterator trait relates each iterator type with the type of value it produces.
The std::ops::Mul trait relates types that can be multiplied. In the expression a * b, the values a and b can be either
The rand crate includes both a trait for random number generators (rand::Rng) and a trait for types that can be randomly generated (rand::Distribution). The traits themselves define exactly how these types work together.

The ways traits describe between types can also be seen as ways of avoiding virtual method overhead and downcasts, since they allow Rust to know more concrete types at compile time.

Associated Types (or How Iterators Work)

Rust has a standard Iterator trait:

1
2
3
4
5
6
pub trait Iterator {
    type Item;

    fn next(&mut self) -> Option<Self::Item>;
    // ...
}

The first feature of this trait, type Item;, is an associated type. Each type that implements Iterator must specify what type of item it produces.

The second feature, the next() method, uses the associated type in its return value: next() returns an Option<Self::Item>: either Some(item), the next value in the sequence, or None when there are no more values to visit. The type is written as Self::Item, not just plain Item, because Item is a feature of each type of iterator, not a standalone type. As always, self and the Self type show up explicitly in the code everywhere their fields, methods, and so on are used.

Implement Iterator for a type:

// (code from the std::env standard library module)
impl Iterator for Args {
    type Item = String;
    
    fn next(&mut self) -> Option<String> {
        // ...
    }
    // ...
}

Generic code can use associated types:

1
2
3
4
5
6
7
8
/// Loop over an iterator, storing the values in a new vector.
fn collect_into_vector<I: Iterator>(iter: I) -> Vec<I::Item> {
    let mut results = Vec::new();
    for value in iter {
        results.push(value);
    }
    results
}

Inside the body of this function, Rust infers the type of value for us; but we must spell out the return type of collect_into_vector, and the Item associated type is the only way to do that. Vec<I> would be simply wrong: we would be claiming to return a vector of iterators.

The following is another example of generic code that uses associated types:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
/// Print out all the values produced by an iterator
fn dump<I>(iter: I)
    where I: Iterator
{
    for (index, value) in iter.enumerate() {
        println!("{}: {:?}", index, value); // error
    }
}
// error[E0277]: `<I as Iterator>::Item` doesn't implement `Debug`
//  --> src/main.rs:8:37
//   |
// 8 |         println!("{}: {:?}", index, value); // error
//   |                                     ^^^^^ `<I as Iterator>::Item` cannot be formatted using `{:?}` because it doesn't implement `Debug`
//   |
//   = help: the trait `Debug` is not implemented for `<I as Iterator>::Item`
//   = note: this error originates in the macro `$crate::format_args_nl` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)
// help: consider further restricting the associated type
//   |
// 5 |     I: Iterator, <I as Iterator>::Item: Debug
//   |                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

value might not be a printable type. We must ensure that I::Item implements the Debug trait, the trait for formatting values with {:?}.

<I as Iterator>::Item is an explicit but verbose way of saying I::Item.

We can do this by placing a bound on I::Item to ensure that I::Item implements the Debug trait

1
2
3
4
5
6
7
use std::fmt::Debug;

fn dump<I>(iter: I)
    where I: Iterator, I::Item: Debug
{
    // ...
}

Or, we could write, “I must be an iterator over String values”:

1
2
3
4
5
fn dump<I>(iter: I)
    where I: Iterator<Item=String>
{
    // ...
}

Iterator<Item=String> is itself a trait. If you think of Iterator as the set of all iterator types, then Iterator<Item=String> is a subset of Iterator: the set of iterator types that produce Strings. This syntax can be used anywhere the name of a trait can be used, including trait object types:

1
2
3
4
5
fn dump(iter: &mut dyn Iterator<Item=String>) {
    for (index, s) in iter.enumerate() {
        println!("{}: {:?}", index, s);
    }
}

Traits with associated types, like Iterator, are compatible with trait methods, but only if all the associated types are spelled out, as shown here (Item=String). Otherwise, the type of s could be anything, and again, Rust would have no way to type-check this code.

Associated types are generally useful whenever a trait needs to cover more than just methods:

In a thread pool library, a Task trait, representing a unit of work, could have an associated Output type.

A Pattern trait, representing a way of searching a string, could have an associated Match type, representing all the information gathered by matching the pattern to the string:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
trait Pattern {
    type Match;

    fn search(&self, text: &str) -> Option<Self::Match>;
}

/// You can search a string for a particular character.
impl Pattern for char {
    /// A "match" is just the location where the
    /// character was found.
    type Match = usize;

    fn search(&self, string: &str) -> Option<usize> {
        // ...
    }
}

impl Pattern for RegExp would have a more elaborate Match type, probably a struct that would include the start and length of the match, the locations where parenthesized groups matched, and so on.

A library for working with relational databases might have a DatabaseConnection trait with associated types representing transactions, cursors, prepared statements, and so on.

Associated types are perfect for cases where each implementation has one specific related type: each type of Task produces a particular type of Output; each type of Pattern looks for a particular type of Match.

Generic Traits (or How Operator Overloading Works)

Multiplication in Rust uses this trait:

1
2
3
4
5
6
7
8
/// std::ops::Mul, the trait for types that support `*`.
pub trait Mul<RHS=Self> {
    /// The resulting type after applying the `*` operator
    type Output;

    /// The method for the `*` operator
    fn mul(self, rhs: RHS) -> Self::Output;
}

Mul is a generic trait. The type parameter, RHS, is short for righthand side. Its instances Mul<f64>, Mul<String>, Mul<Size>, etc., are all different traits.

A single type—say, WindowSize—can implement both Mul<f64> and Mul<i32>, and many more. You would then be able to multiply a WindowSize by many other types. Each implementation would have its own associated Output type.

Generic traits get a special dispensation when it comes to the orphan rule: you can implement a foreign trait for a foreign type, so long as one of the trait’s type parameters is a type defined in the current crate. If you’ve defined WindowSize yourself, you can implement Mul<WindowSize> for f64, even though you didn’t define either Mul or f64. These implementations can even be generic, such as impl<T> Mul<WindowSize> for Vec<T>. This works because there’s no way any other crate could define Mul<WindowSize> on anything, and thus no way a conflict among implementations could arise. This is how crates like nalgebra define arithmetic operations on vectors.

The syntax RHS=Self means that RHS defaults to Self. If I write impl Mul for Complex, without specifying Mul’s type parameter, it means impl Mul<Complex> for Complex. In a bound, if I write where T: Mul, it means where T: Mul<T>.

In Rust, the expression lhs * rhs is shorthand for Mul::mul(lhs, rhs). So overloading the * operator in Rust is as simple as implementing the Mul trait.

`impl Trait`

Combinations of many generic types can get messy:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
use std::iter;
use std::vec::IntoIter;
// connect two iterators into a cycle
fn cyclical_zip(v: Vec<u8>, u: Vec<u8>) ->
    iter::Cycle<iter::Chain<IntoIter<u8>, IntoIter<u8>>> {
        v.into_iter().chain(u.into_iter()).cycle()
}

// An iterator that moves out of a vector. This struct is created by the into_iter method on Vec (provided by the IntoIterator trait).
// https://doc.rust-lang.org/std/vec/struct.IntoIter.html
pub struct IntoIter<T, A = Global>
where
    A: Allocator,
{ /* private fields */ }

// An iterator that links two iterators together, in a chain. This struct is created by Iterator::chain. 
// https://doc.rust-lang.org/std/iter/struct.Chain.html
pub struct Chain<A, B> { /* private fields */ }

// An iterator that repeats endlessly. This struct is created by the `cycle` method on `Iterator`.
// https://doc.rust-lang.org/std/iter/struct.Cycle.html
pub struct Cycle<I> { /* private fields */ }


fn main() {
    let vec = vec![1, 2, 3];
    let mut into_iter = vec.into_iter(); // move out of `vec`

    while let Some(value) = into_iter.next() {
        println!("{} {:?}", value, into_iter);
    }
}
// 1 IntoIter([2, 3])
// 2 IntoIter([3])
// 3 IntoIter([])

We could replace this hairy return type with a trait object:

1
2
3
fn cyclical_zip(v: Vec<u8>, u: Vec<u8>) -> Box<dyn Iterator<Item=u8>> {
    Box::new(v.into_iter().chain(u.into_iter()).cycle())
}

However, taking the overhead of dynamic dispatch and an unavoidable heap allocation every time this function is called just to avoid an ugly type signature doesn’t seem like a good trade, in most cases.

Rust has a feature called impl Trait designed for precisely this situation. impl Trait allows us to “erase” the type of a return value, specifying only the trait or traits it implements, without dynamic dispatch or a heap allocation:

1
2
3
fn cyclical_zip(v: Vec<u8>, u: Vec<u8>) -> impl Iterator<Item=u8> {
    v.into_iter().chain(u.into_iter()).cycle()
}

Now, rather than specifying a particular nested type of iterator combinator structs, cyclical_zip’s signature just states that it returns some kind of iterator over u8. The return type expresses the intent of the function, rather than its implementation details.

impl Trait is more than just a convenient shorthand. Using impl Trait means that you can change the actual type being returned in the future as long as it still implements Iterator<Item=u8>, and any code calling the function will continue to compile without an issue. This provides a lot of flexibility for library authors, because only the relevant functionality is encoded in the type signature.

It might be tempting to use impl Trait to approximate a statically dispatched version of the factory pattern that’s commonly used in object-oriented languages. Consider the following trait:

1
2
3
4
trait Shape {
    fn new() -> Self;
    fn area(&self) -> f64;
}

After implementing it for a few types, you might want to use different Shapes depending on a run-time value, like a string that a user enters. This doesn’t work with impl Shape as the return type:

1
2
3
4
5
6
7
fn make_shape(shape: &str) -> impl Shape {
    match shape {
        "circle" => Circle::new(),
        "triangle" => Triangle::new(), // error: incompatible types
        "shape" => Rectangle::new(),
    }
}

From the perspective of the caller, a function like this doesn’t make much sense. impl Trait is a form of static dispatch, so the compiler has to know the type being returned from the function at compile time in order to allocate the right amount of space on the stack and correctly access fields and methods on that type. Here, it could be Circle, Triangle, or Rectangle, which could all take up different amounts of space and all have different implementations of area().

It’s important to note that Rust doesn’t allow trait methods to use impl Trait return values. Supporting this will require some improvements in the languages’s type system. Until that work is done, only free functions and functions associated with specific types can use impl Trait returns.

impl Trait can also be used in functions that take generic arguments. Consider this simple generic function:

1
2
3
fn print<T: Display>(val: T) {
    println!("{}", val);
}

It is identical to this version using impl Trait:

1
2
3
fn print(val: impl Display) {
    println!("{}", val);
}

There is one important exception. Using generics allows callers of the function to specify the type of the generic arguments, like print::<i32>(42), while using impl Trait does not.

Each impl Trait argument is assigned its own anonymous type parameter, so impl Trait for arguments is limited to only the simplest generic functions, with no relationships between the types of arguments.

Associated Consts

Like structs and enums, traits can have associated constants. You can declare a trait with an associated constant using the same syntax as for a struct or enum:

1
2
3
4
trait Greet {
    const GREETING: &'static str = "Hello";
    fn greet(&self) -> String;
}

Like associated types and functions, you can declare them but not give them a value. Then, implementors of the trait can define these values. This allows you to write generic code that uses these values:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
trait Float {
    const ZERO: Self;
    const ONE: Self;
}

impl Float for f32 {
    const ZERO: f32 = 0.0;
    const ONE: f32 = 1.0;
}

impl Float for f64 {
    const ZERO: f64 = 0.0;
    const ONE: f64 = 1.0;
}

fn add_one<T: Float + Add<Output=T>>(value: T) -> T {
    value + T::ONE
}

Associated constants can’t be used with trait objects, since the compiler relies on type information about the implementation in order to pick the right value at compile time.

Even a simple trait with no behavior at all, like Float, can give enough information about a type, in combination with a few operators, to implement common mathematical functions like Fibonacci:

1
2
3
4
5
6
7
fn fib<T: Float + Add<Output=T>>(n: usize) -> T {
    match n {
        0 => T::ZERO,
        1 => T::ONE,
        n => fib::<T>(n - 1) + fib::<T>(n - 2)
    }
}

In the last two sections, we’ve shown different ways traits can describe relationships between types. All of these can also be seen as ways of avoiding virtual method overhead and downcasts, since they allow Rust to know more concrete types at compile time.

Reverse-Engineering Bounds

Writing generic code can be a real slog when there’s no single trait that does everything you need. Suppose we have written this nongeneric function to do some computation:

1
2
3
4
5
6
7
fn dot(v1: &[i64], v2: &[i64]) -> i64 {
    let mut total = 0;
    for i in 0 .. v1.len() {
        total = total + v1[i] * v2[i];
    }
    total
}

Now we want to use the same code with floating-point values. We might try something like this:

1
2
3
4
5
6
7
fn dot<N>(v1: &[N], v2: &[N]) -> N {
    let mut total: N = 0;
    for i in 0 .. v1.len() {
        total = total + v1[i] * v2[i];
    }
    total
}

Rust complains about the use of * and the type of 0. We can require N to be a type that supports + and * using the Add and Mul traits. Our use of 0 needs to change, though, because 0 is always an integer in Rust; the corresponding floating-point value is 0.0. Fortunately, there is a standard Default trait for types that have default values. For numeric types, the default is always 0:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
use std::ops::{Add, Mul};

fn dot<N: Add + Mul + Default>(v1: &[N], v2: &[N]) -> N {
    let mut total = N::default();
    for i in 0 .. v1.len() {
        total = total + v1[i] * v2[i];
    }
    total
}
// error[E0308]: mismatched types
// | fn dot<N: Add + Mul + Default>(v1: &[N], v2: &[N]) -> N {
// |        - this type parameter
// |     let mut total = N::default();
// |                     ------------ expected due to this value
// |     for i in 0..v1.len() {
// |         total = total + v1[i] * v2[i];
// |                 ^^^^^^^^^^^^^^^^^^^^^ expected type parameter `N`, found associated type
// |
//   = note: expected type parameter `N`
//             found associated type `<N as Add>::Output`
// help: consider further restricting this bound
// |
// | fn dot<N: Add + Mul + Default + Add<Output = N>>(v1: &[N], v2: &[N]) -> N {
// |                               +++++++++++++++++

Our new code assumes that multiplying two values of type N produces another value of type N. This isn’t necessarily the case. You can overload the multiplication operator to return whatever type you want. We need to somehow tell Rust that this generic function only works with types that have the normal flavor of multiplication, where multiplying N * N returns an N. The suggestion in the error message is almost right: we can do this by replacing Mul with Mul<Output=N>, and the same for Add:

1
2
3
4
fn dot<N: Add<Output=N> + Mul<Output=N> + Default>(v1: &[N], v2: &[N]) -> N
{
    // ...
}

At this point, the bounds are starting to pile up, making the code hard to read. Let’s move the bounds into a where clause:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
// OR
fn dot<N>(v1: &[N], v2: &[N]) -> N
    where N: Add<Output=N> + Mul<Output=N> + Default
{
    // ...
}
// error[E0508]: cannot move out of type `[N]`, a non-copy slice
//  --> src/main.rs:9:25
// |
// |         total = total + v1[i] * v2[i];
// |                         ^^^^^
// |                         |
// |                         cannot move out of here
// |                         move occurs because `v1[_]` has type `N`, which does not implement the `Copy` trait

Since we haven’t required N to be a copyable type, Rust interprets v1[i] as an attempt to move a value out of the slice, which is forbidden. But we don’t want to modify the slice at all; we just want to copy the values out to operate on them. Fortunately, all of Rust’s built-in numeric types implement Copy, so we can simply add that to our constraints on N:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
use std::ops::{Add, Mul};

fn dot<N>(v1: &[N], v2: &[N]) -> N
    where N: Add<Output=N> + Mul<Output=N> + Default + Copy
{
    let mut total = N::default();
    for i in 0 .. v1.len() {
        total = total + v1[i] * v2[i];
    }
    total
}

#[test]
fn test_dot() {
    assert_eq!(dot(&[1, 2, 3, 4], &[1, 1, 1, 1]), 10);
    assert_eq!(dot(&[53.0, 7.0], &[1.0, 5.0]), 88.0);
}

This occasionally happens in Rust: there is a period of intense arguing with the compiler, at the end of which the code looks rather nice, as if it had been a breeze to write, and runs beautifully.

What we’ve been doing here is reverse-engineering the bounds on N, using the compiler to guide and check our work. The reason it was a bit of a pain is that there wasn’t a single Number trait in the standard library that included all the operators and methods we wanted to use. As it happens, there’s a popular open source crate called num that defines such a trait! Had we known, we could have added num to our Cargo.toml and written:

1
2
3
4
5
6
7
8
9
use num::Num;

fn dot<N: Num + Copy>(v1: &[N], v2: &[N]) -> N {
    let mut total = N::zero();
    for i in 0 .. v1.len() {
        total = total + v1[i] * v2[i];
    }
    total
}

Just as in object-oriented programming, the right interface makes everything nice, in generic programming, the right trait makes everything nice.

Rust’s designers didn’t make the generics more like C++ templates, where the constraints are left implicit in the code, à la “duck typing”.

One advantage of Rust’s approach is forward compatibility of generic code. You can change the implementation of a public generic function or method, and if you didn’t change the signature, you haven’t broken any of its users.

Another advantage of bounds is that when you do get a compiler error, at least the compiler can tell you where the trouble is. C++ compiler error messages involving templates can be much longer than Rust’s, pointing at many different lines of code, because the compiler has no way to tell who’s to blame for a problem: the template, or its caller, which might also be a template, or that template’s caller…

Perhaps the most important advantage of writing out the bounds explicitly is simply that they are there, in the code and in the documentation. You can look at the signature of a generic function in Rust and see exactly what kind of arguments it accepts. The same can’t be said for templates. The work that goes into fully documenting argument types in C++ libraries like Boost is even more arduous than what we went through here. The Boost developers don’t have a compiler that checks their work.

Traits as a Foundation

Traits are one of the main organizing features in Rust, and with good reason. There’s nothing better to design a program or library around than a good interface.

References

Programming Rust, 2nd Edition (Covers the Rust 2021 Edition)
https://doc.rust-lang.org/book/ch05-03-method-syntax.html#associated-functions
https://doc.rust-lang.org/book/ch17-02-trait-objects.html

TL;DR#

Brief#

Using Traits#

Trait Objects#

Trait object layout#

Generic Functions and Type Parameters#

Which to Use#

Defining and Implementing Traits#

Default Methods#

Traits and Other People’s Types#

Self in Traits#

Subtraits#

Type-Associated Functions#

Fully Qualified Method Calls#

Traits That Define Relationships Between Types#

Associated Types (or How Iterators Work)#

Generic Traits (or How Operator Overloading Works)#

impl Trait#

Associated Consts#

Reverse-Engineering Bounds#

Traits as a Foundation#