Programmable Build System API
In this section, we will program the core API of the programmatic incremental build system. Although we are primarily concerned with programmability in this chapter, we must design the API to support incrementality!
The unit of computation in a programmatic build system is a task. A task is kind of like a closure: a value that can be executed to produce their output, but incremental. To provide incrementality, we also need to keep track of the dynamic dependencies that tasks make while they are executing. Therefore, tasks are executed under an incremental build context, enabling them to create these dynamic dependencies.
Tasks require files through the build context, creating a dynamic file dependency, ensuring the task gets re-executed when that file changes. Tasks also require other tasks through the build context, asking the build context to provide the consistent (most up-to-date) output of that task, and creating a dynamic task dependency to it.
It is then up to the build context to check if it actually needs to execute that required task. If the required task is already consistent, the build context can just return the cached output of that task. Otherwise, the build context executes the required task, caches its output, and returns the output to the requiring task. A non-incremental context can naively execute tasks without checking.
Because tasks require other tasks through the context, and the context selectively executes tasks, the definition of task and context is mutually recursive.
In this tutorial, we will be using the words context, build context, incremental build context, and build system interchangeably, typically using just context as it is concise.
Let’s make tasks and contexts more concrete by defining them in code.
API Implementation
Since we want users of the build system to implement their own tasks, we will define Task
as a trait.
Likewise, we will also be implementing multiple contexts in this tutorial, so we will also define Context
as a trait.
Add the following code to your pie/src/lib.rs
file:
use std::fmt::Debug;
use std::hash::Hash;
/// A unit of computation in a programmatic incremental build system.
pub trait Task: Clone + Eq + Hash + Debug {
/// Type of output this task returns when executed.
type Output: Clone + Eq + Debug;
/// Execute the task, using `context` to specify dynamic dependencies, returning `Self::Output`.
fn execute<C: Context<Self>>(&self, context: &mut C) -> Self::Output;
}
/// Programmatic incremental build context, enabling tasks to create dynamic dependencies that context implementations
/// use for incremental execution.
pub trait Context<T: Task> {
/// Requires given `task`, recording a dependency and selectively executing it. Returns its up-to-date output.
fn require_task(&mut self, task: &T) -> T::Output;
}
If this seems overwhelming to you, don’t worry. We will go through the API and explain things. But more importantly, the API should become more clear once we implement it in the next section and subsequent chapters.
Furthermore, if you’re new to Rust and/or need help understanding certain concepts, I will try to explain them in Rust Help blocks. They are collapsed by default to reduce distraction, clicking the header opens them. See the first Rust Help block at the end of this section.
The Task
trait has several supertraits that we will need later in the tutorial to implement incrementality:
Eq
andHash
: to check whether a task is equal to another one, and to create a hash of it, so we can use aHashMap
to get the output of a task if it is up-to-date.Clone
: to create a clone of the task so that we can store it in theHashMap
without having ownership of it.Debug
: to format the task for debugging purposes.
A Task
has a single method execute
, which takes a reference to itself (&self
), and a mutable reference to a context (context: &mut C
), and produces a value of type Self::Output
.
Because Context
is a trait, we use generics (<C: Context<Self>>
) to have execute
work for any Context
implementation (ignoring the Self
part for now).
The execute
method takes self by reference such that a task can access its data, but not mutate it, as that could throw off incrementality by changing the hash/equality of the task.
Finally, the type of output of a task is defined by the Output
associated type, and this type must implement Clone
, Eq
, and Debug
for the same reason as Task
.
The Context
trait is generic over Task
, allowing it to work with any task implementation.
It has a single method require_task
for creating a dependency to a task and returning its consistent (up-to-date) result.
It takes a mutable reference to itself, enabling dynamic dependency tracking and caching, which require mutation.
Because of this, the context reference passed to Task::execute
is also mutable.
This Task
and Context
API mirrors the mutually recursive definition of task and context we discussed earlier, and forms the basis for the entire build system.
Build the project by running cargo build
.
The output should look something like:
Compiling pie v0.1.0 (/pie)
Finished dev [unoptimized + debuginfo] target(s) in 0.03s
In the next section, we will implement a non-incremental Context
and test it against Task
implementations.
Rust Help: Modules, Imports, Ownership, Traits, Methods, Supertraits, Associated Types, Visibility
The Rust Programming Language is an introductory book about Rust. I will try to provide links to the book where possible.
Rust has a module system for project organization. The lib.rs
file is the “main file” of a library. Later on, we will be creating more modules in different files.
Things are imported into the current scope with
use
statements. We import the Debug
and Hash
traits from the standard library with two use
statements. Use statements use paths to refer to nested things. We use ::
for nesting, similar to namespaces in C++.
Rust models the concept of ownership to enable memory safety without a garbage collector.
The execute
method accepts a reference to the current type, indicated with &
: &self
. This reference is immutable, meaning that we can read data from it, but not mutate it. In Rust, things are immutable by default.
On the other hand, execute
accepts a mutable reference to the context, indicated with &mut
: context: &mut C
, which does allow mutation.
Traits are the main mechanism for open extensibility in Rust. They are comparable to interfaces in class-oriented languages. We will implement a context and tasks in the next section.
Supertraits are a kind of inheritance. The : Clone + Eq + Hash + Debug
part of the Task
trait means that every Task
implementation must also implement the Clone
, Eq
, Hash
, and Debug
traits. These traits are part of the standard library:
- Clone for duplicating values.
- Eq for equality comparisons, along with PartialEq.
- Hash for turning a value into a hash.
- Debug for formatting values in a programmer-facing debugging context.
Clone
and Eq
are so common that they are part of the Rust Prelude, so we don’t have to import those with use
statements.
Methods are functions that take a form of self
as the first argument. This enables convenient object-like calling syntax: context.require_task(&task);
.
Associated types are a kind of placeholder type in a trait such that methods of traits can use that type. In Task
this allows us to talk about the Output
type of a task. In Context
this allows us to refer to both the Task
type T
and its output type T::Output
. The ::
syntax here is used to access associated types of traits.
The Self
type in a trait is a built-in associated type that is a placeholder for the type that is implementing the trait.
The Task
trait is defined with pub
(public) visibility, such that users of the library can implement it. Because Task
uses Context
in its public API, Context
must also be public, even though we don’t intend for users to implement their own Context
.