April 5th, 2021 · #rust · #error_handling · #beginner
The example projects in The Rust Programming Language are great for introducing new would-be Rustaceans to different aspects and features of Rust. In this post, we'll be looking at some different ways of implementing a more robust error handling infrastructure by extending the minigrep
project from The Rust Programming Language.
The minigrep
project is introduced in chapter 12 and walks the reader through building a simple version of the grep
command line tool, which is a utility for searching through text. For example, you'd pass in a query, the text you're searching for, along with the file name where the text lives, and get back all of the lines that contain the query text.
The goal of this post is to extend the book's minigrep
implementation with more robust error handling patterns so that you'll have a better idea of different ways to handle errors in a Rust project.
For reference, you can find the final code for the book's version of minigrep
here.
A common pattern when it comes to structuring Rust projects is to have a "library" portion where the primary data structures, functions, and logic live and an "application" portion that ties the library functions together.
You can see this in the file structure of the original minigrep
code: the application logic lives inside of the src/bin/main.rs
file, and it's merely a thin wrapper around data structures and functions that are defined in the src/lib.rs
file; all the main
function does is call minigrep::run
.
This is important to point out because depending on whether we're building an application or a library changes how we approach error handling.
When it comes to an application, the end user most likely doesn't want to know about the nitty gritty details of what caused an error. Indeed, the end user of an application probably only ought to be notified of an error in the event that the error is unrecoverable. In this case, it's also useful to provide details on why the unrecoverable error occurred, especially if it has to do with user input. If some sort of recoverable error happened in the background, the consumer of an application probably doesn't need to know about it.
Conversely, when it comes to a library, the end users are other developers who are using the library and building something on top of it. In this case, we'd like to give as many relevant details about any errors that occurred in our library as possible. The consumer of the library will then decide how they want to handle those errors.
So how do these two approaches play together when we have both a library portion and an application portion in our project? The main
function executes the minigrep::run
function and outputs any errors that crop up as a result. So most of our error handling efforts will be focused on the library portion.
In src/lib.rs
, we have two functions, Config::new
and run
, which might return errors:
impl Config {
pub fn new(mut args: env::Args) -> Result<Config, &'static str> {
args.next();
let query = match args.next() {
Some(arg) => arg,
None => return Err("Didn't get a query string"),
};
let filename = match args.next() {
Some(arg) => arg,
None => return Err("Didn't get a file name"),
};
let case_sensitive = env::var("CASE_INSENSITIVE").is_err();
Ok(Config {
query,
filename,
case_sensitive,
})
}
}
pub fn run(config: Config) -> Result<(), Box<dyn Error>> {
let contents = fs::read_to_string(config.filename)?;
let results = if config.case_sensitive {
search(&config.query, &contents)
} else {
search_case_insensitive(&config.query, &contents)
};
for line in results {
println!("{}", line);
}
Ok(())
}
There are exactly three spots where errors are being returned: two errors occur in the Config::new
function, which returns a Result<Config, &'static str>
. In this case, the error variant of the Result
is a static string slice.
Here we return an error when a query is not provided by the user.
let query = match args.next() {
Some(arg) => arg,
None => return Err("Didn't get a query string"),
};
Here we return an error when a filename is not provided by the user.
let filename = match args.next() {
Some(arg) => arg,
None => return Err("Didn't get a file name"),
};
The main problem with structuring our errors in this way as static strings is that the error messages are not located in a central spot where we can easily refactor them should we need to. It also makes it more difficult to keep our error messages consistent between the same types of errors.
The third error occurs at the top of run
function, which returns a Result<(), Box<dyn Error>>
. The error variant in this case is a trait object that implements the Error
trait. In other words, the error variant for this function is any instance of a type that implements the Error
trait.
Here we bubble up any errors that might have occurred as a result of calling fs::read_to_string
.
let contents = fs::read_to_string(config.filename)?;
This works for the errors that might crop up as a result of calling fs::read_to_string
since this function is capable of returning multiple types of errors. Therefore, we need a way to generically represent those different possible error types; the commonality between them all is the fact that they all implement the Error
trait!
Ultimately, what we want to do is define all of these different types of errors in a central location and have them all be variants of a single type.
We'll create a new src/error.rs
file and define an enum AppError
, deriving the Debug
trait in the process so that we can get a debug representation should we need it. We'll name each of the variants of this enum in such a way that they appropriately represent each of the three types of errors:
#[derive(Debug)]
pub enum AppError {
MissingQuery,
MissingFilename,
ConfigLoad,
}
The third variant, ConfigLoad
, maps to the error that might crop up when calling fs::read_to_string
in the Config::run
function. This might seem a bit out of place at first, since if an error occurs with that function, it would be some sort of I/O problem reading the provided config file. So why didn't we name it IOError
or something like that?
In this case, since we're surfacing an error from a standard library function, it's more relevant to our application to describe how the surfaced error affects it, instead of simply reiterating it. When an error occurs with fs::read_to_string
, that prevents our Config
from loading, so that's why we named it ConfigLoad
.
Now that we have this type, we need to update all of the spots in our code where we return errors to utilize this AppError
enum.
At the top of our src/lib.rs
file, we need to declare our error
module and bring error::AppError
into scope:
mod error;
use error::AppError;
In our Config::new
function, we need to update the spots where we were returning static string slices as errors, as well as the return type of the function itself:
- pub fn new(mut args: env::Args) -> Result<Config, &'static str>
+ pub fn new(mut args: env::Args) -> Result<Config, AppError>
// --snip--
let query = match args.next() {
Some(arg) => arg,
- None => return Err("Didn't get a query string"),
+ None => return Err(AppError::MissingQuery),
};
let filename = match args.next() {
Some(arg) => arg,
- None => return Err("Didn't get a file name"),
+ None => return Err(AppError::MissingFilename),
};
// --snip--
The third error in the run
function only requires us to update its return type, since the ?
operator is already taking care of bubbling the error up and returning it should it occur.
- pub fn run(config: Config) -> Result<(), Box<dyn Error>>
+ pub fn run(config: Config) -> Result<(), AppError>
Ok, so we're now making use of our error variants, which, should they occur, are being surfaced to our main
function and printed out. But we no longer have the actual error messages that we had before defined anywhere!
The thiserror
crate is one that is commonly used to provide an ergonomic way to format error messages in a Rust library.
It allows us to annotate each of the variants in our AppError
enum with the actual error message that we want displayed to the end user.
Let's add it as a dependency in our Cargo.toml:
[dependencies]
thiserror = "1"
In src/error.rs
we'll bring the thiserror::Error
trait into scope and have our AppError
type derive it. We need this trait derived in order to annotate each enum variant with an #[error]
block. Now we specify the error message that we want displayed for each particular variant:
+ use std::io;
+ use thiserror::Error;
- #[derive(Debug)]
+ #[derive(Debug, Error)]
pub enum AppError {
+ #[error("Didn't get a query string")]
MissingQuery,
+ #[error("Didn't get a file name")]
MissingFilename,
+ #[error("Could not load config")]
ConfigLoad {
+ #[from]
+ source: io::Error,
+ }
}
What's all the extra stuff was added to the ConfigLoad
variant? Since a ConfigLoad
error only occurs when there's an underlying error with the call to fs::read_to_string
, what the ConfigLoad
variant is actually doing is providing extra context around the underlying I/O error.
thiserror
allows us to wrap a lower-level error in additional context by annotating it with a #[from]
in order to convert the source
into our homebrew error type. In this way, when an I/O error occurs (like when we specify a file to search through that doesn't actually exist), we get an error like this:
Could not load config: Os { code: 2, kind: NotFound, message: "No such file or directory" }
Without it, the resulting error message looks like this:
Os { code: 2, kind: NotFound, message: "No such file or directory" }
To a consumer of our library, it's harder to figure out the source of this error; the additional context helps a lot.
You can find the version of minigrep
that uses thiserror
here.
Now we'll switch gears and look out how we might achieve the same results that thiserror
provides us, but without bringing it in as a dependency.
Under the hood, thiserror
performs some magic with procedural macros, which can have a noticeable effect on compilation speeds. In the case of minigrep
, we have very few error variants and the project is so small that a dependency on thiserror
really won't introduce much of an increase in compilation time, but it could be a consideration in a much larger and more complex project.
So on that note, we'll wrap up this post by ripping it out and replacing it with our own hand-rolled implementation. The nice thing about going down this route is that we'll only need to make changes to the src/error.rs
file to implement all of the necessary changes (besides, of course, removing thiserror
from our Cargo.toml).
[dependencies]
- thiserror = "1"
Let's remove all of the annotations that thiserror
was providing us. We'll also replace the thiserror::Error
trait with the std::error::Error
trait:
- use thiserror::Error;
+ use std::error::Error;
- #[derive(Debug, Error)]
+ #[derive(Error)]
pub enum AppError {
- #[error("Didn't get a query string")]
MissingQuery,
- #[error("Didn't get a file name")]
MissingFilename,
- #[error("Could not load config")]
ConfigLoad {
- #[from]
source: io::Error,
}
}
In order to get back all of the functionality we just wiped, we'll need to do three things:
- Implement the
Display
trait forAppError
so that our error variants can be displayed to the user. - Implement the
Error
trait forAppError
. This trait represents the basic expectations of an error type, namely that they implementDisplay
andDebug
, plus the capability to fetch the underlying source or cause of the error. - Implement
From<io::Error>
forAppError
. This is required so that we can convert an I/O error returned fromfs::read_to_string
into an instance ofAppError
.
Here's our implementation of the Display
trait for our AppError
. It maps each error variant to an string and writes it to the Display
formatter.
use std::fmt;
impl fmt::Display for AppError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
Self::MissingQuery => f.write_str("Didn't get a query string"),
Self::MissingFilename => f.write_str("Didn't get a file name"),
Self::ConfigLoad { source } => write!(f, "Could not load config: {}", source),
}
}
}
And here's our implementation of the Error
trait. The main method to be implemented is the Error::source
method, which is meant to provide information about the source of an error. For our AppError
type, only ConfigLoad
exposes any underlying source information, namely the I/O error that might happen as a result of calling fs::read_to_string
. There's no underlying source information to expose in the case of the other error variants.
use std::error;
impl error::Error for AppError {
fn source(&self) -> Option<&(dyn Error + 'static)> {
match self {
Self::ConfigLoad { source } => Some(source),
_ => None,
}
}
}
The &(dyn Error + 'static)
part of the return type is similar to the Box<dyn Error>
trait object that we saw earlier. The main difference here is that the trait object is behind an immutable reference instead of a Box
pointer. The 'static
lifetime here means the trait object itself only contains owned values, i.e., it doesn't store any references internally. This is necessary in order to assuage the compiler that there's no chance of a dangling pointer here.
Lastly, we need a way to convert an io::Error
into an AppError
. We'll do this by impling From<io::error> for AppError
.
impl From<io::Error> for AppError {
fn from(source: io::Error) -> Self {
Self::ConfigLoad { source }
}
}
There's not much to this one. If we get an io::Error
, all we do to convert it to an AppError
is wrap it in a ConfigLoad
variant.
And that's all folks! You can find this version of our minigrep implementation here.
In closing, we discussed how the original minigrep
implementation presented in The Rust Programming Language book is a bit lacking in the error handling department, as well as how to think about different error handling use cases.
From there, we showcased how to use the thiserror
crate to centralize all of the possible error variants into a single type.
Finally, we peeled back the veneer that thiserror
provides and showed how to replicate the same functionality manually.
I hope you all learned something from this post! 🙂