How to Errors Good
This post is going build upon a previous post about errors I had done
A Simple Rule for Better Errors
I've been giving some more thought to errors recently, especially since starting to work in Rust. Hopefully by writing these down I can organize the thoughts a bit, and share them with others who might find them useful.
Encapsulation
My view of errors is built upon my view of encapsulation in general. A program is built from components, each having clearly defined concerns and boundaries, with each depending on other components. Each component defines an interface of methods which can be used by "upstream" components to interact with it. These interfaces are opaque, meaning they expose the bare-minimum information which is required to be useful to the caller. If the caller doesn't need to know something, the interface doesn't provide it, either in the types it uses, the documentation it provides, or the actual functionality it exposes. Whether an interface takes the form of a network API or a language-specific abstract interface is irrelevant to this discussion.
Programs which don't conform to this general pattern are difficult to test and difficult for newcomers to work on. Lacking encapsulation, it's not always clear how changes will effect the overall system, and so subtle bugs sneak in.
Errors form a concrete part of an interface. Many new programmers tend to treat errors as something which we can just assume won't happen, like worrying about an asteroid hitting the earth. Many languages encourage this kind of thinking, by calling errors "exceptions" and hiding them by default.
But given enough users, all errors will eventually occur. If someone out there is going to be using software that I've written, I want to be able to honestly say I've done my best to craft the _full_ experience of that software for them, both the "happy path" and the failure cases they will stumble on. Ignoring the error and dumping a stack trace onto the user is not doing them a service.
In that context, errors are not something which can be hidden, or in fact even something which can be treated as some kind of special case. Every interface needs to enumerate exactly which errors can occur, and at what points, and what behavior can be expected in those cases, just as it does for the happy path cases. When an interface method is called which return an error, then the caller simply _must_ be prepared to handle the error, even if handling that error means passing it upstream to some higher component which will deal with it.
This can sound overwhelming, since the number of errors which can occur is virtually unbounded. To help we can divide the errors into two categories: expected or unexpected.
(Un)Expected
An expected error is one which the caller of the method is able to take a specific action in response to. When authenticating using a username and password, the caller can do something with a "password not found" error (prompt the user for a different one, using a prompt message). This error is "expected". The caller cannot do anything with a "database is unreachable" error. Perhaps the error _could_ be shown to the user, but what's the user going to do with it? This error is "unexpected".
Unexpected errors are generally ones which end up in logs, or which prompt a message to the user which says "please contact the system administrator for help", or offer to submit a bug report. A program which experiences an unexpected error can't be expected to continue functioning in a meaningful way. This is _ok_, life happens, but it's important to recognize these sorts of errors for what they are, and to give the user some inkling of what steps they can take without dumping a load of programming jargon on them.
Additionally, errors which are unexpected for some interfaces can be expected for others, it all depends on what kind of experience is being provided by an interface, and what the needs of the caller are. It may be determined later that an error which was previously considered unexpected could actually be handled in some way, and that error would then become expected. Designing a good interface is as much an art as a science, and art requires practice and mistakes.
Every method for every interface needs to explicitly declare which expected errors it can return, and in what cases they are returned. If it didn't do this, then the interface would be under-specified; there would be behavior it exhibits which is not documented and therefore not handled by the caller. The declaration of errors can be done either via the code itself, if supported by the language, or in the documentation for the method. For example, in Go I end up writing a lot of interfaces which look like this:
var (
	ErrNotFound = errors.New("not found")
	ErrLocked   = errors.New("locked")
)
type UserStore interface {
	// Update overwrites the User record for the given user ID.
	//
	// Expected errors:
	// - ErrNotFound - No user found for the given ID.
	// - ErrLocked - User record is locked and cannot be updated.
	//
	// Other unexpected errors may also occur.
	Update(id string, u User) error
}
When calling `Update`, the caller should explicitly check for every expected error:
err := userStore.Update(id, u)
if errors.Is(err, ErrNotFound) {
	// our application doesn't care about this error in this case, the error
	// gets ignored
} else if errors.Is(err, ErrLocked) {
	// notify the user that the record is locked
} else if err != nil {
	// something unexpected happened, there's nothing we can do about it,
	// pass the error upstream so that the caller of _this_ component knows
	// that things have gone wrong. Eventually the error will likely end up
	// being logged.
	return err
}
Note that `Update` doesn't expose any database or IO errors. These could certainly happen, as we suppose that `UserStore` is most likely implemented using some kind database or filesystem, but they are not relevant to the caller; if they occur the caller will simply pass them upstream as the generic "unexpected" case.
Enumeration
While it's great to have all potential errors documented in a useful way, it would be even better to have language-level support for dealing with expected errors. Then when a new error gets added to a method it can be a computer, rather than a mushy fallible human, which goes through the rest of the application and modifies every call-site of that method.
In a language like Rust, which can use exhaustive enumeration checks to _force_ developers to check their errors, this is actually possible. The above example could instead be defined something like this:
#[derive(Debug)]
enum UpdateError{
    NotFound,
    Locked,
    Unexpected(String),
}
impl Display for UpdateError { /* Implement Display, required to implement Error */ }
impl Error for UpdateError {}
trait UserStore {
    fn update(&self, id: &str, u: &User) -> Result<(), UpdateError>;
}
And then at the call-site:
match userStore.update(&id, &u) {
Ok(()) => (),
Err(UpdateError::NotFound) =>
    // ignore
Err(UpdateError::Locked) =>
    // notify user
Err(UpdateError::Unexpected(e)) =>
    // pass the error up
}
It's a bit more wordy (because it's Rust), but it gets the job done, and if we ever add a new variant to the `UpdateError` enumeration we'll get compiler errors in every place where that error is not being checked. That is, assuming you're confident that none of the existing calls use an `Err(e)` catch-all pattern, in which case you're back to manually checking anyway, so all that wordiness would have gone for nothing. For that reason, I treat `Err(e)` as an anti-pattern.
-----
On the subject of anti-patterns, I see a lot of rust crates provide a single Error type which gets used across all methods in the crate which return an error. I'll use `reqwest` as an example, though I've seen this in a lot of places:
reqwest had the audacity to even hide the underlying error type, so it's not even possible to exhaustively `match` against it. But even if you could, the error could potentially be anything, and unless each method which returns this error documents which possible cases could be returned (they don't), it just about amounts to `Err(e)`.
-----
Back to my example: in Rust world, my `Unexpected(String)` variant is somewhat out of place. I've found that Rust folks often instead include an enumeration for every possible sub-error possible, not just those which the client is expected to deal with. For example, if `UserStore` is implemented using a SQL database, `Unexpected(String)` might instead be `Database(sql::Error)`.
This has the benefit of exposing precisely what went wrong to the caller, but at the cost of clarity and encapsulation. An interface should only expose the bare-minimum necessary in order for it to be useful in its function. Exposing the entirety of a database error is not the bare-minimum, since only a subset of all possible database errors can be usefully dealt with, or are even possible in the first place. Given such an error, it's unclear to the caller what that subset actually is, and so the actual behavior of the interface becomes imprecisely defined.
Further, passing up errors like `Database(sql::Error)` would be considered an abstraction leak. The caller doesn't need to know _how_ `UserStore` is implemented, only what it's capabilities are, and yet it's clear from the error how it's implemented (though not clear what to do about it). This will impede any efforts to wrap, mock, or replace the underlying implementation.
Fin
The rules and patterns I've put to paper here aren't coming from nowhere, it's all based on the idea of designing around opaque interface definitions which encapsulate behavior, a standard practice in programming. Really my main goal is to point out that error cases are part of an interface, that it's the responsibility for both the caller and implementation of the interface to act accordingly, and to show how the design and usage of interfaces is affected.
-----
Published 2023-07-31
This site is a mirror of my gemini capsule. The equivalent gemini page can be found here, and you can learn more about gemini at my 🚀 What is Gemini? page.