Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error type redesign #1131

Open
seanmonstar opened this issue Apr 11, 2017 · 29 comments
Open

Error type redesign #1131

seanmonstar opened this issue Apr 11, 2017 · 29 comments
Labels
A-error Area: error handling B-rfc Blocked: More comments would be useful in determine next steps.

Comments

@seanmonstar
Copy link
Member

The Error type in hyper is currently an enum, wrapping up the various cases of things that could go wrong when parsing incoming HTTP data. Some of the problems with the current design:

  • Being an enum means that adding variants is a breaking change. There is a __Nonexhaustive variant to try to warn that exhaustive matches can cause breaking changes, but that it's better when internals are hidden.
  • Users sometimes try to create hyper::Errors, even though they really shouldn't. It is meant only for saying "this thing went wrong while hyper was parsing HTTP", and nothing else. Part of this can be fixed also by changing the actual error types used, such as in outgoing Streams.

Instead, I'd like to propose making the Error an opaque struct, with some methods to inspect the type and details of an error, somewhat along the lines of std::io::Error.

(Still thinking through the specific code, will add a design section later).

@seanmonstar seanmonstar mentioned this issue Apr 11, 2017
4 tasks
@mp4096
Copy link
Contributor

mp4096 commented May 2, 2017

Have you considered using error-chain? TBH I didn't use it before, but it sounds a lot like what you want to do.

Anyway, maybe @brson himself could say if error-chain might be helpful in your case?

@KodrAus
Copy link

KodrAus commented May 2, 2017

Can you declare the ErrorKind enum in error-chain to be private? error-chain does have the nice benefit of collecting a backtrace from the error's source.

@seanmonstar
Copy link
Member Author

A backtrace is neat (but expensive?). I'm not sure I'd want to expose any enum at all with regards to the Error or ErrorKind, because of the inability to grow it. I'd reconsider when and if Rust ever grows some form of private variants (or however is used to prevent exhaustive matches).

It's worth thinking about the things some would do with an Error from hyper.

  • Just log it. So it needs fmt::Debug and fmt::Display.
  • Try to send a proper Response depending on the error that occurred? Like, return 431 when there was a Error::TooManyHeaders, perhaps a 414 if there were Error::TooLarge, and a 400 with other parse errors.
  • Maybe check for certain io::Errors, to allow an application to adapt, such as if the error involves too many file descriptors open, or something.

Anything else?

I'm thinking of having methods on the type to allow inspecting the type of error it was, since it's always possible to add more methods. Like, error.is_uri_too_long() or whatever.

@sanmai-NL
Copy link
Contributor

sanmai-NL commented May 20, 2017

@seanmonstar: I do not think it is a problem if programmers do exhaustive matching on the enum that implements Error. Yes, their code will need some small fixes, but only if they upgrade their hyper dep. If a new Error variant can be returned and exhaustive matching is done, they probably have to alter the logic in their own code anyway, no?

I use Error enums extensively (in closed source code), I did not encounter ergonomic or API stability issues.

Update:
Also, note that I do not use error-chain. Producing non-technical but complete ‘backtraces’ required very little boilerplate code and could be abstracted away easily. Hopefully, you can consider my comment as a data point that keeping things simple and primitive (keeping the status quo wrt. the issue reported) may be a wiser, be it conservative choice.

Update:
I was unclear in my last sentence. I mean keeping the status quo or better yet, improving error handling in a completely different direction, not splitting hyper::Error into a client and server-related Error.

@KodrAus
Copy link

KodrAus commented May 20, 2017

@sanmai-NL That's interesting, are you able to share a bit about what you do with specific hyper::Error variants? In the code I've written using hyper I don't think I've had a case where I could do something useful with any specific variant of Error, just shrug and pass it along. But that could definitely just be me so I'm interested to know how other people hold it.

@sanmai-NL
Copy link
Contributor

sanmai-NL commented May 20, 2017

No sorry, I do nothing with hyper::Error variants, did you understand my comment as if I do? I do retain such Error values as the cause field (of some struct variants from some custom Error enum) for my own backtraces. What I am trying to point out is that having one Error type to cover lots of code (e.g. the current hyper::Error, as well as the proposed split into server and client errors) seems less productive to me than having specific Error types. The latter can easily remain enums, since they are specific to some function scope and only break exhaustive matches on values of that specific Error type (i.e., return values of that particular function).

@dtolnay
Copy link

dtolnay commented May 20, 2017

That's interesting, are you able to share a bit about what you do with specific hyper::Error variants?

Maybe translate error messages into another language.

@sanmai-NL
Copy link
Contributor

Another general comment: in error handling the viewpoint of regular people needs to be taken into account. I have found this requires tight developer control over error reporting. Backtraces etc. are not very user friendly (may even instill fear), and leak information about the source code, which from a business perspective is intellectual property. For hyper, the user faces the web content and I think hyper has an adequate API to return HTTP status codes and nice error pages.

@KodrAus
Copy link

KodrAus commented May 21, 2017

Maybe translate error messages into another language.

@dtolnay Oh that's interesting... I've never thought of translating errors before. I guess you could do that on an arbitrary text stream though, rather than individual error variants?

What I am trying to point out is that having one Error type to cover lots of code ... seems less productive to me than having specific Error types.

@sanmai-NL I have the opposite opinion on this with respect to libraries like hyper that are consumed by other libraries. I think having multiple error types that span a single use-case adds boilerplate when trying to chain errors, and cognitive load when consuming them. Now instead of one error type to deal with you've got potentially many. Since the server and client use-cases are distinct splitting them seems reasonable to me.

For application code though I agree having an individual error type per operation is nicer to manage.

Backtraces etc. are not very user friendly (may even instill fear), and leak information about the source code ...

I think that's a good point. Returning backtraces in responses isn't very good. But having backtraces in logging is very good, and makes issues much easier to diagnose. This is especially true in Rust where you could be handling an error downstream from where it was produced, so the backtrace you get out-of-the-box might be really unhelpful. Maybe for hyper having a backtrace somewhere in its internals isn't any more useful than a backtrace where you receive it? But that could be made a concern for application logging rather than on error types directly.

As far as mapping a hyper response to a http response in a server goes I thought that would be done by hyper directly, rather than by callers? If that's the case then the ergonomics of inspecting the error don't matter so much. If not then maybe we should look at how that code would change with this new design and see if it becomes clunky.

@seanmonstar
Copy link
Member Author

I do not think it is a problem if programmers do exhaustive matching on the enum that implements Error.

It is a bit of a problem. With exhaustive matching, that means that adding a new error case is a breaking change. For instance, if at some point it was desirable to separate the Error::TooLarge into a UriTooLong and HeaderValueTooLong, that would break other people's code from compiling. If hyper were at 1.0, that would be mean needing to go to 2.0 just because we changed an internal detail of tracking errors.

Maybe translate error messages into another language.

A possibility is Error::code() -> u64 (maybe a smaller number type, probably not going to have trillions of error codes). As a number, you'd have to treat the case of future codes specifically, so adding new ones should be fine.

Backtraces etc. are not very user friendly (may even instill fear), and leak information about the source code

This definitely true if someone were to print the a hyper::Error directly back on a webpage. A worthy thing to consider regarding the use of backtraces in hyper.

What I am trying to point out is that having one Error type to cover lots of code seems less productive to me than having specific Error types.

That's actually part of the point of this issue. There are 2 rather distinct cases of failure in when using a Service (Server or Client), and using hyper::Error in both cases is odd.

@seanmonstar seanmonstar modified the milestone: 0.12 Jun 21, 2017
@seanmonstar
Copy link
Member Author

Relevant to the design of type, from this comment:

One thing I'd like is to differentiate IO errors based on where they came from. I care about connect errors in particular [...]

In terms of internal variants, things I see that would be useful to group are:

  • Connect
  • Parse
  • Io

@nicoburns
Copy link
Contributor

It is a bit of a problem. With exhaustive matching, that means that adding a new error case is a breaking change. For instance, if at some point it was desirable to separate the Error::TooLarge into a UriTooLong and HeaderValueTooLong, that would break other people's code from compiling. If hyper were at 1.0, that would be mean needing to go to 2.0 just because we changed an internal detail of tracking errors.

Wouldn't this be a breaking change regardless of whether hyper's error type is an enum or not? If it was a struct that could be inspected somehow, it would still potentially break peoples code, just at runtime rather than at compile time, no?

@seanmonstar
Copy link
Member Author

If it was a struct that could be inspected somehow, it would still potentially break peoples code, just at runtime rather than at compile time, no?

@nicoburns not necessarily. If it were a struct you could inspect, that'd probably be some method, like err.is_too_large(). That method can still return the correct value if internally the variant is split into more specific kinds, and new err.is_uri_too_large() and err.is_header_too_large() methods were added.

The current enum design makes it impossible to add any detail to a variant, since that requires changing the types of values that can matched publicly, so this extra detail could never be provided without a breaking change.

@algermissen
Copy link

+1 for the code based solution - then there can be a static lookup mechanism (array) to go from code to message. And that would be compatible.

@seanmonstar
Copy link
Member Author

Something that I've wanted that is easily done in class-based languages is to be able to split up an error variant into sub-variants in the future, and not breaking existing code.

With Classes (not Rust)

For example, say there was a Connect variant, and later on, we want to split that into smaller pieces, Resolve and Handshake. With classes, this is easily done. Consider the following Python:

try:
    print(client.get("https://hyper.rs"))
except error.Connect:
    # maybe retry

Splitting the errors:

class Resolve(error.Connect):
    pass
class Handshake(error.Connect):
    pass

And the original exception code will continue to work, and anyone upgrading can update to handle the two cases individually, or continue to just treat them all as Connect errors.

In Rust

In Rust, we don't have classes that can extend like this, so we must use a different strategy to try to achieve this forward compatibility.

With Methods

If all this information is in private fields, and a user can only ask questions with methods, it is possible.

if err.is_connect() {
    // retry?
}

Adding is_resolve() and is_handshake() allows futures versions to identify sub-errors, and the is_connect() method still works too.

The downside here is that it doesn't work very nicely with pattern matching.

With Enums

We could do this and support pattern matching, but it could get verbose. With a Kind enum that you could ask the error about, every variant would need to include a sub-kind (and any sub-kind would additionally need to hold a potentially empty but opaque value to allow for future splitting too. For example:

if let Err(err) = await client.get(url) {
    match err.kind() {
        ErrorKind::Connect(_) => {
            // retry?
        },
        _ => {},
    }
}

If we split the Connect error into sub-errors:

#[non_exhaustive]
enum ErrorKind {
    Connect(Connect),
}

#[non_exhaustive]
enum Connect {
    Resolve(Resolve),
    Handshake(Handshake),
}

struct Resolve(());
struct Handshake(());

Then, the original code still works, because the top type is still ErrorKind::Connect, and new users can check the Connect if they like. It does mean the names can get long, though...

if let Err(err) = await client.get(url) {
    match err.kind() {
        ErrorKind::Connect(Connect::Resolve(_)) => {
            // don't retry?
        },
        ErrorKind::Connect(Connect::Handshake(_)) => {
            // retry?
        },
        _ => {},
    }
}

@dekellum
Copy link

dekellum commented Feb 1, 2018

Regarding just a naming aspect in this design. Chrono has a similar use of the same names for both a variant and nested sub-type. There at least, it can be challenging to get the use imports right without naming ambiguity. If there are natural names that avoid same-names, that can be friendlier, e.g:

enum Connect {
    OnResolve(Resolve),
    OnHandshake(Handshake),
}

@seanmonstar
Copy link
Member Author

@dekellum You mean if trying to import the variants as well? use Connect::*;?

@dekellum
Copy link

dekellum commented Feb 2, 2018 via email

@sanmai-NL
Copy link
Contributor

sanmai-NL commented Feb 7, 2018

I hope we all avoid glob imports, but every importable item should have a unique name irrespective of that.

@seanmonstar seanmonstar removed this from the 0.12 milestone Apr 13, 2018
@seanmonstar seanmonstar added B-rfc Blocked: More comments would be useful in determine next steps. A-error Area: error handling labels May 10, 2018
@tobz1000
Copy link

tobz1000 commented Jul 10, 2018

Has the (unstable) #[non_exhaustive] attribute been considered to address the problem of breaking changes?

IMO this is a preferable solution, since as a user I would like to be confident that I'm handling every possible error type. This attribute allows for a lint which would let you know when a new error type has been added, and should be handled, without causing a compiler error.

It doesn't address the issue of users creating their own errors, but that seems like a lesser concern to me.

@sfackler
Copy link
Contributor

Unless something's changed since the RFC, #[non_exhaustive] generates a hard error, not a warning if you try to exhaustively match against of an enum outside of the crate it's defined in.

@tobz1000
Copy link

I was referring to the possibility of a Clippy lint, or compiler warning in future, which would appear after a minor update in which new variants had been added (see the "Unresolved questions" section).

@dekellum
Copy link

That would definitely make the feature more pragmatic, less dogmatic, and better for users.

@dekellum
Copy link

I've started doing this e.g. a public, documented, unused _FutureProof variant because I think the alternative approaches are dysfunctional at best. I guess only time will tell if I can avoid criticism for breaking changes (e.g. adding error variants) with this approach.

@rolftimmermans
Copy link

rolftimmermans commented Sep 14, 2018

I came here via #1652. I think that methods is_* on an Error struct are very unergonomic and a mistake in general.

  • Code that needs to handle many scenarios will be a long if/else if chain.
  • For contributors, it is very easy to forget to add is_* methods when the internal enum is extended, leading to holes in the API.
  • There is no help for an API user to implement error handling, because the is_* methods convey no information about which classes of errors there are (also see previous point). The is_* methods might be implemented for a small subset, they might overlap, there is no way to know! Contrast this with enums where you know everything is mutually exclusive, and you know (at least for the current version in use) that the set is exhaustive.
  • It is difficult to estimate what level of granularity a user of the library might need to recover from certain errors. The is_* methods might be too granular or not granular enough for their use cases.

If the number of internal errors is large and there is a concern for frequent breaking changes, I am greatly in favour of a two-stage classification. Use one enum for the general error category, and other enums for more granular, specific errors. At the very least the general category should be made public so an API consumer can match on it.

@BeatButton
Copy link

I've recently been having a hard time trying to find a way to get to the actual cause of an error that Hyper gives me. For example, when I receive an error that looks like this:

Error(
    Hyper(
        Error(
            Io,
            Os {
                code: 10054,
                kind: ConnectionReset,
                message: "An existing connection was forcibly closed by the remote host.",
            },
        ),
    ),
    "{url goes here}",
)

I would like to be able to, somehow, extract the io::Error to match against io::ErrorKind::ConnectionReset, since that's a specific case I'd like to deal with. I don't think this is possible with the current Error type, or at least, I haven't been able to figure out how to do it.

@keepsimple1
Copy link

Is it still the case that users of this crate should not create hyper::Error?

@Blub
Copy link

Blub commented Oct 20, 2020

It would be nice if instead of the current into_cause(self) -> Option<...> we could get something with a signature of (self) -> Result<Box<dyn ...>, hyper::Error> for when we really only want to catch eg. an io::Error and pass the rest through, similar to downcast methods, where the other kinds are simply returned unchanged as the Err. Currently you're forced to check all the methods (with no warnings or errors if new methods are added in the future) before then finally doing an into_cause.
(Or just get the enum exposed... because methods make for really ugly error handling code)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-error Area: error handling B-rfc Blocked: More comments would be useful in determine next steps.
Projects
None yet
Development

No branches or pull requests