Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow a custom panic handler #1328

Merged
merged 1 commit into from
Dec 17, 2015
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
183 changes: 183 additions & 0 deletions text/0000-global-panic-handler.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
- Feature Name: panic_handler
- Start Date: 2015-10-08
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary

When a thread panics in Rust, the unwinding runtime currently prints a message
to standard error containing the panic argument as well as the filename and
line number corresponding to the location from which the panic originated.
This RFC proposes a mechanism to allow user code to replace this logic with
custom handlers that will run before unwinding begins.

# Motivation

The default behavior is not always ideal for all programs:

* Programs with command line interfaces do not want their output polluted by
random panic messages.
* Programs using a logging framework may want panic messages to be routed into
that system so that they can be processed like other events.
* Programs with graphical user interfaces may not have standard error attached
at all and want to be notified of thread panics to potentially display an
internal error dialog to the user.

The standard library [previously
supported](https://doc.rust-lang.org/1.3.0/std/rt/unwind/fn.register.html) (in
unstable code) the registration of a set of panic handlers. This API had
several issues:

* The system supported a fixed but unspecified number of handlers, and a
handler could never be unregistered once added.
* The callbacks were raw function pointers rather than closures.
* Handlers would be invoked on nested panics, which would result in a stack
overflow if a handler itself panicked.
* The callbacks were specified to take the panic message, file name and line
number directly. This would prevent us from adding more functionality in
the future, such as access to backtrace information. In addition, the
presence of file names and line numbers for all panics causes some amount of
binary bloat and we may want to add some avenue to allow for the omission of
those values in the future.

# Detailed design

A new module, `std::panic`, will be created with a panic handling API:

```rust
/// Unregisters the current panic handler, returning it.
///
/// If no custom handler is registered, the default handler will be returned.
///
/// # Panics
///
/// Panics if called from a panicking thread. Note that this will be a nested
/// panic and therefore abort the process.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a panic handler is invoked, perhaps it could be considered as "being taken"? That would mean that this function would never need to panic as if called while panicking it'd just continue to return the default handler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could work, but the semantics would be a little weird - the handler can't actually have been taken since it may need to be running on other threads as well, so you'd run into a situation where take_handler doesn't return the thing that's actually registered, and the handler would still be registered

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait right, this is a global resource, so there can be concurrent panics. Nevermind!

pub fn take_handler() -> Box<Fn(&PanicInfo) + 'static + Sync + Send> { ... }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you expect std::panic to be part of core and if so how can there be a Box here (which is in the alloc crate)? Or if not; how will this work in a no_std scenario?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The panicking infrastructure already allocates and is already in libstd.

no_std executables need to define some lang items to implement the functionality for panicking.


/// Registers a custom panic handler, replacing any that was previously
/// registered.
///
/// # Panics
///
/// Panics if called from a panicking thread. Note that this will be a nested
/// panic and therefore abort the process.
pub fn set_handler<F>(handler: F) where F: Fn(&PanicInfo) + 'static + Sync + Send { ... }

/// A struct providing information about a panic.
pub struct PanicInfo { ... }

impl PanicInfo {
/// Returns the payload associated with the panic.
///
/// This will commonly, but not always, be a `&'static str` or `String`.
pub fn payload(&self) -> &Any + Send { ... }

/// Returns information about the location from which the panic originated,
/// if available.
pub fn location(&self) -> Option<Location> { ... }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah one thing I meant to ask, is there a reason this returns Location instead of &Location? In theory a Location could be at least Clone (storing a &'static str internally for the file perhaps) which would allow storing locations elsewhere if necessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember why I chose a Location<'a> vs a &Location off the top of my head, but &Location seems fine as well.

}

/// A struct containing information about the location of a panic.
pub struct Location<'a> { ... }

impl<'a> Location<'a> {
/// Returns the name of the source file from which the panic originated.
pub fn file(&self) -> &str { ... }

/// Returns the line number from which the panic originated.
pub fn line(&self) -> u32 { ... }
}
```

When a panic occurs, but before unwinding begins, the runtime will call the
registered panic handler. After the handler returns, the runtime will then
unwind the thread. If a thread panics while panicking (a "double panic"), the
panic handler will *not* be invoked and the process will abort. Note that the
thread is considered to be panicking while the panic handler is running, so a
panic originating from the panic handler will result in a double panic.

The `take_handler` method exists to allow for handlers to "chain" by closing
over the previous handler and calling into it:

```rust
let old_handler = panic::take_handler();
panic::set_handler(move |info| {
println!("uh oh!");
old_handler(info);
});
```

This is obviously a racy operation, but as a single global resource, the global
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little worried about this sort of restriction, it seems unfortunate today that a library basically can't use this API (as it'll inevitably come up as a use case). I wonder if perhaps set_handler could return the previous one, and perhaps also have take_handler exist for easier chaining? That way a robust library could use set_handler plus its own global state to store the returned handler (still racy, but perhaps "less so"), but I suppose that trading one race for another isn't that great

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're envisioning a library that wants to intercept and maybe filter panics? I think if we wanted to support those use cases in a fully race-free way, we'd want a stack of panic handlers that have control over propagation to the next handler. However, I'm not sure that those use cases are common, and that they'd be significantly impacted by the installation race, particularly since no other "equivalent" API in another language behaves like that.

There's really no reason for set_handler to not return the old one, so we might as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm not envisioning any particular use case in mind, this just the experience I've had from the standard library itself, specifically with regards to signals. A signal handler is some global state and we as a library want to deal with it sometimes (e.g. provide users nicer abstractions), so we may not be able to assume "only applications will reasonably use this API" is basically all I'm getting at.

Thinking more on this I'm not sure that trading one race for another is worth it, so without going for a stack (which I'd also like to avoid) I think the current API is fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other option to actually solve the race would be to wrap the handler in an RwLock, and adjust the API here so there's an update_handler function that returns a thing holding the write lock and having the take and set functions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole thing reminds me of Windows Vectored Exception Handlers. Perhaps we could borrow design from there?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like some of the issues that a global panic handler involves (like races) would be trivial to avoid with a per-thread panic handler. The "alternatives" section mentions that it would be a possible extension, but it does not explore the consequences. The first ones that come up in my mind are:

  • these races would not be possible
  • there would be to make some design choice about what handler should be given to new threads
  • the panic handler of each thread is called at most once (except if it is propagated to other threads), hence its signature might be more restrictive

AFAICT these ideas were not explored in #1100 because its main focus was about avoiding the output from the on_panic handler, but some more details might be useful in this new RFC in order to better explain and compare the tradeoffs between global & per-thread handlers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vadimcn

Hm that's a good point! It looks like the two primary things there (if I'm reading it right) are:

  • Handlers are stacked, not overwritten. This means that everything registered is called in sequence.
  • Handlers all have a token, and the token can be used to unregister the handler.

I think that scheme doesn't have the race problems I'm thinking about here, and it also provides the nice ability to "shut down" something and have handlers in theory scoped for smaller components. Additionally, you don't have to "by default try to be robust" because you don't have to worry about callling someone else's handler.

@sfackler, those semantics sound relatively appealing to me, what do you think?


@ranma42

The major downside of a thread-local handler is that you can't set a handler for yet-to-be-spawned threads. I believe that any sort of global handler scheme can be used to implement a thread-local handler scheme (e.g. with more restrictive interfaces).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The major downside of a thread-local handler is that you can't set a handler for yet-to-be-spawned threads.

That’s not true. We do execute arbitrary code after clone-ing and could easily copy handlers over if option is set or it is generally wanted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nagisa perhaps, yeah, but that means that threads not spawned in Rust won't have inheritance I believe?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcrichton The suggestion by @nagisa is exactly one of the possible choices for "what handler should be given to new threads". I understand that such approach would lose inheritance for threads spawned by non-Rust code, but currently the focus of the RFC seems to be even more restrictive: "the global panic handler should only be adjusted by applications rather than libraries". Anyway, my point is that it would be desirable to explain in the RFC itself (instead of just here in the discussion) some of these tradeoffs.

panic handler should only be adjusted by applications rather than libraries,
most likely early in the startup process.

The implementation of `set_handler` and `take_handler` will have to be
carefully synchronized to ensure that a handler is not replaced while executing
in another thread. This can be accomplished in a manner similar to [that used
by the `log`
crate](https://github.com/rust-lang-nursery/log/blob/aa8618c840dd88b27c487c9fc9571d89751583f3/src/lib.rs).
`take_handler` and `set_handler` will wait until no other threads are currently
running the panic handler, at which point they will atomically swap the handler
out as appropriate.

Note that `location` will always return `Some` in the current implementation.
It returns an `Option` to hedge against possible future changes to the panic
system that would allow a crate to be compiled with location metadata removed
to minimize binary size.

## Prior Art

C++ has a
[`std::set_terminate`](http://www.cplusplus.com/reference/exception/set_terminate/)
function which registers a handler for uncaught exceptions, returning the old
one. The handler takes no arguments.

Python passes uncaught exceptions to the global handler
[`sys.excepthook`](https://docs.python.org/2/library/sys.html#sys.excepthook)
which can be set by user code.

In Java, uncaught exceptions [can be
handled](http://docs.oracle.com/javase/7/docs/api/java/lang/Thread.html#setUncaughtExceptionHandler(java.lang.Thread.UncaughtExceptionHandler))
by handlers registered on an individual `Thread`, by the `Thread`'s,
`ThreadGroup`, and by a handler registered globally. The handlers are provided
with the `Throwable` that triggered the handler.

# Drawbacks

The more infrastructure we add to interact with panics, the more attractive it
becomes to use them as a more normal part of control flow.

# Alternatives

Panic handlers could be run after a panicking thread has unwound rather than
before. This is perhaps a more intuitive arrangement, and allows `catch_panic`
to prevent panic handlers from running. However, running handlers before
unwinding allows them access to more context, for example, the ability to take
a stack trace.

`PanicInfo::location` could be split into `PanicInfo::file` and
`PanicInfo::line` to cut down on the API size, though that would require
handlers to deal with weird cases like a line number but no file being
available.

[RFC 1100](https://github.com/rust-lang/rfcs/pull/1100) proposed an API based
around thread-local handlers. While there are reasonable use cases for the
registration of custom handlers on a per-thread basis, most of the common uses
for custom handlers want to have a single set of behavior cover all threads in
the process. Being forced to remember to register a handler in every thread
spawned in a program is tedious and error prone, and not even possible in many
cases for threads spawned in libraries the author has no control over.

While out of scope for this RFC, a future extension could add thread-local
handlers on top of the global one proposed here in a straightforward manner.

The implementation could be simplified by altering the API to store, and
`take_logger` to return, an `Arc<Fn(&PanicInfo) + 'static + Sync + Send>` or
a bare function pointer. This seems like a somewhat weirder API, however, and
the implementation proposed above should not end up complex enough to justify
the change.

# Unresolved questions

None at the moment.