Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft RFC to add a must_not_await lint #16

Merged
merged 11 commits into from
Oct 9, 2020
85 changes: 85 additions & 0 deletions rfc-drafts/yield_safe_lint.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# RFC: yield safe lint

# Summary

Introduce a `#[yield_unsafe]` lint in the compiler that will warn the user when they unsafely hold an unsafe yield struct across a yield boundary.
Copy link
Member

@yoshuawuyts yoshuawuyts Jul 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[...] the compiler that will warn the user when they unsafely hold an unsafe yield struct across a yield boundary

It seems the term "unsafely" doesn't seem to refer to memory safety here? The terms "safe" and "unsafe" have very specific meanings in Rust. I would strongly caution against repurposing the terminology of safe and unsafe for anything unrelated to its current use.

Teaching people about unsafe Rust is pretty hard already, and it's often misunderstood. It would be bad for the language if the meaning of "unsafe" within Rust became dependent on context.

Instead I would suggest using terminology such as:

  • #[no_yield] / "can be yielded" / "cannot be yielded"
  • #[disallow_yield] / "allowed to be yielded" / "not allowed to be yielded"
  • #[deny_yield] / "allowed to be yielded" / "not allowed to be yielded"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(actually, this seems mentioned in "Unresolved questions" section)

There is a case where "safe" is used as a safe API in places where it doesn't have to do with memory safety (std::panic::UnwindSafe), but it seems "unsafe" doesn't.

  • #[deny_yield] / "allowed to be yielded" / "not allowed to be yielded"

It seems odd that the lint name warned by default is "deny_*".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a case where "safe" is used as a safe API in places where it doesn't have to do with memory safety (std::panic::UnwindSafe), but it seems "unsafe" doesn't.

I don't believe that's quite true. From the explainer in the docs of what "unwind safety" is:

It is possible, however, for logical invariants to be broken in Rust, which can end up causing behavioral bugs. Another key aspect of unwind safety in Rust is that, in the absence of unsafe code, a panic cannot lead to memory unsafety.

Which means that if a type is !UnwindSafe, it's not memory safe to unwind and the code won't compile. This seems quite different than what the #[yield_unsafe] lint introduces?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I wanted to say is that the term "safe" is used in APIs that you can never violate memory safety without other unsafe code, even if implemented incorrectly.

https://github.com/rust-lang/rfcs/blob/master/text/1236-stabilize-catch-panic.md#background-what-is-exception-safety-in-rust

Despite these methods to subvert the mitigations placed by default in Rust, a key part of exception safety in Rust is that safe code can never lead to memory unsafety, regardless of whether it panics or not. Memory unsafety triggered as part of a panic can always be traced back to an unsafe block.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another point on naming: "yield" isn't a documented term in Rust today. While I would very much like to use terminology that's future-proof for when generators stabilize, it seems like it would be quite confusing in the meantime.

Maybe we could use "async" terminology for now, and simply add "yield" terminology as a synonym in the future. The RFC should discuss future options and potential drawbacks, if any.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I am not sold on unsafe it was just the best way I could think of explaining it to people that were familiar with async concepts. I would love to hear more proposals I am very open to naming.

@tmandry yeah, that is a really good point. We should probably use await then? But that also limits us when we do get generators out (which idk when that will happen) so not a huge deal we can duplicate the macro.

  • #[deny_hold_await]
  • #[disallow_across_await] this seems promising and similar to what @yoshuawuyts suggested.


# Motivation

Enable users to fearlessly write concurrent async code without the need to understand the internals of runtimes and how their code will be affected. The goal is to provide a best effort warning that will let the user know of a possible side effect that is not visible by reading the code right away. Some examples of these side effects are holding a `MutexGuard` across a yield bound in a single threaded runtime. In this case the resulting generated future will resolve to `!Send` but could still hold the lock when the future yields back to the executor. This opens up for the possibility of causing a deadlock since the future holding onto the lock did not relinquish it back before it yielded control.

# Guide-level explanation

Provide a lint that can be attached to structs to let the compiler know that this struct is unsafe to be held across a yield boundary.

```rust
#[yield_unsafe]
struct MyUnsafeYieldStruct {}
```

This struct if held across a yield boundary would cause a warning:

```rust
async fn foo() {
let my_struct = MyUnsafeYieldStruct {};
my_async_op.await;
println!("{:?}", my_struct);
}
```

The compiler might output something along the lines of:


warning: Holding `MyUnsafeYieldStruct` across the await bound on line 3 might cause side effects.

Examples use cases for this lint:

- `MutexGuard` holding this across a yield boundary in a single threaded executor could cause deadlocks. In a multi-threaded runtime the resulting future would become `!Send` which will stop the user from spawning this future and causing issues. But in a single threaded runtime which accepts `!Send` futures deadlocks could happen.
- The same applies to other such synchronization primitives such as locks from ParkingLot.
- `tracing::Span` has the ability to enter the span via the `tracing::span::Entered` guard. While entering a span is totally normal, during an async fn the span only needs to be entered once before the `.await` call ,which might potentially yield the execution.

This lint will enable the compiler to warn the user that the generated MIR could produce unforeseen side effects. An unsafe yield struct is some struct that may cause side effects when it is stored internally to the generator. Some examples of this are:

- [`std::sync::MutexGuard`](https://doc.rust-lang.org/std/sync/struct.MutexGuard.html)
- [`tracing::span::Entered`](https://docs.rs/tracing/0.1.15/tracing/span/struct.Entered.html)

This will be a best effort lint to signal to the user about unsafety of using certain types across yield points.

(somewhere go into that async fn/generators generate new “code” that is different than what the users sees, this causes the problem)

Signal incorrect usage of `!Send` types across yield points.

# Reference-level explanation

**TODO**

- [ ] Draw ideas from `#[must_use]` implementation
- [ ] Go through generator implementation to understand how values are captured in state machine.
LucioFranco marked this conversation as resolved.
Show resolved Hide resolved

# Drawbacks
- There is a possibility it can produce a false positive warning and it could get noisy. We likely want to allow overriding via some sort of module level `allow` attribute.

# Rationale and alternatives
Copy link
Member

@yoshuawuyts yoshuawuyts Jul 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why it was chosen to use an attribute instead of a trait, following std::panic::UnwindSafe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @nikomatsakis wanted us to also take a look at what is going on with MustUse. I think we still want this to be some sort of attribute similar to must_use but we could also support a trait style too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would want to parallel #[must_use], I think? But I've shifted my opinion and am now open to using a trait for both I think.



# Prior art
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A clippy lint was added: rust-lang/rust-clippy#5439


* [Clippy lint for holding locks across await points](https://github.com/rust-lang/rust-clippy/pull/5439)

# Unresolved questions

- What should we call a struct that should not be held across yield boundaries? `unsafe` is quite an overloaded term but is somewhat correct in this sense. That said, rust does not consider deadlocks as unsafe so technically the term is incorrect from rust’s definition.
- Better ways to disable the lint rather than using a `#[allow(yield_unsafe)]` at the top of the module? This would disable the lint for the entire module, possibly masking other issues.
- What happens in this situation?:
struct MyType<T> {
item: usize,
lock: MutexGuard<T>, // known to be unsafe to use across yield points
}
- Is it the user’s responsibility to add `#[allow(yield_unsafe)]` to the struct?
- Is it possible to “auto implement” the rule for structs containing any fields marked `[allow(yield_unsafe)]`?
From a usability perspective, the second option is much more desirable, as the user automatically gets this warning for “free”.

# Future possibilities