-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should we have a language concept of erroneous behavior? #512
Comments
I definitely think we should have this tool in our toolbox (though for most of the items you listed I am far from convinced that we want to use this tool there). |
I think the space of things that are well-defined but not endorsed is a bit bigger than what we're intending to use this term for. So I'd like to see a better exploration of what would motivate us to call something erroneous behavior. |
A historic example of a similar thing appearing in the surface language is integer overflow, from RFC 560.
https://rust-lang.github.io/rfcs/0560-integer-overflow.html I vaguely remember older versions of this text using the term "program error" to describe this sort of behavior. |
C++ is also doing work here, see
|
Seeing EB get used for C++ is what initially got me thinking about its potential use for Rust, in fact. Although I'm far from fully aware what the extent of its implications are for C++. The talk around wanting sanitizers to be able to catch mutation of If this is a tool we want to have available, it's a good idea to understand what the tool does and when we do want to use it. So I'll properly ask: is the following an accurate description of what EB would mean in the context of Rust?
If that description is accurate, I'd consider it reasonable to close this issue with an answer of yes. Or to keep it open and transition it into a kind of tracking issue, depending. If that description isn't accurate, we should come up with one that is such that we can utilize EB if/when it is/becomes appropriate. |
That reflects what I had in mind as well. Another potential usecase for this are ABI mismatches -- some of the ones that are currently well-defined are things that the CFI folks would like to reject as exploit mitigation; we could use EB as a way to carve out what calls they may reject without having to make those calls UB. But if we close this issue I feel like we should document this somewhere. |
Indeed, similar wording has been at e.g. https://doc.rust-lang.org/stable/reference/behavior-not-considered-unsafe.html since 2017 (even if not normative), so I considered Rust, in practice, already has EB. In fact, seeing the advantages of it in Rust is why I started supporting introducing and splitting EB from UB for C and C++ in WG14 a few years ago. |
Defining "erroneous behavior" as an operation that has a defined result (does not cause UB) but is still considered incorrect for a program to perform, endorsing sanitizing environments (such as Miri, Valgrind, or CHERI) diagnosing the presence of erroneous behavior and halting program execution.
I would expect the cases of erroneous behavior to be fairly limited, but it seems like it could be beneficial for those cases where an operation is defined not because we're endorsing doing it but because making it undefined instead would be worse. Potential cases include:
let
bound value withoutmut
and lacking any internal/shared mutability.1UnsafeCell
.2Arc
reference count exceeding its maximum and aborting.7(Please don't discuss whether these examples should be allowed or not here; use the issue for each case for that. This issue should focus on whether this is a class of behavior we want to officially recognize as defined but erroneous.)
If UB is an Abstract Machine error, you could think of erroneous behavior as an AM warning. Sanitizers could always choose to diagnose even without any "permission" from the spec, but this would still be a "false positive," and people tend to write code that relies on doing a thing when you tell them that doing the thing is allowed, even if it's discouraged to do so.
Footnotes
Miri could in theory diagnose this similarly to how the immutability of
static
places is enforced. However, the optimization potential is questionable, and delayed initialization oflet
bindings makes it less straightforward, since the place is actually just mutable until it isn't. ↩Stacked Borrows prohibits this for structs/tuples/arrays, but Tree Borrows tags the full reference range uniformly based on any use of
UnsafeCell
. ↩Miri already warns when this occurs. IIUC CHERI deterministically segfaults the process when trying to read/write through a spoofed pointer. This could also apply to other fun code crimes like abusing non-pointer-layout carriers of provenance in a way that breaks on CHERI. ↩
Making reference retagging depend on the contents of the retagged memory appears to have no optimization benefits and would still require the concept of shallow validity to exist. However, it could be beneficial to assign blame to the creator of an unsafe reference than only diagnosing the symptom once a read occurs. ↩
I don't know many details here, but IIRC CFI checks want to catch pointer type mismatch and Rust would prefer pointer ABI to only care about the unsized tail kind. ↩
This would imply that
cfg(ub_checks)
is a kind of lightly sanitizing environment. ↩Okay, this one is definitely a stretch and is just the runtime behavior of the code as written, but it provides a framework for interpreting the choice between causing an abort or saturating and leaking like the linux kernel would prefer. ↩
The text was updated successfully, but these errors were encountered: