Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it Ok for the compiler to crash if proc_macro contains unsound code? Documentation is unclear. #106160

Open
VorfeedCanal opened this issue Dec 26, 2022 · 5 comments
Labels
A-docs Area: Documentation for any part of the project, including the compiler, standard library, and tools

Comments

@VorfeedCanal
Copy link

I'm not really sure whether that's a documentation issue or really a compiler bug.

When we were discussing bug in the indoc crate H2CO3 made shocking (to me) assertion: even if there is a soundness issue in a crate, this should not cause the compiler to crash.

Note that crate in question is a proc_macro crate which, apparently, can create broken String (which doesn't contain valid UTF-8) and thus I, naïvely, assumed that if it breaks The Soundness Pledge then compiler may only burn and crash in flames… but I have just checked and, indeed, reference doesn't even mention soundness.

While the issue with indoc is a minor one to me, the question of whether compiler should/could expect that procmacro contains valid code is a bit different one.

P.S. Also it sounds as if issue with indoc itself is problem with a compiler, too (indoc is tiny crate with no unsafe code thus it, probably, should be able to be unsound), but that's separate.

@bjorn3
Copy link
Member

bjorn3 commented Dec 27, 2022

If a proc macro itself invokes UB when running, rather than produce code that is could invoke UB, I don't think it is unexpected that the compiler may crash. It might just as well overwrite the stack with 0xffffffff. The only way to prevent that would be to sandbox the proc macro such that it can't mess with the memory of rustc and the api boundary checks validity of all values.

@Noratrieb
Copy link
Member

The indoc issue does appear to be a rustc bug, I'll take a look.

@Noratrieb
Copy link
Member

I opened an issue for the referenced indoc example which is not a bug or unsoundness in indoc but an unrelated compiler bug: #106191

@saethlin
Copy link
Member

saethlin commented Dec 27, 2022

I think the point of that informal(?) rule is to make it clear that no matter how convoluted your program is, a rustc crash/ICE is a bug that we should fix. Implementors aren't allowed to ignore input programs because they would execute UB.

Proc macros make this really complicated because not only are they just not sandboxed at all, they are both compiled (and in this case, the buggy code compiles fine!) and executed, at which point technically all bets are off. So we can't have both an airtight no crashes guarantee and have UB actually mean undefined, in the face of proc macros.

From a practical perspective, like @bjorn3 pointed out, the UB clearly wins but with good sandboxing we could make this a non-issue.

@workingjubilee workingjubilee added C-bug Category: This is a bug. A-docs Area: Documentation for any part of the project, including the compiler, standard library, and tools and removed C-bug Category: This is a bug. labels Mar 11, 2023
@workingjubilee
Copy link
Member

During compilation, it is allowed for a compilation to evaluate to a UB result. The Rust compiler's contract, if it ever included that sort of "soundness pledge", was weakened by the acceptance of the "const UB" RFC. This RFC explains that the damage of UB during compilation, or rather specifically during CTFE, should be contained to that evaluation, but may persist into the final binary.

So is not permitted for "const UB" to delete random files on your machine. However, it is permitted to const-eval an array, if evaluation incurs UB, in a way that replaces that array with the entire script of the "Bee Movie". It is more important, for various reasons, that all attempts to view that array provide a consistent result (i.e. the "Bee Movie" script) than that it be a valid array of whatever type it was before.

Whether the same applies to proc macros was not answered by that RFC, but I imagine such extensional reasoning would be a sensible interpretation by a conforming implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-docs Area: Documentation for any part of the project, including the compiler, standard library, and tools
Projects
None yet
Development

No branches or pull requests

5 participants