-
Notifications
You must be signed in to change notification settings - Fork 498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
distinct 'static' items never overlap #1657
base: master
Are you sure you want to change the base?
Conversation
I don't think you understood my example? I was positing that this seems to be a legal interpretation of the reference's current text: static ZEE: u8 = 0;
static ZED: u8 = 0;
assert_eq!(&raw const ZEE, &raw const ZEE);
assert_eq!(&raw const ZED, &raw const ZED);
assert_eq!(&raw const ZEE, &raw const ZED); |
Because this
cannot be a response to what I actually said if it is said with an understanding of what I was trying to say. :/ |
I think I understood the example? I don't understand how that can be a valid interpretation of the text. A static has a location, |
Oh, maybe I didn't quite understand the example. But statics are certainly intended to be unique and disjoint. That's their point -- they describe a place, distinct from all other places. More specifically, statics form their own allocated objects that don't overlap with any other allocated object. So in fact ZST statics are not quite unique -- but statics of type |
@rustbot label +T-opsem |
@rfcbot merge |
Team member @RalfJung has proposed to merge this. The next step is review by the rest of the tagged team members: Concerns:
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns. |
Lgtm, with the note that it's important this language remains restricted to static items, and not other language constructs that produce statics - const promotion, vtables, functions, etc. Will check my box when I'm off mobile, as apparently the GH app doesn't let you edit things anymore (or someone can do it for me :) ) |
You probably don't have edit rights here. I don't. You can use @
I would argue those aren't statics, they are other kinds of global allocations -- exactly because of this fundamental difference. |
@rfcbot reviewed |
hmm. how must the addressing work for this, then? static BLAH: &str = "blah";
static ALSO_BLAH: &str = "blah"; Are these potentially two different pointers to the same string literal? |
Yes. |
@rfcbot reviewed |
🔔 This is now entering its final comment period, as per the review above. 🔔 psst @RalfJung, I wasn't able to add the |
@rfcbot reviewed |
src/items/static-items.md
Outdated
program that is initialized with the initializer expression. This allocated object is disjoint from | ||
all other allocated objects. All references and raw pointers to the static refer to the same |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to guarantee that they are disjoint from other allocated objects? Or just other statics. IE. if I have:
static FOO: i32 = 0;
const BAR: i32 = 0;
fn foo(){
assert!(!core::ptr::eq(&FOO, &BAR));
}
Should we guarantee the assertion will always pass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have concern permissions, but I think this should be addresses by T-opsem before the end of the FCP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think we should guarantee that, and it's what the text already says, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That prevents emitting unnamed_addr
on const
items.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnamed_addr
specifically does coalescing of non-significant-address-items into each other, and potentially into a significant-address item.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The LangRef says:
Note that a constant with significant address can be merged with a unnamed_addr constant, the result being a constant whose address is significant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that is unfortunate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think we should guarantee that, and it's what the text already says, isn't it?
That's what the proposed text says, my question is whether that's what we want it to say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I... honestly don't think that's surprising at all?
That's exactly what I would expect in a case of
const BIG_CONST: BigFrozen = BigFrozen::big_init();
static BIG_STATIC: BigFrozen = BIG_CONST;
That's the precise situation where the const
will be unified "into" the static
, and where it would be a clearly beneficial optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I... honestly don't think that's surprising at all?
I guess we have different intuitions then.
But reality clearly works one way here, so I have updated the text to match reality. @chorman0773 please have a look.
@rfcbot reviewed |
I have a concern: this proposed definition prevents emitting optimization annotations on non- |
@rfcbot concern clarify-optimization-of-consts I think the proposed text and the previous text have the same meaning, which is unfortunate because I think that we'd already specified that consts are not merged into statics. But given the participation on this PR, and its title which seems to be about overlap of statics, and the fact that I think rustc currently does not implement what is documented here, I would like the implications for consts to be spelled out clearly. |
I gave that a shot, please have a look. |
@rustbot resolve clarify-optimization-of-consts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the concern is now resolved, I see no further issues with this.
I think that the definition of overlap of sets is clearly that they have empty intersection, but I think your definition makes more sense in this context. The difference of course is whether we want to say that I recall a related version of this issue: We currently allow ZST's to be magicked up anywhere, such that |
As mentioned above, "maigcking up" a ZST like that does not create a ZST allocation. It just creates a pointer/reference without provenance. Those are not the same thing. I wonder if we are constrained by LLVM here... @nikic do you know if there will be trouble with LLVM when we have a zero-sized static located "inside" another static (or inside a stack/heap allocation)? When I do |
I went surveying various embedded software implementations using Rust today. Many of them do in fact use patterns that would prefer to be able to place One or two do in fact do everything in the most tedious way (entirely |
Note that using the addresses of those statics as markers can be fine, but under no circumstances is it fine to access the same underlying global memory with pointers derived from different |
Do the embedded use-cases just need ZST markers that are directly adjacent to other statics (i.e. they share a start/end), or are they nested strictly inside them? If the latter, do you have any example to share? |
After some experimentation, I got this somewhat concerning result: https://llvm.godbolt.org/z/6qTvx3qxd I guess test1 answers your question in terms of what LLVM currently assumes at least, while test0 looks like an outright miscompile to me. (Alive seems to be struggling with zero-sized globals and thinks that everything is UB: https://alive2.llvm.org/ce/z/qFySyd Filed as AliveToolkit/alive2#1109.) |
I can go into some more detail after I get some sleep but a simplified example would look like this, so yes, strict inclusion: https://godbolt.org/z/eE6Wdjvnb Most of the ones I looked at were from fairly diligent and clued-in folks, so even the cases that wrote code that is probably still UB tbh went to fairly extreme lengths to write contorted code that they clearly are hoping will avoid the notice of an optimizer. And each was contorted in its own unique way. I am offering this simpler example not because it's an exact replica of what their code is like, but an example of what I think they could be doing if we can support this. (...also, because it doesn't involve everyone learning to read linker script so they can discuss it.) ...but uh, that's fairly alarming and we should probably bring this up to Alive. |
Could you add some comments that aid in interpreting the results? :) |
test0: zero-size global == start of alloca -> false test0 is a miscompile, and test1 depends on whether a zero-size global can be in the middle of an alloca or not. LLVM currently assumes it can't, but given that it also assumes it can't be at the start of one, it's hard to distinguish whether that's a bug or a feature :) I think if there is a consensus that we really want to allow zero-size statics that arbitrarily overlap with other allocations, we probably could get through a LangRef change to that effect and adjust InstSimplify accordingly. Though I think in this context, it's also important to distinguish between a) what addresses an extern static may be placed at (e.g. via linker script) and b) what addresses Rust itself can place a static. Even if we allow extern statics to overlap other statics, if you define a static directly in Rust, is Rust allowed to place it at an overlapping address? |
That helps, thanks! I don't have a strong opinion on whether should allow zero-size statics that arbitrarily overlap with other allocations. If it can be done in LLVM without impacting relevant optimizations, I would generally err on the side of having less UB. |
My impression is that from the perspective of embedded developers, the artificial distinction we draw between And as I've mentioned, what's more important is having a blessed pattern that we can recommend to this community, rather than telling them what not to do, which mostly leads to them producing obfuscated code:
|
Yes: |
I have opened rust-lang/unsafe-code-guidelines#546 to try to capture some of the discussion from here about the ZST issue and will try to expand on the embedded use-cases in that issue or the also-relevant rust-lang/unsafe-code-guidelines#545 |
@scottmcm okay, so that would place ZST constant allocations at Do zero-sized local variables get an |
That comment, written during the meeting in the course of discussion, was very much meant to be an early indicator of "this raised some eyebrows in a meeting around what seemed like an unaddressed corner case; here were some various thoughts that came up". It was not meant to convey any kind of definitive conclusion on What Behavior The Team Wants, just to start a discussion. That said, I could absolutely have spelled that out more explicitly. Sorry if it came across as a Decision Being Made rather than a Discussion Being Had. |
For what it's worth, my answer above was given without having read too much of the backscroll, and now that I have I'm leaning more toward allowing ZST allocations (or "allocations") to be anywhere, including strictly inside another allocation, consistently for all purposes. But for the purpose of this RFC, I'm also fine with just punting on the question for now. |
Today they do, though I in fact have an open exploratory draft PR considering changing that: |
Jubilee, can you clarify what you are saying about embedded uses-- are you saying that the projects are relying on a "bytes in common" definition (i.e., they have ZSTs whose pointer value is a byte that is a member of another static)?
|
The LLVM issue is fixed by llvm/llvm-project#115728. |
Team member @traviscross has proposed to merge this. The next step is review by the rest of the tagged team members:
No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns. |
Checking off the boxes for members of T-opsem as these were checked off in the earlier FCP. |
I think the clarification proposed by @joshtriplett (or something more explicit) should be included, but otherwise I'm happy to go ahead with this. @rfcbot reviewed |
Co-authored-by: Josh Triplett <[email protected]>
It seems like so far we did not actually guarantee this.
While we are at it, also clarify that static initializers can read even mutable statics, and what happens in that case.