Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why can consts not refer to statics? #11

Open
RalfJung opened this issue Oct 8, 2018 · 51 comments
Open

Why can consts not refer to statics? #11

RalfJung opened this issue Oct 8, 2018 · 51 comments

Comments

@RalfJung
Copy link
Member

RalfJung commented Oct 8, 2018

Is there some deep fundamental reason why that can never work, or is it a limitation of the current (or a former) implementation?

@oli-obk
Copy link
Contributor

oli-obk commented Oct 8, 2018

I think the root issue is

static FOO: AtomicUsize = AtomicUsize::new(42);

// hides the static behind a regular reference
const BAR: &AtomicUsize = &FOO;
const fn bar() -> usize {
    BAR.swap(99, Ordering::Relaxed)
}

Multiple calls to bar at runtime will yield different results and compile-time calls will error out.

@oli-obk
Copy link
Contributor

oli-obk commented Oct 8, 2018

Additionally

const WAT: usize = BAR.load(Ordering::Relaxed);

seems very weird, because WAT always has the compile-time value of the static, even though the static can change.

@RalfJung
Copy link
Member Author

RalfJung commented Oct 8, 2018

Okay, so this is actually again the same problem as with promoteds: It's about statics that could have been mutated. Once again a rule for promoteds is actually a rule for statics.

Maybe we should make the CTFE engine error out on such accesses, just to make sure the static checks didn't miss anything? Not sure how to do that, though.

@RalfJung
Copy link
Member Author

RalfJung commented Oct 8, 2018

Oh, and this does not explain why the following is not allowed:

static FOO: usize = 13;
const BAR: &usize = &FOO;

Is that just a limitation of the analysis?

@Ixrec
Copy link

Ixrec commented Oct 8, 2018

Maybe a desirable limitation. If that was lifted, we'd get a backwards-compatibility hazard where the lack of privates with interior mutability becomes a non-obvious part of your public API, right? (though I guess it's fine for primitives like usize)

@RalfJung
Copy link
Member Author

RalfJung commented Oct 8, 2018

@Ixrec I am talking about taking the address of a static without ever using it. That seems fine no matter the type.

Of course the question is how we can be sure that it won't be used. Between ruling out &STATIC and *r, it seems better to have the latter and not the former.

@rpjohnst
Copy link

rpjohnst commented Apr 9, 2020

I just hit a use case where some way for a const to refer to a static would be extremely valuable:

My goal was to convert a string interner from handing out indices, to handing out direct pointers. This would have been a performance optimization for programs that frequently do more than simply compare string identities- e.g. printing them out, checking metadata.

Pointers from consts to static come into this because I already had a large collection of consts for "well-known" interned strings, which I used in patterns. With indices, those consts were just (wrapped) integers, the interner constructor inserted the strings in the same order, and everything worked fine. With pointers, all the tests started failing, because separate uses of a const would point to separate copies of the string!

Currently, the only way in Rust to enforce that all parts of a program share a common address in memory is to use a static. But statics don't work in patterns (nor would that really make sense, AIUI), and consts can't point to them, so I'm faced with a frustrating choice: either give up the consts (and the pattern matching), or give up the optimization.

@RalfJung
Copy link
Member Author

RalfJung commented Apr 9, 2020

@rpjohnst to be clear, computing the value of your constants would not actually read from the static, they would just reference it?

I think that should be soundly possible. We even already should have the const-eval engine checks in place which ensure that you do not read through those references. The main problem however is that I do not see a good way to do such a more precise check (allow references but not reads from statics) pre-monomorphization -- so this would result in more post-monomorphization errors, which is not great.

@rpjohnst
Copy link

rpjohnst commented Apr 9, 2020

Yes, I could write it such that computing the value of a constant does not read from a static and only takes its address (or the address of one of its fields, the way I wrote it, which should be just as sound).

(Not related to this issue, but that makes it slightly uglier unless we also get unsized statics or limited type inference for statics, because their type would be StaticEntry<[u8; N]> with a different N for each string. Before I hit this issue I worked around that with an unsizing conversion to &StaticEntry<[u8]>, which makes computing a const read that reference from the static.)

To confirm my understanding of why this would need to be a post-monomorphization check, is this the sort of thing you meant by a choice between "ruling out &STATIC and *r"? https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=4bdb8ceb56797048ba9a406609453f4a

I wonder if we couldn't lift the restriction only for non-generic consts, at least in the short term. Lifting it for generic consts would require some sort of trait-like tracking, for which I can imagine two possibilities at the moment:

  • Forbid reading any statics in consts, which would require trait-like logic to apply to values, I think? A post-monomorphization check is probably much simpler than this.
  • Piggyback it on Freeze or something like that, so consts would be able to read from statics that can't change. (IIUC this would also be okay?) Leaving this option open would mean being careful how much we allow for non-generic consts in the meantime.

@RalfJung
Copy link
Member Author

RalfJung commented Apr 10, 2020

(Not related to this issue, but that makes it slightly uglier unless we also get unsized statics or limited type inference for statics, because their type would be StaticEntry<[u8; N]> with a different N for each string. Before I hit this issue I worked around that with an unsizing conversion to &StaticEntry<[u8]>, which makes computing a const read that reference from the static.)

We could even allow that, reads from immutable statics (no mut, no interior mutability) are fine, actually. Just so far, to be conservative, we have rejected reads from all statics.

To confirm my understanding of why this would need to be a post-monomorphization check, is this the sort of thing you meant by a choice between "ruling out &STATIC and *r"?

Yes, that is a great example.

Another way to say this is that we are looking for the property "does this read from a static" in the const-initializer, and that const-initializer is written in a turing-complete language. Thus whether or not the property holds is undecidable (in the "halting problem" sense). So, even without associated consts, basically the only thing we can do is to just run the initializer and see if it does the bad thing. That is in contrast to now where we do an overapporixmative analysis and reject referring to statics, which then also rules out any attempts to read from them.

I wonder if we couldn't lift the restriction only for non-generic consts, at least in the short term.

We could, but that would be a very odd "split" to have, I think. There is no precedent for a split like that as far as I know, and I am undecided if this is better or worse than a post-monomorphization check.

@rpjohnst
Copy link

rpjohnst commented Apr 10, 2020

Wait, if we can allow reads from truly-immutable statics then couldn't we just always allow that, even around the generic cases?

The results of "does this reference a !Freeze static" should be equally the same across monomorphizations as today's "does this reference any static." (Though really it would have to be any static that is or can itself refer to anything !Freeze.)

Does that sort of rule feel any better than a post-monomorphization check or the odd "non-generic" split? It should remove the need to actually run the initializer, though it does add a bit of extra information that needs to be propagated across statics (though not consts).

Edit: this would also rely on not having associated/generic statics, I think? We don't currently, but if we added them this sort of analysis would have to become post-monomorphization again, or else we'd get the weird "non-generic" split again.

@RalfJung
Copy link
Member Author

Wait, if we can allow reads from truly-immutable statics then couldn't we just always allow that, even around the generic cases?

I think this is mostly legacy. And you raise a good point, but I need to explain some background to translate this proposal into something more concrete.

Some background

We have two layers of defense against misbehaving consts currently: we have some static checks (in the sense of static analysis, not the static keyword) that take the source-code, analyze it, and try to figure out if running that code could do anything bad. If it could, we throw an error. This is necessarily conservative, i.e., we will reject some code that would not do anything bad -- this is basically the halting problem. Historically, these were the only checks we had.

Over time, we added a second line of defense. When the code is "run" in the interpreter that evaluates const/static initializes, we have some dynamic checks that look at what the code actually does, and when it does something bad, we stop the code. This analysis is precise, meaning it should catch all errors and essentially make the static analysis "unnecessary" (modulo what is tracked in #17 and any other concerns that we missed...), but it runs late, after monomorphization, because it needs all the data to actually compute the initializer.

Current status

The dynamic check currently rejects all reads from statics. This is slightly conservative, we could permit reads from immutable statics, but @oli-obk wanted to start conservative here and I didn't push back. I think we can relax this.

The static check rejects even mentioning any static, because that is the only way we know to be sure that a static won't be read from later. Permitting the code to create a reference to a static, but then ruling out that the reference is actually read from, is not feasible.

Your proposal

I think you are proposing to

  • Change the dynamic analysis to permit reads from immutable statics.
  • Then, change the static analysis to permit taking references to statics that are guaranteed immutable (so if we ever have "associated/generic statics", which I think we won't, but if we do this is still fine as consts can just conservatively not reference them). With the relaxed dynamic checks, I think our static checks are then still sound wrt. our dynamic checks.

I think that should be fine, and it should not introduce any new post-monomorphization errors. @rust-lang/wg-const-eval what do you think?

rpjohnst added a commit to rpjohnst/dejavu that referenced this issue Apr 12, 2020
This relies on the extensions to Rust's `const`s described in issue
rust-lang/const-eval#11, and will not compile today.
rpjohnst added a commit to rpjohnst/dejavu that referenced this issue Apr 12, 2020
This relies on the extensions to Rust's `const`s described in issue
rust-lang/const-eval#11, and will not compile today.
@oli-obk
Copy link
Contributor

oli-obk commented Apr 14, 2020

The problem is that any static (even a truly immutable one) can refer to other statics (which may then have interior mutability). So we'd need to enforce transitive immutability. Of course we can do that, but the last time I tried this became a non-fun excercise. Basically we run into query cycles for cyclic statics whenever we try to do such an analysis based on the value of a static.

We could do a conservative type based analysis though. Basically any static containing only Freeze types and no raw pointers or unions should be fine. This does mean though that static FOO: Option<AtomicUsize> = None; is rejected from being used in constants.

@oli-obk
Copy link
Contributor

oli-obk commented Apr 14, 2020

Last attempt: rust-lang/rust#66302

@rpjohnst
Copy link

I would love to try implementing this, even just the conservative type-based version.

@oli-obk
Copy link
Contributor

oli-obk commented Apr 14, 2020

Cool, so I guess the first step is to give https://github.com/rust-lang/rust/blob/57e1da59cd0761330b4ea8d47b16340a78eeafa9/src/librustc_mir/transform/check_consts/ops.rs#L341 a feature gate that turns it on. Then you need to make it more fine grained by giving https://github.com/rust-lang/rust/blob/57e1da59cd0761330b4ea8d47b16340a78eeafa9/src/librustc_mir/transform/check_consts/ops.rs#L340 a field with the type of the static, so the NonConstOp can use different feature gates for differently typed things.

I was initially considering doing the type check via an is_freeze check for the type and if that succeeds, then look for raw pointers, but that is very complex and I fear we'll introduce bugs.

Instead we can copy what is_freeze does and create a new trait in https://github.com/rust-lang/rust/blob/f6fe99c798cb65280a9a56f442b371adcb7b8aa2/src/libcore/marker.rs#L680 that gives us exactly what we need.

#[lang = "conservative_transitive_freeze"]
pub(crate) unsafe auto trait ConservativeTransitiveFreeze {}

impl<T: ?Sized> !Freeze for UnsafeCell<T> {}
impl<T: ?Sized> !Freeze for *const T {}
impl<T: ?Sized> !Freeze for *mut T {}

// `PhantomData` is guaranteed immutable.
unsafe impl<T: ?Sized> Freeze for PhantomData<T> {}

Then we repeat all the boilerplate that exists for freeze, but just name it conservative_transitive_freeze instead.

@RalfJung
Copy link
Member Author

@rpjohnst so nowhere in your use-case is there interior mutability, even in the types, that could become a problem here?

@oli-obk ah good point, I had not considered transitive accesses. With unsafe code, type-based analyses can probably be circumvented, but I presume having post-monomorphization errors when people write "unconst" code is acceptable?

@rpjohnst
Copy link

@oli-obk Thanks! I should be able to try that out soon.

@RalfJung Right, the string interning case doesn't use interior mutability or raw pointers anywhere in the types of statics. It does involve raw pointers to those statics, in consts, but that should be fine?

@oli-obk
Copy link
Contributor

oli-obk commented Apr 15, 2020

With unsafe code, type-based analyses can probably be circumvented, but I presume having post-monomorphization errors when people write "unconst" code is acceptable?

Yes, I don't think we can ever protect us from unconst errors without post monomorphization errors. We can't even protect us from post monomorphization integer math overflows.

@RalfJung
Copy link
Member Author

@rpjohnst turns out we are running right into some deep soundness concerns that I am trying to understand and document in #42. I think until we got those in order, we should probably not make any changes here -- sorry.

@rpjohnst
Copy link

rpjohnst commented Apr 28, 2020

Sounds good to me, I was holding off on updating rust-lang/rust#71332 until I could better understand those anyway.

@RalfJung
Copy link
Member Author

rust-lang/rust#71655 landed and I am out of "panic mode" here -- discovering a new soundness constraint like this so late was something of a shock for me.^^ (I also somehow missed that using these consts in patterns is a key feature for you.)

So here's, I think, roughly what would need to happen to allow consts-pointing-to-immutable-statics (and even reading from them) in our dynamic checks: currently we have a global invariant that pointers to (potentially mutable) statics can only exist in statics. Once we allow consts to read from statics, that could be used to leak pointers to mutable statics into a const evaluation. We could try to control that leakage, but honestly that seems like playing with fire to me.

Instead I think we should ditch the global invariant, and directly check the property that we need: references in consts must point to read-only allocations. This is easy to add to reference validation, but the annoying part is that we currently stop const validation when encountering a static defined in another crate. That could miss references to mutable memory. So we have to keep going. (If this is a foreign static, we should probably just error, as we cannot check their body.)

If this is too expensive for performance, we could have an analysis that remembers for each static "is mutable memory reachable from this". Then we could easily allow consts to point to statics where that is not the case. The trouble is, that is rather coarse-grained, so just going on recursively would be better.

@oli-obk what do you think?

@rpjohnst
Copy link

If I understand this correctly, the "new" constraint here is that using a const in a pattern recursively dereferences all references it contains, const-ly. Or in other words, this is a bigger version of the same problem we discussed above- if a const holds a reference to mutable memory, some other unsuspecting const context may dereference it. There, we assumed references in a const could only be dereferenced in two ways:

  1. In other const contexts, where the dynamic check would catch it, or
  2. At runtime, where it's fine.

But there is a third way:

  1. In a pattern, which effectively const-evals the whole allocation graph reachable from the const.

So... do I understand this (how patterns work) correctly? That is, shouldn't the dynamic check catch any attempts to read mutable state while const-eval-ing consts in patterns? And you're instead talking about validation, and using it to ensure all consts can be used in patterns?

Assuming I'm not horribly misunderstanding any of that, this new validation check sounds like what @oli-obk described in #11 (comment), where a value-based analysis must somehow deal with cycles.

Given that an end user should only ever see evaluation-time or validation errors if they are using unsafe to circumvent the conservative static check, this is mostly just a question of where we feel safest checking this, right? Or is there some reason validation needs to catch any problems before they reach the dynamic check via patterns?

@oli-obk
Copy link
Contributor

oli-obk commented May 1, 2020

Given that an end user should only ever see evaluation-time or validation errors if they are using unsafe to circumvent the conservative static check, this is mostly just a question of where we feel safest checking this, right? Or is there some reason validation needs to catch any problems before they reach the dynamic check via patterns?

I believe it would be fine to go with pattern expansion time checks right now. In the future, we may allow associated constants as patterns, at that point we need to revisit this problem. Though I worry that by allowing it now, we can't fix it in the future for associated constants without monomorphization time errors.

@oli-obk
Copy link
Contributor

oli-obk commented May 1, 2020

If this is too expensive for performance, we could have an analysis that remembers for each static "is mutable memory reachable from this". Then we could easily allow consts to point to statics where that is not the case. The trouble is, that is rather coarse-grained, so just going on recursively would be better.

We need to do this anyway, as the reference may not point at the static directly, but at a field of the static. So if we point to .1 of

static FOO: (AtomicUsize, usize) = ...;

then we encounter a mutable allocation, but not something that will ever be mutable. So we basically just continue with the type based analysis, and if the user did some unsound stuff in order to get a &T to an &UnsafeCell<U>, then that's their fault, we don't need to figure this out and complain.

@RalfJung
Copy link
Member Author

RalfJung commented May 1, 2020

If I understand this correctly, the "new" constraint here is that using a const in a pattern recursively dereferences all references it contains

That is a way to put it, yes.

That is, shouldn't the dynamic check catch any attempts to read mutable state while const-eval-ing consts in patterns?

The evaluation doesn't happen in const-eval though. The evaluation happens at run-time using PartialEq::eq.

Assuming I'm not horribly misunderstanding any of that, this new validation check sounds like what @oli-obk described in #11 (comment), where a value-based analysis must somehow deal with cycles.

No, now you are confusing the static checks and the dynamic checks. I was only talking about dynamic checks. "Value-based" vs "type-based" refers to the static checks. See this post for some more background on those two independent checks that we have.

I was solely talking about how to relax the dynamic checks to allow your use-case. Once we agree on what the dynamic checks should permit or not, we can try to design static checks that approximate the dynamic checks as good as we can. But currently relaxing the static checks would be pointless as the program would still be rejected by the dynamic checks.

@oli-obk

I believe it would be fine to go with pattern expansion time checks right now.

Are you referring to static or dynamic checks here?

then we encounter a mutable allocation, but not something that will ever be mutable

I would not permit that case. It is basically impossible to have a dynamic check that allows this. Or maybe it is possible if we adopt Stacked Borrows into CTFE, but... (a) do we really want that, (b) maybe let's do the simpler thing first.

@oli-obk
Copy link
Contributor

oli-obk commented May 1, 2020

Are you referring to static or dynamic checks here?

neither. What I mean is to relax the dynamic checks for constants referring to statics and only do the pattern checks in the pattern match logic, in order to allow as much as possible in regular constants while preventing the use of some constants in patterns.

@RalfJung
Copy link
Member Author

RalfJung commented May 1, 2020

Hm... yes I guess that would be an alternative. However, unless there are also plans to somehow do this in the static checks (so that consts pointing to interior mutable statics could actually become stable one day), I am not sure if the discrepancy is worth it? This approach poses the risk that patterns (or const generics) could use an "unsuited" const and forget to run the stricter check on it.

then we encounter a mutable allocation, but not something that will ever be mutable

Btw, this strongly reminds me of rust-lang/rust#67534.

@oli-obk
Copy link
Contributor

oli-obk commented May 2, 2020

This approach poses the risk that patterns (or const generics) could use an "unsuited" const and forget to run the stricter check on it.

For const generics, we must run a stricter check anyway, as it can't refer to statics at all afaik. Unless, maybe it would be ok to print relocations into statics by the ID of the static. That won't work for pointers into the interior though, but we don't have that right now anyway.

Patterns could indeed use a constant wrongly, because they fall back to == comparison on the full constant in some cases.

Btw, this strongly reminds me of rust-lang/rust#67534.

ugh... that one.

Ok back to the topic at hand.

Let's teach validation (and maybe interning needs some changes, too?) that constants can refer to statics by removing the check for that and adding checks that constants only refer to immutable allocations. This may require interning to overwrite the mutability of some allocations to immutable during interning, but that's fine.

@RalfJung
Copy link
Member Author

RalfJung commented May 2, 2020

For const generics, we must run a stricter check anyway, as it can't refer to statics at all afaik.

Why would const generics not be able to point to immutable statics? So far I assumed const generics and patterns (with exhaustiveness checking) have basically the same soundness concern. In particular, both require a form of "identity"/"equality" on constants that most behave properly for things to be sound.

Let's teach validation (and maybe interning needs some changes, too?) that constants can refer to statics by removing the check for that and adding checks that constants only refer to immutable allocations.

Okay we agree then. :)

This may require interning to overwrite the mutability of some allocations to immutable during interning, but that's fine.

I think it already does that -- I'd be surprised if we needed any more overrides.

@oli-obk
Copy link
Contributor

oli-obk commented May 2, 2020

Why would const generics not be able to point to immutable statics? So far I assumed const generics and patterns (with exhaustiveness checking) have basically the same soundness concern. In particular, both require a form of "identity"/"equality" on constants that most behave properly for things to be sound.

I need to start with some background:

In const generics, a constant of type &i32 is exactly equal to all constants of type &i32 where the i32 has the same value. So, no matter where the constant comes from, it needs to be equal if it is equal per PartialEq and StructuralPartialEq. Anything else would be unsound, you could end up choosing a different impl depending on how the constant was created.

In order for rustc to be able to handle this, that means that if you have

const FOO: &i32 = &42;
const fn bar() -> &'static i32 { &42 }
const BAR: &i32 = bar();

we need FOO and BAR to not just be equal via PartialEq, but we need their internal representation in rustc to be equal. So their &'tcx ty::Const<'tcx> must have the same address. Such deduplication can only happen if interning has canonical AllocIds for specific Allocations.

If we allow constants to refer to statics, that means we can end up with a constant

static MEH: i32 = 42;
const MOP: &i32 = &MEH;

which most definitely does not have the same value as FOO or BAR, because it must refer to MEH, otherwise users will be very confused about the fact that using &MEH directly differs from doing so indirectly over MOP.

Now we could teach const generics about statics and specifically state that we want to be able to choose a different impl depending on which static we pass as a reference. That seems like a totally reasonable feature to me, but the const generics RFC does not mention statics at all, not even in the future work section, so this would need a second RFC.

What we can do is to simply reject any constant that (directly or indirectly) refers to a static in const generics. This is similar to my suggestion to reject constants that refer to statics during match pattern checking.

This may require interning to overwrite the mutability of some allocations to immutable during interning, but that's fine.

I think it already does that -- I'd be surprised if we needed any more overrides.

As you yourself found out in rust-lang/rust#71316 (comment), we don't do this everywhere/correctly.

@RalfJung
Copy link
Member Author

RalfJung commented May 2, 2020

As you yourself found out in rust-lang/rust#71316 (comment), we don't do this everywhere/correctly.

That's for raw pointers, which doesn't matter here (I hope neither pattern matching nor const generics dereference raw pointers :D ). Besides, we actually do reset mutability there -- and on top of that we delay_span_bug for good measure. I think that hole is closed pretty well.

@RalfJung
Copy link
Member Author

RalfJung commented May 2, 2020

In const generics, a constant of type &i32 is exactly equal to all constants of type &i32 where the i32 has the same value. So, no matter where the constant comes from, it needs to be equal if it is equal per PartialEq and StructuralPartialEq. Anything else would be unsound, you could end up choosing a different impl depending on how the constant was created.

Makes sense. The same is true for pattern matching.

we need FOO and BAR to not just be equal via PartialEq, but we need their internal representation in rustc to be equal. So their &'tcx ty::Const<'tcx> must have the same address.

Oh I see, that's how you want to do this.
@eddyb has an open issue somewhere about problems related to this... rust-lang/rust#70889.

My first thought on this is: I am not sure if it's worth trying to shoehorn ty::Const into this; that poor thing is already doing too many jobs. @eddyb mentioned a "value tree" semantics in rust-lang/rust#70889. So maybe it makes more sense to have a separate representation -- say, ty::StructuralConst -- that can be created from ty::Const, that forgets everything irrelevant, that only supports types that actually can be used for const generics, and that has the property that two elements are equal iff they are PartialEq::eq-equal.

Maybe that same representation can even be used by pattern matching so we don't need all these special cases in ty::Const any more that exist only for pattern matching?

This is similar to my suggestion to reject constants that refer to statics during match pattern checking.

So we seem to agree that it makes sense to treat patterns and const generics consistently?
@rpjohnst's use case however requires using consts that refer to (immutable, and recursively so) statics in patterns.

@RalfJung
Copy link
Member Author

RalfJung commented May 2, 2020

Oh but this reminds me of something...
@rpjohnst you wrote

My goal was to convert a string interner from handing out indices, to handing out direct pointers. This would have been a performance optimization for programs that frequently do more than simply compare string identities- e.g. printing them out, checking metadata.

Pointers from consts to static come into this because I already had a large collection of consts for "well-known" interned strings, which I used in patterns. With indices, those consts were just (wrapped) integers, the interner constructor inserted the strings in the same order, and everything worked fine. With pointers, all the tests started failing, because separate uses of a const would point to separate copies of the string!

Note that when you use references in patterns, from what I understood here, they will be compared with PartialEq::eq! IOW, you are not getting a cheap pointer comparison, the values behind the references are actually compared.

So, I am not sure if this pattern matching on references is really what you are looking for.

@rpjohnst
Copy link

rpjohnst commented May 2, 2020

Well, yes, but I'm not using references in patterns. My Symbol type contains a raw pointer instead (though I realize that only works in patterns somewhat accidentally). 😅

Anything I've said about references in patterns was not specific to the symbol-interning use case that led me here, just trying to understand what rustc has to consider in general.

@RalfJung
Copy link
Member Author

RalfJung commented May 2, 2020

Oh I see... so we allow pattern matching on raw pointer consts? But we don't allow them in const generics? That's... messy. :/

@rpjohnst I think what would help me is a small self-contained example of the kind of thing you should be allowed. Just so that we have a testcase that demonstrates "fait accompli".

@rpjohnst
Copy link

rpjohnst commented May 2, 2020

Here's a reduced playground version: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=048389010d2a718bfbf2771c8fa64b02

@comex
Copy link

comex commented May 2, 2020

For reference, C++ approaches this by requiring pointer/reference-typed const generic parameters to point to (its equivalent of) statics, and then respecting pointer identity. So Foo<&X> and Foo<&Y> are different types if X and Y are different variables, but Foo<&42> is not allowed.

Details for pedants Actually, &42 is syntactically invalid to start with; C++ doesn't let you directly create a pointer from a literal. But this is just one of C++'s pointless limitations, since you can create a reference from a literal, and references are just dressed-up pointers (they have pointer identity). However, a reference created from a literal can't be used as a template argument.

I wouldn't suggest copying C++ in this regard; its requirement means, among other things, you can't even use a string literal as a generic parameter.

On the other hand, Rust's behavior of erasing pointer identity IMO has some surprising consequences even without involving statics. If I compile this as a dylib:

#![feature(const_generics)]
pub struct Wrapper<const REF: &'static u32>;

impl<const REF: &'static u32> Wrapper<REF> {
    pub const PTR: *const u32 = REF;
}

pub fn foo() {
    dbg!(Wrapper::<{&42}>::PTR);
}

then link another crate to it which also does dbg!(Wrapper::<{&42}>::PTR);, it will print a different pointer, even though Wrapper::<{&42}> is considered the same type and can be passed between the two crates.

If I understand correctly, in reality, Wrapper::<{&42}>::PTR is effectively just inlining &42 as *const u32, and it would be allowed to produce different values when evaluated multiple times even within the same crate. But I do consider that surprising. I would intuitively expect pointer-typed generic parameters to have identity just like pointer-typed runtime parameters.

@oli-obk
Copy link
Contributor

oli-obk commented May 2, 2020

If I understand correctly, in reality, Wrapper::<{&42}>::PTR is effectively just inlining &42 as *const u32, and it would be allowed to produce different values when evaluated multiple times even within the same crate. But I do consider that surprising. I would intuitively expect pointer-typed generic parameters to have identity just like pointer-typed runtime parameters.

Don't we have this "issue" already for regular generics? I think we should do whatever regular generics do for dylibs.

@oli-obk
Copy link
Contributor

oli-obk commented May 2, 2020

Oh I see... so we allow pattern matching on raw pointer consts? But we don't allow them in const generics? That's... messy. :/

I think noone is very happy with this status quo. Pattern matching pointers is super weird. It's not even deterministic across builds for function pointers, because it totally depends on codegen units, optimization levels and arbitrary decisions in the compiler around inlining and deduplication.

@oli-obk
Copy link
Contributor

oli-obk commented May 2, 2020

My first thought on this is: I am not sure if it's worth trying to shoehorn ty::Const into this; that poor thing is already doing too many jobs. @eddyb mentioned a "value tree" semantics in rust-lang/rust#70889. So maybe it makes more sense to have a separate representation -- say, ty::StructuralConst -- that can be created from ty::Const, that forgets everything irrelevant, that only supports types that actually can be used for const generics, and that has the property that two elements are equal iff they are PartialEq::eq-equal.

I like this, but I think we need to turn it around. ty::Const needs to be structural and we have a mir::Const that doesn't have to be and is basically what we use now. So const_eval returns mir::Const and the only way to go to ty::Const is to normalize for the type system/generics/pattern matching

@rpjohnst
Copy link

rpjohnst commented May 2, 2020

So const_eval returns mir::Const and the only way to go to ty::Const is to normalize for the type system/generics/pattern matching

This might even suggest a solution to the weirdness of pattern matching on pointers. The behavior of function pointers is similar to the behavior of pointers in consts- it's not even stable across a single program, let alone across builds. But the behavior of pointers to statics is, since that's the entire point of statics.

So a normalization step like that could offer a convenient point to warn on patterns with non-static pointers, on top of filtering them out for types/generics.

@RalfJung
Copy link
Member Author

RalfJung commented May 3, 2020

@rpjohnst

Here's a reduced playground version

Thanks!
So you are comparing pointers here... and by-address (raw pointers), not by-value. That makes things even more complicated. ;)

@oli-obk

I like this, but I think we need to turn it around. ty::Const needs to be structural and we have a mir::Const that doesn't have to be and is basically what we use now. So const_eval returns mir::Const and the only way to go to ty::Const is to normalize for the type system/generics/pattern matching

I am totally open to bikeshedding the names. ;)
In this scheme, not every const could be turned into a ty::Const, right? Just those whose types have "structural equality" (or whatever it is called)?

I think noone is very happy with this status quo. Pattern matching pointers is super weird. It's not even deterministic across builds for function pointers, because it totally depends on codegen units, optimization levels and arbitrary decisions in the compiler around inlining and deduplication.

I would have proposed to start linting against pattern matching raw pointers and function pointers... but it seems like that would close the door for @rpjohnst's use case. :/

Raw pointers that are integers (0x04 as *const _) or point to statics could actually be fine though. Integers have reasonable equality and statics have unique addresses... okay maybe let's make that non-ZST statics to be sure.^^ We could incorporate those into @eddyb's "value tree". But then we'd have to look at the value of a const, and not just its type, to determine if it has "structural equality".

Maybe we need a CompileTimeEqSafeRawPtr type or so?^^

@oli-obk
Copy link
Contributor

oli-obk commented May 3, 2020

I am totally open to bikeshedding the names. ;)

It's a bit more than just bikeshedding. I was informed that if we had two representations for the same usize (e.g. ConstValue::Scalar and ConstValue::ByRef), then two array types with the same length could compare inequal. So the only reason we haven't run into troubles so far is that we only have usize constants in types 😱

In this scheme, not every const could be turned into a ty::Const, right? Just those whose types have "structural equality" (or whatever it is called)?

Yes.

But then we'd have to look at the value of a const, and not just its type, to determine if it has "structural equality".

I think we already do that, because you can have a Result<StructurallyEqType, NotStructurallyEqType> and if it's Ok, then everything works out.

Raw pointers that are integers (0x04 as *const _) or point to statics could actually be fine though. Integers have reasonable equality and statics have unique addresses... okay maybe let's make that non-ZST statics to be sure.^^ We could incorporate those into @eddyb's "value tree". But then we'd have to look at the value of a const, and not just its type, to determine if it has "structural equality".

It's not unsound to use them, just super suprising just like if you wrote a big if chain comparing the pointers at runtime. We don't lint users comparing them via if, so they are "fine" for matching. Basically any non-static non-integral pointers must be treated as opaque by the exhaustiveness checker.

But this is getting off topic imo. Pointers are a different issue.

I'm worried about

static S: &str = "foo";
const X: &&str = &S;
const Y: &&str = &"foo";
match some_str {
    X => {},
    Y => {},
    _ => {}
}

Because I have no idea what we're supposed to do here. X and Y are equal by value but different in their representation.

@RalfJung
Copy link
Member Author

RalfJung commented May 3, 2020

It's not unsound to use them, just super suprising

Yeah, that seems good enough for a lint. ;)

It's not super off-topic because it is also about pointer equality and pattern matching, but sure, we can let this sub-thread lie for now.

Because I have no idea what we're supposed to do here. X and Y are equal by value but different in their representation.

Basically we "just" have to make a decision, right? Either choice works, but we have to be consistent.

@ChayimFriedman2
Copy link

ChayimFriedman2 commented Jan 25, 2023

I just had another use-case.

Some data structure, which has a const new. The data structure is allocating, but the new() function puts a pointer to a static sentinel instead. I could use a fake address, but then I would have to introduce a branch in the hot path.

However, there are some places (not in the hot path) where I do need to differentiate the sentinel. The only way I can think of to do that is by comparing the pointer to its known address (if I have had a way to differentiate based on its contents, I could use static promotion without guaranteed address equality).

That means const fn new() needs to reference the static, which is currently disallowed. I don't need interior mutability (I even don't need to read from it in the const fn). The static doesn't even need to reference other statics, so even the most conservative approach will be fine for me.

@RalfJung
Copy link
Member Author

RalfJung commented Nov 25, 2023

Rust rust-for-linux team also has a usecase for promoteds that point to statics. They don't actually need to read from the static's value inside the const, just be able to point to it.

This would be fairly easy to do, we just need to remove the checks that block mentioning a static. The existing checks we already have in place will ensure any attempt to read the static fails. However for some programs that currently fail during type/const-checking time, this will mean we get post-monomorphization errors instead, so someone will have to convince the lang team that this is worth it.

@oli-obk
Copy link
Contributor

oli-obk commented Nov 25, 2023

Ralf, are you arguing for expanding the scope of promoteds? I need to check with a physicist, but pigs are still not flying afaict 😃 (helicopters don't count)

For real though: seems an odd thing to do to get immutable generic statics, but I guess it doesn't have the issue that duplication must be prevented.

Maybe we could do some shenanigans and restrict other things when you mention a static item?

@RalfJung
Copy link
Member Author

RalfJung commented Nov 25, 2023

Fair question.^^ No, I only want to expand the scope of consts. ;) Inline const blocks should do for what the kernel folks need I think.

They are basically using this to do user-defined vtables, if I understand correctly.

Maybe we could do some shenanigans and restrict other things when you mention a static item?

Which things would it make sense to restrict?

@RalfJung
Copy link
Member Author

Okay turns out what the kernel people are actually running into is rust-lang/rust#118447.

Still, seems worth to at least have a nightly feature for "const refer to static", so that one can start experimenting with it.

@RalfJung
Copy link
Member Author

RalfJung commented Jan 5, 2024

rust-lang/rust#119614 proposes that we can unstably allow consts to refer to statics.
Tracking issue: rust-lang/rust#119618

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants