Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typesystem soundness hole involving cyclically-defined ATIT, and normalization in trait bounds #135011

Open
steffahn opened this issue Jan 2, 2025 · 18 comments
Labels
A-associated-items Area: Associated items (types, constants & functions) A-trait-system Area: Trait system A-type-system Area: Type system C-bug Category: This is a bug. I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness P-medium Medium priority T-types Relevant to the types team, which will review and decide on the PR/issue.

Comments

@steffahn
Copy link
Member

steffahn commented Jan 2, 2025

This isn't quite trivial, but well… it works.

a “transmute” in safe Rust:

fn main() {
    let s: String = transmute(vec![65_u8, 66, 67]);
    println!("{s}"); // ABC
}
fn transmute<L, R>(x: L) -> R {
    transmute_inner_1::<L, R, RewriteReflRight<FakeRefl<L, R>>>(x)
}

fn transmute_inner_1<L, R, P: TyEq<LHS = L, RHS = R>>(x: L) -> R {
    transmute_inner_2::<<P::ProofIsProofValue as TyEq>::ProofDef, L, R, P::ProofValue, P>(x)
}

fn transmute_inner_2<Rw, L, R, Actual, P: TyEq<LHS = L, RHS = R, ProofValue = Actual>>(x: L) -> R
where
    <Rw as TypeIs<P::ProofValue>>::Is: TyEq<LHS = L, RHS = R>,
{
    transmute_inner_3::<L, R, Actual>(x)
}

fn transmute_inner_3<L, R, P: TyEq<LHS = L, RHS = R>>(x: L) -> R {
    transmute_inner_4::<R, P::ProofDef>(x)
}

fn transmute_inner_4<R, Rw>(x: <Rw as TypeIs<R>>::Is) -> R {
    x
}
trait TypeIs<T> {
    type Is;
}
impl<_Self, T> TypeIs<T> for _Self {
    type Is = T;
}

trait TyEq: TyEqImpl {
    type ProofDef: TypeIs<Self::RHS, Is = Self::LHS>;
}

trait TyEqImpl {
    type LHS;
    type RHS;

    type Proof<_Dummy>: TyEq<LHS = Self::LHS, RHS = Self::RHS>;

    type ProofValue;
    type ProofIsProofValue: TyEq<LHS = Self::Proof<()>, RHS = Self::ProofValue>;
}

trait DeriveEqFromImpl {}
impl<P: TyEqImpl + DeriveEqFromImpl> TyEq for P {
    type ProofDef = <P::Proof<()> as TyEq>::ProofDef;
}

struct Refl<T>(T);
impl<T> TyEqImpl for Refl<T> {
    type LHS = T;
    type RHS = T;

    type Proof<_Dummy> = Refl<T>;

    type ProofValue = Refl<T>;
    type ProofIsProofValue = Refl<Refl<T>>;
}
impl<T> TyEq for Refl<T> {
    type ProofDef = ();
}

struct FakeRefl<L, R>(L, R);
impl<L, R> DeriveEqFromImpl for FakeRefl<L, R> {}
impl<L, R> TyEqImpl for FakeRefl<L, R> {
    type LHS = L;
    type RHS = R;

    type Proof<_Dummy> = FakeRefl<L, R>;

    type ProofValue = FakeRefl<L, R>;
    type ProofIsProofValue = Refl<FakeRefl<L, R>>;
}

struct RewriteReflRight<P>(P);
impl<P> DeriveEqFromImpl for RewriteReflRight<P> {}
impl<P, L, R> TyEqImpl for RewriteReflRight<P>
where
    P: TyEq<LHS = L, RHS = R>,
{
    type LHS = L;
    type RHS = R;

    type Proof<_Dummy: RewriteReflRightHelper> = _Dummy::RewriteRight<Refl<L>, R, P::ProofDef>;

    type ProofValue = Refl<L>;
    type ProofIsProofValue = Refl<Refl<L>>;
}

trait RewriteReflRightHelper {
    type RewriteRight<P, R, Rw>: TyEq<LHS = P::LHS, RHS = R>
    where
        P: TyEq<RHS = <Rw as TypeIs<R>>::Is>;
}

impl<_Dummy> RewriteReflRightHelper for _Dummy {
    type RewriteRight<P, R, Rw>
        = P
    where
        P: TyEq<RHS = <Rw as TypeIs<R>>::Is>;
}

(playground)

Besides this unsoundness, there are also many ICEs. This isn't surprising, since we're not really transmuting, but rather sort-of … convincing the type system that the 2 types were never different in the first place.

fn main() {
    let s: u8 = transmute(());
    println!("{s:?}");
}

Or interesting LLVM-errors

fn main() {
    let s: &() = transmute("hi");
    println!("{s:?}");
}
   Compiling playground v0.0.1 (/playground)
Invalid bitcast
  %7 = bitcast { ptr, i64 } %6 to ptr, !dbg !425
in function _ZN10playground17transmute_inner_317h363424402e6a7153E
rustc-LLVM ERROR: Broken function found, compilation aborted!
error: could not compile `playground` (bin "playground")

Oh… that one on nightly has… the compiler itself execute a SIGILL? (Is that actual UB in the compiler!? Or UB in LLVM? Or can SIGILL be part of a deliberate crash condition?)


From a background of “this works a bit like Coq”, the relevant parts that make this work is how:

  • FakeRefl<L, R> is a “proof” that L and R are the same types, which doesn't terminate (i.e. it is cyclically defined in terms of itself)
    • unlike true proof assistants, Rust's trait system does allow some stuff to be defined in a somewhat cyclical manner, especially pre-monomorphization
  • overflow during monomorphization is avoided anyways by:
    • introducing the Rewrite… step which rewrites Refl<L> (a proof of TyEq<LHS = L, RHS = R>) using this FakeRefl proof
    • at the same time also tracking that the proof term itself doesn't actually change with the rewrite
    • using this tracked meta-information in order to eliminate the usage of FakeRefl before its (indirectly and cyclically defined) ProofDef type ever ends up being instantiated

Of course, this is just my intuitive description from a user perspective :-)


I have no idea which part of this code to blame exactly. Perhaps someone else with better knowledge of the theory behind Rust's typesystem and trait system could have a better clue here.

I'm not quite sure if this can perhaps be "simplified" to remove the need for GATs

Update: It can be "simplified" to remove the need for GATs. (playground)

Update2: It's even possible to rework this into working since Rust 1.1

(this section has become somewhat outdated due to this update, click to expand anyway)

I'm not quite sure if this can perhaps be "simplified" to remove the need for GATs; however, I wasn't able to the Rewrite… stuff to work without this trick

type Proof<_Dummy: RewriteReflRightHelper> = _Dummy::RewriteRight<Refl<L>, R, P::ProofDef>;

where the (vacuous) _Dummy: RewriteReflRightHelper restriction prevents the compiler from already knowing the concrete RewriteReflRightHelper impl. (If you worry about the syntactical mismatch of

type Proof<_Dummy: RewriteReflRightHelper>

here vs

type Proof<_Dummy>: TyEq<LHS = Self::LHS, RHS = Self::RHS>;

without bounds on _Dummy in the trait definition: that's actually avoidable (<- see this playground on how the bound can be repeated everywhere and the impl RewriteReflRightHelper for … can even be for a concrete type only).

So for now, I would conjecture that this soundness hole does require GATs.

Edit: I completely forgot about the (generous) usage of GATs inside of RewriteReflRightHelper ~ so, yes, GATs seem fairly integral to this.

Edit2: Nevermind, those can all be eliminated leaving only a single GAT.

(end of somewhat outdated section)


This is not a regression. It also isn't influenced by -Znext-solver.


This is a relatively clean typesystem hole, on stable Rust, without many of the other known-problematic features such as HRTBs, lifetimes, variance, trait objects, or implicitly defined trait impls for functions/closures/etc…

@rustbot label I-unsound, I-ICE, I-crash, T-types, A-type-system, A-trait-system, A-associated-items

@steffahn steffahn added the C-bug Category: This is a bug. label Jan 2, 2025
@rustbot rustbot added needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. A-associated-items Area: Associated items (types, constants & functions) A-GATs Area: Generic associated types (GATs) A-trait-system Area: Trait system A-type-system Area: Type system I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness T-types Relevant to the types team, which will review and decide on the PR/issue. I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Jan 2, 2025
@saethlin
Copy link
Member

saethlin commented Jan 2, 2025

Oh… that one on nightly has… the compiler itself execute a SIGILL? (Is that actual UB in the compiler!? Or UB in LLVM? Or can SIGILL be part of a deliberate crash condition?)

With LLVM assertions enabled, I get this instead of just a SIGILL:

rustc: /checkout/src/llvm-project/llvm/lib/IR/Instructions.cpp:2974: static CastInst *llvm::CastInst::Create(Instruction::CastOps, Value *, Type *, const Twine &, InsertPosition): Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed.

and if I pass -Zverify-llvm-ir, I get the same nice error:

Invalid bitcast
  %5 = bitcast { ptr, i64 } %4 to ptr
rustc-LLVM ERROR: Broken module found, compilation aborted!

I believe the difference you're observing on nightly is #133499 cc @nikic

Oh oops, we document that IR is not verified by default, so it's definitely intentional that on nightly we crash somewhat incoherently with invalid IR.

@jieyouxu jieyouxu removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jan 2, 2025
@apiraino
Copy link
Contributor

apiraino commented Jan 2, 2025

@steffahn Can you (or anyone else having context here) please provide a digestible summary of this issue? What is happening that shouldn't?

This is not a more recent regression (i.e. it "works" ever since GATs were introduced). It also isn't influenced by -Znext-solver.

By reading this sentence I infer that it's not a simple regression. Would you like the appropriate team to discuss this?

Thank you from the triage.

@workingjubilee
Copy link
Member

It's not really a regression at all in the conventional sense.

@steffahn
Copy link
Member Author

steffahn commented Jan 2, 2025

@apiraino I meant to write “this is not a regression” but in the most technical sense of “introduction of GATs make type system unsound” it's perhaps technically a regression; sorry for the confusing wording.

[The interpretation of “introduction of GATs make type system unsound” is assuming I don't find a way to rewrite the reproduction without relying on GATs.]

@steffahn
Copy link
Member Author

steffahn commented Jan 2, 2025

[The interpretation of “introduction of GATs make type system unsound” is assuming I don't find a way to rewrite the reproduction without relying on GATs.]

Nevermind, I did it, GATs aren't necessary. This playground code works since Rust 1.49 now.

@rustbot label -A-GATs

@rustbot rustbot removed the A-GATs Area: Generic associated types (GATs) label Jan 2, 2025
@steffahn steffahn changed the title Typesystem soundness hole involving GATs Typesystem soundness hole Jan 2, 2025
@rustbot rustbot added the A-GATs Area: Generic associated types (GATs) label Jan 2, 2025
@clubby789 clubby789 removed the A-GATs Area: Generic associated types (GATs) label Jan 2, 2025
@steffahn
Copy link
Member Author

steffahn commented Jan 2, 2025

To help with understanding of the first step involved in this soundness issue, it's a cyclically defined associated type; the pattern is mirroring the following reduced example

// take a trait `Impossible` that no type should be allowed/able to implement
// not even implemented by trait objects
trait Impossible: Sized {}

trait MetaImpossible {
    type Assoc: Impossible;
}

trait MetaMetaImpossible {
    type Assoc: MetaImpossible;
}

impl<T: MetaMetaImpossible> MetaImpossible for T {
    type Assoc = <T::Assoc as MetaImpossible>::Assoc;
}

impl MetaMetaImpossible for () {
    type Assoc = ();
}

// now `()` implements `MetaImpossible`,
// and thus technically `<() as MetaImpossible>::Assoc` implements `Impossible`

// Making this into a real issue is “just” a question of whether we can make use
// of this illegal implementation, without “mentioning” `<() as MetaImpossible>::Assoc`
// too concretely, in a way that would motivate the compiler to actually try
// to reduce/simplify the type `<() as MetaImpossible>::Assoc` according to the `impls`
// of `MetaMetaImpossible for ()` and `<T: MetaMetaImpossible> MetaImpossible for T`.

If the above was already somehow (systematically) forbidden, that could theoretically be way to close this soundness issue.

Possible caveats:

  • the full exploitation involved some “extra steps”, so denying the above might be overly restrictive
  • some approaches to check for & deny the above might have negative performance impact on compilation times
  • attempts to design a mechanism for denying such an impl – without becoming overly restrictive towards trait implementations where every associated type ends up terminating anyway – might run into the halting problem quickly
  • even though for this soundness issue, I have found a GAT-free exploitation, GATs might still further complicate the question of what, when, and how to restrict

@RustyYato
Copy link
Contributor

RustyYato commented Jan 2, 2025

edit: I started writing this before steffahn's update right above mine, and didn't notice it before I sent this out. Looks like my point 1 is the same one they brought up

There are two possible locations where this is going wrong:

  1. I think that the problem is that this impl shouldn't apply to FakeRefl because (from the latest playground example)
impl<P: TyEqImpl + DeriveEqFromImpl> TyEq for P {
    type ProofDef = <P::Proof as TyEq>::ProofDef;
}

The problem with this impl is that it doesn't actually provide a concrete type for TyEq::ProofDef, so it shouldn't be trusted.
Then there should be some cycle detection for co-inductive impls (like TyEq <-> TyEqIImpl) which checks are associated types instantiated with concrete types along the way.

As a bit more evidence, specializing the impl for FakeRefl like so leads to overflow evaluating the requirement `<FakeRefl<L, R> as TyEq>::ProofDef == _` , which I think is telling.

impl<L, R> TyEq for FakeRefl<L, R> {
    type ProofDef = <<FakeRefl<L, R> as TyEqImplPrf<()>>::Proof as TyEq>::ProofDef;
}
  1. There is something strange going on in transmute_inner_3/transmute_inner_4, it looks like Rust is adding an invisible bound Rw: TypeIs<R>, and in transmute_inner_4 assuming this is the global bound (which should be fine) but in transmute_inner_3 using the bound from P::ProofDef, which is very strange. I didn't expect that changing the type of Rw would make a difference, since transmute_inner_4 doesn't have any explicit bounds on Rw, it should always use the global bounds.

I don't see anything else that could be going wrong right now, the rest looks fine.

@steffahn
Copy link
Member Author

steffahn commented Jan 2, 2025

2. There is something strange going on in transmute_inner_3/transmute_inner_4, it looks like Rust is adding an invisible bound Rw: TypeIs<R>, and in transmute_inner_4 assuming this is the global bound (which should be fine) but in transmute_inner_3 using the bound from P::ProofDef, which is very strange.

This mechanism is used twice actually. The other place is between RewriteReflRight and RewriteReflRightHelper.

Inside RewriteReflRightHelper, there's a parameter Rw without any explicit bounds; and the global impl is used to reduce P: TyEq<RHS = <Rw as TypeIs<R>>::Is> to P: TyEq<RHS = R>, which is why P can be used to define the associated type to fulfill TyEq<LHS = P::LHS, RHS = R>.

Outside, where _Dummy::RewriteRight<Refl<L>, R, P::ProofDef> is used within RewriteReflRight, the type for P, which is Refl<L>, actually implements TyEq<LHS = L, RHS = L>. However, P::ProofDef implements TypeIs<R, Is = L>, so that Refl<L> implements TyEq<RHS = <P::ProofDef as TypeIs<R>>::Is> from <P::ProofDef as TypeIs<R>>::Is reducing to L

@steffahn
Copy link
Member Author

steffahn commented Jan 2, 2025

Another place of interest is here

struct RewriteReflRight<P>(P);
impl<P> DeriveEqFromImpl for RewriteReflRight<P> {}
impl<P, L, R> TyEqImpl for RewriteReflRight<P>
where
    P: TyEq<LHS = L, RHS = R>,
{
    type LHS = L;
    type RHS = R;

    type Proof<_Dummy: RewriteReflRightHelper> = _Dummy::RewriteRight<Refl<L>, R, P::ProofDef>;

    type ProofValue = Refl<L>;
    type ProofIsProofValue = Refl<Refl<L>>; // <- HERE! #######################################
}
[I collapsed the rest of this response, as it has become a bit long&rambly with multiple follow-up comments in edits. This doesn't add too much in describing the issue. But please feel free to read for rough ideas on a possible direction of a fix.]

This section has become a bit unstructured / stream-of-consciousness, sorry for that. With the edits below, I explore aspects of the meaning/semantics of trait bounds on associated items, and how this issue might be fixable with the compiler collecting extra obligations for checks based on those semantics, which would then need to the be executed during monomorphization.

In this code section (marked <- HERE! above), the compiler is able to check that the trait bound

ProofIsProofValue: TyEq<LHS = Self::Proof<()>, RHS = Self::ProofValue>

is fulfilled. Filling in the definitions, that's

Refl<Refl<L>>: TyEq<LHS = <()>::RewriteRight<Refl<L>, R, P::ProofDef>, RHS = Refl<L>>

It does this by being to do this with the TyEq<LHS = T, RHS = T> implementation of Refl<Refl<L>>.

So the compiler was able to reduce the left sides of the required TyEq bound to Refl<L>, too.

So that is: reducing

<()>::RewriteRight<Refl<L>, R, P::ProofDef>

to

Refl<L>

which involves looking into / applying the type definitions of the generic RewriteReflRightHelper impl.

Notably, it does this reduction happens in a context where P is still not a concrete monomorphized type. The problem is that this turns out to be the only place where <FakeRefl<Vec<u8>, String> as TyEq>::ProofDef is ever explicitly “accessed”, in the form of the P::ProofDef

Then, for concrete types, we can evaluate ProofIsProofValue

e.g. you could even write in main

let r: <RewriteReflRight<FakeRefl<Vec<u8>, String>> as TyEqImpl>::ProofIsProofValue = Refl(Refl(vec![]));

and that works fine.

Now, if the intended contract for associated types in traits (in the context of the possibility of cyclical, non-terminating definitions) is something like “if the associated type can be evaluated/normalized (post-monomorphization), then it has the promised trait bound” then we would have a problem at this exact place.

Perhaps more accurately even, the contract could be “if the associated type can be evaluated/normalized (post-monomorphization), and all types appearing in the trait bound can be evaluated/normalized; then it has the promised trait bound”. Or something along those lines…

Roughly speaking: With such a contract established, the compiler would need to pay attention during normalization/projection in generic contexts, and notice how in particular the type P::ProofDef was involved in determining the trait bound here. It should then remember this connection and attach it to the impl TyEqImpl for RewriteReflRight<P> as some sort of implicit obligation, so that – either any usage of RewriteReflRight<P>::ProofIsProofValue – or at least usage that uses its TyEq<LHS = Self::Proof<()>, RHS = Self::ProofValue> trait bound – is required to also make sure that evaluating the type P::ProofDef terminates.


Edit: Actually, to be fair, the trait bound here also involves RewriteReflRight<P>::Proof<()> which also fails to normalize for P = FakeRefl<Vec<u8>, String>> because it is defined in terms of P::ProofDef.

So another possible place to blame might actually be transmute_inner_1; which calls transmute_inner_2:

  • from the perspective of transmute_inner_1, the bound transmute_inner_2 requires, i.e.
    • <Rw as TypeIs<P::ProofValue>>::Is: TyEq<LHS = L, RHS = R> with P::ProofValue for Rw, is simplified to:
    • P::Proof<()>: TyEq<LHS = L, RHS = R>

It even seems to be able to call a function with that bound explicitly, e.g.

fn transmute_inner_1<L, R, P: TyEq<LHS = L, RHS = R>>(x: L) -> R {
    transmute_inner_2__::<<P::ProofIsProofValue as TyEq>::ProofDef, L, R, P::ProofValue, P>();
    // call::<P::Proof<()>>(); // fails
    transmute_inner_2::<<P::ProofIsProofValue as TyEq>::ProofDef, L, R, P::ProofValue, P>(x)
}

fn call<T>() {}

fn transmute_inner_2__<Rw, L, R, Actual, P: TyEq<LHS = L, RHS = R, ProofValue = Actual>>()
where
    P::Proof<()>: TyEq<LHS = L, RHS = R>,
{

}

however, only once the type P::Proof<()> is even more directly “used” in transmute_inner_1 (e.g. by uncommenting the call::<P::Proof<()>> line), the overflow issue comes back.

Perhaps transmute_inner_1, by calling transmute_inner_2 and fulfilling the P::Proof<()>: TyEq<LHS = L, RHS = R> bound there, should always be remembering to make sure to mark “used” the type P::Proof<()> in a way that fails during monomorphization if evaluating P::Proof<()> doesn't terminate.


Edit2: This is interesting. I guess it would make sense if P::Proof<()>: TyEq<LHS = L, RHS = R> does not mean “P::Proof<()> is a type that implements TyEq<…>” but rather means “If evaluating P::Proof<()> doesn't fail during monomorphization, then it's a type that implements TyEq<…>”.

Hence, the usual case of calling a function with a P::Proof<()>: TyEq<LHS = L, RHS = R> bound, like transmute_inner_1 calling transmute_inner_2__ (the version with the __ added in the last code block above), does not have to remember to “make sure evaluating P::Proof<()> terminates”, because the callee – transmute_inner_2__ – can’t do anything with the trait bound anyway, unless it itself evaluates P::Proof<()> during monomorphization (or it passes on the burden further). But with transmute_inner_1 -> transmute_inner_2, the fact that from within transmute_inner_2 the bound is rewritten to P::ProofValue: TyEq<LHS = L, RHS = R>, it now gains the (false) knowledge that “If evaluating P::ProofValue doesn't fail during monomorphization, then it's a type that implements TyEq<…>”, and nobody gets to fulfill the original obligation of “evaluating P::Proof<()> during monomorphization”.


Edit3: Coming back to the situation around ProofIsProofValue, and the contract idea of

“if the associated type can be evaluated/normalized (post-monomorphization), and all types appearing in the trait bound can be evaluated/normalized; then it has the promised trait bound”

then it seems that associated types are quite distinct in a relevant manner. The trait bound on

type ProofIsProofValue: TyEq<LHS = Self::Proof<()>, RHS = Self::ProofValue>;

actually only mentions Self::Proof<()> on an associated type. But really, we have ProofIsProofValue: TyEq, so to make “use” of this bound, we don't have to mention Self::Proof<()> explicitly.

The user in transmute_inner_1 was able to make use of this <P::ProofIsProofValue as TyEq> without actually mentioning/“using” the Self::Proof<()> from the LHS =.

More precisely thus, a concept of “all types appearing in the trait bound can be evaluated/normalized” would probably need to only talk about “types appearing in the trait bound in parameters (not associated types)”. Although to be fair, it's hard to say – perhaps multiple different contracts could actually work, as long as the compiler made sure that all places that remove any of the implicit “evaluation must terminate during monomorphization”-style requirement from a trait bound do instead add the same requirement to the definition/“body” of the associated type and/or function that did so.

@steffahn steffahn changed the title Typesystem soundness hole Typesystem soundness hole involving cyclically-defined ATIT, and normalization of trait bounds Jan 3, 2025
@steffahn steffahn changed the title Typesystem soundness hole involving cyclically-defined ATIT, and normalization of trait bounds Typesystem soundness hole involving cyclically-defined ATIT, and normalization in trait bounds Jan 3, 2025
@steffahn
Copy link
Member Author

steffahn commented Jan 3, 2025

During the (unfinished) process of trying to get to a more minimal reproducing example eventually, I did come across a version of this that works ever since Rust 1.1. (<- this is not an easier to understand version, but it might be of interest as a test case.) [In Rust 1.1 on compiler-explorer]

@apiraino apiraino removed the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Jan 6, 2025
@lcnr lcnr added the P-medium Medium priority label Jan 6, 2025
@lcnr lcnr moved this to unknown in T-types unsound issues Jan 6, 2025
@lcnr
Copy link
Contributor

lcnr commented Jan 6, 2025

i've been playing with this to make sure I have a full understanding of how exactly this exploit works (and have a bunch of thoughts on how to fix it), a WIP minimization from first principles:

https://rust.godbolt.org/z/WxTfWo3jb

// An always applicable impl.
trait ProjectTo {
    type Usize;
    type This;
    type Myself;
}
impl<T> ProjectTo for T {
    type Usize = usize;
    type This = T;
    type Myself = T;
}

// A diverging associated type whose item-bound must not apply
// and differs from an always applicable impl.
trait Diverges {
    type Diverge<T: Diverges>: ProjectTo<Usize = &'static str>;
}
impl Diverges for () {
    type Diverge<T: Diverges> = T::Diverge<()>;
}

// We need to transfer the item bound to an existing type while only
// using the diverging associated type in a generic context.
//
// The idea is to use an indirection which uses the diverging associated type
// in a where-clause. This then uses its item-bound in the caller and normalizes
// via the applicable impl internally.
//
// `<T as Apply<D::Diverge<()>>::Assoc` for `D: Diverge` has
// an invalid item-bound and can be normalized to a concrete type while
// `D` is still generic.
trait Apply<D> {
    type Assoc: ProjectTo<Myself = usize>;
}
impl<T: ProjectTo<Myself = <D as ProjectTo>::Usize>, D> Apply<D> for T {
    type Assoc = T;
}

// It's still difficult to actually using this item bound as we must
// avoid `D::Diverge<()>` being mentioned in the MIR. We otherwise
// try to normalize it during codegen and error with overflow instead.
fn transmute<D: Diverges>(x: &usize) -> &&'static str {
    // We need to prove the where-bounds here while `D` is still generic
    // as they mention `D::Diverge<()>`.
    //
    // We normalize `<&'static str as Apply<D::Diverge<()>>>::Assoc` to `&'static str`
    // here. We therefore need to prove `&'static str: ProjectTo<This = &'static str>`
    // and `&'static str: Apply<D::Diverge<()>>`. Proving the second where-clause relies
    // on the item bound of `D::Diverge<()>`.
    provide_where_bound::<D, _>(x)
}

// `<T as ProjectTo>::This` normalizes to the rigid associated type
// `<T as Apply<D::Diverge<()>>>::Assoc` inside of this function.
fn provide_where_bound<D: Diverges, T>(x: &usize) -> &T
where
    T: ProjectTo<This = <T as Apply<D::Diverge<()>>>::Assoc>,
    T: Apply<D::Diverge<()>>,
{
    // We therefore prove `<T as Apply<D::Diverge<()>>>::Assoc: ProjectTo<Myself = usize>`
    // here. This holds thanks to the item-bound.
    apply_where_bound::<T>(x)
}

// `<T as ProjectTo>::This` normalizes to `T` inside of this function.
// THis means we've now get the incorret where-bound `T: ProjectTo<Myself = usize>`
// inside of this function. Exploiting this is trivial.
fn apply_where_bound<T>(x: &usize) -> &T
where
    <T as ProjectTo>::This: ProjectTo<Myself = usize>,
{
    via_impl::<T>(x)
}

fn via_impl<T>(x: &<T as ProjectTo>::Myself) -> &T {
    x
}

pub fn main() {
    println!("{}", transmute::<()>(&0));
}

I recently thought about diverging associated types themselves and expected them to be theoretical unsound, cc rust-lang/trait-system-refactor-initiative#133. I have not been able to craft an actual example back then as I mostly considered their interactions with the occurs check

Good job @steffahn. This is a wonderful unsoundness and I am incredibly excited about it :3

@steffahn
Copy link
Member Author

steffahn commented Jan 7, 2025

Is this more “minimal”? I don’t know… but it’s certainly more concise… more dense… anyway 😅

trait Tr: Sized {
    type Eq<T>;
    type EqLPrf<R>: Tr<Eq<Self> = Self::Prf<Self, R>>;
    type Prf<L, R>: Tr<Eq<L> = R>;
    type Rw<L: Tr<Eq<L> = <<R as Tr>::Prf<R, L> as Tr>::Eq<R>>, R>: Tr<Eq<L> = R>;
    fn transmute<R>(self) -> R;
}
impl<L_> Tr for L_ {
    type Eq<T> = T;
    type EqLPrf<R: Tr> = L_;
    type Prf<L: Tr<Eq<L> = L>, R: Tr> = L::Rw<L, R>;
    type Rw<L: Tr<Eq<L> = R>, R> = L;
    fn transmute<R: Tr>(self) -> <<R::EqLPrf<L_> as Tr>::Eq<R> as Tr>::Eq<R> {
        self
    }
}
fn main() {
    let s: String = vec![65_u8, 66, 67].transmute();
    println!("{}", s); // ABC
}

Edit: Eliminating/combining some more types…

trait Tr: Sized {
    type Eq<T>;
    type Prf<L, R>: Tr<Eq<L> = R>;
    fn transmute<L: Tr, R>(l: L) -> R;
}
impl<__> Tr for __ {
    type Eq<T> = T;
    type Prf<L: Tr<Eq<<R::Prf<R, L> as Tr>::Eq<R>> = R>, R: Tr> = L;
    fn transmute<L, R: Tr>(l: L) -> <R::Prf<R, L> as Tr>::Eq<R> {
        l
    }
}
fn main() {
    let s: String = <()>::transmute(vec![65_u8, 66, 67]);
    println!("{}", s); // ABC
}

@steffahn
Copy link
Member Author

steffahn commented Jan 7, 2025

I'll make this one into a follow-up because it's getting a bit wild; only one (generic) associated type is enough

trait Tr: Sized {
    type Prf<L, R>: Tr<Prf<L, ()> = R>;
    fn transmute<R>(self) -> R;
}
impl<L_> Tr for L_ {
    type Prf<L: Tr<Prf<<R::Prf<R, L> as Tr>::Prf<R, ()>, ()> = R>, R: Tr> = L;
    fn transmute<R: Tr>(self) -> <R::Prf<R, L_> as Tr>::Prf<R, ()> { self }
}
fn main() {
    let s: String = vec![65_u8, 66, 67].transmute();
    println!("{}", s); // ABC
}

or, with a bit less , ()… things:

trait Tr: Sized { type Prf<R>: Tr<Prf<R> = Self>; }
impl<L> Tr for L {
    type Prf<R: Tr<Prf<R> = <L::Prf<R> as Tr>::Prf<R>>> = R where L: Tr;
}
fn transmute<L: Tr, R>(r: L) -> <L::Prf<R> as Tr>::Prf<R> { r }
fn main() {
    let s: String = transmute::<_, String>(vec![65_u8, 66, 67]);
    println!("{}", s); // ABC
}

the R can be a trait parameter, too

trait Tr<R>: Sized { type Prf: Tr<R, Prf = Self>; }
impl<L, R> Tr<R> for L {
    type Prf = R where L: Tr<R>, R: Tr<R, Prf = <L::Prf as Tr<R>>::Prf>;
}
fn transmute<L: Tr<R>, R>(r: L) -> <L::Prf as Tr<R>>::Prf { r }
fn main() {
    let s: String = transmute::<_, String>(vec![65_u8, 66, 67]);
    println!("{}", s); // ABC
}

@lcnr
Copy link
Contributor

lcnr commented Jan 7, 2025

trait Tr<R>: Sized { type Prf: Tr<R, Prf = Self>; }
impl<L, R> Tr<R> for L {
    type Prf = R where L: Tr<R>, R: Tr<R, Prf = <L::Prf as Tr<R>>::Prf>;
}
fn transmute<L: Tr<R>, R>(r: L) -> <L::Prf as Tr<R>>::Prf { r }
fn main() {
    let s: String = transmute::<_, String>(vec![65_u8, 66, 67]);
    println!("{}", s); // ABC
}

that minization is actually a separate unsoundness (or at least what I intend to do to fix the original is insufficient there 💀). That sure is something

@lcnr
Copy link
Contributor

lcnr commented Jan 7, 2025

// We only check that GAT where-clauses of the *trait* while normalizing;
// normalizing `<T as Trait<U>>::Proof` to `U` trivially succeeds.
trait Trait<R>: Sized {
    type Proof: Trait<R, Proof = Self>;
}
impl<L, R> Trait<R> for L {
    // We prove that the impl item is compatible with the trait in the
    // env of the trait, which is pretty much empty.
    //
    // `L: Trait<R>` is trivial
    // `R: Trait<R, Proof = <L::Proof as Trait<R>>::Proof>` normalizes to
    // `R: Trait<R, Proof = <R as Trait<R>>::Proof>` normalizes to
    // `R: Trait<R, Proof = R>` is trivial
    //
    // Proving the item-bound holds assumes the *impl where-bounds*.
    // We normalize the where-bound `R: Trait<R, Proof = <L::Proof as Trait<R>>::Proof>`
    // by using the item-bound of `L::Proof`: `R: Trait<R, Proof = L>` 💀¹. Proving the
    // item-bound of `<L as Trait<R>>::Proof` is now trivial.
    type Proof
        = R
    where
        L: Trait<R>,
        R: Trait<R, Proof = <L::Proof as Trait<R>>::Proof>;
}

What's happening at ¹ is that proving that the item-bounds of an associated type is able
to assume the item-bounds of exactly that associated type. This is non-productive cyclic reasoning.

You've found a new way to exploit rust-lang/trait-system-refactor-initiative#62, answering the question posed in rust-lang/trait-system-refactor-initiative#116 😊

@lcnr
Copy link
Contributor

lcnr commented Jan 7, 2025

My perspective of the original unsoundness is that we're able to rely on the item-bounds of an associated type while never actually normalizing the associated type, which is unsound in case it diverges.

While your new minimization does not actually have any associated type which diverges. We're able to normalize <L::Proof as Trait<R>>::Proof to R without issues. The problem is the cyclic reasoning when proving that the item-bounds are actually valid.

@steffahn
Copy link
Member Author

steffahn commented Jan 7, 2025

that minization is actually a separate unsoundness (or at least what I intend to do to fix the original is insufficient there 💀). That sure is something

Oh… that’s interesting! I didn’t even really pay attention to the fact that it’s no longer containing containing any “cyclically defined ATIT”, at least for certain notions of “cyclically defined”. I got there (to the minimization) in a relatively incremental/step-by-step manner; but that certainly explains why I was having a hard time [i.e. not succeeding in] trying to translate the last version back into one without ATIT-where-bounds 🙃

@lcnr
Copy link
Contributor

lcnr commented Jan 8, 2025

Alright, have a minimization I am happy with now and feel like I properly understand this bug:
https://rust.godbolt.org/z/Y6K3aKGGq

// Always applicable impls. Using two separate traits
// for clarity/simplicity.
trait IdA {
    type This;
}
impl<T> IdA for T {
    type This = T;
}
trait IdB {
    type This;
}
impl<T> IdB for T {
    type This = T;
}
trait Trivial {
    type Usize;
}
impl<T> Trivial for T {
    type Usize = usize;
}

// A diverging associated type whose item-bound must not apply
// and differs from an always applicable impl.
trait Diverges<T> {
    type Diverge<Recur: Diverges<T>>: Trivial<Usize = T>;
}
impl<T> Diverges<T> for () {
    type Diverge<Recur: Diverges<T>> = Recur::Diverge<()>;
}

// An impl with a where-clause which is normalized using the blanket impl when
// checking that its item-bounds hold. We then prove that where-clause via the
// incorrect item-bound from above.
//
// This allows us to transfer the item-bound to a concrete type while only
// using the diverging associated type in a generic context.
//
// Concretely, this is impl is well-formed as `<T as IdA>::This` normalizes to
// `<Recur as Trivial>::Usize` which is then normalized to `usize` by using
// the blanket impl. This means `<T as IdA>::This: IdB<This: Copy>` trivially
// holds.
//
// This means it should only be possible to instantiate it with `usize` itself.
// Instantiating `Recur` with `<D as Diverges<T>>::Diverge<()>` causes it to apply
// for all `T` instead.
trait Apply<D> {
    type Assoc: IdB<This: Copy>;
}
impl<T: IdA<This = <Recur as Trivial>::Usize>, Recur> Apply<Recur> for T {
    type Assoc = <T as IdA>::This;
}

fn copy_me<T, D: Diverges<T>>(x: T) {
    // Need to prove the where-bounds of `provide_where_bound` in a generic context
    // as normalizing `D::Diverge<()>` would otherwise overflow.
    //
    // Using the impl of `Apply` requires `T: IdA<This = <D::Diverge<()> as Trivial>::Usize>`.
    // We normalize `<D::Diverge<()> as Trivial>::Usize` to `T` using the item-bound of
    // `D::Diverge<()>` at which point the impl trivially applies.
    provide_where_bound::<D, T>(x)
}

fn provide_where_bound<D: Diverges<T>, T>(x: T)
where
    T: IdA<This = <T as Apply<D::Diverge<()>>>::Assoc>,
    T: Apply<D::Diverge<()>>,
{
    // Inside of this function `<T as Apply<D::Diverge<()>>>::Assoc` is rigid,
    // allowing us to use its incorrect item-bound.
    //
    // <<T as IdA>::This as IdB>::This: Copy
    // ------------------------------------- normalize `<T as IdA>::This` via where-bound
    // <<T as Apply<D::Diverge<()>>>::Assoc as IdB>::This: Copy
    // -------------------------------------------------------- prove via item-bound
    apply_where_bound::<T>(x)
}

fn apply_where_bound<T>(x: T)
where
    <<T as IdA>::This as IdB>::This: Copy,
{
    // We simply normalize `<<T as IdA>::This as IdB>::This` to `T` inside of this
    // function.
    dup(x, x)
}

fn dup<T>(_x: T, _y: T) {}

pub fn main() {
    copy_me::<String, ()>(String::from("hello there"));
}

As mentioned by @nikomatsakis in rust-lang/trait-system-refactor-initiative#133 (comment) we've got a few ways forward here. I expect that the most likely approach will be to make sure we recheck where-clauses of functions during monomorphization to trigger the overflow at this point. Eagerly detecting potentially diverging associated types feels impossible to me without breaking too much existing code. @compiler-errors has implemented this in #135216 for a crater run.

Landing this may be blocked on the new solver as it will likely cause new overflow errors due to hitting the recursion limit. These are fatal with the old trait solver.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-associated-items Area: Associated items (types, constants & functions) A-trait-system Area: Trait system A-type-system Area: Type system C-bug Category: This is a bug. I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness P-medium Medium priority T-types Relevant to the types team, which will review and decide on the PR/issue.
Projects
Status: unknown
Development

No branches or pull requests

9 participants