-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can CFI be made compatible with type erasure schemes? #128728
Comments
Wouldn't this be a breaking change? And it isn't a soundness or security fix, so I don't see h it being allowed by that rule. |
Yeah we'd have to evaluate whether making the ecosystem CFI-compatible is possible without undue breakage. OTOH if we don't do this CFI will always be a second-class citizen as a sanitizer, since it will keep rejecting completely valid code. |
The other option is to make CFI Rust compatible. Why is that not being considered as a solution? I see no reason why Rust and C/C++ should use exactly the same variant of CFI, since they are different languages with different UB/EB. For the same reason I would expect yet other languages to need other variations of CFI as well (though I don't know them in enough detail to come up with any specific examples). |
It is being considered. I am talking to various stakeholders and the CFI folks are telling me that accepting arbitrary pointee type mismatches makes CFI "essentially worthless". I'll let them explain why that is the case. |
It is possible to fix the example without fn transmutes (although I don't know if this will be good enough for CFI) by introducing a shim: use std::marker::PhantomData;
/// Invariant: there exists some type `T` such that `data` is actually a `&'a T` and `op`
/// is actually a `fn(&T)`.
struct ErasedTypeAndOp<'a> {
data: *const (),
op: fn(*const ()),
_phantom: PhantomData<&'a ()>,
}
impl<'a> ErasedTypeAndOp<'a> {
pub fn new<F: Fn(&'a T), T>(data: &'a T, _op: F) -> Self {
const { assert!(std::mem::size_of::<F>() == 0) };
fn op_shim<'a, T: 'a, F: Fn(&'a T)>(data: *const ()) {
unsafe {
let op = std::mem::zeroed::<F>();
op(&*data.cast::<T>())
}
}
Self {
data: data as *const T as *const (),
op: op_shim::<T, F>,
_phantom: PhantomData,
}
}
pub fn call_op(&self) {
(self.op)(self.data)
}
} There are other variants of this, such as introducing a trait. This isn't as flexible as transmuting the fn pointer, but I think it fits most of the cases. So if we can break compatibility (especially considering that the docs about ABI compatibility weren't stable for a long time, only since 1.76.0) we can tell people to use that approach. It seems a somewhat-safe wrapper (providing |
Ah right, if you take the function item as input you can do this without extra allocations. You could try this approach with
We didn't document anything before, but this has been de-facto relied upon since Rust 1.0 by Also, CFI is not a super widely used sanitizer technique (to my knowledge), so I am hesitant to force everyone to adopt such quirky patterns... it would be better if the sanitizer could be improved to incur less collateral damage on legit code. |
Personally I have always used a shim until this was documented. std can use things that are not guaranteed sound. But I can understand others were relying on this. What about declaring it is not UB, but recommending developers to switch to alternatives (perhaps adding something like Also, is there any way we can automatically detect crates using such techniques and lint against them when trying to apply CFI? |
Dumping some context onto this discussion - I'll write a separate post with design/opinion statements, this is intended just to provide a big infodump.
While my document there has a bunch of CFI background, I've heard from some folks that they glossed over anything at the implementation level, and would really like a description of what policies could be implemented by each system, so I'm going to write a super-reduced summary here: What is CFI?CFI is a general class of sanitizers that is used to harden the program against attacker control flow hijacks by making sure that computed transfers (calls to function pointers, closures, trait objects, returns) are going to a site that would have been statically possible in the source program. We narrow further to forward-edge CFI, which is everything other than "returns" - returns are generally protected by other mechanisms that we aren't discussing here. CFI ImplementationsWhile there are a bunch of CFI implementations, there are two we have unstably implemented in Rust, and they're the most commonly used in the outside world. Both of these are "type-based" CFI in their C/C++ variant, meaning that they address the aliasing problem by asserting that function pointers are only compatible if their type is the same. LLVM CFIThis is commonly just called "CFI", and if someone says they've enabled "CFI", this is probably what they mean. Requirements
Policy Power
Operation in C/C++
KCFIThis is a variant scheme for CFI that uses only one tag value, intended for embedded applications. It is actively used in the Linux kernel and firmware development. For example, every Android phone shipped today has KCFI enabled. KCFI is also supported by GCC, not just clang. Requirements
Policy Power
Operation in C (C++ not supported)
Rust TodayWe do have CFI implemented in Rust today, and it mostly works. Operation
Cross-language support
Flaws
|
OK, now for my opinions. In general, I think that ideally CFI should follow the language's rules. That is what it's supposed to do - provide an extra set of guard-rails at runtime that prevents the running program from engaging in some behaviors that are outside of the program's expected behavior. Do note that before CFI ever becomes helpful, UB has to have occurred somewhere. The goal is to make that UB less practically devastating when it happens by making it harder for an attacker to weaponize. Effectively, this means that an idealized/abstract implementation of CFI is a transformation that removes some of the worst behaviors from the set of possibilities when UB is hit. 1 That said, even in the case of existing CFI, some compromises needed to be made. LLVM CFI loses function pointer equality between dynamically loadable units. C technically supports casts between "compatible" functions (much narrower set of types than what Rust ABI-matching casts allow), but you're not allowed to do this when either LLVM CFI or KCFI are enabled. I think there's a case for solutions where some legal code is disallowed with CFI enabled, especially if those patterns are rare and statically detectable. Possible Resolutions
I haven't run numbers on this yet, but ABI compatibility equivalence sets are extremely broad, especially for low-arity functions. If exact quantification would be helpful, we could try running the experiment to see how expanded the alias sets would be if we could decide on a representative example. CFI/KCFI Specific CodepathsThis is what the original PR linked to this issue #115954 was attempting to do - essentially, when CFI or KCFI are enabled, adjust the code so that it's CFI-legal rather than just Rust-legal. Advantages
Disadvantages
Enforce only Rust aliasing restrictionsRust has much weaker aliasing restrictions. These look similar to a
Advantages
Disadvantages
|
To be clear, I was suggesting this (in the form of a
I don't know what people actually use these "function calls with ABI mismatches" for. There's manual type erasure schemes like the one in |
Not that we have to follow C, but as an interesting side-note, I just came across UndefinedBehaviorSanitizer’s unexpected behavior which seems to indicate that C doesn't allow this kind of function pointer mismatch at all. |
What would it even mean to do CFI cross language? Types aren't named the same. You might consider a mapping table, but that isn't straight forward. For example: Then there are the types thst don't have exact equivalents: As soon as you leave basic primitives and start passing user defined types it gets even more complex. What is CFI even comparing here?
To me that seems to rule out both "simple" approaches, and what is left are fuzzy "are these similar enough" comparisons that would be way too expensive to do at runtime for every call. Newtypes (which are extremely common in Rust and I have started to see more of in modern C++ as well) makes this even more complex. And then there is the whole niche optimised Option/Result as you pointed out as well. I'm fairly certain that I could (if I wasn't on my phone right now) sit down and come up with a set of FFI signatures pairs that should work to any reasonable programmer but where any one way of implementing CFI would break at least one of them on some architectures. |
Clang has an option to normalize CFI types based on their size and signedness, so rust i32, C int and C int32_t all map to the same CFI tag if this option is enabled. And C char would map to i8 or u8 depending on if char is signed or not.
The struct name is exactly what is currently used as tag in Rust, just like it is in C. The expectation is that the raw bindings use the same names as C and you optionally write safe(r) wrapper around it with a name following the Rust naming conventions.
You don't need to pass newtypes across the C boundary, so that is irrelevant. And I'm not sure C++ CFI is supported by Rust at all currently. Only C CFI do I know for certain is supported. |
How does that work with crate / module names? Or are those automatically stripped from the name of the type? Also, what about reserved identifiers? While Rust has raw identifiers, C does not. So for the There are also things like replaces in bindgen. Looking through the documentation of cbindgen (which I haven't used myself) it also seem to support renaming types. For any future C++ cross language CFI, renaming is going to be required even. |
It's just using the name of the struct. It's not using the mangled name or anything like that. Yes, there will be different types of the same name that collide. It's fine. As for renames, Rust has an attribute to change the name used by a type for cfi purposes. Bindgen could generate that for renames in the future. But Rust CFI support isn't perfect yet. For now, bindgen can give you failures when C enums are involved, as bindgen may just generate an integer type alias on the Rust side. I'm sure we'll fix it in the future. |
As mentioned above, C (the standard) does allow function ABI punning between certain compatible types, but C-with-sanitizers forbids this in many/most cases.
By the current C++ standard, a reinterpret_cast between "newtype" wrappers will usually result in UB because of the "TBAA" rules. (It's possible to do soundly if an lvalue with the wrong type is never manifested, but this is difficult to ensure and forbids the use of methods.) It will usually work anyway, thus people doing it anyway, but formally it is not permitted.
And to explicitly state why: CFI is ultimately a low-cost sanitizer to turn some UB into a somewhat controlled program termination, and is more concerned with running more correct code than it is catching more incorrect code. |
As CAD said, that's likely an overeager sanitizer, not UB in the code. Though the list here does not seem to say |
The curl article says this:
which seems to agree that |
Either way I think it is a terrible idea to make this UB. There's no justification for unleashing the Kraken on programs like that. Worst-case we should declare it erroneous behavior. But currently we actually document stably that this is allowed. |
There is a kind of common type erasure scheme used in Rust that goes something like this:
This is used, in particular, in the standard library for type-erased
fmt
arguments.Unfortunately, CFI is not happy with this since the function pointer
op
was transmuted, and is not invoked at its original type. Our documented ABI compatibility rules are fine with this (and Miri also won't complain), but CFI is actually more restrictive than those rules and (at least in some configurations) rejects this call since caller and callee do not agree on the type of the argument.This is not good: ideally we could just tell people they can turn on CFI in any Rust program and expect it to work. In other words, ideally CFI would only reject programs that we consider buggy (in an official Rust lang/opsem sense), since they have either UB or erroneous behavior.
I am not sure what is the best way to fix this. I also know very little about what CFI can and cannot do (and I understand there's actually a bunch of CFI implementations that differ in their capabilities). My completely naive first idea would be to say that we have a new magic primitive type
Erased
and then declareErasedTypeAndOp
as follows:Then we say that if caller and callee use different pointer types for their arguments, that is erroneous behavior unless one of them uses the special
Erased
pointee type, in which case the call is permitted. (Or maybe only the call site is allowed to useErased
like that?)Is that something CFI can do -- basically have pointee type checking "turned off" if the call site uses
*const Erased
? If yes, would this be an acceptable trade-off between still rejecting accidental type mismatches (and catching as many attacks as possible) while accepting legitimate type erasure patterns?Cc @rust-lang/opsem @maurer @rcvalle
The text was updated successfully, but these errors were encountered: