-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aarch64 Miscompilation #1535
Comments
Are you using the version at https://github.com/rust-lang/rustc_codegen_cranelift/releases/tag/dev (or one you compiled yourself) or are you using the rustc-codegen-cranelift-preview component? If the latter you are mixing LLVM and Cranelift compiled code as the standard library would be compiled using LLVM. I'm currently working on fixing a couple of ABI bugs in Cranelift that affect mixed LLVM/Cranelift binaries. |
I am uisng the I did some more investigating and am currently trying to come up with a minimal reproduction. I will link to it when I do. |
This issue is likely related to #1507. It turns out that same exact
My investigation isn't complete, but at the moment I suspect the value comes from the code generated by the bevy
If there are any suggestions or pointers on where to look next I would much appreciate it! |
Hey all, fn main() {
// Initialize the shared RwLock-protected Option<String> with Some("Hello")
let shared_option: Arc<RwLock<Option<String>>> =
Arc::new(RwLock::new(Some(String::from("Hello"))));
// Vector to hold thread handles
let mut handles = vec![];
// Spawn multiple threads to read and write to the shared Option
for i in 0..4 {
let shared_clone = Arc::clone(&shared_option);
let handle = thread::spawn(move || {
if i % 2 == 0 {
// Even-indexed threads perform read operations
let opt = shared_clone.read().unwrap();
if let Some(ref s) = *opt {
println!("Thread {}: Read value -> {}", i, s);
} else {
println!("Thread {}: Read value -> None", i);
}
} else {
// Odd-indexed threads perform write operations
let mut opt = shared_clone.write().unwrap();
*opt = None;
println!("Thread {}: Wrote value -> None", i);
}
});
handles.push(handle);
}
// Wait for all threads to finish
for handle in handles {
handle.join().unwrap();
}
// Final state check
let final_opt = shared_option.read().unwrap();
if let Some(ref s) = *final_opt {
println!("Final state: {}", s);
} else {
println!("Final state: None");
}
} It seems the code generated by cranelift does not uphold the guarantees made by |
Thanks for looking into this! I will take a look once I'm done with the aformentioned abi fixes. |
I have tried to reproduce this issue on aarch64-linux (Graviton dev desktop) and I couldn't initially, until it was pointed out to me that of course the rwlock, which is suspicous here, is completely different on MacOS and Linux. |
Removed the arm label given that it reproduces on x86_64 too. |
(gdb) bt
#0 core::sync::atomic::atomic_load<*mut std::sys::sync::rwlock::queue::Node> () at /home/nora/other/rustc_codegen_cranelift/build/stdlib/library/core/src/sync/atomic.rs:3274
#1 0x0000565456a9f33f in std::sys::sync::rwlock::queue::{impl#0}::get () at /home/nora/other/rustc_codegen_cranelift/build/stdlib/library/core/src/sync/atomic.rs:1424
#2 0x0000565456a9f8f2 in std::sys::sync::rwlock::queue::add_backlinks_and_find_tail () at std/src/sys/sync/rwlock/queue.rs:254
#3 0x0000565456aa0330 in std::sys::sync::rwlock::queue::{impl#3}::unlock_queue () at std/src/sys/sync/rwlock/queue.rs:483
#4 0x0000565456aa024d in std::sys::sync::rwlock::queue::{impl#3}::unlock_contended () at std/src/sys/sync/rwlock/queue.rs:464
#5 0x0000565456a3e466 in <std::sync::rwlock::RwLockWriteGuard<T> as core::ops::drop::Drop>::drop ()
#6 0x0000565456a3b1fa in core::ptr::drop_in_place<std::sync::rwlock::RwLockWriteGuard<core::option::Option<alloc::string::String>>> ()
#7 0x0000565456a3ee9a in repro::main::{{closure}} ()
it looks like the corrupted value here is the tail pointer of a node in the wait queue, accessed while trying to unlock the writer lock. |
I copied the queue implementation into my own code and ran it: https://gist.github.com/Noratrieb/fc2733ff612e0dda6ce71553669fcb91 (and also changed it to 2 threads).
|
I think the ptr_mask intrinsic requires bitwise-and with the inverse of the mask: rustc_codegen_cranelift/src/intrinsics/mod.rs Lines 601 to 606 in 753271c
|
In particular currently the |
oh yeah. the ptr_mask doctest segfaults, so I guess that's a minimization^^: let mut v = 17_u32;
let ptr: *mut u32 = &mut v;
// `u32` is 4 bytes aligned,
// which means that lower 2 bits are always 0.
let tag_mask = 0b11;
let ptr_mask = !tag_mask;
// We can store something in these lower bits
let tagged_ptr = ptr.map_addr(|a| a | 0b10);
// Get the "tag" back
let tag = tagged_ptr.addr() & tag_mask;
assert_eq!(tag, 0b10);
// Note that `tagged_ptr` is unaligned, it's UB to read from/write to it.
// To get original pointer `mask` can be used:
let masked_ptr = tagged_ptr.mask(ptr_mask);
assert_eq!(unsafe { *masked_ptr }, 17);
unsafe { *masked_ptr = 0 };
assert_eq!(v, 0); |
more minimal example: let mut v = 17_u32;
let ptr: *mut u32 = &mut v;
let masked_ptr = ptr.mask(!0b11);
unsafe { masked_ptr.read_volatile() }; |
0000000000015220 <_ZN4core3ptr7mut_ptr31_$LT$impl$u20$$BP$mut$u20$T$GT$4mask17h602a34ac5952876eE>:
15220: 55 push rbp
15221: 48 89 e5 mov rbp,rsp
15224: 48 31 c0 xor rax,rax
15227: 48 89 ec mov rsp,rbp
1522a: 5d pop rbp
1522b: c3 ret that's a quite creative way to implement pointer masking. |
No, the currently implementation is actually fine. The issue is a missing write of the return value like rustc_codegen_cranelift/src/intrinsics/mod.rs Line 598 in 753271c
|
Yeah, I've fixed that locally and it works now. I'll put up a PR. |
Hey all, hope all is well.
I am trying to build my bevy application on my M3 Pro macbook. After fighting with feature flags for a few dependencies, I was able to remove all of the problematic neon intrinsics.
When running my application. I get a segfault within my netcode. When investigating with lldb, I discovered the following:
The
std..sys_common..net..LookupHost$u20$as$u20$core..convert..TryFrom
implementation here returnsx0 = 0x0000000000000010
. This appears to be aniche optimization
of the Ok variant. Once this function returns, the generated assembly tries to dereference that address, causing a segmentation fault.If it helps, I exclusively used cranelift for this build, so ABI should not be of concern here. It seems like its more likely that the generated CLIF is incorrectly representing this trait method, and producing unsound code.
If there are any suggestions on how to approach this problem, I can continue this investigation and submit a PR if cranelift is determined to be the cause.
Thanks again for all your help!
The text was updated successfully, but these errors were encountered: