-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bevy segfaults on some Windows systems after Rust 1.79 update #126442
Comments
https://github.com/rust-lang/cargo-bisect-rustc would help locating the nightly (and validate it’s nightly-2024-04-18) and in particular the PR where this issue appeared. |
I'm a bit sleepy, but I'll try $env:WGPU_BACKEND='dx12'; cargo-bisect-rustc --start=2024-04-17 --end=2024-04-18 --prompt --with-src -- run --example alien_cake_addict Update 2: Can confirm the reported range (fails on nightly-2024-04-18, and works on nightly-2024-04-17) is correct
|
Per another user running the tools, the offending commit is 38104f3 |
cc PR author @Mark-Simulacrum on that bisection |
My bisection is done and I can confirm the result. This is the output: searched nightlies: from nightly-2024-04-17 to nightly-2024-04-18 bisected with cargo-bisect-rustc v0.6.8Host triple: x86_64-pc-windows-msvc cargo bisect-rustc --end=2024-04-18 --prompt --with-src -- run --example alien_cake_addict note by me: also requires |
The offending PR: #123936 |
If that is the offending commit, then it seems very unlikely that this has not unveiled a bug inside wgpu or whatever else is busy talking to DX12. |
When looking at WinDbg the issue seems to arise here. wgpu-hal was converting a normal rust str into a c_char ptr and passing that across ffi to a function expecting a null terminated string. Unluckily for us this string was a empty string "". Before #123936 that meant we'd get a pointer to somewhere we can read from. After the PR we get a 0x1 pointer that will cause a segfault when read from. |
Agh, that is the classic error that I predicted in my remarks here. |
This passes, so changing to #[test]
fn cstrs_dont_dangle() {
let c_str = c"";
assert_ne!(c_str.as_ptr(), std::ptr::NonNull::dangling().as_ptr());
} However... @Brezak Are you sure that's the issue? It seems that this string was created from a Does that |
The debugger said that source_name had the address 0x1. |
Oh, I was squinting at the wrong line it seems. |
I'm not at my computer right now but in the code calling said borked function there was a unwrap_or_default creating a that would get passed into the function. |
yep. |
For reference. The source of the |
It seems that the reason this isn't hit on the other path is that |
So IIUC this line is unsound; if the callee expects a null-terminated string then you can't just pass a
So seems like previously you were lucky that the pointer was pointing to a zero byte and so the c string was considered empty -- but that was still UB since you were outside the bounds of that zero-sized allocation.
FWIW, using |
@RalfJung Note that |
hassle_rs is providing a nice Rusty API and doing so correctly. I don't think they have to take any blame here. wgpu tried and failed to provide the same nice Rusty API, and used an (For anyone reading along, also see the discussion here.) |
It is a Rusty API, it's just slightly perplexing to me because the caller is holding on to a CString anyways, it seems. |
This particular caller is, maybe others are not?
|
Hopefully! Actually what's more confusing, now that I look closer, is that the underlying compiler API for DXC takes a wide string (thus it does necessitate allocation, it seems...)... and this is the newer one? It seems like that should be the other way around, knowing Microsoft API version history, and the new one should accept a UTF-8 string, and the older one should ask for a wide string... ...oh wait, both APIs are deprecated, actually, and Microsoft is on to |
Thank you for the help and the investigation! It now has been fixed in wgpu, waiting for a backport and a release |
The issue was first noticed in CI: https://github.com/bevyengine/bevy/actions/runs/9505180010/job/26199467296
The same job works on Rust 1.78: https://github.com/mockersf/bevy/actions/runs/9506123706/job/26202601792
A few users reported the same error, so it's not just on virtualised hardware. It's also not on all Windows machines
It seems to be triggered when rendering. Running an example from the Bevy repository that renders something reliably causes a segfault on some systems when using DirectX12. Running directly examples from wgpu don't seem to cause the crash.
A user report that it fails on nightly-2024-04-18, and works on nightly-2024-04-17
Sorry that's not much to go on, but I don't have an affected system available.
The text was updated successfully, but these errors were encountered: