Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault in futex_wait() on riscv64gc-unknown-linux-gnu with rustc 1.64.0 #102866

Closed
tommythorn opened this issue Oct 10, 2022 · 6 comments
Closed

Comments

@tommythorn
Copy link

tommythorn commented Oct 10, 2022

Note, this does not appear to be related to issue #102155 but I can't be sure.

I'm many levels deep in the original problem, but traced it down to autocfg not building (filed here: cuviper/autocfg#51); the repro is trivial.

The back trace however makes me suspect this is a deeper issue:

Thread 2 "cargo" received signal SIGUSR1, User defined signal 1.
[Switching to Thread 0x3ff7e0ffc0 (LWP 1094)]
syscall (syscall_number=98, arg1=<optimized out>, arg2=137, arg3=0, arg4=0, arg5=0, arg6=-1, arg7=274609471544) at ../sysdeps/unix/sysv/linux/riscv/syscall.c:27
27	../sysdeps/unix/sysv/linux/riscv/syscall.c: No such file or directory.
(gdb) bt
#0  syscall (syscall_number=98, arg1=<optimized out>, arg2=137, arg3=0, arg4=0, arg5=0, arg6=-1, arg7=274609471544) at ../sysdeps/unix/sysv/linux/riscv/syscall.c:27
#1  0x0000002aab379664 in std::sys::unix::futex::futex_wait () at library/std/src/sys/unix/futex.rs:62
#2  0x0000002aab37c50e in std::sys::unix::locks::futex_condvar::Condvar::wait_optional_timeout () at library/std/src/sys/unix/locks/futex_condvar.rs:51
#3  std::sys::unix::locks::futex_condvar::Condvar::wait () at library/std/src/sys/unix/locks/futex_condvar.rs:35
#4  0x0000002aab3504d4 in <jobserver::HelperState>::for_each_request::<jobserver::imp::spawn_helper::{closure#1}::{closure#0}> ()
#5  0x0000002aab350a60 in std::sys_common::backtrace::__rust_begin_short_backtrace::<jobserver::imp::spawn_helper::{closure#1}, ()> ()
#6  0x0000002aab350cb2 in _RINvNvNtCseOBki07ryB6_3std9panicking3try7do_callINtNtNtCsidPuqEqzKzv_4core5panic11unwind_safe16AssertUnwindSafeNCNCINvMNtB6_6threadNtB1T_7Builder16spawn_unchecked_NCNvNtCsGjmX1GWYch_9jobserver3imp12spawn_helpers_0uEs_00EuEB2H_.llvm.3138756864971081497 ()
#7  0x0000002aab350d4e in __rust_try.llvm.3138756864971081497 ()
#8  0x0000002aab351954 in <<std::thread::Builder>::spawn_unchecked_<jobserver::imp::spawn_helper::{closure#1}, ()>::{closure#1} as core::ops::function::FnOnce<()>>::call_once::{shim:vtable#0} ()
#9  0x0000002aab37bdc0 in alloc::boxed::{impl#44}::call_once<(), dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global> () at library/alloc/src/boxed.rs:1935
#10 alloc::boxed::{impl#44}::call_once<(), alloc::boxed::Box<dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global>, alloc::alloc::Global> () at library/alloc/src/boxed.rs:1935
#11 std::sys::unix::thread::{impl#2}::new::thread_start () at library/std/src/sys/unix/thread.rs:108
#12 0x0000003ff7e7d450 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#13 0x0000003ff7ecaef2 in __thread_start () at ../sysdeps/unix/sysv/linux/riscv/clone.S:85

Unfortunately I don't have the expertise to debug this.

@tommythorn
Copy link
Author

The dmesg info may be useful:

[ 8935.594852] rustc[1472]: unhandled signal 11 code 0x1 at 0x0000000000000008 in librustc_driver-ac972a4e10c98556.so[3fab1a3000+6fb9000]                                                                                                                                                                               
[ 8935.594959] CPU: 1 PID: 1472 Comm: rustc Not tainted 5.17.0-1006-starfive #7-Ubuntu
[ 8935.594974] Hardware name: StarFive VisionFive V1 (DT)
[ 8935.594982] epc : 0000003faee8d1aa ra : 0000003faf093baa sp : 0000003faae67fc0
[ 8935.594991]  gp : 0000002adc331800 tp : 0000003faae87480 t0 : 0000000000002000
[ 8935.594999]  t1 : 0000000000000002 t2 : 0000003faae6b2b8 s0 : 0000003faae69a38
[ 8935.595008]  s1 : 0000000000000000 a0 : ffffffffffffe000 a1 : 0000003faae69a90
[ 8935.595016]  a2 : 00000000000012d0 a3 : 0000000000000000 a4 : 0000000000000000
[ 8935.595024]  a5 : 0000000000000000 a6 : 0000003faae6c0cb a7 : 0000000000000001
[ 8935.595032]  s2 : 0000003faae69a90 s3 : 0000000000001a70 s4 : 0000003faae6aefb
[ 8935.595041]  s5 : ffffffffffffe590 s6 : 0000000000000000 s7 : 0000003faae6aca3
[ 8935.595049]  s8 : 0000003faae6ac8b s9 : 0000003faae6ac73 s10: 0000003faae6ac5b
[ 8935.595057]  s11: 0000000000001000 t3 : 0000003faae6c0ab t4 : 0000003faae6b2f0
[ 8935.595065]  t5 : 0000003faae6b380 t6 : 0000003faae6b440
[ 8935.595072] status: 0000000200004020 badaddr: 0000000000000008 cause: 000000000000000d

@tommythorn
Copy link
Author

Aha, it's a regression. Doesn't happen with 1.58.0. I'll track it down to the exact version.

@saethlin
Copy link
Member

You might want to use https://github.com/rust-lang/cargo-bisect-rustc

@saethlin
Copy link
Member

Oh, that's not the right backtrace. I think gdb just halts on any signal, and that's a SIGUSR1 not a SIGSEGV. Everything is fine when the program is there. Perhaps this is helpful: https://peeterjoot.wordpress.com/2010/07/07/avoiding-gdb-signal-noise/

(I usually debug from core dumps, which is one way around this)

@tommythorn
Copy link
Author

Thanks Saethlin, yes, I should have known better than that. This looks more likely:

Thread 2 "rustc" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x3ff05efd20 (LWP 2816)]
0x0000003ff45f61aa in <rustc_middle::arena::Arena>::alloc_from_iter::<rustc_middle::dep_graph::dep_node::DepKindStruct, rustc_arena::IsNotCopy, [rustc_middle::dep_graph::dep_node::DepKindStruct; 282]> () from /home/tommy/.rustup.riscv64-linux/toolchains/1.64.0-riscv64gc-unknown-linux-gnu/bin/../lib/librustc_driver-ac972a4e10c98556.so
(gdb) bt
#0  0x0000003ff45f61aa in <rustc_middle::arena::Arena>::alloc_from_iter::<rustc_middle::dep_graph::dep_node::DepKindStruct, rustc_arena::IsNotCopy, [rustc_middle::dep_graph::dep_node::DepKindStruct; 282]> ()
   from /home/tommy/.rustup.riscv64-linux/toolchains/1.64.0-riscv64gc-unknown-linux-gnu/bin/../lib/librustc_driver-ac972a4e10c98556.so
#1  0x0000003ff47fcbaa in rustc_query_impl::query_callbacks () from /home/tommy/.rustup.riscv64-linux/toolchains/1.64.0-riscv64gc-unknown-linux-gnu/bin/../lib/librustc_driver-ac972a4e10c98556.so
#2  0x0000003ff125300a in <core::cell::once::OnceCell<_>>::get_or_try_init::outlined_call::<<core::cell::once::OnceCell<rustc_middle::ty::context::GlobalCtxt>>::get_or_init<rustc_interface::passes::create_global_ctxt::{closure#1}::{closure#0}>::{closure#0}, rustc_middle::ty::context::GlobalCtxt, !> ()
   from /home/tommy/.rustup.riscv64-linux/toolchains/1.64.0-riscv64gc-unknown-linux-gnu/bin/../lib/librustc_driver-ac972a4e10c98556.so
#3  0x0000003ff1663d14 in <core::cell::once::OnceCell<rustc_middle::ty::context::GlobalCtxt>>::get_or_init::<rustc_interface::passes::create_global_ctxt::{closure#1}::{closure#0}> ()
...

Note the fault was introduced post 1.63.0 and it doesn't reproduce on rust version 1.66.0-nightly (81f3919 2022-10-09), so we can probably close this (I wanted to capture the bug right away in case I didn't get time to dig deeper). I'll leave it for open in case somebody want me to run another experiment.

@tommythorn
Copy link
Author

Ok, clearly a dup of #102155

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants