Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault when watcher is moved. #33

Closed
TyOverby opened this issue Oct 19, 2015 · 22 comments
Closed

Segfault when watcher is moved. #33

TyOverby opened this issue Oct 19, 2015 · 22 comments

Comments

@TyOverby
Copy link
Contributor

I have a program written using rsnotify 2.3.3 that segfaults when I touch a file that I'm watching.

Here's the backtrace from lldb

Process 60876 stopped
* thread #2: tid = 0x2b2995, 0x00000001000e36f8 stormtrooper`notify::sync::mpsc::oneshot::Packet<T>::sent(self=0x0000000000000010) + 8 at oneshot.rs:133, stop reason = EXC_BAD_ACCESS (code=1, address=0x60)
    frame #0: 0x00000001000e36f8 stormtrooper`notify::sync::mpsc::oneshot::Packet<T>::sent(self=0x0000000000000010) + 8 at oneshot.rs:133
(lldb) bt
error: need to add support for DW_TAG_base_type '()' encoded with DW_ATE = 0x7, bit_size = 0
* thread #2: tid = 0x2b2995, 0x00000001000e36f8 stormtrooper`notify::sync::mpsc::oneshot::Packet<T>::sent(self=0x0000000000000010) + 8 at oneshot.rs:133, stop reason = EXC_BAD_ACCESS (code=1, address=0x60)
  * frame #0: 0x00000001000e36f8 stormtrooper`notify::sync::mpsc::oneshot::Packet<T>::sent(self=0x0000000000000010) + 8 at oneshot.rs:133
    frame #1: 0x00000001000e2b95 stormtrooper`notify::sync::mpsc::Sender<T>::send(self=0x00007fff5fbff8e0, t=Event at 0x0000000101cfa618) + 517 at mod.rs:535
    frame #2: 0x00000001000e1c41 stormtrooper`notify::fsevent::callback(stream_ref=0x0000000101a002b0, info=0x00007fff5fbff8e0, num_events=1, event_paths=0x0000000101a00780, event_flags=0x00000001005f1000, event_ids=0x00000001005f0000) + 1649 at fsevent.rs:197
    frame #3: 0x00000001000dbf6d stormtrooper`fsevent::callback::hf1094fd0c5c09dbeeia + 61
    frame #4: 0x00007fff92498ba1 FSEvents`implementation_callback_rpc + 2123
    frame #5: 0x00007fff92496672 FSEvents`_Xcallback_rpc + 254
    frame #6: 0x00007fff92496746 FSEvents`FSEventsD2F_server + 54
    frame #7: 0x00007fff9249bf75 FSEvents`FSEventsClientProcessMessageCallback + 44
    frame #8: 0x00007fff88de8d87 CoreFoundation`__CFMachPortPerform + 247
    frame #9: 0x00007fff88de8c79 CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE1_PERFORM_FUNCTION__ + 41
    frame #10: 0x00007fff88de8beb CoreFoundation`__CFRunLoopDoSource1 + 475
    frame #11: 0x00007fff88dda767 CoreFoundation`__CFRunLoopRun + 2375
    frame #12: 0x00007fff88dd9bd8 CoreFoundation`CFRunLoopRunSpecific + 296
    frame #13: 0x00007fff88e91671 CoreFoundation`CFRunLoopRun + 97
    frame #14: 0x00000001000de91a stormtrooper`fnfn + 410 at fsevent.rs:158
    frame #15: 0x00000001000de749 stormtrooper`notify::boxed::F.FnBox<A>::call_box(self=0x0000000101437030, args=<unavailable>) + 89 at boxed.rs:516
    frame #16: 0x00000001000de33c stormtrooper`notify::boxed::Box<FnBox<A, Output = R>+ Send + 'a>.FnOnce<A>::call_once(self=Box<FnBox<()>> at 0x0000000101d00bc0, args=<unavailable>) + 60 at boxed.rs:532
    frame #17: 0x00000001000ddf0e stormtrooper`fnfn + 78 at mod.rs:281
    frame #18: 0x00000001000ddeba stormtrooper`notify::sys_common::unwind::try::try_fn<closure>(opt_closure=0x0000000101d00cf0) + 58 at mod.rs:156
    frame #19: 0x00000001001a7e89 stormtrooper`__rust_try + 9
    frame #20: 0x00000001001a59cf stormtrooper`sys_common::unwind::try::inner_try::h9c69b1952218bc71tRr + 111
    frame #21: 0x00000001000dde24 stormtrooper`notify::sys_common::unwind::try<closure>(f=closure at 0x0000000101d00d08) + 100 at mod.rs:126
    frame #22: 0x00000001000ddc9c stormtrooper`fnfn + 412 at mod.rs:281
    frame #23: 0x00000001000de56d stormtrooper`notify::boxed::F.FnBox<A>::call_box(self=0x000000010143b000, args=<unavailable>) + 77 at boxed.rs:516
    frame #24: 0x00000001001a955e stormtrooper`sys::thread::_$LT$impl$GT$::new::thread_start::h08e832c9db3841f0Qcw + 142
    frame #25: 0x00007fff8aba805a libsystem_pthread.dylib`_pthread_body + 131
    frame #26: 0x00007fff8aba7fd7 libsystem_pthread.dylib`_pthread_start + 176
    frame #27: 0x00007fff8aba53ed libsystem_pthread.dylib`thread_start + 13
@TyOverby
Copy link
Contributor Author

Here's how I'm using rsnotify:

fn watch(file: &str) {
    use notify::{RecommendedWatcher, Error, Watcher};
    use std::sync::mpsc::channel;
    use std::thread;

    // Create a channel to receive the events.
    let (tx, rx) = channel();

    let mut watcher = notify::new(tx).unwrap();
    watcher.watch(file).unwrap();

    thread::spawn(move || {
        match rx.recv() {
            _ => println!("changes!")
        }
    });
}

fn main() {
    watch("./test.ares");
}

@passcod passcod added A-bug Z-needs implementation Needs an implementation, will accept PRs os-mac labels Oct 19, 2015
@passcod
Copy link
Member

passcod commented Oct 19, 2015

Pinging @octplane on this.

@octplane
Copy link
Contributor

Can't reproduce using this very code:

extern crate notify;

fn watch(file: &str) {
    use notify::Watcher;
    use std::sync::mpsc::channel;
    use std::thread;

    // Create a channel to receive the events.
    let (tx, rx) = channel();

    let mut watcher = notify::new(tx).unwrap();
    watcher.watch(file).unwrap();

    let _ = thread::spawn(move || {
        loop {
        match rx.recv() {
            _ => println!("changes!")
        }
        }
    }).join();

}

fn main() {
    watch("./test.ares");
}

I'm using rustc 1.3.0 (9a92aaf19 2015-09-15), with a blank project only using rsnotify-2.3.3.
@TyOverby can you try narrowing down the issue on your side ?

@TyOverby
Copy link
Contributor Author

@octplane Ah, sorry about that, it looks like the issue was based around how threading and dropping the watcher interact.

Here's my repro using rsnotify-2.3.3 and rustc 1.5.0-nightly (81b3b27cf 2015-10-11).

extern crate notify;

fn watch(file: &str) -> notify::FsEventWatcher {
    use notify::Watcher;
    use std::sync::mpsc::channel;
    use std::thread;

    // Create a channel to receive the events.
    let (tx, rx) = channel();

    let mut watcher = notify::new(tx).unwrap();
    watcher.watch(file).unwrap();

    // Don't join this thread
    thread::spawn(move || {
        loop {
        match rx.recv() {
            _ => println!("changes!")
        }
        }
    });

    watcher
}

fn main() {
    // Don't let the watcher drop
    let _watcher = watch("./test.ares");
    // in my real application, this sleep-loop
    // is actually a webserver that is running
    loop {
        ::std::thread::sleep_ms(20);
    }
}

LLDB stacktrace

(lldb) target create "target/debug/rsnotify-repro"
Current executable set to 'target/debug/rsnotify-repro' (x86_64).
(lldb) run
Process 78741 launched: '/Users/tyoverby/workspace/rust/rsnotify-repro/target/debug/rsnotify-repro' (x86_64)
Process 78741 stopped
* thread #2: tid = 0x2cdd9c, 0x0000000100034e68 rsnotify-repro`notify::sync::mpsc::oneshot::Packet<T>::sent(self=0x0000000000001010) + 8 at oneshot.rs:133, stop reason = EXC_BAD_ACCESS (code=1, address=0x1060)
    frame #0: 0x0000000100034e68 rsnotify-repro`notify::sync::mpsc::oneshot::Packet<T>::sent(self=0x0000000000001010) + 8 at oneshot.rs:133
(lldb) bt
error: need to add support for DW_TAG_base_type '()' encoded with DW_ATE = 0x7, bit_size = 0
* thread #2: tid = 0x2cdd9c, 0x0000000100034e68 rsnotify-repro`notify::sync::mpsc::oneshot::Packet<T>::sent(self=0x0000000000001010) + 8 at oneshot.rs:133, stop reason = EXC_BAD_ACCESS (code=1, address=0x1060)
  * frame #0: 0x0000000100034e68 rsnotify-repro`notify::sync::mpsc::oneshot::Packet<T>::sent(self=0x0000000000001010) + 8 at oneshot.rs:133
    frame #1: 0x0000000100034305 rsnotify-repro`notify::sync::mpsc::Sender<T>::send(self=0x00007fff5fbff8d0, t=Event at 0x00000001039fa618) + 517 at mod.rs:535
    frame #2: 0x00000001000333b1 rsnotify-repro`notify::fsevent::callback(stream_ref=0x0000000100500c50, info=0x00007fff5fbff8d0, num_events=1, event_paths=0x0000000100700140, event_flags=0x0000000100620000, event_ids=0x00000001001ff000) + 1649 at fsevent.rs:197
    frame #3: 0x000000010002d6dd rsnotify-repro`fsevent::callback::hf1094fd0c5c09dbeeia + 61
    frame #4: 0x00007fff92498ba1 FSEvents`implementation_callback_rpc + 2123
    frame #5: 0x00007fff92496672 FSEvents`_Xcallback_rpc + 254
    frame #6: 0x00007fff92496746 FSEvents`FSEventsD2F_server + 54
    frame #7: 0x00007fff9249bf75 FSEvents`FSEventsClientProcessMessageCallback + 44
    frame #8: 0x00007fff88de8d87 CoreFoundation`__CFMachPortPerform + 247
    frame #9: 0x00007fff88de8c79 CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE1_PERFORM_FUNCTION__ + 41
    frame #10: 0x00007fff88de8beb CoreFoundation`__CFRunLoopDoSource1 + 475
    frame #11: 0x00007fff88dda767 CoreFoundation`__CFRunLoopRun + 2375
    frame #12: 0x00007fff88dd9bd8 CoreFoundation`CFRunLoopRunSpecific + 296
    frame #13: 0x00007fff88e91671 CoreFoundation`CFRunLoopRun + 97
    frame #14: 0x000000010003008a rsnotify-repro`fnfn + 410 at fsevent.rs:158
    frame #15: 0x000000010002feb9 rsnotify-repro`notify::boxed::F.FnBox<A>::call_box(self=0x0000000101437030, args=<unavailable>) + 89 at boxed.rs:516
    frame #16: 0x000000010002faac rsnotify-repro`notify::boxed::Box<FnBox<A, Output = R>+ Send + 'a>.FnOnce<A>::call_once(self=Box<FnBox<()>> at 0x0000000103a00bc0, args=<unavailable>) + 60 at boxed.rs:532
    frame #17: 0x000000010002f67e rsnotify-repro`fnfn + 78 at mod.rs:281
    frame #18: 0x000000010002f62a rsnotify-repro`notify::sys_common::unwind::try::try_fn<closure>(opt_closure=0x0000000103a00cf0) + 58 at mod.rs:156
    frame #19: 0x000000010003efc9 rsnotify-repro`__rust_try + 9
    frame #20: 0x000000010003d0ef rsnotify-repro`sys_common::unwind::try::inner_try::h9c69b1952218bc71tRr + 111
    frame #21: 0x000000010002f594 rsnotify-repro`notify::sys_common::unwind::try<closure>(f=closure at 0x0000000103a00d08) + 100 at mod.rs:126
    frame #22: 0x000000010002f40c rsnotify-repro`fnfn + 412 at mod.rs:281
    frame #23: 0x000000010002fcdd rsnotify-repro`notify::boxed::F.FnBox<A>::call_box(self=0x000000010143b000, args=<unavailable>) + 77 at boxed.rs:516
    frame #24: 0x000000010004040e rsnotify-repro`sys::thread::_$LT$impl$GT$::new::thread_start::h08e832c9db3841f0Qcw + 142
    frame #25: 0x00007fff8aba805a libsystem_pthread.dylib`_pthread_body + 131
    frame #26: 0x00007fff8aba7fd7 libsystem_pthread.dylib`_pthread_start + 176
    frame #27: 0x00007fff8aba53ed libsystem_pthread.dylib`thread_start + 13

@octplane
Copy link
Contributor

Looks like the rsnotify wrapping around FSEvent has a bug, cf rust-lang/rust#29201 (comment)

@passcod , I'm not much familiar with the overall architecture of this piece of code. Are you? This is likely to change some of the rsnotify internals a bit..

@TyOverby TyOverby changed the title Segfault on OSX when file is changed Segfault when watcher is moved. Oct 21, 2015
@TyOverby
Copy link
Contributor Author

I've updated the title.

@passcod
Copy link
Member

passcod commented Oct 21, 2015

For ref: https://github.com/passcod/rsnotify/blob/master/src/fsevent.rs#L132-L164

I didn't write this code, actually, and I'm not quite sure what it does, or why such a large unsafe block is needed. If we remove the unsafe and try to compile, can we shrink down the unsafe zone to the bare minimum? That won't make the bug disappear, but it should help coral down a bit.

What kind of internal reshuffle are you envisioning?

@andelf
Copy link
Contributor

andelf commented Oct 22, 2015

:( it was writed by me. @passcod

andelf added a commit to andelf/rsnotify that referenced this issue Oct 22, 2015
FsEventWatch.context:: Option<StreamContextInfo> is used for passing channels to watch thread.
When FsEventWatch moves, its address changes, so c callback will fail.
Fix: add Box<> wrapper to StreamContextInfo
@andelf
Copy link
Contributor

andelf commented Oct 22, 2015

My bad. PRed, waiting for review.

Diagnose:

frame #2: 0x00000001000333b1 rsnotify-repronotify::fsevent::callback(stream_ref=0x0000000100500c50, info=0x00007fff5fbff8d0, num_events=1, event_paths=0x0000000100700140, event_flags=0x0000000100620000, event_ids=0x00000001001ff000) + 1649 at fsevent.rs:197`

Here, the info parameter is a stack address(in fn watch()). :(

@andelf
Copy link
Contributor

andelf commented Oct 22, 2015

@passcod The large unsafe block does following:

  • call FSEvent APIs (all unsafe)
  • pass runloop info, stream info into CFRunLoopRun() thread

The unsafe should be splited to two(outside thread and inside thread).

Maybe the info structs should be wraped into Cell, RefCell, or Unique.

@passcod
Copy link
Member

passcod commented Oct 22, 2015

:( it was written by me

No worries! All of us overlook things, it is no big deal. Bugs happen. Thank you for the swift response and contribution of a fix :)

@octplane
Copy link
Contributor

Yeah, no worries @andelf , that you found a way to fix the issue is very cool!

@andelf
Copy link
Contributor

andelf commented Oct 23, 2015

OK! :)
I haven't figure out how to add a testcase of this situation(moved watcher, since its address space changes).

@TyOverby
Copy link
Contributor Author

@andelf I was able to get @octplane 's test to segfault every time I ran it, so maybe just use that as the test case? If it doesn't segfault, it's good?

@TyOverby
Copy link
Contributor Author

No longer segfaults in my application! Thanks!

@passcod
Copy link
Member

passcod commented Nov 3, 2015

Apparently still occurs, see watchexec/cargo-watch#22.
@andelf @octplane Can you have a look?
@fritzsche Can you try to provide an LLDB stack trace as above?

@passcod passcod removed the Z-needs implementation Needs an implementation, will accept PRs label Nov 3, 2015
@octplane
Copy link
Contributor

octplane commented Nov 3, 2015

I can reproduce the issue with cargo-watch @ master. However, the original test-case do not fail, so I think we will have to take cargo-watch until we find what's going on... :-/

@octplane
Copy link
Contributor

octplane commented Nov 3, 2015

Ok, it's possible to reproduce the crash by issuing 2 watchs command.

let _ = watch("./target");
let _ = watch("./src");

@octplane
Copy link
Contributor

octplane commented Nov 3, 2015

I have updated my test case. It exhibits something going wrong immediately. Please see https://github.com/octplane/rsnotify-33

Clearly, the channel is shut down way to early. Note that if you move the main loop{} in watch() it starts behaving correctly.

@fritzsche
Copy link

I set a breakpoint on function rust-panic in LLDB and triggered a file change:

(lldb) breakpoint set -b rust_panic
Breakpoint 4: where = cargo-watch`rust_panic, address = 0x0000000100170720
(lldb) run
There is a running process, kill it and restart?: [Y/n] y
Process 5433 exited with status = 9 (0x00000009) 
Process 5548 launched: '/usr/local/bin/cargo-watch' (x86_64)
thread '<unnamed>' panicked at 'internal error: entered unreachable code', ../src/libstd/sync/mpsc/mod.rs:565
Process 5548 stopped
* thread #2: tid = 0x40af1e, 0x0000000100170720 cargo-watch`rust_panic, stop reason = breakpoint 4.1
    frame #0: 0x0000000100170720 cargo-watch`rust_panic
cargo-watch`rust_panic:
->  0x100170720 <+0>: pushq  %rbp
    0x100170721 <+1>: movq   %rsp, %rbp
    0x100170724 <+4>: pushq  %r15
    0x100170726 <+6>: pushq  %r14
(lldb) bt
error: need to add support for DW_TAG_base_type '()' encoded with DW_ATE = 0x7, bit_size = 0
* thread #2: tid = 0x40af1e, 0x0000000100170720 cargo-watch`rust_panic, stop reason = breakpoint 4.1
  * frame #0: 0x0000000100170720 cargo-watch`rust_panic
    frame #1: 0x000000010015a288 cargo-watch`sys_common::unwind::begin_unwind_inner::h263a8f3a93eff05evas + 424
    frame #2: 0x000000010010adb8 cargo-watch`notify::sys_common::unwind::begin_unwind<&str>(msg=(data_ptr = "internal error: entered unreachable code\x02", length = 40), file_line=0x000000010023c850) + 120 at mod.rs:232
    frame #3: 0x0000000100127ac8 cargo-watch`notify::sync::mpsc::Sender<T>::send(self=0x000000010245e000, t=Event @ 0x0000700000290638) + 408 at macros.rs:44
    frame #4: 0x0000000100126de2 cargo-watch`notify::fsevent::callback(stream_ref=0x0000000100600360, info=0x000000010245e000, num_events=1, event_paths=0x0000000101201730, event_flags=0x000000010138e000, event_ids=0x000000010138d000) + 1618 at fsevent.rs:191
    frame #5: 0x000000010012121d cargo-watch`fsevent::callback::h7081936fe63bd73cXia + 61
    frame #6: 0x00007fff8c2c5b0b FSEvents`implementation_callback_rpc + 1917
    frame #7: 0x00007fff8c2c36ca FSEvents`_Xcallback_rpc + 254
    frame #8: 0x00007fff8c2c379e FSEvents`FSEventsD2F_server + 54
    frame #9: 0x00007fff8c2c8fd9 FSEvents`FSEventsClientProcessMessageCallback + 44
    frame #10: 0x00007fff98f8721c CoreFoundation`__CFMachPortPerform + 252
    frame #11: 0x00007fff98f87109 CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE1_PERFORM_FUNCTION__ + 41
    frame #12: 0x00007fff98f87079 CoreFoundation`__CFRunLoopDoSource1 + 473
    frame #13: 0x00007fff98f7eaab CoreFoundation`__CFRunLoopRun + 2171
    frame #14: 0x00007fff98f7dfc8 CoreFoundation`CFRunLoopRunSpecific + 296
    frame #15: 0x00007fff98fbf961 CoreFoundation`CFRunLoopRun + 97
    frame #16: 0x0000000100123b2d cargo-watch`fnfn + 413 at fsevent.rs:153
    frame #17: 0x0000000100123959 cargo-watch`notify::boxed::F.FnBox<A>::call_box(self=0x000000010245e090, args=<unavailable>) + 89 at boxed.rs:485
    frame #18: 0x000000010012355c cargo-watch`notify::boxed::Box<FnBox<A, Output = R>+ Send + 'a>.FnOnce<A>::call_once(self=Box<FnBox<()>> @ 0x0000700000296bc0, args=<unavailable>) + 60 at boxed.rs:501
    frame #19: 0x000000010012313e cargo-watch`fnfn + 78 at mod.rs:280
    frame #20: 0x00000001001230ea cargo-watch`notify::sys_common::unwind::try::try_fn<closure>(opt_closure="") + 58 at mod.rs:164
    frame #21: 0x0000000100170719 cargo-watch`__rust_try + 9
    frame #22: 0x0000000100168471 cargo-watch`sys_common::unwind::try::inner_try::h93da745d827b7cb5T6r + 97
    frame #23: 0x0000000100123054 cargo-watch`notify::sys_common::unwind::try<closure>(f=closure @ 0x0000700000296d08) + 100 at mod.rs:136
    frame #24: 0x0000000100122ecc cargo-watch`fnfn + 412 at mod.rs:280
    frame #25: 0x000000010012377d cargo-watch`notify::boxed::F.FnBox<A>::call_box(self=0x000000010242c660, args=<unavailable>) + 77 at boxed.rs:485
    frame #26: 0x000000010017314e cargo-watch`sys::thread::Thread::new::thread_start::h770a3938c0de9490puw + 142
    frame #27: 0x00007fff96e6f9b1 libsystem_pthread.dylib`_pthread_body + 131
    frame #28: 0x00007fff96e6f92e libsystem_pthread.dylib`_pthread_start + 168
    frame #29: 0x00007fff96e6d385 libsystem_pthread.dylib`thread_start + 13
(lldb) 

@octplane
Copy link
Contributor

octplane commented Nov 4, 2015

     let mut watcher = notify::new(tx).unwrap();
    watcher.watch(file).unwrap();

    watcher.stop();
    watcher.run();

Is enough to crash the watcher at first change, too. I propose we move to a new bug.

@octplane
Copy link
Contributor

octplane commented Nov 4, 2015

Also, my first investigation shows that the self.runloop is not correctly set in time and so, when stop() is called, it does not stop the runloop and still deallocates the corresponding data structure.

One way to try and fix is to ensure that run does not return until self.runloop has been correctly set.

passcod added a commit that referenced this issue Nov 5, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants