-
-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
macos ci build times out every now and again #2837
Comments
i've made the following change on the macos ci: - make test
- #ctest --output-on-failure
+ make test || { ctest --output-on-failure ; false ; } so we'll still fail out if |
huh we could just run |
it's important to note that this does not happen all the time, so a successful build doesn't mean it's fixed. |
i just had a (gdb) thread apply all bt
Thread 2 (Thread 0x7f57fab966c0 (LWP 2761558) "notcurses-demo"):
#0 futex_wait (futex_word=0x5653640b1d18, expected=2, private=0) at ../sysdeps/nptl/futex-internal.h:146
#1 __GI___lll_lock_wait (futex=futex@entry=0x5653640b1d18, private=0) at ./nptl/lowlevellock.c:49
#2 0x00007f57ffc503a2 in lll_mutex_lock_optimized (mutex=0x5653640b1d18) at ./nptl/pthread_mutex_lock.c:48
#3 ___pthread_mutex_lock (mutex=mutex@entry=0x5653640b1d18) at ./nptl/pthread_mutex_lock.c:93
#4 0x00007f57ffdd52ef in load_ncinput (ictx=ictx@entry=0x5653640adc90, tni=tni@entry=0x7f57fab95ac0) at ./src/lib/in.c:522
#5 0x00007f57ffdd5560 in kitty_kbd_txt (ictx=ictx@entry=0x5653640adc90, val=113, mods=<optimized out>, mods@entry=1, txt=txt@entry=0x0, evtype=3) at ./src/lib/in.c:839
#6 0x00007f57ffdd5dff in kitty_kbd (ictx=0x5653640adc90, val=<optimized out>, mods=1, evtype=<optimized out>) at ./src/lib/in.c:844
#7 kitty_cb_complex (ictx=0x5653640adc90) at ./src/lib/in.c:1151
#8 0x00007f57ffdd6220 in process_escape (ictx=ictx@entry=0x5653640adc90, buf=buf@entry=0x5653640afc90 "\033[113;1:3uu:afaf/afaf/d7d7\033\\\033]4;147;rgb:afaf/afaf/ffff\033\\\033]4;148;rgb:afaf/d7d7/0000\033\\\033]4;149;rgb:afaf/d7d7/5f5f\033\\\033]4;150;rgb:afaf/d7d7/8787\033\\\033]4;151;rgb:afaf/d7d7/afaf\033\\\033]4;152;rgb:afaf/d7d7/d7d7\033\\\033]4;"..., buflen=10) at ./src/lib/in.c:2238
#9 0x00007f57ffdd77a8 in process_escape (ictx=0x5653640adc90, buf=<optimized out>, buflen=<optimized out>) at ./src/lib/in.c:2214
#10 process_melange (ictx=0x5653640adc90, buf=0x5653640afc90 "\033[113;1:3uu:afaf/afaf/d7d7\033\\\033]4;147;rgb:afaf/afaf/ffff\033\\\033]4;148;rgb:afaf/d7d7/0000\033\\\033]4;149;rgb:afaf/d7d7/5f5f\033\\\033]4;150;rgb:afaf/d7d7/8787\033\\\033]4;151;rgb:afaf/d7d7/afaf\033\\\033]4;152;rgb:afaf/d7d7/d7d7\033\\\033]4;"..., bufused=0x5653640b1cd0) at ./src/lib/in.c:2397
#11 0x00007f57ffdd80f9 in process_ibuf (ictx=<optimized out>) at ./src/lib/in.c:2451
#12 input_thread (vmarshall=0x5653640adc90) at ./src/lib/in.c:2622
#13 0x00007f57ffc4d083 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
#14 0x00007f57ffccb7b8 in __GI___clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
Thread 1 (Thread 0x7f57fb779880 (LWP 2761557) "notcurses-demo"):
#0 0x00007f57ffc49a1e in __futex_abstimed_wait_common64 (private=128, futex_word=0x7f57fab96990, expected=2761558, op=265, abstime=0x0, cancel=true) at ./nptl/futex-internal.c:57
#1 __futex_abstimed_wait_common (futex_word=futex_word@entry=0x7f57fab96990, expected=2761558, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=128, cancel=cancel@entry=true) at ./nptl/futex-internal.c:87
#2 0x00007f57ffc49a9b in __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x7f57fab96990, expected=<optimized out>, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=128) at ./nptl/futex-internal.c:139
#3 0x00007f57ffc4ebc3 in __pthread_clockjoin_ex (threadid=threadid@entry=140015845336768, thread_return=thread_return@entry=0x0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, block=block@entry=true) at ./nptl/pthread_join_common.c:102
#4 0x00007f57ffc4ea6f in ___pthread_join (threadid=threadid@entry=140015845336768, thread_return=thread_return@entry=0x0) at ./nptl/pthread_join.c:24
#5 0x00007f57ffdda471 in cancel_and_join (name=0x7f57ffe1c754 "input", res=0x0, tid=140015845336768) at ./src/lib/internal.h:1840
#6 stop_inputlayer (ti=ti@entry=0x5653640a5660) at ./src/lib/in.c:2651
#7 0x00007f57ffe0c082 in free_terminfo_cache (ti=ti@entry=0x5653640a5660) at ./src/lib/termdesc.c:198
#8 0x00007f57ffde9dce in notcurses_stop (nc=0x5653640a5360) at ./src/lib/notcurses.c:1472
#9 0x00005653471d4ddb in ?? ()
#10 0x00007f57ffbe4d68 in __libc_start_call_main (main=main@entry=0x5653471d3d60, argc=argc@entry=3, argv=argv@entry=0x7ffdc548da18) at ../sysdeps/nptl/libc_start_call_main.h:58
#11 0x00007f57ffbe4e25 in __libc_start_main_impl (main=0x5653471d3d60, argc=3, argv=0x7ffdc548da18, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffdc548da08) at ../csu/libc-start.c:360
#12 0x00005653471d50c1 in ?? ()
(gdb) |
hrmmm so input thread is blocking on a lock, not in |
wow, that locks hard, too -- neither ^C nor ^Z break it |
hrmmm it's not obvious how the input thread could be blocking on taking |
huh |
so i'm seeing this when running |
hrmm i can't reproduce the |
i've added some debug code. normally when we see process_escape:2246:walk result on 49 (1): 0 273
process_escape:2246:walk result on 58 (:): 0 290
process_escape:2246:walk result on 51 (3): 0 291
kitty_kbd_txt:808:v/m/e 1115121 1 3
load_ncinput:522:taking lock
load_ncinput:532:got lock
mark_pipe_ready:475:wrote to readiness pipe
load_ncinput:554:unlocking
process_escape:2246:walk result on 117 (u): 2 292
block_on_input:2484:blocking on input availability
...
process_escape:2246:walk result on 49 (1): 0 275
process_escape:2246:walk result on 49 (1): 0 275
process_escape:2246:walk result on 51 (3): 0 275
kitty_kbd_txt:808:v/m/e 113 0 0
load_ncinput:522:taking lock
load_ncinput:532:got lock
mark_pipe_ready:475:wrote to readiness pipe
load_ncinput:554:unlocking
process_escape:2246:walk result on 117 (u): 2 280
block_on_input:2484:blocking on input availability
block_on_input:2541:waiting on 1 fds (ibuf: 0/8192)
internal_get:2738:draining event readiness pipe 0
ncplane_destroy:1030:destroying 5x25 plane "hud" @ 33x32
ncplane_destroy:1030:destroying 6x72 plane "fps but then we don't: process_escape:2246:walk result on 58 (:): 0 290
process_escape:2246:walk result on 51 (3): 0 291
kitty_kbd_txt:808:v/m/e 113 1 3
load_ncinput:522:taking lock
notcurses_rasterize_inner:1276:pile 0x55989260b210 ymax: 45 xmax: 90
notcurses_rasterize_inner:1282:sprixel phase 1
clean_sprixels:892:phase 1 sprixel 551486 state 5 loc 13/43
clean_sprixels:892:phase 1 sprixel 551486 state 3 loc 13/43
sprite_redraw:733:sprixel 551486 state 3
kitty_draw:1180:dumping 683013b for 551486 at 0 0
notcurses_rasterize_inner:1287:glyph phase 1
notcurses_rasterize_inner:1291:sprixel phase 2 |
looks like maybe a path in |
also, if you get cancelled in a condition variable wait, don't you reaquire the lock? i think you do. yeah:
so we need a cleanup handler there, too. |
yep, looks like that got it, w00t! not sure if this is the same thing responsible for the macos ci hang but i suspect it is. |
looks like we just hit this again at https://github.com/dankamongmen/notcurses/actions/runs/13054111564 arrrrrr |
ok the macos issue does not necessarily look input related. i've got a (huge) logfile.
|
so can we not run with also, we get some compiler warnings on the macos build, it seems: Z /Users/runner/work/notcurses/notcurses/src/lib/unixsig.c:7:34: warning: macro 'ATOMIC_VAR_INIT' has bee
2025-01-30T14:09:15.3206940Z static void* _Atomic signal_nc = ATOMIC_VAR_INIT(NULL);
2025-01-30T14:09:15.3327660Z ^
2025-01-30T14:09:15.3391600Z /Applications/Xcode_15.4.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/15.0.
2025-01-30T14:09:15.3392270Z [ 21%] Building C object CMakeFiles/notcurses-core-static.dir/src/lib/util.c.o
2025-01-30T14:09:15.3458170Z #pragma clang deprecated(ATOMIC_VAR_INIT)
2025-01-30T14:09:15.3526420Z ^
2025-01-30T14:09:15.3526730Z [ 21%] Building C object CMakeFiles/notcurses-core.dir/src/lib/util.c.o
2025-01-30T14:09:15.3543930Z /Users/runner/work/notcurses/notcurses/src/lib/unixsig.c:7:34: warning: macro 'ATOMIC_VAR_INIT' has bee
2025-01-30T14:09:15.3569590Z static void* _Atomic signal_nc = ATOMIC_VAR_INIT(NULL);
2025-01-30T14:09:15.3636340Z ^
2025-01-30T14:09:15.3666930Z /Applications/Xcode_15.4.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/15.0.
2025-01-30T14:09:15.3669020Z #pragma clang deprecated(ATOMIC_VAR_INIT)
2025-01-30T14:09:15.3670650Z ^
2025-01-30T14:09:15.3671320Z 1 warning generated.
2025-01-30T14:09:15.3671950Z 1 warning generated. |
2025-01-30T14:09:38.8099230Z /Users/runner/work/notcurses/notcurses/src/tests/metric.cpp:95:5: warning: 'sprintf' is deprecated: This function is provided for compatibility reasons only. Due to security concerns inherent in the design of sprintf(3), it is highly recommended that you use snprintf(3) instead. [
2025-01-30T14:09:38.8202020Z sprintf(gold, "%.2fEi", ((double)(INTMAX_MAX - 1ull)) / (1ull << 60));
2025-01-30T14:09:38.8303760Z ^
2025-01-30T14:09:38.8335010Z /Applications/Xcode_15.4.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.5.sdk/usr/include/stdio.h:180:1: note: 'sprintf' has been explicitly marked deprecated here
2025-01-30T14:09:38.8336200Z __deprecated_msg("This function is provided for compatibility reasons only. Due to security concerns inherent in the design of sprintf(3), it is highly recommended that you use snprintf(3) instead.")
2025-01-30T14:09:38.8336820Z ^
2025-01-30T14:09:38.8337410Z /Applications/Xcode_15.4.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.5.sdk/usr/include/sys/cdefs.h:218:48: note: expanded from macro '__deprecated_msg'
2025-01-30T14:09:38.8338140Z #define __deprecated_msg(_msg) __attribute__((__deprecated__(_msg)))
2025-01-30T14:09:38.8459120Z ^
2025-01-30T14:09:38.8460040Z /Users/runner/work/notcurses/notcurses/src/tests/metric.cpp:98:5: warning: 'sprintf' is deprecated: This function is provided for compatibility reasons only. Due to security concerns inherent in the design of sprintf(3), it is highly recommended that you use snprintf(3) instead. [
|
See for instance https://github.com/dankamongmen/notcurses/actions/runs/12726507286/job/35474975934#step:6:44 (a 3.0.12-dirty buid):
so there are two problems here: we ought rerun with --output-on-failure in the CI, to have more info on this kind of thing. two, we're seeing this timeout. i think i've seen it locally, too, though i thought it was one of the
input
test programs. maybe it's a general problem? either way, let's get this figured out.The text was updated successfully, but these errors were encountered: