Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coredumps #1067

Open
henkmet opened this issue Dec 21, 2023 · 24 comments
Open

coredumps #1067

henkmet opened this issue Dec 21, 2023 · 24 comments
Labels

Comments

@henkmet
Copy link

henkmet commented Dec 21, 2023

Current Behavior:
After updating webkit to 2-42-4 on archlinux (and rebuilding luakit) I'm experiencing instabilities. Two times a SIGSEGV and once a SIGABRT

Desired Behavior:
don't crash

How can we reproduce it (step by step):
As before, it seems archlinux.org triggers something. I'd need to do some more testing for step-by-step reproduction.

Environment:

Linux Distribution & Version:
archlinux, linux 6.6.7.arch1-1
Output of luakit --version:
luakit 2.3.3-15-gb143b383
built with: webkit 2.42.4 (installed version: 2.42.4)
GTK 3.24.38
GLIB 2.78.3
SOUP 3.4.4

Note about webkit issues:

If you're reporting a rendering issue, please test it with the gnome
browser ephiphany as well. If the issue occurs there too, we're very
likely not able to help. These issues should be reported to webkit:
https://bugs.webkit.org

I don't know how useful backtraces are but there seems to be an issue with malloc? (This is from the SIGABRT coredump)
(gdb) bt
#0 __pthread_kill_implementation (threadid=, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1 0x00007fa4dfefe8a3 in __pthread_kill_internal (signo=6, threadid=) at pthread_kill.c:78
#2 0x00007fa4dfeae668 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00007fa4dfe964b8 in __GI_abort () at abort.c:79
#4 0x00007fa4dfe97390 in __libc_message (fmt=fmt@entry=0x7fa4e000e55d "%s\n") at ../sysdeps/posix/libc_fatal.c:150
#5 0x00007fa4dff087b7 in malloc_printerr (str=str@entry=0x7fa4e00117d8 "malloc(): mismatching next->prev_size (unsorted)") at malloc.c:5765
#6 0x00007fa4dff0bd9c in _int_malloc (av=av@entry=0x7fa438000030, bytes=bytes@entry=88) at malloc.c:4076
#7 0x00007fa4dff0ccad in __GI___libc_malloc (bytes=bytes@entry=88) at malloc.c:3329
#8 0x00007fa4e00b4763 in g_malloc (n_bytes=88) at ../glib/glib/gmem.c:130
#9 0x000056105d128605 in ipc_endpoint_new ()
#10 0x000056105d125875 in ()
#11 0x00007fa4e00dda05 in g_thread_proxy (data=0x56105f21ddf0) at ../glib/glib/gthread.c:831
#12 0x00007fa4dfefc9eb in start_thread (arg=) at pthread_create.c:444
#13 0x00007fa4dff807cc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

@c0dev0id c0dev0id added the bug label Jan 24, 2024
@ghost
Copy link

ghost commented Jan 26, 2024

Crashes for me too, but not if I start luakit like luakit https://archlinux.org, then it works fine.

Using Linux localhost 6.6.13-0-lts #1-Alpine SMP PREEMPT_DYNAMIC Wed, 24 Jan 2024 13:26:05 +0000 x86_64 Linux.
Luakit:

  built with: webkit 2.42.3 (installed version: 2.42.3)
                 GTK 3.24.38
                GLIB 2.78.4
                SOUP 3.4.4

@c0dev0id
Copy link
Member

c0dev0id commented Jan 30, 2024

Is this the full backtrace you're seeing? There's no luakit and no webkit in there...

Can you try to hit the issue with luakit --log=debug?

EDIT: Nevermind - I'm able to reproduce the bug. It just took many times opening the website.

@c0dev0id
Copy link
Member

c0dev0id commented Feb 2, 2024

Can you retest with the latest commit? I cannot reproduce the issue with it anymore.

@henkmet
Copy link
Author

henkmet commented Feb 2, 2024

It occurs too unpredictable to be certain but I haven't had any problems since. If you close this, the issue can always be reopened if problems recur.

@c0dev0id
Copy link
Member

c0dev0id commented Feb 2, 2024

I just didn't want to be rude without being certain. But there's a high chance that it's fixed now. I'll close the issue. Please reopen it when it happens again. Thanks!

@c0dev0id c0dev0id closed this as completed Feb 2, 2024
@ghost
Copy link

ghost commented Feb 2, 2024

Still crashes for me, albeit less often. If the site loads fine, reloading it a couple times results in a crash

@c0dev0id
Copy link
Member

c0dev0id commented Feb 2, 2024

Alright, we keep the issue then. Can you run a luakit debug build in gdb until it crashes and then post a stack trace?

I reloaded archlinux.org 50 times this morning - also in multiple tabs at the same time. No crash here.

@c0dev0id c0dev0id reopened this Feb 2, 2024
@ghost
Copy link

ghost commented Feb 2, 2024

I don't use gdb, how would I debug luakit using it? Here's the debug log though:
out.log

@c0dev0id
Copy link
Member

c0dev0id commented Feb 2, 2024

On the first glance, nothing is catching my eye in the log. The easiest way is to start it with gdb -ex run luakit in a terminal. It will then run luakit. When the crash is happening, it drops to a console where you can type bt to get the stack trace.

@ghost
Copy link

ghost commented Feb 2, 2024

Thread 34 "send_thread" received signal SIGSEGV, Segmentation fault.
[Switching to LWP 24619]
0x00007ffff29abe14 in g_io_channel_write_chars () from /usr/lib/libglib-2.0.so.0
(gdb) bt
#0  0x00007ffff29abe14 in g_io_channel_write_chars () at /usr/lib/libglib-2.0.so.0
#1  0x000055555556fe9f in ipc_send_thread (UNUSED_user_data=0x0) at common/ipc.c:81
#2  0x00007ffff29e8391 in ??? () at /usr/lib/libglib-2.0.so.0
#3  0x00007ffff7fb822e in start (p=0x7fff913fc1d0) at src/thread/pthread_create.c:207
#4  0x00007ffff7fba82f in __clone () at src/thread/x86_64/clone.s:22

@c0dev0id
Copy link
Member

c0dev0id commented Feb 3, 2024

I guarded more of the IPC and lua stack operations. I ran the test suite in a loop for an hour and did 100 refreshes on archlinux.org and got no crash. Please try again with the latest version.

@ghost
Copy link

ghost commented Feb 3, 2024

Yup, still crashes for me:

[    7.019614] I [lua/webview]: Requested link: https://archlinux.org/ (text/html)
[    7.847867] I [lua/webview]: Requested link: https://archlinux.org/ (text/html)
[    8.745841] I [lua/webview]: Requested link: https://archlinux.org/ (text/html)
[    9.388399] I [lua/webview]: Requested link: https://archlinux.org/ (text/html)
[    9.898108] I [lua/webview]: Requested link: https://archlinux.org/ (text/html)
[   10.372404] I [lua/webview]: Requested link: https://archlinux.org/ (text/html)
[   10.818088] I [lua/webview]: Requested link: https://archlinux.org/ (text/html)
[   11.266965] I [lua/webview]: Requested link: https://archlinux.org/ (text/html)
[   11.688157] I [lua/webview]: Requested link: https://archlinux.org/ (text/html)
[   12.112472] I [lua/webview]: Requested link: https://archlinux.org/ (text/html)
[   12.544171] I [lua/webview]: Requested link: https://archlinux.org/ (text/html)
[   12.976512] I [lua/webview]: Requested link: https://archlinux.org/ (text/html)
[   13.359990] I [lua/webview]: Requested link: https://archlinux.org/ (text/html)
[   13.736810] I [lua/webview]: Requested link: https://archlinux.org/ (text/html)
[   14.098360] I [lua/webview]: Requested link: https://archlinux.org/ (text/html)
[   14.212835] E [core/common/ipc]: Trying to send an ipc message, but the endpoint went away.
[   14.470711] I [lua/webview]: Requested link: https://archlinux.org/ (text/html)
[   14.484320] E [core/common/ipc]: Trying to send an ipc message, but the endpoint went away.
[   14.484438] E [core/common/ipc]: Trying to send an ipc message, but the endpoint went away.
[   14.582370] E [core/common/ipc]: Trying to send an ipc message, but the endpoint went away.
[   14.834234] I [lua/webview]: Requested link: https://archlinux.org/ (text/html)
[   14.840385] E [core/common/ipc]: Trying to send an ipc message, but the endpoint went away.
[   14.840500] E [core/common/ipc]: Trying to send an ipc message, but the endpoint went away.
[   14.934864] E [core/common/ipc]: Trying to send an ipc message, but the endpoint went away.
[   16.976210] I [lua/webview]: Requested link: https://archlinux.org/ (text/html)
Segmentation fault (core dumped)

stacktrace:

Thread 1 "luakit" received signal SIGSEGV, Segmentation fault.
get_nominal_size (end=0x7fff95f42f2c "", p=0x7fff95f42ed0 "\004")
    at src/malloc/mallocng/meta.h:169
warning: 169    src/malloc/mallocng/meta.h: No such file or directory
(gdb) bt
#0  get_nominal_size (end=0x7fff95f42f2c "", p=0x7fff95f42ed0 "\004")
    at src/malloc/mallocng/meta.h:169
#1  __libc_free (p=0x7fff95f42ed0) at src/malloc/mallocng/free.c:110
#2  0x00007ffff2eff71c in cairo_region_destroy () at /usr/lib/libcairo.so.2
#3  0x00007ffff7ea8471 in ??? () at /usr/lib/libgdk-3.so.0
#4  0x00007ffff7ea866d in ??? () at /usr/lib/libgdk-3.so.0
#5  0x00007ffff2ad9ce3 in ??? () at /usr/lib/libgobject-2.0.so.0
#6  0x00007ffff2ad9dd3 in g_signal_emit_valist () at /usr/lib/libgobject-2.0.so.0
#7  0x00007ffff2ad9e90 in g_signal_emit () at /usr/lib/libgobject-2.0.so.0
#8  0x00007ffff7e9f659 in ??? () at /usr/lib/libgdk-3.so.0
#9  0x00007ffff7e8c62a in ??? () at /usr/lib/libgdk-3.so.0
#10 0x00007ffff29b931a in ??? () at /usr/lib/libglib-2.0.so.0
#11 0x00007ffff29b817a in ??? () at /usr/lib/libglib-2.0.so.0
#12 0x00007ffff2a19547 in ??? () at /usr/lib/libglib-2.0.so.0
#13 0x00007ffff29b8bd7 in g_main_loop_run () at /usr/lib/libglib-2.0.so.0
#14 0x00007ffff31d64ef in gtk_main () at /usr/lib/libgtk-3.so.0
#15 0x000055555556f882 in main ()

Version info:

luakit 2.3.6-7-g0bc0e394
  built with: webkit 2.42.3 (installed version: 2.42.3)
                 GTK 3.24.38
                GLIB 2.78.4
                SOUP 3.4.4

@ghost
Copy link

ghost commented Feb 3, 2024

So, I did some more testing, and it seems like the problem occurs if hardened_malloc is enabled.
Commenting or removing /usr/local/lib/libhardened_malloc.so from /etc/ld-musl-x86_64.path seems to have fixed the issue for me, and I also don't get the Trying to send an ipc message, but the endpoint went away. message.

I'll continue to do some testing, but this might be the culprit (atleast for me, I don't know if @henkmet uses hardened_malloc too)

EDIT: Nevermind, still crashes. Different stacktrace though:

Thread 1 "luakit" received signal SIGSEGV, Segmentation fault.
get_nominal_size (end=0x7fff95de5b0c "", p=0x7fff95de5ab0 "") at src/malloc/mallocng/meta.h:169
warning: 169    src/malloc/mallocng/meta.h: No such file or directory
(gdb) bt
#0  get_nominal_size (end=0x7fff95de5b0c "", p=0x7fff95de5ab0 "") at src/malloc/mallocng/meta.h:169
#1  __libc_free (p=0x7fff95de5ab0) at src/malloc/mallocng/free.c:110
#2  0x00007ffff2fc433f in ??? () at /usr/lib/libpango-1.0.so.0
#3  0x00007ffff2fc437e in ??? () at /usr/lib/libpango-1.0.so.0
#4  0x00007ffff2ac9074 in g_object_unref () at /usr/lib/libgobject-2.0.so.0
#5  0x00007ffff31bf9f2 in ??? () at /usr/lib/libgtk-3.so.0
#6  0x00007ffff31c28cb in gtk_label_set_markup () at /usr/lib/libgtk-3.so.0
#7  0x0000555555589acd in luaH_label_newindex ()
#8  0x000055555558256b in luaH_widget_newindex ()
#9  0x00007ffff2b0f426 in ??? () at /usr/lib/libluajit-5.1.so.2
#10 0x00007ffff2b6c45f in lua_pcall () at /usr/lib/libluajit-5.1.so.2
#11 0x00005555555735d6 in luaH_dofunction ()
#12 0x00005555555743b0 in luaH_object_emit_signal ()
#13 0x0000555555591ea7 in load_changed_cb ()
#14 0x00007ffff2abb300 in g_closure_invoke () at /usr/lib/libgobject-2.0.so.0
#15 0x00007ffff2ae89f6 in ??? () at /usr/lib/libgobject-2.0.so.0
#16 0x00007ffff2ad9bb2 in ??? () at /usr/lib/libgobject-2.0.so.0
#17 0x00007ffff2ad9dd3 in g_signal_emit_valist () at /usr/lib/libgobject-2.0.so.0
#18 0x00007ffff2ad9e90 in g_signal_emit () at /usr/lib/libgobject-2.0.so.0
#19 0x00007ffff4d0844c in ??? () at /usr/lib/libwebkit2gtk-4.1.so.0
#20 0x00007ffff4c28223 in ??? () at /usr/lib/libwebkit2gtk-4.1.so.0
#21 0x00007ffff4845f72 in ??? () at /usr/lib/libwebkit2gtk-4.1.so.0
#22 0x00007ffff4841df4 in ??? () at /usr/lib/libwebkit2gtk-4.1.so.0
#23 0x00007ffff4b7e815 in ??? () at /usr/lib/libwebkit2gtk-4.1.so.0
#24 0x00007ffff4c81bfb in ??? () at /usr/lib/libwebkit2gtk-4.1.so.0
#25 0x00007ffff4b78363 in ??? () at /usr/lib/libwebkit2gtk-4.1.so.0
#26 0x00007ffff4b789ee in ??? () at /usr/lib/libwebkit2gtk-4.1.so.0
#27 0x00007ffff27d3dbb in ??? () at /usr/lib/libjavascriptcoregtk-4.1.so.0
#28 0x00007ffff284a2b7 in ??? () at /usr/lib/libjavascriptcoregtk-4.1.so.0
#29 0x00007ffff2848a3d in ??? () at /usr/lib/libjavascriptcoregtk-4.1.so.0
#30 0x00007ffff29b817a in ??? () at /usr/lib/libglib-2.0.so.0
#31 0x00007ffff2a19547 in ??? () at /usr/lib/libglib-2.0.so.0
#32 0x00007ffff29b8bd7 in g_main_loop_run () at /usr/lib/libglib-2.0.so.0
#33 0x00007ffff31d64ef in gtk_main () at /usr/lib/libgtk-3.so.0
#34 0x000055555556f882 in main ()

@henkmet
Copy link
Author

henkmet commented Feb 3, 2024

Just crashed but all the bt I get is this; not sure why so short:

Core was generated by `luakit'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  g_io_channel_write_chars (channel=0x71b, buf=0x5beae0727740 "4", count=8, bytes_written=0x0, error=0x0) at ../glib/glib/giochannel.c:2214
Downloading source file /usr/src/debug/glib2/build/../glib/glib/giochannel.c
2214      g_return_val_if_fail (channel->is_writeable, G_IO_STATUS_ERROR);
[Current thread is 1 (Thread 0x7b1d834006c0 (LWP 1593))]
(gdb) bt
#0  g_io_channel_write_chars (channel=0x71b, buf=0x5beae0727740 "4", count=8, bytes_written=0x0, error=0x0) at ../glib/glib/giochannel.c:2214
#1  0x00005beadda4fbbb in ??? ()
#2  0x00007b1e08f3fa45 in g_thread_proxy (data=0x5beadee54740) at ../glib/glib/gthread.c:831
#3  0x00007b1e08d5e9eb in start_thread (arg=<optimized out>) at pthread_create.c:444
#4  0x00007b1e08de28ac in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

@c0dev0id
Copy link
Member

c0dev0id commented Feb 3, 2024

Webkit spawns many threads, these threads communicate with the luakit main thread using GIO channels. What is happening here is that a webkit thread tries to write to a non existant channel.

This is why the backtrace is short and doesn't show anything from luakit. I'm not sure how to methodically debug this. So I'm trying to find situation where a channel could get closed and try to make sure that nothing is sent/recieved on these channels.

Meanwhile, I think that this crash and the fact that hints are also not visible on archlinux.org are related. There's an early EOF, which also closes the channel. The reaction is intended, but that the EOF is sent is not OK, because this means the thread stays around and cuts off the communication with luakit. The effect is, that no further signals are getting to this thread and neither scrolling, nor hints work.

I think this situation also leads to this segfault when the thread finds a way to send a signal to the close ipc channel.

I'm going to read a bit more about GIO to back up my assumptions.

@henkmet
Copy link
Author

henkmet commented Feb 4, 2024

Just crashed but all the bt I get is this; not sure why so short:

Core was generated by `luakit'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  g_io_channel_write_chars (channel=0x71b, buf=0x5beae0727740 "4", count=8, bytes_written=0x0, error=0x0) at ../glib/glib/giochannel.c:2214
Downloading source file /usr/src/debug/glib2/build/../glib/glib/giochannel.c
2214      g_return_val_if_fail (channel->is_writeable, G_IO_STATUS_ERROR);
[Current thread is 1 (Thread 0x7b1d834006c0 (LWP 1593))]
(gdb) bt
#0  g_io_channel_write_chars (channel=0x71b, buf=0x5beae0727740 "4", count=8, bytes_written=0x0, error=0x0) at ../glib/glib/giochannel.c:2214
#1  0x00005beadda4fbbb in ??? ()
#2  0x00007b1e08f3fa45 in g_thread_proxy (data=0x5beadee54740) at ../glib/glib/gthread.c:831
#3  0x00007b1e08d5e9eb in start_thread (arg=<optimized out>) at pthread_create.c:444
#4  0x00007b1e08de28ac in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

The original post was on a recent build of luakit but the one I quote here was on a machine where I forgot to rebuild. Apologies for that.

@apprehensions
Copy link

I have this issue with #1079 on a stable release. If i build with git, this problem is gone.

luakit 2.3.6-14-g100e0601
  built with: webkit 2.42.5 (installed version: 2.42.5)
                 GTK 3.24.41 
                GLIB 2.78.4 
                SOUP 3.4.4 

@c0dev0id
Copy link
Member

c0dev0id commented Feb 10, 2024

I'm currently working on this. It happens in various circumstances and it depends on the website.

You're likely seeing this error on the terminal in the latest git version: "Trying to send an ipc message, but the endpoint went away."

I added a check, which prevents luakit from crashing (mostly). BUT it does not prevent the webview from not initializing properly. This means all functions/keybinds that emit signals on the webview won't work (follow mode / scroll via keyboard).

If you open another page on the same webview(tab), the issue carries over. If you close the tab and open a new one, it works again.

EDIT:

You're likely seeing this error on the terminal in the latest git version: "Trying to send an ipc message, but the endpoint went away."

Not in all cases, because I started to check the lua objects before reaching this point.

@henkmet
Copy link
Author

henkmet commented Feb 12, 2024

I just got a longer bt from a crash, again going back and forth in history on an archlinux website. Indeed one of the most recent calls is ipc_send_lua(). Since I already have the bt, I'll attach it:

#0  0x0000764eb4e3abd3 in g_list_last (list=0x31 = {...}, list@entry=0x764ce8005ad0 = {...}) at ../glib/glib/glist.c:1009
#1  0x0000764eb4e3b008 in g_list_append (list=0x764ce8005ad0 = {...}, data=0x5a8ecd9dbe40) at ../glib/glib/glist.c:293
#2  0x0000764eb4e57dc7 in g_queue_push_tail (queue=0x764ce8006ba0, data=<optimized out>) at ../glib/glib/gqueue.c:447
#3  0x00005a8ecacb58e6 in ipc_send_lua ()
#4  0x00005a8ecacd0e97 in ??? ()
#5  0x0000764eb4f9def6 in lj_BC_FUNCC () at buildvm_x86.dasc:857
#6  0x0000764eb4fb0cb3 in lua_pcall (L=0x764ea368c380, nargs=<optimized out>, nresults=-1, errfunc=<optimized out>)
#7  0x00005a8ecacb87b2 in luaH_object_emit_signal ()
#8  0x00005a8ecaccdfe6 in ??? ()
#9  0x0000764eb568c6cd in _gtk_marshal_BOOLEAN__BOXED
    (closure=0x5a8ecd9b43d0, return_value=0x7ffef921db70, param_values=0x7ffef921dc00, marshal_data=<optimized out>, invocation_hint=<optimized out>, n_param_values=<optimized out>) at gtk/gtkmarshalers.c:84
#10 0x0000764eb59363a3 in _gtk_marshal_BOOLEAN__BOXED
    (marshal_data=0x0, invocation_hint=<optimized out>, param_values=0x7ffef921dc00, n_param_values=2, return_value=0x7ffef921db70, closure=0x5a8ecd9b43d0) at gtk/gtkmarshalers.c:70
#11 gtk_widget_draw_marshaller
    (closure=0x5a8ecd9b43d0, return_value=0x7ffef921db70, n_param_values=2, param_values=0x7ffef921dc00, invocation_hint=<optimized out>, marshal_data=0x0) at ../gtk/gtk/gtkwidget.c:951
#12 0x0000764eb4f466c0 in g_closure_invoke
    (closure=0x5a8ecd9b43d0, return_value=0x7ffef921db70, n_param_values=2, param_values=0x7ffef921dc00, invocation_hint=0x7ffef921db50) at ../glib/gobject/gclosure.c:832
#13 0x0000764eb4f74a36 in signal_emit_unlocked_R.isra.0
    (node=node@entry=0x7ffef921dcf0, detail=detail@entry=0, instance=instance@entry=0x5a8ecd973080, emission_return=emission_return@entry=0x7ffef921dd70, instance_and_params=instance_and_params@entry=0x7ffef921dc00) at ../glib/gobject/gsignal.c:3980
#14 0x0000764eb4f65335 in signal_emit_valist_unlocked
    (instance=instance@entry=0x5a8ecd973080, signal_id=signal_id@entry=74, detail=detail@entry=0, var_args=var_args@entry=0x7ffef921de50) at ../glib/gobject/gsignal.c:3625
#15 0x0000764eb4f65c77 in g_signal_emit_valist
    (instance=0x5a8ecd973080, signal_id=74, detail=0, var_args=var_args@entry=0x7ffef921de50) at ../glib/gobject/gsignal.c:3355
#16 0x0000764eb4f65d34 in g_signal_emit (instance=instance@entry=0x5a8ecd973080, signal_id=<optimized out>, detail=detail@entry=0)
    at ../glib/gobject/gsignal.c:3675
#17 0x0000764eb5948ac3 in gtk_widget_draw_internal (widget=0x5a8ecd973080, cr=0x5a8ecc8ea700, clip_to_size=<optimized out>)
    at ../gtk/gtk/gtkwidget.c:7077
#18 0x0000764eb570a7a5 in gtk_container_propagate_draw (container=<optimized out>, child=0x5a8ecd973080, cr=0x5a8ecc8ea700)
    at ../gtk/gtk/gtkcontainer.c:3854
#19 0x0000764eb5815cd6 in gtk_notebook_draw_stack
#20 0x0000764eb570b0b1 in gtk_css_custom_gadget_draw (gadget=<optimized out>, cr=<optimized out>, x=<optimized out>, y=<optimized out>, width=<optimized out>, height=<optimized out>) at ../gtk/gtk/gtkcsscustomgadget.c:159
#21 0x0000764eb5717f1c in gtk_css_gadget_draw (gadget=0x5a8ecc6cd410, cr=0x5a8ecc8ea700) at ../gtk/gtk/gtkcssgadget.c:885
#22 0x0000764eb56bf42a in gtk_box_gadget_draw (gadget=<optimized out>, cr=0x5a8ecc8ea700, x=<optimized out>, y=<optimized out>, width=<optimized out>, height=<optimized out>) at ../gtk/gtk/gtkboxgadget.c:512
#23 0x0000764eb5717f1c in gtk_css_gadget_draw (gadget=0x5a8ecc6cd2b0, cr=0x5a8ecc8ea700) at ../gtk/gtk/gtkcssgadget.c:885
#24 0x0000764eb5815d2c in gtk_notebook_draw (widget=<optimized out>, cr=0x5a8ecc8ea700) at ../gtk/gtk/gtknotebook.c:2560
#25 0x0000764eb59489aa in gtk_widget_draw_internal (widget=0x5a8ecc6ccca0, cr=0x5a8ecc8ea700, clip_to_size=<optimized out>) at ../gtk/gtk/gtkwidget.c:7084
#26 0x0000764eb570a7a5 in gtk_container_propagate_draw (container=<optimized out>, child=0x5a8ecc6ccca0, cr=0x5a8ecc8ea700) at ../gtk/gtk/gtkcontainer.c:3854
#27 0x0000764eb570a8b6 in gtk_container_draw (widget=0x5a8ecc6c5560, cr=0x5a8ecc8ea700) at ../gtk/gtk/gtkcontainer.c:3674
#28 0x0000764eb59489aa in gtk_widget_draw_internal (widget=0x5a8ecc6c5560, cr=0x5a8ecc8ea700, clip_to_size=<optimized out>) at ../gtk/gtk/gtkwidget.c:7084
#29 0x0000764eb570a7a5 in gtk_container_propagate_draw (container=<optimized out>, child=0x5a8ecc6c5560, cr=0x5a8ecc8ea700) at ../gtk/gtk/gtkcontainer.c:3854
#30 0x0000764eb570a8b6 in gtk_container_draw (widget=0x5a8ecc6cb780, cr=0x5a8ecc8ea700) at ../gtk/gtk/gtkcontainer.c:3674
#31 0x0000764eb56b6251 in gtk_box_draw_contents (gadget=<optimized out>, cr=<optimized out>, x=<optimized out>, y=<optimized out>, width=<optimized out>, height=<optimized out>, unused=0x0) at ../gtk/gtk/gtkbox.c:453
#32 0x0000764eb570b0b1 in gtk_css_custom_gadget_draw (gadget=<optimized out>, cr=<optimized out>, x=<optimized out>, y=<optimized out>, width=<optimized out>, height=<optimized out>) at ../gtk/gtk/gtkcsscustomgadget.c:159
#33 0x0000764eb5717f1c in gtk_css_gadget_draw (gadget=0x5a8ecc6cbde0, cr=0x5a8ecc8ea700) at ../gtk/gtk/gtkcssgadget.c:885
#34 0x0000764eb56b67c5 in gtk_box_draw (widget=<optimized out>, cr=<optimized out>) at ../gtk/gtk/gtkbox.c:462
#35 0x0000764eb59489aa in gtk_widget_draw_internal (widget=0x5a8ecc6cb780, cr=0x5a8ecc8ea700, clip_to_size=<optimized out>) at ../gtk/gtk/gtkwidget.c:7084
#36 0x0000764eb570a7a5 in gtk_container_propagate_draw (container=<optimized out>, child=0x5a8ecc6cb780, cr=0x5a8ecc8ea700) at ../gtk/gtk/gtkcontainer.c:3854
#37 0x0000764eb570a8b6 in gtk_container_draw (widget=0x5a8ecc6ca430, cr=0x5a8ecc8ea700) at ../gtk/gtk/gtkcontainer.c:3674
#38 0x0000764eb5768fbf in gtk_event_box_draw (widget=0x5a8ecc6ca430, cr=0x5a8ecc8ea700) at ../gtk/gtk/gtkeventbox.c:619
#39 0x0000764eb59489aa in gtk_widget_draw_internal (widget=0x5a8ecc6ca430, cr=0x5a8ecc8ea700, clip_to_size=<optimized out>) at ../gtk/gtk/gtkwidget.c:7084
#40 0x0000764eb570a7a5 in gtk_container_propagate_draw (container=<optimized out>, child=0x5a8ecc6ca430, cr=0x5a8ecc8ea700) at ../gtk/gtk/gtkcontainer.c:3854
#41 0x0000764eb570a8b6 in gtk_container_draw (widget=0x5a8ecc6c7460, cr=0x5a8ecc8ea700) at ../gtk/gtk/gtkcontainer.c:3674
#42 0x0000764eb59489aa in gtk_widget_draw_internal (widget=0x5a8ecc6c7460, cr=0x5a8ecc8ea700, clip_to_size=<optimized out>) at ../gtk/gtk/gtkwidget.c:7084
#43 0x0000764eb59538d3 in gtk_widget_render (widget=0x5a8ecc6c7460, window=0x5a8ecbe4f030, region=<optimized out>) at ../gtk/gtk/gtkwidget.c:17610
#44 0x0000764eb57efc6b in gtk_main_do_event (event=0x7ffef921e820) at ../gtk/gtk/gtkmain.c:1844
#45 gtk_main_do_event (event=<optimized out>) at ../gtk/gtk/gtkmain.c:1691
#46 0x0000764eb5539b77 in _gdk_event_emit (event=0x7ffef921e820) at ../gtk/gdk/gdkevents.c:73
#47 _gdk_event_emit (event=0x7ffef921e820) at ../gtk/gdk/gdkevents.c:67
#48 0x0000764eb554bb02 in _gdk_window_process_updates_recurse_helper (window=0x5a8ecbe4f030, expose_region=<optimized out>) at ../gtk/gdk/gdkwindow.c:3874
#49 0x0000764eb5550158 in gdk_window_process_updates_internal (window=0x5a8ecbe4f030) at ../gtk/gdk/gdkwindow.c:4020
#50 0x0000764eb5550375 in gdk_window_process_updates_with_mode (recurse_mode=<optimized out>, window=<optimized out>) at ../gtk/gdk/gdkwindow.c:4215
#51 gdk_window_process_updates_with_mode (window=<optimized out>, recurse_mode=<optimized out>) at ../gtk/gdk/gdkwindow.c:4186
#55 0x0000764eb4f65d34 in <emit signal '???' on instance ???> (instance=instance@entry=0x5a8ecbe08180, signal_id=<optimized out>, detail=detail@entry=0) at ../glib/gobject/gsignal.c:3675
    #52 0x0000764eb4f65b73 in _g_closure_invoke_va (param_types=0x0, n_params=<optimized out>, args=0x7ffef921eb50, instance=0x5a8ecbe08180, return_value=0x0, closure=0x5a8ecc76a9e0) at ../glib/gobject/gclosure.c:895
    #53 signal_emit_valist_unlocked (instance=instance@entry=0x5a8ecbe08180, signal_id=signal_id@entry=32, detail=detail@entry=0, var_args=var_args@entry=0x7ffef921eb50) at ../glib/gobject/gsignal.c:3516
    #54 0x0000764eb4f65c77 in g_signal_emit_valist (instance=0x5a8ecbe08180, signal_id=32, detail=0, var_args=var_args@entry=0x7ffef921eb50) at ../glib/gobject/gsignal.c:3355
#56 0x0000764eb5546fe9 in _gdk_frame_clock_emit_paint (frame_clock=0x5a8ecbe08180) at ../gtk/gdk/gdkframeclock.c:657
#57 gdk_frame_clock_paint_idle (data=0x5a8ecbe08180) at ../gtk/gdk/gdkframeclockidle.c:597
#58 0x0000764eb553369e in gdk_threads_dispatch (data=0x5a8ecd9b92c0, data@entry=<error reading variable: value has been optimized out>) at ../gtk/gdk/gdk.c:769
#59 0x0000764eb4e413ee in g_timeout_dispatch (source=0x5a8ecd9b79c0, callback=<optimized out>, user_data=<optimized out>) at ../glib/glib/gmain.c:5121
#60 0x0000764eb4e3ff69 in g_main_dispatch (context=0x5a8ecbd73140) at ../glib/glib/gmain.c:3476
#61 0x0000764eb4e9e3a7 in g_main_context_dispatch_unlocked (context=0x5a8ecbd73140) at ../glib/glib/gmain.c:4284
#62 g_main_context_iterate_unlocked.isra.0 (context=0x5a8ecbd73140, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/glib/gmain.c:4349
#63 0x0000764eb4e40b97 in g_main_loop_run (loop=0x5a8ecbf058e0) at ../glib/glib/gmain.c:4551
#64 0x0000764eb57ed2bf in gtk_main () at ../gtk/gtk/gtkmain.c:1329
#65 0x00005a8ecacb21b3 in main ()

@c0dev0id
Copy link
Member

Just an update - no solution yet.

When tracing the called functions, this is what is happening:

[    0.143536] E [core/ipc]: ipc_init
[    0.144004] E [core/ipc]: web_extension_connect_thread
[    0.144076] E [core/ipc]: build_socket_path
[    0.145036] I [core/luah]: Loading rc: ./config/rc.lua
[    0.160944] I [lua/lib/adblock]: found 3 filter lists
[    0.496400] I [lua/lib/styles]: found 1 user stylesheet
[    0.583637] E [core/ipc]: initialize_web_extensions_cb
[    0.766051] E [core/ipc]: ipc_recv_extension_init
[    0.786918] E [core/extension/ipc]: web_page_created_cb
[    0.798004] E [core/extension/ipc]: ipc_recv_extension_init
[    0.798032] E [core/extension/ipc]: emit_pending_page_creation_ipc
[    0.798046] E [core/extension/ipc]: emit_page_created_ipc
[    0.798069] E [core/ipc]: ipc_recv_page_created, id: 16
[    0.798078] E [core/widgets/webview]: webview_connect_to_endpoint
[    0.829705] E [core/ipc]: ipc_recv_eval_js
[    0.830579] E [core/ipc]: ipc_recv_eval_js
[    0.830772] E [core/ipc]: ipc_recv_eval_js
[    0.830893] E [core/ipc]: ipc_recv_eval_js
[    0.888124] E [core/ipc]: ipc_recv_eval_js
[    0.888200] E [core/ipc]: ipc_recv_eval_js
[    0.997070] E [core/ipc]: ipc_recv_eval_js

Up to here, everything looks normal. Usually the webview starts up here and the navigation request is handled (with page id 16 in this case).

However, in the archlinux.org case, a lua_ipc message is sent which leads to the re-initialization of the webextension...

[    1.008159] E [core/ipc]: initialize_web_extensions_cb
[    1.184447] E [core/ipc]: ipc_recv_extension_init
[    1.206604] E [core/extension/ipc]: web_page_created_cb
[    1.212188] E [core/extension/ipc]: ipc_recv_extension_init
[    1.212264] E [core/extension/ipc]: emit_pending_page_creation_ipc
[    1.212314] E [core/extension/ipc]: emit_page_created_ipc
[    1.212387] E [core/ipc]: ipc_recv_page_created, id: 20
[    1.212409] E [core/ipc]: webview_get_by_id failed, id: 20
[    1.220767] I [lua/lib/webview]: Requested link: https://archlinux.org/ (text/html)

The reinitialization is triggered by an EOF, which is recieved on the GIO channel. All we know at this point is that we can't communicate with this channel anymore and so we close it down. I assume this leads to the reinitialization of the webextension.

The questions now are:

  • What happens that we get an EOF message from the webview thread?
  • Assuming the EOF is correct: Why can't we find webview 20, right after creating it?

I still have trouble to put the finger on why these things are happening.

@c0dev0id
Copy link
Member

I have tried a few times to figure this out and have failed. Any help is welcome here.

@intr-cx
Copy link

intr-cx commented Sep 9, 2024

I've been bit by this issue endless times now and I'd like to help if I can. So far I noticed the same issue happens in Vimb (so I'm guessing Luakit is at least not at fault). It's always reproducible when visiting the site switchedtolinux.com. Sometimes by a segfault, sometimes just by the EOF.

@c0dev0id I'm not exactly new to debugging but Webkit is quite daunting to me. Still, if you could point me to the right direction to debugging this I'd love to see if I can find something. Luakit is incredibly comfy when it works, and I'd hate to see it die.

@c0dev0id
Copy link
Member

@intr-cx If its reliable crashing for you, can you try to roll back the libsoup3 update and see if that fixes it?
The diff on top of the current luakit is here: https://ptrace.org/0001-Downgrade-from-libsoup3-to-libsoup2.patch

I tried to debug the issue but I didn't manage to get ahead of the signal flow that's happening (and where it's wrong).

@intr-cx
Copy link

intr-cx commented Sep 12, 2024

@c0dev0id I've compiled luakit with the provided patch, but the segfaults still happen. Either it segfaults or scrolling/hinting breaks, every time. It seems every coredump looks the same.

#0  0x000068b25bb502dd in g_queue_push_tail () at /usr/lib/libglib-2.0.so.0
#1  0x00000890093580df in ipc_send ()
#2  0x000008900935850a in ipc_send_lua ()
#3  0x0000089009376b65 in luaH_webview_eval_js ()
#4  0x000068b2614b4f06 in ??? () at /usr/lib/libluajit-5.1.so.2
#5  0x000068b2614c8684 in lua_pcall () at /usr/lib/libluajit-5.1.so.2
#6  0x000008900935b3e8 in luaH_dofunction ()
#7  0x000008900935c1b4 in luaH_object_emit_signal ()
#8  0x000008900937b833 in expose_cb ()
#9  0x000068b25be786cd in ??? () at /usr/lib/libgtk-3.so.0
#10 0x000068b25c124283 in ??? () at /usr/lib/libgtk-3.so.0
#11 0x000068b261552750 in g_closure_invoke () at /usr/lib/libgobject-2.0.so.0
#12 0x000068b261581ce5 in ??? () at /usr/lib/libgobject-2.0.so.0
#13 0x000068b2615723a0 in ??? () at /usr/lib/libgobject-2.0.so.0
#14 0x000068b261572da7 in g_signal_emit_valist () at /usr/lib/libgobject-2.0.so.0
#15 0x000068b261572e64 in g_signal_emit () at /usr/lib/libgobject-2.0.so.0
#16 0x000068b25c133b93 in ??? () at /usr/lib/libgtk-3.so.0
#17 0x000068b25bef51e1 in gtk_container_propagate_draw () at /usr/lib/libgtk-3.so.0
#18 0x000068b25c001ce6 in ??? () at /usr/lib/libgtk-3.so.0
#19 0x000068b25bef5cc1 in ??? () at /usr/lib/libgtk-3.so.0
#20 0x000068b25bf0a3e2 in ??? () at /usr/lib/libgtk-3.so.0
#21 0x000068b25beab301 in ??? () at /usr/lib/libgtk-3.so.0
#22 0x000068b25bf0a3e2 in ??? () at /usr/lib/libgtk-3.so.0
#23 0x000068b25c001d3c in ??? () at /usr/lib/libgtk-3.so.0
#24 0x000068b25c133a7a in ??? () at /usr/lib/libgtk-3.so.0
#25 0x000068b25bef51e1 in gtk_container_propagate_draw () at /usr/lib/libgtk-3.so.0
#26 0x000068b25bef530e in ??? () at /usr/lib/libgtk-3.so.0
#27 0x000068b25c133a7a in ??? () at /usr/lib/libgtk-3.so.0
#28 0x000068b25bef51e1 in gtk_container_propagate_draw () at /usr/lib/libgtk-3.so.0
#29 0x000068b25bef530e in ??? () at /usr/lib/libgtk-3.so.0
#30 0x000068b25bea2441 in ??? () at /usr/lib/libgtk-3.so.0
#31 0x000068b25bef5cc1 in ??? () at /usr/lib/libgtk-3.so.0
#32 0x000068b25bf0a3e2 in ??? () at /usr/lib/libgtk-3.so.0
#33 0x000068b25bea29b5 in ??? () at /usr/lib/libgtk-3.so.0
#34 0x000068b25c133a7a in ??? () at /usr/lib/libgtk-3.so.0
#35 0x000068b25bef51e1 in gtk_container_propagate_draw () at /usr/lib/libgtk-3.so.0
#36 0x000068b25bef530e in ??? () at /usr/lib/libgtk-3.so.0
#37 0x000068b25bf5530f in ??? () at /usr/lib/libgtk-3.so.0
#38 0x000068b25c133a7a in ??? () at /usr/lib/libgtk-3.so.0
#39 0x000068b25bef51e1 in gtk_container_propagate_draw () at /usr/lib/libgtk-3.so.0
#40 0x000068b25bef530e in ??? () at /usr/lib/libgtk-3.so.0
#41 0x000068b25c133a7a in ??? () at /usr/lib/libgtk-3.so.0
#42 0x000068b25c1419a3 in ??? () at /usr/lib/libgtk-3.so.0
#43 0x000068b25bfdbefb in gtk_main_do_event () at /usr/lib/libgtk-3.so.0
#44 0x000068b2619d1bd7 in ??? () at /usr/lib/libgdk-3.so.0
#45 0x000068b2619e3be2 in ??? () at /usr/lib/libgdk-3.so.0
#46 0x000068b2619e8328 in ??? () at /usr/lib/libgdk-3.so.0
#47 0x000068b2619e854d in ??? () at /usr/lib/libgdk-3.so.0
#48 0x000068b261572ca2 in ??? () at /usr/lib/libgobject-2.0.so.0
#49 0x000068b261572da7 in g_signal_emit_valist () at /usr/lib/libgobject-2.0.so.0
#50 0x000068b261572e64 in g_signal_emit () at /usr/lib/libgobject-2.0.so.0
#51 0x000068b2619df1b1 in ??? () at /usr/lib/libgdk-3.so.0
#52 0x000068b2619cb6ee in ??? () at /usr/lib/libgdk-3.so.0
#53 0x000068b25bb3afce in ??? () at /usr/lib/libglib-2.0.so.0
#54 0x000068b25bb39c19 in ??? () at /usr/lib/libglib-2.0.so.0
#55 0x000068b25bb9a2af in ??? () at /usr/lib/libglib-2.0.so.0
#56 0x000068b25bb3a887 in g_main_loop_run () at /usr/lib/libglib-2.0.so.0
#57 0x000068b25bfd947f in gtk_main () at /usr/lib/libgtk-3.so.0
#58 0x0000089009357752 in main ()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants