RESOLVED FIXED 280003
REGRESSION(283414@main): [WPE][GTK] Network process crash when writing to pid socket
https://bugs.webkit.org/show_bug.cgi?id=280003
Summary REGRESSION(283414@main): [WPE][GTK] Network process crash when writing to pid...
Michael Catanzaro
Reported 2024-09-19 12:35:02 PDT
So I found a few crashes in my coredumpctl today: Thu 2024-09-19 10:03:03 CDT 110106 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 10:05:28 CDT 111069 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 10:06:03 CDT 111704 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 10:06:07 CDT 111836 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 10:06:49 CDT 112310 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 10:07:21 CDT 112671 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 10:27:40 CDT 116131 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 10:30:35 CDT 116661 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 10:30:37 CDT 116859 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 11:09:15 CDT 138281 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 12:04:28 CDT 159776 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 12:05:13 CDT 160735 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 12:06:00 CDT 161122 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 12:10:19 CDT 162120 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 12:10:36 CDT 162455 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 12:10:53 CDT 162794 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 12:40:45 CDT 10583 1000 1000 SIGABRT present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 12:42:40 CDT 173060 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 12:58:12 CDT 175517 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 12:58:28 CDT 175969 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 12:58:36 CDT 176337 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 13:00:58 CDT 176767 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 13:03:30 CDT 177475 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 13:11:05 CDT 179595 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 13:11:39 CDT 179823 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 13:11:46 CDT 180051 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 14:05:38 CDT 202981 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 14:06:12 CDT 203416 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 14:07:53 CDT 204229 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 14:07:53 CDT 204347 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 14:10:26 CDT 210788 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 14:10:35 CDT 211294 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess > Thu 2024-09-19 14:11:27 CDT 212373 1000 1000 SIGTRAP present /usr/libexec/webkitgtk-6.0/WebKitNetworkProcess That's only half of them. I'll spare you the rest. Backtrace: #0 g_log_structured_array (log_level=<optimized out>, fields=0x7ffd526365f0, n_fields=3) at ../glib/gmessages.c:426 #1 0x00007ff10ef1c346 in g_log_default_handler (log_domain=log_domain@entry=0x0, log_level=log_level@entry=6, message=message@entry=0x55b1f8ef0e90 "sendPIDToPeer: Failed to send pid: Broken pipe", unused_data=unused_data@entry=0x0) at ../glib/gmessages.c:3357 #2 0x00007ff10ef1c5cc in g_logv (log_domain=0x0, log_level=G_LOG_LEVEL_ERROR, format=<optimized out>, args=args@entry=0x7ffd52636740) at ../glib/gmessages.c:1246 #3 0x00007ff10ef1c943 in g_log (log_domain=<optimized out>, log_level=<optimized out>, format=<optimized out>) at ../glib/gmessages.c:1315 #4 0x00007ff1140a3a63 in IPC::sendPIDToPeer (socket=<optimized out>) at /buildstream/gnome/sdk/webkitgtk-6.0.bst/Source/WebKit/Platform/IPC/unix/ConnectionUnix.cpp:608 #5 0x00007ff1140cb8b3 in WebKit::AuxiliaryProcessMainCommon::parseCommandLine (this=0x7ffd526368d8, argc=<optimized out>, argv=<optimized out>) at /buildstream/gnome/sdk/webkitgtk-6.0.bst/Source/WebKit/Shared/unix/AuxiliaryProcessMain.cpp:90 #6 0x00007ff1140277c4 in WebKit::AuxiliaryProcessMainBase<WebKit::NetworkProcess, false>::run (this=0x7ffd526368d0, argc=4, argv=0x7ffd52636a68) at /buildstream/gnome/sdk/webkitgtk-6.0.bst/Source/WebKit/Shared/AuxiliaryProcessMain.h:66 #7 WebKit::AuxiliaryProcessMain<WebKit::NetworkProcessMainSoup> (argc=4, argv=0x7ffd52636a68) at /buildstream/gnome/sdk/webkitgtk-6.0.bst/Source/WebKit/Shared/AuxiliaryProcessMain.h:98 #8 0x00007ff11342b188 in __libc_start_call_main (main=main@entry=0x55b1e0011150 <main(int, char**)>, argc=argc@entry=4, argv=argv@entry=0x7ffd52636a68) at ../sysdeps/nptl/libc_start_call_main.h:58 #9 0x00007ff11342b24b in __libc_start_main_impl (main=0x55b1e0011150 <main(int, char**)>, argc=4, argv=0x7ffd52636a68, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffd52636a58) at ../csu/libc-start.c:360 #10 0x000055b1e0011085 in _start () at ../sysdeps/x86_64/start.S:115 Since 283414@main, this would surely occur if the UI process quits after it begins launching the network process but before it completes. But that also makes no sense. That should be an extremely small race window, less than 200 milliseconds and much too small to explain the large number of crashes here. You can also see it's common to crash 3 times within 1 minute, but I'm not opening and closing Epiphany that frequently. I think we should make the network process exit gracefully with exit status 1 when this happens, to avoid generating a core dump. That's easy enough, but we should try to understand why this is happening.
Attachments
Michael Catanzaro
Comment 1 2024-09-19 12:42:07 PDT
(In reply to Michael Catanzaro from comment #0) > I think we should make the network process exit gracefully with exit status > 1 when this happens, to avoid generating a core dump. This should be safe to do regardless because the UI process will crash (g_error("Failed to read pid from child process")) *unless* the ProcessLauncher has been destroyed (destroying its GSocketMonitor), so we're not going to fail to notice any problems. Maybe process launch is being canceled for some reason?
Michael Catanzaro
Comment 2 2024-09-19 14:19:20 PDT
(In reply to Michael Catanzaro from comment #1) > This should be safe to do regardless because the UI process will crash > (g_error("Failed to read pid from child process")) *unless* the > ProcessLauncher has been destroyed (destroying its GSocketMonitor), so we're > not going to fail to notice any problems. Maybe process launch is being > canceled for some reason? If this hypothesis is true, that would mean the NetworkProcessProxy is being quickly created and then destroyed. But that seems very strange?
Michael Catanzaro
Comment 3 2024-09-19 14:37:24 PDT
Michael Catanzaro
Comment 4 2024-09-20 06:22:08 PDT
OK, surprise! When closing Epiphany just now with Ctrl+Q, I noticed a very slight UI process hang before it quit. That was this network process crash. For some reason, WebKit is launching a new network process right when it quits! No wonder the ProcessLauncher gets destroyed. But it usually doesn't happen. Unfortunately I don't know how to reproduce it. And most of the crashes I see occur when the UI process is running normally, not just before it is closed.
EWS
Comment 5 2024-09-20 06:32:55 PDT
Committed 283981@main (1f6c2306d3ed): <https://commits.webkit.org/283981@main> Reviewed commits have been landed. Closing PR #33934 and removing active labels.
Michael Catanzaro
Comment 6 2024-09-20 06:43:03 PDT
(In reply to Michael Catanzaro from comment #2) > If this hypothesis is true, that would mean the NetworkProcessProxy is being > quickly created and then destroyed. But that seems very strange? Reported bug #280061 for follow up.
Note You need to log in before you can comment on or make changes to this bug.