WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 264824
[GStreamer] Crash after 10 seconds on watchdog thread due to loop when destroying ~ImageDecoderGStreamerSample
https://bugs.webkit.org/show_bug.cgi?id=264824
Summary
[GStreamer] Crash after 10 seconds on watchdog thread due to loop when destro...
Michael Catanzaro
Reported
2023-11-14 10:46:35 PST
Created
attachment 468593
[details]
backtrace Similar to
bug #257551
,
bug #259504
, and
bug #260796
, here's another crash after 10 seconds when the main thread has deadlocked deep inside GStreamer. The main thread is stuck here: hread 2 (Thread 0x7f115d3e7f40 (LWP 2)): #0 0x00007f11655113cb in __GI_sched_yield () at ../sysdeps/unix/syscall-template.S:120 #1 0x00007f11610b1279 in g_thread_yield () at ../glib/gthread-posix.c:1329 #2 0x00007f1161d55ad5 in default_stop (pool=0x7f0e941fcf70 [GstVideoBufferPool|videobufferpool1602]) at ../gst/gstbufferpool.c:415 #3 0x00007f1161d4eb53 in do_stop (pool=pool@entry=0x7f0e941fcf70 [GstVideoBufferPool|videobufferpool1602]) at ../gst/gstbufferpool.c:440 #4 0x00007f1161d56b70 in dec_outstanding (pool=0x7f0e941fcf70 [GstVideoBufferPool|videobufferpool1602]) at ../gst/gstbufferpool.c:1207 #5 dec_outstanding (pool=0x7f0e941fcf70 [GstVideoBufferPool|videobufferpool1602]) at ../gst/gstbufferpool.c:1197 #6 gst_buffer_pool_release_buffer (buffer=0x7f0e941fff80 [GstBuffer], pool=0x7f0e941fcf70 [GstVideoBufferPool|videobufferpool1602]) at ../gst/gstbufferpool.c:1393 #7 gst_buffer_pool_release_buffer (pool=0x7f0e941fcf70 [GstVideoBufferPool|videobufferpool1602], buffer=0x7f0e941fff80 [GstBuffer]) at ../gst/gstbufferpool.c:1372 #8 0x00007f1161d56bb8 in _gst_buffer_dispose (buffer=0x7f0e941fff80 [GstBuffer]) at ../gst/gstbuffer.c:776 #9 0x00007f1161d8c0d4 in gst_mini_object_unref (mini_object=0x7f0e941fff80 [GstBuffer]) at ../gst/gstminiobject.c:669 #10 0x00007f1161dbefca in _gst_sample_free (sample=0x55b845127710 [GstSample]) at ../gst/gstsample.c:87 #11 0x00007f1167d89570 in WTF::GRefPtr<_GstSample>::clear() (this=0x7f109a9f07b8) at WTF/Headers/wtf/glib/GRefPtr.h:90 #12 WebCore::ImageGStreamer::~ImageGStreamer() (this=0x7f109a9f07b0) at /buildstream/gnome/sdk/webkitgtk-6.0.bst/Source/WebCore/platform/graphics/gstreamer/ImageGStreamerCairo.cpp:175 All-threads backtrace attached.
Attachments
backtrace
(94.92 KB, text/plain)
2023-11-14 10:46 PST
,
Michael Catanzaro
no flags
Details
View All
Add attachment
proposed patch, testcase, etc.
Philippe Normand
Comment 1
2023-11-15 07:00:03 PST
Deadlock or infinite loop?
Michael Catanzaro
Comment 2
2023-11-15 07:05:28 PST
Hm, I guess it's indeed an infinite (or excessively slow) loop inside default_stop().
Philippe Normand
Comment 3
2023-11-15 07:18:32 PST
It's a bit weird, maybe something buggy in gst_video_convert_sample()... The returned sample is ref'ed in the decoder buffer pool, which was disposed before returning from gst_video_convert_sample() ... We might need to do a deep copy of the sample I think...
Philippe Normand
Comment 4
2023-11-15 07:41:48 PST
I filed a GStreamer bug about this:
https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/3131
Michael Catanzaro
Comment 5
2023-11-15 08:04:45 PST
I just hit this again when viewing:
https://news.sky.com/story/plane-forced-to-return-to-airport-after-horse-gets-loose-on-board-13008688
This crash occurred shortly after a different web process died to
bug #264674
when attempting to display the same page. Notably, the UI process died as well, although there is no core dump for it, so presumably it quit itself and we'll never learn why. :/ I also found a huge spam of warnings in my journal. The first look like this: Nov 15 09:55:54 chargestone-cave org.gnome.Epiphany.Devel.desktop[35157]: (WebKitWebProcess:2): GStreamer-CRITICAL **: 09:55:54.594: gst_poll_write_control: assertion 'set != NULL' failed Nov 15 09:55:54 chargestone-cave org.gnome.Epiphany.Devel.desktop[35157]: (WebKitWebProcess:2): GStreamer-CRITICAL **: 09:55:54.594: gst_poll_write_control: assertion 'set != NULL' failed Nov 15 09:55:54 chargestone-cave org.gnome.Epiphany.Devel.desktop[35157]: (WebKitWebProcess:2): GStreamer-CRITICAL **: 09:55:54.594: gst_poll_read_control: assertion 'set != NULL' failed So that explains why the loop never quits: while (!gst_poll_read_control (priv->poll)) It will always return FALSE due to the precondition violation.
Philippe Normand
Comment 6
2023-11-15 10:00:42 PST
This makes no sense and I am unable to reproduce this issue. Sorry.
Michael Catanzaro
Comment 7
2024-01-01 06:50:34 PST
I believe I've effectively lost my system journal to this bug; I now have only a few days of history in my journal instead of the several years of history that I would expect. It's all spammed with: gst_poll_read_control: assertion 'set != NULL' failed I see about 30,000 of these in the first two seconds of my journal; after that, there's a 30 second gap and then more of the same. I haven't checked beyond that but I think it's safe to assume that my system journal is probably >99% this. Can we turn this into a g_assert() to crash the process instead of spam the journal? (Arguably there is a systemd bug here for even allowing one process to create this much spam, but oh well.)
Michael Catanzaro
Comment 8
2024-01-25 07:23:26 PST
How about we add an assertion to crash the process when this bug is hit, so it doesn't spam the journal? I'm trying to investigate a hardware error that happened yesterday at the same time that I hit this bug, and having to wade through thousands of lines of this is not helpful. I gave up on hitting PageDown after 100,000 lines of this error message, and I suspect that might be only a small fraction of the spam; it's so much that journalctl isn't even able to page to the end.
Philippe Normand
Comment 9
2024-01-25 10:36:12 PST
I'm not sure to follow. iiuc: gst_poll_read_control: assertion 'set != NULL' failed happens because we run out of descriptors. In any case that's deep in GStreamer. So not sure what you want to do about this. Obviously the proper fix would be to fix the leaks :)
Michael Catanzaro
Comment 10
2024-01-25 11:39:43 PST
What I'm suggesting as a stopgap measure until this can be properly investigated and fixed. Something like: diff --git a/subprojects/gstreamer/gst/gstbufferpool.c b/subprojects/gstreamer/gst/gstbufferpool.c index d7da0cd4ad..d06c78fd6a 100644 --- a/subprojects/gstreamer/gst/gstbufferpool.c +++ b/subprojects/gstreamer/gst/gstbufferpool.c @@ -408,6 +408,7 @@ default_stop (GstBufferPool * pool) /* clear the pool */ while ((buffer = gst_atomic_queue_pop (priv->queue))) { + g_assert (priv->poll != NULL); while (!gst_poll_read_control (priv->poll)) { if (errno == EWOULDBLOCK) { /* We put the buffer into the queue but did not finish writing control While a crash of course is not great, it would be *much* less inconvenient than a denial of service attack against the system journal. :)
Philippe Normand
Comment 11
2024-02-16 08:39:14 PST
(In reply to Michael Catanzaro from
comment #5
)
> I just hit this again when viewing: > >
https://news.sky.com/story/plane-forced-to-return-to-airport-after-horse
- > gets-loose-on-board-13008688 >
After disabling my pihole I can reproduce the warning (GStreamer-CRITICAL **: 16:30:29.127: gst_poll_write_control: assertion 'set != NULL' failed) in TP (gotta scroll down to the bottom and wait). I can't reproduce this with current git main, so ... shrug. Maybe this was fixed then? There were a bunch of leak fixes in the last couple months.
Philippe Normand
Comment 12
2024-02-16 09:24:52 PST
OK I can reproduce the issue after setting autoplay-policy=deny and enabling leak tracing...
Philippe Normand
Comment 13
2024-02-17 05:11:11 PST
Pull request:
https://github.com/WebKit/WebKit/pull/24684
EWS
Comment 14
2024-02-20 03:50:15 PST
Committed
275032@main
(e653f5a19d03): <
https://commits.webkit.org/275032@main
> Reviewed commits have been landed. Closing PR #24684 and removing active labels.
Radar WebKit Bug Importer
Comment 15
2024-02-20 03:51:20 PST
<
rdar://problem/123304014
>
Michael Catanzaro
Comment 16
2024-03-01 06:28:18 PST
***
Bug 266573
has been marked as a duplicate of this bug. ***
Philippe Normand
Comment 17
2024-03-08 09:40:24 PST
***
Bug 270698
has been marked as a duplicate of this bug. ***
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug