Bug 222763 - [GStreamer] Crashes deep in GStreamer under gst_element_add_pad
Summary: [GStreamer] Crashes deep in GStreamer under gst_element_add_pad
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Media (show other bugs)
Version: WebKit Nightly Build
Hardware: PC Linux
: P2 Normal
Assignee: Philippe Normand
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2021-03-04 15:20 PST by Michael Catanzaro
Modified: 2021-03-23 08:05 PDT (History)
10 users (show)

See Also:


Attachments
bt full (11.40 KB, text/plain)
2021-03-04 15:21 PST, Michael Catanzaro
no flags Details
thread apply all bt (163.45 KB, text/plain)
2021-03-04 15:23 PST, Michael Catanzaro
no flags Details
log? (677.27 KB, text/plain)
2021-03-07 16:50 PST, Michael Catanzaro
no flags Details
log! (870.94 KB, text/plain)
2021-03-08 10:38 PST, Michael Catanzaro
no flags Details
Patch (5.08 KB, patch)
2021-03-09 04:46 PST, Philippe Normand
no flags Details | Formatted Diff | Diff
Patch (5.19 KB, patch)
2021-03-09 08:11 PST, Philippe Normand
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Catanzaro 2021-03-04 15:20:59 PST
All pages on vox.com have become very crashy. I can look at my coredumpctl history to see when I take my lunch break. :P

Thu 2021-03-04 13:28:13 CST  165981  1000  1000   6 present   /usr/libexec/webkit2gtk-4.0/WebKitWebProcess
Thu 2021-03-04 13:28:41 CST  166244  1000  1000   6 present   /usr/libexec/webkit2gtk-4.0/WebKitWebProcess
Thu 2021-03-04 13:29:15 CST  166901  1000  1000   6 present   /usr/libexec/webkit2gtk-4.0/WebKitWebProcess
Thu 2021-03-04 13:30:21 CST  168105  1000  1000   6 present   /usr/libexec/webkit2gtk-4.0/WebKitWebProcess
Thu 2021-03-04 17:11:34 CST  255740  1000  1000   6 present   /usr/libexec/webkit2gtk-4.0/WebKitWebProcess

I think this started happening with 2.31.90, but not certain:

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ff1de354855 in __GI_abort () at abort.c:79
#2  0x00007ff1deb5e6eb in  () at /usr/lib/x86_64-linux-gnu/libwebkit2gtk-4.0.so.37
#3  0x00007ff1d9f3df75 in ffi_call_unix64 () at ../src/x86/unix64.S:101
#4  0x00007ff1d9f3d369 in ffi_call_int
    (cif=<optimized out>, fn=<optimized out>, rvalue=<optimized out>, avalue=<optimized out>, closure=<optimized out>) at ../src/x86/ffi64.c:669
#9  0x00007ff1de0119c3 in <emit signal ??? on instance 0x7fefa000c020 [GstDecodebin3]>
    (instance=instance@entry=0x7fefa000c020, signal_id=<optimized out>, detail=detail@entry=0)
    at ../gobject/gsignal.c:3553
    #5  0x00007ff1ddff8a0c in g_cclosure_marshal_generic
    (closure=closure@entry=0x7fefa0010db0, return_gvalue=return_gvalue@entry=0x0, n_param_values=n_param_values@entry=2, param_values=param_values@entry=0x7ff00d7f93f0, invocation_hint=invocation_hint@entry=0x7ff00d7f9370, marshal_data=marshal_data@entry=0x0) at ../gobject/gclosure.c:1510
    #6  0x00007ff1ddff7f3f in g_closure_invoke
    (closure=0x7fefa0010db0, return_value=return_value@entry=0x0, n_param_values=2, param_values=param_values@entry=0x7ff00d7f93f0, invocation_hint=invocation_hint@entry=0x7ff00d7f9370) at ../gobject/gclosure.c:810
    #7  0x00007ff1de00ad4b in signal_emit_unlocked_R
    (node=node@entry=0x560ea74f5600, detail=detail@entry=0, instance=instance@entry=0x7fefa000c020, emission_return=emission_return@entry=0x0, instance_and_params=instance_and_params@entry=0x7ff00d7f93f0) at ../gobject/gsignal.c:3741
    #8  0x00007ff1de011861 in g_signal_emit_valist
    (instance=<optimized out>, signal_id=<optimized out>, detail=<optimized out>, var_args=var_args@entry=0x7ff00d7f9590) at ../gobject/gsignal.c:3497
#10 0x00007ff1db3405a0 in gst_element_add_pad
    (element=element@entry=0x7fefa000c020 [GstDecodebin3], pad=0x7fefa000f610 [GstGhostPad])
    at ../gst/gstelement.c:714
#11 0x00007ff154189433 in reconfigure_output_stream (output=0x7fef9c0024d0, slot=0x7fef9801edc0)
    at ../gst/playback/gstdecodebin3.c:2254
#12 0x00007ff154189b4f in multiqueue_src_probe
    (pad=pad@entry=0x7fef980111c0 [GstPad], info=info@entry=0x7ff00d7f9950, slot=0x7fef9801edc0)
    at ../gst/playback/gstdecodebin3.c:1791
#13 0x00007ff1db35c2ee in probe_hook_marshal (hook=0x7fef980122c0, data=0x7ff00d7f9820) at ../gst/gstpad.c:3565
#14 0x00007ff1ddef0466 in g_hook_list_marshal
    (hook_list=hook_list@entry=0x7fef98011258, may_recurse=may_recurse@entry=1, marshaller=marshaller@entry=0x7ff1db35bee0 <probe_hook_marshal>, data=data@entry=0x7ff00d7f9820) at ../glib/ghook.c:672
#15 0x00007ff1db35b9d9 in do_probe_callbacks
    (pad=pad@entry=0x7fef980111c0 [GstPad], info=<optimized out>, defaultval=defaultval@entry=GST_FLOW_OK)
    at ../gst/gstpad.c:3728
#16 0x00007ff1db35f1c5 in gst_pad_push_event_unchecked
    (pad=pad@entry=0x7fef980111c0 [GstPad], event=0x7fef9801f6c0 [GstEvent], type=type@entry=GST_PAD_PROBE_TYPE_EVENT_DOWNSTREAM) at ../gst/gstpad.c:5376
#17 0x00007ff1db35f758 in push_sticky
    (pad=pad@entry=0x7fef980111c0 [GstPad], ev=ev@entry=0x7ff00d7f9a30, user_data=user_data@entry=0x7ff00d7f9aa0)
    at ../gst/gstevent.h:438
#18 0x00007ff1db35d0b0 in events_foreach
    (pad=pad@entry=0x7fef980111c0 [GstPad], func=func@entry=0x7ff1db35f700 <push_sticky>, user_data=user_data@entry=0x7ff00d7f9aa0) at ../gst/gstpad.c:608
#19 0x00007ff1db368400 in check_sticky (event=0x7fef9801f6c0 [GstEvent], pad=0x7fef980111c0 [GstPad])
    at ../gst/gstpad.c:3986
#20 gst_pad_push_event (pad=0x7fef980111c0 [GstPad], event=event@entry=0x7fef9801f6c0 [GstEvent])
    at ../gst/gstpad.c:5542
#21 0x00007ff15c258474 in gst_single_queue_push_one
    (allow_drop=<synthetic pointer>, object=0x7fef9801f6c0 [GstEvent], sq=0x7fef98020ff0, mq=0x560ea79906f0 [GstMultiQueue]) at ../plugins/elements/gstmultiqueue.c:1688
#22 gst_multi_queue_loop (pad=<optimized out>) at ../plugins/elements/gstmultiqueue.c:1959
#23 0x00007ff1db396017 in gst_task_func (task=0x7fefa4029dd0 [GstTask]) at ../gst/gsttask.c:328
#24 0x00007ff1ddf2bea4 in g_thread_pool_thread_proxy (data=<optimized out>) at ../glib/gthreadpool.c:354
#25 0x00007ff1ddf2b5a1 in g_thread_proxy (data=0x7fef98003460) at ../glib/gthread.c:826
#26 0x00007ff1da68b4d2 in start_thread (arg=<optimized out>) at pthread_create.c:477
#27 0x00007ff1de430323 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Will attach a 'bt full' and a 'thread apply all bt full'.
Comment 1 Michael Catanzaro 2021-03-04 15:21:54 PST
Created attachment 422296 [details]
bt full
Comment 2 Michael Catanzaro 2021-03-04 15:23:45 PST
Created attachment 422298 [details]
thread apply all bt

I see several threads doing GStreamery things at the same time...
Comment 3 Philippe Normand 2021-03-05 01:15:54 PST
I'm gonna need logs, as usual.
Comment 4 Michael Catanzaro 2021-03-05 07:24:56 PST
Sadly it seems the bug does not occur today. :S
Comment 5 Philippe Normand 2021-03-05 10:37:11 PST
There's a lot going in the full trace, I suspect the issue is related with the ImageDecoder, having the assert message would be helpful.
Comment 6 Michael Catanzaro 2021-03-05 11:00:04 PST
It was super crashy yesterday but survived my entire lunch break today. I can only assume that the web content itself has changed since yesterday. Sorry I forgot to take a GStreamer debug log; I always forget that. :/

(In reply to Philippe Normand from comment #5)
> There's a lot going in the full trace, I suspect the issue is related with
> the ImageDecoder, having the assert message would be helpful.

I don't think there's an assertion failure here.
Comment 7 Michael Catanzaro 2021-03-07 16:50:46 PST
Created attachment 422540 [details]
log?

This may or may not help... it's too sporadic to catch it when I try to catch it, so I just left WebKit running with GST_DEBUG=6. This is all the terminal scrollback I have, but it covers only one second before the crash and there are multiple browser tabs using GStreamer, so it's probably an unreadable mess. I don't even know if I captured anything interesting. But I'm not sure that I'm going to be able to catch this on its own....
Comment 8 Philippe Normand 2021-03-08 01:32:34 PST
Not very useful indeed, please set GST_DEBUG="3,webkit*:6"
Comment 9 Michael Catanzaro 2021-03-08 06:11:23 PST
(In reply to Philippe Normand from comment #8)
> Not very useful indeed, please set GST_DEBUG="3,webkit*:6"

OK. Maybe you could set up a wiki page somewhere with instructions for reporting GStreamer bugs? Even if I'm the only person who knows to look at it, it might make a difference. :P

A couple more observations:

 * I believe this crash also occurs frequently on cnn.com
 * I'm increasingly confident this is a regression introduced between 2.31.1 and 2.31.90
Comment 10 Philippe Normand 2021-03-08 08:26:05 PST
(In reply to Michael Catanzaro from comment #9)
> (In reply to Philippe Normand from comment #8)
> > Not very useful indeed, please set GST_DEBUG="3,webkit*:6"
> 
> OK. Maybe you could set up a wiki page somewhere with instructions for
> reporting GStreamer bugs? Even if I'm the only person who knows to look at
> it, it might make a difference. :P
> 

It's been in the wiki for a while:
https://trac.webkit.org/wiki/WebKitGTK/Debugging#Debuggingmultimediastuff
Comment 11 Michael Catanzaro 2021-03-08 10:38:59 PST
Created attachment 422583 [details]
log!

OK, hopefully this one is better. I had two tabs crash at roughly the same time, at the bottom of the log. Sadly, since we don't log process identifiers, I guess it will still be likely very hard to follow.
Comment 12 Philippe Normand 2021-03-08 10:50:15 PST
bingo:


0:00:00.322674527     2 0x556178f75f60 DEBUG     webkitimagedecoder ImageDecoderGStreamer.cpp:259:connectDecoderPad:<image-decoder-0> New decodebin pad <decodebin3-0:audio_0> caps: audio/x-raw, format=(string)S16LE, layout=(string)interleaved, rate=(int)[ 8000, 96000 ], channels=(int)[ 1, 8 ]

(epiphany:2): epiphany-WARNING **: 12:34:41.832: Web process crashed

I suspect this is hit:

 RELEASE_ASSERT(doCapsHaveType(padCaps.get(), "video"));
Comment 13 Michael Catanzaro 2021-03-08 12:48:03 PST
Hm, could it be a regression from r271396? That's the only commit after 2.31.1 that touched ImageDecoderGStreamer.
Comment 14 Philippe Normand 2021-03-08 12:53:52 PST
(In reply to Michael Catanzaro from comment #13)
> Hm, could it be a regression from r271396? That's the only commit after
> 2.31.1 that touched ImageDecoderGStreamer.

I doubt this is a recent regression. I think you started seeing this when that website started serving videos (with audio) in <img> tags, that's all.
Comment 15 Philippe Normand 2021-03-09 04:46:59 PST
Created attachment 422688 [details]
Patch
Comment 16 Philippe Normand 2021-03-09 04:47:42 PST
Michael, I wasn't able to reproduce the bug but this patch should make the image decoder a bit more robust... Can you test?
Comment 17 Thibault Saunier 2021-03-09 04:50:44 PST
Comment on attachment 422688 [details]
Patch

lgtm.
Comment 18 Michael Catanzaro 2021-03-09 05:33:49 PST
(In reply to Philippe Normand from comment #16)
> Michael, I wasn't able to reproduce the bug but this patch should make the
> image decoder a bit more robust... Can you test?

The only way to know for sure is to actually run the code all day long for a couple days. I'll add your patch to the GNOME runtime.
Comment 19 Xabier Rodríguez Calvar 2021-03-09 07:29:48 PST
Comment on attachment 422688 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=422688&action=review

> Source/WebCore/platform/graphics/gstreamer/ImageDecoderGStreamer.cpp:335
> +            gst_element_send_event(m_decodebin.get(), gst_event_new_select_streams(streams));

stream in gst_event_new_select_streams is [transfer none]. Shouldn't we free the list after creating the event?
Comment 20 Philippe Normand 2021-03-09 08:11:18 PST
Created attachment 422696 [details]
Patch
Comment 21 Michael Catanzaro 2021-03-09 08:29:13 PST
(In reply to Michael Catanzaro from comment #18)
> The only way to know for sure is to actually run the code all day long for a
> couple days. I'll add your patch to the GNOME runtime.

https://gitlab.gnome.org/GNOME/gnome-build-meta/-/merge_requests/1058
Comment 22 Michael Catanzaro 2021-03-10 15:37:58 PST
Zero crashes today. Lots of crashes yesterday. I think you fixed it.
Comment 23 Michael Catanzaro 2021-03-12 07:19:03 PST
I'm now very confident this fix works. Ping multimedia reviewers, let's try to get this into 2.32.0.
Comment 24 EWS 2021-03-12 08:25:35 PST
Committed r274358: <https://commits.webkit.org/r274358>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 422696 [details].
Comment 25 Radar WebKit Bug Importer 2021-03-12 08:26:15 PST
<rdar://problem/75362586>