All pages on vox.com have become very crashy. I can look at my coredumpctl history to see when I take my lunch break. :P
Thu 2021-03-04 13:28:13 CST 165981 1000 1000 6 present /usr/libexec/webkit2gtk-4.0/WebKitWebProcess
Thu 2021-03-04 13:28:41 CST 166244 1000 1000 6 present /usr/libexec/webkit2gtk-4.0/WebKitWebProcess
Thu 2021-03-04 13:29:15 CST 166901 1000 1000 6 present /usr/libexec/webkit2gtk-4.0/WebKitWebProcess
Thu 2021-03-04 13:30:21 CST 168105 1000 1000 6 present /usr/libexec/webkit2gtk-4.0/WebKitWebProcess
Thu 2021-03-04 17:11:34 CST 255740 1000 1000 6 present /usr/libexec/webkit2gtk-4.0/WebKitWebProcess
I think this started happening with 2.31.90, but not certain:
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007ff1de354855 in __GI_abort () at abort.c:79
#2 0x00007ff1deb5e6eb in () at /usr/lib/x86_64-linux-gnu/libwebkit2gtk-4.0.so.37
#3 0x00007ff1d9f3df75 in ffi_call_unix64 () at ../src/x86/unix64.S:101
#4 0x00007ff1d9f3d369 in ffi_call_int
(cif=<optimized out>, fn=<optimized out>, rvalue=<optimized out>, avalue=<optimized out>, closure=<optimized out>) at ../src/x86/ffi64.c:669
#9 0x00007ff1de0119c3 in <emit signal ??? on instance 0x7fefa000c020 [GstDecodebin3]>
(instance=instance@entry=0x7fefa000c020, signal_id=<optimized out>, detail=detail@entry=0)
#5 0x00007ff1ddff8a0c in g_cclosure_marshal_generic
(closure=closure@entry=0x7fefa0010db0, return_gvalue=return_gvalue@entry=0x0, n_param_values=n_param_values@entry=2, param_values=param_values@entry=0x7ff00d7f93f0, invocation_hint=invocation_hint@entry=0x7ff00d7f9370, marshal_data=marshal_data@entry=0x0) at ../gobject/gclosure.c:1510
#6 0x00007ff1ddff7f3f in g_closure_invoke
(closure=0x7fefa0010db0, return_value=return_value@entry=0x0, n_param_values=2, param_values=param_values@entry=0x7ff00d7f93f0, invocation_hint=invocation_hint@entry=0x7ff00d7f9370) at ../gobject/gclosure.c:810
#7 0x00007ff1de00ad4b in signal_emit_unlocked_R
(node=node@entry=0x560ea74f5600, detail=detail@entry=0, instance=instance@entry=0x7fefa000c020, emission_return=emission_return@entry=0x0, instance_and_params=instance_and_params@entry=0x7ff00d7f93f0) at ../gobject/gsignal.c:3741
#8 0x00007ff1de011861 in g_signal_emit_valist
(instance=<optimized out>, signal_id=<optimized out>, detail=<optimized out>, var_args=var_args@entry=0x7ff00d7f9590) at ../gobject/gsignal.c:3497
#10 0x00007ff1db3405a0 in gst_element_add_pad
(element=element@entry=0x7fefa000c020 [GstDecodebin3], pad=0x7fefa000f610 [GstGhostPad])
#11 0x00007ff154189433 in reconfigure_output_stream (output=0x7fef9c0024d0, slot=0x7fef9801edc0)
#12 0x00007ff154189b4f in multiqueue_src_probe
(pad=pad@entry=0x7fef980111c0 [GstPad], info=info@entry=0x7ff00d7f9950, slot=0x7fef9801edc0)
#13 0x00007ff1db35c2ee in probe_hook_marshal (hook=0x7fef980122c0, data=0x7ff00d7f9820) at ../gst/gstpad.c:3565
#14 0x00007ff1ddef0466 in g_hook_list_marshal
(hook_list=hook_list@entry=0x7fef98011258, may_recurse=may_recurse@entry=1, marshaller=marshaller@entry=0x7ff1db35bee0 <probe_hook_marshal>, data=data@entry=0x7ff00d7f9820) at ../glib/ghook.c:672
#15 0x00007ff1db35b9d9 in do_probe_callbacks
(pad=pad@entry=0x7fef980111c0 [GstPad], info=<optimized out>, defaultval=defaultval@entry=GST_FLOW_OK)
#16 0x00007ff1db35f1c5 in gst_pad_push_event_unchecked
(pad=pad@entry=0x7fef980111c0 [GstPad], event=0x7fef9801f6c0 [GstEvent], type=type@entry=GST_PAD_PROBE_TYPE_EVENT_DOWNSTREAM) at ../gst/gstpad.c:5376
#17 0x00007ff1db35f758 in push_sticky
(pad=pad@entry=0x7fef980111c0 [GstPad], ev=ev@entry=0x7ff00d7f9a30, user_data=user_data@entry=0x7ff00d7f9aa0)
#18 0x00007ff1db35d0b0 in events_foreach
(pad=pad@entry=0x7fef980111c0 [GstPad], func=func@entry=0x7ff1db35f700 <push_sticky>, user_data=user_data@entry=0x7ff00d7f9aa0) at ../gst/gstpad.c:608
#19 0x00007ff1db368400 in check_sticky (event=0x7fef9801f6c0 [GstEvent], pad=0x7fef980111c0 [GstPad])
#20 gst_pad_push_event (pad=0x7fef980111c0 [GstPad], event=event@entry=0x7fef9801f6c0 [GstEvent])
#21 0x00007ff15c258474 in gst_single_queue_push_one
(allow_drop=<synthetic pointer>, object=0x7fef9801f6c0 [GstEvent], sq=0x7fef98020ff0, mq=0x560ea79906f0 [GstMultiQueue]) at ../plugins/elements/gstmultiqueue.c:1688
#22 gst_multi_queue_loop (pad=<optimized out>) at ../plugins/elements/gstmultiqueue.c:1959
#23 0x00007ff1db396017 in gst_task_func (task=0x7fefa4029dd0 [GstTask]) at ../gst/gsttask.c:328
#24 0x00007ff1ddf2bea4 in g_thread_pool_thread_proxy (data=<optimized out>) at ../glib/gthreadpool.c:354
#25 0x00007ff1ddf2b5a1 in g_thread_proxy (data=0x7fef98003460) at ../glib/gthread.c:826
#26 0x00007ff1da68b4d2 in start_thread (arg=<optimized out>) at pthread_create.c:477
#27 0x00007ff1de430323 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Will attach a 'bt full' and a 'thread apply all bt full'.
Created attachment 422296 [details]
Created attachment 422298 [details]
thread apply all bt
I see several threads doing GStreamery things at the same time...
I'm gonna need logs, as usual.
Sadly it seems the bug does not occur today. :S
There's a lot going in the full trace, I suspect the issue is related with the ImageDecoder, having the assert message would be helpful.
It was super crashy yesterday but survived my entire lunch break today. I can only assume that the web content itself has changed since yesterday. Sorry I forgot to take a GStreamer debug log; I always forget that. :/
(In reply to Philippe Normand from comment #5)
> There's a lot going in the full trace, I suspect the issue is related with
> the ImageDecoder, having the assert message would be helpful.
I don't think there's an assertion failure here.
Created attachment 422540 [details]
This may or may not help... it's too sporadic to catch it when I try to catch it, so I just left WebKit running with GST_DEBUG=6. This is all the terminal scrollback I have, but it covers only one second before the crash and there are multiple browser tabs using GStreamer, so it's probably an unreadable mess. I don't even know if I captured anything interesting. But I'm not sure that I'm going to be able to catch this on its own....
Not very useful indeed, please set GST_DEBUG="3,webkit*:6"
(In reply to Philippe Normand from comment #8)
> Not very useful indeed, please set GST_DEBUG="3,webkit*:6"
OK. Maybe you could set up a wiki page somewhere with instructions for reporting GStreamer bugs? Even if I'm the only person who knows to look at it, it might make a difference. :P
A couple more observations:
* I believe this crash also occurs frequently on cnn.com
* I'm increasingly confident this is a regression introduced between 2.31.1 and 2.31.90
(In reply to Michael Catanzaro from comment #9)
> (In reply to Philippe Normand from comment #8)
> > Not very useful indeed, please set GST_DEBUG="3,webkit*:6"
> OK. Maybe you could set up a wiki page somewhere with instructions for
> reporting GStreamer bugs? Even if I'm the only person who knows to look at
> it, it might make a difference. :P
It's been in the wiki for a while:
Created attachment 422583 [details]
OK, hopefully this one is better. I had two tabs crash at roughly the same time, at the bottom of the log. Sadly, since we don't log process identifiers, I guess it will still be likely very hard to follow.
0:00:00.322674527 2 0x556178f75f60 DEBUG webkitimagedecoder ImageDecoderGStreamer.cpp:259:connectDecoderPad:<image-decoder-0> New decodebin pad <decodebin3-0:audio_0> caps: audio/x-raw, format=(string)S16LE, layout=(string)interleaved, rate=(int)[ 8000, 96000 ], channels=(int)[ 1, 8 ]
(epiphany:2): epiphany-WARNING **: 12:34:41.832: Web process crashed
I suspect this is hit:
Hm, could it be a regression from r271396? That's the only commit after 2.31.1 that touched ImageDecoderGStreamer.
(In reply to Michael Catanzaro from comment #13)
> Hm, could it be a regression from r271396? That's the only commit after
> 2.31.1 that touched ImageDecoderGStreamer.
I doubt this is a recent regression. I think you started seeing this when that website started serving videos (with audio) in <img> tags, that's all.
Created attachment 422688 [details]
Michael, I wasn't able to reproduce the bug but this patch should make the image decoder a bit more robust... Can you test?
Comment on attachment 422688 [details]
(In reply to Philippe Normand from comment #16)
> Michael, I wasn't able to reproduce the bug but this patch should make the
> image decoder a bit more robust... Can you test?
The only way to know for sure is to actually run the code all day long for a couple days. I'll add your patch to the GNOME runtime.
Comment on attachment 422688 [details]
View in context: https://bugs.webkit.org/attachment.cgi?id=422688&action=review
> + gst_element_send_event(m_decodebin.get(), gst_event_new_select_streams(streams));
stream in gst_event_new_select_streams is [transfer none]. Shouldn't we free the list after creating the event?
Created attachment 422696 [details]
(In reply to Michael Catanzaro from comment #18)
> The only way to know for sure is to actually run the code all day long for a
> couple days. I'll add your patch to the GNOME runtime.
Zero crashes today. Lots of crashes yesterday. I think you fixed it.
I'm now very confident this fix works. Ping multimedia reviewers, let's try to get this into 2.32.0.
Committed r274358: <https://commits.webkit.org/r274358>
All reviewed patches have been landed. Closing bug and clearing flags on attachment 422696 [details].