Bug 274802

Summary: REGRESSION(274853@main): [GLib][MSE] Excessive CPU usage from cached web process after closing ksdk.com page, failure to display web content after reopening page, part 2
Product: WebKit Reporter: Michael Catanzaro <mcatanzaro>
Component: WebKitGTKAssignee: Philippe Normand <philn>
Status: RESOLVED FIXED    
Severity: Normal CC: bugs-noreply, jean-yves.avenard, mcatanzaro, philn
Priority: P2    
Version: Other   
Hardware: PC   
OS: Linux   
See Also: https://bugs.webkit.org/show_bug.cgi?id=274261
https://bugs.webkit.org/show_bug.cgi?id=269461
Attachments:
Description Flags
gst.log
none
dots none

Michael Catanzaro
Reported 2024-05-28 14:52:18 PDT
This is almost exactly the same as bug #274261, except the reproducer website is different: https://www.ksdk.com/article/news/local/business-journal/why-fewer-cicadas-in-city-of-st-louis/63-387e0227-992c-43f0-80a7-ceaa892a3561?fbclid=IwZXh0bgNhZW0CMTEAAR2haCCONxEqI_SUqSL7ZfjyCQp1vUNTda_lcr7amZH9LuEdFxb-gllbGhQ_aem_AYtd79h6GJYg-D0MbLsiyXo6X6xoUami6ydQijOVtcKrfCeEoXhbuoNmTHbIlNU8PuzYdRluirZ57BMzdQ07BrYE instead of nbcnews.com, and also the underlying cause of the bug is different. Bug #274261 was introduced by 276798@main and fixed in 278929@main. But the the problem on this website is not fixed. I'm having trouble tracking it down, though, because I can only reproduce in Epiphany Tech Preview using WebKitGTK 2.45.2. When I attempt to reproduce the issue in my jhbuild environment so that I can bisect it, the video just doesn't play at all. I wonder if I need to build a bunch of GStreamer stuff in my jhbuild to make the video work.
Attachments
gst.log (618.56 KB, text/x-log)
2024-05-31 06:38 PDT, Michael Catanzaro
no flags
dots (284.04 KB, application/zip)
2024-05-31 06:38 PDT, Michael Catanzaro
no flags
Michael Catanzaro
Comment 1 2024-05-30 13:55:34 PDT
After some debugging with Phil, I eventually noticed that I simply didn't have gstreamer1-plugin-openh264 installed on my host system, so no wonder the video didn't work. Then I was able to bisect. This is caused by 274802@main. Normally I would revert the commit and ask questions later, but this one looks pretty big and is likely going to be awkward to back out and reland. It would be better if we could figure out what is wrong.
Michael Catanzaro
Comment 2 2024-05-30 13:57:39 PDT
(In reply to Michael Catanzaro from comment #1) > This is caused by 274802@main. Sorry, it's caused by 274853@main "[MSE] Add MediaSourceHandle object and update to MSE's latest spec (including MSE in worker)" (274802 is the ID of this bug.)
Philippe Normand
Comment 3 2024-05-31 01:52:46 PDT
I can't load that here... is that geoblocked? Access Denied You don't have permission to access "http://www.ksdk.com/article/news/local/business-journal/why-fewer-cicadas-in-city-of-st-louis/63-387e0227-992c-43f0-80a7-ceaa892a3561" on this server. Reference #18.11b91402.1717145532.33a35968 https://errors.edgesuite.net/18.11b91402.1717145532.33a35968
Michael Catanzaro
Comment 4 2024-05-31 05:48:13 PDT
Drat, it's probably because you're in a GDPR country. This bug is also happening very regularly on reddit.com, but I haven't found a reliable reproducer for it there yet unfortunately. The www.ksdk.com reproducer is 100%. :(
Michael Catanzaro
Comment 5 2024-05-31 06:36:00 PDT
Another reproducer: https://www.firstalert4.com/2024/05/30/time-is-running-out-get-up-500-walmart-part-class-action-settlement/?tbref=hp same procedure Another reproducer: https://apnews.com/live/trump-guilty-verdict-updates, scroll down to "Trump to hold a news conference at Trump Tower", click on the video to start it playing, then close the page
Michael Catanzaro
Comment 6 2024-05-31 06:38:23 PDT
Created attachment 471553 [details] gst.log Here's a debug log taken on ksdk.com
Michael Catanzaro
Comment 7 2024-05-31 06:38:56 PDT
Philippe Normand
Comment 8 2024-05-31 08:35:51 PDT
HTMLMediaElement::mediaPlayerReadyStateChanged() peeked my interest in the perf logs... Turns out something calls it in an infinite loop. I don't what yet, I can continue working on that in the coming days...
Philippe Normand
Comment 9 2024-06-01 02:57:40 PDT
Is the HTMLMediaElement supposed to be destroyed when the tab is closed? Because currently it's not, and thus mediaPlayerReadyStateChanged() loops on itself forever. I'm not familiar with the process suspension logic and this is not related with GStreamer, so I'm not sure I'm the right person to debug this, sorry.
Michael Catanzaro
Comment 10 2024-06-01 05:37:24 PDT
(In reply to Philippe Normand from comment #9) > Is the HTMLMediaElement supposed to be destroyed when the tab is closed? I would expect the entire web page ought to be destroyed? But perhaps a leak keeps the HTMLMediaElement alive? > Because currently it's not, and thus mediaPlayerReadyStateChanged() loops on > itself forever. > > I'm not familiar with the process suspension logic and this is not related > with GStreamer, so I'm not sure I'm the right person to debug this, sorry. This is unrelated to process suspension. Currently there is no process suspension (bug #262794), but even if that was fixed, it's just SIGSTOP/SIGCONT, so suspending the process would only delay the bug until the process is restored from page cache. Anyway, I'll try to revert for now. We'll just have to try again.
Michael Catanzaro
Comment 11 2024-06-01 05:42:24 PDT
There are merge conflicts in 20 files, so a revert won't be a weekend project. I'll probably additionally revert many other commits that depend on this one. This will be pretty disruptive unfortunately :( but we need to get back to a good state. The good news is this hasn't hit 2.44 so stable branch users are not affected yet. Anyway, thanks for investigating. At least now we know the problem is related to HTMLMediaElement not being destroyed. I wonder if we'd be able to reland 276798@main now.... (In reply to Michael Catanzaro from comment #10) > so suspending the process would only delay the bug until > the process is restored from page cache. I meant "web process cache," not page cache.
Michael Catanzaro
Comment 12 2024-06-01 06:39:07 PDT
Phil is pretty nervous about reverting. We're not sure what to do here. What I'm going to do instead is just downgrade Ephy Tech Preview all the way back to 2.44 for now. That's really bad as it means significantly less testing for the past several months of changes in WebKit, but it will eliminate the urgency here at least.
Michael Catanzaro
Comment 13 2024-06-06 08:39:15 PDT
I'm not sure we made the right choice here. The longer we wait to revert, the harder it's going to eventually be. Jean-Yves, your opinion would be helpful. Anyway, looks like there is a reference leak in a loop somewhere. I added a debug patch: diff --git a/Source/WebCore/html/HTMLMediaElement.h b/Source/WebCore/html/HTMLMediaElement.h index 0a87c1f42471..fa15c9453359 100644 --- a/Source/WebCore/html/HTMLMediaElement.h +++ b/Source/WebCore/html/HTMLMediaElement.h @@ -174,8 +174,8 @@ public: using CanMakeWeakPtr<HTMLMediaElement, WeakPtrFactoryInitialization::Eager>::WeakPtrImplType; // ActiveDOMObject. - void ref() const final { HTMLElement::ref(); } - void deref() const final { HTMLElement::deref(); } + void ref() const final { WTFLogAlways("before ref: this=%p refcount=%u", this, refCount()); HTMLElement::ref(); } + void deref() const final { WTFLogAlways("before deref: this=%p refcount=%u", this, refCount()); HTMLElement::deref(); } MediaPlayer* player() const { return m_player.get(); } RefPtr<MediaPlayer> protectedPlayer() const { return m_player; } There are a bunch of HTMLMediaElements on this page, but only one of them suffers from a continually increasing refcount. For example: before deref: this=0x7f36d6084660 refcount=41 before deref: this=0x7f36d6084660 refcount=40 before ref: this=0x7f36d60880b0 refcount=2 before ref: this=0x7f36d60880b0 refcount=3 before deref: this=0x7f36d60880b0 refcount=4 before deref: this=0x7f36d60880b0 refcount=3 before ref: this=0x7f36d6087400 refcount=1 before ref: this=0x7f36d6087400 refcount=2 before deref: this=0x7f36d6087400 refcount=3 before deref: this=0x7f36d6087400 refcount=2 When the page is open, every so often the refcount suddenly drops down to a normal level (less than 10 refs). But after closing the page, the refcount remains permanently stuck at the elevated level. Excessive CPU usage begin immediately when the page is closed and never sooner. I also added breakpoints to the ref and deref functions, but there are a huge number of refs and derefs so I don't think that's a viable path for investigation.
Michael Catanzaro
Comment 14 2024-06-06 08:47:16 PDT
I pressed Ctrl+C to get a web process backtrace: #0 0x00007f1d82b1e86d in __GI___libc_write (fd=2, buf=0x7ffd6a4022b0, nbytes=45) at ../sysdeps/unix/sysv/linux/write.c:26 #1 __GI___libc_write (fd=2, buf=0x7ffd6a4022b0, nbytes=45) at ../sysdeps/unix/sysv/linux/write.c:24 #2 0x00007f1d82aa1175 in _IO_new_file_write (f=0x7f1d82bf74e0 <_IO_2_1_stderr_>, data=0x7ffd6a4022b0, n=45) at fileops.c:1181 #3 0x00007f1d82a9f31d in new_do_write (fp=fp@entry=0x7f1d82bf74e0 <_IO_2_1_stderr_>, data=data@entry=0x7ffd6a4022b0 "before deref: this=0x7f1cf2089680 refcount=7\n", to_do=to_do@entry=45) at /usr/src/debug/glibc-2.39-13.fc40.x86_64/libio/libioP.h:1030 #4 0x00007f1d82aa1381 in _IO_new_file_xsputn (f=0x7f1d82bf74e0 <_IO_2_1_stderr_>, data=<optimized out>, n=45) at fileops.c:1255 #5 _IO_new_file_xsputn (f=0x7f1d82bf74e0 <_IO_2_1_stderr_>, data=<optimized out>, n=45) at fileops.c:1197 #6 0x00007f1d82a6eaf9 in __printf_buffer_flush_to_file (buf=buf@entry=0x7ffd6a402280) at ../libio/libioP.h:1030 #7 0x00007f1d82a6ebbc in __printf_buffer_to_file_done (buf=buf@entry=0x7ffd6a402280) at printf_buffer_to_file.c:120 #8 0x00007f1d82a7a07b in __vfprintf_internal (s=0x7f1d82bf74e0 <_IO_2_1_stderr_>, format=0x7f1cb7e69da0 "before deref: this=%p refcount=%u\n", ap=0x7ffd6a4023b0, mode_flags=0) at vfprintf-internal.c:1560 #9 0x00007f1d868d4021 in vprintf_stderr_common (format=0x7f1cb7e69da0 "before deref: this=%p refcount=%u\n", args=0x7ffd6a4023b0) at /home/mcatanzaro/Projects/WebKit/Source/WTF/wtf/Assertions.cpp:179 #10 vprintf_stderr_with_trailing_newline (format=0x7f1d8bd43e80 "before deref: this=%p refcount=%u", args=args@entry=0x7ffd6a4023b0) at /home/mcatanzaro/Projects/WebKit/Source/WTF/wtf/Assertions.cpp:212 #11 0x00007f1d868d40d8 in WTFLogAlwaysV (format=0x2 <error: Cannot access memory at address 0x2>, args=0x7ffd6a4023b0) at /home/mcatanzaro/Projects/WebKit/Source/WTF/wtf/Assertions.cpp:529 #12 WTFLogAlways (format=0x2 <error: Cannot access memory at address 0x2>) at /home/mcatanzaro/Projects/WebKit/Source/WTF/wtf/Assertions.cpp:536 #13 0x00007f1d8a98dc06 in WebCore::HTMLMediaElement::deref (this=0x7f1cf2089680) at /home/mcatanzaro/Projects/WebKit/Source/WebCore/html/HTMLMediaElement.h:178 #14 WTF::DefaultRefDerefTraits<WebCore::HTMLMediaElement>::derefIfNotNull (ptr=0x7f1cf2089680) at WTF/Headers/wtf/Ref.h:62 #15 WTF::Ref<WebCore::HTMLMediaElement, WTF::RawPtrTraits<WebCore::HTMLMediaElement>, WTF::DefaultRefDerefTraits<WebCore::HTMLMediaElement> >::~Ref (this=0x7f1cb7e18c88) at WTF/Headers/wtf/Ref.h:82 #16 WebCore::ActiveDOMObject::PendingActivity<WebCore::HTMLMediaElement>::~PendingActivity (this=0x7f1cb7e18c80) at /home/mcatanzaro/Projects/WebKit/Source/WebCore/dom/ActiveDOMObject.h:94 #17 WTF::RefCounted<WebCore::ActiveDOMObject::PendingActivity<WebCore::HTMLMediaElement> >::deref (this=0x7f1cb7e18c80) at WTF/Headers/wtf/RefCounted.h:220 #18 WTF::DefaultRefDerefTraits<WebCore::ActiveDOMObject::PendingActivity<WebCore::HTMLMediaElement> >::derefIfNotNull (ptr=0x7f1cb7e18c80) at WTF/Headers/wtf/Ref.h:62 #19 WTF::Ref<WebCore::ActiveDOMObject::PendingActivity<WebCore::HTMLMediaElement>, WTF::RawPtrTraits<WebCore::ActiveDOMObject::PendingActivity<WebCore::HTMLMediaElement> >, WTF::DefaultRefDerefTraits<WebCore::ActiveDOMObject::PendingActivity<WebCore::HTMLMediaElement> > >::~Ref (this=0x7f1cc36c75b0) at WTF/Headers/wtf/Ref.h:82 #20 WebCore::ActiveDOMObject::queueTaskKeepingObjectAlive<WebCore::HTMLMediaElement>(WebCore::HTMLMediaElement&, WebCore::TaskSource, WTF::Function<void ()>&&)::{lambda()#1}::~Function() (this=0x7f1cc36c75a8) at /home/mcatanzaro/Projects/WebKit/Source/WebCore/dom/ActiveDOMObject.h:114 #21 0x00007f1d8a98dcb7 in WTF::Detail::CallableWrapper<WebCore::ActiveDOMObject::queueTaskKeepingObjectAlive<WebCore::HTMLMediaElement>(WebCore::HTMLMediaElement&, WebCore::TaskSource, WTF::Function<void ()>&&)::{lambda()#1}, void>::~CallableWrapper() (this=0x7f1cc36c75a0) at WTF/Headers/wtf/Function.h:47 #22 WTF::Detail::CallableWrapper<WebCore::ActiveDOMObject::queueTaskKeepingObjectAlive<WebCore::HTMLMediaElement>(WebCore::HTMLMediaElement&, WebCore::TaskSource, WTF::Function<void ()>&&)::{lambda()#1}, void>::~CallableWrapper() (this=0x7f1cc36c75a0) at WTF/Headers/wtf/Function.h:47 #23 0x00007f1d8a7684f0 in std::default_delete<WTF::Detail::CallableWrapperBase<void> >::operator() (this=<optimized out>, __ptr=0x2c) at /usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/unique_ptr.h:93 #24 std::unique_ptr<WTF::Detail::CallableWrapperBase<void>, std::default_delete<WTF::Detail::CallableWrapperBase<void> > >::~unique_ptr (this=0x7f1cc36c75d8) at /usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/unique_ptr.h:398 #25 WTF::Function<void ()>::~Function() (this=0x7f1cc36c75d8) at WTF/Headers/wtf/Forward.h:86 #26 WebCore::EventLoopFunctionDispatchTask::~EventLoopFunctionDispatchTask (this=0x7f1cc36c75c0) at /home/mcatanzaro/Projects/WebKit/Source/WebCore/dom/EventLoop.cpp:461 #27 WebCore::EventLoopFunctionDispatchTask::~EventLoopFunctionDispatchTask (this=0x7f1cc36c75c0) at /home/mcatanzaro/Projects/WebKit/Source/WebCore/dom/EventLoop.cpp:461 #28 0x00007f1d8a766260 in std::default_delete<WebCore::EventLoopTask>::operator() (this=0x7f1c767e4a00, __ptr=0x2) at /usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/unique_ptr.h:93 #29 std::unique_ptr<WebCore::EventLoopTask, std::default_delete<WebCore::EventLoopTask> >::~unique_ptr (this=0x7f1c767e4a00) at /usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/unique_ptr.h:398 #30 WTF::VectorDestructor<true, std::unique_ptr<WebCore::EventLoopTask, std::default_delete<WebCore::EventLoopTask> > >::destruct (begin=0x7f1c767e4a00, end=<optimized out>) at WTF/Headers/wtf/Vector.h:70 #31 WTF::VectorTypeOperations<std::unique_ptr<WebCore::EventLoopTask, std::default_delete<WebCore::EventLoopTask> > >::destruct (begin=0x7f1c767e4a00, end=<optimized out>) at WTF/Headers/wtf/Vector.h:253 #32 WTF::Vector<std::unique_ptr<WebCore::EventLoopTask, std::default_delete<WebCore::EventLoopTask> >, 0ul, WTF::CrashOnOverflow, 16ul, WTF::FastMalloc>::~Vector (this=<optimized out>) at WTF/Headers/wtf/Vector.h:785 #33 WebCore::EventLoop::run (this=0x7f1d6a0ecd20, deadline=<error reading variable: That operation is not available on integers of more than 8 bytes.>) at /home/mcatanzaro/Projects/WebKit/Source/WebCore/dom/EventLoop.cpp:337 #34 0x00007f1d8a800ee3 in WebCore::WindowEventLoop::didReachTimeToRun (this=0x7f1d6a0ecd20) at /home/mcatanzaro/Projects/WebKit/Source/WebCore/dom/WindowEventLoop.cpp:209 #35 0x00007f1d8ae8ceaf in WebCore::ThreadTimers::sharedTimerFiredInternal (this=0x7f1d6a0e0a80) at /home/mcatanzaro/Projects/WebKit/Source/WebCore/platform/ThreadTimers.cpp:125 #36 0x00007f1d8695c72e in WTF::RunLoop::TimerBase::TimerBase(WTF::RunLoop&)::$_0::operator()(void*) const (userData=userData@entry=0x7f1d8d1d7f98 <WebCore::MainThreadSharedTimer::singleton()::instance+16>, this=<optimized out>) at /home/mcatanzaro/Projects/WebKit/Source/WTF/wtf/glib/RunLoopGLib.cpp:177 #37 WTF::RunLoop::TimerBase::TimerBase(WTF::RunLoop&)::$_0::__invoke(void*) (userData=userData@entry=0x7f1d8d1d7f98 <WebCore::MainThreadSharedTimer::singleton()::instance+16>) at /home/mcatanzaro/Projects/WebKit/Source/WTF/wtf/glib/RunLoopGLib.cpp:169 #38 0x00007f1d8695b9aa in WTF::RunLoop::$_0::operator() (source=0x26332b0, callback=0x7f1d8695c6a0 <WTF::RunLoop::TimerBase::TimerBase(WTF::RunLoop&)::$_0::__invoke(void*)>, userData=0x7f1d8d1d7f98 <WebCore::MainThreadSharedTimer::singleton()::instance+16>, this=<optimized out>) at /home/mcatanzaro/Projects/WebKit/Source/WTF/wtf/glib/RunLoopGLib.cpp:53 #39 WTF::RunLoop::$_0::__invoke (source=0x26332b0, callback=0x7f1d8695c6a0 <WTF::RunLoop::TimerBase::TimerBase(WTF::RunLoop&)::$_0::__invoke(void*)>, userData=0x7f1d8d1d7f98 <WebCore::MainThreadSharedTimer::singleton()::instance+16>) at /home/mcatanzaro/Projects/WebKit/Source/WTF/wtf/glib/RunLoopGLib.cpp:45 #40 0x00007f1d86eecd3b in g_main_dispatch (context=0x23f6c40) at ../../../../Projects/glib/glib/gmain.c:3348 #41 0x00007f1d86eee0f0 in g_main_context_dispatch_unlocked (context=0x23f6c40) at ../../../../Projects/glib/glib/gmain.c:4197 #42 0x00007f1d86eee2b1 in g_main_context_iterate_unlocked (context=0x23f6c40, block=1, dispatch=1, self=0x2440920) at ../../../../Projects/glib/glib/gmain.c:4262 #43 0x00007f1d86eee7c5 in g_main_loop_run (loop=0x2437220) at ../../../../Projects/glib/glib/gmain.c:4464 #44 0x00007f1d8695bf50 in WTF::RunLoop::run () at /home/mcatanzaro/Projects/WebKit/Source/WTF/wtf/glib/RunLoopGLib.cpp:108 #45 0x00007f1d8972a00b in WebKit::AuxiliaryProcessMainBase<WebKit::WebProcess, true>::run (this=0x7ffd6a4027a0, argc=<optimized out>, argv=<optimized out>) at /home/mcatanzaro/Projects/WebKit/Source/WebKit/Shared/AuxiliaryProcessMain.h:72 #46 WebKit::AuxiliaryProcessMain<WebKit::WebProcessMainGtk> (argc=<optimized out>, argv=<optimized out>) at /home/mcatanzaro/Projects/WebKit/Source/WebKit/Shared/AuxiliaryProcessMain.h:98 #47 WebKit::WebProcessMain (argc=3, argv=0x7ffd6a402928) at /home/mcatanzaro/Projects/WebKit/Source/WebKit/WebProcess/gtk/WebProcessMainGtk.cpp:98 #48 0x00007f1d82a3d088 in __libc_start_call_main (main=main@entry=0x401130 <main(int, char**)>, argc=argc@entry=3, argv=argv@entry=0x7ffd6a402928) at ../sysdeps/nptl/libc_start_call_main.h:58 #49 0x00007f1d82a3d14b in __libc_start_main_impl (main=0x401130 <main(int, char**)>, argc=3, argv=0x7ffd6a402928, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffd6a402918) at ../csu/libc-start.c:360 #50 0x0000000000401065 in _start () I caught it while printing my debug WTFLogAlways. :) I took two mark backtraces and got one more deref at the same location, and also a ref: (gdb) bt #0 0x00007f1d82b1e86d in __GI___libc_write (fd=2, buf=0x7ffd6a402270, nbytes=43) at ../sysdeps/unix/sysv/linux/write.c:26 #1 __GI___libc_write (fd=2, buf=0x7ffd6a402270, nbytes=43) at ../sysdeps/unix/sysv/linux/write.c:24 #2 0x00007f1d82aa1175 in _IO_new_file_write (f=0x7f1d82bf74e0 <_IO_2_1_stderr_>, data=0x7ffd6a402270, n=43) at fileops.c:1181 #3 0x00007f1d82a9f31d in new_do_write (fp=fp@entry=0x7f1d82bf74e0 <_IO_2_1_stderr_>, data=data@entry=0x7ffd6a402270 "before ref: this=0x7f1cf208b150 refcount=8\n", to_do=to_do@entry=43) at /usr/src/debug/glibc-2.39-13.fc40.x86_64/libio/libioP.h:1030 #4 0x00007f1d82aa1381 in _IO_new_file_xsputn (f=0x7f1d82bf74e0 <_IO_2_1_stderr_>, data=<optimized out>, n=43) at fileops.c:1255 #5 _IO_new_file_xsputn (f=0x7f1d82bf74e0 <_IO_2_1_stderr_>, data=<optimized out>, n=43) at fileops.c:1197 #6 0x00007f1d82a6eaf9 in __printf_buffer_flush_to_file (buf=buf@entry=0x7ffd6a402240) at ../libio/libioP.h:1030 #7 0x00007f1d82a6ebbc in __printf_buffer_to_file_done (buf=buf@entry=0x7ffd6a402240) at printf_buffer_to_file.c:120 #8 0x00007f1d82a7a07b in __vfprintf_internal (s=0x7f1d82bf74e0 <_IO_2_1_stderr_>, format=0x7f1cb7e009c0 "before ref: this=%p refcount=%u\n", ap=0x7ffd6a402370, mode_flags=0) at vfprintf-internal.c:1560 #9 0x00007f1d868d4021 in vprintf_stderr_common (format=0x7f1cb7e009c0 "before ref: this=%p refcount=%u\n", args=0x7ffd6a402370) at /home/mcatanzaro/Projects/WebKit/Source/WTF/wtf/Assertions.cpp:179 #10 vprintf_stderr_with_trailing_newline (format=0x7f1d8bd43e60 "before ref: this=%p refcount=%u", args=args@entry=0x7ffd6a402370) at /home/mcatanzaro/Projects/WebKit/Source/WTF/wtf/Assertions.cpp:212 #11 0x00007f1d868d40d8 in WTFLogAlwaysV (format=0x2 <error: Cannot access memory at address 0x2>, args=0x7ffd6a402370) at /home/mcatanzaro/Projects/WebKit/Source/WTF/wtf/Assertions.cpp:529 #12 WTFLogAlways (format=0x2 <error: Cannot access memory at address 0x2>) at /home/mcatanzaro/Projects/WebKit/Source/WTF/wtf/Assertions.cpp:536 #13 0x00007f1d8a964ddb in WebCore::HTMLMediaElement::ref (this=0x7f1cf208b150) at /home/mcatanzaro/Projects/WebKit/Source/WebCore/html/HTMLMediaElement.h:177 #14 WTF::DefaultRefDerefTraits<WebCore::HTMLMediaElement>::ref (ref=...) at WTF/Headers/wtf/Ref.h:55 #15 WTF::Ref<WebCore::HTMLMediaElement, WTF::RawPtrTraits<WebCore::HTMLMediaElement>, WTF::DefaultRefDerefTraits<WebCore::HTMLMediaElement> >::Ref (this=0x7ffd6a402460, object=...) at WTF/Headers/wtf/Ref.h:86 #16 WebCore::ActiveDOMObject::queueTaskKeepingObjectAlive<WebCore::HTMLMediaElement>(WebCore::HTMLMediaElement&, WebCore::TaskSource, WTF::Function<void ()>&&) (object=..., source=source@entry=WebCore::TaskSource::MediaElement, task=...) at /home/mcatanzaro/Projects/WebKit/Source/WebCore/dom/ActiveDOMObject.h:114 #17 0x00007f1d8a971b61 in WebCore::HTMLMediaElement::mediaPlayerReadyStateChanged (this=0x7f1cf208b150) at /home/mcatanzaro/Projects/WebKit/Source/WebCore/html/HTMLMediaElement.cpp:2889 #18 0x00007f1d8a76602d in WebCore::EventLoop::run (this=0x7f1d6a0ecd20, deadline=<error reading variable: That operation is not available on integers of more than 8 bytes.>) at /home/mcatanzaro/Projects/WebKit/Source/WebCore/dom/EventLoop.cpp:327 #19 0x00007f1d8a800ee3 in WebCore::WindowEventLoop::didReachTimeToRun (this=0x7f1d6a0ecd20) at /home/mcatanzaro/Projects/WebKit/Source/WebCore/dom/WindowEventLoop.cpp:209 #20 0x00007f1d8ae8ceaf in WebCore::ThreadTimers::sharedTimerFiredInternal (this=0x7f1d6a0e0a80) at /home/mcatanzaro/Projects/WebKit/Source/WebCore/platform/ThreadTimers.cpp:125 #21 0x00007f1d8695c72e in WTF::RunLoop::TimerBase::TimerBase(WTF::RunLoop&)::$_0::operator()(void*) const (userData=userData@entry=0x7f1d8d1d7f98 <WebCore::MainThreadSharedTimer::singleton()::instance+16>, this=<optimized out>) at /home/mcatanzaro/Projects/WebKit/Source/WTF/wtf/glib/RunLoopGLib.cpp:177 #22 WTF::RunLoop::TimerBase::TimerBase(WTF::RunLoop&)::$_0::__invoke(void*) (userData=userData@entry=0x7f1d8d1d7f98 <WebCore::MainThreadSharedTimer::singleton()::instance+16>) at /home/mcatanzaro/Projects/WebKit/Source/WTF/wtf/glib/RunLoopGLib.cpp:169 #23 0x00007f1d8695b9aa in WTF::RunLoop::$_0::operator() (source=0x26332b0, callback=0x7f1d8695c6a0 <WTF::RunLoop::TimerBase::TimerBase(WTF::RunLoop&)::$_0::__invoke(void*)>, userData=0x7f1d8d1d7f98 <WebCore::MainThreadSharedTimer::singleton()::instance+16>, this=<optimized out>) at /home/mcatanzaro/Projects/WebKit/Source/WTF/wtf/glib/RunLoopGLib.cpp:53 #24 WTF::RunLoop::$_0::__invoke (source=0x26332b0, callback=0x7f1d8695c6a0 <WTF::RunLoop::TimerBase::TimerBase(WTF::RunLoop&)::$_0::__invoke(void*)>, userData=0x7f1d8d1d7f98 <WebCore::MainThreadSharedTimer::singleton()::instance+16>) at /home/mcatanzaro/Projects/WebKit/Source/WTF/wtf/glib/RunLoopGLib.cpp:45 #25 0x00007f1d86eecd3b in g_main_dispatch (context=0x23f6c40) at ../../../../Projects/glib/glib/gmain.c:3348 #26 0x00007f1d86eee0f0 in g_main_context_dispatch_unlocked (context=0x23f6c40) at ../../../../Projects/glib/glib/gmain.c:4197 #27 0x00007f1d86eee2b1 in g_main_context_iterate_unlocked (context=0x23f6c40, block=1, dispatch=1, self=0x2440920) at ../../../../Projects/glib/glib/gmain.c:4262 #28 0x00007f1d86eee7c5 in g_main_loop_run (loop=0x2437220) at ../../../../Projects/glib/glib/gmain.c:4464 #29 0x00007f1d8695bf50 in WTF::RunLoop::run () at /home/mcatanzaro/Projects/WebKit/Source/WTF/wtf/glib/RunLoopGLib.cpp:108 #30 0x00007f1d8972a00b in WebKit::AuxiliaryProcessMainBase<WebKit::WebProcess, true>::run (this=0x7ffd6a4027a0, argc=<optimized out>, argv=<optimized out>) at /home/mcatanzaro/Projects/WebKit/Source/WebKit/Shared/AuxiliaryProcessMain.h:72 #31 WebKit::AuxiliaryProcessMain<WebKit::WebProcessMainGtk> (argc=<optimized out>, argv=<optimized out>) at /home/mcatanzaro/Projects/WebKit/Source/WebKit/Shared/AuxiliaryProcessMain.h:98 #32 WebKit::WebProcessMain (argc=3, argv=0x7ffd6a402928) at /home/mcatanzaro/Projects/WebKit/Source/WebKit/WebProcess/gtk/WebProcessMainGtk.cpp:98 #33 0x00007f1d82a3d088 in __libc_start_call_main (main=main@entry=0x401130 <main(int, char**)>, argc=argc@entry=3, argv=argv@entry=0x7ffd6a402928) at ../sysdeps/nptl/libc_start_call_main.h:58 #34 0x00007f1d82a3d14b in __libc_start_main_impl (main=0x401130 <main(int, char**)>, argc=3, argv=0x7ffd6a402928, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffd6a402918) at ../csu/libc-start.c:360 #35 0x0000000000401065 in _start () What task is being queued? (gdb) frame 16 #16 WebCore::ActiveDOMObject::queueTaskKeepingObjectAlive<WebCore::HTMLMediaElement>(WebCore::HTMLMediaElement&, WebCore::TaskSource, WTF::Function<void ()>&&) (object=..., source=source@entry=WebCore::TaskSource::MediaElement, task=...) at /home/mcatanzaro/Projects/WebKit/Source/WebCore/dom/ActiveDOMObject.h:114 114 object.queueTaskInEventLoop(source, [protectedObject = Ref { object }, activity = WTFMove(activity), task = WTFMove(task)] () { WebCore::TaskSource::MediaElement Unfortunately those tasks can be queued from a bunch of different places: $ git grep 'TaskSource::MediaElement' | wc -l 80 Hope that helps a little....
Michael Catanzaro
Comment 15 2024-06-06 08:48:48 PDT
> I took two mark backtraces I meant "two more backtraces"
Philippe Normand
Comment 16 2024-06-11 03:00:16 PDT
Did you see comment 9 ? It's obvious if you read the code: void HTMLMediaElement::mediaPlayerReadyStateChanged() { if (isSuspended()) { queueTaskKeepingObjectAlive(*this, TaskSource::MediaElement, [this] { mediaPlayerReadyStateChanged(); }); return; } ... } As nothing makes isSuspended() false, the method loops on itself forever.
Philippe Normand
Comment 17 2024-06-11 03:25:03 PDT
(In reply to Philippe Normand from comment #16) > Did you see comment 9 ? It's obvious if you read the code: > > void HTMLMediaElement::mediaPlayerReadyStateChanged() > { > if (isSuspended()) { > queueTaskKeepingObjectAlive(*this, TaskSource::MediaElement, [this] { > mediaPlayerReadyStateChanged(); > }); > return; > } > > ... > } > git blame linked this code to MediaSession handling. After disabling MEDIA_SESSION in my build I am unable to reproduce the issue anymore... So this seems likely related to our MEDIA_SESSION stuff. It's not the first time that backend triggers apocalypse. :(
Philippe Normand
Comment 18 2024-06-11 07:55:18 PDT
I'm not entirely sure this is due to MediaSession... According to my log here is what happens: 1. MediaSource::detachFromElement() is called 2. that triggers a call to setReadyState(ReadyState::Closed) 3. which then end up calling MediaPlayerPrivateGStreamerMSE::setReadyState(HaveMetadata) 4. which calls MediaPlayerPrivateGStreamerMSE::propagateReadyStateToPlayer() 5. which calls player->readyStateChanged() 6. which end up to HTMLMediaElement::mediaPlayerReadyStateChanged() called with isSuspended() returning true. 7. infinite loop starts, we're doomed
Philippe Normand
Comment 19 2024-06-16 02:37:56 PDT
EWS
Comment 20 2024-06-19 05:15:54 PDT
Committed 280171@main (fc158b61f4be): <https://commits.webkit.org/280171@main> Reviewed commits have been landed. Closing PR #29872 and removing active labels.
Note You need to log in before you can comment on or make changes to this bug.