WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
208642
REGRESSION: (
r257905
) [ Mac wk2 Debug ] ASSERTION FAILED: !m_isolatedWorld->isNormal() || m_wrapper || !m_jsFunction
https://bugs.webkit.org/show_bug.cgi?id=208642
Summary
REGRESSION: (r257905) [ Mac wk2 Debug ] ASSERTION FAILED: !m_isolatedWorld->i...
Jason Lawrence
Reported
2020-03-05 08:11:41 PST
media/modern-media-controls/tracks-support/tracks-support-no-tracks.html Description: This test is flaky crashing on Mac wk2 Debug. It appears to have a regression point of
r257905
. History:
https://results.webkit.org/?suite=layout-tests&test=media%2Fmodern-media-controls%2Ftracks-support%2Ftracks-support-no-tracks.html&style=debug&limit=50000
Crash log attached; Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 com.apple.JavaScriptCore 0x0000000821882b2e WTFCrash + 14 (Assertions.cpp:305) 1 com.apple.WebCore 0x0000000804d70b5b WTFCrashWithInfo(int, char const*, char const*, int) + 27 2 com.apple.WebCore 0x000000080706142a WebCore::JSEventListener::jsFunction(WebCore::ScriptExecutionContext&) const + 426 (JSEventListener.h:111) 3 com.apple.WebCore 0x000000080706080b WebCore::JSEventListener::handleEvent(WebCore::ScriptExecutionContext&, WebCore::Event&) + 219 (JSEventListener.cpp:113) 4 com.apple.WebCore 0x000000080771857c WebCore::EventTarget::innerInvokeEventListeners(WebCore::Event&, WTF::Vector<WTF::RefPtr<WebCore::RegisteredEventListener, WTF::DumbPtrTraits<WebCore::RegisteredEventListener> >, 1ul, WTF::CrashOnOverflow, 16ul, WTF::FastMalloc>, WebCore::EventTarget::EventInvokePhase) + 956 (EventTarget.cpp:315) 5 com.apple.WebCore 0x00000008077147f2 WebCore::EventTarget::fireEventListeners(WebCore::Event&, WebCore::EventTarget::EventInvokePhase) + 354 (EventTarget.cpp:252) 6 com.apple.WebCore 0x000000080771815d WebCore::EventTarget::dispatchEvent(WebCore::Event&) + 333 (EventTarget.cpp:211) 7 com.apple.WebCore 0x0000000807737310 WebCore::MainThreadGenericEventQueue::dispatchOneEvent() + 608 (GenericEventQueue.cpp:73) 8 com.apple.WebCore 0x000000080773e011 decltype(*(std::__1::forward<WebCore::MainThreadGenericEventQueue*&>(fp0)).*fp()) std::__1::__invoke<void (WebCore::MainThreadGenericEventQueue::*&)(), WebCore::MainThreadGenericEventQueue*&, void>(void (WebCore::MainThreadGenericEventQueue::*&)(), WebCore::MainThreadGenericEventQueue*&) + 113 (type_traits:4302) 9 com.apple.WebCore 0x000000080773df70 std::__1::__bind_return<void (WebCore::MainThreadGenericEventQueue::*)(), std::__1::tuple<WebCore::MainThreadGenericEventQueue*>, std::__1::tuple<>, __is_valid_bind_return<void (WebCore::MainThreadGenericEventQueue::*)(), std::__1::tuple<WebCore::MainThreadGenericEventQueue*>, std::__1::tuple<> >::value>::type std::__1::__apply_functor<void (WebCore::MainThreadGenericEventQueue::*)(), std::__1::tuple<WebCore::MainThreadGenericEventQueue*>, 0ul, std::__1::tuple<> >(void (WebCore::MainThreadGenericEventQueue::*&)(), std::__1::tuple<WebCore::MainThreadGenericEventQueue*>&, std::__1::__tuple_indices<0ul>, std::__1::tuple<>&&) + 64 (functional:2644) 10 com.apple.WebCore 0x000000080773df1c std::__1::__bind_return<void (WebCore::MainThreadGenericEventQueue::*)(), std::__1::tuple<WebCore::MainThreadGenericEventQueue*>, std::__1::tuple<>, __is_valid_bind_return<void (WebCore::MainThreadGenericEventQueue::*)(), std::__1::tuple<WebCore::MainThreadGenericEventQueue*>, std::__1::tuple<> >::value>::type std::__1::__bind<void (WebCore::MainThreadGenericEventQueue::*)(), WebCore::MainThreadGenericEventQueue*>::operator()<>() + 60 (functional:2677) 11 com.apple.WebCore 0x000000080773deb9 WTF::Detail::CallableWrapper<std::__1::__bind<void (WebCore::MainThreadGenericEventQueue::*)(), WebCore::MainThreadGenericEventQueue*>, void>::call() + 25 (Function.h:52) 12 com.apple.WebCore 0x0000000804d8353a WTF::Function<void ()>::operator()() const + 138 (Function.h:84) 13 com.apple.WebCore 0x0000000805061aeb WebCore::GenericTaskQueue<WebCore::Timer>::enqueueTask(WTF::Function<void ()>&&)::'lambda'()::operator()() const + 171 (GenericTaskQueue.h:108) 14 com.apple.WebCore 0x0000000805061939 WTF::Detail::CallableWrapper<WebCore::GenericTaskQueue<WebCore::Timer>::enqueueTask(WTF::Function<void ()>&&)::'lambda'(), void>::call() + 25 (Function.h:52) 15 com.apple.WebCore 0x0000000804d8353a WTF::Function<void ()>::operator()() const + 138 (Function.h:84) 16 com.apple.WebCore 0x0000000808610f4e WebCore::TaskDispatcher<WebCore::Timer>::dispatchOneTask() + 222 (GenericTaskQueue.cpp:111) 17 com.apple.WebCore 0x0000000808610c00 WebCore::TaskDispatcher<WebCore::Timer>::sharedTimerFired() + 256 (GenericTaskQueue.cpp:86) 18 com.apple.WebCore 0x00000008086170c1 WebCore::TaskDispatcher<WebCore::Timer>::sharedTimer()::$_1::operator()() const + 17 (GenericTaskQueue.cpp:60) 19 com.apple.WebCore 0x0000000808617089 WTF::Detail::CallableWrapper<WebCore::TaskDispatcher<WebCore::Timer>::sharedTimer()::$_1, void>::call() + 25 (Function.h:52) 20 com.apple.WebCore 0x0000000804d8353a WTF::Function<void ()>::operator()() const + 138 (Function.h:84) 21 com.apple.WebCore 0x0000000804e37269 WebCore::Timer::fired() + 25 (Timer.h:127) 22 com.apple.WebCore 0x000000080866018a WebCore::ThreadTimers::sharedTimerFiredInternal() + 650 (ThreadTimers.cpp:129) 23 com.apple.WebCore 0x000000080866a5f1 WebCore::ThreadTimers::setSharedTimer(WebCore::SharedTimer*)::$_0::operator()() const + 33 (ThreadTimers.cpp:69) 24 com.apple.WebCore 0x000000080866a5a9 WTF::Detail::CallableWrapper<WebCore::ThreadTimers::setSharedTimer(WebCore::SharedTimer*)::$_0, void>::call() + 25 (Function.h:52) 25 com.apple.WebCore 0x0000000804d8353a WTF::Function<void ()>::operator()() const + 138 (Function.h:84) 26 com.apple.WebCore 0x000000080862a397 WebCore::MainThreadSharedTimer::fired() + 135 (MainThreadSharedTimer.cpp:84) 27 com.apple.WebCore 0x00000008086d2486 WebCore::timerFired(__CFRunLoopTimer*, void*) + 38 (MainThreadSharedTimerCF.cpp:75) 28 com.apple.CoreFoundation 0x00007fff36015804 __CFRUNLOOP_IS_CALLING_OUT_TO_A_TIMER_CALLBACK_FUNCTION__ + 20 29 com.apple.CoreFoundation 0x00007fff360153be __CFRunLoopDoTimer + 859 30 com.apple.CoreFoundation 0x00007fff36014e9e __CFRunLoopDoTimers + 317 31 com.apple.CoreFoundation 0x00007fff35ff9aed __CFRunLoopRun + 2213 32 com.apple.CoreFoundation 0x00007fff35ff8bd3 CFRunLoopRunSpecific + 499 33 com.apple.Foundation 0x00007fff3869c1a8 -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 212 34 com.apple.Foundation 0x00007fff3874fd8b -[NSRunLoop(NSRunLoop) run] + 76 35 libxpc.dylib 0x00007fff6d8ed0e1 _xpc_objc_main.cold.4 + 49 36 libxpc.dylib 0x00007fff6d8ed027 _xpc_objc_main + 559 37 libxpc.dylib 0x00007fff6d8ecb5a xpc_main + 377 38 com.apple.WebKit 0x00000007f889321c WebKit::XPCServiceMain(int, char const**) + 1276 (XPCServiceMain.mm:166) 39 com.apple.WebKit 0x00000007f9bdcc4b WKXPCServiceMain + 27 (WKMain.mm:33) 40 com.apple.WebKit.WebContent 0x00000001061e2ec2 main + 34 (AuxiliaryProcessMain.cpp:30) 41 libdyld.dylib 0x00007fff6d69e7fd start + 1
Attachments
tracks-support-no-tracks-crash-log
(117.23 KB, text/plain)
2020-03-05 08:13 PST
,
Jason Lawrence
no flags
Details
Patch
(4.45 KB, patch)
2020-03-06 03:36 PST
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(10.05 KB, patch)
2020-03-07 13:44 PST
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(10.05 KB, patch)
2020-03-07 13:50 PST
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(9.98 KB, patch)
2020-03-07 13:54 PST
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(9.89 KB, patch)
2020-03-07 13:58 PST
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(10.10 KB, patch)
2020-03-07 16:39 PST
,
Yusuke Suzuki
darin
: review+
Details
Formatted Diff
Diff
Show Obsolete
(5)
View All
Add attachment
proposed patch, testcase, etc.
Radar WebKit Bug Importer
Comment 1
2020-03-05 08:12:14 PST
<
rdar://problem/60084741
>
Jason Lawrence
Comment 2
2020-03-05 08:13:19 PST
Created
attachment 392571
[details]
tracks-support-no-tracks-crash-log
Jason Lawrence
Comment 3
2020-03-05 08:23:25 PST
I can reproduce this issue with
r257915
(debug) and
r257905
(debug), but cannot reproduce the issue with
r257904
(debug). run-webkit-tests --iterations 1000 --exit-after-n-failures 5 --force -f --no-retry --debug media/modern-media-controls/tracks-support/tracks-support-no-tracks.html
r257915
: [51/1000] media/modern-media-controls/tracks-support/tracks-support-no-tracks.html failed unexpectedly (com.apple.WebKit.WebContent.Development crashed [pid=49295]) [51/1000] media/modern-media-controls/tracks-support/tracks-support-no-tracks.html failed unexpectedly (com.apple.WebKit.WebContent.Development crashed [pid=49293]) [61/1000] media/modern-media-controls/tracks-support/tracks-support-no-tracks.html failed unexpectedly (com.apple.WebKit.WebContent.Development crashed [pid=49288]) [89/1000] media/modern-media-controls/tracks-support/tracks-support-no-tracks.html failed unexpectedly (com.apple.WebKit.WebContent.Development crashed [pid=49297]) [91/1000] media/modern-media-controls/tracks-support/tracks-support-no-tracks.html failed unexpectedly (com.apple.WebKit.WebContent.Development crashed [pid=49412]) Exiting early after 5 failures. 88 tests run. 83 tests ran as expected, 5 didn't (912 didn't run):
r257905
(I switched from 5 to 3 failures here): [67/1000] media/modern-media-controls/tracks-support/tracks-support-no-tracks.html failed unexpectedly (com.apple.WebKit.WebContent.Development crashed [pid=50171]) [76/1000] media/modern-media-controls/tracks-support/tracks-support-no-tracks.html failed unexpectedly (com.apple.WebKit.WebContent.Development crashed [pid=50165]) [78/1000] media/modern-media-controls/tracks-support/tracks-support-no-tracks.html failed unexpectedly (com.apple.WebKit.WebContent.Development crashed [pid=50174]) Exiting early after 3 failures. 75 tests run. 72 tests ran as expected, 3 didn't (925 didn't run):
r257904
: All 1000 tests ran as expected.
Ryan Haddad
Comment 4
2020-03-05 09:26:36 PST
ASSERTION FAILED: !m_isolatedWorld->isNormal() || m_wrapper || !m_jsFunction ./bindings/js/JSEventListener.h(111) : JSC::JSObject *WebCore::JSEventListener::jsFunction(WebCore::ScriptExecutionContext &) const 1 0x7d16190b9 WTFCrash 2 0x7b4ca328b WTFCrashWithInfo(int, char const*, char const*, int) 3 0x7b6f982ea WebCore::JSEventListener::jsFunction(WebCore::ScriptExecutionContext&) const
Jason Lawrence
Comment 5
2020-03-05 10:03:46 PST
I have rolled out
r257905
here:
https://trac.webkit.org/changeset/257922/webkit
Yusuke Suzuki
Comment 6
2020-03-05 13:22:00 PST
This assertion looks wrong. It is assuming that, if m_wrapper goes away, m_jsFunction should go away too. That is not guaranteed by JSC GC.
Yusuke Suzuki
Comment 7
2020-03-05 14:41:43 PST
It is saying, 1. If the world is normal. 2. If m_wrapper is gone. 3. If m_jsFunction is not gone. Then, the assertion hits. But this is wrong. We have no guarantee that m_jsFunction is gone when m_wrapper is gone. We have conservative GC. We have generational GC. Everything makes it possible that m_jsFunction is live while m_wrapper is gone. I'll remove this assertion.
Alexey Proskuryakov
Comment 8
2020-03-05 15:51:27 PST
DOM is supposed to guarantee that wrappers stay alive for event listeners to fire. Otherwise, code in the listeners will or will not run semi-randomly depending on whether GC happened, which would be incorrect behavior. GC must not be observable from user code.
Ryosuke Niwa
Comment 9
2020-03-05 16:03:19 PST
The assertion is saying that if m_jsFunction is not null, then m_wrapper shouldn't be null. This must be true because any event target with a valid JS event listener should have had a wrapper when the event listener was added, and the wrapper should never be collected as long as the event listener can be invoked.
Ryosuke Niwa
Comment 10
2020-03-05 16:04:14 PST
In other words, removing the assertion is not the right solution here. It's an indicative of a bug that we're collecting a wrapper of an event target even when an event can be dispatched in the future on it.
Yusuke Suzuki
Comment 11
2020-03-05 17:20:20 PST
> The assertion is saying that if m_jsFunction is not null, then m_wrapper shouldn't be null. This must be true because any event target with a valid JS event listener should have had a wrapper when the event listener was added, and the wrapper should never be collected as long as the event listener can be invoked.
No. m_jsFunction can be any value basically if m_wrapper is null. This can be collected. But this can be retained because of conservative GC, because user puts listener function to somewhere, and the other any reasons. So, m_jsFunction in this assertion is completely unrelated to what we should test. The correct assertion should be at least, ASSERT(!m_isolatedWorld->isNormal() || m_wrapper);
Yusuke Suzuki
Comment 12
2020-03-05 17:21:43 PST
Currently, I'm checking two possibilities, 1. My patch had a bug. And this allows m_wrapper collected by GC. I'm now checking this possibility. 2. This was pre-existing bug. We are not hitting the crash just because m_jsFunction was dead too.
Yusuke Suzuki
Comment 13
2020-03-05 17:23:31 PST
(In reply to Yusuke Suzuki from
comment #11
)
> > The assertion is saying that if m_jsFunction is not null, then m_wrapper shouldn't be null. This must be true because any event target with a valid JS event listener should have had a wrapper when the event listener was added, and the wrapper should never be collected as long as the event listener can be invoked. > > No. m_jsFunction can be any value basically if m_wrapper is null. This can > be collected. But this can be retained because of conservative GC, because > user puts listener function to somewhere, and the other any reasons. > So, m_jsFunction in this assertion is completely unrelated to what we should > test. > The correct assertion should be at least, > > ASSERT(!m_isolatedWorld->isNormal() || m_wrapper);
Maybe, we can also ensure that, if (m_wrapper) ASSERT(m_jsFunction); But current assertion is saying, if (!m_wrapper) ASSERT(!m_jsFunction); This is not guaranteed (you can consider conservative GC. Any objects can be marked as live).
Yusuke Suzuki
Comment 14
2020-03-05 17:28:11 PST
(In reply to Yusuke Suzuki from
comment #13
)
> (In reply to Yusuke Suzuki from
comment #11
) > > > The assertion is saying that if m_jsFunction is not null, then m_wrapper shouldn't be null. This must be true because any event target with a valid JS event listener should have had a wrapper when the event listener was added, and the wrapper should never be collected as long as the event listener can be invoked. > > > > No. m_jsFunction can be any value basically if m_wrapper is null. This can > > be collected. But this can be retained because of conservative GC, because > > user puts listener function to somewhere, and the other any reasons. > > So, m_jsFunction in this assertion is completely unrelated to what we should > > test. > > The correct assertion should be at least, > > > > ASSERT(!m_isolatedWorld->isNormal() || m_wrapper); > > Maybe, we can also ensure that, > > if (m_wrapper) > ASSERT(m_jsFunction); > > But current assertion is saying, > > if (!m_wrapper) > ASSERT(!m_jsFunction); > > This is not guaranteed (you can consider conservative GC. Any objects can be > marked as live).
And I ensured that nobody is clearing m_jsFunction when m_wrapper is gone. So the current assertion's `!m_jsFunction` part is incorrect.
Ryosuke Niwa
Comment 15
2020-03-05 18:40:04 PST
Given the backtrace, it looks like someone using GenericEventQueue isn't keeping event target's wrapper alive. GenericEventQueue is problematic because it doesn't keep event target's wrappers alive automatically. Each event target which gets added to GenericEventQueue needs to keep its wrapper alive.
Yusuke Suzuki
Comment 16
2020-03-06 00:04:51 PST
Several updates. 1. The assertion's normal-world check looks also wrong. We are skipping event emission if the world is not normal, this does not make sense. These events can be used to implement the internal feature, and it could emit user-visible events later. 2. So far, this looks like an existing bug due to
https://bugs.webkit.org/show_bug.cgi?id=165713
. By using IsoSubspace, we start putting some lower-tier cells in PreciseAllocation, and reuse them. This makes GC behavior different, and exposing the existing bug: we should re-register root if we changed the root after we register the root. I could create a test case which reproduces this assertion failure without my patch.
Yusuke Suzuki
Comment 17
2020-03-06 00:12:15 PST
(In reply to Yusuke Suzuki from
comment #16
)
> 2. So far, this looks like an existing bug due to >
https://bugs.webkit.org/show_bug.cgi?id=165713
. By using IsoSubspace, we > start putting some lower-tier cells in PreciseAllocation, and reuse them. > This makes GC behavior different, and exposing the existing bug: we should > re-register root if we changed the root after we register the root. I could > create a test case which reproduces this assertion failure without my patch.
Let's describe what is happening. 1. HTMLVideoElement is created in the test under the current Document. So root is Document. 2. Concurrent GC starts working. 3. Marking (1)'s HTMLVideoElement and registering Document as a root. 4. The executed code removes HTMLVideoElement from Document. 5. HTMLVideoTrackList in HTMLVideoElement queries the opaque root. Since the root of HTMLVideoElement is changed to itself, HTMLVideoTrackList says "I'm live if the root set includes HTMLVideoElement". But this is not included since HTMLVideoElement registers Document as a root. And after the root is changed, it is not re-registering the new root. 6. HTMLVideoTrackList is saying I'm not reachable.
Ryosuke Niwa
Comment 18
2020-03-06 00:25:43 PST
(In reply to Yusuke Suzuki from
comment #17
)
> (In reply to Yusuke Suzuki from
comment #16
) > > 2. So far, this looks like an existing bug due to > >
https://bugs.webkit.org/show_bug.cgi?id=165713
. By using IsoSubspace, we > > start putting some lower-tier cells in PreciseAllocation, and reuse them. > > This makes GC behavior different, and exposing the existing bug: we should > > re-register root if we changed the root after we register the root. I could > > create a test case which reproduces this assertion failure without my patch. > > Let's describe what is happening. > > 1. HTMLVideoElement is created in the test under the current Document. So > root is Document. > 2. Concurrent GC starts working. > 3. Marking (1)'s HTMLVideoElement and registering Document as a root. > 4. The executed code removes HTMLVideoElement from Document. > 5. HTMLVideoTrackList in HTMLVideoElement queries the opaque root. Since the > root of HTMLVideoElement is changed to itself, HTMLVideoTrackList says "I'm > live if the root set includes HTMLVideoElement". But this is not included > since HTMLVideoElement registers Document as a root. And after the root is > changed, it is not re-registering the new root. > 6. HTMLVideoTrackList is saying I'm not reachable.
In this scenario, HTMLVideoElement needs to have a pending activity in ActiveDOMObjet while HTMLVideoTrackList/HTMLVideoElement sits in GenericEventQueue. That would prevent GC from collecting the JS wrapper since it would be reachableFromOpaqueRoot regardless of where it is.
Yusuke Suzuki
Comment 19
2020-03-06 00:34:27 PST
(In reply to Ryosuke Niwa from
comment #18
)
> (In reply to Yusuke Suzuki from
comment #17
) > > (In reply to Yusuke Suzuki from
comment #16
) > > > 2. So far, this looks like an existing bug due to > > >
https://bugs.webkit.org/show_bug.cgi?id=165713
. By using IsoSubspace, we > > > start putting some lower-tier cells in PreciseAllocation, and reuse them. > > > This makes GC behavior different, and exposing the existing bug: we should > > > re-register root if we changed the root after we register the root. I could > > > create a test case which reproduces this assertion failure without my patch. > > > > Let's describe what is happening. > > > > 1. HTMLVideoElement is created in the test under the current Document. So > > root is Document. > > 2. Concurrent GC starts working. > > 3. Marking (1)'s HTMLVideoElement and registering Document as a root. > > 4. The executed code removes HTMLVideoElement from Document. > > 5. HTMLVideoTrackList in HTMLVideoElement queries the opaque root. Since the > > root of HTMLVideoElement is changed to itself, HTMLVideoTrackList says "I'm > > live if the root set includes HTMLVideoElement". But this is not included > > since HTMLVideoElement registers Document as a root. And after the root is > > changed, it is not re-registering the new root. > > 6. HTMLVideoTrackList is saying I'm not reachable. > > In this scenario, HTMLVideoElement needs to have a pending activity in > ActiveDOMObjet while HTMLVideoTrackList/HTMLVideoElement sits in > GenericEventQueue. That would prevent GC from collecting the JS wrapper > since it would be reachableFromOpaqueRoot regardless of where it is.
No, it does not fix the issue unfortunately :( Even if HTMLVideoElement is live, HTMLVideoTrackList goes away since HTMLVideoTrackList is live only if it can find an opaque root, which is not properly registered by HTMLVideoElement due to race condition. Fundamental fix should be, in some way, re-registering an root & saying "Marking" happens if the root is changed. This is some form of write-barrier. This issue itself exists for 4 years, and this issue is hidden by this wrong assertion I think (this happens frequently for HTMLVideoElement, but this assertion does not hit due to `m_jsFunction` check and world-is-normal check since this pattern is used in non-normal world (media controls).
Ryosuke Niwa
Comment 20
2020-03-06 00:48:45 PST
(In reply to Yusuke Suzuki from
comment #19
)
> (In reply to Ryosuke Niwa from
comment #18
)
>
> > In this scenario, HTMLVideoElement needs to have a pending activity in > > ActiveDOMObjet while HTMLVideoTrackList/HTMLVideoElement sits in > > GenericEventQueue. That would prevent GC from collecting the JS wrapper > > since it would be reachableFromOpaqueRoot regardless of where it is. > > No, it does not fix the issue unfortunately :( > Even if HTMLVideoElement is live, HTMLVideoTrackList goes away since > HTMLVideoTrackList is live only if it can find an opaque root, which is not > properly registered by HTMLVideoElement due to race condition.
Okay, so VideoTrackList needs to be an ActiveDOMObject itself and have a pending activity whenever it's in HTMLVideoElement's m_asyncEventQueue.
> Fundamental fix should be, in some way, re-registering an root & saying > "Marking" happens if the root is changed. This is some form of write-barrier.
Hm... that sounds like a brittle approach but if that's what we need to do, then dispatchChildRemovalEvents in ContainerNode.cpp has a code to force creating JS wrapper on the removed node so perhaps we can emit a write barrier there. Alternatively, can we use MarkingConstraint to relate these objects?
Ryosuke Niwa
Comment 21
2020-03-06 01:00:39 PST
(In reply to Ryosuke Niwa from
comment #20
)
> (In reply to Yusuke Suzuki from
comment #19
) > > (In reply to Ryosuke Niwa from
comment #18
) > > > > > In this scenario, HTMLVideoElement needs to have a pending activity in > > > ActiveDOMObjet while HTMLVideoTrackList/HTMLVideoElement sits in > > > GenericEventQueue. That would prevent GC from collecting the JS wrapper > > > since it would be reachableFromOpaqueRoot regardless of where it is. > > > > No, it does not fix the issue unfortunately :( > > Even if HTMLVideoElement is live, HTMLVideoTrackList goes away since > > HTMLVideoTrackList is live only if it can find an opaque root, which is not > > properly registered by HTMLVideoElement due to race condition. > > Okay, so VideoTrackList needs to be an ActiveDOMObject itself and have a > pending activity whenever it's in HTMLVideoElement's m_asyncEventQueue.
Let another approach here is to use CustomIsReachable and write a custom JSVideoTrackList::isReachableFromOpaqueRoots, or add a new variant of GenerateIsReachable like ImplMediaElementRoot which checks not just that the root of the media element is an opaque root but also if the media element has a pending activity or not.
Yusuke Suzuki
Comment 22
2020-03-06 01:20:20 PST
I think this is more fundamental problem which is not tied to HTMLVideoElement case. For any node, this race condition happens if, 1. Node (A) is attached to the tree (TA). 2. (A)'s wrapper is marked and (TA) is registered as an opaque root. 3. The user code removes (A) from (TA). 4. Node (B), which is under (A), is invoking is-reachable-from-opaque-root query. 5. Node (B)'s root is (A) since (A) is detached from the tree (TA). 6. (4) cannot find the (A) from the set of opaque roots. Let's consider about the fundamental fix. Every time we change the root, we need to emit "write-barrier" to this node. Then, GC rescans it, re-register the updated root. So, the problem is how to emit "write-barrier" while we only have C++ node. So the question is how to achieve this effect. One idea I have right now is, (1) when changing the root, (2) if GC is not running right now, noop, (2) otherwise, let's check the old root is included in the opaque root, (3) if it is included, we put the new root into the opaque root as something like remember-set mechanism. Then, conservatively we collect the necessary opaque roots. And once GC cycles pass, opaque root set is anyway cleared again. This works at least for main thread since we can access these data via commonVM().
Ryosuke Niwa
Comment 23
2020-03-06 01:44:34 PST
(In reply to Yusuke Suzuki from
comment #22
)
> I think this is more fundamental problem which is not tied to > HTMLVideoElement case. > For any node, this race condition happens if, > > 1. Node (A) is attached to the tree (TA). > 2. (A)'s wrapper is marked and (TA) is registered as an opaque root. > 3. The user code removes (A) from (TA). > 4. Node (B), which is under (A), is invoking is-reachable-from-opaque-root query. > 5. Node (B)'s root is (A) since (A) is detached from the tree (TA). > 6. (4) cannot find the (A) from the set of opaque roots.
That indeed sounds like a real bug.
> Let's consider about the fundamental fix. Every time we change the root, we > need to emit "write-barrier" to this node. > Then, GC rescans it, re-register the updated root. So, the problem is how to > emit "write-barrier" while we only have C++ node. > > So the question is how to achieve this effect. One idea I have right now is, > (1) when changing the root,
This would be ContainerNode::removeAllChildrenWithScriptAssertion, ContainerNode::removeNodeWithScriptAssertion, and executeNodeInsertionWithScriptAssertion.
> (2) if GC is not running right now, noop, (2) > otherwise, let's check the old root is included in the opaque root, (3) if > it is included, we put the new root into the opaque root as something like > remember-set mechanism.
Hm... I'm thinking if we can re-use something like GCReachableRefMap or add more logic to isReachableFromDOM in JSNodeCustom.cpp but I don't think we could because the issue isn't so much about the root node, which is an opaque root but rather anyone which uses a node has an opaque root. For that matter, this seems like an issue with anything that can dynamically change its opaque root, not just Node.
Yusuke Suzuki
Comment 24
2020-03-06 01:49:22 PST
Or, super easy way (I'm not sure this is good for performance for now), I'm now considering: let's register opaque roots in output constraints!
Yusuke Suzuki
Comment 25
2020-03-06 02:01:49 PST
(In reply to Yusuke Suzuki from
comment #24
)
> Or, super easy way (I'm not sure this is good for performance for now), I'm > now considering: let's register opaque roots in output constraints!
Not sure, some wiring should be required. But this sounds the cleanest solution.
Yusuke Suzuki
Comment 26
2020-03-06 02:09:29 PST
(In reply to Yusuke Suzuki from
comment #25
)
> (In reply to Yusuke Suzuki from
comment #24
) > > Or, super easy way (I'm not sure this is good for performance for now), I'm > > now considering: let's register opaque roots in output constraints! > > Not sure, some wiring should be required. But this sounds the cleanest > solution.
Need to check. It would be possible that I'm missing something.
Ryosuke Niwa
Comment 27
2020-03-06 02:13:01 PST
(In reply to Yusuke Suzuki from
comment #24
)
> Or, super easy way (I'm not sure this is good for performance for now), I'm > now considering: let's register opaque roots in output constraints!
When you say register opaque roots in output constraints, do you mean to do that for all opaque roots?
Yusuke Suzuki
Comment 28
2020-03-06 02:52:22 PST
Ooops, maybe I found the bug. I'm now checking but maybe this is caused by my silly mistake. Sorry. I'm rolling out the patch for now. And ensuring the new patch is fixing this issue. But assertion is anyway wrong so I'll create a different patch for fixing the assertion too.
Yusuke Suzuki
Comment 29
2020-03-06 03:04:12 PST
OK, I found the super silly bug in my original patch
https://bugs.webkit.org/show_bug.cgi?id=205107#c13
. Really sorry :( But at the same time, this assertion is still wrong. I'll use this bug for fixing the assertion to the correct form: ASSERT(m_wrapper && m_jsFunction); since, 1. Whether the world is normal or not does not matter to event emission. This is introduced to suppress some assertion failures originally, this is wrong. 2. The original assertion was saying, "if m_wrapper is dead, m_jsFunction must be dead". This is wrong since m_jsFunction can be alive due to whatever reasons. For GC managed objects, we cannot say "This is dead since it is not referenced from anywhere" since conservative GC can find it even if it is not logically referenced. Only thing we can say is, "if m_wrapper is live, m_jsFunction should be live".
Ryosuke Niwa
Comment 30
2020-03-06 03:30:26 PST
Do we still have the issue with opaque roots dynamically updating? Or was it just red herring due to that bug?
Yusuke Suzuki
Comment 31
2020-03-06 03:36:57 PST
Created
attachment 392702
[details]
Patch
Yusuke Suzuki
Comment 32
2020-03-06 03:37:36 PST
(In reply to Ryosuke Niwa from
comment #30
)
> Do we still have the issue with opaque roots dynamically updating? Or was it > just red herring due to that bug?
I've checked that and as I thought we are already emitting opaque root registering in the output-constraints. So this works correctly :)
Yusuke Suzuki
Comment 33
2020-03-06 03:43:11 PST
Comment on
attachment 392702
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=392702&action=review
> Source/WebCore/bindings/js/JSEventListener.h:110 > + ASSERT(!m_isolatedWorld->isNormal() || m_wrapper);
We could even say, `ASSERT(!m_isolatedWorld->isNormal() || (m_wrapper && m_jsFunction));` but not sure whether we have a case that m_jsFunction is null initialized while the m_wrapper is non-null initialized. And ideal ASSERT would be `ASSERT(m_wrapper && m_jsFunction)` since normal-world or not does not matter whether we should skip the event handlers.
Saam Barati
Comment 34
2020-03-06 08:04:55 PST
(In reply to Yusuke Suzuki from
comment #29
)
> OK, I found the super silly bug in my original patch >
https://bugs.webkit.org/show_bug.cgi?id=205107#c13
. Really sorry :(
So if we don’t visit any of our output constraint spaces, we only have a single test that fails in a flaky manner? Sounds like we need better tests :-)
> But at the same time, this assertion is still wrong. I'll use this bug for > fixing the assertion to the correct form: ASSERT(m_wrapper && m_jsFunction); > since, > > 1. Whether the world is normal or not does not matter to event emission. > This is introduced to suppress some assertion failures originally, this is > wrong. > 2. The original assertion was saying, "if m_wrapper is dead, m_jsFunction > must be dead". This is wrong since m_jsFunction can be alive due to whatever > reasons. For GC managed objects, we cannot say "This is dead since it is not > referenced from anywhere" since conservative GC can find it even if it is > not logically referenced. Only thing we can say is, "if m_wrapper is live, > m_jsFunction should be live".
Alexey Proskuryakov
Comment 35
2020-03-06 08:57:52 PST
Comment on
attachment 392702
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=392702&action=review
Since the change only makes the assertion stronger, it is good as far as I'm concerned. But tests are now asserting, so mechanically r-.
> Source/WebCore/ChangeLog:12 > + GC, we cannot say such an condition. Even if m_wrapper is dead, m_jsFunction can be alive by various reasons: conservative > + GC finds it, user code stores this function somewhere reachable from the root, etc. The correct thing is `m_wrapper` must
There are a couple things here that make me a little skeptical, from a position of someone who only touched this a long time ago and is absolutely not an expert. 1. If this assertion can misfire due to conservative GC finding it, why haven't we ever seen it fire without a real bug? 2. The assertion would fire if user code stored the function somewhere, so it's essentially saying that user code must not do that. That isn't a bug in the assertion necessarily. Is there even a reason for user code to do this? 3. Checking m_jsFunction for being null is not the same as checking whether the function is alive. Once can just assign null to the member, and arguably should do this without waiting for GC. Is it even right that m_jsFunction is a weak pointer?
> Source/WebCore/ChangeLog:15 > + 2. The comments about "zombie m_jsFunctions" is wrong. We are using JSC::Weak<>. So if the object gets collected, it returns > + nullptr, not getting a zombie pointer.
Yes, must have been written before these were weak pointers.
> Source/WebCore/ChangeLog:17 > + 3. We are emitting write-barrier in a wrong order. In the heavily stressed scenario, it is possible that concurrent marking > + scans JSEventListener just after we emit the write-barrier, and this marking misses the assigned value. We must emit
Great find! Is there any test that hits this, at least in StressGC configuration?
> Source/WebCore/ChangeLog:21 > + Still keeping normal world check. But this looks also wrong. The assertion is allowing non-normal world to get dead m_wrapper.
Yes, pretty sure it's been understood to be a dirty hack when added. Interaction between worlds and event listeners was bogus back then, and I don't know if it has been improved since.
Yusuke Suzuki
Comment 36
2020-03-06 13:10:08 PST
Comment on
attachment 392702
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=392702&action=review
The test failure is saying that we invoke JSEventListener::jsFunction with `!m_wrapper` legally. This sounds like we are also using `m_wrapper` for whether it is initialized or not. This sounds like wrong too.
>> Source/WebCore/ChangeLog:12 >> + GC finds it, user code stores this function somewhere reachable from the root, etc. The correct thing is `m_wrapper` must > > There are a couple things here that make me a little skeptical, from a position of someone who only touched this a long time ago and is absolutely not an expert. > > 1. If this assertion can misfire due to conservative GC finding it, why haven't we ever seen it fire without a real bug? > > 2. The assertion would fire if user code stored the function somewhere, so it's essentially saying that user code must not do that. That isn't a bug in the assertion necessarily. Is there even a reason for user code to do this? > > 3. Checking m_jsFunction for being null is not the same as checking whether the function is alive. Once can just assign null to the member, and arguably should do this without waiting for GC. Is it even right that m_jsFunction is a weak pointer?
1. This is because this assertion is using `!m_jsFuncion` wrong purpose. We are *not* firing if `m_jsFunction` is successfully collected even if `m_wrapper` is collected. This weakens the assertion so we could miss the real bugs due to this wrong condition. And again, checking `m_jsFunction`'s liveness here is unrelated to the validity of JSEventListener. I think the key is that, previously this m_jsFunction is not Weak<>. So, when we are evaluating `!m_jsFuncion`, we are not checking the liveness of this. Rather we are checking whether we initialized it once before. However, now it is replaced with Weak<> and this `!m_jsFunction` becomes "it is not initialized OR it is collected". This is clearly wrong, and that's why this worked so far. We made this assertion's meaning weak unintentionally when we introduced Weak<> to this JSEventListener. I guess that this could hide the real bug so far. And making the meaning of this assertion looks very wrong because this is now checking m_jsFunction's liveness too. The assertion is saying "Even if the m_wrapper is collected, if m_jsFunction is collected too, this is OK, let's skip event triggering". No, this is not OK, this is the bug. 2. Yes, it is possible. You can write a code, function handler() { }; imgElement.onload = handler; This `handler` is m_jsFunction. You can put it anywhere. So making `m_jsFunction` alive is very easy. The key here is that, the original intention of `m_jsFunction` check is not looking into the liveness. It is checking whether we initialized this field once (because we were using WriteBarrier<>). But now, this is changed to Weak<>, and the meaning becomes broken. 3. Yes. And checking `m_jsFunction` for being null is not the same as checking it is not initialized too. But this assertion is using `m_jsFunction` condition to check wether it is initialized. This is wrong. Liveness should not be checked.
Yusuke Suzuki
Comment 37
2020-03-07 13:03:54 PST
Comment on
attachment 392702
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=392702&action=review
>> Source/WebCore/ChangeLog:17 >> + scans JSEventListener just after we emit the write-barrier, and this marking misses the assigned value. We must emit > > Great find! Is there any test that hits this, at least in StressGC configuration?
It is possible that if we stress this GC situation enough, we could eventually hit GC crash. But not sure whether our GC bots have the tests which stresses this path.
> Source/WebCore/bindings/js/JSEventListener.h:101 > if (!m_jsFunction) {
Now, I think this condition is also wrong. I think this is originally inserted as "this listener is not initialized yet". But now, this `!m_jsFunction`'s meaning is different from this. Initializing function again if the function is gone sounds wrong.
Yusuke Suzuki
Comment 38
2020-03-07 13:44:54 PST
Created
attachment 392888
[details]
Patch
Yusuke Suzuki
Comment 39
2020-03-07 13:50:32 PST
Created
attachment 392892
[details]
Patch
Yusuke Suzuki
Comment 40
2020-03-07 13:54:46 PST
Created
attachment 392893
[details]
Patch
Yusuke Suzuki
Comment 41
2020-03-07 13:58:46 PST
Created
attachment 392895
[details]
Patch
Yusuke Suzuki
Comment 42
2020-03-07 16:38:37 PST
Comment on
attachment 392895
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=392895&action=review
> Source/WebCore/bindings/js/JSEventListener.cpp:56 > m_jsFunction = JSC::Weak<JSC::JSObject>(function);
I think the assertion is saying this can be nullptr.
Yusuke Suzuki
Comment 43
2020-03-07 16:39:58 PST
Created
attachment 392911
[details]
Patch
Yusuke Suzuki
Comment 44
2020-03-08 01:36:56 PST
I’m now thinking that corrected assertion found the real issue.
https://ews-build.webkit.org/results/macOS-Mojave-Debug-WK1-Tests-EWS/r392911-4549/results.html
Yusuke Suzuki
Comment 45
2020-03-08 01:56:51 PST
(In reply to Yusuke Suzuki from
comment #44
)
> I’m now thinking that corrected assertion found the real issue. >
https://ews-build.webkit.org/results/macOS-Mojave-Debug-WK1-Tests-EWS/
>
r392911
-4549/results.html
LoadableTextTrack should be kept alive while it is loading.
Yusuke Suzuki
Comment 46
2020-03-08 01:57:08 PST
(In reply to Yusuke Suzuki from
comment #45
)
> (In reply to Yusuke Suzuki from
comment #44
) > > I’m now thinking that corrected assertion found the real issue. > >
https://ews-build.webkit.org/results/macOS-Mojave-Debug-WK1-Tests-EWS/
> >
r392911
-4549/results.html > > LoadableTextTrack should be kept alive while it is loading.
LoadableTextTrack's HTMLTrackElement's wrapper should be kept alive.
Yusuke Suzuki
Comment 47
2020-03-08 04:04:46 PDT
(In reply to Yusuke Suzuki from
comment #46
)
> (In reply to Yusuke Suzuki from
comment #45
) > > (In reply to Yusuke Suzuki from
comment #44
) > > > I’m now thinking that corrected assertion found the real issue. > > >
https://ews-build.webkit.org/results/macOS-Mojave-Debug-WK1-Tests-EWS/
> > >
r392911
-4549/results.html > > > > LoadableTextTrack should be kept alive while it is loading. > > LoadableTextTrack's HTMLTrackElement's wrapper should be kept alive.
If LoadableTextTrack is loading a track, and firing events, if HTMLTrackElement is removed from the tree and nobody is looking into it (while event listener is set), I think this assertion hits.
Darin Adler
Comment 48
2020-03-08 23:03:42 PDT
Comment on
attachment 392911
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=392911&action=review
> Source/WebCore/bindings/js/JSEventListener.cpp:57 > + if (!wrapper) { > ASSERT(!function); > + return; > + }
Inconsistent to use early return for (!wrapper), but a normal if statement for (function). I suggest we write this instead: if (function) { ASSERT(wrapper); ...
Yusuke Suzuki
Comment 49
2020-03-09 00:23:38 PDT
Comment on
attachment 392911
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=392911&action=review
>> Source/WebCore/bindings/js/JSEventListener.cpp:57 >> + } > > Inconsistent to use early return for (!wrapper), but a normal if statement for (function). I suggest we write this instead: > > if (function) { > ASSERT(wrapper); > ...
Thanks, fixed.
Yusuke Suzuki
Comment 50
2020-03-10 01:33:01 PDT
Committed
r258189
: <
https://trac.webkit.org/changeset/258189
>
Enrique Ocaña
Comment 51
2021-02-10 05:09:28 PST
This patch is solving an issue we found downstream on wpe-2.22 (and 2.28 but not on 2.30) on ARM 32bit devices (at least, as that's what we use). We were directly accessing mediaSource.sourceBuffers from JavaScript to register an event listener on the addsourcebuffer event, but when the event happened, the event listener had been declared "dead" by the garbage collector and never ran. If we held mediaSource.sourceBuffers in a temporary variable in an outer scope, the behaviour didn't happen. It took a long bisect to locate this fix. Thanks a lot for it!
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug