RESOLVED FIXED 171406
[macOS] Flaky Crash under EventTarget::fireEventListeners on imported/blink/paint/deprecatedpaintlayer/non-self-painting-layer-overrides-visibility.html
https://bugs.webkit.org/show_bug.cgi?id=171406
Summary [macOS] Flaky Crash under EventTarget::fireEventListeners on imported/blink/p...
Attachments
Patch (1.39 KB, patch)
2017-04-27 17:39 PDT, Matt Lewis
no flags
Patch (10.25 KB, patch)
2017-05-02 10:40 PDT, Chris Dumez
no flags
Matt Lewis
Comment 1 2017-04-27 16:48:47 PDT
Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 com.apple.WebCore 0x000000011d29d0e1 0x1196f9000 + 62537953 1 com.apple.WebCore 0x00000001197b3305 WTF::TypeCastTraits<WebCore::Document const, WebCore::ScriptExecutionContext const, false>::isOfType(WebCore::ScriptExecutionContext const&) + 21 (Document.h:1796) 2 com.apple.WebCore 0x0000000119ae14d7 bool WTF::is<WebCore::Document, WebCore::ScriptExecutionContext>(WebCore::ScriptExecutionContext*) + 39 (TypeCasts.h:66) 3 com.apple.WebCore 0x000000011a052b69 WebCore::EventTarget::fireEventListeners(WebCore::Event&, WTF::Vector<WTF::RefPtr<WebCore::RegisteredEventListener>, 1ul, WTF::CrashOnOverflow, 16ul>) + 137 (EventTarget.cpp:237) 4 com.apple.WebCore 0x000000011a05298e WebCore::EventTarget::fireEventListeners(WebCore::Event&) + 318 (EventTarget.cpp:209) 5 com.apple.WebCore 0x000000011a052829 WebCore::EventTarget::dispatchEvent(WebCore::Event&) + 233 (EventTarget.cpp:166) 6 com.apple.WebCore 0x000000011a2a3e2f WebCore::GenericEventQueue::dispatchOneEvent() + 239 (GenericEventQueue.cpp:70) 7 com.apple.WebCore 0x000000011a2a6b0d WTF::Function<void ()>::CallableWrapper<std::__1::__bind<void (WebCore::GenericEventQueue::*)(), WebCore::GenericEventQueue*> >::call() + 221 (Function.h:89) 8 com.apple.WebCore 0x000000011977cdfe WTF::Function<void ()>::operator()() const + 94 (Function.h:50) 9 com.apple.WebCore 0x000000011a2a6399 WebCore::GenericTaskQueue<WebCore::Timer>::enqueueTask(WTF::Function<void ()>&&)::'lambda'()::operator()() const + 137 (GenericTaskQueue.h:99) 10 com.apple.WebCore 0x000000011a2a6269 WTF::Function<void ()>::CallableWrapper<WebCore::GenericTaskQueue<WebCore::Timer>::enqueueTask(WTF::Function<void ()>&&)::'lambda'()>::call() + 25 (Function.h:89) 11 com.apple.WebCore 0x000000011977cdfe WTF::Function<void ()>::operator()() const + 94 (Function.h:50) 12 com.apple.WebCore 0x000000011a2a7487 WebCore::TaskDispatcher<WebCore::Timer>::dispatchOneTask() + 119 (GenericTaskQueue.cpp:81) 13 com.apple.WebCore 0x000000011a2a721f WebCore::TaskDispatcher<WebCore::Timer>::sharedTimerFired() + 255 (GenericTaskQueue.cpp:67) 14 com.apple.WebCore 0x000000011a2a98da void std::__1::__invoke_void_return_wrapper<void>::__call<void (*&)()>(void (*&&&)()) + 42 (__functional_base:469) 15 com.apple.WebCore 0x000000011a2a9889 std::__1::__function::__func<void (*)(), std::__1::allocator<void (*)()>, void ()>::operator()() + 41 (functional:1437) 16 com.apple.WebCore 0x000000011975b3ea std::__1::function<void ()>::operator()() const + 26 (functional:1817) 17 com.apple.WebCore 0x000000011975b309 WebCore::Timer::fired() + 25 (Timer.h:135) 18 com.apple.WebCore 0x000000011c23c930 WebCore::ThreadTimers::sharedTimerFiredInternal() + 480 (ThreadTimers.cpp:121) 19 com.apple.WebCore 0x000000011c23db71 WebCore::ThreadTimers::setSharedTimer(WebCore::SharedTimer*)::$_0::operator()() const + 33 (ThreadTimers.cpp:70) 20 com.apple.WebCore 0x000000011c23db3d void std::__1::__invoke_void_return_wrapper<void>::__call<WebCore::ThreadTimers::setSharedTimer(WebCore::SharedTimer*)::$_0&>(WebCore::ThreadTimers::setSharedTimer(WebCore::SharedTimer*)::$_0&&&) + 45 (__functional_base:469) 21 com.apple.WebCore 0x000000011c23dae9 std::__1::__function::__func<WebCore::ThreadTimers::setSharedTimer(WebCore::SharedTimer*)::$_0, std::__1::allocator<WebCore::ThreadTimers::setSharedTimer(WebCore::SharedTimer*)::$_0>, void ()>::operator()() + 41 (functional:1437) 22 com.apple.WebCore 0x000000011975b3ea std::__1::function<void ()>::operator()() const + 26 (functional:1817) 23 com.apple.WebCore 0x000000011b4ee2a8 WebCore::MainThreadSharedTimer::fired() + 104 (MainThreadSharedTimer.cpp:53) 24 com.apple.WebCore 0x000000011b4ee639 WebCore::timerFired(__CFRunLoopTimer*, void*) + 41 (MainThreadSharedTimerCF.cpp:74) 25 com.apple.CoreFoundation 0x00007fffaf2a0de4 __CFRUNLOOP_IS_CALLING_OUT_TO_A_TIMER_CALLBACK_FUNCTION__ + 20 26 com.apple.CoreFoundation 0x00007fffaf2a0a73 __CFRunLoopDoTimer + 1075 27 com.apple.CoreFoundation 0x00007fffaf2a05ca __CFRunLoopDoTimers + 298 28 com.apple.CoreFoundation 0x00007fffaf297fa1 __CFRunLoopRun + 2081 29 com.apple.CoreFoundation 0x00007fffaf297524 CFRunLoopRunSpecific + 420 30 com.apple.HIToolbox 0x00007fffae7f7ebc RunCurrentEventLoopInMode + 240 31 com.apple.HIToolbox 0x00007fffae7f7cf1 ReceiveNextEventCommon + 432 32 com.apple.HIToolbox 0x00007fffae7f7b26 _BlockUntilNextEventMatchingListInModeWithFilter + 71 33 com.apple.AppKit 0x00007fffacd92e24 _DPSNextEvent + 1120 34 com.apple.AppKit 0x00007fffad50e85e -[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 2796 35 com.apple.AppKit 0x00007fffacd877ab -[NSApplication run] + 926 36 com.apple.AppKit 0x00007fffacd521de NSApplicationMain + 1237 37 libxpc.dylib 0x00007fffc52118c7 _xpc_objc_main + 775 38 libxpc.dylib 0x00007fffc52102e4 xpc_main + 494 39 com.apple.WebKit.WebContent 0x000000010c2e8105 main + 1189 (XPCServiceMain.mm:148) 40 libdyld.dylib 0x00007fffc4fb8235 start + 1
Matt Lewis
Comment 2 2017-04-27 16:49:43 PDT
This does not appear to be a recent regression and may already be tracked with: rdar://problem/30945281
Matt Lewis
Comment 3 2017-04-27 17:39:02 PDT
WebKit Commit Bot
Comment 4 2017-04-27 18:23:48 PDT
Comment on attachment 308485 [details] Patch Clearing flags on attachment: 308485 Committed r215912: <http://trac.webkit.org/changeset/215912>
WebKit Commit Bot
Comment 5 2017-04-27 18:23:50 PDT
All reviewed patches have been landed. Closing bug.
Ryan Haddad
Comment 6 2017-04-27 18:25:49 PDT
Reopening because the patch only marked the test as flaky.
Joseph Pecoraro
Comment 7 2017-04-27 23:57:11 PDT
At this point the crash log is not available via the links. It would be great if crash logs were attached to the bugzilla bug when filing bugs about a crash.
Joseph Pecoraro
Comment 8 2017-04-28 00:09:11 PDT
This test just loads a page with an <object>. I think its possible that this test itself is not the cause of the crash, but another test (perhaps the one or two preceeding it) that did not clean up properly. So skipping this test if the actual cause is another test does little good. The crash is happening due to a timer from a GenericEventQueue firing. GenericTaskQueues are primarily used by Media code, and there aren't many. Perhaps we can just audit uses of GenericTaskQueues.
Joseph Pecoraro
Comment 9 2017-04-28 00:30:26 PDT
It doesn't look like an assert or a nullptr crash. From the snipper of a crash log provided in the earlier comment this looks like it could be a dangling pointer. Since the number of GenericEventQueues is small I looked around. HTMLMediaElement closes all its GenericTaskQueues in HTMLMediaElement::contextDestroyed() but doesn't close its GenericEventQueue. That seems rather suspect to me: (1) GenericTaskQueue was in the backtrace (2) The ScriptExecutionContext is apparently being destroyed here, so if a timer eventually fires for this queue it would be bad. Hence the crash apparently attempting to check the type of a ScriptExecutionContext. Jer / Eric, what do you think? Should HTMLMediaElement be closing the m_asyncEventQueue here? Would it have been guaranteed to have been closed / cleared earlier?
Eric Carlson
Comment 10 2017-04-28 08:20:45 PDT
(In reply to Joseph Pecoraro from comment #9) > It doesn't look like an assert or a nullptr crash. From the snipper of a > crash log provided in the earlier comment this looks like it could be a > dangling pointer. > > Since the number of GenericEventQueues is small I looked around. > > HTMLMediaElement closes all its GenericTaskQueues in > HTMLMediaElement::contextDestroyed() but doesn't close its > GenericEventQueue. That seems rather suspect to me: > > (1) GenericTaskQueue was in the backtrace > > (2) The ScriptExecutionContext is apparently being destroyed here, so if > a timer eventually fires for this queue it would be bad. Hence the > crash > apparently attempting to check the type of a ScriptExecutionContext. > > Jer / Eric, what do you think? Should HTMLMediaElement be closing the > m_asyncEventQueue here? Would it have been guaranteed to have been closed / > cleared earlier? The m_asyncEventQueue is closed in HTMLMediaElement::stop, which should be called before contextDestroyed, but it does seem wrong that all of the other queues are closed in HTMLMediaElement::contextDestroyed. m_promiseTaskQueue is closed in both HTMLMediaElement::contextDestroyed and HTMLMediaElement::stop, and m_asyncEventQueue is closed in HTMLMediaElement::stop and in the destructor. I don't know if it will fix this crash, but it may be useful to put all of the queue/timer cleanup one place and add asserts in the places we assume everything has been cleaned up.
Jer Noble
Comment 11 2017-04-28 08:30:04 PDT
(In reply to Eric Carlson from comment #10) > > The m_asyncEventQueue is closed in HTMLMediaElement::stop, which should be > called before contextDestroyed, but it does seem wrong that all of the other > queues are closed in HTMLMediaElement::contextDestroyed. > > m_promiseTaskQueue is closed in both HTMLMediaElement::contextDestroyed and > HTMLMediaElement::stop, and m_asyncEventQueue is closed in > HTMLMediaElement::stop and in the destructor. I don't know if it will fix > this crash, but it may be useful to put all of the queue/timer cleanup one > place and add asserts in the places we assume everything has been cleaned up. We could probably fix this narrow issue by closing the queue in contextDestroyed(), but perhaps the more interesting issue is why the ScriptExecutionContext/Document was destroyed without first calling stopActiveDOMObjects(), or why the HTMLMediaElement wasn't considered to be an ActiveDOMObject at destruction time.
Chris Dumez
Comment 12 2017-05-02 09:13:08 PDT
Assuming the EventTarget is a Node (as it is likely the case since most GenericTaskQueue queues are on HTMLMediaElement), then scriptExecution() context is supposed to return the Node's document(). A node's document() is always valid & alive as long as the Node is itself alive. Therefore, I suspect that either: 1. The EventTarget is not a Node 2. The HTMLMediaElement is itself dead.
Chris Dumez
Comment 13 2017-05-02 09:27:33 PDT
Based on the trace, it is a GenericEventQueue so it could be: - MediaKeySession::m_eventQueue - WebKitMediaKeySession::m_asyncEventQueue - MediaSource::m_asyncEventQueue - SourceBuffer::m_asyncEventQueue - SourceBufferList::m_asyncEventQueue - AudioContext::m_eventQueue - HTMLMediaElement::m_asyncEventQueue - TrackListBase::m_asyncEventQueue As Joe pointed out, the crash is unlikely to be caused by imported/blink/paint/deprecatedpaintlayer/non-self-painting-layer-overrides-visibility.html but rather a test that ran shortly before it. Given that this is the first test in LayoutTests/imported/blink/paint/ folder, I think it is likely caused by a test inside LayoutTests/imported/blink/media/ folder.
Chris Dumez
Comment 14 2017-05-02 09:44:48 PDT
SourceBufferList is potentially suspicious as it: - Has a GenericEventQueue data member - overrides EventTarget::scriptExecutionContext() to return its m_scriptExecutionContext member which is an unprotected raw pointer. - I see no code clearing m_scriptExecutionContext when the document gets destroyed We probably want SourceBufferList to subclass ContextDestructionObserver instead. Similar issue for TrackListBase. TextTrack::m_scriptExecution context also looks potentially unsafe, although TextTrack does not have a GenericEventQueue (it could still be the Event::target() of an Event that is inside the GenericEventQueue).
Chris Dumez
Comment 15 2017-05-02 10:19:20 PDT
Chris Dumez
Comment 16 2017-05-02 10:40:15 PDT
Chris Dumez
Comment 17 2017-05-02 12:04:03 PDT
Comment on attachment 308832 [details] Patch Clearing flags on attachment: 308832 Committed r216084: <http://trac.webkit.org/changeset/216084>
Chris Dumez
Comment 18 2017-05-02 12:04:05 PDT
All reviewed patches have been landed. Closing bug.
Lucas Forschler
Comment 19 2019-02-06 09:18:32 PST
Mass move bugs into the DOM component.
Note You need to log in before you can comment on or make changes to this bug.