Summary: | REGRESSION: Layout Test storage/indexeddb/detached-iframe.html is a flaky crash | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Matt Lewis <jlewis3> | ||||||||
Component: | Web Inspector | Assignee: | Matt Baker <mattbaker> | ||||||||
Status: | RESOLVED DUPLICATE | ||||||||||
Severity: | Normal | CC: | achristensen, ap, cdumez, inspector-bugzilla-changes, joepeck, mattbaker, sihui_liu, webkit-bug-importer | ||||||||
Priority: | P2 | Keywords: | InRadar | ||||||||
Version: | WebKit Nightly Build | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Attachments: |
|
Description
Matt Lewis
2017-11-29 16:44:21 PST
Created attachment 327925 [details]
WK2 Crash Log
Attaching the WK2 crash log as well
The first failure on the bots was on 2017-11-14 around 1 am. It's not super frequent, but clearly a recent regression. There are also some STP crashes with similar signatures, which I didn't look into carefully. The crash looks like a null Frame: > Exception Type: EXC_BAD_ACCESS (SIGSEGV) > Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000040 > > ... > > Thread 0 Crashed:: Dispatch queue: com.apple.main-thread > 0 com.apple.WebCore 0x0000000450402a8c WebCore::Frame::page() const + 12 (Frame.h:373) > 1 com.apple.WebCore 0x0000000451830325 WebCore::InspectorInstrumentation::instrumentingAgentsForFrame(WebCore::Frame&) + 21 (InspectorInstrumentation.h:1431) > 2 com.apple.WebCore 0x0000000452480116 WebCore::InspectorInstrumentation::didDispatchPostMessage(WebCore::Frame&, WebCore::TimerBase&) + 54 (InspectorInstrumentation.h:731) > 3 com.apple.WebCore 0x000000045247fe57 WebCore::DOMWindow::postMessageTimerFired(WebCore::PostMessageTimer&) + 551 (DOMWindow.cpp:973) Coming from: > void DOMWindow::postMessageTimerFired(PostMessageTimer& timer) > { > if (!document() || !isCurrentlyDisplayedInFrame()) > return; > > ... > > InspectorInstrumentation::willDispatchPostMessage(*m_frame, timer); > > dispatchEvent(timer.event(*document())); > > InspectorInstrumentation::didDispatchPostMessage(*m_frame, timer); > } So my guess is `m_frame` is nullptr by didDispatchPostMessage time (and presumably wasn't at willDispatchPostMessage time). (In reply to Joseph Pecoraro from comment #4) > The crash looks like a null Frame: > > > Exception Type: EXC_BAD_ACCESS (SIGSEGV) > > Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000040 > > > > ... > > > > Thread 0 Crashed:: Dispatch queue: com.apple.main-thread > > 0 com.apple.WebCore 0x0000000450402a8c WebCore::Frame::page() const + 12 (Frame.h:373) > > 1 com.apple.WebCore 0x0000000451830325 WebCore::InspectorInstrumentation::instrumentingAgentsForFrame(WebCore::Frame&) + 21 (InspectorInstrumentation.h:1431) > > 2 com.apple.WebCore 0x0000000452480116 WebCore::InspectorInstrumentation::didDispatchPostMessage(WebCore::Frame&, WebCore::TimerBase&) + 54 (InspectorInstrumentation.h:731) > > 3 com.apple.WebCore 0x000000045247fe57 WebCore::DOMWindow::postMessageTimerFired(WebCore::PostMessageTimer&) + 551 (DOMWindow.cpp:973) > > Coming from: > > > void DOMWindow::postMessageTimerFired(PostMessageTimer& timer) > > { > > if (!document() || !isCurrentlyDisplayedInFrame()) > > return; > > > > ... > > > > InspectorInstrumentation::willDispatchPostMessage(*m_frame, timer); > > > > dispatchEvent(timer.event(*document())); > > > > InspectorInstrumentation::didDispatchPostMessage(*m_frame, timer); > > } > > So my guess is `m_frame` is nullptr by didDispatchPostMessage time (and > presumably wasn't at willDispatchPostMessage time). So something is calling FrameDestructionObserver::frameDestroyed() during event dispatch. If this is the case, can we still get the PageDebuggerAgent for the frame, in order to clean up async call stack state? Is the agent even around anymore? > > So my guess is `m_frame` is nullptr by didDispatchPostMessage time (and
> > presumably wasn't at willDispatchPostMessage time).
>
> So something is calling FrameDestructionObserver::frameDestroyed() during
> event dispatch. If this is the case, can we still get the PageDebuggerAgent
> for the frame, in order to clean up async call stack state? Is the agent
> even around anymore?
The crash happens in looking up the list of agents from the frame (InspectorInstrumentation level), but the frame is null.
And PageDebuggerAgent is 1 per Page anyways, so I would expect it to exist, but we still crash before reaching that point.
Created attachment 328353 [details]
Patch
Comment on attachment 328353 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=328353&action=review > Source/WebCore/page/DOMWindow.cpp:954 > - InspectorInstrumentation::didPostMessage(*m_frame, *timer, state); > + InspectorInstrumentation::didPostMessage(*document(), *timer, state); I don't understand how this will fix anything. We are just changing the assumption that the frame is non-null to an assumption that the document is non-null. Is that assumption more valid? Why don't we want an early return or something? Comment on attachment 328353 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=328353&action=review >> Source/WebCore/page/DOMWindow.cpp:954 >> + InspectorInstrumentation::didPostMessage(*document(), *timer, state); > > I don't understand how this will fix anything. We are just changing the assumption that the frame is non-null to an assumption that the document is non-null. Is that assumption more valid? Why don't we want an early return or something? Yeah, I agree. I think the specific concern was at line 984, after dispatchEvent() the document and frame might not exist so we should check instead of assuming either one. Comment on attachment 328353 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=328353&action=review >>> Source/WebCore/page/DOMWindow.cpp:954 >>> + InspectorInstrumentation::didPostMessage(*document(), *timer, state); >> >> I don't understand how this will fix anything. We are just changing the assumption that the frame is non-null to an assumption that the document is non-null. Is that assumption more valid? Why don't we want an early return or something? > > Yeah, I agree. I think the specific concern was at line 984, after dispatchEvent() the document and frame might not exist so we should check instead of assuming either one. And if possible we should write a test. Script on the page can register for that dispatchEvent and close the frame in the JS handler. Comment on attachment 328353 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=328353&action=review >>>> Source/WebCore/page/DOMWindow.cpp:954 >>>> + InspectorInstrumentation::didPostMessage(*document(), *timer, state); >>> >>> I don't understand how this will fix anything. We are just changing the assumption that the frame is non-null to an assumption that the document is non-null. Is that assumption more valid? Why don't we want an early return or something? >> >> Yeah, I agree. I think the specific concern was at line 984, after dispatchEvent() the document and frame might not exist so we should check instead of assuming either one. > > And if possible we should write a test. Script on the page can register for that dispatchEvent and close the frame in the JS handler. The document/frame/whatever that is passed to Inspector instrumentation is only used to retrieve instrumenting agents, so we just need something that will outlive the detached frame. We can use page() instead. As far as a test goes, detached-frame.html performs the same steps you describe (closing the frame in the JS handler during dispatchEvent). Is the existing test insufficient? > >> Yeah, I agree. I think the specific concern was at line 984, after dispatchEvent() the document and frame might not exist so we should check instead of assuming either one. > > > > And if possible we should write a test. Script on the page can register for that dispatchEvent and close the frame in the JS handler. > > The document/frame/whatever that is passed to Inspector instrumentation is > only used to retrieve instrumenting agents, so we just need something that > will outlive the detached frame. We can use page() instead. page() just uses m_frame under the hood, and can be null: Page* DOMWindow::page() { return frame() ? frame()->page() : nullptr; } So it is no different, and would still need an if check. > As far as a test goes, detached-frame.html performs the same steps you > describe (closing the frame in the JS handler during dispatchEvent). > Is the existing test insufficient? I suppose it is, but its flakey and could go away, be skipped, or changed and we'd never know if there is coverage for this issue which we want to fix. Ideally there would be a specific test covering this specific scenario. If one can't be constructed, then we can fallback to the existing test. *** This bug has been marked as a duplicate of bug 191217 *** |