Summary: | ScrollViewInsetTests.RestoreInitialContentOffsetAfterCrash API test is failing with async delegates | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Chris Dumez <cdumez> | ||||||||||||||
Component: | WebCore Misc. | Assignee: | Chris Dumez <cdumez> | ||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||
Severity: | Normal | CC: | achristensen, commit-queue, dbates, ews-watchlist, japhet, rniwa, ryanhaddad, webkit-bug-importer, wenson_hsieh | ||||||||||||||
Priority: | P2 | Keywords: | InRadar | ||||||||||||||
Version: | WebKit Nightly Build | ||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||
OS: | Unspecified | ||||||||||||||||
Bug Depends on: | 183886, 184206, 184210 | ||||||||||||||||
Bug Blocks: | 180568 | ||||||||||||||||
Attachments: |
|
Description
Chris Dumez
2018-03-20 11:47:44 PDT
I see this error which could be related: 2018-03-20 11:50:30.865 TestWebKitAPI[34074:5841571] Encountered error: Error Domain=WKErrorDomain Code=4 "A JavaScript exception occurred" UserInfo={WKJavaScriptExceptionLineNumber=1, WKJavaScriptExceptionMessage=TypeError: null is not an object (evaluating 'editor.querySelector('a').href'), WKJavaScriptExceptionColumnNumber=26, WKJavaScriptExceptionSourceURL=file:///Volumes/Data/WebKit/OpenSource/WebKitBuild/Release-iphonesimulator/TestWebKitAPI.resources/selected-text-image-link-and-editable.html, NSLocalizedDescription=A JavaScript exception occurred} while evaluating script: editor.querySelector('a').href (In reply to Chris Dumez from comment #2) > I see this error which could be related: > 2018-03-20 11:50:30.865 TestWebKitAPI[34074:5841571] Encountered error: > Error Domain=WKErrorDomain Code=4 "A JavaScript exception occurred" > UserInfo={WKJavaScriptExceptionLineNumber=1, > WKJavaScriptExceptionMessage=TypeError: null is not an object (evaluating > 'editor.querySelector('a').href'), WKJavaScriptExceptionColumnNumber=26, > WKJavaScriptExceptionSourceURL=file:///Volumes/Data/WebKit/OpenSource/ > WebKitBuild/Release-iphonesimulator/TestWebKitAPI.resources/selected-text- > image-link-and-editable.html, NSLocalizedDescription=A JavaScript exception > occurred} while evaluating script: editor.querySelector('a').href That is because we fail to perform the drop, so when test subsequently queries the DOM and expects a dropped anchor element, we throw an exception. Potentially timing related. Adding [webView waitForNextPresentationUpdate] calls after the navigation has completed and before the drag and drop actually starts seems to fix most API tests. I need to make both the navigation AND the response policy delegates synchronous again to get the API tests to pass again. Making only one of those 2 synchronous does not fix the tests. Looking at ScrollViewInsetTests.RestoreInitialContentOffsetAfterCrash test that seems simpler and appears to be failing for the same reason: In the success case: RemoteLayerTreeDrawingArea::flushLayers() RemoteLayerTreeDrawingArea::flushLayers() m_isFlushingSuspended 0x7f918900f018 - WebPageProxy::didCommitLoadForFrame() PageClientImpl::didCommitLoadForMainFrame() _didCommitLoadForMainFrame sets _firstPaintAfterCommitLoadTransactionID & _needsResetViewStateAfterCommitLoadForMainFrame RemoteLayerTreeDrawingArea::flushLayers() 0x7f918900f018 - WebPageProxy::didFinishLoadForFrame() PASS ScrollViewInsetTests.RestoreInitialContentOffsetAfterCrash In the error case: Failure: RemoteLayerTreeDrawingArea::flushLayers() RemoteLayerTreeDrawingArea::flushLayers() sends CommitLayer tree IPC RemoteLayerTreeDrawingAreaProxy::commitLayerTree() RemoteLayerTreeDrawingArea::flushLayers() RemoteLayerTreeDrawingArea::flushLayers() m_waitingForBackingStoreSwap 0x7fa90b80ae18 - WebPageProxy::didCommitLayerTree() PageClientImpl::didCommitLayerTree() _didCommitLayerTree 0x7fa90b80ae18 - WebPageProxy::didCommitLoadForFrame() PageClientImpl::didCommitLoadForMainFrame() _didCommitLoadForMainFrame sets _firstPaintAfterCommitLoadTransactionID & _needsResetViewStateAfterCommitLoadForMainFrame 0x7fa90b80ae18 - WebPageProxy::didFinishLoadForFrame() RemoteLayerTreeDrawingArea::flushLayers() FAIL ScrollViewInsetTests.RestoreInitialContentOffsetAfterCrash The test stores scrollView.contentOffset right after the load and expects scrollView.contentOffset.y to be -400. However, it is -299 after my change. scrollView.contentOffset gets updated in WKWebView._didCommitLayerTree. Note that in the success case, _didCommitLayerTree does not get called before the test finishes running. Therefore, the test was truly storing the *initial* contentOffset in initialContentOffset. However, with the async policy delegates, _didCommitLayerTree gets called and overwrites scrollView.contentOffset *before* didFinishNavigation is called. As a result, the test stores the updated contentOffset in initialContentOffset and therefore fails. This looks pretty racy to me. So overall, the issue seems to be that we're now committing the layer tree during the navigation policy decision: policyChecker().checkNavigationPolicy() BEGIN RemoteLayerTreeDrawingArea::flushLayers() RemoteLayerTreeDrawingArea::flushLayers() sends CommitLayer tree IPC RemoteLayerTreeDrawingArea::flushLayers() RemoteLayerTreeDrawingArea::flushLayers() m_waitingForBackingStoreSwap policyChecker().checkNavigationPolicy() END (In reply to Chris Dumez from comment #6) > So overall, the issue seems to be that we're now committing the layer tree > during the navigation policy decision: > policyChecker().checkNavigationPolicy() BEGIN > RemoteLayerTreeDrawingArea::flushLayers() > RemoteLayerTreeDrawingArea::flushLayers() sends CommitLayer tree IPC > RemoteLayerTreeDrawingArea::flushLayers() > RemoteLayerTreeDrawingArea::flushLayers() m_waitingForBackingStoreSwap > policyChecker().checkNavigationPolicy() END This happens because when we create the view, it creates a WebPage which ends up calling RemoteLayerTreeDrawingArea::scheduleCompositingLayerFlush() before we've started any real load: 1 0x10ee13853 WebKit::RemoteLayerTreeDrawingArea::scheduleCompositingLayerFlush() 2 0x11aaf565d WebCore::RenderLayerCompositor::attachRootLayer(WebCore::RenderLayerCompositor::RootLayerAttachment) 3 0x11aaf061a WebCore::RenderLayerCompositor::ensureRootLayer() 4 0x11aacbbf8 WebCore::RenderLayerCompositor::updateCompositingLayers(WebCore::CompositingUpdateType, WebCore::RenderLayer*) 5 0x11a2f6e90 WebCore::Document::resolveStyle(WebCore::Document::ResolveStyleType) 6 0x11a2f8720 WebCore::Document::createRenderTree() 7 0x11a2f876b WebCore::Document::didBecomeCurrentDocumentInFrame() 8 0x11a77b02b WebCore::Frame::setDocument(WTF::RefPtr<WebCore::Document, WTF::DumbPtrTraits<WebCore::Document> >&&) 9 0x11a6a9864 WebCore::DocumentWriter::begin(WebCore::URL const&, bool, WebCore::Document*) 10 0x11a69711b WebCore::DocumentLoader::commitData(char const*, unsigned long) 11 0x11a696973 WebCore::DocumentLoader::finishedLoading() 12 0x11a69d809 WebCore::DocumentLoader::maybeLoadEmpty() 13 0x11a69dab1 WebCore::DocumentLoader::startLoadingMainResource() 14 0x11a6b2566 WebCore::FrameLoader::init() 15 0x10eebd6a1 WebKit::WebFrame::createWithCoreMainFrame(WebKit::WebPage*, WebCore::Frame*) 16 0x10eef12fd WebKit::WebPage::WebPage(unsigned long long, WebKit::WebPageCreationParameters&&) 17 0x10eef03b8 WebKit::WebPage::create(unsigned long long, WebKit::WebPageCreationParameters&&) 18 0x10ef9f5cf WebKit::WebProcess::createWebPage(unsigned long long, WebKit::WebPageCreationParameters&&) This schedules a timer that calls flushLayers() after a short amount of time. Now that the navigation policy decision is async, this timer can fire during the policy decision and we thus end up committing the layer tree for about:blank, before we've actually started loading the test URL. Ok, so I believe the difference in behavior may come from the fact that when the navigation policy delegate is synchronous: 1. about:blank is in the view, there is a timer started to flush layers. 2. We do the synchronous navigation policy for the test page 3. We start provisional load which calls WebFrameLoaderClient::provisionalLoadStarted() which calls didStartPageTransition() which freezes the layer tree. So when the timer to flush layers fires during the provisional load, we return early because the layer tree is frozen. When the navigation policy delegate is asynchronous. The timer to flush layers fires during the policy decision, before we've frozen the layer tree (because provisional has not started yet). As a result, we commit the layer tree for about:blank. I'll experiment tomorrow with freezing the layer tree earlier, when we start the navigation policy decision, to see if this restores previous behavior. (In reply to Chris Dumez from comment #8) > Ok, so I believe the difference in behavior may come from the fact that when > the navigation policy delegate is synchronous: > 1. about:blank is in the view, there is a timer started to flush layers. > 2. We do the synchronous navigation policy for the test page > 3. We start provisional load which calls > WebFrameLoaderClient::provisionalLoadStarted() which calls > didStartPageTransition() which freezes the layer tree. > So when the timer to flush layers fires during the provisional load, we > return early because the layer tree is frozen. > > When the navigation policy delegate is asynchronous. The timer to flush > layers fires during the policy decision, before we've frozen the layer tree > (because provisional has not started yet). As a result, we commit the layer > tree for about:blank. > > I'll experiment tomorrow with freezing the layer tree earlier, when we start > the navigation policy decision, to see if this restores previous behavior. Yep, looks like this is it. If I freeze the layer tree before the navigation policy decision instead of after, then it restores previous behavior and the API tests start passing again. I'll work on a patch tomorrow. Created attachment 336201 [details]
Patch
Comment on attachment 336201 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=336201&action=review > Source/WebKit/WebProcess/WebCoreSupport/WebFrameLoaderClient.cpp:904 > + if (!m_frame->isMainFrame()) This seems to be the case, but just wanted to make sure — it's not possible for m_frame->isMainFrame() to be true in WebFrameLoaderClient::dispatchDecidePolicyForNavigationAction but false in WebFrameLoaderClient::didDecidePolicyForNavigationAction, correct? > Tools/TestWebKitAPI/Tests/ios/ScrollViewInsetTests.mm:36 > +static bool navigationComplete = false; Nit - I think we now generally avoid using static state variables in API tests if possible (though many older tests still do this). Could we fold this into AsyncPolicyDelegateForInsetTest? Created attachment 336202 [details]
Patch
Comment on attachment 336202 [details] Patch Attachment 336202 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.webkit.org/results/7053708 New failing tests: svg/as-image/svg-image-with-data-uri-reloading.html fast/shadow-dom/clear-active-state-in-shadow.html Created attachment 336212 [details]
Archive of layout-test-results from ews107 for mac-sierra-wk2
The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews107 Port: mac-sierra-wk2 Platform: Mac OS X 10.12.6
Comment on attachment 336202 [details] Patch Attachment 336202 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: http://webkit-queues.webkit.org/results/7053757 New failing tests: svg/as-image/svg-image-with-data-uri-reloading.html Created attachment 336214 [details]
Archive of layout-test-results from ews122 for ios-simulator-wk2
The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews122 Port: ios-simulator-wk2 Platform: Mac OS X 10.12.6
Created attachment 336219 [details]
Patch
Created attachment 336224 [details]
Patch
Comment on attachment 336224 [details] Patch Clearing flags on attachment: 336224 Committed r229828: <https://trac.webkit.org/changeset/229828> All reviewed patches have been landed. Closing bug. It appears that this may have caused an API test timeout on iOS: Tests that timed out: WebKit.NoHistoryItemScrollToFragment https://build.webkit.org/builders/Apple%20iOS%2011%20Simulator%20Release%20WK2%20%28Tests%29/builds/3857 (In reply to Ryan Haddad from comment #22) > It appears that this may have caused an API test timeout on iOS: > > Tests that timed out: > WebKit.NoHistoryItemScrollToFragment > > https://build.webkit.org/builders/ > Apple%20iOS%2011%20Simulator%20Release%20WK2%20%28Tests%29/builds/3857 Looking now. (In reply to Chris Dumez from comment #23) > (In reply to Ryan Haddad from comment #22) > > It appears that this may have caused an API test timeout on iOS: > > > > Tests that timed out: > > WebKit.NoHistoryItemScrollToFragment > > > > https://build.webkit.org/builders/ > > Apple%20iOS%2011%20Simulator%20Release%20WK2%20%28Tests%29/builds/3857 > > Looking now. via https://bugs.webkit.org/show_bug.cgi?id=183886 |