Many tests are flaky with blank images either in the actual.png or expected.png. In some cases, it fails both runs and happen to be registered as failure. Link to first test run with the flakies: https://build.webkit.org/builders/WPE-Linux-64-bit-Release-Tests/builds/20967 Testing locally, the issue is reproducible most of the time (>80% here) with the following command: ./Tools/Scripts/run-webkit-tests --verbose --debug-rwt-logging --no-retry-failures --no-new-test-results --no-build --no-show-results --clobber-old-results --display-server=xvfb --results-directory layout-test-results --release --wpe fast/css/sticky/inflow-sticky.html Also, running only on fast/css/sticky/, the following 3 tests (that use window.scrollTo) fail: fast/css/sticky/inflow-sticky.html fast/css/sticky/inline-sticky-abspos-child.html fast/css/sticky/sticky-both-sides.html One complication is that it seems to depend on the hardware, as Chris could not reproduce it on his desktop but only on his laptop.
Just to add a note that I'm working on this - I've narrowed it down to the layer UpdateScope that gets used in the Frame and Overflow scrolling nodes causing some kind of either deadlock or oscillation, but I've not been able to pin down exactly what's happening due to the difficulty in reproducing and its lack of effect in anything except testing situations (at least that I've found so far). I've been trying various fixes and work-arounds, but haven't had any success so far.
Also a note, the reproduction instructions in comment one don't actually apply on my laptop (where I can reproduce this error). If I do 5 iterations of fast/css, I get a variable number of image-only failures and fast/css/pseudo-valid-fieldset.html gets highlighted as unexpected flakiness consistently. If I run that test in isolation, even for 50 iterations, it succeeds. All of the tests mentioned in the first comment succeed for me. 5 iterations might be overkill, I might be able to get away with 2 or 3. On my desktop, all tests pass fine even with 10 iterations.
Created attachment 414214 [details] Patch
I believe this was happening due to updating the layer scroll node ID without flushing composition state and thus breaking the mapping of layer->scroll node. There was no actual deadlock (at least that I could find), but rather a race. This doesn't affect GTK because GTK doesn't run tests with async composition enabled, afaik. This also, as far as I can tell, doesn't have any visually noticeable effect in practical use, I think the problem mostly just affected test results.
Comment on attachment 414214 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=414214&action=review > Source/WebCore/page/scrolling/nicosia/ScrollingCoordinatorNicosia.cpp:107 > + if (auto* frameView = frameViewForScrollingNode(nodeID)) > + frameView->flushCompositingStateIncludingSubframes(); This should not be necessary here. This is called from compositing updates, so we're already going to flush.
(In reply to Simon Fraser (smfr) from comment #5) > Comment on attachment 414214 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=414214&action=review > > > Source/WebCore/page/scrolling/nicosia/ScrollingCoordinatorNicosia.cpp:107 > > + if (auto* frameView = frameViewForScrollingNode(nodeID)) > > + frameView->flushCompositingStateIncludingSubframes(); > > This should not be necessary here. This is called from compositing updates, > so we're already going to flush. That's even better, so this patch, minus this, is valid? I couldn't trace through enough to really understand if this was a situation where a flush was guaranteed or not.
(In reply to Chris Lord from comment #6) > (In reply to Simon Fraser (smfr) from comment #5) > > Comment on attachment 414214 [details] > > Patch > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=414214&action=review > > > > > Source/WebCore/page/scrolling/nicosia/ScrollingCoordinatorNicosia.cpp:107 > > > + if (auto* frameView = frameViewForScrollingNode(nodeID)) > > > + frameView->flushCompositingStateIncludingSubframes(); > > > > This should not be necessary here. This is called from compositing updates, > > so we're already going to flush. > > That's even better, so this patch, minus this, is valid? I couldn't trace > through enough to really understand if this was a situation where a flush > was guaranteed or not. Any changes to a ScrollingState* node will schedule a flush.
Comment on attachment 414214 [details] Patch Actually this patch is wrong :)
Accessing platformLayers from ScrollingCoordinatorNicosia code seems wrong. On macOS, the scrolling tree nodeIDs are accessed on the scrolling thread by going from CALayer -> PlatformCALayer. Can Nicosia do something similar?
Yes, it can. Node ID assignments should be handled through GraphicsLayer.
Created attachment 414326 [details] Patch
Ok, that was much simpler once I knew where to look...
Created attachment 414346 [details] Patch
Committed r269909: <https://trac.webkit.org/changeset/269909> All reviewed patches have been landed. Closing bug and clearing flags on attachment 414346 [details].
<rdar://problem/71490175>