The following layout test is flaky on iOS and macOS release bots performance-api/performance-observer-entry-sort.html Probable cause: Unknown. The earliest failure visible on the dashboard was on 4/23/2018 and blames https://trac.webkit.org/log/?verbose=on&rev=230918&stop_rev=230929 Flakiness Dashboard: https://webkit-test-results.webkit.org/dashboards/flakiness_dashboard.html#showAllRuns=true&tests=performance-api%2Fperformance-observer-entry-sort.html --- /Volumes/Data/slave/highsierra-release-tests-wk1/build/layout-test-results/performance-api/performance-observer-entry-sort-expected.txt +++ /Volumes/Data/slave/highsierra-release-tests-wk1/build/layout-test-results/performance-api/performance-observer-entry-sort-actual.txt @@ -11,16 +11,16 @@ PASS list.getEntries()[2].entryType is "measure" PASS list.getEntries()[3].entryType is "mark" PASS list.getEntries()[3].name is "mark1" -PASS list.getEntries()[7].entryType is "mark" -PASS list.getEntries()[7].name is "mark3" +FAIL list.getEntries()[7].entryType should be mark. Was measure. +FAIL list.getEntries()[7].name should be mark3. Was measure-matching-mark2-2. PASS list.getEntries()[0].name is "measure1" PASS list.getEntries()[1].name is "measure2" PASS list.getEntries()[2].name is "measure3" PASS list.getEntries()[3].name is "mark1" PASS list.getEntries()[4].name is "mark2" PASS list.getEntries()[5].name is "measure-matching-mark2-1" -PASS list.getEntries()[6].name is "measure-matching-mark2-2" -PASS list.getEntries()[7].name is "mark3" +FAIL list.getEntries()[6].name should be measure-matching-mark2-2. Was mark3. +FAIL list.getEntries()[7].name should be mark3. Was measure-matching-mark2-2. PASS successfullyParsed is true TEST COMPLETE
<rdar://problem/40049391>
There is a race condition in the test here. We have the following code: function wait() { let now = performance.now(); while (now === performance.now()) continue; } ... wait(); // Ensure mark2 !== mark3 startTime by making sure performance.now advances. performance.mark("mark3"); performance.measure("measure3"); performance.measure("measure-matching-mark2-2", "mark2"); Here, the test expects the start time of "mark3" and "measure-matching-mark2-2" to be identical since we just aligned at 1ms granularity of performance.now(). If that were always true, then we would expect "mark3" to always appear after "measure-matching-mark2-2" because UserTiming::measure places a "measure" entry before a "mark" entry with the same start time. However, this isn't always guaranteed. On a highly contentious computer, we might get pre-emptied right after enqueuing "mark3" but before enqueuing "measure-matching-mark2-2", thereby making "measure-matching-mark2-2"'s start time to be later than "mark3"'s start time.
Hm... the above analysis maybe incorrect. I've introduced an artificial delay between measure & mark but not seeing the same failure.
Saam and I looked at the test more closely, and the only explanation we can come up with is that performance.now() is drifting backwards either due to some bug in monotonic clock itself or how std::floor is compiled / executed in CPU.
Created attachment 340243 [details] Adds assertions
Since I can't reproduce this flakiness on my machine, here's a patch to add more diagnostic logging.
Comment on attachment 340243 [details] Adds assertions Attachment 340243 [details] did not pass win-ews (win): Output: http://webkit-queues.webkit.org/results/7659078 New failing tests: http/tests/security/contentSecurityPolicy/userAgentShadowDOM/allow-audio.html
Created attachment 340247 [details] Archive of layout-test-results from ews204 for win-future The attached test failures were seen while running run-webkit-tests on the win-ews. Bot: ews204 Port: win-future Platform: CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit
Committed r232033: <https://trac.webkit.org/changeset/232033>
Committed r232166: <https://trac.webkit.org/changeset/232166>
Committed r232181: <https://trac.webkit.org/changeset/232181>