Bug 185385 - REGRESSION: performance-api/performance-observer-entry-sort.html is flaky
Summary: REGRESSION: performance-api/performance-observer-entry-sort.html is flaky
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: Other
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Ryosuke Niwa
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2018-05-07 10:44 PDT by Ryan Haddad
Modified: 2018-05-24 20:21 PDT (History)
7 users (show)

See Also:


Attachments
Adds assertions (2.79 KB, patch)
2018-05-11 21:13 PDT, Ryosuke Niwa
sbarati: review+
ews-watchlist: commit-queue-
Details | Formatted Diff | Diff
Archive of layout-test-results from ews204 for win-future (12.72 MB, application/zip)
2018-05-12 01:31 PDT, EWS Watchlist
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ryan Haddad 2018-05-07 10:44:46 PDT
The following layout test is flaky on iOS and macOS release bots

performance-api/performance-observer-entry-sort.html

Probable cause:

Unknown. The earliest failure visible on the dashboard was on 4/23/2018  and blames https://trac.webkit.org/log/?verbose=on&rev=230918&stop_rev=230929

Flakiness Dashboard:

https://webkit-test-results.webkit.org/dashboards/flakiness_dashboard.html#showAllRuns=true&tests=performance-api%2Fperformance-observer-entry-sort.html

--- /Volumes/Data/slave/highsierra-release-tests-wk1/build/layout-test-results/performance-api/performance-observer-entry-sort-expected.txt
+++ /Volumes/Data/slave/highsierra-release-tests-wk1/build/layout-test-results/performance-api/performance-observer-entry-sort-actual.txt
@@ -11,16 +11,16 @@
 PASS list.getEntries()[2].entryType is "measure"
 PASS list.getEntries()[3].entryType is "mark"
 PASS list.getEntries()[3].name is "mark1"
-PASS list.getEntries()[7].entryType is "mark"
-PASS list.getEntries()[7].name is "mark3"
+FAIL list.getEntries()[7].entryType should be mark. Was measure.
+FAIL list.getEntries()[7].name should be mark3. Was measure-matching-mark2-2.
 PASS list.getEntries()[0].name is "measure1"
 PASS list.getEntries()[1].name is "measure2"
 PASS list.getEntries()[2].name is "measure3"
 PASS list.getEntries()[3].name is "mark1"
 PASS list.getEntries()[4].name is "mark2"
 PASS list.getEntries()[5].name is "measure-matching-mark2-1"
-PASS list.getEntries()[6].name is "measure-matching-mark2-2"
-PASS list.getEntries()[7].name is "mark3"
+FAIL list.getEntries()[6].name should be measure-matching-mark2-2. Was mark3.
+FAIL list.getEntries()[7].name should be mark3. Was measure-matching-mark2-2.
 PASS successfullyParsed is true
 
 TEST COMPLETE
Comment 1 Radar WebKit Bug Importer 2018-05-07 22:59:12 PDT
<rdar://problem/40049391>
Comment 2 Ryosuke Niwa 2018-05-10 20:22:07 PDT
There is a race condition in the test here.

We have the following code:

function wait() {
    let now = performance.now();
    while (now === performance.now())
        continue;
}
...
wait(); // Ensure mark2 !== mark3 startTime by making sure performance.now advances.
performance.mark("mark3");
performance.measure("measure3");
performance.measure("measure-matching-mark2-2", "mark2");

Here, the test expects the start time of "mark3" and "measure-matching-mark2-2" to be identical since we just aligned at 1ms granularity of performance.now().

If that were always true, then we would expect "mark3" to always appear after "measure-matching-mark2-2" because UserTiming::measure places a "measure" entry before a "mark" entry with the same start time.

However, this isn't always guaranteed. On a highly contentious computer, we might get pre-emptied right after enqueuing "mark3" but before enqueuing "measure-matching-mark2-2", thereby making "measure-matching-mark2-2"'s start time to be later than "mark3"'s start time.
Comment 3 Ryosuke Niwa 2018-05-11 19:15:49 PDT
Hm... the above analysis maybe incorrect. I've introduced an artificial delay between measure & mark but not seeing the same failure.
Comment 4 Ryosuke Niwa 2018-05-11 21:09:03 PDT
Saam and I looked at the test more closely, and the only explanation we can come up with is that performance.now() is drifting backwards either due to some bug in monotonic clock itself or how std::floor is compiled / executed in CPU.
Comment 5 Ryosuke Niwa 2018-05-11 21:13:07 PDT
Created attachment 340243 [details]
Adds assertions
Comment 6 Ryosuke Niwa 2018-05-11 21:18:35 PDT
Since I can't reproduce this flakiness on my machine, here's a patch to add more diagnostic logging.
Comment 7 EWS Watchlist 2018-05-12 01:30:48 PDT
Comment on attachment 340243 [details]
Adds assertions

Attachment 340243 [details] did not pass win-ews (win):
Output: http://webkit-queues.webkit.org/results/7659078

New failing tests:
http/tests/security/contentSecurityPolicy/userAgentShadowDOM/allow-audio.html
Comment 8 EWS Watchlist 2018-05-12 01:31:00 PDT
Created attachment 340247 [details]
Archive of layout-test-results from ews204 for win-future

The attached test failures were seen while running run-webkit-tests on the win-ews.
Bot: ews204  Port: win-future  Platform: CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit
Comment 9 Ryosuke Niwa 2018-05-21 16:28:41 PDT
Committed r232033: <https://trac.webkit.org/changeset/232033>
Comment 10 Ryosuke Niwa 2018-05-24 15:06:41 PDT
Committed r232166: <https://trac.webkit.org/changeset/232166>
Comment 11 Ryosuke Niwa 2018-05-24 20:21:30 PDT
Committed r232181: <https://trac.webkit.org/changeset/232181>