Bug 185385

Summary:

REGRESSION: performance-api/performance-observer-entry-sort.html is flaky

Product:

WebKit

Reporter:

Ryan Haddad <ryanhaddad>

Component:

New Bugs

Assignee:

Ryosuke Niwa <rniwa>

Status:

NEW

Severity:

Normal

CC:

achristensen, ap, ews-watchlist, joepeck, rniwa, saam, webkit-bug-importer

Priority:

Keywords:

InRadar

Version:

Other

Hardware:

Unspecified

OS:

Unspecified

Attachments:

Description	Flags
Adds assertions	saam: review+, ews-watchlist: commit-queue-
Archive of layout-test-results from ews204 for win-future	none

Ryan Haddad

Reported 2018-05-07 10:44:46 PDT

The following layout test is flaky on iOS and macOS release bots performance-api/performance-observer-entry-sort.html Probable cause: Unknown. The earliest failure visible on the dashboard was on 4/23/2018 and blames https://trac.webkit.org/log/?verbose=on&rev=230918&stop_rev=230929 Flakiness Dashboard: https://webkit-test-results.webkit.org/dashboards/flakiness_dashboard.html#showAllRuns=true&tests=performance-api%2Fperformance-observer-entry-sort.html --- /Volumes/Data/slave/highsierra-release-tests-wk1/build/layout-test-results/performance-api/performance-observer-entry-sort-expected.txt +++ /Volumes/Data/slave/highsierra-release-tests-wk1/build/layout-test-results/performance-api/performance-observer-entry-sort-actual.txt @@ -11,16 +11,16 @@ PASS list.getEntries()[2].entryType is "measure" PASS list.getEntries()[3].entryType is "mark" PASS list.getEntries()[3].name is "mark1" -PASS list.getEntries()[7].entryType is "mark" -PASS list.getEntries()[7].name is "mark3" +FAIL list.getEntries()[7].entryType should be mark. Was measure. +FAIL list.getEntries()[7].name should be mark3. Was measure-matching-mark2-2. PASS list.getEntries()[0].name is "measure1" PASS list.getEntries()[1].name is "measure2" PASS list.getEntries()[2].name is "measure3" PASS list.getEntries()[3].name is "mark1" PASS list.getEntries()[4].name is "mark2" PASS list.getEntries()[5].name is "measure-matching-mark2-1" -PASS list.getEntries()[6].name is "measure-matching-mark2-2" -PASS list.getEntries()[7].name is "mark3" +FAIL list.getEntries()[6].name should be measure-matching-mark2-2. Was mark3. +FAIL list.getEntries()[7].name should be mark3. Was measure-matching-mark2-2. PASS successfullyParsed is true TEST COMPLETE

Attachments
Adds assertions (2.79 KB, patch) 2018-05-11 21:13 PDT, Ryosuke Niwa	saam: review+ ews-watchlist: commit-queue-	Details Formatted Diff Diff
Archive of layout-test-results from ews204 for win-future (12.72 MB, application/zip) 2018-05-12 01:31 PDT, EWS Watchlist	no flags	Details
View All Add attachment proposed patch, testcase, etc.

Radar WebKit Bug Importer

Comment 1 2018-05-07 22:59:12 PDT

<rdar://problem/40049391>

Ryosuke Niwa

Comment 2 2018-05-10 20:22:07 PDT

There is a race condition in the test here. We have the following code: function wait() { let now = performance.now(); while (now === performance.now()) continue; } ... wait(); // Ensure mark2 !== mark3 startTime by making sure performance.now advances. performance.mark("mark3"); performance.measure("measure3"); performance.measure("measure-matching-mark2-2", "mark2"); Here, the test expects the start time of "mark3" and "measure-matching-mark2-2" to be identical since we just aligned at 1ms granularity of performance.now(). If that were always true, then we would expect "mark3" to always appear after "measure-matching-mark2-2" because UserTiming::measure places a "measure" entry before a "mark" entry with the same start time. However, this isn't always guaranteed. On a highly contentious computer, we might get pre-emptied right after enqueuing "mark3" but before enqueuing "measure-matching-mark2-2", thereby making "measure-matching-mark2-2"'s start time to be later than "mark3"'s start time.

Ryosuke Niwa

Comment 3 2018-05-11 19:15:49 PDT

Hm... the above analysis maybe incorrect. I've introduced an artificial delay between measure & mark but not seeing the same failure.

Ryosuke Niwa

Comment 4 2018-05-11 21:09:03 PDT

Saam and I looked at the test more closely, and the only explanation we can come up with is that performance.now() is drifting backwards either due to some bug in monotonic clock itself or how std::floor is compiled / executed in CPU.

Ryosuke Niwa

Comment 5 2018-05-11 21:13:07 PDT

Created attachment 340243 [details] Adds assertions

Ryosuke Niwa

Comment 6 2018-05-11 21:18:35 PDT

Since I can't reproduce this flakiness on my machine, here's a patch to add more diagnostic logging.

EWS Watchlist

Comment 7 2018-05-12 01:30:48 PDT

Comment on attachment 340243 [details] Adds assertions Attachment 340243 [details] did not pass win-ews (win): Output: http://webkit-queues.webkit.org/results/7659078 New failing tests: http/tests/security/contentSecurityPolicy/userAgentShadowDOM/allow-audio.html

EWS Watchlist

Comment 8 2018-05-12 01:31:00 PDT

Created attachment 340247 [details] Archive of layout-test-results from ews204 for win-future The attached test failures were seen while running run-webkit-tests on the win-ews. Bot: ews204 Port: win-future Platform: CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit

Ryosuke Niwa

Comment 9 2018-05-21 16:28:41 PDT

Committed r232033: <https://trac.webkit.org/changeset/232033>

Ryosuke Niwa

Comment 10 2018-05-24 15:06:41 PDT

Committed r232166: <https://trac.webkit.org/changeset/232166>

Ryosuke Niwa

Comment 11 2018-05-24 20:21:30 PDT

Committed r232181: <https://trac.webkit.org/changeset/232181>

Note You need to log in before you can comment on or make changes to this bug.