Bug 209006 - REGRESSION: [ Mac iOS wk2 ] imported/w3c/web-platform-tests/html/semantics/scripting-1/the-script-element/execution-timing/085.html is failing
Summary: REGRESSION: [ Mac iOS wk2 ] imported/w3c/web-platform-tests/html/semantics/sc...
Status: RESOLVED WONTFIX
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2020-03-12 11:03 PDT by Truitt Savell
Modified: 2020-04-22 13:26 PDT (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Truitt Savell 2020-03-12 11:03:20 PDT
imported/w3c/web-platform-tests/html/semantics/scripting-1/the-script-element/execution-timing/085.html

This issue began recently but is very flaky. the diffs are the same between iOS and Mac.

Reproduce with:
run-webkit-tests --iterations 2000 --exit-after-n-failures 1 --no-retry --no-build -f imported/w3c/web-platform-tests/html/semantics/scripting-1/the-script-element/execution-timing/085.html

history:
https://results.webkit.org/?suite=layout-tests&test=imported%2Fw3c%2Fweb-platform-tests%2Fhtml%2Fsemantics%2Fscripting-1%2Fthe-script-element%2Fexecution-timing%2F085.html

Diff:
--- /Volumes/Data/slave/catalina-debug-tests-wk2/build/layout-test-results/imported/w3c/web-platform-tests/html/semantics/scripting-1/the-script-element/execution-timing/085-expected.txt
+++ /Volumes/Data/slave/catalina-debug-tests-wk2/build/layout-test-results/imported/w3c/web-platform-tests/html/semantics/scripting-1/the-script-element/execution-timing/085-actual.txt
@@ -1,4 +1,4 @@
 FAILED (This TC requires JavaScript enabled)
 
-PASS  scheduler: async script and slow-loading defer script 
+FAIL  scheduler: async script and slow-loading defer script assert_array_equals: property 0, expected "external script #2" but got "external script #1"
Comment 1 Radar WebKit Bug Importer 2020-03-12 11:22:20 PDT
<rdar://problem/60379091>
Comment 2 Truitt Savell 2020-03-12 11:26:13 PDT
This test was last updated in https://trac.webkit.org/changeset/249886/webkit
Comment 3 Truitt Savell 2020-03-12 11:27:09 PDT
This test seems to reproduce no matter what revision I run it on.
Comment 4 Truitt Savell 2020-03-12 11:27:48 PDT
History shows this started some time after 258250 but no way currently to narrow that down
Comment 5 Truitt Savell 2020-03-12 11:35:11 PDT
Marked test as failing while it is investigated:
https://trac.webkit.org/changeset/258345/webkit
Comment 6 Alexey Proskuryakov 2020-03-12 17:08:22 PDT
This quite certainly is a regression from r258268.
Comment 7 Chris Dumez 2020-04-22 13:03:59 PDT
(In reply to Alexey Proskuryakov from comment #6)
> This quite certainly is a regression from r258268.

Flakiness is likely triggered by when the first paint actually happens since the script is delayed until first paint.
Comment 8 Chris Dumez 2020-04-22 13:21:22 PDT
(In reply to Chris Dumez from comment #7)
> (In reply to Alexey Proskuryakov from comment #6)
> > This quite certainly is a regression from r258268.
> 
> Flakiness is likely triggered by when the first paint actually happens since
> the script is delayed until first paint.

I think the test is inherently flaky. It has a slow loading defer script (1 second to load) and an async script. It expects the async script to always run first.
Comment 9 Chris Dumez 2020-04-22 13:26:02 PDT
(In reply to Chris Dumez from comment #8)
> (In reply to Chris Dumez from comment #7)
> > (In reply to Alexey Proskuryakov from comment #6)
> > > This quite certainly is a regression from r258268.
> > 
> > Flakiness is likely triggered by when the first paint actually happens since
> > the script is delayed until first paint.
> 
> I think the test is inherently flaky. It has a slow loading defer script (1
> second to load) and an async script. It expects the async script to always
> run first.

To be clear, there is no expectation that async scripts run before defer scripts. async script run whenever they are done loading, which may happen before or after DOMContentLoaded. defer scripts run after DOMContentLoaded always. This test is using a 1 second delay to try and make sure that the async script finishes loading before DOMContentLoaded but this is bound to be flaky. This is definitely flakier too now that we defer async scripts further in some cases.

I think marking this test as flaky is the right thing to do given that this is an upstream test.