Bug 186045

Summary: webkit-test-runner: Add support for the reftest-wait class name
Product: WebKit Reporter: Frédéric Wang (:fredw) <fred.wang>
Component: Tools / TestsAssignee: Darin Adler <darin>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, cdumez, changseok, clopez, darin, dbarton, emilio, esprehn+autocc, ews-watchlist, fred.wang, glenn, gsnedders, gyuyoung.kim, hector_i_lopez, jfernandez, koivisto, lforschler, Ms2ger, rego, rniwa, rwlbuis, simon.fraser, tsavell, webkit-bot-watchers-bugzilla, webkit-bug-importer, youennf
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
URL: https://web-platform-tests.org/writing-tests/reftests.html
See Also: https://bugs.webkit.org/show_bug.cgi?id=203477
https://bugs.webkit.org/show_bug.cgi?id=202859
https://bugs.webkit.org/show_bug.cgi?id=209649
https://bugs.webkit.org/show_bug.cgi?id=214892
https://bugs.webkit.org/show_bug.cgi?id=216397
https://bugs.webkit.org/show_bug.cgi?id=216428
https://bugs.webkit.org/show_bug.cgi?id=216440
Bug Depends on: 187095    
Bug Blocks: 207734, 170784, 186127, 203477    
Attachments:
Description Flags
Patch (move reftest_wait_0 to LayoutTests root)
none
Patch (proof of concept)
none
Patch (WIP)
none
Patch
none
Archive of layout-test-results from ews103 for mac-sierra
none
Archive of layout-test-results from ews104 for mac-sierra-wk2
none
Archive of layout-test-results from ews114 for mac-sierra
none
Archive of layout-test-results from ews201 for win-future
none
Archive of layout-test-results from ews121 for ios-simulator-wk2
none
Patch
ews-watchlist: commit-queue-
Archive of layout-test-results from ews101 for mac-sierra
none
Archive of layout-test-results from ews105 for mac-sierra-wk2
none
Archive of layout-test-results from ews121 for ios-simulator-wk2
none
Archive of layout-test-results from ews200 for win-future
none
Archive of layout-test-results from ews114 for mac-sierra
none
Patch
none
Patch
none
Archive of layout-test-results from ews206 for win-future
none
Patch
none
Patch
koivisto: review+
Patch for landing
ews-feeder: commit-queue-
Patch for landing none

Description Frédéric Wang (:fredw) 2018-05-28 23:34:16 PDT
From the WPT doc, reftest-wait is a mechanism similar to testRunner's waitUntilDone / notifyDone():

"Controlling When Comparison Occurs

By default reftest screenshots are taken after the load event has fired, and web fonts (if any) are loaded. In some cases it is necessary to delay the screenshot later than this, for example because some DOM manipulation is required to set up the desired test conditions. To enable this, the test may have a class="reftest-wait" attribute specified on the root element. This will cause the screenshot to be delayed until the load event has fired and the reftest-wait class has been removed from the root element. Note that in neither case is exact timing of the screenshot guaranteed: it is only guaranteed to be after those events."

The reftest-wait attributed is already used by the following tests imported into WebKit:

./FileAPI/url/url_xmlhttprequest_img.html
./css/css-grid/abspos/absolute-positioning-changing-containing-block-001.html
./css/css-grid/abspos/grid-item-absolute-positioning-dynamic-001.html
./css/css-grid/abspos/grid-positioned-item-dynamic-change-001.html
./css/css-ui/text-overflow-005.html
./css/css-ui/text-overflow-017.html
./css/css-ui/text-overflow-021.html
./css/mediaqueries/min-width-tables-001.html
./css/selectors/focus-within-001.html
./css/selectors/focus-within-002.html
./css/selectors/focus-within-003.html
./css/selectors/focus-within-004.html
./css/selectors/focus-within-005.html
./css/selectors/focus-within-006-expected.html
./css/selectors/focus-within-006.html
./css/selectors/focus-within-007.html
./css/selectors/focus-within-008.html
./css/selectors/focus-within-010.html
./css/selectors/focus-within-shadow-001.html
./css/selectors/focus-within-shadow-002.html
./css/selectors/focus-within-shadow-003.html
./css/selectors/focus-within-shadow-004.html
./css/selectors/focus-within-shadow-005.html
./css/selectors/focus-within-shadow-006.html
./css/selectors/selector-placeholder-shown-type-change-001.html
./css/selectors/selector-placeholder-shown-type-change-002.html
./css/selectors/selector-placeholder-shown-type-change-003.html
./css/selectors/selector-read-write-type-change-002.html
./css/selectors/selector-required-type-change-002.html
./mathml/relations/html5-tree/href-click-1.html
./mathml/relations/html5-tree/href-click-2.html

Probably, most of them pass because the expected final rendering happens fast enough. However, this is a bit fragile and might lead to flacky tests. Attached is a patch that moves WPT's reftest_wait_0 test so that one can run it with Tools/Scripts/run-webkit-tests reftest_wait_0.html. As indicated in the test description, it is expected to fail since: 

- reftest_wait_0.html is a doc with a red background but after 2 seconds the background color is changed to green (The end of the test is controlled via the reftest-wait attribute).
- reftest_wait_0-expected.html is a doc with a red background.

However, because webkit-test-runner does not support reftest-wait, the screenshot of reftest_wait_0.html is actually taken before the background color is turned into green.
Comment 1 Frédéric Wang (:fredw) 2018-05-28 23:35:11 PDT
Created attachment 341479 [details]
Patch (move reftest_wait_0 to LayoutTests root)
Comment 2 Frédéric Wang (:fredw) 2018-05-30 05:02:41 PDT
Created attachment 341562 [details]
Patch (proof of concept)

This patch shows how to emulate reftest-wait with some helper JS script that relies on testRunner's waitUntiDone/notifyDone. I wonder if we should make the test runner script inject such JS code or if class="reftest-wait" should be handled in any other way...
Comment 3 Frédéric Wang (:fredw) 2018-06-06 09:46:48 PDT
Created attachment 342053 [details]
Patch (WIP)
Comment 4 Frédéric Wang (:fredw) 2018-06-06 23:45:33 PDT
Created attachment 342129 [details]
Patch
Comment 5 EWS Watchlist 2018-06-07 00:11:31 PDT
Comment on attachment 342129 [details]
Patch

Attachment 342129 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/8048035

Number of test failures exceeded the failure limit.
Comment 6 EWS Watchlist 2018-06-07 00:11:32 PDT
Created attachment 342130 [details]
Archive of layout-test-results from ews103 for mac-sierra

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews103  Port: mac-sierra  Platform: Mac OS X 10.12.6
Comment 7 EWS Watchlist 2018-06-07 00:37:31 PDT
Comment on attachment 342129 [details]
Patch

Attachment 342129 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/8048232

New failing tests:
compositing/debug-borders-dynamic.html
Comment 8 EWS Watchlist 2018-06-07 00:37:32 PDT
Created attachment 342134 [details]
Archive of layout-test-results from ews104 for mac-sierra-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews104  Port: mac-sierra-wk2  Platform: Mac OS X 10.12.6
Comment 9 EWS Watchlist 2018-06-07 00:38:42 PDT
Comment on attachment 342129 [details]
Patch

Attachment 342129 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/8048159

Number of test failures exceeded the failure limit.
Comment 10 EWS Watchlist 2018-06-07 00:38:43 PDT
Created attachment 342135 [details]
Archive of layout-test-results from ews114 for mac-sierra

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews114  Port: mac-sierra  Platform: Mac OS X 10.12.6
Comment 11 EWS Watchlist 2018-06-07 01:08:15 PDT
Comment on attachment 342129 [details]
Patch

Attachment 342129 [details] did not pass win-ews (win):
Output: http://webkit-queues.webkit.org/results/8049058

Number of test failures exceeded the failure limit.
Comment 12 EWS Watchlist 2018-06-07 01:08:17 PDT
Created attachment 342137 [details]
Archive of layout-test-results from ews201 for win-future

The attached test failures were seen while running run-webkit-tests on the win-ews.
Bot: ews201  Port: win-future  Platform: CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit
Comment 13 EWS Watchlist 2018-06-07 01:28:57 PDT
Comment on attachment 342129 [details]
Patch

Attachment 342129 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: http://webkit-queues.webkit.org/results/8048826

New failing tests:
http/tests/webfont/font-loading-system-fallback-visibility.html
http/tests/webfont/font-loading-system-fallback-visibility-FontRanges.html
Comment 14 EWS Watchlist 2018-06-07 01:28:59 PDT
Created attachment 342140 [details]
Archive of layout-test-results from ews121 for ios-simulator-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews121  Port: ios-simulator-wk2  Platform: Mac OS X 10.13.4
Comment 15 Frédéric Wang (:fredw) 2018-06-07 01:30:28 PDT
Created attachment 342141 [details]
Patch
Comment 16 EWS Watchlist 2018-06-07 02:18:01 PDT
Comment on attachment 342141 [details]
Patch

Attachment 342141 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/8050515

New failing tests:
testing/reftest-wait-long.html
testing/reftest-wait-other-class-names.html
Comment 17 EWS Watchlist 2018-06-07 02:18:03 PDT
Created attachment 342142 [details]
Archive of layout-test-results from ews101 for mac-sierra

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews101  Port: mac-sierra  Platform: Mac OS X 10.12.6
Comment 18 EWS Watchlist 2018-06-07 02:21:17 PDT
Comment on attachment 342141 [details]
Patch

Attachment 342141 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/8050556

New failing tests:
compositing/debug-borders-dynamic.html
Comment 19 EWS Watchlist 2018-06-07 02:21:19 PDT
Created attachment 342143 [details]
Archive of layout-test-results from ews105 for mac-sierra-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews105  Port: mac-sierra-wk2  Platform: Mac OS X 10.12.6
Comment 20 EWS Watchlist 2018-06-07 02:54:16 PDT
Comment on attachment 342141 [details]
Patch

Attachment 342141 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: http://webkit-queues.webkit.org/results/8050675

New failing tests:
http/tests/webfont/font-loading-system-fallback-visibility.html
http/tests/webfont/font-loading-system-fallback-visibility-FontRanges.html
Comment 21 EWS Watchlist 2018-06-07 02:54:18 PDT
Created attachment 342144 [details]
Archive of layout-test-results from ews121 for ios-simulator-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews121  Port: ios-simulator-wk2  Platform: Mac OS X 10.13.4
Comment 22 EWS Watchlist 2018-06-07 03:08:51 PDT
Comment on attachment 342141 [details]
Patch

Attachment 342141 [details] did not pass win-ews (win):
Output: http://webkit-queues.webkit.org/results/8051148

New failing tests:
testing/reftest-wait-long.html
testing/reftest-wait-other-class-names.html
Comment 23 EWS Watchlist 2018-06-07 03:09:02 PDT
Created attachment 342145 [details]
Archive of layout-test-results from ews200 for win-future

The attached test failures were seen while running run-webkit-tests on the win-ews.
Bot: ews200  Port: win-future  Platform: CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit
Comment 24 EWS Watchlist 2018-06-07 03:09:51 PDT
Comment on attachment 342141 [details]
Patch

Attachment 342141 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/8050924

New failing tests:
testing/reftest-wait-long.html
testing/reftest-wait-other-class-names.html
Comment 25 EWS Watchlist 2018-06-07 03:09:52 PDT
Created attachment 342146 [details]
Archive of layout-test-results from ews114 for mac-sierra

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews114  Port: mac-sierra  Platform: Mac OS X 10.12.6
Comment 26 Frédéric Wang (:fredw) 2018-06-07 06:34:39 PDT
Created attachment 342154 [details]
Patch
Comment 27 Frédéric Wang (:fredw) 2018-06-07 07:25:41 PDT
Created attachment 342158 [details]
Patch
Comment 28 Ms2ger (he/him; ⌚ UTC+1/+2) 2018-06-07 07:55:18 PDT
Comment on attachment 342158 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=342158&action=review

> Tools/Scripts/webkitpy/port/driver.py:510
> +                command += "'--check-reftest-wait"

Something seems to be off about the quotes at the start here.
Comment 29 Frédéric Wang (:fredw) 2018-06-07 09:37:09 PDT
Comment on attachment 342158 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=342158&action=review

>> Tools/Scripts/webkitpy/port/driver.py:510
>> +                command += "'--check-reftest-wait"
> 
> Something seems to be off about the quotes at the start here.

The single quote is just a separator between arguments, it does not need to be closed (see the rest of the _command_from_driver_input function for more context).
Comment 30 youenn fablet 2018-06-07 10:29:12 PDT
Do you know how Chrome is handling ref-test?

If we have to handle various classes of documents, maybe it would be better to handle them in JS directly.
Like updating testharnessreport.js to handle these classes, or maybe a script dedicated to handle these classes.
This would require updating the tests to include that script (which would be specialized for each browser) so we would need to discuss this on WPT GitHub first.
Comment 31 Frédéric Wang (:fredw) 2018-06-07 10:45:01 PDT
(In reply to youenn fablet from comment #30)
> Do you know how Chrome is handling ref-test?
>

No, I have not checked. I would need to investigate that...

> If we have to handle various classes of documents, maybe it would be better
> to handle them in JS directly.
> Like updating testharnessreport.js to handle these classes, or maybe a
> script dedicated to handle these classes.
> This would require updating the tests to include that script (which would be
> specialized for each browser) so we would need to discuss this on WPT GitHub
> first.

Yeah, I had initially tried testharnessreport.js, however reftests generally don't include such scripts. Injecting the script in the TestRunner is maybe not the best approach but I'm not sure it will be better to have the import script insert a script or to have to modify all the WPT reftests that use reftest-wait...
Comment 32 Frédéric Wang (:fredw) 2018-06-07 11:06:50 PDT
(In reply to Frédéric Wang (:fredw) from comment #31)
> (In reply to youenn fablet from comment #30)
> > Do you know how Chrome is handling ref-test?
> >
> 
> No, I have not checked. I would need to investigate that...

Apparently, Chromium does a similar thing as my patch and injects a script to emulate reftest-wait support:

https://cs.chromium.org/chromium/src/content/shell/test_runner/test_runner.cc?rcl=383ea4e718310f7d214061c8081202cc2f2937c2&l=314

> > If we have to handle various classes of documents, maybe it would be better
> > to handle them in JS directly.

Just to clarify, I think "reftest-wait" is the only document class with a special semantics in WPT tests.
Comment 33 EWS Watchlist 2018-06-07 11:22:38 PDT
Comment on attachment 342158 [details]
Patch

Attachment 342158 [details] did not pass win-ews (win):
Output: http://webkit-queues.webkit.org/results/8059600

New failing tests:
http/tests/security/canvas-remote-read-remote-video-blocked-no-crossorigin.html
http/tests/preload/onload_event.html
Comment 34 EWS Watchlist 2018-06-07 11:22:49 PDT
Created attachment 342193 [details]
Archive of layout-test-results from ews206 for win-future

The attached test failures were seen while running run-webkit-tests on the win-ews.
Bot: ews206  Port: win-future  Platform: CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit
Comment 35 Ms2ger (he/him; ⌚ UTC+1/+2) 2018-06-08 03:00:27 PDT
We're definitely not requiring upstream that reftests must contain a script, so something along the lines of what's proposed here makes sense to me.
Comment 36 Frédéric Wang (:fredw) 2018-06-11 00:55:00 PDT
Comment on attachment 342158 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=342158&action=review

> Tools/WebKitTestRunner/InjectedBundle/InjectedBundle.cpp:426
> +    "})();";

I think Chromium's injected script has a few improvements over the one I proposed in attachment 342158 [details], so I think we can take it: https://cs.chromium.org/chromium/src/content/shell/test_runner/test_runner.cc?rcl=383ea4e718310f7d214061c8081202cc2f2937c2&l=314

In particular, Chromium and upstream WPT also wait for web fonts to be loaded (Chromium does not do it when reftest-wait is absent, see https://bugs.chromium.org/p/chromium/issues/detail?id=848804) so "checkReftestWait" is maybe not a good name. Note that document.font.ready is buggy in WebKit see bug 174030 and https://github.com/web-platform-tests/wpt/pull/10025
Comment 37 Frédéric Wang (:fredw) 2018-06-27 01:47:26 PDT
Comment on attachment 342158 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=342158&action=review

> LayoutTests/ChangeLog:11
> +        * http/wpt/reftest-wait/reftest-wait-events-expected.html: Added.

I noticed that WPT actually has tests for reftest-wait support, so probably we should import infrastructure/.
Comment 38 Ryosuke Niwa 2018-06-27 15:52:56 PDT
Ugh... this is so ugly. Why can't WPT define a specific JS function helper for browsers to hook in instead of relying on having some browser specific detection of the class name?
Comment 39 Ryosuke Niwa 2018-06-27 15:54:46 PDT
(In reply to Ms2ger from comment #35)
> We're definitely not requiring upstream that reftests must contain a script,
> so something along the lines of what's proposed here makes sense to me.

Why not? If the way the tests identify the time to record the actual result is removing the class name from the document element, why can't that code simply invoke a function instead? Given the amount of boilerplate WPT requires elsewhere, that seems to have a low cost.

I'm not at all excited to hard-code this insane ad-hoc convention WPT has into WTR or DRT.
Comment 40 Ryosuke Niwa 2018-06-27 16:12:24 PDT
Comment on attachment 342158 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=342158&action=review

>>> Tools/Scripts/webkitpy/port/driver.py:510
>>> +                command += "'--check-reftest-wait"
>> 
>> Something seems to be off about the quotes at the start here.
> 
> The single quote is just a separator between arguments, it does not need to be closed (see the rest of the _command_from_driver_input function for more context).

Why do we need to have a runtime option for this? Why can't we always check for this condition.

> Tools/WebKitTestRunner/InjectedBundle/InjectedBundle.cpp:404
> +static const char* checkReftestWaitScript =

I don't think it makes sense to implement a logic like this in JS if we're modifying WTR
since we already have a code to wait for sub-resources to load.
If we're going with the injected script route, we're much better off adding a generic mechanism to inject arbitrary JS into the test.
I really don't want WPT specific logic creeping into WTR or DRT. r- because of this.
Comment 41 Frédéric Wang (:fredw) 2018-06-28 01:29:00 PDT
Comment on attachment 342158 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=342158&action=review

>>>> Tools/Scripts/webkitpy/port/driver.py:510
>>>> +                command += "'--check-reftest-wait"
>>> 
>>> Something seems to be off about the quotes at the start here.
>> 
>> The single quote is just a separator between arguments, it does not need to be closed (see the rest of the _command_from_driver_input function for more context).
> 
> Why do we need to have a runtime option for this? Why can't we always check for this condition.

Not sure I understand your question, but this --check-reftest-wait command is passed to the WTR and DRT binaries. I'm not sure they know whether we are executing a reftest or whether a test is a WPT, unless you suggest to copy the logic into WTR and DRT too.

>> Tools/WebKitTestRunner/InjectedBundle/InjectedBundle.cpp:404
>> +static const char* checkReftestWaitScript =
> 
> I don't think it makes sense to implement a logic like this in JS if we're modifying WTR
> since we already have a code to wait for sub-resources to load.
> If we're going with the injected script route, we're much better off adding a generic mechanism to inject arbitrary JS into the test.
> I really don't want WPT specific logic creeping into WTR or DRT. r- because of this.

I also didn't like this approach, it was mostly a first attempt to open discussion however I was surprised that Chromium people actually did the same thing and supported this approach in a mail exchanges Youenn and I had with them. One argument in favor of this is that the sync logic is nontrivial, apparently one needs to properly check that web fonts are ready before taking the screenshot. This is needed even for tests without reftest-wait class, so that is one more argument in favor of a generic injection mechanism (another argument is that the same code would probably need to be injected in DRT too).
Comment 42 Frédéric Wang (:fredw) 2018-06-28 01:49:09 PDT
(In reply to Ryosuke Niwa from comment #38)
> Ugh... this is so ugly. Why can't WPT define a specific JS function helper
> for browsers to hook in instead of relying on having some browser specific
> detection of the class name?

(In reply to Ryosuke Niwa from comment #39)
> (In reply to Ms2ger from comment #35)
> > We're definitely not requiring upstream that reftests must contain a script,
> > so something along the lines of what's proposed here makes sense to me.
> 
> Why not? If the way the tests identify the time to record the actual result
> is removing the class name from the document element, why can't that code
> simply invoke a function instead? Given the amount of boilerplate WPT
> requires elsewhere, that seems to have a low cost.

Regarding this reftest-wait class, I see tree options:
1) Implement support in WebKit
2) Implement a conversion mechanism in the script importer/exporter (reftest-wait <-> wait/done) 
3) Convince upstream WPT to implement a new wait/done mechanism (and update tests)

(2) seems a bit fragile. We already have import/export issue with reftests, so I don't really want to make that worse.

reftest-wait is already a well-established convention inherited from initial 2007 implementation in Mozilla and is used everywhere in upstream WPT, so I doubt (3) will be easy, but we can try.

(1) is what I tried here and that's what Chromium people decided too and they seem to be happy with that.

We've already discussed this by private emails with Youenn and some Google/WPT people. Maybe we should open a public discussion with the rest of the browser vendors and WPT authors...
Comment 43 Carlos Alberto Lopez Perez 2019-10-10 06:52:42 PDT
I have been hit by this on bug 202783. 

This is affecting a lot of tests already, many of them incorrectly marked.

$ grep -rl reftest-wait LayoutTests/|grep html$|wc -l
118
Comment 44 Antti Koivisto 2019-11-22 03:30:13 PST
Would be nice to get this landed. I'm running into reftest-wait tests I'd like to enable.
Comment 45 Simon Fraser (smfr) 2020-01-05 09:53:31 PST
Ditto. I'd like imported/w3c/web-platform-tests/css/css-display/display-none-inline-img.html to work.
Comment 46 Frédéric Wang (:fredw) 2020-02-04 07:00:28 PST
@Simon @Antti @Carlos Just for the record, I'm not working on this as we didn't have an agreement on the way to proceed. I think we should restart discussion on blink-dev. The status is basically what I said in comment 42.
Comment 47 Frédéric Wang (:fredw) 2020-02-04 07:04:09 PST
(In reply to Frédéric Wang (:fredw) from comment #46)
> discussion on blink-dev.

webkit-dev sorry -- multi-browser work :-)
Comment 48 Radar WebKit Bug Importer 2020-02-18 11:24:41 PST
<rdar://problem/59558132>
Comment 49 Hector Lopez 2020-08-24 10:02:45 PDT
imported/w3c/web-platform-tests/css/css-display/display-none-inline-img.html

Test expectation changed to ImageOnlyFailure

Test expectation changed:

https://trac.webkit.org/changeset/266065/webkit
Comment 50 Darin Adler 2020-09-04 15:27:26 PDT
This test is marked as failing with this bug number:

imported/w3c/web-platform-tests/css/css-animations/animation-delay-010.html

But it’s passing for me. What’s the actual situation?
Comment 51 Alexey Proskuryakov 2020-09-04 16:42:07 PDT
Looks like it's failing pretty consistently on macOS, and flaky on iOS:

https://results.webkit.org/?suite=layout-tests&test=imported%2Fw3c%2Fweb-platform-tests%2Fcss%2Fcss-animations%2Fanimation-delay-010.html
Comment 52 Darin Adler 2020-09-04 16:59:22 PDT
Passed for me on Big Sur.
Comment 53 Sam Sneddon [:gsnedders] 2020-09-07 09:47:49 PDT
(In reply to Darin Adler from comment #50)
> This test is marked as failing with this bug number:
> 
> imported/w3c/web-platform-tests/css/css-animations/animation-delay-010.html
> 
> But it’s passing for me. What’s the actual situation?

On the whole we should expect anything affected by this to be flaky.
Comment 54 Darin Adler 2020-09-07 09:50:35 PDT
If true, then there should be a flakiness expectation rather than a failure expectation.
Comment 55 Darin Adler 2020-09-07 12:22:45 PDT
Planning to work on this today.
Comment 56 Darin Adler 2020-09-07 15:46:50 PDT Comment hidden (obsolete)
Comment 57 Darin Adler 2020-09-07 16:10:32 PDT
Created attachment 408201 [details]
Patch
Comment 58 Antti Koivisto 2020-09-08 01:06:24 PDT
Comment on attachment 408201 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=408201&action=review

> Tools/WebKitTestRunner/InjectedBundle/InjectedBundlePage.cpp:1995
> -    if (injectedBundle.pageCount())
> -        injectedBundle.page()->dump();
> -    else
> +    if (!injectedBundle.pageCount()) {
>          injectedBundle.done();
> +        return;
> +    }
> +
> +    pollForRefTestWaitAttributeRemoval();

This reads poorly. Dumping is the important thing, reftest-wait is a minor feature required by a few tests.

Maybe call it 'dumpRespectingRefTestWait' or similar? (same for the other runners)

> LayoutTests/ChangeLog:9
> +        * TestExpectations: Expect a pass on the test that directly tests this feature.
> +        There are multiple other tests that should now be passing.

What's the plan for enabling them?
Comment 59 Alexey Proskuryakov 2020-09-08 09:31:26 PDT
> What's the plan for enabling them?

Someone needs to find potentially affected tests by grepping for reftest-wait, and to check result history a day or two after this lands. Perhaps bot watchers could?
Comment 60 Darin Adler 2020-09-08 09:56:22 PDT
Comment on attachment 408201 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=408201&action=review

Currently have a power outage at my house so I can’t do a new version of the patch yet. Will figure out how this is affecting the tests that are failing on the bots and upload a new version that fixes the tests (more likely) or the machinery (less likely).

>> Tools/WebKitTestRunner/InjectedBundle/InjectedBundlePage.cpp:1995
>> +    pollForRefTestWaitAttributeRemoval();
> 
> This reads poorly. Dumping is the important thing, reftest-wait is a minor feature required by a few tests.
> 
> Maybe call it 'dumpRespectingRefTestWait' or similar? (same for the other runners)

I’ll rename.

The tricky thing about this naming is that the function is both the thing you call to start polling, and the function used by the polling cycle. So one name ends up serving for both. I emphasized the "poll" half because it’s what explains the strange arguments and function type signature.

>> LayoutTests/ChangeLog:9
>> +        There are multiple other tests that should now be passing.
> 
> What's the plan for enabling them?

I think Alexey’s plan makes sense. It’s easy to grep for the affected tests.
Comment 61 Antti Koivisto 2020-09-08 10:47:45 PDT
> The tricky thing about this naming is that the function is both the thing
> you call to start polling, and the function used by the polling cycle. So
> one name ends up serving for both. I emphasized the "poll" half because it’s
> what explains the strange arguments and function type signature.

One option is to keep the common case dump() in here and call pollForRefTestWaitAttributeRemoval() only when the attribute is present. Slightly more complicated than needed but also very readable.
Comment 62 Darin Adler 2020-09-08 10:59:28 PDT
(In reply to Antti Koivisto from comment #61)
> > The tricky thing about this naming is that the function is both the thing
> > you call to start polling, and the function used by the polling cycle. So
> > one name ends up serving for both. I emphasized the "poll" half because it’s
> > what explains the strange arguments and function type signature.
> 
> One option is to keep the common case dump() in here and call
> pollForRefTestWaitAttributeRemoval() only when the attribute is present.
> Slightly more complicated than needed but also very readable.

With threads it would be:

    waitUntilRefTestWaitIsRemoved();
    dump();

I’d like to write code like that. Maybe:

    afterRefTestWaitIsRemoved([] {
        dump();
    });
Comment 63 Darin Adler 2020-09-08 20:24:54 PDT
Renamed it to dumpAfterWaitAttributeIsRemoved.
Comment 64 Darin Adler 2020-09-09 13:34:32 PDT Comment hidden (obsolete)
Comment 65 Darin Adler 2020-09-09 16:59:13 PDT
Created attachment 408385 [details]
Patch for landing
Comment 66 EWS 2020-09-10 04:29:27 PDT
Committed r266817: <https://trac.webkit.org/changeset/266817>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 408385 [details].
Comment 67 Truitt Savell 2020-09-10 08:57:48 PDT
It looks like the changes in https://trac.webkit.org/changeset/266817/webkit

have broken 5 tests for all Mac:
  imported/w3c/web-platform-tests/css/css-ui/text-overflow-017.html [ Timeout ]
  imported/w3c/web-platform-tests/css/css-values/vh_not_refreshing_on_chrome.html [ Timeout ]
  imported/w3c/web-platform-tests/html/rendering/replaced-elements/embedded-content/cross-domain-iframe.sub.html [ Timeout ]
  imported/w3c/web-platform-tests/html/semantics/forms/the-option-element/dynamic-content-change-rendering.html [ Timeout ]
  imported/w3c/web-platform-tests/html/semantics/forms/the-select-element/reset-algorithm-rendering.html [ Timeout ]

and 1 test for Mac wk1:
  imported/w3c/web-platform-tests/css/css-position/fixed-z-index-blend.html [ ImageOnlyFailure ]

Diff:
https://build.webkit.org/results/Apple-Catalina-Release-WK1-Tests/r266817%20(8910)/imported/w3c/web-platform-tests/css/css-position/fixed-z-index-blend-diffs.html

History:
https://results.webkit.org/?suite=layout-tests&suite=layout-tests&suite=layout-tests&suite=layout-tests&suite=layout-tests&suite=layout-tests&test=imported%2Fw3c%2Fweb-platform-tests%2Fcss%2Fcss-position%2Ffixed-z-index-blend.html&test=imported%2Fw3c%2Fweb-platform-tests%2Fcss%2Fcss-ui%2Ftext-overflow-017.html&test=imported%2Fw3c%2Fweb-platform-tests%2Fcss%2Fcss-values%2Fvh_not_refreshing_on_chrome.html&test=imported%2Fw3c%2Fweb-platform-tests%2Fhtml%2Frendering%2Freplaced-elements%2Fembedded-content%2Fcross-domain-iframe.sub.html&test=imported%2Fw3c%2Fweb-platform-tests%2Fhtml%2Fsemantics%2Fforms%2Fthe-option-element%2Fdynamic-content-change-rendering.html&test=imported%2Fw3c%2Fweb-platform-tests%2Fhtml%2Fsemantics%2Fforms%2Fthe-select-element%2Freset-algorithm-rendering.html

The timeouts appear to have been constantly failing tests before this change.
Comment 68 Alexey Proskuryakov 2020-09-10 09:53:22 PDT
This is likely OK, and needs expectations update (skipping for timeouts). I looked at text-overflow-017.html, and it appears to be a bad test that requires a manual click - unless there is some other magical machinery that clicks elements automagically.

EWS has the design choice that it doesn't run tests with failure expectations at all, so it cannot catch it when something changes from failure to timeout or to crash.
Comment 69 Sam Sneddon [:gsnedders] 2020-09-10 10:14:20 PDT
(In reply to Alexey Proskuryakov from comment #68)
> This is likely OK, and needs expectations update (skipping for timeouts). I
> looked at text-overflow-017.html, and it appears to be a bad test that
> requires a manual click - unless there is some other magical machinery that
> clicks elements automagically.
> 
> EWS has the design choice that it doesn't run tests with failure
> expectations at all, so it cannot catch it when something changes from
> failure to timeout or to crash.

I too only looked at text-overflow-017.html, but I'll note it has a <meta name=flags content=interact> (see https://web-platform-tests.org/writing-tests/css-metadata.html?highlight=interact) so ideally we'd be treating it as a manual test.

We (read: me) should really look into https://github.com/web-platform-tests/wpt/issues/5381 again now; commented there that we should probably do something soon which means we can then more easily filter out tests that WPT considers manual.
Comment 70 Truitt Savell 2020-09-10 10:17:15 PDT
Skipped the 5 Timeouts in https://trac.webkit.org/changeset/266831/webkit
Comment 71 Alexey Proskuryakov 2020-09-11 12:41:43 PDT
I looked at another test that turned into a timeout, and it's interesting.

imported/w3c/web-platform-tests/html/semantics/forms/the-option-element/dynamic-content-change-rendering.html 

It completes after a TestRendered event fires, but there is nothing in the test that does that. It's fired by web-platform-tests/tools/wptrunner/wptrunner/executors/test-wait.js.

Which I guess means that for a complete reftest-wait implementation, we need that in DRT/WKTR?
Comment 72 Alexey Proskuryakov 2020-09-11 12:45:37 PDT
Okay, why not look at all of them.

Most tests that had to be skipped use TestRendered. One appears to fail because we don't/cannot implement multiple test domains the way WPT wants to, and one appears to be just a failure.
Comment 73 Darin Adler 2020-09-11 15:26:18 PDT
TestRendered event? Well it’s easy to fire an event; happy to add the code for that. Where is that documented?
Comment 74 Simon Fraser (smfr) 2020-09-11 15:33:42 PDT
https://web-platform-tests.org/writing-tests/reftests.html
Comment 75 Darin Adler 2020-09-11 18:10:03 PDT
(In reply to Alexey Proskuryakov from comment #72)
> Most tests that had to be skipped use TestRendered. One appears to fail
> because we don't/cannot implement multiple test domains the way WPT wants
> to, and one appears to be just a failure.

I only see 3 tests in the tree that are using TestRendered. Are there more than 3?
Comment 76 Alexey Proskuryakov 2020-09-11 20:29:52 PDT
I think that's it as far as ones that we imported go. I think that all three are in the list that needed to be skipped (either macOS, or iOS, or both).
Comment 77 Frédéric Wang (:fredw) 2020-09-14 05:33:28 PDT
(In reply to Simon Fraser (smfr) from comment #74)
> https://web-platform-tests.org/writing-tests/reftests.html

Apparently crashtests have a similar attribute but it's called test-wait:

https://web-platform-tests.org/writing-tests/crashtest.html
Comment 78 Darin Adler 2020-09-14 11:03:54 PDT
While I am. irritated to hear that, sounds like a "too many cooks" situation, it’s trivial to make the tools check for both.

I don’t want a mess of modes and options, so I’d prefer to just implement both unconditionally unless there is an obvious reason we can’t. For the same reason, I made the code always send the TestRendered event, not conditionally.

Are we running crash tests in LayoutTests right now? Is there one that is crashing or flaky because of the lack of "test-wait"? Could someone file a new bug with the details; it is so easy to fix this. (Sadly need to rename source files that I named ReftestFunctions.h/cpp.)
Comment 79 Frédéric Wang (:fredw) 2020-09-14 11:22:17 PDT
(In reply to Darin Adler from comment #78)
> While I am. irritated to hear that, sounds like a "too many cooks"
> situation, it’s trivial to make the tools check for both.
> 
> I don’t want a mess of modes and options, so I’d prefer to just implement
> both unconditionally unless there is an obvious reason we can’t. For the
> same reason, I made the code always send the TestRendered event, not
> conditionally.
> 
> Are we running crash tests in LayoutTests right now? Is there one that is
> crashing or flaky because of the lack of "test-wait"? Could someone file a
> new bug with the details; it is so easy to fix this. (Sadly need to rename
> source files that I named ReftestFunctions.h/cpp.)

AFAIK, it's a new feature. I just learned about it because I was writing my first WPT crashtest today in Chromium... Maybe someone who is more up-to-date about WPT development can give more explanation (Sam, Youenn?). Also cc'in Emilio who presented this at the last year web engines hackfest.

I don't know if we import/run crashtests in WebKit, but my understanding is that ignoring class="test-wait" would just make the crashtest exit early and so we could potential miss crashes.
Comment 80 Alexey Proskuryakov 2020-09-14 13:36:26 PDT
We don't have support for tests crashing without causing bad effects (instability due to system load; trouble collecting crash logs).
Comment 81 Hector Lopez 2020-09-14 17:58:25 PDT
imported/w3c/web-platform-tests/css/css-ui/text-overflow-017.html
imported/w3c/web-platform-tests/css/css-values/vh_not_refreshing_on_chrome.html

Test were noted on this bug as a constant timeouts. Test expectations to skip were made to Mac but also a constant timeout on iOS.

History:

https://results.webkit.org/?suite=layout-tests&suite=layout-tests&test=imported%2Fw3c%2Fweb-platform-tests%2Fhtml%2Fsemantics%2Fforms%2Fthe-option-element%2Fdynamic-content-change-rendering.html&test=imported%2Fw3c%2Fweb-platform-tests%2Fhtml%2Fsemantics%2Fforms%2Fthe-select-element%2Freset-algorithm-rendering.html

Test expectations skipped for iOS:

https://trac.webkit.org/changeset/267061/webkit
Comment 82 Emilio Cobos Álvarez (:emilio) 2020-09-15 00:31:00 PDT
(In reply to Darin Adler from comment #78)
> While I am. irritated to hear that, sounds like a "too many cooks"
> situation, it’s trivial to make the tools check for both.
> 
> I don’t want a mess of modes and options, so I’d prefer to just implement
> both unconditionally unless there is an obvious reason we can’t. For the
> same reason, I made the code always send the TestRendered event, not
> conditionally.

Yes, that seems reasonable. I'm not aware of the details of why the chosen class is different (reftest-wait predates crashtests, and my guess is that someone complained that crashtests aren't _technically_ reftests so the new class was born, but...).

(In reply to Alexey Proskuryakov from comment #80)
> We don't have support for tests crashing without causing bad effects
> (instability due to system load; trouble collecting crash logs).

Well, the point is that those tests shouldn't crash, if they do it's a bug to fix in WebKit :). I guess skipping them until the bug is fixed if it causes problems and enabling them when the problem is fixed seems the easiest path forward there.
Comment 83 Darin Adler 2020-09-15 11:56:29 PDT
Lets stop discussing "test-wait" here. Someone should file another bug to track adding it. It’s very easy to add support, but I prefer not to do it until we have at least one use of it in a test.