Bug 229992

Summary: run-webkit-tests showing as failed tests which are already included in TestExpectations
Product: WebKit Reporter: Sergio Villar Senin <svillar>
Component: Tools / TestsAssignee: Nobody <webkit-unassigned>
Status: REOPENED ---    
Severity: Normal CC: ap, jbedard, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Demo none

Description Sergio Villar Senin 2021-09-07 04:09:41 PDT
For example I've just run:

$ Tools/Scripts/run-webkit-tests LayoutTests/imported/w3c/web-platform-tests/css/css-flexbox/

[...]

[204/1076] imported/w3c/web-platform-tests/css/css-flexbox/abspos/flex-abspos-staticpos-fallback-align-content-001.html failed (reference mismatc                                                                                                                                                 [231/1076] imported/w3c/web-platform-tests/css/css-flexbox/abspos/flex-abspos-staticpos-justify-content-vertWM-001.html failed (reference mismatc                                                                                                                                                 [233/1076] imported/w3c/web-platform-tests/css/css-flexbox/abspos/flex-abspos-staticpos-justify-content-vertWM-002.html failed (reference mismatc                                                                                                                                                 worker/1 stopping                                                                                                                  
worker/0 stopping                                                                                                                        
All 1076 tests ran as expected.

It's easy to see that the three of them are already in the general TestExpectations file marked as ImageOnlyFailure. They shouldn't be shown as failed while running the command.
Comment 1 Alexey Proskuryakov 2021-09-07 09:25:16 PDT
I think that this may be by design? It didn't say "failed unexpectedly", as it would if it didn't see the expectation.
Comment 2 Jonathan Bedard 2021-09-07 10:23:57 PDT
This is behaving correctly.

Test expectations change the expected result, if you want to make a test that doesn't match it's expected result actually pass, you need to change the expected result. It's  annoying that these terminologies are so similar, I like to call "expected results" "baselines" to reduce confusion.
Comment 3 Alexey Proskuryakov 2021-09-07 10:42:29 PDT
This gets ugly with reftests though - creating an -expected.html that matches incorrect rendering is harder than just landing expected results in -expected.txt. It would also create substantial confusing, because there isn't any indication of the -expected.html being a failure, like there is with "FAILED" lines in text tests.

So I think that there is an issue here, just not sure what to do about it. Maybe stop printing "failed" for expected failures completely? Those are somewhat relevant for flaky tests though, where one wants to run it thousands of times to see if the failure reproduces. Could consider only printing them in verbose mode.
Comment 4 Sergio Villar Senin 2021-09-08 09:00:52 PDT
I think I am not explaining myself correctly. When running the tests *without* the verbose parameter -v, then you don't get any failed result because each run overwrites the previous line.

What is happening here is that test that are expected to fail, do fail (which is correct) but print a new line in the output.
Comment 5 Sergio Villar Senin 2021-09-15 04:45:53 PDT
Created attachment 438237 [details]
Demo

OK I can give now more data and a 100% reproducible procedure to observe the issue.

The key here is the size, in particular the width, of the terminal in which the command is executed. If the terminal is wide enough then expected failures are not shown and everything runs smoothly. However if you reduce the width of the terminal below a given threshold (basically the test with the longest path name plus the "failed reference mismatch...") then expected failures are printed in a new line.

I'm attaching an screenshot that demonstrates the issue.
Comment 6 Sergio Villar Senin 2021-09-15 04:46:49 PDT
Reopening since it is indeed a bug as showcased in the attached screenshot and as explained in comment 5.
Comment 7 Radar WebKit Bug Importer 2021-09-15 04:47:17 PDT
<rdar://problem/83144603>
Comment 8 Jonathan Bedard 2021-09-15 08:59:17 PDT
(In reply to Sergio Villar Senin from comment #6)
> Reopening since it is indeed a bug as showcased in the attached screenshot
> and as explained in comment 5.

If you run with `--debug-rwt-logging`, are these lines printed?
Comment 9 Sergio Villar Senin 2021-09-15 11:49:48 PDT
(In reply to Jonathan Bedard from comment #8)
> (In reply to Sergio Villar Senin from comment #6)
> > Reopening since it is indeed a bug as showcased in the attached screenshot
> > and as explained in comment 5.
> 
> If you run with `--debug-rwt-logging`, are these lines printed?

Yeah sure, in that case the lines are printed for all the tests.

If you check the screenshot, I'm running the same command in 2 terminals. In the wider one nothing is printed as expected. The problem appears in the narrower were lines that fail and are long enough are printed when they should not.
Comment 10 Sergio Villar Senin 2021-09-16 02:53:34 PDT
Some more info, cannot reproduce it on MacOS, I can reproduce it in Linux though.

I'm starting to suspect that it might be a python bug in whatever package we use to paint the lines...
Comment 11 Jonathan Bedard 2021-09-16 08:07:07 PDT
(In reply to Sergio Villar Senin from comment #10)
> Some more info, cannot reproduce it on MacOS, I can reproduce it in Linux
> though.
> 
> I'm starting to suspect that it might be a python bug in whatever package we
> use to paint the lines...

That's definitely what it is, although it's our own code that does the line painting. Tools/Scripts/webkitpy/layout_tests/views/printing.py, I think, or Tools/Scripts/webkitpy/layout_tests/views/metered_stream.py