For example I've just run: $ Tools/Scripts/run-webkit-tests LayoutTests/imported/w3c/web-platform-tests/css/css-flexbox/ [...] [204/1076] imported/w3c/web-platform-tests/css/css-flexbox/abspos/flex-abspos-staticpos-fallback-align-content-001.html failed (reference mismatc [231/1076] imported/w3c/web-platform-tests/css/css-flexbox/abspos/flex-abspos-staticpos-justify-content-vertWM-001.html failed (reference mismatc [233/1076] imported/w3c/web-platform-tests/css/css-flexbox/abspos/flex-abspos-staticpos-justify-content-vertWM-002.html failed (reference mismatc worker/1 stopping worker/0 stopping All 1076 tests ran as expected. It's easy to see that the three of them are already in the general TestExpectations file marked as ImageOnlyFailure. They shouldn't be shown as failed while running the command.
I think that this may be by design? It didn't say "failed unexpectedly", as it would if it didn't see the expectation.
This is behaving correctly. Test expectations change the expected result, if you want to make a test that doesn't match it's expected result actually pass, you need to change the expected result. It's annoying that these terminologies are so similar, I like to call "expected results" "baselines" to reduce confusion.
This gets ugly with reftests though - creating an -expected.html that matches incorrect rendering is harder than just landing expected results in -expected.txt. It would also create substantial confusing, because there isn't any indication of the -expected.html being a failure, like there is with "FAILED" lines in text tests. So I think that there is an issue here, just not sure what to do about it. Maybe stop printing "failed" for expected failures completely? Those are somewhat relevant for flaky tests though, where one wants to run it thousands of times to see if the failure reproduces. Could consider only printing them in verbose mode.
I think I am not explaining myself correctly. When running the tests *without* the verbose parameter -v, then you don't get any failed result because each run overwrites the previous line. What is happening here is that test that are expected to fail, do fail (which is correct) but print a new line in the output.
Created attachment 438237 [details] Demo OK I can give now more data and a 100% reproducible procedure to observe the issue. The key here is the size, in particular the width, of the terminal in which the command is executed. If the terminal is wide enough then expected failures are not shown and everything runs smoothly. However if you reduce the width of the terminal below a given threshold (basically the test with the longest path name plus the "failed reference mismatch...") then expected failures are printed in a new line. I'm attaching an screenshot that demonstrates the issue.
Reopening since it is indeed a bug as showcased in the attached screenshot and as explained in comment 5.
<rdar://problem/83144603>
(In reply to Sergio Villar Senin from comment #6) > Reopening since it is indeed a bug as showcased in the attached screenshot > and as explained in comment 5. If you run with `--debug-rwt-logging`, are these lines printed?
(In reply to Jonathan Bedard from comment #8) > (In reply to Sergio Villar Senin from comment #6) > > Reopening since it is indeed a bug as showcased in the attached screenshot > > and as explained in comment 5. > > If you run with `--debug-rwt-logging`, are these lines printed? Yeah sure, in that case the lines are printed for all the tests. If you check the screenshot, I'm running the same command in 2 terminals. In the wider one nothing is printed as expected. The problem appears in the narrower were lines that fail and are long enough are printed when they should not.
Some more info, cannot reproduce it on MacOS, I can reproduce it in Linux though. I'm starting to suspect that it might be a python bug in whatever package we use to paint the lines...
(In reply to Sergio Villar Senin from comment #10) > Some more info, cannot reproduce it on MacOS, I can reproduce it in Linux > though. > > I'm starting to suspect that it might be a python bug in whatever package we > use to paint the lines... That's definitely what it is, although it's our own code that does the line painting. Tools/Scripts/webkitpy/layout_tests/views/printing.py, I think, or Tools/Scripts/webkitpy/layout_tests/views/metered_stream.py