Bug 69444 - The new run-webkit-tests needs to dump out pixel hash failures even if the pixel test passes.
Summary: The new run-webkit-tests needs to dump out pixel hash failures even if the pi...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Dirk Pranke
URL:
Keywords:
Depends on:
Blocks: 64491
  Show dependency treegraph
 
Reported: 2011-10-05 10:48 PDT by Dave Hyatt
Modified: 2012-02-17 17:01 PST (History)
7 users (show)

See Also:


Attachments
Patch (4.21 KB, patch)
2012-02-15 20:00 PST, Dirk Pranke
no flags Details | Formatted Diff | Diff
make the tests pass (7.65 KB, patch)
2012-02-15 20:44 PST, Dirk Pranke
no flags Details | Formatted Diff | Diff
ignore - wrong bug (7.16 KB, patch)
2012-02-15 20:46 PST, Dirk Pranke
no flags Details | Formatted Diff | Diff
change warning message to match ORWT (7.53 KB, patch)
2012-02-16 20:21 PST, Dirk Pranke
eric: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dave Hyatt 2011-10-05 10:48:10 PDT
To those of us running pixel tests, it's very important to be able to see when pixel hash failures occur, since they often indicate a legitimate failure that is going unnoticed. old-run-webkit-tests properly reports these pixel hash failures so that you can see if you caused a minute pixel test failure and fix it. The new tool does not, which is really painful for me. I'm having to stick with old-run-webkit-tests because of this issue.
Comment 1 Dirk Pranke 2011-10-05 14:27:32 PDT
I assume you're running on the apple mac port ... is this a problem because by default the apple mac port runs with a tolerance set and allows some amount of fuzziness to match? IIRC, this happens even if you specify --tolerance=0.0.

Also, to clarify, do you want to see the pixel hash mismatches on the console, in the results html file, or both?
Comment 2 Dave Hyatt 2011-10-27 15:19:16 PDT
Console would be sufficient for me.
Comment 3 Eric Seidel (no email) 2011-10-27 15:43:25 PDT
Interesting.  It sounds like we just want --tolerance=0.0 by default.  But it's also possible to spit out these errors to the console.  

I'm not sure I understand the current --tolerance vs. not decisions on various ports.
Comment 4 Dirk Pranke 2011-10-27 15:47:33 PDT
Note that we added the %age of the pixel mismatch in bug 67253, and as I noted in comment #1, using a tolerance of 0.0 may still mask some differences on the apple ports. 

It should be easy to note the hash mismatch in either the console or the results file.
Comment 5 Dirk Pranke 2012-02-15 19:49:52 PST
On a related note, currently, if you use the 'test' port, we have a test case failures/expected/checksum.html, where the checksum fails to match but the image matches; when you run rwt --platform test, this is reported as an 'unexpected pass', when it should be reported as an image hash mismatch
Comment 6 Dirk Pranke 2012-02-15 19:57:07 PST
Adding Ojan and Tony to this ... 

If memory serves, Chromium has always considered the case where the image hashes don't match but the images themselves do to be a passing case. I think we used to report a warning in the results.html file, but at some point we stopped doing that (either when we made the results file generated from the json data, or when we embedded the checksums into the image, I think).

The upshot is we currently ignore the problem, which seems wrong, but I'm not sure which is the right thing to do. I think we have four choices:

1) report a warning to the console but otherwise treat the test as passing across the board
2) report a warning to the console and include the warning in results.html / results.json, but consider the test as passing (similar to how we handle stderr output now)
3) consider the test as failing, but only if --tolerance is set to 0.0
4) consider the test as failing, regardless of what --tolerance is set to

In any situation, we'd probably need to change the results.html output to identify this case properly and not just confuse the user.

I have a patch that does #1 but only logs an error to the console that I will post, just to demonstrate some of the scope of work involved in fixing it.

I have no idea how often this will come up in practice so we probably don't want to overengineer this.
Comment 7 Dirk Pranke 2012-02-15 20:00:18 PST
Created attachment 127303 [details]
Patch
Comment 8 Dirk Pranke 2012-02-15 20:44:33 PST
Created attachment 127305 [details]
make the tests pass
Comment 9 Dirk Pranke 2012-02-15 20:46:33 PST
Created attachment 127306 [details]
ignore - wrong bug
Comment 10 Tony Chang 2012-02-16 10:12:32 PST
Comment on attachment 127305 [details]
make the tests pass

View in context: https://bugs.webkit.org/attachment.cgi?id=127305&action=review

We should just match old-run-webkit-tests behavior, which is to log to the console.

> Tools/Scripts/webkitpy/layout_tests/controllers/single_test_runner.py:270
> +                _log.warning('%s had mismatched checksums but matching images (exp %s, got %s)' %
> +                            (self._test_name, expected_driver_output.image_hash, driver_output.image_hash))

Can we match ORWT's text?  If not, we should at least include the image diff percent.
Comment 11 Dirk Pranke 2012-02-16 11:41:54 PST
(In reply to comment #10)
> (From update of attachment 127305 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=127305&action=review
> 
> We should just match old-run-webkit-tests behavior, which is to log to the console.
> 
> > Tools/Scripts/webkitpy/layout_tests/controllers/single_test_runner.py:270
> > +                _log.warning('%s had mismatched checksums but matching images (exp %s, got %s)' %
> > +                            (self._test_name, expected_driver_output.image_hash, driver_output.image_hash))
> 
> Can we match ORWT's text?  If not, we should at least include the image diff percent.

Good idea. I will take a look.
Comment 12 Dirk Pranke 2012-02-16 20:21:32 PST
Created attachment 127506 [details]
change warning message to match ORWT
Comment 13 Dirk Pranke 2012-02-16 20:22:46 PST
Output now is:

$ ./new-run-webkit-tests -p fast/html/keygen.html
fast/html/keygen.html -> pixel hash failed (but pixel test still passes)
All 1 tests ran as expected.

$

I would be happy to add the tolerance value to both NRWT and ORWT as well if desired.
Comment 14 Eric Seidel (no email) 2012-02-17 16:18:01 PST
Comment on attachment 127506 [details]
change warning message to match ORWT

OK.
Comment 15 Dirk Pranke 2012-02-17 17:01:39 PST
Committed r108144: <http://trac.webkit.org/changeset/108144>