80644 – Test expectations and tolerance interfere

NEW 80644

Test expectations and tolerance interfere

https://bugs.webkit.org/show_bug.cgi?id=80644

Summary Test expectations and tolerance interfere

Tim Horton

Reported 2012-03-08 15:02:53 PST

Scenario: Have some pixel tests which pass at the default tolerance, but fail at tolerance=0. Run tests with tolerance=0, add the failing tests to text_expectations.txt as IMAGE failures. Run tests with default tolerance, and the tests result in: css3/filters/crash-filter-change.html -> pixel hash failed (but pixel test still passes) css3/filters/crash-filter-change.html -> unexpected pass This is kind of unfortunate. I'm not sure how one would solve it, though. Perhaps if pixel hash fails and it's an expected failure, don't PASS it?

Attachments
Add attachment proposed patch, testcase, etc.

Dirk Pranke

Comment 1 2012-03-09 12:00:39 PST

(In reply to comment #0) > Scenario: > > Have some pixel tests which pass at the default tolerance, but fail at tolerance=0. > > Run tests with tolerance=0, add the failing tests to text_expectations.txt as IMAGE failures. > > Run tests with default tolerance, and the tests result in: > > css3/filters/crash-filter-change.html -> pixel hash failed (but pixel test still passes) > css3/filters/crash-filter-change.html -> unexpected pass > > This is kind of unfortunate. I'm not sure how one would solve it, though. Perhaps if pixel hash fails and it's an expected failure, don't PASS it? I don't think I understand what you're looking for ... the difference between an expected failure and an unexpected failure is that the former don't cause the test run itself to fail (since the tree goes red only on unexpected failures). In your example, do you want css3/filters/crash-filter-change.html to turn the tree red if the pixel hash doesn't match? If so, why list it as an expected IMAGE failure at all? Perhaps we need some sort of expectation of fuzzy pass but exact failure?

Tim Horton

Comment 2 2012-03-09 13:07:15 PST

(In reply to comment #1) > I don't think I understand what you're looking for ... the difference between an expected failure and an unexpected failure is that the former don't cause the test run itself to fail (since the tree goes red only on unexpected failures). > > In your example, do you want css3/filters/crash-filter-change.html to turn the tree red if the pixel hash doesn't match? If so, why list it as an expected IMAGE failure at all? > > Perhaps we need some sort of expectation of fuzzy pass but exact failure? Yes, I think that's what we want. A fuzzy pass expectation (the overall result may be a pass because the tolerance can change, but the pixel hash will not pass).

Dirk Pranke

Comment 3 2012-03-09 13:40:32 PST

(In reply to comment #2) > (In reply to comment #1) > > I don't think I understand what you're looking for ... the difference between an expected failure and an unexpected failure is that the former don't cause the test run itself to fail (since the tree goes red only on unexpected failures). > > > > In your example, do you want css3/filters/crash-filter-change.html to turn the tree red if the pixel hash doesn't match? If so, why list it as an expected IMAGE failure at all? > > > > Perhaps we need some sort of expectation of fuzzy pass but exact failure? > > Yes, I think that's what we want. A fuzzy pass expectation (the overall result may be a pass because the tolerance can change, but the pixel hash will not pass). Okay, so is it that you don't want the "unexpected pass" line to show up, and to just see the pixel hash failed/pixel test passes line? Or do you pixel hash failures to show up in results.html as well? Or do you want the tests to actually fail and turn the step red? Either of the first two is fine with me, I'm just trying to understand what to implement. if you want the third behavior, I'm still confused.

Tim Horton

Comment 4 2012-03-09 13:51:55 PST

(In reply to comment #3) > Okay, so is it that you don't want the "unexpected pass" line to show up, and to just see the pixel hash failed/pixel test passes line? > > Or do you pixel hash failures to show up in results.html as well? > > Or do you want the tests to actually fail and turn the step red? > > Either of the first two is fine with me, I'm just trying to understand what to implement. if you want the third behavior, I'm still confused. Hmm, I think #1 is sufficient.

Dirk Pranke

Comment 5 2012-03-09 13:54:08 PST

(In reply to comment #4) > Hmm, I think #1 is sufficient. Okay, that's doable. I will mull over the best way to accomplish such a thing ...

Rafael Brandao

Comment 6 2012-07-20 11:26:04 PDT

Is this hash calculated automatically or should we update it somehow once we change an expected image result? I can't find much about it, this is why I'm asking.

Dirk Pranke

Comment 7 2012-07-20 12:22:56 PDT

(In reply to comment #6) > Is this hash calculated automatically or should we update it somehow once we change an expected image result? I can't find much about it, this is why I'm asking. The hash is calculated automatically and embedded into the PNG in a TEXT comment field. You can view the hashes with the read-checksum-from-png script in Tools/Scripts .

Note You need to log in before you can comment on or make changes to this bug.

Status NEW

Resolution

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Unspecified

OS Unspecified

Product WebKit

Component Tools / Tests

Assignee

Nobody

Reported

2012-03-08 15:02 PST

Modified

2012-07-20 12:22 PDT History

CC List

5 users Show

URL

Keywords NRWT

Depends on

Blocks