NEW 80644
Test expectations and tolerance interfere
https://bugs.webkit.org/show_bug.cgi?id=80644
Summary Test expectations and tolerance interfere
Tim Horton
Reported 2012-03-08 15:02:53 PST
Scenario: Have some pixel tests which pass at the default tolerance, but fail at tolerance=0. Run tests with tolerance=0, add the failing tests to text_expectations.txt as IMAGE failures. Run tests with default tolerance, and the tests result in: css3/filters/crash-filter-change.html -> pixel hash failed (but pixel test still passes) css3/filters/crash-filter-change.html -> unexpected pass This is kind of unfortunate. I'm not sure how one would solve it, though. Perhaps if pixel hash fails and it's an expected failure, don't PASS it?
Attachments
Dirk Pranke
Comment 1 2012-03-09 12:00:39 PST
(In reply to comment #0) > Scenario: > > Have some pixel tests which pass at the default tolerance, but fail at tolerance=0. > > Run tests with tolerance=0, add the failing tests to text_expectations.txt as IMAGE failures. > > Run tests with default tolerance, and the tests result in: > > css3/filters/crash-filter-change.html -> pixel hash failed (but pixel test still passes) > css3/filters/crash-filter-change.html -> unexpected pass > > This is kind of unfortunate. I'm not sure how one would solve it, though. Perhaps if pixel hash fails and it's an expected failure, don't PASS it? I don't think I understand what you're looking for ... the difference between an expected failure and an unexpected failure is that the former don't cause the test run itself to fail (since the tree goes red only on unexpected failures). In your example, do you want css3/filters/crash-filter-change.html to turn the tree red if the pixel hash doesn't match? If so, why list it as an expected IMAGE failure at all? Perhaps we need some sort of expectation of fuzzy pass but exact failure?
Tim Horton
Comment 2 2012-03-09 13:07:15 PST
(In reply to comment #1) > I don't think I understand what you're looking for ... the difference between an expected failure and an unexpected failure is that the former don't cause the test run itself to fail (since the tree goes red only on unexpected failures). > > In your example, do you want css3/filters/crash-filter-change.html to turn the tree red if the pixel hash doesn't match? If so, why list it as an expected IMAGE failure at all? > > Perhaps we need some sort of expectation of fuzzy pass but exact failure? Yes, I think that's what we want. A fuzzy pass expectation (the overall result may be a pass because the tolerance can change, but the pixel hash will not pass).
Dirk Pranke
Comment 3 2012-03-09 13:40:32 PST
(In reply to comment #2) > (In reply to comment #1) > > I don't think I understand what you're looking for ... the difference between an expected failure and an unexpected failure is that the former don't cause the test run itself to fail (since the tree goes red only on unexpected failures). > > > > In your example, do you want css3/filters/crash-filter-change.html to turn the tree red if the pixel hash doesn't match? If so, why list it as an expected IMAGE failure at all? > > > > Perhaps we need some sort of expectation of fuzzy pass but exact failure? > > Yes, I think that's what we want. A fuzzy pass expectation (the overall result may be a pass because the tolerance can change, but the pixel hash will not pass). Okay, so is it that you don't want the "unexpected pass" line to show up, and to just see the pixel hash failed/pixel test passes line? Or do you pixel hash failures to show up in results.html as well? Or do you want the tests to actually fail and turn the step red? Either of the first two is fine with me, I'm just trying to understand what to implement. if you want the third behavior, I'm still confused.
Tim Horton
Comment 4 2012-03-09 13:51:55 PST
(In reply to comment #3) > Okay, so is it that you don't want the "unexpected pass" line to show up, and to just see the pixel hash failed/pixel test passes line? > > Or do you pixel hash failures to show up in results.html as well? > > Or do you want the tests to actually fail and turn the step red? > > Either of the first two is fine with me, I'm just trying to understand what to implement. if you want the third behavior, I'm still confused. Hmm, I think #1 is sufficient.
Dirk Pranke
Comment 5 2012-03-09 13:54:08 PST
(In reply to comment #4) > Hmm, I think #1 is sufficient. Okay, that's doable. I will mull over the best way to accomplish such a thing ...
Rafael Brandao
Comment 6 2012-07-20 11:26:04 PDT
Is this hash calculated automatically or should we update it somehow once we change an expected image result? I can't find much about it, this is why I'm asking.
Dirk Pranke
Comment 7 2012-07-20 12:22:56 PDT
(In reply to comment #6) > Is this hash calculated automatically or should we update it somehow once we change an expected image result? I can't find much about it, this is why I'm asking. The hash is calculated automatically and embedded into the PNG in a TEXT comment field. You can view the hashes with the read-checksum-from-png script in Tools/Scripts .
Note You need to log in before you can comment on or make changes to this bug.