168221 – Nwtr unexpectedly passes mismatch ref test if the hashes doesn't match but no diff

CLOSED INVALID 168221

Nwtr unexpectedly passes mismatch ref test if the hashes doesn't match but no diff

https://bugs.webkit.org/show_bug.cgi?id=168221

Summary Nwtr unexpectedly passes mismatch ref test if the hashes doesn't match but no...

Fujii Hironori

Reported 2017-02-12 22:08:46 PST

Nwtr unexpectedly passes mismatch ref test if the hashes doesn't match but no diff There is a logic error in comparing mismatch ref test images. In mismatch ref test images, it should be failed if two images has no diff. But, if the hashes are different, nwtr unexpectedly pass the mismatch ref test. It is difficult to test this bug. First, you need to find a *match* ref test which does not match the hashes. For example, in GTK port, I found a following test is the case: > fast/css/sticky/sticky-left-percentage.html -> ref test hashes didn't match but diff passed Next, convert this match ref test to a mismatch ref test: > mv LayoutTests/fast/css/sticky/sticky-left-percentage-expected{,-mismatch}.html And, run the test. In this time, it should fail: > $ ./Tools/Scripts/run-webkit-tests --gtk --release --no-new-test-results fast/css/sticky/sticky-left-percentage.html -v But, passed: > [1/1] fast/css/sticky/sticky-left-percentage.html passed

Attachments
Patch (2.61 KB, patch) 2017-02-12 22:21 PST, Fujii Hironori	no flags	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Fujii Hironori

Comment 1 2017-02-12 22:21:28 PST

Created attachment 301332 [details] Patch

Alexey Proskuryakov

Comment 2 2017-02-13 10:06:48 PST

Comment on attachment 301332 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=301332&action=review > Tools/ChangeLog:3 > + Nwtr unexpectedly passes mismatch ref test if the hashes doesn't match but no diff Why does the logic for expected mismatch run ImageDiff in this case? I'd expect the test to pass as soon as there is a hash mismatch.

Ryosuke Niwa

Comment 3 2017-02-13 11:35:03 PST

Comment on attachment 301332 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=301332&action=review >> Tools/ChangeLog:3 >> + Nwtr unexpectedly passes mismatch ref test if the hashes doesn't match but no diff > > Why does the logic for expected mismatch run ImageDiff in this case? > > I'd expect the test to pass as soon as there is a hash mismatch. The problem is that sometimes two images look identical even though hashes are different. e.g. due to anti-aliasing differences on Mac when a layer is composed.

WebKit Commit Bot

Comment 4 2017-02-13 12:00:15 PST

Comment on attachment 301332 [details] Patch Clearing flags on attachment: 301332 Committed r212237: <http://trac.webkit.org/changeset/212237>

WebKit Commit Bot

Comment 5 2017-02-13 12:00:22 PST

All reviewed patches have been landed. Closing bug.

Alexey Proskuryakov

Comment 6 2017-02-13 12:02:34 PST

> The problem is that sometimes two images look identical even though hashes are different. > e.g. due to anti-aliasing differences on Mac when a layer is composed. I understand this, but that doesn't answer my question. I still think that this patch is not the right approach to the issue.

Ryosuke Niwa

Comment 7 2017-02-13 12:08:38 PST

(In reply to comment #6) > > The problem is that sometimes two images look identical even though hashes are different. > > e.g. due to anti-aliasing differences on Mac when a layer is composed. > > I understand this, but that doesn't answer my question. I still think that > this patch is not the right approach to the issue. What do you think will be the right fix?

Alexey Proskuryakov

Comment 8 2017-02-13 12:26:13 PST

I think that we should report pass as soon as there is a hash difference. We already know that there is some mismatch at this point, and it's not the tool's responsibility to make guesses about how much of a difference is really a difference.

Ryosuke Niwa

Comment 9 2017-02-13 13:38:29 PST

(In reply to comment #8) > I think that we should report pass as soon as there is a hash difference. We > already know that there is some mismatch at this point, and it's not the > tool's responsibility to make guesses about how much of a difference is > really a difference. The problem is that tests with mismatch expectation MUST fail when two images look the same, and hashes don't seem to be a good indicator of whether they're the same or not since many of the existing ref tests with match expectation only pass under image-diff, meaning that many tests have different hashes from their respective expected results. Conversely, it is then possible for many tests with mismatch expectation to have different hashes from expected results yet look identical (failing condition).

Alexey Proskuryakov

Comment 10 2017-02-13 14:22:57 PST

I don't think that mismatch is about looking the same. For example, if you have a mismatch test that confirms that two different forms of antialiasing are not identical, the two results will look almost the same. Do you have a specific counter example where running an image diff would be helpful? We don't have that many "failed hash but passed diff" tests to serve as a generic counter example.

Fujii Hironori

Comment 11 2017-02-13 16:58:08 PST

I don't understand your question correctly. I have a question. Do you mean you are against using ImageDiff only for mismatch ref tests, even though you think it's ok to use ImageDiff for match ref test?

Fujii Hironori

Comment 12 2017-02-13 21:05:16 PST

There were some ideas suggested here: https://lists.webkit.org/pipermail/webkit-dev/2015-January/027179.html * Add ImageDiff or ImageHashMismatch expectation * Expose a new internals or testRunner methods to mark a test such

Fujii Hironori

Comment 13 2017-02-13 22:42:05 PST

Bug 94704 – REGRESSION(r126189): Reftest mismatches are (again) run through ImageDiff with 0.1 tolerance tolerance=0.1 is given to pixel tests, and tolerance=0 for ref tests. Pixels should be equal in ref tests.

Ryosuke Niwa

Comment 14 2017-02-13 22:52:19 PST

(In reply to comment #13) > Bug 94704 – REGRESSION(r126189): Reftest mismatches are (again) run > through ImageDiff with 0.1 tolerance > > tolerance=0.1 is given to pixel tests, and tolerance=0 for ref tests. > Pixels should be equal in ref tests. Wait, if that's the case how could this code path ever be useful?

Fujii Hironori

Comment 15 2017-02-13 23:09:23 PST

I had a big mistake. I was using fast/css/sticky/sticky-left-percentage.html for testing (comment 0). But this test has other problem (bug 168033 comment 7).

Fujii Hironori

Comment 16 2017-02-13 23:11:56 PST

(In reply to comment #14) > Wait, if that's the case how could this code path ever be useful? That's right. Calling diff_image with tolerance=0 is useless. Checking checksum seems enough.

Alexey Proskuryakov

Comment 17 2017-02-13 23:17:28 PST

Note that there is some tolerance forced by ImageDiffCG.cpp even when tolerance=0 is requested. That may be not easy to clean up.

Fujii Hironori

Comment 18 2017-02-13 23:20:21 PST

(In reply to comment #17) > Note that there is some tolerance forced by ImageDiffCG.cpp even when > tolerance=0 is requested. That may be not easy to clean up. I can not find such code in ImageDiffCG.cpp. Could you tell me which code?

Alexey Proskuryakov

Comment 19 2017-02-13 23:56:01 PST

Distances lower than 1/255 are ignored: if (distance >= 1.0f / 255.0f) { count += 1.0f; sum += distance; if (distance > maxDistance) maxDistance = distance; }

Fujii Hironori

Comment 20 2017-02-14 01:11:25 PST

(In reply to comment #19) > Distances lower than 1/255 are ignored: > > if (distance >= 1.0f / 255.0f) { > count += 1.0f; > sum += distance; > if (distance > maxDistance) > maxDistance = distance; > } Thank you so much. This color distance threshold is needed for match ref tests, but for mismatch ref tests at the moment. Should I stop calling image_diff and just do checksum checking for mismatch ref tests?

Fujii Hironori

Comment 21 2017-02-14 02:10:29 PST

(In reply to comment #20) > Should I stop calling image_diff and just do checksum checking > for mismatch ref tests? I checked some mismatch ref tests to see how do they expect. Most of them expect substantial mismatch. But, a following test expects tiny mismatch. I understand Alexey's comment 10. css3/filters/drop-shadow-blur-radius.html Original code before my change <http://trac.webkit.org/changeset/212237> was actually just doing checksum checking for mismatch ref tests. So, I prefer rolling back r212237 and closing this bug as invalid. I'll remove calling image_diff for mismatch ref tests in other bug. I don't have a permission to roll back. Could you roll back r212237?

WebKit Commit Bot

Comment 22 2017-02-14 02:48:45 PST

Re-opened since this is blocked by bug 168298

Fujii Hironori

Comment 23 2017-02-14 05:24:17 PST

Thank you, Ryosuke. Closed this bug as INVALID.

Alexey Proskuryakov

Comment 24 2017-02-14 09:21:21 PST

I do think that there is a bug here though, as you've demonstrated how the logic is clearly wrong. It just seems better to not run ImageDiff in this case.

Simon Fraser (smfr)

Comment 25 2017-02-14 11:01:57 PST

There's still a valid need to have ref tests that are a close, but not exact match (for example, software-rendered clipping vs. GPU-rendered clipping).

Fujii Hironori

Comment 26 2017-02-14 16:14:39 PST

(In reply to comment #24) > I do think that there is a bug here though, as you've demonstrated how the > logic is clearly wrong. It just seems better to not run ImageDiff in this > case. I agree and I'll do it (comment 21).

Fujii Hironori

Comment 27 2017-02-14 16:16:49 PST

(In reply to comment #25) > There's still a valid need to have ref tests that are a close, but not exact > match (for example, software-rendered clipping vs. GPU-rendered clipping). Right. If someone will add a new mismatch ref test which needs the color distance threshold, it will be the counter example which Alexey requested (comment 10).

Note You need to log in before you can comment on or make changes to this bug.

Status CLOSED

Resolution INVALID

Priority P2

Severity Normal

Classification Unclassified

Version WebKit Nightly Build

Hardware Unspecified

OS Unspecified

Product WebKit

Component Tools / Tests

Assignee

Fujii Hironori

Reported

2017-02-12 22:08 PST

Modified

2017-02-14 16:16 PST History

CC List

6 users Show

URL

Keywords

Depends on

168298

Blocks

168033

Dependencies

tree graph