WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
168033
Nwtr ignores ImageDiff's errors for ref test
https://bugs.webkit.org/show_bug.cgi?id=168033
Summary
Nwtr ignores ImageDiff's errors for ref test
Fujii Hironori
Reported
2017-02-08 19:58:55 PST
nrwt: Reftests pass unexpectedly in case of ImageDiff's crash
> fast/css/cascade/box-shadow-and-webkit-box-shadow-cascade-order.html -> ref test hashes didn't match but diff passed
On the other hand, pixel test fails and reports "ImageDiff crashed" as expected.
> fast/css/background-clip-radius-values.html : ImageDiff crashed >[1/1] fast/css/background-clip-radius-values.html failed unexpectedly (image diff)
Attachments
Patch
(3.75 KB, patch)
2017-02-08 20:23 PST
,
Fujii Hironori
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews117 for mac-elcapitan
(1.62 MB, application/zip)
2017-02-08 21:33 PST
,
Build Bot
no flags
Details
Archive of layout-test-results from ews105 for mac-elcapitan-wk2
(976.81 KB, application/zip)
2017-02-08 21:38 PST
,
Build Bot
no flags
Details
Patch
(7.10 KB, patch)
2017-02-09 00:57 PST
,
Fujii Hironori
no flags
Details
Formatted Diff
Diff
Patch
(3.46 KB, patch)
2017-02-15 00:46 PST
,
Fujii Hironori
no flags
Details
Formatted Diff
Diff
Patch
(3.46 KB, patch)
2017-02-15 01:04 PST
,
Fujii Hironori
no flags
Details
Formatted Diff
Diff
Patch
(5.61 KB, patch)
2017-02-15 01:07 PST
,
Fujii Hironori
no flags
Details
Formatted Diff
Diff
Patch
(5.67 KB, patch)
2017-02-15 20:16 PST
,
Fujii Hironori
no flags
Details
Formatted Diff
Diff
Patch
(4.55 KB, patch)
2017-03-07 00:54 PST
,
Fujii Hironori
no flags
Details
Formatted Diff
Diff
Patch
(4.63 KB, patch)
2017-03-12 22:32 PDT
,
Fujii Hironori
no flags
Details
Formatted Diff
Diff
Show Obsolete
(9)
View All
Add attachment
proposed patch, testcase, etc.
Fujii Hironori
Comment 1
2017-02-08 20:23:33 PST
Created
attachment 301004
[details]
Patch
Build Bot
Comment 2
2017-02-08 21:33:44 PST
Comment on
attachment 301004
[details]
Patch
Attachment 301004
[details]
did not pass mac-debug-ews (mac): Output:
http://webkit-queues.webkit.org/results/3030141
New failing tests: compositing/transitions/transform-on-large-layer.html fast/css/sticky/sticky-left-percentage.html
Build Bot
Comment 3
2017-02-08 21:33:47 PST
Created
attachment 301013
[details]
Archive of layout-test-results from ews117 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews117 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Build Bot
Comment 4
2017-02-08 21:38:03 PST
Comment on
attachment 301004
[details]
Patch
Attachment 301004
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
http://webkit-queues.webkit.org/results/3030161
New failing tests: compositing/transitions/transform-on-large-layer.html fast/css/sticky/sticky-left-percentage.html
Build Bot
Comment 5
2017-02-08 21:38:06 PST
Created
attachment 301015
[details]
Archive of layout-test-results from ews105 for mac-elcapitan-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews105 Port: mac-elcapitan-wk2 Platform: Mac OS X 10.11.6
Alexey Proskuryakov
Comment 6
2017-02-08 22:03:46 PST
Comment on
attachment 301004
[details]
Patch Looks like this breaks tests. But also, why does ImageDiff crash? In a crazy situation like this, it's probably appropriate to terminate run-webkit-tests with an exception.
Fujii Hironori
Comment 7
2017-02-08 23:33:49 PST
(In reply to
comment #4
)
> New failing tests: > compositing/transitions/transform-on-large-layer.html > fast/css/sticky/sticky-left-percentage.html
This change discovers problems which have been hidden by false negative.
> compositing/transitions/transform-on-large-layer.html : ImageDiff produced stderr output: > Error: test and reference images have different sizes. Test image is 800x600, reference image is 1600x977 > [1/10] compositing/transitions/transform-on-large-layer.html failed unexpectedly (reference didn't generate pixel results.) > fast/css/sticky/sticky-left-percentage.html : ImageDiff produced stderr output: > Error: test and reference images have different sizes. Test image is 800x700, reference image is 800x600 > [2/10] fast/css/sticky/sticky-left-percentage.html failed unexpectedly (reference didn't generate pixel results.)
Fujii Hironori
Comment 8
2017-02-08 23:54:16 PST
(In reply to
comment #6
)
> But also, why does ImageDiff crash? In a crazy situation like this, it's > probably appropriate to terminate run-webkit-tests with an exception.
My ImageDiff crashed just because mis-configuration of my system (
Bug 168036 comment 1
). But, this change isn't only for ImageDiff's crash. Nwtr hides tree types of other problems at the moment: * ImageDiff produced stderr output * ImageDiff timed out * ImageDiff crashed
https://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/port/image_diff.py?rev=211931#L97
Fujii Hironori
Comment 9
2017-02-09 00:06:08 PST
Nwtr ignores ImageDiff's errors only for ref test. Nwtr checks the err_str of pixel test.
Fujii Hironori
Comment 10
2017-02-09 00:57:13 PST
Created
attachment 301022
[details]
Patch
Alexey Proskuryakov
Comment 11
2017-02-10 16:52:08 PST
Comment on
attachment 301022
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=301022&action=review
> Tools/ChangeLog:3 > + Nwtr ignores ImageDiff's errors for ref test
Looking at the patch, but it's taking me some effort to understand all that it changes. In particular, I'm still not sure what the problems with these two tests were. What exactly makes them fail now? They clearly don't crash or hang. When I run them locally, I get a warning saying "ref test hashes didn't match but diff passed", but many more tests produce this warning, so what is the difference with those? compositing/masks/compositing-clip-path.html -> ref test hashes didn't match but diff passed compositing/overlap-blending/children-opacity-huge.html -> ref test hashes didn't match but diff passed compositing/overlap-blending/children-opacity-no-overlap.html -> ref test hashes didn't match but diff passed compositing/overlap-blending/nested-non-overlap-clipping.html -> ref test hashes didn't match but diff passed compositing/transitions/transform-on-large-layer.html -> ref test hashes didn't match but diff passed css3/filters/backdrop/backdrop-filter-with-border-radius-and-reflection-add.html -> ref test hashes didn't match but diff passed css3/filters/backdrop/backdrop-filter-with-border-radius-and-reflection.html -> ref test hashes didn't match but diff passed css3/filters/filters-on-svg-root.html -> ref test hashes didn't match but diff passed
> LayoutTests/ChangeLog:13 > + Set a tiny margin-top to #sandbox not to overwrap with a preceding > + text.
Why do we need it to overlap?
Alexey Proskuryakov
Comment 12
2017-02-10 16:54:12 PST
Comment on
attachment 301022
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=301022&action=review
>> LayoutTests/ChangeLog:13 >> + text. > > Why do we need it to overlap?
I meant, why do we need it to not overlap?
Ryosuke Niwa
Comment 13
2017-02-10 17:07:20 PST
Comment on
attachment 301022
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=301022&action=review
> Tools/ChangeLog:9 > + expected. Bacause _compare_image checks returned err_str of
We use a single space after period.
> Tools/Scripts/webkitpy/layout_tests/controllers/single_test_runner.py:339 > diff_result = self._port.diff_image(reference_driver_output.image, actual_driver_output.image, tolerance=0) > - if not diff_result[0]: > - failures.append(test_failures.FailureReftestMismatchDidNotOccur(reference_filename)) > + err_str = diff_result[2] > + if err_str: > + _log.warning(' %s : %s' % (self._test_name, err_str)) > + failures.append(test_failures.FailureReftestNoImagesGenerated(reference_filename))
Can we extract this as a helper method with _ prefix so that we don't have to repeat the same code twice?
> Tools/Scripts/webkitpy/layout_tests/controllers/single_test_runner.py:343 > else: > - _log.warning(" %s -> ref test hashes matched but diff failed" % self._test_name) > + if not diff_result[0]: > + failures.append(test_failures.FailureReftestMismatchDidNotOccur(reference_filename))
Why don't we use elif here?
> Tools/Scripts/webkitpy/layout_tests/controllers/single_test_runner.py:355 > else: > - _log.warning(" %s -> ref test hashes didn't match but diff passed" % self._test_name) > + if diff_result[0]:
Ditto.
> LayoutTests/ChangeLog:9 > + ImageDiff reports errors for different image sizes of expected and > + actual image of ref tests. They should have same window size.
Can we make this change in a separate patch?
Fujii Hironori
Comment 14
2017-02-12 20:38:15 PST
Thank you for reviewing. (In reply to
comment #11
)
> Comment on
attachment 301022
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=301022&action=review
> > > Tools/ChangeLog:3 > > + Nwtr ignores ImageDiff's errors for ref test > > Looking at the patch, but it's taking me some effort to understand all that > it changes. > > In particular, I'm still not sure what the problems with these two tests > were. What exactly makes them fail now? They clearly don't crash or hang. > When I run them locally, I get a warning saying "ref test hashes didn't > match but diff passed", but many more tests produce this warning, so what is > the difference with those? > > compositing/masks/compositing-clip-path.html -> ref test hashes didn't > match but diff passed > compositing/overlap-blending/children-opacity-huge.html -> ref test hashes > didn't match but diff passed > compositing/overlap-blending/children-opacity-no-overlap.html -> ref test > hashes didn't match but diff passed > compositing/overlap-blending/nested-non-overlap-clipping.html -> ref test > hashes didn't match but diff passed > compositing/transitions/transform-on-large-layer.html -> ref test hashes > didn't match but diff passed > > css3/filters/backdrop/backdrop-filter-with-border-radius-and-reflection-add. > html -> ref test hashes didn't match but diff passed > > css3/filters/backdrop/backdrop-filter-with-border-radius-and-reflection.html > -> ref test hashes didn't match but diff passed > css3/filters/filters-on-svg-root.html -> ref test hashes didn't match but > diff passed
You didn't apply my patch. It's difficult to know what's an actual failure without my patch because nwtr doesn't report ImageDiff's error. You can see the ImageDiff's error message by applying out patch (
Comment 7
):
> Error: test and reference images have different sizes. Test image is 800x600, reference image is 1600x977
And, ImageDiff doen't check any pixels if the comparing two image have different sizes. So, it can not fail the test by modifying the text in -expected.html.
Fujii Hironori
Comment 15
2017-02-12 22:37:49 PST
(In reply to
comment #12
)
> Comment on
attachment 301022
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=301022&action=review
> > >> LayoutTests/ChangeLog:13 > >> + text. > > > > Why do we need it to overlap? > > I meant, why do we need it to not overlap?
It caused pixel diffs while I was runnning nwtr of GTK port. I thought it was caused by z-index specifed in the test. But, I check the test again, there is no such z-index specifed, but just translate3d. This is my fault. I need to investigate the failure of GTK port. Thank you for pointing out.
Fujii Hironori
Comment 16
2017-02-12 22:51:01 PST
Thank you for reviewing. (In reply to
comment #13
)
> Comment on
attachment 301022
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=301022&action=review
> > > Tools/ChangeLog:9 > > + expected. Bacause _compare_image checks returned err_str of > > We use a single space after period.
Oh, this is Emacs default behavior of fill-paragraph. I changed my config (sentence-end-double-space) now.
> > Tools/Scripts/webkitpy/layout_tests/controllers/single_test_runner.py:339 > > diff_result = self._port.diff_image(reference_driver_output.image, actual_driver_output.image, tolerance=0) > > - if not diff_result[0]: > > - failures.append(test_failures.FailureReftestMismatchDidNotOccur(reference_filename)) > > + err_str = diff_result[2] > > + if err_str: > > + _log.warning(' %s : %s' % (self._test_name, err_str)) > > + failures.append(test_failures.FailureReftestNoImagesGenerated(reference_filename)) > > Can we extract this as a helper method with _ prefix so that we don't have > to repeat the same code twice?
I found another problem (
Bug 168221
). The login of match and mismatch ref tests are similar but same. I think it's difficult. I'll revisit this bug after finishing
Bug 168221
.
> > Tools/Scripts/webkitpy/layout_tests/controllers/single_test_runner.py:343 > > else: > > - _log.warning(" %s -> ref test hashes matched but diff failed" % self._test_name) > > + if not diff_result[0]: > > + failures.append(test_failures.FailureReftestMismatchDidNotOccur(reference_filename)) > > Why don't we use elif here? > > > Tools/Scripts/webkitpy/layout_tests/controllers/single_test_runner.py:355 > > else: > > - _log.warning(" %s -> ref test hashes didn't match but diff passed" % self._test_name) > > + if diff_result[0]: > > Ditto.
I just copied the code of pixel tests. You're right. I'll revise.
> > LayoutTests/ChangeLog:9 > > + ImageDiff reports errors for different image sizes of expected and > > + actual image of ref tests. They should have same window size. > > Can we make this change in a separate patch?
That's good idea. Filed (
Bug 168217
and
Bug 168218
).
Alexey Proskuryakov
Comment 17
2017-02-12 23:14:35 PST
> You didn't apply my patch. It's difficult to know what's an > actual failure without my patch because nwtr doesn't report > ImageDiff's error.
I don't think that this answers my question. Can you explain what is different about the two tests that you modified in this patch, in contrast with the others that currently have the same output?
> And, ImageDiff doen't check any pixels if the comparing two image have different sizes.
Are you sure? My recollection is that is does, at least on the Mac.
Fujii Hironori
Comment 18
2017-02-13 00:09:59 PST
(In reply to
comment #17
)
> > You didn't apply my patch. It's difficult to know what's an > > actual failure without my patch because nwtr doesn't report > > ImageDiff's error. > > I don't think that this answers my question. Can you explain what is > different about the two tests that you modified in this patch, in contrast > with the others that currently have the same output?
The two tests call window.resize, but others don't. Is this an answer to your question? I think I don't understand your question.
> > And, ImageDiff doen't check any pixels if the comparing two image have different sizes. > > Are you sure? My recollection is that is does, at least on the Mac.
I tested in my mac mini now. My result is different with yours.
> fujii@fujihiro-mac-mini $ ./Tools/Scripts/run-webkit-tests compositing/transitions/transform-on-large-layer.html > Using port 'mac-elcapitan-wk2' > Test configuration: <elcapitan, x86_64, release> > Placing test results in /Users/fujii/work/webkit/g1/WebKitBuild/Release/layout-test-results > Baseline search path: platform/mac-wk2 -> platform/wk2 -> platform/mac-elcapitan -> platform/mac -> generic > Using Release build > Pixel tests disabled > Regular timeout: 30000, slow test timeout: 150000 > Command line: /Users/fujii/work/webkit/g1/WebKitBuild/Release/WebKitTestRunner - > > --lint-test-files warnings: > LayoutTests/TestExpectations:384 Path does not exist. imported/w3c/web-platform-tests/tools > > Found 1 test; running 1, skipping 0. > > Running 1 test > > Starting helper ...2017-02-13 00:00:03.439 LayoutTestHelper[28724:1359551] Could not determine current color profile, so it will not be reset after running the tests. Running 1 WebKitTestRunner. > > compositing/transitions/transform-on-large-layer.html -> ref test hashes didn't match but diff passed > The test ran as expected. > > fujii@fujihiro-mac-mini $ sed -ie 's/turns green/XXX/' LayoutTests/compositing/transitions/transform-on-large-layer-expected.html > fujii@fujihiro-mac-mini $ ./Tools/Scripts/run-webkit-tests compositing/transitions/transform-on-large-layer.html > Using port 'mac-elcapitan-wk2' > Test configuration: <elcapitan, x86_64, release> > Placing test results in /Users/fujii/work/webkit/g1/WebKitBuild/Release/layout-test-results > Baseline search path: platform/mac-wk2 -> platform/wk2 -> platform/mac-elcapitan -> platform/mac -> generic > Using Release build > Pixel tests disabled > Regular timeout: 30000, slow test timeout: 150000 > Command line: /Users/fujii/work/webkit/g1/WebKitBuild/Release/WebKitTestRunner - > > --lint-test-files warnings: > LayoutTests/TestExpectations:384 Path does not exist. imported/w3c/web-platform-tests/tools > > Found 1 test; running 1, skipping 0. > > Running 1 test > > Starting helper ...2017-02-13 00:02:06.395 LayoutTestHelper[28882:1360579] Could not determine current color profile, so it will not be reset after running the tests. Running 1 WebKitTestRunner. > > compositing/transitions/transform-on-large-layer.html -> ref test hashes didn't match but diff passed > The test ran as expected. > > fujii@fujihiro-mac-mini $
Here is a source code of ImageDiff of Mac port.
https://trac.webkit.org/browser/trunk/Tools/DumpRenderTree/cg/ImageDiffCG.cpp?rev=198559#L216
It doesn't check any pixels if the sizes are different.
Fujii Hironori
Comment 19
2017-02-13 18:41:54 PST
I discovered old discussions. This problem was discussed two years ago. [webkit-dev] run-webkit-tests question; hashes when comparing ref test output
https://lists.webkit.org/pipermail/webkit-dev/2015-January/027166.html
Fujii Hironori
Comment 20
2017-02-15 00:46:44 PST
Created
attachment 301589
[details]
Patch
Fujii Hironori
Comment 21
2017-02-15 01:04:31 PST
Created
attachment 301593
[details]
Patch
Fujii Hironori
Comment 22
2017-02-15 01:07:24 PST
Created
attachment 301594
[details]
Patch
Fujii Hironori
Comment 23
2017-02-15 20:16:40 PST
Created
attachment 301694
[details]
Patch
Fujii Hironori
Comment 24
2017-02-27 18:52:45 PST
Comment on
attachment 301593
[details]
Patch I've finished fixing depending bugs, please review again.
Ryosuke Niwa
Comment 25
2017-03-06 19:14:11 PST
Comment on
attachment 301593
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=301593&action=review
> Tools/ChangeLog:8 > + On the other hand, pixel test fails and reports the error as
It's strange to start your log by saying "on the other hand" given you haven't stated what on one hand happens.
> Tools/ChangeLog:9 > + expected. Bacause _compare_image checks returned err_str of
Typo: B"a"cause.
> Tools/ChangeLog:16 > +
Please also explain the change you're making to expected match case.
> Tools/Scripts/webkitpy/layout_tests/controllers/single_test_runner.py:340 > + err_str = diff_result[2]
Please spell out "error" instead of abbreviating it as "err".
Fujii Hironori
Comment 26
2017-03-07 00:54:44 PST
Created
attachment 303635
[details]
Patch
Alexey Proskuryakov
Comment 27
2017-03-09 16:13:51 PST
Comment on
attachment 303635
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=303635&action=review
Looks good to me. Can automated tests be added for this fix?
> Tools/Scripts/webkitpy/layout_tests/controllers/single_test_runner.py:276 > + error = diff_result[2]
I think that error_string would be a better name, there are too many things that an "error" can contain.
> Tools/Scripts/webkitpy/layout_tests/controllers/single_test_runner.py:344 > + actual_driver_output.error = (actual_driver_output.error or '') + error
I don't see any other place where we set actual_driver_output.error. Why is this needed?
Fujii Hironori
Comment 28
2017-03-12 22:15:00 PDT
Comment on
attachment 303635
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=303635&action=review
>> Tools/Scripts/webkitpy/layout_tests/controllers/single_test_runner.py:276 >> + error = diff_result[2] > > I think that error_string would be a better name, there are too many things that an "error" can contain.
I agree. I'll revise.
>> Tools/Scripts/webkitpy/layout_tests/controllers/single_test_runner.py:344 >> + actual_driver_output.error = (actual_driver_output.error or '') + error > > I don't see any other place where we set actual_driver_output.error. Why is this needed?
actual_driver_output is the return value of self._driver.run_test. Then, actual_driver_output.error may contain a error of executing WTR for actual case.
Fujii Hironori
Comment 29
2017-03-12 22:23:23 PDT
(In reply to
comment #27
)
> Looks good to me. Can automated tests be added for this fix?
I don't think so. I have no idea.
Fujii Hironori
Comment 30
2017-03-12 22:32:32 PDT
Created
attachment 304227
[details]
Patch
Alexey Proskuryakov
Comment 31
2017-03-12 23:35:18 PDT
CC'ing some folks who may have an idea for how to make a regression test for this fix.
WebKit Commit Bot
Comment 32
2017-03-14 11:22:09 PDT
Comment on
attachment 304227
[details]
Patch Clearing flags on attachment: 304227 Committed
r213918
: <
http://trac.webkit.org/changeset/213918
>
WebKit Commit Bot
Comment 33
2017-03-14 11:22:19 PDT
All reviewed patches have been landed. Closing bug.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug