RESOLVED FIXED Bug 9496
Pixel tests failing on BuildBot
https://bugs.webkit.org/show_bug.cgi?id=9496
Summary Pixel tests failing on BuildBot
mitz
Reported 2006-06-18 10:03:32 PDT
142 pixel tests are reported as failing on BuildBot, which makes it hard to notice real regressions. The failures consist of: 1. Minor discrepancies due to color matching, image decoding or other graphics APIs (some of these pass locally, some fail differently): css1/box_properties/float_elements_in_series editing/selection/iframe editing/selection/inline-table fast/css/first-letter-detach fast/css/imageTileOpacity fast/selectors/159 tables/mozilla/bugs/bug5797 tables/mozilla/bugs/bug10565 tables/mozilla/bugs/bug11026 tables/mozilla/bugs/bug12908-1 tables/mozilla/bugs/bug12908-2 tables/mozilla/bugs/bug12910-2 tables/mozilla/bugs/bug13169 tables/mozilla/bugs/bug15544 tables/mozilla/bugs/bug17138 tables/mozilla/bugs/bug29314 tables/mozilla/bugs/bug82946-2 tables/mozilla/bugs/bug120107 tables/mozilla/bugs/bug196870 tables/mozilla/bugs/bug1271 tables/mozilla/bugs/bug25074 tables/mozilla/bugs/bug625 tables/mozilla/bugs/bug1188 tables/mozilla/bugs/bug1296 tables/mozilla/bugs/bug1430 tables/mozilla/bugs/bug2981-2 tables/mozilla/bugs/bug4093 tables/mozilla/bugs/bug4284 tables/mozilla/bugs/bug4427 tables/mozilla/bugs/bug4523 tables/mozilla/bugs/bug6404 tables/mozilla/bugs/bug50695-2 tables/mozilla/bugs/bug56563 tables/mozilla/core/bloomberg tables/mozilla/core/col_widths_auto_autoFix tables/mozilla/core/misc tables/mozilla/marvin/tbody_valign_baseline tables/mozilla/marvin/tbody_valign_bottom tables/mozilla/marvin/tbody_valign_middle tables/mozilla/marvin/tbody_valign_top tables/mozilla/marvin/td_valign_baseline tables/mozilla/marvin/td_valign_bottom tables/mozilla/marvin/td_valign_middle tables/mozilla/marvin/td_valign_top tables/mozilla/marvin/tfoot_valign_baseline tables/mozilla/marvin/tfoot_valign_bottom tables/mozilla/marvin/tfoot_valign_middle tables/mozilla/marvin/tfoot_valign_top tables/mozilla/marvin/th_valign_baseline tables/mozilla/marvin/th_valign_bottom tables/mozilla/marvin/th_valign_middle tables/mozilla/marvin/th_valign_top tables/mozilla/marvin/thead_valign_baseline tables/mozilla/marvin/thead_valign_bottom tables/mozilla/marvin/thead_valign_middle tables/mozilla/marvin/thead_valign_top tables/mozilla/marvin/tr_valign_baseline tables/mozilla/marvin/tr_valign_bottom tables/mozilla/marvin/tr_valign_middle tables/mozilla/marvin/tr_valign_top tables/mozilla/other/cell_widths tables/mozilla_expected_failures/bugs/bug6933 tables/mozilla_expected_failures/bugs/bug85016 tables/mozilla_expected_failures/bugs/bug101674 css2.1/t0804-c5510-padn-00-b-ag css2.1/t100801-c544-valgn-02-d-agi css2.1/t100801-c544-valgn-03-d-agi css2.1/t100801-c544-valgn-04-d-agi fast/backgrounds/size/backgroundSize10 fast/backgrounds/size/backgroundSize12 fast/backgrounds/size/backgroundSize18 fast/backgrounds/size/backgroundSize19 fast/box-sizing/percentage-height fast/replaced/image-sizing fast/replaced/maxheight-percent fast/replaced/maxheight-pxs fast/replaced/maxwidth-percent fast/replaced/maxwidth-pxs tables/mozilla/bugs/bug14929 tables/mozilla/bugs/bug16252 tables/mozilla/bugs/bug97383 2. Pixel results not updated after fixing bug 3297: editing/selection/3690719 fast/invalid/018 fast/table/colspanMinWidth tables/mozilla/bugs/bug6304 tables/mozilla/bugs/bug25086 tables/mozilla/bugs/bug28928 tables/mozilla/bugs/bug44523 tables/mozilla/bugs/bug97138 tables/mozilla/core/col_widths_fix_auto tables/mozilla/core/row_span tables/mozilla_expected_failures/bugs/bug1262 tables/mozilla_expected_failures/bugs/bug11945 tables/mozilla_expected_failures/bugs/bug23847 tables/mozilla_expected_failures/bugs/bug32205-1 tables/mozilla_expected_failures/marvin/backgr_border-table-cell tables/mozilla_expected_failures/marvin/backgr_border-table-column-group tables/mozilla_expected_failures/marvin/backgr_border-table fast/encoding/utf-16-big-endian fast/encoding/utf-16-little-endian fast/table/cell-absolute-child 3. Pixel results not updated for r13868: fast/css/word-space-extra fast/overflow/image-selection-highlight 4. A possible regression: editing/style/smoosh-styles-003 5. Expected results generated with a debug build, bot running a release build: tables/mozilla_expected_failures/bugs/bug178855 6. SVG failures: svg/W3C-SVG-1.1/coords-units-01-b svg/W3C-SVG-1.1/coords-viewattr-02-b svg/W3C-SVG-1.1/filters-blend-01-b svg/W3C-SVG-1.1/filters-color-01-b svg/W3C-SVG-1.1/filters-composite-02-b svg/W3C-SVG-1.1/filters-comptran-01-b svg/W3C-SVG-1.1/filters-diffuse-01-f svg/W3C-SVG-1.1/filters-displace-01-f svg/W3C-SVG-1.1/filters-example-01-b svg/W3C-SVG-1.1/filters-gauss-01-b svg/W3C-SVG-1.1/filters-image-01-b svg/W3C-SVG-1.1/filters-light-01-f svg/W3C-SVG-1.1/filters-offset-01-b svg/W3C-SVG-1.1/filters-specular-01-f svg/W3C-SVG-1.1/paths-data-04-t svg/W3C-SVG-1.1/pservers-grad-02-b svg/W3C-SVG-1.1/pservers-grad-04-b svg/W3C-SVG-1.1/pservers-grad-05-b svg/W3C-SVG-1.1/pservers-grad-06-b svg/W3C-SVG-1.1/pservers-grad-11-b svg/W3C-SVG-1.1/pservers-grad-12-b svg/W3C-SVG-1.1/render-groups-01-b svg/W3C-SVG-1.1/render-groups-03-t svg/W3C-SVG-1.1/struct-image-01-t svg/W3C-SVG-1.1/struct-image-02-b svg/W3C-SVG-1.1/struct-image-04-t svg/W3C-SVG-1.1/styling-inherit-01-b svg/custom/feComponentTransfer-Discrete svg/custom/feComponentTransfer-Gamma svg/custom/feComponentTransfer-Linear svg/custom/feComponentTransfer-Table svg/custom/feDisplacementMap-01 svg/custom/filter-source-alpha svg/custom/image-with-transform-clip-filter svg/custom/invalid-css svg/custom/text-filter svg/custom/text-image-opacity With the exception of 4., I think the bot's current results should be checked in as the expected results. The tricky part is that if you run the tests on your machines, some of your results may differ from the bot's (due to different architectures, OS build or graphics hardware or color profiles), so perhaps somebody with access to the build slaves can run the tests on one of them and pull the results from it.
Attachments
download-differing-buildbot-pixel-results.pl (3.50 KB, text/x-perl-script)
2006-06-18 14:18 PDT, David Kilzer (:ddkilzer)
no flags
Pixel test differences for three machines (as of r19001) (27.23 KB, text/plain)
2007-01-29 08:18 PST, mitz
no flags
Pixel failures, classified (not including SVG) (18.48 KB, text/plain)
2007-01-29 09:34 PST, mitz
no flags
Pixel failures, annotated (SVG only) (5.20 KB, text/plain)
2007-01-29 11:53 PST, mitz
no flags
Pixel failures, annotated (SVG only) (7.48 KB, text/plain)
2007-01-29 12:01 PST, mitz
no flags
Patch that lets you ignore small differences (5.67 KB, patch)
2007-02-17 05:52 PST, mitz
no flags
Patch that lets you ignore small differences (6.34 KB, patch)
2007-11-07 17:42 PST, mitz
no flags
David Kilzer (:ddkilzer)
Comment 1 2006-06-18 14:18:13 PDT
Created attachment 8907 [details] download-differing-buildbot-pixel-results.pl A handy Perl script to download all of the differing images from the BuildBot web site into your local WebKit/LayoutTests directory structure. Note that it does not reset checksums; this will occur when rerunning run-webkit-tests if the image diff succeeds. (I tried the css2.1 images but they did not work locally.) It would be nice if we could figure out why some of the tests with no apparent pixel differences are failing to compare.
mitz
Comment 2 2006-06-18 14:50:08 PDT
(In reply to comment #1) > It would be nice if we could figure out why some of the tests with no apparent > pixel differences are failing to compare. > Tests with no visible pixel differences actually have small differences due to color matching and image decoding issues. One issue is that image decoding and rescaling has changed in 10.4.6, and the build slaves are on 10.4.5 or earlier. They should probably be upgraded before proceeding. The next step after upgrading would be to examine the new set of failing tests, and then run-webkit-tests --pixel --reset on a build slave and commit the generated results for all tests except the suspected regression (group 4. above) to the repository.
Alexey Proskuryakov
Comment 3 2006-06-19 11:38:19 PDT
Comparing http://build.webkit.org/post-commit-pixel-powerpc-mac-os-x/builds/1007 and http://build.webkit.org/post-commit-pixel-powerpc-mac-os-x/builds/1008 (coming from apple-slave-6 and apple-slave-5 respectively, no significant differences in the codebase), I see several pixel tests that fail only in one of these: editing/selection/drag-in-iframe editing/selection/drag-to-contenteditable-iframe svg/W3C-SVG-1.1/pservers-grad-08-b svg/custom/filter-source-alpha tables/mozilla/bugs/bug86708
mitz
Comment 4 2007-01-29 08:18:07 PST
Created attachment 12741 [details] Pixel test differences for three machines (as of r19001) This file lists tests that failed on at least one of three machines (a build slave <http://build.webkit.org/post-commit-pixel-powerpc-mac-os-x/builds/3397>, an iMac G5 and a MacBook Pro, the latter two running Mac OS X 10.4.8). For each test, the three numbers are the distance from the expected result observed by each of the machines (in decreasing order). The tests are listed in descending order of the biggest difference. The difference metric is the maximum over all pixels of the L_1 distance (sum of absolute differences of R, G and B) between actual pixel and expected pixel. Hopefully I will follow up with some analysis. (What I really should have measured was the distances between the different machines' actual results, but laziness and total lack of {perl,python,ruby} fu have so far stopped me from doing it).
mitz
Comment 5 2007-01-29 09:34:03 PST
Created attachment 12745 [details] Pixel failures, classified (not including SVG) This suggests that for non-SVG tests, an acceptance threshold can be set at or slightly above 10, provided that 10 or so problematic tests are changed (to not use Arial, animated and other problematic GIFs or unpredictable caret visibility).
mitz
Comment 6 2007-01-29 11:53:02 PST
Created attachment 12748 [details] Pixel failures, annotated (SVG only) Looks like the biggest problem with SVG is fonts.
mitz
Comment 7 2007-01-29 12:01:44 PST
Created attachment 12749 [details] Pixel failures, annotated (SVG only) Changed tabs to spaces.
mitz
Comment 8 2007-02-17 05:52:19 PST
Created attachment 13211 [details] Patch that lets you ignore small differences I'm using this with --threshold 10
mitz
Comment 9 2007-11-07 17:42:10 PST
Created attachment 17115 [details] Patch that lets you ignore small differences Updated to merge with TOT.
Darin Adler
Comment 10 2007-11-07 17:44:38 PST
Comment on attachment 17115 [details] Patch that lets you ignore small differences r=me Should we set an appropriate default threshold?
mitz
Comment 11 2007-11-07 17:48:49 PST
Comment on attachment 17115 [details] Patch that lets you ignore small differences This patch landed in r27584. Removing the review flag to keep it out of the commit queue.
mitz
Comment 12 2007-11-07 17:54:32 PST
(In reply to comment #10) > Should we set an appropriate default threshold? (A) default threshold(s) should be part of the plan to get people to run the tests on their own machines, but I would like to improve the reporting from run-webkit-tests before doing that (so that it tells you about "silent failures" and potentially suggests that you cache alternative checksums).
Simon Fraser (smfr)
Comment 13 2009-01-03 19:45:57 PST
How much of this is still relevant?
mitz
Comment 14 2009-01-03 19:53:23 PST
(In reply to comment #13) > How much of this is still relevant? Nothing that justifies keeping the bug open.
Note You need to log in before you can comment on or make changes to this bug.