For example, see http://test-results.appspot.com/dashboards/flakiness_dashboard.html#showExpectations=true&tests=fast%2Ftable%2Fempty-section-crash.html. The repaint background on the platform/chromium-win and platform/chromium-linux results is #575757. On the Chromium mac bot, some pixels are #575757 and some are #565656. Not really sure how image_diff translates that to the whole background not matching. The platform/mac result has a background that is #6a6a6a, with a color profile. Diffing the chromium win/linux result with the platform/mac result only shows a diff on the text though (i.e. the backgrounds match). I assume image-diff accounts for the color-profile somehow. Also, curiously, on some of the snowleopard bots, we sometimes get a result that matches the platform/mac result: http://test-results.appspot.com/dashboards/flakiness_dashboard.html#group=%40ToT%20-%20webkit.org&showExpectations=true&tests=fast%2Ftable%2Fempty-section-crash.html
Probably this https://codereview.appspot.com/5758043
Here's the skia bug http://code.google.com/p/skia/issues/detail?id=420
Ojan, can you please tell me how to get a list of the "hundreds" of tests that fall into this category? (Or at least a good chunk of them?) Taking the example given, empty-section-crash, I see (via svn blame) that it was marked as expected to IMAGE fail at least 16 months ago (http://trac.webkit.org/changeset/74070/trunk/LayoutTests/platform/chromium/test_expectations.txt). Its expectations line refers to http://crbug.com/23489 , which was filed in Sept 2009...
(In reply to comment #3) > Ojan, can you please tell me how to get a list of the "hundreds" of tests that fall into this category? (Or at least a good chunk of them?) 1. load "webkit-patch garden-o-matic" 2. go to the expected failures tab 3. examine the fast/repaint tests Fixing the color issue probably won't make all these tests pass, but it will make it so that the pixel diffs are just text-rendering/antialiasing. So, they would be straightforward rebaselines. I think a large percentage of these have also be rebaselined. I know I rebaselined a lot of these for the Lion port before I realized that it was a more widespread problem. So, fixing this would cause a number of tests to "fail" that would need rebaselining.
Marked LayoutTest bugs, bugs with Chromium IDs, and some others as WontFix. Test failure bugs still are trackable via TestExpectations or disabled unit tests.