38063 – new-run-webkit-tests complains about missing pixel results instead of plopping down new expectations

RESOLVED FIXED 38063

new-run-webkit-tests complains about missing pixel results instead of plopping down new expectations

https://bugs.webkit.org/show_bug.cgi?id=38063

Summary new-run-webkit-tests complains about missing pixel results instead of ploppin...

James Robinson

Reported 2010-04-23 15:59:30 PDT

Run a test that does not have any -expected.png or -expected.checksum files checked in with the "-p" option. old-run-webkit-tests generates a new .png and .checksum file and sticks them into the appropriate place (LayoutTests/platform/mac when running on my mac). new-run-webkit-tests fails the test and complains about missing expectation

Attachments
Work in progress (8.57 KB, patch) 2011-06-29 09:58 PDT, Adam Barth	no flags	Details Formatted Diff Diff
Patch (14.67 KB, patch) 2011-06-29 12:00 PDT, Adam Barth	no flags	Details Formatted Diff Diff
Patch for landing (15.13 KB, patch) 2011-06-29 13:37 PDT, Adam Barth	no flags	Details Formatted Diff Diff
Show Obsolete (2) View All Add attachment proposed patch, testcase, etc.

Ojan Vafai

Comment 1 2010-04-25 07:57:47 PDT

Moving to the old-run-webkit-tests behavior is fine with me as long as we print something out that we're doing so. I don't like that old-run-webkit-tests does so silently.

Eric Seidel (no email)

Comment 2 2011-06-26 23:16:11 PDT

*** Bug 62497 has been marked as a duplicate of this bug. ***

Adam Barth

Comment 3 2011-06-27 18:03:57 PDT

NRWT does the same thing for text results. Do you think we should switch both to match the ORWT behavior, or do you think pixel tests should work different than text tests?

Eric Seidel (no email)

Comment 4 2011-06-28 01:09:33 PDT

I recommend switching both to the ORWT behavior and adding a --no-new-test-results flag. Chromium has a wrapper script around NRWT for their bots iirc. We could add that flag to that wrapper for easy transition. We could also make the chromium port imply --no-new-test-results for the time-being (or alternatively, make the run-webkit-tests wrapper imply --new-test-results when running new-run-webkit-tests. That's part of what it's there for, to make this migration easier: http://trac.webkit.org/browser/trunk/Tools/Scripts/run-webkit-tests

Ryosuke Niwa

Comment 5 2011-06-28 06:13:01 PDT

I'd prefer having --no-new-test-results since that behavior seems to be only useful in bots.

Adam Barth

Comment 6 2011-06-28 13:57:35 PDT

Ok. Will do. Thanks.

Adam Barth

Comment 7 2011-06-29 09:58:06 PDT

Created attachment 99108 [details] Work in progress

Adam Barth

Comment 8 2011-06-29 12:00:43 PDT

Created attachment 99121 [details] Patch

Adam Barth

Comment 9 2011-06-29 12:01:46 PDT

There are some niceties missing (as noted in the ChangeLog). Also the print statements that say which files are being created don't seem to be appearing. If I bump the log level, then they show up turning unit testing, which is annoying. Anyway, more polish required.

Dirk Pranke

Comment 10 2011-06-29 12:57:57 PDT

Comment on attachment 99121 [details] Patch Generally looks good ... R+'ing it although there are some nits and at least one change that looks correct but unnecessary that I have a question about. View in context: https://bugs.webkit.org/attachment.cgi?id=99121&action=review > Tools/Scripts/webkitpy/common/net/layouttestresults.py:164 > + return [result for result in self._test_results if result.has_failure_matching_types(*failure_types)] Why make this change (and the one in test_results.py)? It seems unnecessary. > Tools/Scripts/webkitpy/layout_tests/run_webkit_tests.py:309 > + dest="save_missing_results", default=True, Nit: might this be better as dest="new_test_results"? That's more in keeping with how we normally name options. > Tools/Scripts/webkitpy/layout_tests/layout_package/manager.py:798 > + failures = self._get_failures(result_summary, include_crashes=False, include_missing=False) Hm. I'm not sure about this change, given that we can have flaky MISSING results if a text-only test crashes or fails prior to calling dumpAsText(). I agree that there's no point in retrying a truly missing test, and maybe the flaky case is rare enough that this is the right thing to do. > Tools/Scripts/webkitpy/layout_tests/layout_package/manager.py:936 > + def _get_failures(self, result_summary, include_crashes=True, include_missing=True): Nit: I'd probably leave off the default values, since it's an internal function and it looks like they're not being used anyway. > Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:131 > + if self._options.save_missing_results: Change to options.new_test_results as per above? > Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:145 > + def _add_missing_baselines(self, test_result, driver_output): Nit: maybe add comments or docstrings indicating which options trigger _add_missing_baselines and _overwrite_baselines? I know you can figure this out from looking at the callers, but it might be helpful to list them here as well. > Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:150 > + # than for text tests? If I remember correctly, with old-run-webkit-tests, all new test results go in the generic directories alongside the test, and you need --add-platform-exceptions to tell it to put the new results in the platform dir instead. Also, I suppose theoretically at least some render tree dumps could be generic (enough for more platforms, anyway). Regardless, there's no way to distinguish a render tree dump from a non-rendertree dump without parsing the text output, and I'd prefer to not have to start doing that. Maybe we can modify DRT to output custom headers when dumping the layer tree? > Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:160 > # DumpRenderTree may not output utf-8 text (e.g. webarchives). As an aside, it turns out webarchives are just XML files and I think they're always UTF-8 as a result. Should probably update this comment at some point ... > Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:190 > + _log.debug('Writing new baseline result "%s"' % output_path) I'm tempted to say that these should be _log.info() calls ... that would address Ojan's comments about the "silent" updates, and probably is generally more useful. > Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:193 > + _log.debug('Resetting baseline result "%s"' % output_path) Same as above.

Adam Barth

Comment 11 2011-06-29 13:29:42 PDT

Comment on attachment 99121 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=99121&action=review >> Tools/Scripts/webkitpy/common/net/layouttestresults.py:164 >> - return [result for result in self._test_results if result.has_failure_matching_types(failure_types)] >> + return [result for result in self._test_results if result.has_failure_matching_types(*failure_types)] > > Why make this change (and the one in test_results.py)? It seems unnecessary. I just did that to make the code less ugly. The varags approach looked prettier than calling the function like foo([ddd]). >> Tools/Scripts/webkitpy/layout_tests/run_webkit_tests.py:309 >> + optparse.make_option("--no-new-test-results", action="store_false", >> + dest="save_missing_results", default=True, > > Nit: might this be better as dest="new_test_results"? That's more in keeping with how we normally name options. Fixed. >> Tools/Scripts/webkitpy/layout_tests/layout_package/manager.py:798 >> - failures = self._get_failures(result_summary, include_crashes=False) >> + failures = self._get_failures(result_summary, include_crashes=False, include_missing=False) > > Hm. I'm not sure about this change, given that we can have flaky MISSING results if a text-only test crashes or fails prior to calling dumpAsText(). I agree that there's no point in retrying a truly missing test, and maybe the flaky case is rare enough that this is the right thing to do. Without this check, the behavior is really braindead because the first pass generates the results and then the second pass thinks the test passes. If this is a big source of flakiness, we can re-evaluate. >> Tools/Scripts/webkitpy/layout_tests/layout_package/manager.py:936 >> - def _get_failures(self, result_summary, include_crashes): >> + def _get_failures(self, result_summary, include_crashes=True, include_missing=True): > > Nit: I'd probably leave off the default values, since it's an internal function and it looks like they're not being used anyway. Ok. >> Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:131 >> + if self._options.save_missing_results: > > Change to options.new_test_results as per above? Done. >> Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:145 >> - def _save_baselines(self, driver_output): >> + def _add_missing_baselines(self, test_result, driver_output): > > Nit: maybe add comments or docstrings indicating which options trigger _add_missing_baselines and _overwrite_baselines? I know you can figure this out from looking at the callers, but it might be helpful to list them here as well. We've been trying to follow the WebKit style in document webkitpy, which is to avoid including comments that are redundant with the code and/or can become out of date when the code changes. Listing who the callers of a function are and why they call this function seems to fall into that category. >> Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:150 >> + # than for text tests? > > If I remember correctly, with old-run-webkit-tests, all new test results go in the generic directories alongside the test, and you need --add-platform-exceptions to tell it to put the new results in the platform dir instead. > > Also, I suppose theoretically at least some render tree dumps could be generic (enough for more platforms, anyway). > > Regardless, there's no way to distinguish a render tree dump from a non-rendertree dump without parsing the text output, and I'd prefer to not have to start doing that. > > Maybe we can modify DRT to output custom headers when dumping the layer tree? I don't think your recollection of how old-run-webkit-tests works is correct. Tests that produce render tree dumps and/or images go into the platform-specific directory by default and normal text tests go into the generic directory. (Where "generic" means outside the "platform" directory.) I'm not sure who it figures that out. >> Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:160 >> # DumpRenderTree may not output utf-8 text (e.g. webarchives). > > As an aside, it turns out webarchives are just XML files and I think they're always UTF-8 as a result. Should probably update this comment at some point ... Ok. I don't really understand that, so I'll let you update the comment next time you're editing this file so folks who "svn blame" will ask you about it instead of me. :) >> Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:190 >> + _log.debug('Writing new baseline result "%s"' % output_path) > > I'm tempted to say that these should be _log.info() calls ... that would address Ojan's comments about the "silent" updates, and probably is generally more useful. Yes. The trouble was that then they showed up in the test-webkitpy output! I'll fix that in a subsequent patch.

Adam Barth

Comment 12 2011-06-29 13:37:35 PDT

Created attachment 99139 [details] Patch for landing

WebKit Review Bot

Comment 13 2011-06-29 15:26:11 PDT

Comment on attachment 99139 [details] Patch for landing Clearing flags on attachment: 99139 Committed r90055: <http://trac.webkit.org/changeset/90055>

WebKit Review Bot

Comment 14 2011-06-29 15:26:16 PDT

All reviewed patches have been landed. Closing bug.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware PC

OS OS X 10.5

Product WebKit

Component Tools / Tests

Assignee

Adam Barth

Reported

2010-04-23 15:59 PDT

Modified

2011-06-29 15:26 PDT History

CC List

6 users Show

URL

Keywords

Duplicates (1)

62497 View as bug list

Depends on

Blocks

34984

Dependencies

tree graph