WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 63494
REGRESSION(
r84294
): new-run-webkit-tests results.html generate links to diffs.html or diff.png that don't exist
https://bugs.webkit.org/show_bug.cgi?id=63494
Summary
REGRESSION(r84294): new-run-webkit-tests results.html generate links to diffs...
Ryosuke Niwa
Reported
2011-06-27 16:51:06 PDT
Tools/Scripts/new-run-webkit-tests --debug --pixel do not generate -diffs.html or diff.png for image failures on Mac port I can see links on results.html, and in fact, layout-test-results's corresponding directories do have -expected.png and -actual.png but there are no -diffs.html or -diff.png
Attachments
Work in progress
(2.64 KB, patch)
2011-06-29 19:50 PDT
,
Adam Barth
no flags
Details
Formatted Diff
Diff
Work in progress
(675 bytes, patch)
2011-06-29 19:58 PDT
,
Adam Barth
no flags
Details
Formatted Diff
Diff
Work in progress
(20.06 KB, patch)
2011-06-29 21:27 PDT
,
Adam Barth
no flags
Details
Formatted Diff
Diff
Work in progress (might actually work)
(23.16 KB, patch)
2011-06-29 22:15 PDT
,
Adam Barth
no flags
Details
Formatted Diff
Diff
Patch
(25.48 KB, patch)
2011-06-29 22:34 PDT
,
Adam Barth
no flags
Details
Formatted Diff
Diff
Patch
(25.65 KB, patch)
2011-06-30 11:31 PDT
,
Adam Barth
eric
: review+
Details
Formatted Diff
Diff
Show Obsolete
(5)
View All
Add attachment
proposed patch, testcase, etc.
Ryosuke Niwa
Comment 1
2011-06-27 17:05:18 PDT
It appears that diffs.html and diff.png aren't generated when tests fail due to checksum differences.
Ryosuke Niwa
Comment 2
2011-06-27 17:16:45 PDT
At this point I'm pretty certain this is a bug in new results.html page.
Dirk Pranke
Comment 3
2011-06-27 17:21:03 PDT
On further investigation (e.g., fast/box-shadow/basic-shadows.html), we're getting cases where the checksums don't match but the images do match, as reported by ImageDiff. So, no -diff files will be created. However, the test is still reported as failing. I think the json results generator doesn't know how to distinguish the different kinds of failures, and so it reports the generic Image failure. We might need to add another flag to the test_dict in manager.py: summarize_results() to handle this case. Once again, the attempt to map the broader set of TestFailures onto the smaller set of expectation types might be hurting us.
Adam Barth
Comment 4
2011-06-29 14:08:12 PDT
Is someone stepping up to own this issue. Ojan?
Ojan Vafai
Comment 5
2011-06-29 16:54:57 PDT
(In reply to
comment #3
)
> On further investigation (e.g., fast/box-shadow/basic-shadows.html), we're getting cases where the checksums don't match but the images do match, as reported by ImageDiff.
Hm. That should not happen now that checksums are embedded in pngs!
> So, no -diff files will be created. However, the test is still reported as failing. I think the json results generator doesn't know how to distinguish the different kinds of failures, and so it reports the generic Image failure. > > We might need to add another flag to the test_dict in manager.py: summarize_results() to handle this case. > > Once again, the attempt to map the broader set of TestFailures onto the smaller set of expectation types might be hurting us.
The right solution IMO, is to make the results types serialized to the JSON file have a one-to-one mapping with the TestFailure types. That would also help cleanup code. (In reply to
comment #4
)
> Is someone stepping up to own this issue. Ojan?
I won't realistically get to it anytime soon.
Dirk Pranke
Comment 6
2011-06-29 17:12:36 PDT
(In reply to
comment #5
)
> (In reply to
comment #3
) > > On further investigation (e.g., fast/box-shadow/basic-shadows.html), we're getting cases where the checksums don't match but the images do match, as reported by ImageDiff. > > Hm. That should not happen now that checksums are embedded in pngs! >
Tony pointed out a difference a week or two ago where the apple mac ImageDiff implementation will report some PNGs as identical even when they are not bit-for-bit identical (i.e., even when --tolerance=0.0 is specified). I think that that's a bug in ImageDiff that should be fixed, but perhaps that is debatable. Regardless, that is probably what is causing this.
> The right solution IMO, is to make the results types serialized to the JSON file have a one-to-one mapping with the TestFailure types. That would also help cleanup code.
That would probably work, and certainly simplify some things.
Adam Barth
Comment 7
2011-06-29 17:13:46 PDT
Ojan, if you're not going to fix your regression, should we roll out your patch?
Adam Barth
Comment 8
2011-06-29 18:02:20 PDT
Hum... That patch is from 04/19/2011 14:18:52 (2 months ago). :(
Adam Barth
Comment 9
2011-06-29 18:06:23 PDT
Reading this more carefully, this doesn't need to block
Bug 34984
. This appears to happen only in the case were the images differ bit-by-bit but ImageDiff says they match, which should be rare.
Ryosuke Niwa
Comment 10
2011-06-29 18:11:45 PDT
(In reply to
comment #9
)
> Reading this more carefully, this doesn't need to block
Bug 34984
. This appears to happen only in the case were the images differ bit-by-bit but ImageDiff says they match, which should be rare.
It happens on Mac port all the time because it has the default non-zero tolerance.
Ryosuke Niwa
Comment 11
2011-06-29 18:14:41 PDT
So it should block the
bug 34984
.
Adam Barth
Comment 12
2011-06-29 18:19:31 PDT
I see. Ok. Thanks.
Adam Barth
Comment 13
2011-06-29 19:50:43 PDT
Created
attachment 99218
[details]
Work in progress
Adam Barth
Comment 14
2011-06-29 19:58:33 PDT
Created
attachment 99222
[details]
Work in progress
Adam Barth
Comment 15
2011-06-29 21:27:13 PDT
Created
attachment 99231
[details]
Work in progress
Adam Barth
Comment 16
2011-06-29 22:15:04 PDT
Created
attachment 99236
[details]
Work in progress (might actually work)
Eric Seidel (no email)
Comment 17
2011-06-29 22:20:03 PDT
Comment on
attachment 99236
[details]
Work in progress (might actually work) View in context:
https://bugs.webkit.org/attachment.cgi?id=99236&action=review
I like the idea. I'd have to review it in the morning.
> Tools/Scripts/webkitpy/layout_tests/port/webkit.py:143 > + # FIXME: It's not clear what we should return in this case. > + # Maybe we should throw an exception? > return True
Seems you could easily raise ValueError here.
Adam Barth
Comment 18
2011-06-29 22:34:29 PDT
Created
attachment 99239
[details]
Patch
Dirk Pranke
Comment 19
2011-06-29 22:39:27 PDT
Looks plausible to me also. I have only skimmed the changes and need to review it more carefully as well Adding Tom Hudson to this since this is related to some work he was doing that would allow us to return more metrics than just true/false for the diff, so that we could sort results by the degree and type of differences. Tom, this patch has much if not all of the restructuring we'd probably want for the things you were working on ...
Eric Seidel (no email)
Comment 20
2011-06-29 22:56:17 PDT
Comment on
attachment 99239
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=99239&action=review
> Tools/Scripts/webkitpy/layout_tests/run_webkit_tests_integrationtest.py:573 > + def diff_image(self, expected_contents, actual_contents): > self.tolerance_used_for_diff_image = self._options.tolerance > return True
Isn't this supposed to return an image diff?
Adam Barth
Comment 21
2011-06-29 22:58:55 PDT
Comment on
attachment 99239
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=99239&action=review
>> Tools/Scripts/webkitpy/layout_tests/run_webkit_tests_integrationtest.py:573 >> return True > > Isn't this supposed to return an image diff?
This is some fake thing for testing something ridiculous. I think it just has to return something that's not falsy.
Eric Seidel (no email)
Comment 22
2011-06-30 11:18:34 PDT
Comment on
attachment 99239
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=99239&action=review
r- for nits.
> Tools/Scripts/webkitpy/layout_tests/run_webkit_tests_integrationtest.py:268 > + res, buildbot_output, regular_output, user = logging_run(['--print-last-failures'], filesystem=fs) > + self.assertEqual(regular_output.get(), [u'failures/expected/checksum.html\n\n'])
I'm confused what would prompt this test case to complain about checksum.html? This is slightly concerning. Would be nice to hear dirks thoughts on this.
> Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:275 > - if not expected_driver_outputs.image: > + if not expected_driver_output.image: > failures.append(test_failures.FailureMissingImage())
Why if the expected output is missing an image do we add FailureMissingImage here?
> Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:281 > + elif driver_output.image_hash != expected_driver_output.image_hash: > + driver_output.image_diff = self._port.diff_image(driver_output.image, expected_driver_output.image) > + if driver_output.image_diff: > + failures.append(test_failures.FailureImageHashMismatch())
Its a little subtle that this image_diff creation is here. We may need to rename this method or add furhter commetns.
> Tools/Scripts/webkitpy/layout_tests/layout_package/test_result_writer.py:72 > + # FIXME: This work should be done earlier in the pipeline.
Can you be more specific as to when? You're saying that the image diffing should be done before writing out files?
> Tools/Scripts/webkitpy/layout_tests/layout_package/test_result_writer.py:265 > + # FIXME: This seems like a text file, not a binary file.
All .diff/.patch files are binary, but if this is the diffs.html file (which cycles through the images) then yes, it seems like text rather than binary.
> Tools/Scripts/webkitpy/layout_tests/port/base.py:937 > + self.image_diff = None # image_diff gets filled in after construction.
It's odd that we even need to store this on driver_output. It doesn't come from the driver.
> Tools/Scripts/webkitpy/layout_tests/port/chromium.py:179 > - raise e > + raise
Yay! I wonder how many other places we get this wrong.
> Tools/Scripts/webkitpy/layout_tests/port/chromium.py:182 > + if exit_code == 1: > + result = self._filesystem.read_binary_file(diff_filename)
Although this seems awkward for chromium, this makes much more sense for WebKit who sends the binary over the wire instead of through the filesystem.
> Tools/Scripts/webkitpy/layout_tests/port/webkit.py:143 > + # FIXME: It's not clear what we should return in this case. > + # Maybe we should throw an exception? > return True
I would raise ValueError (or any other error which seems reasonable for bad arguments).
> Tools/Scripts/webkitpy/layout_tests/port/webkit.py:145 > + process = self._create_image_diff_process(expected_contents, actual_contents)
Maybe thsi should say "start"? "create" blends wiht image_diff "create_image_diff" as though you're asking for the process which creates the image diffs.
> Tools/Scripts/webkitpy/layout_tests/port/webkit.py:160 > + (len(actual_contents), actual_contents, len(expected_contents), expected_contents))
Did you mean to indent this again?
> Tools/Scripts/webkitpy/layout_tests/port/webkit.py:189 > + if output.startswith('diff'): > + m = re.match('diff: (.+)% (passed|failed)', output) > + if m.group(2) == 'passed': > + return None > + return output
I wonder if ORWT saves the diffs even when they "pass"?
Adam Barth
Comment 23
2011-06-30 11:30:08 PDT
Comment on
attachment 99239
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=99239&action=review
>> Tools/Scripts/webkitpy/layout_tests/run_webkit_tests_integrationtest.py:268 >> + self.assertEqual(regular_output.get(), [u'failures/expected/checksum.html\n\n']) > > I'm confused what would prompt this test case to complain about checksum.html? This is slightly concerning. Would be nice to hear dirks thoughts on this.
I suspect this is because we change how this kind of failure is treated. /me will investigate.
>> Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:275 >> failures.append(test_failures.FailureMissingImage()) > > Why if the expected output is missing an image do we add FailureMissingImage here?
Because that's what FailureMissingImage means!
>> Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:281 >> + failures.append(test_failures.FailureImageHashMismatch()) > > Its a little subtle that this image_diff creation is here. We may need to rename this method or add furhter commetns.
Ok.
>> Tools/Scripts/webkitpy/layout_tests/layout_package/test_result_writer.py:72 >> + # FIXME: This work should be done earlier in the pipeline. > > Can you be more specific as to when? You're saying that the image diffing should be done before writing out files?
I've updated the comment. We should compare image for ref and non-ref tests at the same layer. That's all.
>> Tools/Scripts/webkitpy/layout_tests/layout_package/test_result_writer.py:265 >> + # FIXME: This seems like a text file, not a binary file. > > All .diff/.patch files are binary, but if this is the diffs.html file (which cycles through the images) then yes, it seems like text rather than binary.
I'm going to change that in a future patch.
>> Tools/Scripts/webkitpy/layout_tests/port/base.py:937 >> + self.image_diff = None # image_diff gets filled in after construction. > > It's odd that we even need to store this on driver_output. It doesn't come from the driver.
That seems like a problem for a future patch.
>> Tools/Scripts/webkitpy/layout_tests/port/webkit.py:143 >> return True > > I would raise ValueError (or any other error which seems reasonable for bad arguments).
I'm not sure what the consequences of that would be. I'm inclined to leave the comment and worry about it later.
>> Tools/Scripts/webkitpy/layout_tests/port/webkit.py:145 >> + process = self._create_image_diff_process(expected_contents, actual_contents) > > Maybe thsi should say "start"? "create" blends wiht image_diff "create_image_diff" as though you're asking for the process which creates the image diffs.
Fixed.
>> Tools/Scripts/webkitpy/layout_tests/port/webkit.py:160 >> + (len(actual_contents), actual_contents, len(expected_contents), expected_contents)) > > Did you mean to indent this again?
Fixed.
>> Tools/Scripts/webkitpy/layout_tests/port/webkit.py:189 >> + return output > > I wonder if ORWT saves the diffs even when they "pass"?
The diffs don't get created in that case, which is the whole problem.
Adam Barth
Comment 24
2011-06-30 11:31:26 PDT
Created
attachment 99340
[details]
Patch
Eric Seidel (no email)
Comment 25
2011-06-30 11:33:08 PDT
Comment on
attachment 99340
[details]
Patch OK.
Adam Barth
Comment 26
2011-06-30 11:37:22 PDT
Committed
r90138
: <
http://trac.webkit.org/changeset/90138
>
Dirk Pranke
Comment 27
2011-06-30 12:37:05 PDT
Comment on
attachment 99340
[details]
Patch Patch looks generally good. Lots of minor stuff that could be changed, but most if not all of it could be done in a follow-on patch. The only major thing I don't like is hanging the image_diff off of the driver_output. I would probably have R-'ed the patch if I thought that fixing that wouldn't complicate the patch even more, so hopefully we can clean that up soon rather than forget about it. I don't think I saw anything that looked actually incorrect in the sense that it would break things.
> Tools/ChangeLog:18 > +
It would've been nice to do these two things in two different patches, since they're both nontrivial, but I'm not going to lose a lot of sleep over it.
> Tools/Scripts/webkitpy/layout_tests/run_webkit_tests_integrationtest.py:268 > + self.assertEqual(regular_output.get(), [u'failures/expected/checksum.html\n\n'])
Both you and Eric seem to be unwrapping lines as you go along in these patches, as well as making actual changes. Since I've had this feedback on my patches, I'll pass it along: it would be nice if you separated out the formatting changes from the semantic changes into different patches. In addition, since there's nothing actually wrong with the wrapped code, it seems like these changes are kinda unnecessary (one could argue that I would be just as justified in submitting patches that added the line breaks back in, because I find the old way easier to read ...).
> Tools/Scripts/webkitpy/layout_tests/run_webkit_tests_integrationtest.py:426 > + 'failures/unexpected/checksum-with-matching-image.html',
If this test no longer fails, we should rename it to passes/checksum-with-matching-image.html
> Tools/Scripts/webkitpy/layout_tests/run_webkit_tests_integrationtest.py:431 > + self.assertTrue(fs.read_text_file('/tmp/layout-test-results/unexpected_results.json').find('"num_regressions":0') != -1)
res contains the # of unexpected failures. Just do self.assertEquals(res, 0).
> Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:276 > + if not expected_driver_output.image:
This is concerning ... if these typos weren't causing failures in the test suite before, then these lines weren't being executed. We should figure out why that is/was, and if it's no longer true. Do you know why they were executing before and that they now are?
> Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:281 > + driver_output.image_diff = self._port.diff_image(driver_output.image, expected_driver_output.image)
The driver isn't producing the image_diff, so I'm not pleased with hanging the image_diff off of it. Can we return the diff separately alongside the failures? Maybe we should pass in the TestResult object in to the compare_* routines and modify that instead?
> Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:283 > + failures.append(test_failures.FailureImageHashMismatch())
Probably a separate patch, but I wonder if we should just get rid of the FailureImageHash* failure types and just have FailureImageMismatch ?
> Tools/Scripts/webkitpy/layout_tests/layout_package/test_result_writer.py:61 > + writer.write_image_diff_files(driver_output.image_diff)
See above re: image_diff hanging off of driver_output. If we modify write_test_result() to take the TestResult object as a parameter instead of just the list of failures, then the changes I made above would allow you to just pull the existing image_diff off of that object. This can probably all be done in a follow-up patch, though.
> Tools/Scripts/webkitpy/layout_tests/layout_package/test_result_writer.py:75 > + writer.write_image_diff_files(image_diff)
similar to above.
> Tools/Scripts/webkitpy/layout_tests/port/base.py:937 > + self.image_diff = None # image_diff gets filled in after construction.
see comments above.
> Tools/Scripts/webkitpy/layout_tests/port/chromium.py:142 > # FIXME: need unit tests for this.
I think we can actually remove this comment, given that there are unit tests in port_testcase.py for this.
> Tools/Scripts/webkitpy/layout_tests/port/chromium_unittest.py:229 > + self.assertEquals(None, port.diff_image("EXPECTED", "ACTUAL"))
I wonder why these tests are here rather than in port_testcase ... I don't see anything obviously chromium-specific about them. Maybe I wrote them before I wrote the port_testcase ones, and never moved them over.
> Tools/Scripts/webkitpy/layout_tests/port/port_testcase.py:235 > + self.assertFalse(port.diff_image('', ''))
These should be testing against None explicitly. A port implementation that didn't get updated and returned False to indicate that there was a image diff would incorrectly pass here.
> Tools/Scripts/webkitpy/layout_tests/port/test.py:148 > + actual_checksum='text-image-checksum_fail-checksum')
See comment way earlier ... this should be renamed if it no longer fails.
> Tools/Scripts/webkitpy/layout_tests/port/webkit.py:142 > + # Maybe we should throw an exception?
Obviously, the tests expect you return that there was no diff. I don't remember why I thought this was acceptable or correct, unfortunately :( Maybe there's a comment in the patch that added them?
> Tools/Scripts/webkitpy/layout_tests/port/webkit.py:190 > + return output
The re-ordering of these branches looks correct, but it would be nice if we had tests that tested these cases one way or another to know that we were fixing and not introducing bugs. Oh, and I'd like a pony.
Adam Barth
Comment 28
2011-06-30 13:58:46 PDT
(In reply to
comment #27
)
> (From update of
attachment 99340
[details]
) > Patch looks generally good.
Thanks for looking at the patch.
> > Tools/Scripts/webkitpy/layout_tests/run_webkit_tests_integrationtest.py:268 > > + self.assertEqual(regular_output.get(), [u'failures/expected/checksum.html\n\n']) > > Both you and Eric seem to be unwrapping lines as you go along in these patches, as well as making actual changes. > > Since I've had this feedback on my patches, I'll pass it along: it would be nice if you separated out the formatting changes from the semantic changes into different patches. In addition, since there's nothing actually wrong with the wrapped code, it seems like these changes are kinda unnecessary (one could argue that I would be just as justified in submitting patches that added the line breaks back in, because I find the old way easier to read ...).
We're following PEP8, with the exception of the line limit, where we're following WebKit style. In any case, if our biggest problem is line length, then we're in good shape.
> > Tools/Scripts/webkitpy/layout_tests/run_webkit_tests_integrationtest.py:426 > > + 'failures/unexpected/checksum-with-matching-image.html', > > If this test no longer fails, we should rename it to passes/checksum-with-matching-image.html
Ok.
> > Tools/Scripts/webkitpy/layout_tests/run_webkit_tests_integrationtest.py:431 > > + self.assertTrue(fs.read_text_file('/tmp/layout-test-results/unexpected_results.json').find('"num_regressions":0') != -1) > > res contains the # of unexpected failures. Just do self.assertEquals(res, 0).
The two seem interchangable. What the advantage of one or the other?
> > Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:276 > > + if not expected_driver_output.image: > > This is concerning ... if these typos weren't causing failures in the test suite before, then these lines weren't being executed. We should figure out why that is/was, and if it's no longer true.
I just renamed a bound variable.
> Do you know why they were executing before and that they now are? > > > Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:281 > > + driver_output.image_diff = self._port.diff_image(driver_output.image, expected_driver_output.image) > > The driver isn't producing the image_diff, so I'm not pleased with hanging the image_diff off of it. > > Can we return the diff separately alongside the failures? Maybe we should pass in the TestResult object in to the compare_* routines and modify that instead?
Yeah, this isn't ideal, but that's not a blocking issue. Hence, it's a problem for another day.
> > Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:283 > > + failures.append(test_failures.FailureImageHashMismatch()) > > Probably a separate patch, but I wonder if we should just get rid of the FailureImageHash* failure types and just have FailureImageMismatch ?
We're going to have to rip out the multiple representations of the result of running a test and replace it with something consistent. We should do that then.
> > Tools/Scripts/webkitpy/layout_tests/layout_package/test_result_writer.py:61 > > + writer.write_image_diff_files(driver_output.image_diff) > > See above re: image_diff hanging off of driver_output. If we modify write_test_result() to take the TestResult object as a parameter instead of just the list of failures, then the changes I made above would allow you to just pull the existing image_diff off of that object. This can probably all be done in a follow-up patch, though.
This all needs to be re-architected globally.
> > Tools/Scripts/webkitpy/layout_tests/layout_package/test_result_writer.py:75 > > + writer.write_image_diff_files(image_diff) > > similar to above. > > > Tools/Scripts/webkitpy/layout_tests/port/base.py:937 > > + self.image_diff = None # image_diff gets filled in after construction. > > see comments above. > > > Tools/Scripts/webkitpy/layout_tests/port/chromium.py:142 > > # FIXME: need unit tests for this. > > I think we can actually remove this comment, given that there are unit tests in port_testcase.py for this.
Please feel to write such a patch.
> > Tools/Scripts/webkitpy/layout_tests/port/chromium_unittest.py:229 > > + self.assertEquals(None, port.diff_image("EXPECTED", "ACTUAL")) > > I wonder why these tests are here rather than in port_testcase ... I don't see anything obviously chromium-specific about them. Maybe I wrote them before I wrote the port_testcase ones, and never moved them over.
Dunno.
> > Tools/Scripts/webkitpy/layout_tests/port/port_testcase.py:235 > > + self.assertFalse(port.diff_image('', '')) > > These should be testing against None explicitly. A port implementation that didn't get updated and returned False to indicate that there was a image diff would incorrectly pass here.
Why is that an incorrect pass? Returning anything falsy is probably ok.
> > Tools/Scripts/webkitpy/layout_tests/port/test.py:148 > > + actual_checksum='text-image-checksum_fail-checksum') > > See comment way earlier ... this should be renamed if it no longer fails. > > > Tools/Scripts/webkitpy/layout_tests/port/webkit.py:142 > > + # Maybe we should throw an exception? > > Obviously, the tests expect you return that there was no diff. I don't remember why I thought this was acceptable or correct, unfortunately :( Maybe there's a comment in the patch that added them?
The problem is that they're returning that there is a diff (which makes very little sense)!
> > Tools/Scripts/webkitpy/layout_tests/port/webkit.py:190 > > + return output > > The re-ordering of these branches looks correct, but it would be nice if we had tests that tested these cases one way or another to know that we were fixing and not introducing bugs. Oh, and I'd like a pony.
I don't think I changed any behavior here.
Dirk Pranke
Comment 29
2011-06-30 15:13:58 PDT
(In reply to
comment #28
)
> (In reply to
comment #27
) > > (From update of
attachment 99340
[details]
[details]) > > Patch looks generally good. > > Thanks for looking at the patch. > > > > Tools/Scripts/webkitpy/layout_tests/run_webkit_tests_integrationtest.py:268 > > > + self.assertEqual(regular_output.get(), [u'failures/expected/checksum.html\n\n']) > > > > Both you and Eric seem to be unwrapping lines as you go along in these patches, as well as making actual changes. > > > > Since I've had this feedback on my patches, I'll pass it along: it would be nice if you separated out the formatting changes from the semantic changes into different patches. In addition, since there's nothing actually wrong with the wrapped code, it seems like these changes are kinda unnecessary (one could argue that I would be just as justified in submitting patches that added the line breaks back in, because I find the old way easier to read ...). > > We're following PEP8, with the exception of the line limit, where we're following WebKit style. > > In any case, if our biggest problem is line length, then we're in good shape. >
Yes. I'm mostly just teasing.
> > > Tools/Scripts/webkitpy/layout_tests/run_webkit_tests_integrationtest.py:426 > > > + 'failures/unexpected/checksum-with-matching-image.html', > > > > If this test no longer fails, we should rename it to passes/checksum-with-matching-image.html > > Ok. > > > > Tools/Scripts/webkitpy/layout_tests/run_webkit_tests_integrationtest.py:431 > > > + self.assertTrue(fs.read_text_file('/tmp/layout-test-results/unexpected_results.json').find('"num_regressions":0') != -1) > > > > res contains the # of unexpected failures. Just do self.assertEquals(res, 0). > > The two seem interchangable. What the advantage of one or the other? >
It's shorter, faster, clearer (IMO), and does not have a dependency on the format of the unexpected_results.json file.
> > > Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:276 > > > + if not expected_driver_output.image: > > > > This is concerning ... if these typos weren't causing failures in the test suite before, then these lines weren't being executed. We should figure out why that is/was, and if it's no longer true. > > I just renamed a bound variable. >
Yes, my point was that you found what are at least gaps in test coverage and perhaps actual bugs. We should figure out what those are.
> > Do you know why they were executing before and that they now are? > > > > > Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:281 > > > + driver_output.image_diff = self._port.diff_image(driver_output.image, expected_driver_output.image) > > > > The driver isn't producing the image_diff, so I'm not pleased with hanging the image_diff off of it. > > > > Can we return the diff separately alongside the failures? Maybe we should pass in the TestResult object in to the compare_* routines and modify that instead? > > Yeah, this isn't ideal, but that's not a blocking issue. Hence, it's a problem for another day. > > > > Tools/Scripts/webkitpy/layout_tests/layout_package/single_test_runner.py:283 > > > + failures.append(test_failures.FailureImageHashMismatch()) > > > > Probably a separate patch, but I wonder if we should just get rid of the FailureImageHash* failure types and just have FailureImageMismatch ? > > We're going to have to rip out the multiple representations of the result of running a test and replace it with something consistent. We should do that then. > > > > Tools/Scripts/webkitpy/layout_tests/layout_package/test_result_writer.py:61 > > > + writer.write_image_diff_files(driver_output.image_diff) > > > > See above re: image_diff hanging off of driver_output. If we modify write_test_result() to take the TestResult object as a parameter instead of just the list of failures, then the changes I made above would allow you to just pull the existing image_diff off of that object. This can probably all be done in a follow-up patch, though. > > This all needs to be re-architected globally.
I don't think you should dismiss my comments quite so lightly. You are introducing a design flaw in the code (by confusing what DriverOutput is doing). My suggestions would be straightforward to fix that and would hardly be a "global re-architecting".
> > > > Tools/Scripts/webkitpy/layout_tests/layout_package/test_result_writer.py:75 > > > + writer.write_image_diff_files(image_diff) > > > > similar to above. > > > > > Tools/Scripts/webkitpy/layout_tests/port/base.py:937 > > > + self.image_diff = None # image_diff gets filled in after construction. > > > > see comments above. > > > > > Tools/Scripts/webkitpy/layout_tests/port/chromium.py:142 > > > # FIXME: need unit tests for this. > > > > I think we can actually remove this comment, given that there are unit tests in port_testcase.py for this. > > Please feel to write such a patch. > > > > Tools/Scripts/webkitpy/layout_tests/port/chromium_unittest.py:229 > > > + self.assertEquals(None, port.diff_image("EXPECTED", "ACTUAL")) > > > > I wonder why these tests are here rather than in port_testcase ... I don't see anything obviously chromium-specific about them. Maybe I wrote them before I wrote the port_testcase ones, and never moved them over. > > Dunno. > > > > Tools/Scripts/webkitpy/layout_tests/port/port_testcase.py:235 > > > + self.assertFalse(port.diff_image('', '')) > > > > These should be testing against None explicitly. A port implementation that didn't get updated and returned False to indicate that there was a image diff would incorrectly pass here. > > Why is that an incorrect pass? Returning anything falsy is probably ok. >
The point is that by testing against False, you are missing cases where the routine returns False instead of a string or None (certainly a bug), and missing cases where it's returning a diff of '' instead of None (probalby a bug).
> > > Tools/Scripts/webkitpy/layout_tests/port/test.py:148 > > > + actual_checksum='text-image-checksum_fail-checksum') > > > > See comment way earlier ... this should be renamed if it no longer fails. > > > > > Tools/Scripts/webkitpy/layout_tests/port/webkit.py:142 > > > + # Maybe we should throw an exception? > > > > Obviously, the tests expect you return that there was no diff. I don't remember why I thought this was acceptable or correct, unfortunately :( Maybe there's a comment in the patch that added them? > > The problem is that they're returning that there is a diff (which makes very little sense)! >
If one file exists and is a valid image, and the other exists but is empty (what this branch is testing), I would say that that is a (sensible) diff.
> > > Tools/Scripts/webkitpy/layout_tests/port/webkit.py:190 > > > + return output > > > > The re-ordering of these branches looks correct, but it would be nice if we had tests that tested these cases one way or another to know that we were fixing and not introducing bugs. Oh, and I'd like a pony. > > I don't think I changed any behavior here.
Well, you clearly changed the logic of the code, and that logic should be visibly (testably) different, even if current tests don't happen to illustrate that. It might be nice to have such tests, but I don't think it's super important (hence the pony comment).
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug