Bug 57481 - [chromium] update the rebaseline tool to know about pngs with checksums
Summary: [chromium] update the rebaseline tool to know about pngs with checksums
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Other OS X 10.5
: P2 Normal
Assignee: Tony Chang
URL:
Keywords:
Depends on:
Blocks: 56286
  Show dependency treegraph
 
Reported: 2011-03-30 11:22 PDT by Tony Chang
Modified: 2011-04-01 10:29 PDT (History)
6 users (show)

See Also:


Attachments
Patch (6.56 KB, patch)
2011-03-30 11:25 PDT, Tony Chang
no flags Details | Formatted Diff | Diff
Patch (6.92 KB, patch)
2011-03-30 13:49 PDT, Tony Chang
no flags Details | Formatted Diff | Diff
Patch (10.49 KB, patch)
2011-03-31 10:53 PDT, Tony Chang
ojan: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tony Chang 2011-03-30 11:22:52 PDT
[chromium] update the rebaseline tool to know about pngs with checksums
Comment 1 Tony Chang 2011-03-30 11:25:10 PDT
Created attachment 87578 [details]
Patch
Comment 2 Dirk Pranke 2011-03-30 11:49:46 PDT
Comment on attachment 87578 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=87578&action=review

> Tools/Scripts/webkitpy/layout_tests/rebaseline_chromium_webkit_tests.py:399
> +

So the logic here is that if the checksum is embedded in the PNG, we don't need the .checksum file?

If it isn't, then presumably either the PNG doesn't have a checksum and we behave as before (which is fine), or it does have a checksum and they don't match. What should happen in that case, and how can you distinguish it from the other?

Also, what happens if the file we're rebaselining is in a directory shared with other ports (e.g. platform/mac, or a generic file) that may not support embedded PNGs?
Comment 3 Tony Chang 2011-03-30 11:58:20 PDT
(In reply to comment #2)
> (From update of attachment 87578 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=87578&action=review
> 
> > Tools/Scripts/webkitpy/layout_tests/rebaseline_chromium_webkit_tests.py:399
> > +
> 
> So the logic here is that if the checksum is embedded in the PNG, we don't need the .checksum file?

Yes.

> If it isn't, then presumably either the PNG doesn't have a checksum and we behave as before (which is fine), or it does have a checksum and they don't match. What should happen in that case, and how can you distinguish it from the other?

If the checksums don't match, we prefer the .checksum file and go ahead and copy it into place.  This matches NRWT which also prefers the .checksum file to the png.

In practice, I don't think this can happen, but if it does, I think the user should just use the .checksum file and move on.

> Also, what happens if the file we're rebaselining is in a directory shared with other ports (e.g. platform/mac, or a generic file) that may not support embedded PNGs?

I thought this script only writes to platform/chromium* directories.  How do I use the script to write files to platform/mac?
Comment 4 Tony Chang 2011-03-30 12:01:07 PDT
(In reply to comment #3)
> I thought this script only writes to platform/chromium* directories.  How do I use the script to write files to platform/mac?

Oh, I see, there's a --target-platform flag.  Does any port other than chromium use this script (I mean, it has chromium in the file name)?  Wouldn't they just use webkit-patch rebaseline?
Comment 5 Dirk Pranke 2011-03-30 12:07:22 PDT
(In reply to comment #3)
> (In reply to comment #2)
> > (From update of attachment 87578 [details] [details])
> > View in context: https://bugs.webkit.org/attachment.cgi?id=87578&action=review
> > 
> > > Tools/Scripts/webkitpy/layout_tests/rebaseline_chromium_webkit_tests.py:399
> > > +
> > 
> > So the logic here is that if the checksum is embedded in the PNG, we don't need the .checksum file?
> 
> Yes.
> 
> > If it isn't, then presumably either the PNG doesn't have a checksum and we behave as before (which is fine), or it does have a checksum and they don't match. What should happen in that case, and how can you distinguish it from the other?
> 
> If the checksums don't match, we prefer the .checksum file and go ahead and copy it into place.  This matches NRWT which also prefers the .checksum file to the png.
> 
> In practice, I don't think this can happen, but if it does, I think the user should just use the .checksum file and move on.
> 

Hm. I wonder if raising an error and bailing out might be better in this case.

> > Also, what happens if the file we're rebaselining is in a directory shared with other ports (e.g. platform/mac, or a generic file) that may not support embedded PNGs?
> 
> I thought this script only writes to platform/chromium* directories.  How do I use the script to write files to platform/mac?

Ah, you're right. I was thinking it would update the generic baseline if it needed to, but it doesn't, it writes to the baseline_path() and counts on de-duping later bubble the files up.

While the tool theoretically supports non-chromium ports, I don't think anyone actually uses it for that, and theoretically the same logic should apply there anyway, so I wasn't worried about that.
Comment 6 Dirk Pranke 2011-03-30 12:08:03 PDT
(In reply to comment #4)
> (In reply to comment #3)
> > I thought this script only writes to platform/chromium* directories.  How do I use the script to write files to platform/mac?
> 
> Oh, I see, there's a --target-platform flag.  Does any port other than chromium use this script (I mean, it has chromium in the file name)?  Wouldn't they just use webkit-patch rebaseline?

I don't know if anyone does (or should) use the --target-platform flag. Frankly, I should probably remove all of the flags that likely lead to broken behavior.
Comment 7 Tony Chang 2011-03-30 12:11:22 PDT
(In reply to comment #5)
> (In reply to comment #3)
> > If the checksums don't match, we prefer the .checksum file and go ahead and copy it into place.  This matches NRWT which also prefers the .checksum file to the png.
> > 
> > In practice, I don't think this can happen, but if it does, I think the user should just use the .checksum file and move on.
> > 
> 
> Hm. I wonder if raising an error and bailing out might be better in this case.

Raising an error is easy, but what would the user do in this case?  For someone gardening, I think it's easier to just check in the file that will cause the tree to go green.

When the time comes to try to actively remove all .checksum files from the tree, we can verify that these all match and handle the mismatch files on a case by case basis.
Comment 8 Dirk Pranke 2011-03-30 12:28:32 PDT
(In reply to comment #7)
> (In reply to comment #5)
> > (In reply to comment #3)
> > > If the checksums don't match, we prefer the .checksum file and go ahead and copy it into place.  This matches NRWT which also prefers the .checksum file to the png.
> > > 
> > > In practice, I don't think this can happen, but if it does, I think the user should just use the .checksum file and move on.
> > > 
> > 
> > Hm. I wonder if raising an error and bailing out might be better in this case.
> 
> Raising an error is easy, but what would the user do in this case?  For someone gardening, I think it's easier to just check in the file that will cause the tree to go green.
> 

Well, let's consider what the implications of a checksum mismatch would be.

1) The .checksum file is wrong. Presumably DRT is either somehow generating the checksum two different ways, or somehow the wrong checksum is getting copied into the .checksum file (e.g., from a bug in NRWT). In this case, subsequent NRWT runs will complain about a hash mismatch. Not the end of the world, but the tree still won't be green, and we'll be papering over a bug.

On the other hand, if we raised an error, the gardener would either have to roll out the change that caused this to fail, or suppress it in test_expectations instead of rebaselining. 

2) The embedded checksum is wrong. No big deal, because we'll ignore it until we eventually delete the .checksums, at which point the bug will be revealed. Again, we'd be ignoring a bug.

In either case, papering over a bug seems bad.

> When the time comes to try to actively remove all .checksum files from the tree, we can verify that these all match and handle the mismatch files on a case by case basis.

Which will be long after it actually happened, presumably, making it harder to figure out what the heck happened.
Comment 9 Tony Chang 2011-03-30 12:53:36 PDT
(In reply to comment #8)
> 1) The .checksum file is wrong. Presumably DRT is either somehow generating the checksum two different ways, or somehow the wrong checksum is getting copied into the .checksum file (e.g., from a bug in NRWT). In this case, subsequent NRWT runs will complain about a hash mismatch. Not the end of the world, but the tree still won't be green, and we'll be papering over a bug.
> 
> On the other hand, if we raised an error, the gardener would either have to roll out the change that caused this to fail, or suppress it in test_expectations instead of rebaselining. 

If the .checksum file is wrong, then this is no different than before this change.  As in, the gardener would still check in the .checksum file and the tree would stay red.

> 2) The embedded checksum is wrong. No big deal, because we'll ignore it until we eventually delete the .checksums, at which point the bug will be revealed. Again, we'd be ignoring a bug.
> 
> In either case, papering over a bug seems bad.

I agree that papering over a bug seems bad, I don't think raising an error about this helps fix either error because it's not clear which checksum is wrong.  As a gardener, I don't think one has the time to investigate this and should just try to get the tree green.

How do you anticipate the gardener would figure out what caused the error and fix it?

Both of these errors exist today, you just can't see them.  It won't be until we try to eliminate all .checksum files from the tree that we will see the existing errors.
Comment 10 Dirk Pranke 2011-03-30 13:23:39 PDT
(In reply to comment #9)
> (In reply to comment #8)
> > 1) The .checksum file is wrong. Presumably DRT is either somehow generating the checksum two different ways, or somehow the wrong checksum is getting copied into the .checksum file (e.g., from a bug in NRWT). In this case, subsequent NRWT runs will complain about a hash mismatch. Not the end of the world, but the tree still won't be green, and we'll be papering over a bug.
> > 
> > On the other hand, if we raised an error, the gardener would either have to roll out the change that caused this to fail, or suppress it in test_expectations instead of rebaselining. 
> 
> If the .checksum file is wrong, then this is no different than before this change.  As in, the gardener would still check in the .checksum file and the tree would stay red.
> 

Well, since the (presumably correct) checksum is now embedded in the file, we can now notice and catch bugs in NRWT that we couldn't before, so there is a difference.

> > 2) The embedded checksum is wrong. No big deal, because we'll ignore it until we eventually delete the .checksums, at which point the bug will be revealed. Again, we'd be ignoring a bug.
> > 
> > In either case, papering over a bug seems bad.
> 
> I agree that papering over a bug seems bad, I don't think raising an error about this helps fix either error because it's not clear which checksum is wrong.  As a gardener, I don't think one has the time to investigate this and should just try to get the tree green.

I agree that the gardener may not have the time or knowledge to investigate this, but checking in a potentially incorrect file seems like the wrong way to work around it. Better to add a line to test_expectations.txt so that the failure is less likely to be forgotten.

> 
> How do you anticipate the gardener would figure out what caused the error and fix it?
> 

The same way he or she finds any other cause of a problem ... by investigating and/or asking around.

> Both of these errors exist today, you just can't see them.  It won't be until we try to eliminate all .checksum files from the tree that we will see the existing errors.

Not true; see above.
Comment 11 Dirk Pranke 2011-03-30 13:29:50 PDT
Note that admittedly this is a pretty minor issue and I don't want to beat it to death.
Comment 12 Tony Chang 2011-03-30 13:49:10 PDT
Created attachment 87607 [details]
Patch
Comment 13 Tony Chang 2011-03-30 13:51:47 PDT
(In reply to comment #12)
> Created an attachment (id=87607) [details]
> Patch

The only difference with the last patch is that this will now log an error if the .checksum and checksum in png file don't match.  In practice, I don't think this will happen and if it does, the gardener can notice when the change has both a .checksum and .png file.

In general, I think we should try to make the tools pick reasonable default actions and not prompt the user.
Comment 14 Dirk Pranke 2011-03-30 14:04:46 PDT
(In reply to comment #13)
> (In reply to comment #12)
> > Created an attachment (id=87607) [details] [details]
> > Patch
> 
> The only difference with the last patch is that this will now log an error if the .checksum and checksum in png file don't match.  In practice, I don't think this will happen and if it does, the gardener can notice when the change has both a .checksum and .png file.
> 
> In general, I think we should try to make the tools pick reasonable default actions and not prompt the user.

This looks fine. I agree that this shouldn't happen in practice. In fact, I think it's the very definition of "exceptional" and I'd probably be fine with the tool aborting.

That said, it occurred to me that we should just have a presubmit hook that checked that pngs do in fact have the checksums expected by the .checksum files (or maybe even the comments as well, although I don't know if I'm that paranoid). Does that seem like a good idea?
Comment 15 Dirk Pranke 2011-03-30 14:06:16 PDT
(In reply to comment #13)
> In general, I think we should try to make the tools pick reasonable default actions and not prompt the user.

I agree. However, in this case it's not obvious what the default action should be, or that there should even be one, because clearly something's quite wrong.
Comment 16 Tony Chang 2011-03-31 10:53:48 PDT
Created attachment 87764 [details]
Patch
Comment 17 Tony Chang 2011-03-31 10:54:18 PDT
(In reply to comment #16)
> Created an attachment (id=87764) [details]
> Patch

Fix a bug (pointed out by Ryosuke) where we don't delete stale .checksum files when writing new pngs.
Comment 18 Dirk Pranke 2011-03-31 12:12:13 PDT
Comment on attachment 87764 [details]
Patch

LGTM. One minor comment about a related issue ...

View in context: https://bugs.webkit.org/attachment.cgi?id=87764&action=review



> Tools/Scripts/webkitpy/layout_tests/rebaseline_chromium_webkit_tests.py:400
> +                continue

Do we need to put this logic into NRWT for --reset-results and --new-baseline as well?
Comment 19 Tony Chang 2011-03-31 14:04:23 PDT
(In reply to comment #18)
> (From update of attachment 87764 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=87764&action=review
> > Tools/Scripts/webkitpy/layout_tests/rebaseline_chromium_webkit_tests.py:400
> > +                continue
> 
> Do we need to put this logic into NRWT for --reset-results and --new-baseline as well?

Yes, that's right.  Bug 57575 filed.
Comment 20 Ojan Vafai 2011-03-31 17:28:50 PDT
Comment on attachment 87764 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=87764&action=review

> Tools/Scripts/webkitpy/layout_tests/rebaseline_chromium_webkit_tests.py:429
> +    def _png_has_same_checksum(self, temp_name, checksum_expected_fullpath):

I know this is from the calling code, but temp_name is not the most clear variable name. temp_file_name? temp_file_path?

> Tools/Scripts/webkitpy/layout_tests/rebaseline_chromium_webkit_tests.py:435
> +            return False

Maybe log an error here as well? If a checksum file exists, it's an error that an associated png file doesn't, right?
Comment 21 Tony Chang 2011-04-01 09:37:55 PDT
Committed r82688: <http://trac.webkit.org/changeset/82688>
Comment 22 WebKit Review Bot 2011-04-01 10:29:23 PDT
http://trac.webkit.org/changeset/82688 might have broken SnowLeopard Intel Release (Tests)
The following tests are not passing:
platform/mac/accessibility/aria-treegrid.html