84762 – webkit-patch rebaseline-test is retrieving stale expectations

RESOLVED FIXED 84762

webkit-patch rebaseline-test is retrieving stale expectations

https://bugs.webkit.org/show_bug.cgi?id=84762

Summary webkit-patch rebaseline-test is retrieving stale expectations

Dirk Pranke

Reported 2012-04-24 14:04:56 PDT

webkit-patch rebaseline-test is fetching individual files from the layout_tests archive directories, rather than fetching them from the most recent zipfile. Since we don't empty out / clobber the layout_test_results directory before extracting the build into it, it's easy for there to be stale results in the directory (in fact, the flakiness dashboard leverages this to show you the "last failure on the bot"). There are probably three different ways to fix this: The easiest and probably least invasive is to fetch the zip file and then extract the member instead of fetching the member. This is what rebaseline-chromium-webkit-tests used to do, and it's guaranteed to be correct. It's also faster if you realize that you can cache the zip file locally and are rebaselining a lot of files at once (although adding caching adds complexity, obviously). The second would be to clobber the results directory - this would break the "feature" of the flakiness dashboard, which can be confusing but is also useful. The third would be to modify the tools to be smart enough to realize that the test was only failing with (say) an IMAGE failure and to then ignore the .txt file.

Attachments
Patch (5.83 KB, patch) 2012-04-24 14:57 PDT, Dirk Pranke	no flags	Details Formatted Diff Diff
Now with updated tests :) (15.62 KB, patch) 2012-04-24 16:28 PDT, Dirk Pranke	no flags	Details Formatted Diff Diff
update w/ more review feedback - patch for landing (15.44 KB, patch) 2012-04-24 17:04 PDT, Dirk Pranke	no flags	Details Formatted Diff Diff
Show Obsolete (2) View All Add attachment proposed patch, testcase, etc.

Ojan Vafai

Comment 1 2012-04-24 14:11:31 PDT

As I said offline, my preference would be to do the zip-file + caching approach. In addition to fixing this bug, it would also make bulk rebaselines many orders of magnitude faster. When I was rebaselining for the Chromium Lion port, I would queue up a couple hundred rebaselines, then wait 30-60 minutes for garden-o-matic to catch up. That said, I think we should also modify garden-o-matic to only try to rebaseline the expectations type that is failing (e.g. only the png if it's a IMAGE failure), but that's a separate bug. Notably, I think we want UI in garden-o-matic to just rebaseline one of the txt or the png for cases where one of them is correct.

Dirk Pranke

Comment 2 2012-04-24 14:57:16 PDT

Created attachment 138655 [details] Patch

Dirk Pranke

Comment 3 2012-04-24 14:58:39 PDT

Patch uploaded, but this is intended solely as a workaround until abarth can chime in on this.

Ojan Vafai

Comment 4 2012-04-24 15:40:35 PDT

Comment on attachment 138655 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=138655&action=review > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:143 > + # correct results. Also, this fetch_from_zip test is really sleazy because it > + # is bypassing all of the unit tests ... > + fetch_from_zip = self._tool.__class__.__name__ != 'MockTool' > + if fetch_from_zip: Can't we just update the tests with whatever the new output is? I don't think we should be checking things like this in even temporarily. Some percentage of code rot like this tends to stick around a lot longer than expected.

Dirk Pranke

Comment 5 2012-04-24 15:52:59 PDT

(In reply to comment #4) > Can't we just update the tests with whatever the new output is? I don't think we should be checking things like this in even temporarily. Bah :)

Dirk Pranke

Comment 6 2012-04-24 16:28:09 PDT

Created attachment 138680 [details] Now with updated tests :)

Dirk Pranke

Comment 7 2012-04-24 16:28:44 PDT

Comment on attachment 138680 [details] Now with updated tests :) Ojan is a harsh task master.

Ojan Vafai

Comment 8 2012-04-24 16:38:29 PDT

Comment on attachment 138680 [details] Now with updated tests :) View in context: https://bugs.webkit.org/attachment.cgi?id=138680&action=review Thanks for fixing up the tests. > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:144 > + # correct results. Also, this fetch_from_zip test is really sleazy because it > + # is bypassing all of the unit tests ... Nit: comment outdated. > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:146 > + zip_url = self._results_url(builder_name).replace('results/layout-test-results', 'layout-test-results.zip') Now that we don't have the else clause, this is the only use of this method. Maybe make that method do the right thing? If you hvae it call results_url instead of accumulated_results_url and then append layout-test-results.zip, I think it'll do what you want. > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:152 > + print " Found " + member_name In the spirit of keeping the log output minimal, I'm not sure this is really useful. If the rebaseline continues without error, we can assume it's found. The error-case below is good though.

Dirk Pranke

Comment 9 2012-04-24 16:51:55 PDT

(In reply to comment #8) > (From update of attachment 138680 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=138680&action=review > > Thanks for fixing up the tests. > > > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:144 > > + # correct results. Also, this fetch_from_zip test is really sleazy because it > > + # is bypassing all of the unit tests ... > > Nit: comment outdated. > Good catch. > > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:146 > > + zip_url = self._results_url(builder_name).replace('results/layout-test-results', 'layout-test-results.zip') > > Now that we don't have the else clause, this is the only use of this method. Maybe make that method do the right thing? If you hvae it call results_url instead of accumulated_results_url and then append layout-test-results.zip, I think it'll do what you want. Done. > > > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:152 > > + print " Found " + member_name > > In the spirit of keeping the log output minimal, I'm not sure this is really useful. If the rebaseline continues without error, we can assume it's found. The error-case below is good though. The error case is not really an error case (since we look for a 'wav' file for every test). If you want to reduce what we log, I think it would be better to only log the ones we do find. (Will be less verbose for text-only tests, at least).

Ojan Vafai

Comment 10 2012-04-24 16:59:09 PDT

Comment on attachment 138680 [details] Now with updated tests :) View in context: https://bugs.webkit.org/attachment.cgi?id=138680&action=review >>> Tools/Scripts/webkitpy/tool/commands/rebaseline.py:152 >>> + print " Found " + member_name >> >> In the spirit of keeping the log output minimal, I'm not sure this is really useful. If the rebaseline continues without error, we can assume it's found. The error-case below is good though. > > The error case is not really an error case (since we look for a 'wav' file for every test). If you want to reduce what we log, I think it would be better to only log the ones we do find. (Will be less verbose for text-only tests, at least). oic. I guess this is better output than the old output. I'm fine with it as is. We can always tweak it later if we think it's too verbose.

Dirk Pranke

Comment 11 2012-04-24 17:01:57 PDT

(In reply to comment #9) > > > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:146 > > > + zip_url = self._results_url(builder_name).replace('results/layout-test-results', 'layout-test-results.zip') > > > > Now that we don't have the else clause, this is the only use of this method. Maybe make that method do the right thing? If you hvae it call results_url instead of accumulated_results_url and then append layout-test-results.zip, I think it'll do what you want. > Actually, it looks like results_url() doesn't return what you would want (it omits the "f" directive and points you to a "results" dir).

Dirk Pranke

Comment 12 2012-04-24 17:04:05 PDT

Created attachment 138694 [details] update w/ more review feedback - patch for landing

Dirk Pranke

Comment 13 2012-04-24 17:11:04 PDT

Committed r115142: <http://trac.webkit.org/changeset/115142>

Eric Seidel (no email)

Comment 14 2012-05-10 22:59:09 PDT

I think the caching mechanism either doesn't work, or the idea of having to ever download a 34mb zip file in order to rebasline is broken. :) Or possibly both. :)

Ojan Vafai

Comment 15 2012-05-11 00:07:07 PDT

(In reply to comment #14) > I think the caching mechanism either doesn't work, or the idea of having to ever download a 34mb zip file in order to rebasline is broken. :) Or possibly both. :) Well there is no caching mechanism yet. The hypothetical caching mechanism could make this a lot faster though. :) It is a problem though. rebaselines are really slow with these larger zip files. IMO, slow+correct is better than fast+mostly correct. But, it'd be good to get the caching in soon.

Ojan Vafai

Comment 16 2012-05-11 00:08:10 PDT

(In reply to comment #15) > (In reply to comment #14) > > I think the caching mechanism either doesn't work, or the idea of having to ever download a 34mb zip file in order to rebasline is broken. :) Or possibly both. :) > > Well there is no caching mechanism yet. The hypothetical caching mechanism could make this a lot faster though. :) > > It is a problem though. rebaselines are really slow with these larger zip files. IMO, slow+correct is better than fast+mostly correct. But, it'd be good to get the caching in soon. Also, theoretically, once we have the caching, rebaselines will be much faster than they were in the old scheme, not just faster than the current slowness.

Eric Seidel (no email)

Comment 17 2012-05-11 00:15:20 PDT

I suspect that the sydney office may beg to differ with your assessment that the curent situation is better than 2 weeks ago. :)

Adam Barth

Comment 18 2012-05-11 00:37:44 PDT

@noel Perhaps this is what caused Bug 86171

noel gordon

Comment 19 2012-05-11 02:21:35 PDT

Yes I believe so.

Dirk Pranke

Comment 20 2012-05-11 10:20:38 PDT

Sounds like we should either revert this, or add a flag to get the old behavior back while we're waiting for a better solution.

Adam Barth

Comment 21 2012-05-11 10:27:02 PDT

I'd be inclined to roll back the change given the performance impact. We might want to try a different approach for solving this problem, for example by having garden-o-matic indiciate which results it wants the script to download.

Dirk Pranke

Comment 22 2012-05-11 10:36:08 PDT

Okay, I will roll it out until I can come up with a better solution.

Dirk Pranke

Comment 23 2012-05-11 10:42:36 PDT

Rollout posted in bug 86231 .

Dirk Pranke

Comment 24 2012-05-11 12:40:03 PDT

Okay, i've posted a patch for garden-o-matic to tell webkit-patch which types of results to rebaseline in bug 86242 ...

noel gordon

Comment 25 2012-05-12 03:23:55 PDT

Question: would it be possible to extract needed files from the .zip on the server-side?

Dirk Pranke

Comment 26 2012-05-13 09:41:28 PDT

(In reply to comment #25) > Question: would it be possible to extract needed files from the .zip on the server-side? Not the way things are currently set up; the server is a dumb file server. However, bug 86242 should fix things.

Dirk Pranke

Comment 27 2012-05-18 15:09:08 PDT

closing ... this should be fixed now.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Unspecified

OS Unspecified

Product WebKit

Component Tools / Tests

Assignee

Dirk Pranke

Reported

2012-04-24 14:04 PDT

Modified

2012-05-18 15:09 PDT History

CC List

5 users Show

URL

Keywords

Depends on

86242

Blocks

86171

Dependencies

tree graph