Bug 84762

Summary: webkit-patch rebaseline-test is retrieving stale expectations
Product: WebKit Reporter: Dirk Pranke <dpranke>
Component: Tools / TestsAssignee: Dirk Pranke <dpranke>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, danakj, eric, noel.gordon, ojan
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 86242    
Bug Blocks: 86171    
Attachments:
Description Flags
Patch
none
Now with updated tests :)
none
update w/ more review feedback - patch for landing none

Dirk Pranke
Reported 2012-04-24 14:04:56 PDT
webkit-patch rebaseline-test is fetching individual files from the layout_tests archive directories, rather than fetching them from the most recent zipfile. Since we don't empty out / clobber the layout_test_results directory before extracting the build into it, it's easy for there to be stale results in the directory (in fact, the flakiness dashboard leverages this to show you the "last failure on the bot"). There are probably three different ways to fix this: The easiest and probably least invasive is to fetch the zip file and then extract the member instead of fetching the member. This is what rebaseline-chromium-webkit-tests used to do, and it's guaranteed to be correct. It's also faster if you realize that you can cache the zip file locally and are rebaselining a lot of files at once (although adding caching adds complexity, obviously). The second would be to clobber the results directory - this would break the "feature" of the flakiness dashboard, which can be confusing but is also useful. The third would be to modify the tools to be smart enough to realize that the test was only failing with (say) an IMAGE failure and to then ignore the .txt file.
Attachments
Patch (5.83 KB, patch)
2012-04-24 14:57 PDT, Dirk Pranke
no flags
Now with updated tests :) (15.62 KB, patch)
2012-04-24 16:28 PDT, Dirk Pranke
no flags
update w/ more review feedback - patch for landing (15.44 KB, patch)
2012-04-24 17:04 PDT, Dirk Pranke
no flags
Ojan Vafai
Comment 1 2012-04-24 14:11:31 PDT
As I said offline, my preference would be to do the zip-file + caching approach. In addition to fixing this bug, it would also make bulk rebaselines many orders of magnitude faster. When I was rebaselining for the Chromium Lion port, I would queue up a couple hundred rebaselines, then wait 30-60 minutes for garden-o-matic to catch up. That said, I think we should also modify garden-o-matic to only try to rebaseline the expectations type that is failing (e.g. only the png if it's a IMAGE failure), but that's a separate bug. Notably, I think we want UI in garden-o-matic to just rebaseline one of the txt or the png for cases where one of them is correct.
Dirk Pranke
Comment 2 2012-04-24 14:57:16 PDT
Dirk Pranke
Comment 3 2012-04-24 14:58:39 PDT
Patch uploaded, but this is intended solely as a workaround until abarth can chime in on this.
Ojan Vafai
Comment 4 2012-04-24 15:40:35 PDT
Comment on attachment 138655 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=138655&action=review > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:143 > + # correct results. Also, this fetch_from_zip test is really sleazy because it > + # is bypassing all of the unit tests ... > + fetch_from_zip = self._tool.__class__.__name__ != 'MockTool' > + if fetch_from_zip: Can't we just update the tests with whatever the new output is? I don't think we should be checking things like this in even temporarily. Some percentage of code rot like this tends to stick around a lot longer than expected.
Dirk Pranke
Comment 5 2012-04-24 15:52:59 PDT
(In reply to comment #4) > Can't we just update the tests with whatever the new output is? I don't think we should be checking things like this in even temporarily. Bah :)
Dirk Pranke
Comment 6 2012-04-24 16:28:09 PDT
Created attachment 138680 [details] Now with updated tests :)
Dirk Pranke
Comment 7 2012-04-24 16:28:44 PDT
Comment on attachment 138680 [details] Now with updated tests :) Ojan is a harsh task master.
Ojan Vafai
Comment 8 2012-04-24 16:38:29 PDT
Comment on attachment 138680 [details] Now with updated tests :) View in context: https://bugs.webkit.org/attachment.cgi?id=138680&action=review Thanks for fixing up the tests. > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:144 > + # correct results. Also, this fetch_from_zip test is really sleazy because it > + # is bypassing all of the unit tests ... Nit: comment outdated. > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:146 > + zip_url = self._results_url(builder_name).replace('results/layout-test-results', 'layout-test-results.zip') Now that we don't have the else clause, this is the only use of this method. Maybe make that method do the right thing? If you hvae it call results_url instead of accumulated_results_url and then append layout-test-results.zip, I think it'll do what you want. > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:152 > + print " Found " + member_name In the spirit of keeping the log output minimal, I'm not sure this is really useful. If the rebaseline continues without error, we can assume it's found. The error-case below is good though.
Dirk Pranke
Comment 9 2012-04-24 16:51:55 PDT
(In reply to comment #8) > (From update of attachment 138680 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=138680&action=review > > Thanks for fixing up the tests. > > > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:144 > > + # correct results. Also, this fetch_from_zip test is really sleazy because it > > + # is bypassing all of the unit tests ... > > Nit: comment outdated. > Good catch. > > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:146 > > + zip_url = self._results_url(builder_name).replace('results/layout-test-results', 'layout-test-results.zip') > > Now that we don't have the else clause, this is the only use of this method. Maybe make that method do the right thing? If you hvae it call results_url instead of accumulated_results_url and then append layout-test-results.zip, I think it'll do what you want. Done. > > > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:152 > > + print " Found " + member_name > > In the spirit of keeping the log output minimal, I'm not sure this is really useful. If the rebaseline continues without error, we can assume it's found. The error-case below is good though. The error case is not really an error case (since we look for a 'wav' file for every test). If you want to reduce what we log, I think it would be better to only log the ones we do find. (Will be less verbose for text-only tests, at least).
Ojan Vafai
Comment 10 2012-04-24 16:59:09 PDT
Comment on attachment 138680 [details] Now with updated tests :) View in context: https://bugs.webkit.org/attachment.cgi?id=138680&action=review >>> Tools/Scripts/webkitpy/tool/commands/rebaseline.py:152 >>> + print " Found " + member_name >> >> In the spirit of keeping the log output minimal, I'm not sure this is really useful. If the rebaseline continues without error, we can assume it's found. The error-case below is good though. > > The error case is not really an error case (since we look for a 'wav' file for every test). If you want to reduce what we log, I think it would be better to only log the ones we do find. (Will be less verbose for text-only tests, at least). oic. I guess this is better output than the old output. I'm fine with it as is. We can always tweak it later if we think it's too verbose.
Dirk Pranke
Comment 11 2012-04-24 17:01:57 PDT
(In reply to comment #9) > > > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:146 > > > + zip_url = self._results_url(builder_name).replace('results/layout-test-results', 'layout-test-results.zip') > > > > Now that we don't have the else clause, this is the only use of this method. Maybe make that method do the right thing? If you hvae it call results_url instead of accumulated_results_url and then append layout-test-results.zip, I think it'll do what you want. > Actually, it looks like results_url() doesn't return what you would want (it omits the "f" directive and points you to a "results" dir).
Dirk Pranke
Comment 12 2012-04-24 17:04:05 PDT
Created attachment 138694 [details] update w/ more review feedback - patch for landing
Dirk Pranke
Comment 13 2012-04-24 17:11:04 PDT
Eric Seidel (no email)
Comment 14 2012-05-10 22:59:09 PDT
I think the caching mechanism either doesn't work, or the idea of having to ever download a 34mb zip file in order to rebasline is broken. :) Or possibly both. :)
Ojan Vafai
Comment 15 2012-05-11 00:07:07 PDT
(In reply to comment #14) > I think the caching mechanism either doesn't work, or the idea of having to ever download a 34mb zip file in order to rebasline is broken. :) Or possibly both. :) Well there is no caching mechanism yet. The hypothetical caching mechanism could make this a lot faster though. :) It is a problem though. rebaselines are really slow with these larger zip files. IMO, slow+correct is better than fast+mostly correct. But, it'd be good to get the caching in soon.
Ojan Vafai
Comment 16 2012-05-11 00:08:10 PDT
(In reply to comment #15) > (In reply to comment #14) > > I think the caching mechanism either doesn't work, or the idea of having to ever download a 34mb zip file in order to rebasline is broken. :) Or possibly both. :) > > Well there is no caching mechanism yet. The hypothetical caching mechanism could make this a lot faster though. :) > > It is a problem though. rebaselines are really slow with these larger zip files. IMO, slow+correct is better than fast+mostly correct. But, it'd be good to get the caching in soon. Also, theoretically, once we have the caching, rebaselines will be much faster than they were in the old scheme, not just faster than the current slowness.
Eric Seidel (no email)
Comment 17 2012-05-11 00:15:20 PDT
I suspect that the sydney office may beg to differ with your assessment that the curent situation is better than 2 weeks ago. :)
Adam Barth
Comment 18 2012-05-11 00:37:44 PDT
@noel Perhaps this is what caused Bug 86171
noel gordon
Comment 19 2012-05-11 02:21:35 PDT
Yes I believe so.
Dirk Pranke
Comment 20 2012-05-11 10:20:38 PDT
Sounds like we should either revert this, or add a flag to get the old behavior back while we're waiting for a better solution.
Adam Barth
Comment 21 2012-05-11 10:27:02 PDT
I'd be inclined to roll back the change given the performance impact. We might want to try a different approach for solving this problem, for example by having garden-o-matic indiciate which results it wants the script to download.
Dirk Pranke
Comment 22 2012-05-11 10:36:08 PDT
Okay, I will roll it out until I can come up with a better solution.
Dirk Pranke
Comment 23 2012-05-11 10:42:36 PDT
Rollout posted in bug 86231 .
Dirk Pranke
Comment 24 2012-05-11 12:40:03 PDT
Okay, i've posted a patch for garden-o-matic to tell webkit-patch which types of results to rebaseline in bug 86242 ...
noel gordon
Comment 25 2012-05-12 03:23:55 PDT
Question: would it be possible to extract needed files from the .zip on the server-side?
Dirk Pranke
Comment 26 2012-05-13 09:41:28 PDT
(In reply to comment #25) > Question: would it be possible to extract needed files from the .zip on the server-side? Not the way things are currently set up; the server is a dumb file server. However, bug 86242 should fix things.
Dirk Pranke
Comment 27 2012-05-18 15:09:08 PDT
closing ... this should be fixed now.
Note You need to log in before you can comment on or make changes to this bug.