webkit-patch rebaseline-test is fetching individual files from the layout_tests archive directories, rather than fetching them from the most recent zipfile. Since we don't empty out / clobber the layout_test_results directory before extracting the build into it, it's easy for there to be stale results in the directory (in fact, the flakiness dashboard leverages this to show you the "last failure on the bot"). There are probably three different ways to fix this: The easiest and probably least invasive is to fetch the zip file and then extract the member instead of fetching the member. This is what rebaseline-chromium-webkit-tests used to do, and it's guaranteed to be correct. It's also faster if you realize that you can cache the zip file locally and are rebaselining a lot of files at once (although adding caching adds complexity, obviously). The second would be to clobber the results directory - this would break the "feature" of the flakiness dashboard, which can be confusing but is also useful. The third would be to modify the tools to be smart enough to realize that the test was only failing with (say) an IMAGE failure and to then ignore the .txt file.
As I said offline, my preference would be to do the zip-file + caching approach. In addition to fixing this bug, it would also make bulk rebaselines many orders of magnitude faster. When I was rebaselining for the Chromium Lion port, I would queue up a couple hundred rebaselines, then wait 30-60 minutes for garden-o-matic to catch up. That said, I think we should also modify garden-o-matic to only try to rebaseline the expectations type that is failing (e.g. only the png if it's a IMAGE failure), but that's a separate bug. Notably, I think we want UI in garden-o-matic to just rebaseline one of the txt or the png for cases where one of them is correct.
Created attachment 138655 [details] Patch
Patch uploaded, but this is intended solely as a workaround until abarth can chime in on this.
Comment on attachment 138655 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=138655&action=review > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:143 > + # correct results. Also, this fetch_from_zip test is really sleazy because it > + # is bypassing all of the unit tests ... > + fetch_from_zip = self._tool.__class__.__name__ != 'MockTool' > + if fetch_from_zip: Can't we just update the tests with whatever the new output is? I don't think we should be checking things like this in even temporarily. Some percentage of code rot like this tends to stick around a lot longer than expected.
(In reply to comment #4) > Can't we just update the tests with whatever the new output is? I don't think we should be checking things like this in even temporarily. Bah :)
Created attachment 138680 [details] Now with updated tests :)
Comment on attachment 138680 [details] Now with updated tests :) Ojan is a harsh task master.
Comment on attachment 138680 [details] Now with updated tests :) View in context: https://bugs.webkit.org/attachment.cgi?id=138680&action=review Thanks for fixing up the tests. > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:144 > + # correct results. Also, this fetch_from_zip test is really sleazy because it > + # is bypassing all of the unit tests ... Nit: comment outdated. > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:146 > + zip_url = self._results_url(builder_name).replace('results/layout-test-results', 'layout-test-results.zip') Now that we don't have the else clause, this is the only use of this method. Maybe make that method do the right thing? If you hvae it call results_url instead of accumulated_results_url and then append layout-test-results.zip, I think it'll do what you want. > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:152 > + print " Found " + member_name In the spirit of keeping the log output minimal, I'm not sure this is really useful. If the rebaseline continues without error, we can assume it's found. The error-case below is good though.
(In reply to comment #8) > (From update of attachment 138680 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=138680&action=review > > Thanks for fixing up the tests. > > > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:144 > > + # correct results. Also, this fetch_from_zip test is really sleazy because it > > + # is bypassing all of the unit tests ... > > Nit: comment outdated. > Good catch. > > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:146 > > + zip_url = self._results_url(builder_name).replace('results/layout-test-results', 'layout-test-results.zip') > > Now that we don't have the else clause, this is the only use of this method. Maybe make that method do the right thing? If you hvae it call results_url instead of accumulated_results_url and then append layout-test-results.zip, I think it'll do what you want. Done. > > > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:152 > > + print " Found " + member_name > > In the spirit of keeping the log output minimal, I'm not sure this is really useful. If the rebaseline continues without error, we can assume it's found. The error-case below is good though. The error case is not really an error case (since we look for a 'wav' file for every test). If you want to reduce what we log, I think it would be better to only log the ones we do find. (Will be less verbose for text-only tests, at least).
Comment on attachment 138680 [details] Now with updated tests :) View in context: https://bugs.webkit.org/attachment.cgi?id=138680&action=review >>> Tools/Scripts/webkitpy/tool/commands/rebaseline.py:152 >>> + print " Found " + member_name >> >> In the spirit of keeping the log output minimal, I'm not sure this is really useful. If the rebaseline continues without error, we can assume it's found. The error-case below is good though. > > The error case is not really an error case (since we look for a 'wav' file for every test). If you want to reduce what we log, I think it would be better to only log the ones we do find. (Will be less verbose for text-only tests, at least). oic. I guess this is better output than the old output. I'm fine with it as is. We can always tweak it later if we think it's too verbose.
(In reply to comment #9) > > > Tools/Scripts/webkitpy/tool/commands/rebaseline.py:146 > > > + zip_url = self._results_url(builder_name).replace('results/layout-test-results', 'layout-test-results.zip') > > > > Now that we don't have the else clause, this is the only use of this method. Maybe make that method do the right thing? If you hvae it call results_url instead of accumulated_results_url and then append layout-test-results.zip, I think it'll do what you want. > Actually, it looks like results_url() doesn't return what you would want (it omits the "f" directive and points you to a "results" dir).
Created attachment 138694 [details] update w/ more review feedback - patch for landing
Committed r115142: <http://trac.webkit.org/changeset/115142>
I think the caching mechanism either doesn't work, or the idea of having to ever download a 34mb zip file in order to rebasline is broken. :) Or possibly both. :)
(In reply to comment #14) > I think the caching mechanism either doesn't work, or the idea of having to ever download a 34mb zip file in order to rebasline is broken. :) Or possibly both. :) Well there is no caching mechanism yet. The hypothetical caching mechanism could make this a lot faster though. :) It is a problem though. rebaselines are really slow with these larger zip files. IMO, slow+correct is better than fast+mostly correct. But, it'd be good to get the caching in soon.
(In reply to comment #15) > (In reply to comment #14) > > I think the caching mechanism either doesn't work, or the idea of having to ever download a 34mb zip file in order to rebasline is broken. :) Or possibly both. :) > > Well there is no caching mechanism yet. The hypothetical caching mechanism could make this a lot faster though. :) > > It is a problem though. rebaselines are really slow with these larger zip files. IMO, slow+correct is better than fast+mostly correct. But, it'd be good to get the caching in soon. Also, theoretically, once we have the caching, rebaselines will be much faster than they were in the old scheme, not just faster than the current slowness.
I suspect that the sydney office may beg to differ with your assessment that the curent situation is better than 2 weeks ago. :)
@noel Perhaps this is what caused Bug 86171
Yes I believe so.
Sounds like we should either revert this, or add a flag to get the old behavior back while we're waiting for a better solution.
I'd be inclined to roll back the change given the performance impact. We might want to try a different approach for solving this problem, for example by having garden-o-matic indiciate which results it wants the script to download.
Okay, I will roll it out until I can come up with a better solution.
Rollout posted in bug 86231 .
Okay, i've posted a patch for garden-o-matic to tell webkit-patch which types of results to rebaseline in bug 86242 ...
Question: would it be possible to extract needed files from the .zip on the server-side?
(In reply to comment #25) > Question: would it be possible to extract needed files from the .zip on the server-side? Not the way things are currently set up; the server is a dumb file server. However, bug 86242 should fix things.
closing ... this should be fixed now.