RESOLVED FIXED 99793
make move_overwritten_baselines_to work again while rebaselining
https://bugs.webkit.org/show_bug.cgi?id=99793
Summary make move_overwritten_baselines_to work again while rebaselining
Dirk Pranke
Reported 2012-10-18 18:49:38 PDT
make move_overwritten_baselines_to work again while rebaselining
Attachments
Patch (13.47 KB, patch)
2012-10-18 18:51 PDT, Dirk Pranke
ojan: review+
Dirk Pranke
Comment 1 2012-10-18 18:51:32 PDT
Dirk Pranke
Comment 2 2012-10-18 18:58:26 PDT
It looks like this code broke at some point (possibly at least partially in r121706 / bug 90396, but probably partially in other patches as well). Since I need to bring up the new Chromium Mountain Lion baselines, it would be nice to make this work again in garden-o-matic. This patch, I believe restores the previous functionality, and I think we should land it as-is before trying something else. However, two ideas ... 1) It is kinda inconvenient to have to locally modify builders.js to take advantage of this functionality. I'm thinking it might make more sense to also support a command line flag to garden-o-matic, like "webkit-patch garden-o-matic --move-overwritten-baselines-to 'WebKit Mac10.8=chromium-mac-lion'" or something. I don't think the syntax needs to be terribly elegant. 2) Obviously better would be to just make garden-o-matic handle this rebaselining case better. Namely, when updating a baseline, we should check to see if this will potentially alter the results for any other builder that might use that baseline and, if so, and if not already also planning to rebaseline that builder, copy the overwritten result down. I think, since we rebaseline a test for all bots at once with rebaseline-json, this should be pretty straightforward, and I'm not sure that this strategy has any flaws. But obviously I'd want to do this in a separate patch . Thoughts?
Ojan Vafai
Comment 3 2012-10-18 22:02:28 PDT
I strongly prefer 2. But for unexpected failures, I think we decided on keeping the current behavior since the common case is that the test hasn't run on all the bots yet, but that they will share results (e.g. all the mac bots will have the same result). This minimized churn and waiting on bots to cycle for unexpected failures, but gets 100% correctness for expected failures or new bots. So, for new bots (e.g. the mtlion bot), I think we still want a way to mark it as doing this fallback behavior, just not needing to hard-code the fallback.
Ojan Vafai
Comment 4 2012-10-18 22:18:38 PDT
Comment on attachment 169530 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=169530&action=review > Tools/Scripts/webkitpy/layout_tests/port/builders.py:44 > +# FIXME: When do you need this? This is a workaround for ports that don't have full bot coverage, e.g. win-7sp1 actualy corresponds to platform/win, but there's no win-7sp1 bot, so we hack the win-7sp0 bot to put baselines in platform/win and qt-linux is th eonly qt bot on the webkit.org waterfall to run tests. I'm not sure at the moment, why mac-mountainlion needs this though. That might be a bug. It's a total hack. Not sure what else we could do. I suppose we could try to encourage ports to get better coverage, except platform/qt is like platform/chromium (i.e. there's no such thing as a plain qt bot). So they would need full coverage for every platform/qt-* directory before we could remove it for them. There's probably a better way to do all this, but it hasn't yet occurred to me. :)
Dirk Pranke
Comment 5 2012-10-18 22:23:49 PDT
(In reply to comment #4) > (From update of attachment 169530 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=169530&action=review > > > Tools/Scripts/webkitpy/layout_tests/port/builders.py:44 > > +# FIXME: When do you need this? > > This is a workaround for ports that don't have full bot coverage, e.g. win-7sp1 actualy corresponds to platform/win, but there's no win-7sp1 bot, so we hack the win-7sp0 bot to put baselines in platform/win and qt-linux is th eonly qt bot on the webkit.org waterfall to run tests. I'm not sure at the moment, why mac-mountainlion needs this though. That might be a bug. > > It's a total hack. Not sure what else we could do. I suppose we could try to encourage ports to get better coverage, except platform/qt is like platform/chromium (i.e. there's no such thing as a plain qt bot). So they would need full coverage for every platform/qt-* directory before we could remove it for them. > > There's probably a better way to do all this, but it hasn't yet occurred to me. :) Okay, I'll add something about this to the comments.
Dirk Pranke
Comment 6 2012-10-18 22:26:06 PDT
(In reply to comment #3) > I strongly prefer 2. But for unexpected failures, I think we decided on keeping the current behavior since the common case is that the test hasn't run on all the bots yet, but that they will share results (e.g. all the mac bots will have the same result). I think you've convinced me that we shouldn't move overwritten baselines by default for unexpected failures. I'm not sure that it's worth providing a way to choose to move overwritten baselines in the UI; it might be sufficient to never move unexpected and always move expected? > This minimized churn and waiting on bots to cycle for unexpected failures, but gets 100% correctness for expected failures or new bots. Is there a difference between expected failures and new bots here? >So, for new bots (e.g. the mtlion bot), I think we still want a way to mark it as doing this fallback behavior, just not needing to hard-code the fallback. I'm not sure what you mean by this sentence.
Dirk Pranke
Comment 7 2012-10-18 22:32:27 PDT
Ojan Vafai
Comment 8 2012-10-18 22:34:14 PDT
(In reply to comment #6) > (In reply to comment #3) > > I strongly prefer 2. But for unexpected failures, I think we decided on keeping the current behavior since the common case is that the test hasn't run on all the bots yet, but that they will share results (e.g. all the mac bots will have the same result). > > I think you've convinced me that we shouldn't move overwritten baselines by default for unexpected failures. I'm not sure that it's worth providing a way to choose to move overwritten baselines in the UI; it might be sufficient to never move unexpected and always move expected? The latter is what I'm picturing. > > This minimized churn and waiting on bots to cycle for unexpected failures, but gets 100% correctness for expected failures or new bots. > > Is there a difference between expected failures and new bots here? No. > >So, for new bots (e.g. the mtlion bot), I think we still want a way to mark it as doing this fallback behavior, just not needing to hard-code the fallback. > > I'm not sure what you mean by this sentence. All I meant is the above. We need some way to mark new bots as new so that we no know to treat their unexpected failures as expected failures. Oh, I suppose the way you have it now with a special TestExpectations file, they already all show up under expected failures, so there's no extra plumbing. That's seems better.
Dirk Pranke
Comment 9 2012-10-18 22:38:30 PDT
(In reply to comment #8) > > >So, for new bots (e.g. the mtlion bot), I think we still want a way to mark it as doing this fallback behavior, just not needing to hard-code the fallback. > > > > I'm not sure what you mean by this sentence. > > All I meant is the above. We need some way to mark new bots as new so that we no know to treat their unexpected failures as expected failures. Oh, I suppose the way you have it now with a special TestExpectations file, they already all show up under expected failures, so there's no extra plumbing. That's seems better. Yeah, it's really easy to make unexpected failures expected at this point. I'm not even sure that the separate TestExpectations file is worth it assuming the tools will just work (although I suppose for non-Chromium ports a separate file is kinda the default anyway).
Ojan Vafai
Comment 10 2012-10-18 22:42:28 PDT
> Yeah, it's really easy to make unexpected failures expected at this point. I'm not even sure that the separate TestExpectations file is worth it assuming the tools will just work (although I suppose for non-Chromium ports a separate file is kinda the default anyway). The only problem with this is the current garden-o-matic UI makes it hard to work through just the list of MtLion expected failures. With unexpected failures it groups them in a more convenient way for that specific use-case.
Dirk Pranke
Comment 11 2012-10-18 22:45:17 PDT
(In reply to comment #10) > > Yeah, it's really easy to make unexpected failures expected at this point. I'm not even sure that the separate TestExpectations file is worth it assuming the tools will just work (although I suppose for non-Chromium ports a separate file is kinda the default anyway). > > The only problem with this is the current garden-o-matic UI makes it hard to work through just the list of MtLion expected failures. With unexpected failures it groups them in a more convenient way for that specific use-case. Yeah, although at least this evening I was finding it really convenient to see the failures by name and just look to see if a given test was failing on ML or ML+other bots; the unexpected results page might not give you this context (although I need to look again). Fortunately, I still have a few thousand tests to go so I'll have plenty more opportunity to think about what might work better than what we have now :).
Note You need to log in before you can comment on or make changes to this bug.