Running garden-o-matic as of ~12:45pm synced to r116981, I am getting the results in the attached screenshots. garden-o-matic claims that there are 8 tests failing in fast/repaint, for example, with IMAGE failures on all the mac bots. However, if you look at the canaries, there are no failing fast/repaint tests. The list of tests in that dir are: fast/repaint/inline-relative-positioned.html fast/repaint/lines-with-layout-delta.html fast/repaint/overflow-clip-subtree-layout.html fast/repaint/repaint-resized-overflow.html fast/repaint/subtree-layoutstate-transform.html fast/repaint/subtree-root-clip-2.html fast/repaint/subtree-root-clip.html fast/repaint/subtree-root-skipped.html There are suppressions for them on LEOPARD and SNOWLEOPARD in test_expectations.txt (lines 3249-3256) but not LION. So, I'm not sure what's going on here.
Created attachment 141774 [details] garden-o-matic screenshot
Created attachment 141775 [details] current state of the canaries
Looking at http://test-results.appspot.com/testfile?builder=Webkit%20Mac10.6&name=full_results.json: "overflow-clip-subtree-layout.html":{"expected":"IMAGE","actual":"IMAGE","image_diff_percent":0} So, the data in the latest full_results.json file is correct. Leads me to believe the bug is on the garden-o-matic side. Either it's pulling the wrong full_results.json (seems unlikely) or it's got a bug interpreting the data.
Looks like the state in the model is correct as well. model.state.resultsByBuilder['Webkit Mac10.6'].tests['fast']['repaint']['inline-relative-positioned.html']: Object actual: "IMAGE" expected: "IMAGE" image_diff_percent: 0 __proto__: Object
The tree is currently flaming red. The tool tries to be conservative. Once we clear up all the other higher priority failures, we'll see if it can figure out what's going on.
Found the problem. Here's one of the fast/repaint results return by results.unexpectedFailuresByTest(model.state.resultsByBuilder). All the other fast/repaint results were the same (failing only on Mac10.6 debug). fast/repaint/subtree-root-skipped.html: Object Webkit Mac10.6 (dbg): Object actual: "IMAGE" expected: "IMAGE+TEXT" image_diff_percent: 0 __proto__: Object Here's another test that is somehow also grouped with these fast/repaint tests: fast/replaced/width100percent-searchfield.html: Object Webkit Mac10.5: Object actual: "IMAGE" expected: "PASS IMAGE+TEXT" image_diff_percent: 0 __proto__: Object Webkit Mac10.5 (dbg)(2): Object actual: "IMAGE" expected: "PASS IMAGE+TEXT" image_diff_percent: 0 __proto__: Object Webkit Mac10.6: Object actual: "IMAGE" expected: "PASS IMAGE+TEXT" image_diff_percent: 0 __proto__: Object Webkit Mac10.6 (dbg): Object actual: "IMAGE" expected: "PASS IMAGE+TEXT" image_diff_percent: 0 __proto__: Object Webkit Mac10.7: Object actual: "IMAGE" expected: "PASS" image_diff_percent: 0 __proto__: Object What we show in garden-o-matic is the union of the failure types.
So, this is technically correct behavior. I'm assuming all these tests started failing at r116965, and thus they are grouped together. The union of the failures shown is correct even though all the fast/repaint tests are only still failing on Mac 10.6 debug. Only way to fix this would be to change the grouping logic to take into account which bots the test failed on and try to group bots that failed on the same builders in a separate group. Not sure if this can be done without making the UI confusing. Although, whenever I've used garden-o-matic, I've found the current grouping confusing for the same reason as this bug. I'd rather tests that all fail on the same bots to be grouped separately I think.
I'm not sure I'm seeing the same thing you are (or at least, I'm not understanding it yet). The way the information is grouped, it makes it look like all of the tests have IMAGE failures on all of the bots (exluding the TEXT failures, of course). If that's not the case, we should change that. I thought that the bots that stopped failing used to drop off of the display, but maybe the fact that there's still one test failing on LION is keeping it in the grouping? Second, if I click on "Examine", it takes me to a page where only two tests show up: fast/replaced/replaced-breaking.html fast/replaced/width100percent-searchfield.html Why aren't the other tests showing up on the Examine page?
(In reply to comment #8) > I'm not sure I'm seeing the same thing you are (or at least, I'm not understanding it yet). > > The way the information is grouped, it makes it look like all of the tests have IMAGE failures on all of the bots (exluding the TEXT failures, of course). If that's not the case, we should change that. I thought that the bots that stopped failing used to drop off of the display, but maybe the fact that there's still one test failing on LION is keeping it in the grouping? > > Second, if I click on "Examine", it takes me to a page where only two tests show up: > > fast/replaced/replaced-breaking.html > fast/replaced/width100percent-searchfield.html > > Why aren't the other tests showing up on the Examine page? Well, no longer see the fast/repaint tests at all, presumably since the mac 10.6 debug bot finally cycled. So, not sure what else we can do to further debug this without using local dummy data. It looks to me like everything is working as intended, but maybe I'm missing something. I think the current grouping is confusing and should probably take into account which bots the test is failing on and possibly the type of failure.
> I think the current grouping is confusing and should probably take into account which bots the test is failing on and possibly the type of failure. There's a trade-off here between being too coarse grained and overloading the gardener with information. Any thoughts you two have on improving the UI is much appreciated. :)
(In reply to comment #10) > > I think the current grouping is confusing and should probably take into account which bots the test is failing on and possibly the type of failure. > > There's a trade-off here between being too coarse grained and overloading the gardener with information. Any thoughts you two have on improving the UI is much appreciated. :) Perhaps we could do something like highlight the tests that are actually still failing on that bot when you hover over the bot name?
That's a nice idea.
(In reply to comment #11) > (In reply to comment #10) > > > I think the current grouping is confusing and should probably take into account which bots the test is failing on and possibly the type of failure. > > > > There's a trade-off here between being too coarse grained and overloading the gardener with information. Any thoughts you two have on improving the UI is much appreciated. :) > > Perhaps we could do something like highlight the tests that are actually still failing on that bot when you hover over the bot name? Clever. Certainly worth a try before cluttering the UI in some other way.