Bug 86393 - garden-o-matic should highlight which platforms/configurations apply to each test in the failure stream when you hover over the test
Summary: garden-o-matic should highlight which platforms/configurations apply to each ...
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-05-14 13:00 PDT by Dirk Pranke
Modified: 2017-07-18 08:29 PDT (History)
3 users (show)

See Also:


Attachments
garden-o-matic screenshot (98.91 KB, image/png)
2012-05-14 13:02 PDT, Dirk Pranke
no flags Details
current state of the canaries (238.63 KB, image/png)
2012-05-14 13:03 PDT, Dirk Pranke
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dirk Pranke 2012-05-14 13:00:27 PDT
Running garden-o-matic as of ~12:45pm synced to r116981, I am getting the results in the attached screenshots. garden-o-matic claims that there are 8 tests failing in fast/repaint, for example, with IMAGE failures on all the mac bots. 

However, if you look at the canaries, there are no failing fast/repaint tests. 

The list of tests in that dir are:

fast/repaint/inline-relative-positioned.html
fast/repaint/lines-with-layout-delta.html
fast/repaint/overflow-clip-subtree-layout.html
fast/repaint/repaint-resized-overflow.html
fast/repaint/subtree-layoutstate-transform.html
fast/repaint/subtree-root-clip-2.html
fast/repaint/subtree-root-clip.html
fast/repaint/subtree-root-skipped.html

There are suppressions for them on LEOPARD and SNOWLEOPARD in test_expectations.txt (lines 3249-3256) but not LION. So, I'm not sure what's going on here.
Comment 1 Dirk Pranke 2012-05-14 13:02:51 PDT
Created attachment 141774 [details]
garden-o-matic screenshot
Comment 2 Dirk Pranke 2012-05-14 13:03:12 PDT
Created attachment 141775 [details]
current state of the canaries
Comment 3 Ojan Vafai 2012-05-14 13:34:14 PDT
Looking at http://test-results.appspot.com/testfile?builder=Webkit%20Mac10.6&name=full_results.json:
"overflow-clip-subtree-layout.html":{"expected":"IMAGE","actual":"IMAGE","image_diff_percent":0}

So, the data in the latest full_results.json file is correct. Leads me to believe the bug is on the garden-o-matic side. Either it's pulling the wrong full_results.json (seems unlikely) or it's got a bug interpreting the data.
Comment 4 Ojan Vafai 2012-05-14 13:40:30 PDT
Looks like the state in the model is correct as well.

model.state.resultsByBuilder['Webkit Mac10.6'].tests['fast']['repaint']['inline-relative-positioned.html']:
Object
actual: "IMAGE"
expected: "IMAGE"
image_diff_percent: 0
__proto__: Object
Comment 5 Adam Barth 2012-05-14 13:45:30 PDT
The tree is currently flaming red.  The tool tries to be conservative.  Once we clear up all the other higher priority failures, we'll see if it can figure out what's going on.
Comment 6 Ojan Vafai 2012-05-14 13:46:39 PDT
Found the problem.

Here's one of the fast/repaint results return by results.unexpectedFailuresByTest(model.state.resultsByBuilder). All the other fast/repaint results were the same (failing only on Mac10.6 debug).

fast/repaint/subtree-root-skipped.html: Object
Webkit Mac10.6 (dbg): Object
actual: "IMAGE"
expected: "IMAGE+TEXT"
image_diff_percent: 0
__proto__: Object

Here's another test that is somehow also grouped with these fast/repaint tests:
fast/replaced/width100percent-searchfield.html: Object
Webkit Mac10.5: Object
actual: "IMAGE"
expected: "PASS IMAGE+TEXT"
image_diff_percent: 0
__proto__: Object
Webkit Mac10.5 (dbg)(2): Object
actual: "IMAGE"
expected: "PASS IMAGE+TEXT"
image_diff_percent: 0
__proto__: Object
Webkit Mac10.6: Object
actual: "IMAGE"
expected: "PASS IMAGE+TEXT"
image_diff_percent: 0
__proto__: Object
Webkit Mac10.6 (dbg): Object
actual: "IMAGE"
expected: "PASS IMAGE+TEXT"
image_diff_percent: 0
__proto__: Object
Webkit Mac10.7: Object
actual: "IMAGE"
expected: "PASS"
image_diff_percent: 0
__proto__: Object

What we show in garden-o-matic is the union of the failure types.
Comment 7 Ojan Vafai 2012-05-14 13:49:16 PDT
So, this is technically correct behavior. I'm assuming all these tests started failing at r116965, and thus they are grouped together. The union of the failures shown is correct even though all the fast/repaint tests are only still failing on Mac 10.6 debug.

Only way to fix this would be to change the grouping logic to take into account which bots the test failed on and try to group bots that failed on the same builders in a separate group.

Not sure if this can be done without making the UI confusing. Although, whenever I've used garden-o-matic, I've found the current grouping confusing for the same reason as this bug. I'd rather tests that all fail on the same bots to be grouped separately I think.
Comment 8 Dirk Pranke 2012-05-14 13:56:36 PDT
I'm not sure I'm seeing the same thing you are (or at least, I'm not understanding it yet).

The way the information is grouped, it makes it look like all of the tests have IMAGE failures on all of the bots (exluding the TEXT failures, of course). If that's not the case, we should change that. I thought that the bots that stopped failing used to drop off of the display, but maybe the fact that there's still one test failing on LION is keeping it in the grouping?

Second, if I click on "Examine", it takes me to a page where only two tests show up: 

fast/replaced/replaced-breaking.html
fast/replaced/width100percent-searchfield.html

Why aren't the other tests showing up on the Examine page?
Comment 9 Ojan Vafai 2012-05-14 14:08:42 PDT
(In reply to comment #8)
> I'm not sure I'm seeing the same thing you are (or at least, I'm not understanding it yet).
> 
> The way the information is grouped, it makes it look like all of the tests have IMAGE failures on all of the bots (exluding the TEXT failures, of course). If that's not the case, we should change that. I thought that the bots that stopped failing used to drop off of the display, but maybe the fact that there's still one test failing on LION is keeping it in the grouping?
> 
> Second, if I click on "Examine", it takes me to a page where only two tests show up: 
> 
> fast/replaced/replaced-breaking.html
> fast/replaced/width100percent-searchfield.html
> 
> Why aren't the other tests showing up on the Examine page?

Well, no longer see the fast/repaint tests at all, presumably since the mac 10.6 debug bot finally cycled. So, not sure what else we can do to further debug this without using local dummy data.

It looks to me like everything is working as intended, but maybe I'm missing something. I think the current grouping is confusing and should probably take into account which bots the test is failing on and possibly the type of failure.
Comment 10 Adam Barth 2012-05-14 14:15:15 PDT
> I think the current grouping is confusing and should probably take into account which bots the test is failing on and possibly the type of failure.

There's a trade-off here between being too coarse grained and overloading the gardener with information.  Any thoughts you two have on improving the UI is much appreciated.  :)
Comment 11 Dirk Pranke 2012-05-14 14:32:41 PDT
(In reply to comment #10)
> > I think the current grouping is confusing and should probably take into account which bots the test is failing on and possibly the type of failure.
> 
> There's a trade-off here between being too coarse grained and overloading the gardener with information.  Any thoughts you two have on improving the UI is much appreciated.  :)

Perhaps we could do something like highlight the tests that are actually still failing on that bot when you hover over the bot name?
Comment 12 Adam Barth 2012-05-14 14:38:06 PDT
That's a nice idea.
Comment 13 Ojan Vafai 2012-05-14 14:48:44 PDT
(In reply to comment #11)
> (In reply to comment #10)
> > > I think the current grouping is confusing and should probably take into account which bots the test is failing on and possibly the type of failure.
> > 
> > There's a trade-off here between being too coarse grained and overloading the gardener with information.  Any thoughts you two have on improving the UI is much appreciated.  :)
> 
> Perhaps we could do something like highlight the tests that are actually still failing on that bot when you hover over the bot name?

Clever. Certainly worth a try before cluttering the UI in some other way.