Right now, if you select "failing" tests on the builder pane, the new flakiness dashboard lists all failing tests including ones that have the right test expectation. It should instead only list tests that are failing and don't have the right expectation that are making bots red.
Created attachment 215240 [details] Changes the behavior
Comment on attachment 215240 [details] Changes the behavior I've never used this feature on the old dashboard, so it's not clear to me if either behavior is useful. What are the use cases? If this is a replacement for regular dashboard, then we should consider just removing the duplicate functionality. r=me
(In reply to comment #2) > (From update of attachment 215240 [details]) > I've never used this feature on the old dashboard, so it's not clear to me if either behavior is useful. What are the use cases? If this is a replacement for regular dashboard, then we should consider just removing the duplicate functionality. This shows the list of failing tests on the bots.
Comment on attachment 215240 [details] Changes the behavior Clearing flags on attachment: 215240 Committed r158093: <http://trac.webkit.org/changeset/158093>
All reviewed patches have been landed. Closing bug.
> This shows the list of failing tests on the bots. I don't think that this answers my question about use cases. Listing tests that are currently failing is not a job for the dashboard, which is for historic analysis of results.
(In reply to comment #6) > > This shows the list of failing tests on the bots. > > I don't think that this answers my question about use cases. Listing tests that are currently failing is not a job for the dashboard, which is for historic analysis of results. If you're talking about http://build.webkit.org/dashboard/, I find it impossible to use because it doesn't have links to builder's page and it has -webkit-user-select: none along with dozens of other problems.
Can you please file bugs for those? That is the tool intended to be used for looking at immediate state of the bots, and adding duplicate functionality to other tools is not the best path forward. We'll just end up with a set of tools that no one but their creators understand or use. build.webkit.org/dashboard is also meant to be the primary entry point into the regression test bot system for most people, because checking historic flakiness is an activity that is secondary to checking immediate state. Buildbot waterfall and console certainly have their use, but mostly for people who administer the system, not for WebKit developers in my opinion. There is a bunch of bugs and enhancement requests filed already, you can find these by searching for "build.webkit.org/dashboard" in Bugzilla titles. I encourage you to file bugs in terms of use cases that aren't addressed well (i.e. not simply "please remove user-select:none", but "I often need to do XXX when bot watching, and it's difficult to do now").
(In reply to comment #8) > build.webkit.org/dashboard is also meant to be the primary entry point into the regression test bot system for most people, because checking historic flakiness is an activity that is secondary to checking immediate state. Buildbot waterfall and console certainly have their use, but mostly for people who administer the system, not for WebKit developers in my opinion. I don't see a point in doing that given I'm satisfied with what build.webkit.org/waterfall and build.webkit.org/console provides. Those two pages provides exactly the kind of information I need.
> I'm satisfied with what build.webkit.org/waterfall and build.webkit.org/console provides In this case, can we just get rid of the "failing" display in the new flakiness dashboard?
(In reply to comment #10) > > I'm satisfied with what build.webkit.org/waterfall and build.webkit.org/console provides > > In this case, can we just get rid of the "failing" display in the new flakiness dashboard? Why? The historical results of currently failing tests is exactly what bot watchers need to see to determine which patch caused the failure and whether tests have been flaky or not.
I think I'm disagreeing with the statement that "checking historic flakiness is an activity that is secondary to checking immediate state". In my experience, viewing the historical results of a test has been essential in determining the culprit and the correct test expectation to add. Knowing how many tests are failing on a builder doesn't get me anywhere as a bot watcher because my primary job as a bot watcher (contacting the patch author, etc…) cannot be carried out until the culprit is determined. I don't know what revision number http://build.webkit.org/dashboard/ is showing but automatically determining the culprit has already been tried by TestFailures and garden-o-magic. They have both miserably failed to carry out the promise. The task of this sort is best done by humans.