142425 – Crash count should be listed separately from failure count when layout tests fail

RESOLVED WONTFIX 142425

Crash count should be listed separately from failure count when layout tests fail

https://bugs.webkit.org/show_bug.cgi?id=142425

Summary Crash count should be listed separately from failure count when layout tests ...

David Kilzer (:ddkilzer)

Reported 2015-03-06 20:05:52 PST

When a layout tests fail, it's useful to print the crash count separately (as a subset of the failure count). This is because crashes are more severe than most other failures, and when running tests with ASan or GuardMalloc, the primary intent is to find tests that crash since those are the most interesting results for those bots.

Attachments
Patch v1 (3.40 KB, patch) 2015-03-06 20:23 PST, David Kilzer (:ddkilzer)	no flags	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

David Kilzer (:ddkilzer)

Comment 1 2015-03-06 20:07:36 PST

<rdar://problem/20078699>

David Kilzer (:ddkilzer)

Comment 2 2015-03-06 20:23:30 PST

Created attachment 248137 [details] Patch v1

Csaba Osztrogonác

Comment 3 2015-03-07 01:50:26 PST

(In reply to comment #0) > When a layout tests fail, it's useful to print the crash count separately > (as a subset of the failure count). > > This is because crashes are more severe than most other failures, and when > running tests with ASan or GuardMalloc, the primary intent is to find tests > that crash since those are the most interesting results for those bots. I like the idea to show crashes separately, but in my opinion counting them twice (as crashed and failures too) is very confusing if we don't change the output format. I could imagine a more detailed output, but it should be evident what means what. And maybe the bot watchers dashboard should be fixed to be able parse the new output. for example: (where x = y + u + v) x unexpected regressions: y crashes, u timeouts, v failures j new passes, k flakes, l missing results For example let's see the latest result from the GTK bot, where we can found all kind of failures: actual output: 22 failures 32 new passes 10 flakes 8 missing results proposed output, which is confusing for me: 22 failures 4 crashes 32 new passes 10 flakes 8 missing results Or what about: 22 failures (4 of them crashes) ... ? detailed results (actual): -------------------------- Expected to crash, but passed: (1) Expected to timeout, but passed: (2) Expected to fail, but passed: (29) ==> 32 new passes Unexpected flakiness: text-only failures (4) Unexpected flakiness: image-only failures (1) Unexpected flakiness: timeouts (5) ==> 10 flakes Regressions: Unexpected text-only failures (4) Regressions: Unexpected image-only failures (6) Regressions: Unexpected crashes (4) Regressions: Unexpected timeouts (8) ==> 22 failures Regressions: Unexpected missing results (8) ==> 8 missing results

Alexey Proskuryakov

Comment 4 2015-03-07 08:03:24 PST

Comment on attachment 248137 [details] Patch v1 View in context: https://bugs.webkit.org/attachment.cgi?id=248137&action=review > Tools/BuildSlaveSupport/build.webkit.org-config/master.cfg:377 > + # Add crash count to failure count. I don't understand this part. Isn't it already included?

Alexey Proskuryakov

Comment 5 2015-03-07 08:07:00 PST

Ossy, I think that the intention here is to display crashes for crashesOnly queues (already supported by the dashboard), such as a potential GuardMalloc queue. I would object to displaying crash count separately for normal queues, that's too much information.

David Kilzer (:ddkilzer)

Comment 6 2015-03-08 15:27:15 PDT

(In reply to comment #4) > Comment on attachment 248137 [details] > Patch v1 > > View in context: > https://bugs.webkit.org/attachment.cgi?id=248137&action=review > > > Tools/BuildSlaveSupport/build.webkit.org-config/master.cfg:377 > > + # Add crash count to failure count. > > I don't understand this part. Isn't it already included? Prior to this change, crashes were counted as failures but not listed separately. This piece of code keeps that invariant the same so that crash count is listed separately, but crashes are still also counted as failures. FWIW, this matches how we report crashes on our internal buildbot instances (although the code accomplishes this slightly differently). I don't have strong opinions on whether crash count should be a subset of failure count or not (I originally implemented them as separate counts). (In reply to comment #5) > Ossy, I think that the intention here is to display crashes for crashesOnly > queues (already supported by the dashboard), such as a potential GuardMalloc > queue. Yes, that's the intention. > I would object to displaying crash count separately for normal queues, > that's too much information. I don't see the potential harm here, though. Can you explain what exactly your concern is other than "too much information"? Is it that the Info column might wrap to a second line? How is having a crash count a bad thing in this situation? Personally, I think crashes are an interesting enough subset of failures that I'd like to have a separate count in the summary text. It's especially useful for GuardMalloc or ASan queues, but I feel crashes in other queues are interesting as well.

Alexey Proskuryakov

Comment 7 2015-03-09 12:30:00 PDT

> > > Tools/BuildSlaveSupport/build.webkit.org-config/master.cfg:377 > > > + # Add crash count to failure count. > > > > I don't understand this part. Isn't it already included? > > Prior to this change, crashes were counted as failures but not listed separately. What I'm saying is that failures come from all "Regressions: Unexpected" lines, and crashes come from "Regressions: Unexpected crash" lines. Which seems to mean that failures already include crashes, but then you add crashes to failures. Aren't they double-counted as a result? Am I misreading the code? > Can you explain what exactly your concern is other than "too much information"? Too much information is exactly what I'm concerned about. We do have a bug about improving summary strings for failures though - bug 122901 - please feel free to add the idea of separating crashes if you think that it would help those looking at the page.

Alexey Proskuryakov

Comment 8 2015-03-09 12:31:36 PDT

One thing we could do without adding noise is to say "N crashes" when all the failures are crashes, and to say "N failures" when there is a mix of crashes, text failures, audio failures and/or image failures.

Blaze Burg

Comment 9 2015-11-23 15:22:39 PST

(In reply to comment #8) > One thing we could do without adding noise is to say "N crashes" when all > the failures are crashes, and to say "N failures" when there is a mix of > crashes, text failures, audio failures and/or image failures. So, I think whether crashes should be included in the big red number or not is a dashboard UI issue. It would be nice to simply have the crash count included in the raw data (I would prefer it be subtracted from failures, unless we double count other types of failures already).

David Kilzer (:ddkilzer)

Comment 10 2016-06-21 11:39:52 PDT

Comment on attachment 248137 [details] Patch v1 Clearing review flags.

David Kilzer (:ddkilzer)

Comment 11 2016-06-21 11:41:05 PDT

Apparently I was trying to solve a problem that no one was having, so moving to RESOLVED/WONTFIX.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution WONTFIX

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Unspecified

OS Unspecified

Product WebKit

Component Tools / Tests

Assignee

David Kilzer (:ddkilzer)

Reported

2015-03-06 20:05 PST

Modified

2016-06-21 11:41 PDT History

CC List

11 users Show

URL

Keywords InRadar

Depends on

Blocks