Summary: | Retry crashing tests serially at the end of NRWT on Apple Mac | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Mark Hahnenberg <mhahnenberg> | ||||||
Component: | Tools / Tests | Assignee: | Mark Hahnenberg <mhahnenberg> | ||||||
Status: | RESOLVED FIXED | ||||||||
Severity: | Normal | CC: | abarth, ap, dpranke, ggaren, mitz, ojan, ossy, webkit.review.bot | ||||||
Priority: | P2 | ||||||||
Version: | 528+ (Nightly build) | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Attachments: |
|
Description
Mark Hahnenberg
2012-03-26 12:18:48 PDT
Created attachment 133873 [details]
Patch
Comment on attachment 133873 [details]
Patch
Note that the tests that are retried are still actually run concurrently, so this patch may not do what you think it should do.
I'm sure I had a patch at one point that did run things serially, but I guess I never committed it; it would be easy to make that change and I can't really think of a good reason not to do it, so you could either add that to this or add a separate patch to do so, if you like, or you could file a bug and I could do it at some point.
All that aside, I'm a bit torn by this patch. I can see the argument in retrying the crashes to see if they're consistently crashing or just flaky under load; I don't really have a problem with that.
However, I'm not sure that i think a flaky crash should be ignored (which is what would happen by default).
Is it hard (or impossible) to fix things so that DRT and WTR don't actually crash?
As an actual comment on the crash, I think I would prefer this to be a method on the port itself (like port.should_retry_crashes()), rather than a test against the port name.
> Is it hard (or impossible) to fix things so that DRT and WTR don't actually crash? Like the bug description says, the crash isn't in DRT or WTR (or WebKit, for that matter): It's a bug in certain "daemons/system services/frameworks on the Mac bots". Fixing those crashes it out of scope for the WebKit project. > Note that the tests that are retried are still actually run concurrently, so this patch may not do what you think it should do. Mark, it sounds like you should write a follow-up patch to have retried tests run serially. (Possibly just change the concurrency setting before retrying?) Based on my testing a year ago, this would also fix flakiness in filesystem/database and animation tests, so it's probably appropriate for all ports. > However, I'm not sure that i think a flaky crash should be ignored (which is what would happen by default). What would you suggest instead? We'd like a solution that can give an unambiguous "PASS" (green) or "FAIL" (red) result for a given test run. Created attachment 133879 [details]
Patch
(In reply to comment #3) > > Is it hard (or impossible) to fix things so that DRT and WTR don't actually crash? > > Like the bug description says, the crash isn't in DRT or WTR (or WebKit, for that matter): It's a bug in certain "daemons/system services/frameworks on the Mac bots". Fixing those crashes it out of scope for the WebKit project. > Well, for daemons and system services, presumably you could make DRT/WTR/WebKit resilient to crashes in other processes. For frameworks obviously that's harder; however, in either case I would be concerned that crashes represent potential security issues and shouldn't be ignored. Obviously, having tests crash all the time is a good way to make sure they aren't ignored, but I agree that randomly flaky tests don't help anything. > > However, I'm not sure that i think a flaky crash should be ignored (which is what would happen by default). > > What would you suggest instead? > The alternative I was considering was that we could retry the test to see if it crashed twice, but still consider it a FAIL even if it only crashed once. > We'd like a solution that can give an unambiguous "PASS" (green) or "FAIL" (red) result for a given test run. Sure, wouldn't we all? :) Comment on attachment 133879 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=133879&action=review > Tools/Scripts/webkitpy/layout_tests/controllers/manager.py:901 > + # On Apple Mac, we retry crashes due to https://bugs.webkit.org/show_bug.cgi?id=82233 this comment should move to mac.py now. Please fix that before landing ... This was fixed in http://trac.webkit.org/changeset/112189. Is it still needed? Apple buildbots run with --no-retry-failures for a while without a regular flakiness. Should we still retry crashes for local users only? > Apple buildbots run with --no-retry-failures for a while
We still have retrying enabled on debug bots and on EWS.
But as far as retrying crashed tests goes, I do not understand why we do that. AFAIK we still report it as a hard failure when first try crashes, and second one doesn't.
|