WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
82233
Retry crashing tests serially at the end of NRWT on Apple Mac
https://bugs.webkit.org/show_bug.cgi?id=82233
Summary
Retry crashing tests serially at the end of NRWT on Apple Mac
Mark Hahnenberg
Reported
2012-03-26 12:18:48 PDT
A few of the daemons/system services/frameworks on the Mac bots can't handle being pummeled by many threads simultaneously while we're running all the layout tests concurrently. This causes some of the tests to randomly fail occasionally when it isn't actually an issue with the test. If we were to run all of these potentially crashing tests serially at the end of NRWT (which we already do for non-crashing tests), this might alleviate the problem.
Attachments
Patch
(1.71 KB, patch)
2012-03-26 13:05 PDT
,
Mark Hahnenberg
no flags
Details
Formatted Diff
Diff
Patch
(3.14 KB, patch)
2012-03-26 13:35 PDT
,
Mark Hahnenberg
dpranke
: review+
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Mark Hahnenberg
Comment 1
2012-03-26 13:05:12 PDT
Created
attachment 133873
[details]
Patch
Dirk Pranke
Comment 2
2012-03-26 13:19:42 PDT
Comment on
attachment 133873
[details]
Patch Note that the tests that are retried are still actually run concurrently, so this patch may not do what you think it should do. I'm sure I had a patch at one point that did run things serially, but I guess I never committed it; it would be easy to make that change and I can't really think of a good reason not to do it, so you could either add that to this or add a separate patch to do so, if you like, or you could file a bug and I could do it at some point. All that aside, I'm a bit torn by this patch. I can see the argument in retrying the crashes to see if they're consistently crashing or just flaky under load; I don't really have a problem with that. However, I'm not sure that i think a flaky crash should be ignored (which is what would happen by default). Is it hard (or impossible) to fix things so that DRT and WTR don't actually crash? As an actual comment on the crash, I think I would prefer this to be a method on the port itself (like port.should_retry_crashes()), rather than a test against the port name.
Geoffrey Garen
Comment 3
2012-03-26 13:32:27 PDT
> Is it hard (or impossible) to fix things so that DRT and WTR don't actually crash?
Like the bug description says, the crash isn't in DRT or WTR (or WebKit, for that matter): It's a bug in certain "daemons/system services/frameworks on the Mac bots". Fixing those crashes it out of scope for the WebKit project.
> Note that the tests that are retried are still actually run concurrently, so this patch may not do what you think it should do.
Mark, it sounds like you should write a follow-up patch to have retried tests run serially. (Possibly just change the concurrency setting before retrying?) Based on my testing a year ago, this would also fix flakiness in filesystem/database and animation tests, so it's probably appropriate for all ports.
> However, I'm not sure that i think a flaky crash should be ignored (which is what would happen by default).
What would you suggest instead? We'd like a solution that can give an unambiguous "PASS" (green) or "FAIL" (red) result for a given test run.
Mark Hahnenberg
Comment 4
2012-03-26 13:35:35 PDT
Created
attachment 133879
[details]
Patch
Dirk Pranke
Comment 5
2012-03-26 14:00:09 PDT
(In reply to
comment #3
)
> > Is it hard (or impossible) to fix things so that DRT and WTR don't actually crash? > > Like the bug description says, the crash isn't in DRT or WTR (or WebKit, for that matter): It's a bug in certain "daemons/system services/frameworks on the Mac bots". Fixing those crashes it out of scope for the WebKit project. >
Well, for daemons and system services, presumably you could make DRT/WTR/WebKit resilient to crashes in other processes. For frameworks obviously that's harder; however, in either case I would be concerned that crashes represent potential security issues and shouldn't be ignored. Obviously, having tests crash all the time is a good way to make sure they aren't ignored, but I agree that randomly flaky tests don't help anything.
> > However, I'm not sure that i think a flaky crash should be ignored (which is what would happen by default). > > What would you suggest instead? >
The alternative I was considering was that we could retry the test to see if it crashed twice, but still consider it a FAIL even if it only crashed once.
> We'd like a solution that can give an unambiguous "PASS" (green) or "FAIL" (red) result for a given test run.
Sure, wouldn't we all? :)
Dirk Pranke
Comment 6
2012-03-26 14:01:04 PDT
Comment on
attachment 133879
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=133879&action=review
> Tools/Scripts/webkitpy/layout_tests/controllers/manager.py:901 > + # On Apple Mac, we retry crashes due to
https://bugs.webkit.org/show_bug.cgi?id=82233
this comment should move to mac.py now. Please fix that before landing ...
Mark Hahnenberg
Comment 7
2012-03-28 17:34:20 PDT
This was fixed in
http://trac.webkit.org/changeset/112189
.
Csaba Osztrogonác
Comment 8
2015-02-05 05:31:58 PST
Is it still needed? Apple buildbots run with --no-retry-failures for a while without a regular flakiness. Should we still retry crashes for local users only?
Alexey Proskuryakov
Comment 9
2015-02-05 09:51:22 PST
> Apple buildbots run with --no-retry-failures for a while
We still have retrying enabled on debug bots and on EWS. But as far as retrying crashed tests goes, I do not understand why we do that. AFAIK we still report it as a hard failure when first try crashes, and second one doesn't.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug