WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED WORKSFORME
59570
new-run-webkit-tests is flaky on mac
https://bugs.webkit.org/show_bug.cgi?id=59570
Summary
new-run-webkit-tests is flaky on mac
Geoffrey Garen
Reported
2011-04-26 18:39:46 PDT
I have a patch that bundles up all flaky tests to run at the end. Seems to fail the http tests, though.
Attachments
patch
(6.62 KB, text/plain)
2011-04-26 18:40 PDT
,
Geoffrey Garen
no flags
Details
patch
(4.59 KB, patch)
2011-04-26 18:56 PDT
,
Geoffrey Garen
no flags
Details
Formatted Diff
Diff
patch
(6.43 KB, patch)
2011-04-27 18:23 PDT
,
Geoffrey Garen
no flags
Details
Formatted Diff
Diff
Patch
(7.79 KB, patch)
2011-04-27 21:53 PDT
,
Geoffrey Garen
levin
: review-
Details
Formatted Diff
Diff
Show Obsolete
(3)
View All
Add attachment
proposed patch, testcase, etc.
Geoffrey Garen
Comment 1
2011-04-26 18:40:29 PDT
Created
attachment 91206
[details]
patch Here's the patch.
Dirk Pranke
Comment 2
2011-04-26 18:51:01 PDT
when you say "seems to fail the http tests", you mean that running with this patch failed on the http tests, right? without this patch, do the http tests run okay?
Geoffrey Garen
Comment 3
2011-04-26 18:56:42 PDT
Created
attachment 91209
[details]
patch Now with http tests fixed.
Geoffrey Garen
Comment 4
2011-04-26 18:58:56 PDT
> when you say "seems to fail the http tests", you mean that running with this patch failed on the http tests, right?
Yes.
> without this patch, do the http tests run okay?
Yes. Should be fixed in the latest patch. It was a bad rename.
Geoffrey Garen
Comment 5
2011-04-27 18:23:56 PDT
Created
attachment 91396
[details]
patch Updated to retry more reliably and serialize Java and a few more things.
Dirk Pranke
Comment 6
2011-04-27 18:37:04 PDT
how well do the tests work when you run w/o this patch? If we retry the failures, does everything pass for you on the second run?
Geoffrey Garen
Comment 7
2011-04-27 21:50:28 PDT
(In reply to
comment #6
)
> how well do the tests work when you run w/o this patch? If we retry the failures, does everything pass for you on the second run?
Without this patch, running with 16-32 parallel instances, I got ~10 failures per run, and the failures didn't necessarily pass on retry. The failures seemed to change unpredictably between runs. (At least, there were enough failures with enough variety that I couldn't suss out a pattern.)
Geoffrey Garen
Comment 8
2011-04-27 21:53:10 PDT
Created
attachment 91425
[details]
Patch
Geoffrey Garen
Comment 9
2011-04-27 21:58:18 PDT
I'm not super-familiar with this code, and I've never programmed in Python before, but I think this patch is worth an r?. On my SnowLeopard Mac Pro, it makes new-run-webkit-tests basically as reliable as run-webkit-tests.
Dirk Pranke
Comment 10
2011-04-27 22:16:15 PDT
Comment on
attachment 91425
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=91425&action=review
Given what you responded to, I'm okay with adding a concept like this to the test harness in order to make tests more reliable, but we wouldn't want to land this patch as is. I've got some specific comments below, but unless you really want to learn the code better and hack on this yourself, you can leave this for a day or two and I'll be happy to implement it.
> Tools/Scripts/webkitpy/layout_tests/layout_package/manager.py:72 > +_locked_tests = [
We definitely wouldn't want to hard-code the list of tests and directories into this file, which is generic code shared across all of the implementations. One possibility would be to add methods on the Port objects (e.g. webkitpy/layout_tests/port/webkit.py) to return lists of files and directories that needed to be serialized. However, it's probably better to pull this out of code altogether and use the test_expectations.txt file to track this. I think the right way to go would be to add a SERIAL modifier (or some similar name), just like we have modifiers to indicate that tests can be SLOW.
> Tools/Scripts/webkitpy/layout_tests/layout_package/manager.py:103 > + # These tests rely on filesystem state, so they can't run in parallel.
This is almost certainly not true in general. It isn't for the chromium ports, at least. There may be some tests that have state, but not all of them, and you would lose a lot of parallelism by enforcing this this broadly.
> Tools/Scripts/webkitpy/layout_tests/layout_package/manager.py:-595 > - int(self._options.child_processes) > 1 and not self._options.experimental_fully_parallel)
By itself, I think the idea of running all of the retried tests serially makes sense. There should rarely be many of them. However, it might make sense to indicate this simply by changing the value of options.child_processes to 1 before the retry (or passing the child_processes value in as a parameter instead).
> Tools/Scripts/webkitpy/layout_tests/layout_package/manager.py:696 > + worker_number = num_workers + 1
This block of code duplicates the above block, and should be refactored into a separate method.
David Levin
Comment 11
2011-04-28 13:09:53 PDT
Comment on
attachment 91425
[details]
Patch Based on Dirk's comments, it looks like there are some reasonable things to address before this can be landed,so r-.
Ojan Vafai
Comment 12
2011-05-10 11:14:50 PDT
(In reply to
comment #10
)
> (From update of
attachment 91425
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=91425&action=review
> > Tools/Scripts/webkitpy/layout_tests/layout_package/manager.py:72 > > +_locked_tests = [ > > We definitely wouldn't want to hard-code the list of tests and directories into this file, which is generic code shared across all of the implementations. One possibility would be to add methods on the Port objects (e.g. webkitpy/layout_tests/port/webkit.py) to return lists of files and directories that needed to be serialized. > > However, it's probably better to pull this out of code altogether and use the test_expectations.txt file to track this. I think the right way to go would be to add a SERIAL modifier (or some similar name), just like we have modifiers to indicate that tests can be SLOW.
I don't think I agree. In general, the problem here is often with the tests themselves, so the list will want to be shared across ports. That said, it might make sense to put it in port/base.py and that way different ports can add to/override the list in base.py.
> > Tools/Scripts/webkitpy/layout_tests/layout_package/manager.py:103 > > + # These tests rely on filesystem state, so they can't run in parallel. > > This is almost certainly not true in general. It isn't for the chromium ports, at least. There may be some tests that have state, but not all of them, and you would lose a lot of parallelism by enforcing this this broadly. > > > Tools/Scripts/webkitpy/layout_tests/layout_package/manager.py:-595 > > - int(self._options.child_processes) > 1 and not self._options.experimental_fully_parallel) > > By itself, I think the idea of running all of the retried tests serially makes sense. There should rarely be many of them. However, it might make sense to indicate this simply by changing the value of options.child_processes to 1 before the retry (or passing the child_processes value in as a parameter instead).
Have we tried whether just making the retry step serial is sufficient? It would be great if we didn't need to maintain a list of tests at all. Certainly we could commit that part of this patch and see where that gets us on the NRWT bot.
Geoffrey Garen
Comment 13
2011-05-10 11:46:55 PDT
> Have we tried whether just making the retry step serial is sufficient? It would be great if we didn't need to maintain a list of tests at all. Certainly we could commit that part of this patch and see where that gets us on the NRWT bot.
I haven't tried that. Seems like it could work.
Eric Seidel (no email)
Comment 14
2011-07-05 13:29:54 PDT
Comment on
attachment 91425
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=91425&action=review
>>> Tools/Scripts/webkitpy/layout_tests/layout_package/manager.py:72 >>> +_locked_tests = [ >> >> We definitely wouldn't want to hard-code the list of tests and directories into this file, which is generic code shared across all of the implementations. One possibility would be to add methods on the Port objects (e.g. webkitpy/layout_tests/port/webkit.py) to return lists of files and directories that needed to be serialized. >> >> However, it's probably better to pull this out of code altogether and use the test_expectations.txt file to track this. I think the right way to go would be to add a SERIAL modifier (or some similar name), just like we have modifiers to indicate that tests can be SLOW. > > I don't think I agree. In general, the problem here is often with the tests themselves, so the list will want to be shared across ports. That said, it might make sense to put it in port/base.py and that way different ports can add to/override the list in base.py.
I thought we were using test_expectations for this sort of thing?
http://trac.webkit.org/browser/trunk/LayoutTests/platform/mac/test_expectations.txt#L66
>>> Tools/Scripts/webkitpy/layout_tests/layout_package/manager.py:103 >>> + # These tests rely on filesystem state, so they can't run in parallel. >> >> This is almost certainly not true in general. It isn't for the chromium ports, at least. There may be some tests that have state, but not all of them, and you would lose a lot of parallelism by enforcing this this broadly. > > Have we tried whether just making the retry step serial is sufficient? It would be great if we didn't need to maintain a list of tests at all. Certainly we could commit that part of this patch and see where that gets us on the NRWT bot.
I believe the current approach is that tests which fail will just be retried (in serial). That's not ideal though because then the tests are reported as "flaky" even though we expected them to fail when run in parallel. I think we should work on encoding this information in the LayoutTests file system, and making DRT more parallel-friendly. Durring the proposed transition (this week) we're just running NRWT with a single child process on the bots to avoid these issues.
> Tools/Scripts/webkitpy/layout_tests/layout_package/manager.py:114 > + # Java plug-in seems to malfunction unpredictably when used under parallel load.
Java seems to malfunction in general. :) the commit-queue has filed several flaky-test bugs about java related tests.
Dirk Pranke
Comment 15
2011-07-05 13:38:01 PDT
(In reply to
comment #14
)
> (From update of
attachment 91425
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=91425&action=review
> > >>> Tools/Scripts/webkitpy/layout_tests/layout_package/manager.py:72 > >>> +_locked_tests = [ > >> > >> We definitely wouldn't want to hard-code the list of tests and directories into this file, which is generic code shared across all of the implementations. One possibility would be to add methods on the Port objects (e.g. webkitpy/layout_tests/port/webkit.py) to return lists of files and directories that needed to be serialized. > >> > >> However, it's probably better to pull this out of code altogether and use the test_expectations.txt file to track this. I think the right way to go would be to add a SERIAL modifier (or some similar name), just like we have modifiers to indicate that tests can be SLOW. > > > > I don't think I agree. In general, the problem here is often with the tests themselves, so the list will want to be shared across ports. That said, it might make sense to put it in port/base.py and that way different ports can add to/override the list in base.py. > > I thought we were using test_expectations for this sort of thing? >
http://trac.webkit.org/browser/trunk/LayoutTests/platform/mac/test_expectations.txt#L66
>
We were (or I was, at least). Three problems with this approach ... 1) It is unclear if non-Chromium ports want to use test expectations at all. 2) It is unclear how to share the knowledge of tests that are generally flaky across ports if each port has its own test_expectations file. 3) Given the current implementation of test_expectations, the best we can do is to mark a test as "PASS FAIL", which is pretty broad and can cause an awful lot of things to be hidden. Per my comment in #10, If we wanted to make things cleaner via the test_expectations.txt route, I think the best approach to tracking this would be to introduce a new modifier like "SERIAL" to complement "SLOW" to indicate that the test should be run serially.
Eric Seidel (no email)
Comment 16
2011-08-08 10:14:44 PDT
I'm not sure what to do with this bug. I guess we'll keep it around until we complete the transition of all bots over to parallel NRWT. We'll have to solve this sort of issue somehow to do that conversion.
Dirk Pranke
Comment 17
2011-12-21 14:08:08 PST
ggaren, do you think we can close this now? It seems like retrying the failures is probably good enough?
Dirk Pranke
Comment 18
2012-06-08 16:09:06 PDT
I'm gonna close this; please reopen or file a new bug if you wish for us to do something more.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug