RESOLVED FIXED 51572
new-run-webkit-tests: shut up and exit when all the threads are wedged
https://bugs.webkit.org/show_bug.cgi?id=51572
Summary new-run-webkit-tests: shut up and exit when all the threads are wedged
Dirk Pranke
Reported 2010-12-23 15:57:18 PST
shut up and exit when all the threads are wedged
Attachments
Patch (4.17 KB, patch)
2010-12-23 16:00 PST, Dirk Pranke
no flags
use MockTestRunner() (4.27 KB, patch)
2010-12-23 18:55 PST, Dirk Pranke
no flags
update w/ fix in r76073, try again (5.02 KB, patch)
2011-02-02 18:18 PST, Dirk Pranke
no flags
remove duplicated code (5.78 KB, patch)
2011-02-03 19:12 PST, Dirk Pranke
no flags
Dirk Pranke
Comment 1 2010-12-23 16:00:18 PST
Mihai Parparita
Comment 2 2010-12-23 17:06:20 PST
Comment on attachment 77380 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=77380&action=review > Tools/Scripts/webkitpy/layout_tests/layout_package/message_broker_unittest.py:148 > + def update(self): There's a MockTestRunner in this file, can you use that?
Dirk Pranke
Comment 3 2010-12-23 17:24:54 PST
(In reply to comment #2) > (From update of attachment 77380 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=77380&action=review > > > Tools/Scripts/webkitpy/layout_tests/layout_package/message_broker_unittest.py:148 > > + def update(self): > > There's a MockTestRunner in this file, can you use that? Ah, good point. That's actually in a different class, but there's no compelling need to have two separate classes here, so I'll combine them.
Dirk Pranke
Comment 4 2010-12-23 18:55:39 PST
Created attachment 77397 [details] use MockTestRunner()
Dirk Pranke
Comment 5 2010-12-23 18:57:01 PST
Adding some more reviewers ... one of you want to review and/or rubber-stamp this? It's very small and should help the bots out (and anyone who runs the layout tests manually ...)
Adam Barth
Comment 6 2010-12-23 18:59:52 PST
Comment on attachment 77397 [details] use MockTestRunner() I love the name of this bug.
Eric Seidel (no email)
Comment 7 2010-12-23 19:00:21 PST
Comment on attachment 77397 [details] use MockTestRunner() I'm not sure I understand what this does.
Eric Seidel (no email)
Comment 8 2010-12-23 19:01:07 PST
Comment on attachment 77397 [details] use MockTestRunner() I think the cq will try to land this if it has a cq+, even if it still has r? (and will fail). Setting to cq? to prevent that.
Kenneth Russell
Comment 9 2010-12-23 19:41:12 PST
Comment on attachment 77397 [details] use MockTestRunner() I gather that this causes us to log only once per wedged thread and to abort if all threads get wedged. Two questions: (1) if this happens, is bailing out the right thing to do? Or should some other action (like spawning more threads) be taken? (2) Will the warning and early exit be properly handled as a failure?
Dirk Pranke
Comment 10 2010-12-23 20:36:48 PST
(In reply to comment #9) > (From update of attachment 77397 [details]) > I gather that this causes us to log only once per wedged thread and to abort if all threads get wedged. Ken's analysis is right. Right now if we get wedged we will log very frequently and run for a very long time before eventually either getting killed by the bot or we make enough progress to exit (although I'm not sure why the bot doesn't always get stuck). This causes things to fail fast. > Two questions: (1) if this happens, is bailing out the right thing to do? Or should some other action (like spawning more threads) be taken? If threads get wedged, then the process will never exit cleanly and the test run will be considered a failure. In addition, one or more tests will never complete. Given the way the code currently handles sharding the tests, large numbers of tests might not get to run. This is by no means a real fix, it is just intended to make the behavior less bad. The fix is the other set of patches I've been working on. > (2) Will the warning and early exit be properly handled as a failure? Yes.
Kenneth Russell
Comment 11 2010-12-23 21:15:43 PST
Comment on attachment 77397 [details] use MockTestRunner() Even if this isn't the final fix it's still a major improvement over the current situation. r=me
WebKit Commit Bot
Comment 12 2010-12-24 04:12:44 PST
Comment on attachment 77397 [details] use MockTestRunner() Clearing flags on attachment: 77397 Committed r74632: <http://trac.webkit.org/changeset/74632>
WebKit Commit Bot
Comment 13 2010-12-24 04:12:52 PST
All reviewed patches have been landed. Closing bug.
Philippe Normand
Comment 14 2010-12-24 13:45:39 PST
Sorry but I had to rollout this patch. The unittest makes test-webkitpy hang forever on the GTK bots. See also bug 51600
Dirk Pranke
Comment 15 2011-01-04 14:37:54 PST
Sigh. Sorry for the brokenness - I have no idea what's different about your config that is causing it to hang, but Python will wait for every spawned thread to exit before exiting the process, so that's probably what's going on here. What version of Linux runs on the GTK bots?
Eric Seidel (no email)
Comment 16 2011-01-05 19:03:58 PST
I just saw a commit-queue node hang in the python tests 9 hours ago, so unless this was re-landed, this was not the only cause.
Philippe Normand
Comment 17 2011-01-10 07:04:40 PST
(In reply to comment #15) > Sigh. Sorry for the brokenness - I have no idea what's different about your config that is causing it to hang, but Python will wait for every spawned thread to exit before exiting the process, so that's probably what's going on here. > > What version of Linux runs on the GTK bots? I could reproduce the issue on my laptop running Debian Sid. The GTK Debug bots run on Debian Sid as well, IIRC.
Dirk Pranke
Comment 18 2011-01-10 12:29:41 PST
Hmm. I don't know that I have an easy way to get an instance of Debian Sid to test this on. After I poke into this some more, and if it works for me on the Ubuntu instances I have, I may need your help to figure this one out.
Dirk Pranke
Comment 19 2011-02-02 18:18:44 PST
Created attachment 81020 [details] update w/ fix in r76073, try again
Kenneth Russell
Comment 20 2011-02-02 18:54:36 PST
Comment on attachment 81020 [details] update w/ fix in r76073, try again OK.
Dirk Pranke
Comment 21 2011-02-03 19:12:19 PST
Created attachment 81173 [details] remove duplicated code
Dirk Pranke
Comment 22 2011-02-03 19:12:45 PST
Dirk Pranke
Comment 23 2011-02-03 20:54:05 PST
Dirk Pranke
Comment 24 2011-02-17 17:30:19 PST
*** Bug 46833 has been marked as a duplicate of this bug. ***
Dirk Pranke
Comment 25 2011-02-17 17:34:02 PST
*** Bug 46320 has been marked as a duplicate of this bug. ***
Note You need to log in before you can comment on or make changes to this bug.