51572 – new-run-webkit-tests: shut up and exit when all the threads are wedged

RESOLVED FIXED 51572

new-run-webkit-tests: shut up and exit when all the threads are wedged

https://bugs.webkit.org/show_bug.cgi?id=51572

Summary new-run-webkit-tests: shut up and exit when all the threads are wedged

Dirk Pranke

Reported 2010-12-23 15:57:18 PST

shut up and exit when all the threads are wedged

Attachments
Patch (4.17 KB, patch) 2010-12-23 16:00 PST, Dirk Pranke	no flags	Details Formatted Diff Diff
use MockTestRunner() (4.27 KB, patch) 2010-12-23 18:55 PST, Dirk Pranke	no flags	Details Formatted Diff Diff
update w/ fix in r76073, try again (5.02 KB, patch) 2011-02-02 18:18 PST, Dirk Pranke	no flags	Details Formatted Diff Diff
remove duplicated code (5.78 KB, patch) 2011-02-03 19:12 PST, Dirk Pranke	no flags	Details Formatted Diff Diff
Show Obsolete (3) View All Add attachment proposed patch, testcase, etc.

Dirk Pranke

Comment 1 2010-12-23 16:00:18 PST

Created attachment 77380 [details] Patch

Mihai Parparita

Comment 2 2010-12-23 17:06:20 PST

Comment on attachment 77380 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=77380&action=review > Tools/Scripts/webkitpy/layout_tests/layout_package/message_broker_unittest.py:148 > + def update(self): There's a MockTestRunner in this file, can you use that?

Dirk Pranke

Comment 3 2010-12-23 17:24:54 PST

(In reply to comment #2) > (From update of attachment 77380 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=77380&action=review > > > Tools/Scripts/webkitpy/layout_tests/layout_package/message_broker_unittest.py:148 > > + def update(self): > > There's a MockTestRunner in this file, can you use that? Ah, good point. That's actually in a different class, but there's no compelling need to have two separate classes here, so I'll combine them.

Dirk Pranke

Comment 4 2010-12-23 18:55:39 PST

Created attachment 77397 [details] use MockTestRunner()

Dirk Pranke

Comment 5 2010-12-23 18:57:01 PST

Adding some more reviewers ... one of you want to review and/or rubber-stamp this? It's very small and should help the bots out (and anyone who runs the layout tests manually ...)

Adam Barth

Comment 6 2010-12-23 18:59:52 PST

Comment on attachment 77397 [details] use MockTestRunner() I love the name of this bug.

Eric Seidel (no email)

Comment 7 2010-12-23 19:00:21 PST

Comment on attachment 77397 [details] use MockTestRunner() I'm not sure I understand what this does.

Eric Seidel (no email)

Comment 8 2010-12-23 19:01:07 PST

Comment on attachment 77397 [details] use MockTestRunner() I think the cq will try to land this if it has a cq+, even if it still has r? (and will fail). Setting to cq? to prevent that.

Kenneth Russell

Comment 9 2010-12-23 19:41:12 PST

Comment on attachment 77397 [details] use MockTestRunner() I gather that this causes us to log only once per wedged thread and to abort if all threads get wedged. Two questions: (1) if this happens, is bailing out the right thing to do? Or should some other action (like spawning more threads) be taken? (2) Will the warning and early exit be properly handled as a failure?

Dirk Pranke

Comment 10 2010-12-23 20:36:48 PST

(In reply to comment #9) > (From update of attachment 77397 [details]) > I gather that this causes us to log only once per wedged thread and to abort if all threads get wedged. Ken's analysis is right. Right now if we get wedged we will log very frequently and run for a very long time before eventually either getting killed by the bot or we make enough progress to exit (although I'm not sure why the bot doesn't always get stuck). This causes things to fail fast. > Two questions: (1) if this happens, is bailing out the right thing to do? Or should some other action (like spawning more threads) be taken? If threads get wedged, then the process will never exit cleanly and the test run will be considered a failure. In addition, one or more tests will never complete. Given the way the code currently handles sharding the tests, large numbers of tests might not get to run. This is by no means a real fix, it is just intended to make the behavior less bad. The fix is the other set of patches I've been working on. > (2) Will the warning and early exit be properly handled as a failure? Yes.

Kenneth Russell

Comment 11 2010-12-23 21:15:43 PST

Comment on attachment 77397 [details] use MockTestRunner() Even if this isn't the final fix it's still a major improvement over the current situation. r=me

WebKit Commit Bot

Comment 12 2010-12-24 04:12:44 PST

Comment on attachment 77397 [details] use MockTestRunner() Clearing flags on attachment: 77397 Committed r74632: <http://trac.webkit.org/changeset/74632>

WebKit Commit Bot

Comment 13 2010-12-24 04:12:52 PST

All reviewed patches have been landed. Closing bug.

Philippe Normand

Comment 14 2010-12-24 13:45:39 PST

Sorry but I had to rollout this patch. The unittest makes test-webkitpy hang forever on the GTK bots. See also bug 51600

Dirk Pranke

Comment 15 2011-01-04 14:37:54 PST

Sigh. Sorry for the brokenness - I have no idea what's different about your config that is causing it to hang, but Python will wait for every spawned thread to exit before exiting the process, so that's probably what's going on here. What version of Linux runs on the GTK bots?

Eric Seidel (no email)

Comment 16 2011-01-05 19:03:58 PST

I just saw a commit-queue node hang in the python tests 9 hours ago, so unless this was re-landed, this was not the only cause.

Philippe Normand

Comment 17 2011-01-10 07:04:40 PST

(In reply to comment #15) > Sigh. Sorry for the brokenness - I have no idea what's different about your config that is causing it to hang, but Python will wait for every spawned thread to exit before exiting the process, so that's probably what's going on here. > > What version of Linux runs on the GTK bots? I could reproduce the issue on my laptop running Debian Sid. The GTK Debug bots run on Debian Sid as well, IIRC.

Dirk Pranke

Comment 18 2011-01-10 12:29:41 PST

Hmm. I don't know that I have an easy way to get an instance of Debian Sid to test this on. After I poke into this some more, and if it works for me on the Ubuntu instances I have, I may need your help to figure this one out.

Dirk Pranke

Comment 19 2011-02-02 18:18:44 PST

Created attachment 81020 [details] update w/ fix in r76073, try again

Kenneth Russell

Comment 20 2011-02-02 18:54:36 PST

Comment on attachment 81020 [details] update w/ fix in r76073, try again OK.

Dirk Pranke

Comment 21 2011-02-03 19:12:19 PST

Created attachment 81173 [details] remove duplicated code

Dirk Pranke

Comment 22 2011-02-03 19:12:45 PST

Committed r77586: <http://trac.webkit.org/changeset/77586>

Dirk Pranke

Comment 23 2011-02-03 20:54:05 PST

Committed r77595: <http://trac.webkit.org/changeset/77595>

Dirk Pranke

Comment 24 2011-02-17 17:30:19 PST

*** Bug 46833 has been marked as a duplicate of this bug. ***

Dirk Pranke

Comment 25 2011-02-17 17:34:02 PST

*** Bug 46320 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Other

OS OS X 10.5

Product WebKit

Component New Bugs

Assignee

Dirk Pranke

Reported

2010-12-23 15:57 PST

Modified

2011-02-17 17:34 PST History

CC List

8 users Show

URL

Keywords

Duplicates (2)

46320 46833 View as bug list

Depends on

51600

Blocks

Dependencies

tree graph