RESOLVED FIXED 38300
new-run-webkit-tests is hitting a python bug, and hanging/crashing on Chromium Mac Bots
https://bugs.webkit.org/show_bug.cgi?id=38300
Summary new-run-webkit-tests is hitting a python bug, and hanging/crashing on Chromiu...
Eric Seidel (no email)
Reported 2010-04-28 17:50:14 PDT
new-run-webkit-tests hanging on chromium bots Chromium actually runs their own wrapper run_webkit_tests.py, but same deal. This is a continuation of bug 37987. We thought it was resolved by http://trac.webkit.org/changeset/58314 but it does not appear to be. I've been able to reproduce two different hangs locally. This bug will cover the crazy python logging hang, this bug will cover that. Bug 38298 will cover the rather blocking I/O hang.
Attachments
Sample from new-run-webkit-tests when hung in logging code. (7.58 KB, text/plain)
2010-04-28 17:52 PDT, Eric Seidel (no email)
no flags
crash report from python assert crash (29.20 KB, text/plain)
2010-04-28 18:46 PDT, Eric Seidel (no email)
no flags
Eric Seidel (no email)
Comment 1 2010-04-28 17:52:38 PDT
Created attachment 54650 [details] Sample from new-run-webkit-tests when hung in logging code. I looked at this sample with python developer Jeffrey Yasskin. We were not able to find the cause by inspection. I updated the stack dumping code (locally) to also print out "logging._lock" which would tell us what thread was holding the lock.
Eric Seidel (no email)
Comment 2 2010-04-28 18:23:52 PDT
[snip] each represents 100 or so repetitions of the same line. Note that it appears multiple threads are printing the same debug message. This definitely seems to be a python bug, since it crashed. Not sure how or why were' tickling it. 100428 18:15:13 dump_render_tree_thread.py:349 DEBUG Thread-3 http/tests/navigation/reload-subframe-iframe.html passed 100428 18:15:13 dump_render_tree_thread.py:349 DEBUG Thread-2 fast/block/float/marquee-shrink-to-avoid-floats.html passed pthread_cond_wait: Invalid argument [snip] pthread_cond_wait: Invalid argument pthread_cond_wait: Invalipthread_cond_wait: Invalid argument pthread_cond_wait: Invalid argument [snip] pthread_cond_wait: Invalid argument pthread_cond_wait: Invalid argument d argument pthread_cond_wait: Invalid argument pthread_cond_wait: Invalid argument [snip] pthread_cond_wait: Invalid argument pthread_cond_wait: Invalid argument 100428 18:15:14 dump_render_tree_thread.py:349 DEBUG Thread-2 fast/block/float/multiple-float-positioning.html passed pthread_cond_wait: Invalid argument pthread_cond_wait: Invalid argument [snip] pthread_cond_wait: Invalid argument pthread_cond_wait: Invalid argument Assertion failed: (tstate != NULL), function PyEval_EvalCodeEx, file Python/ceval.c, line 2664.
Eric Seidel (no email)
Comment 3 2010-04-28 18:46:01 PDT
Created attachment 54659 [details] crash report from python assert crash
Eric Seidel (no email)
Comment 4 2010-04-28 18:47:37 PDT
I wonder if these asserts are during the fork/exec process.
Eric Seidel (no email)
Comment 5 2010-04-28 21:04:54 PDT
*** Bug 38252 has been marked as a duplicate of this bug. ***
Eric Seidel (no email)
Comment 6 2010-05-03 16:53:49 PDT
So far we've only seen this reported for Mac, and only for run_webkit_tests.py (chromium's new-run-webkit-tests wrapper). I suspect it exists both for Chromium and WebKit ports however (they use slightly different python code). I suspect it may exist on non-mac python versions as well, although may be specific to Python 2.5. More investigation is required. Anyone having seen this should add their platform information/python version to the bug, so I can get a sense of where we're seeing this and how often.
Dirk Pranke
Comment 7 2010-05-04 16:13:56 PDT
From a fairly quick look over the past few days on the Chromium bots, several bots (WebKit Mac, WebKit Mac (dbg)(3) at least) are hanging with several different sets of symptoms. It looks like most of those hangs are probably not related to pretty patch, since it looks like pretty patch may not be available. I will try to dig up some representative stack traces.
Dirk Pranke
Comment 8 2011-02-18 19:24:36 PST
This should be fixed as of r79062.
Note You need to log in before you can comment on or make changes to this bug.