RESOLVED FIXED 51894
[Qt] run-qtwebkit-tests needs timeout
https://bugs.webkit.org/show_bug.cgi?id=51894
Summary [Qt] run-qtwebkit-tests needs timeout
Csaba Osztrogonác
Reported 2011-01-04 14:03:54 PST
Nowadays run-qtwebkit-tests hangs regularly on buildbots. It should have timeout and kill the bad testcases. INFO:Exec:Running... WebKitBuild/Release/WebKit/qt/tests/benchmarks/painting/tst_painting command timed out: 1200 seconds without output, killing pid .... I prefer run-qtwebkit-tests kills the tst_culprit_test in 1-2 minutes to buildbot kills run-qtwebkit-tests after 20 minutes.
Attachments
patch (4.15 KB, patch)
2011-01-06 07:40 PST, Jędrzej Nowacki
ossy: review-
Patch (5.73 KB, patch)
2011-01-07 04:09 PST, Jędrzej Nowacki
no flags
Jędrzej Nowacki
Comment 1 2011-01-05 00:06:45 PST
Sure, the script should be able to kill a test, but in the first place the test should not hang, it have to fail ASAP ;-). I'm going to add the new functionality.
Jędrzej Nowacki
Comment 2 2011-01-05 01:01:49 PST
Just to be precise, you mean kill a bad testsuite, not a testcase, right? I would describe them as: Testsuite is something like tst_painting Testcase is something like tst_painting::paint() QTestLib doesn't provide timeout functionality. I may change run-qtwebkit-tests script so it can kill a subprocess (which is a testsuite) after specified timeout. But that solution is essentially similar to kill the script by buildbot. If you want to kill a testcase it would be much more complex, as the script have to run each testcase in a separate process. That may introduce an issue; results of running tests manually and by the script could be different, because of possible dependency between testcases.
Csaba Osztrogonác
Comment 3 2011-01-05 04:06:06 PST
(In reply to comment #2) > Just to be precise, you mean kill a bad testsuite, not a testcase, right? > > I would describe them as: > Testsuite is something like tst_painting > Testcase is something like tst_painting::paint() I was a little ambiguous, I mean killing the bad testsuite, of course. > QTestLib doesn't provide timeout functionality. I may change > run-qtwebkit-tests script so it can kill a subprocess (which > is a testsuite) after specified timeout. But that solution is > essentially similar to kill the script by buildbot. It would be absolutely correct solution. ;) I only would like to ensure that run-qtwebkit-tests won't block a buildslave for 20 minutes if something goes wrong in a testsuite. Now the runtime in parallel mode is ~14-15 seconds and ~38-40 seconds with single thread. So I think a 30 seconds timeout would be more than enough for each testsuite. Otherwise I supsect a network error/timeout might cause this problem, because the occurrences aren't random, but constant in a random (max. 1-2 hours long) time interval.
Jędrzej Nowacki
Comment 4 2011-01-06 07:40:09 PST
Created attachment 78120 [details] patch I do not set cq to '?' as some configuration changes may by applied on the buildbot side.
Csaba Osztrogonác
Comment 5 2011-01-06 08:06:14 PST
Comment on attachment 78120 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=78120&action=review I got many errors like this: Exception in thread Thread-1: Traceback (most recent call last): File "/usr/lib/python2.5/threading.py", line 486, in __bootstrap_inner self.run() File "/usr/lib/python2.5/threading.py", line 663, in run self.function(*self.args, **self.kwargs) File "Tools/Scripts/run-qtwebkit-tests", line 133, in process_killer tst.terminate() AttributeError: 'Popen' object has no attribute 'terminate' > Tools/Scripts/run-qtwebkit-tests:72 > + opt.add_option("-t", "--timeout", action="store", type="int", > + dest="timeout", default=0, > + help="Timeout in seconds for each testsuite. Zero value means that there is not timeout. Default: %default.") I prefer 30 seconds as default value.
Peter Gal
Comment 6 2011-01-06 08:08:25 PST
Just some minor things: > Tools/Scripts/run-qtwebkit-tests:125 > + tst = Popen([test_suite.test_file_name(), ] + options.split(), stdout=PIPE, stderr=None, shell=False) The comma inside the brackets isn't required, it will be a list without it too. Also the shell=False is unnecessary, that is the default behavior for Popen. > Tools/Scripts/run-qtwebkit-tests:127 > + tst = Popen([test_suite.test_file_name(), ] + options.split(), stdout=None, stderr=STDOUT, shell=False) ditto.
Peter Gal
Comment 7 2011-01-06 08:10:13 PST
(In reply to comment #5) > (From update of attachment 78120 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=78120&action=review > > I got many errors like this: > > Exception in thread Thread-1: > Traceback (most recent call last): > File "/usr/lib/python2.5/threading.py", line 486, in __bootstrap_inner > self.run() > File "/usr/lib/python2.5/threading.py", line 663, in run > self.function(*self.args, **self.kwargs) > File "Tools/Scripts/run-qtwebkit-tests", line 133, in process_killer > tst.terminate() > AttributeError: 'Popen' object has no attribute 'terminate' Popen's terminate() method is only available from python 2.6
Peter Gal
Comment 8 2011-01-06 08:15:31 PST
> Tools/Scripts/run-qtwebkit-tests:133 > + log.debug("Setting timeout timer %i sec on %s (process %s)", timeout, test_suite.test_file_name(), tst.pid) > + tst.terminate() maybe 'os.kill(tst.pid, signal.SIGTERM)' could resolve the problem for python 2.5
Csaba Osztrogonác
Comment 9 2011-01-06 08:30:29 PST
Comment on attachment 78120 [details] patch Please fix issues mentioned in Comment #6 and Comment #8 and add 30 seconds default timeout and I'll give an r+.
Jędrzej Nowacki
Comment 10 2011-01-07 00:59:04 PST
(In reply to comment #5) > I prefer 30 seconds as default value. I don't think that it is a good idea. 30 sec is good for buildbot, but is not good enough for others. For example I need 2 min to run tests, more in debug mode and It takes only a few "weeks" on my old laptop :-). Personally I prefer to be not bothered by timeouts. (In reply to comment #6) > The comma inside the brackets isn't required, it will be a list without it too. Also the shell=False is unnecessary, that is the default behavior for Popen. True. (In reply to comment #8) > maybe 'os.kill(tst.pid, signal.SIGTERM)' could resolve the problem for python 2.5 It won't work on windows until python 2.7 :/. Anyway I'm not aware of any Windows user who is using the script. Probably one of http://code.activestate.com/recipes/347462-terminating-a-subprocess-on-windows/ solutions should be used instead, but I don't have environment to check.
Jędrzej Nowacki
Comment 11 2011-01-07 01:08:00 PST
(In reply to comment #10) > (In reply to comment #8) > > maybe 'os.kill(tst.pid, signal.SIGTERM)' could resolve the problem for python 2.5 > It won't work on windows until python 2.7 :/. Anyway I'm not aware of any Windows user who is using the script. Probably one of http://code.activestate.com/recipes/347462-terminating-a-subprocess-on-windows/ solutions should be used instead, but I don't have environment to check. I would write something like that: try: tst.terminate() except AttributeError: # python version < 2.6 os.kill(tst.pid, signal.SIGTERM) # FIXME: Windows support what do you think? Is it enough?
Peter Gal
Comment 12 2011-01-07 01:57:35 PST
(In reply to comment #11) > (In reply to comment #10) > > (In reply to comment #8) > > > maybe 'os.kill(tst.pid, signal.SIGTERM)' could resolve the problem for python 2.5 > > It won't work on windows until python 2.7 :/. Anyway I'm not aware of any Windows user who is using the script. Probably one of http://code.activestate.com/recipes/347462-terminating-a-subprocess-on-windows/ solutions should be used instead, but I don't have environment to check. > > I would write something like that: > > try: > tst.terminate() > except AttributeError: > # python version < 2.6 > os.kill(tst.pid, signal.SIGTERM) > # FIXME: Windows support > > what do you think? Is it enough? What if we check the python version? ( http://docs.python.org/library/platform.html#platform.python_version_tuple ) I personally don't like exceptions if I can found an other way for it. so maybe like this: ver = platform.python_version_tuple() if ver[0] == 2 and ver[1] < 6: if platform.system == "win32": # win .. else: os.kill(tst.pid, signal.SIGTERM) else: tst.terminate()
Csaba Osztrogonác
Comment 13 2011-01-07 03:39:52 PST
(In reply to comment #10) > I don't think that it is a good idea. 30 sec is good for buildbot, but is not good enough for others. For example I need 2 min to run tests, more in debug mode and It takes only a few "weeks" on my old laptop :-). Personally I prefer to be not bothered by timeouts. OK, you're right. Please add "--timeout 30" to Tools/BuildSlaveSupport/build.webkit.org-config/master.cfg instead of the default timeout to make buildbots happy. (It must be enough, because the maximum runtime of a test suite is 10-12 seconds on the bot.) (In reply to comment #12) > What if we check the python version? ( http://docs.python.org/library/platform.html#platform.python_version_tuple ) > I personally don't like exceptions if I can found an other way for it. > > so maybe like this: > > ver = platform.python_version_tuple() > if ver[0] == 2 and ver[1] < 6: > if platform.system == "win32": > # win > .. > else: > os.kill(tst.pid, signal.SIGTERM) > else: > tst.terminate() It looks good to me with one of the windows implementation mentioned in comment #10 .
Jędrzej Nowacki
Comment 14 2011-01-07 04:07:13 PST
(In reply to comment #12) > What if we check the python version? ( http://docs.python.org/library/platform.html#platform.python_version_tuple ) > I personally don't like exceptions if I can found an other way for it. You shouldn't avoid exceptions, it is nice feature that simplify your code ;-). Python encourages usage of exceptions. Please check http://docs.python.org/glossary.html#term-eafp
Jędrzej Nowacki
Comment 15 2011-01-07 04:09:36 PST
Created attachment 78220 [details] Patch I haven't tested that on Win, but I checked docs and it should work(TM) :-)
WebKit Commit Bot
Comment 16 2011-01-07 05:07:34 PST
Comment on attachment 78220 [details] Patch Clearing flags on attachment: 78220 Committed r75240: <http://trac.webkit.org/changeset/75240>
WebKit Commit Bot
Comment 17 2011-01-07 05:07:41 PST
All reviewed patches have been landed. Closing bug.
Csaba Osztrogonác
Comment 18 2011-01-07 05:39:20 PST
Bill, could you update the master to make our bot happier, please?
William Siegrist
Comment 19 2011-01-07 07:28:53 PST
Buildbot updated.
Note You need to log in before you can comment on or make changes to this bug.