RESOLVED FIXED 106838
kill whole lighttpd process tree for chromium win
https://bugs.webkit.org/show_bug.cgi?id=106838
Summary kill whole lighttpd process tree for chromium win
Dirk Pranke
Reported 2013-01-14 16:04:04 PST
This is a speculative bug fix for http://code.google.com/p/chromium/issues/detail?id=169530 . My guess is that while we are able to kill the initial lighttp process, some child process is getting hung (as evidenced by a bunch of tests timing out) and killing the parent outright doesn't kill the children.
Attachments
Patch (5.21 KB, patch)
2013-01-14 16:32 PST, Dirk Pranke
no flags
patch for landing (4.96 KB, patch)
2013-01-14 17:05 PST, Dirk Pranke
no flags
Dirk Pranke
Comment 1 2013-01-14 16:32:55 PST
Dirk Pranke
Comment 2 2013-01-14 16:33:48 PST
Tony, what do you think of this? I can add lighttpd.exe to the list of processes to kill as well, but this seems like a "more correct" change.
Tony Chang
Comment 3 2013-01-14 16:43:13 PST
Comment on attachment 182658 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=182658&action=review I think the change is a fine way to prevent the bot from hanging, but the test run will still be junk because of too many timeouts. It'll probably confuse garden-o-matic a bit. > Tools/ChangeLog:9 > + Land a speculative fix for lighttpd.exe hanging on some google > + internal win bots; I think a test is causing a httpd server child process to I would remove internal since this is failing on the public build.chromium.org too (the Win bots used by the gardening tools). > Tools/Scripts/webkitpy/layout_tests/servers/http_server_unittest.py:36 > +from webkitpy.thirdparty.autoinstalled import unittest2 as unittest I would make this change separately.
Dirk Pranke
Comment 4 2013-01-14 16:52:03 PST
(In reply to comment #3) > (From update of attachment 182658 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=182658&action=review > > I think the change is a fine way to prevent the bot from hanging, but the test run will still be junk because of too many timeouts. It'll probably confuse garden-o-matic a bit. > Yeah, but that's no worse than any other too-many-timeouts build run failure, right? Note for the record that at the moment it's not clear what's causing the too many timeouts, possibly it's a server bug caused by a particular test, or possibly its the parallelism in the test harness ... > > Tools/ChangeLog:9 > > + Land a speculative fix for lighttpd.exe hanging on some google > > + internal win bots; I think a test is causing a httpd server child process to > > I would remove internal since this is failing on the public build.chromium.org too (the Win bots used by the gardening tools). > It is? I looked earlier today and didn't see any hangs on either the XP or Win7 canaries. I thought I saw one report of a hang earlier on a webkit.org bot, but I didn't see anyting recent there, either. At any rate, I'm happy to remove the comment. > > Tools/Scripts/webkitpy/layout_tests/servers/http_server_unittest.py:36 > > +from webkitpy.thirdparty.autoinstalled import unittest2 as unittest > > I would make this change separately. This was needed for the self.assertIn() method I'm using in the test (since that's new in 2.7 and we need unittest2 for 2.6 compatibility). Does that make this okay or do you want me to switch to a 2.6-compatible method like assertEquals() ?
Tony Chang
Comment 5 2013-01-14 17:00:03 PST
(In reply to comment #4) > (In reply to comment #3) > > > Tools/ChangeLog:9 > > > + Land a speculative fix for lighttpd.exe hanging on some google > > > + internal win bots; I think a test is causing a httpd server child process to > > > > I would remove internal since this is failing on the public build.chromium.org too (the Win bots used by the gardening tools). > > > > It is? I looked earlier today and didn't see any hangs on either the XP or Win7 canaries. I thought I saw one report of a hang earlier on a webkit.org bot, but I didn't see anyting recent there, either. WebKit Win7 has been hung since 5am and WebKit Win7 (dbg)(2) has been hung for days. http://build.chromium.org/p/chromium.webkit/waterfall?show=WebKit%20Win7&show=WebKit%20Win7%20(dbg)(2) The dbg hang might not be the same. I don't see lots of tests timing out. > > > Tools/Scripts/webkitpy/layout_tests/servers/http_server_unittest.py:36 > > > +from webkitpy.thirdparty.autoinstalled import unittest2 as unittest > > > > I would make this change separately. > > This was needed for the self.assertIn() method I'm using in the test (since that's new in 2.7 and we need unittest2 for 2.6 compatibility). Does that make this okay or do you want me to switch to a 2.6-compatible method like assertEquals() ? I would make the switch to unittest2 first, then land this patch.
Dirk Pranke
Comment 6 2013-01-14 17:02:24 PST
(In reply to comment #5) > > It is? I looked earlier today and didn't see any hangs on either the XP or Win7 canaries. I thought I saw one report of a hang earlier on a webkit.org bot, but I didn't see anyting recent there, either. > > WebKit Win7 has been hung since 5am and WebKit Win7 (dbg)(2) has been hung for days. > http://build.chromium.org/p/chromium.webkit/waterfall?show=WebKit%20Win7&show=WebKit%20Win7%20(dbg)(2) > Ah, you're right. I misread that one.
Dirk Pranke
Comment 7 2013-01-14 17:05:31 PST
Created attachment 182663 [details] patch for landing
WebKit Review Bot
Comment 8 2013-01-14 17:42:20 PST
Comment on attachment 182663 [details] patch for landing Clearing flags on attachment: 182663 Committed r139697: <http://trac.webkit.org/changeset/139697>
WebKit Review Bot
Comment 9 2013-01-14 17:42:24 PST
All reviewed patches have been landed. Closing bug.
Tony Chang
Comment 10 2013-01-15 11:21:10 PST
Comment on attachment 182663 [details] patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=182663&action=review > Tools/Scripts/webkitpy/layout_tests/servers/http_server.py:224 > + self._executive.run_command(["taskkill.exe", "/f", "/t", self._pid], error_handler=self._executive.ignore_error) This command is wrong: D:\src\src>taskkill.exe /f /t 4608 ERROR: Invalid argument/option - '4608'. Type "TASKKILL /?" for usage. It's missing a /pid before the number.
Eric Seidel (no email)
Comment 11 2013-01-15 11:25:05 PST
Executive has a kill() function which does taskkill.exe on windows iirc.
Eric Seidel (no email)
Comment 12 2013-01-15 11:25:35 PST
(In reply to comment #11) > Executive has a kill() function which does taskkill.exe on windows iirc. I see the patch now. nevermind.
Tony Chang
Comment 13 2013-01-15 11:37:16 PST
Fixed in http://trac.webkit.org/changeset/139764 and tested on my Windows machine using new-run-webkit-httpd --server {start,stop}. I'll fix all the hung Chromium Windows bots :(
Dirk Pranke
Comment 14 2013-01-15 12:40:29 PST
whoops. sigh :(.
Note You need to log in before you can comment on or make changes to this bug.