Bug 141235 - Tests don't work on some bots: Failed to stop wptwk
Summary: Tests don't work on some bots: Failed to stop wptwk
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P1 Critical
Assignee: youenn fablet
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-02-03 23:39 PST by Alexey Proskuryakov
Modified: 2015-02-21 12:22 PST (History)
5 users (show)

See Also:


Attachments
Patch (8.41 KB, patch)
2015-02-04 13:04 PST, youenn fablet
no flags Details | Formatted Diff | Diff
Adding unit tests (12.53 KB, patch)
2015-02-16 03:09 PST, youenn fablet
no flags Details | Formatted Diff | Diff
Updated changelog (12.42 KB, patch)
2015-02-21 11:30 PST, youenn fablet
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alexey Proskuryakov 2015-02-03 23:39:17 PST
I see tests fail on multiple bots tonight:

https://build.webkit.org/builders/Apple%20Mavericks%20Release%20WK2%20(Tests)/builds/11276
https://build.webkit.org/builders/Apple%20Mavericks%20Debug%20WK2%20(Tests)/builds/9271

ServerError: Failed to stop wptwk: please go to web platform test server log file to get the PID list
Comment 1 youenn fablet 2015-02-04 00:26:02 PST
OK.
I will mark as failed the two tests so that it does not disturb the bots and will investigate this further.
Comment 2 Alexey Proskuryakov 2015-02-04 09:45:55 PST
The bots are still completely broken - it's not just the two tests, it's an exception in run-webkit-tests that blocks reporting any results at all.
Comment 3 youenn fablet 2015-02-04 13:04:30 PST
Created attachment 246043 [details]
Patch
Comment 4 youenn fablet 2015-02-04 13:45:09 PST
(In reply to comment #3)
> Created attachment 246043 [details]
> Patch

This patch fixes the issue that broke the bots (uncaught exception).

The underlying issue is that run-webkit-tests is not always able to stop the wpt server properly. Reasons are unclear to me.

To make things more robust, this patch kills not only the launcher process but also all server processes. This should allow clean restarting.
Comment 5 youenn fablet 2015-02-05 07:39:50 PST
(In reply to comment #4)
> (In reply to comment #3)
> > Created attachment 246043 [details]
> > Patch
> 
> The underlying issue is that run-webkit-tests is not always able to stop the
> wpt server properly. Reasons are unclear to me.

I looked at the logs.
The bots stopped in the middle of layout test runs.
My previous patch was not handling the recovery correctly :(
The uploaded fix for bug 141235 should handle that issue, and probably a bit more.
 
It may be interesting to do "ps -A | grep python" on one or both of the corrupted bots when they do not run any task.
There may be 4 or 5 python processes remaining, with ids close one to the others. It might be best to kill them or restart the bots.

The current patch should be able to cleanly kill such processes in the case of a layout test run failure.
Comment 6 youenn fablet 2015-02-16 03:09:36 PST
Created attachment 246640 [details]
Adding unit tests
Comment 7 Ryosuke Niwa 2015-02-19 23:38:49 PST
Comment on attachment 246640 [details]
Adding unit tests

View in context: https://bugs.webkit.org/attachment.cgi?id=246640&action=review

Looks sane to me.

> Tools/ChangeLog:9
> +        Added killing of server subprocesses if still running.

Just say "kill the server wpt subprocess " in stop().

> Tools/Scripts/webkitpy/layout_tests/servers/web_platform_test_server.py:67
> +        self._servers_file = self._filesystem.join(self._runtime_path, '%s_servers.json' % (self._name))

Can't we just create a temporary file instead?
Comment 8 youenn fablet 2015-02-19 23:44:50 PST
Thanks for the review.

> > Tools/ChangeLog:9
> > +        Added killing of server subprocesses if still running.
> 
> Just say "kill the server wpt subprocess " in stop().

OK

> > Tools/Scripts/webkitpy/layout_tests/servers/web_platform_test_server.py:67
> > +        self._servers_file = self._filesystem.join(self._runtime_path, '%s_servers.json' % (self._name))
> 
> Can't we just create a temporary file instead?

self._runtime_path is a temp dir in which one can find the pid file for apache server or wpt server launcher.
Comment 9 youenn fablet 2015-02-21 11:30:13 PST
Created attachment 247051 [details]
Updated changelog
Comment 10 WebKit Commit Bot 2015-02-21 12:22:22 PST
Comment on attachment 247051 [details]
Updated changelog

Clearing flags on attachment: 247051

Committed r180480: <http://trac.webkit.org/changeset/180480>
Comment 11 WebKit Commit Bot 2015-02-21 12:22:29 PST
All reviewed patches have been landed.  Closing bug.