Bug 84177 - Lion Production Test failing with error: "Failed to stop httpd: pid file still exists"
Summary: Lion Production Test failing with error: "Failed to stop httpd: pid file stil...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Jer Noble
URL: http://build.webkit.org/builders/Lion...
Keywords:
Depends on:
Blocks:
 
Reported: 2012-04-17 11:55 PDT by Jer Noble
Modified: 2012-04-20 12:34 PDT (History)
5 users (show)

See Also:


Attachments
Try to remove apache's pid file in the case it's stale (1.72 KB, patch)
2012-04-17 12:59 PDT, Ryosuke Niwa
dpranke: review-
Details | Formatted Diff | Diff
Patch (2.04 KB, patch)
2012-04-18 13:17 PDT, Jer Noble
no flags Details | Formatted Diff | Diff
Patch (1.82 KB, patch)
2012-04-18 13:43 PDT, Jer Noble
dpranke: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jer Noble 2012-04-17 11:55:49 PDT
run_webkit_tests is failing with the following stack trace:

Traceback (most recent call last):
  File "/Volumes/Data/slave/lion-intel-release-tests/build/Tools/Scripts/webkitpy/layout_tests/run_webkit_tests.py", line 459, in <module>
    sys.exit(main())
...snip...
    raise http_server_base.ServerError('Failed to stop %s: pid file still exists' % self._name)
webkitpy.layout_tests.servers.http_server_base.ServerError: Failed to stop httpd: pid file still exists
Failed to execute Tools/Scripts/new-run-webkit-tests at ./Tools/Scripts/run-webkit-tests line 122.
Comment 1 Ryosuke Niwa 2012-04-17 12:59:31 PDT
Created attachment 137586 [details]
Try to remove apache's pid file in the case it's stale
Comment 2 Jer Noble 2012-04-17 13:03:02 PDT
Looks like it might be a hardware problem.  Rebooted the offending bot. <http://build.webkit.org/buildslaves/apple-xserve-10>
Comment 3 Dirk Pranke 2012-04-17 13:20:44 PDT
Comment on attachment 137586 [details]
Try to remove apache's pid file in the case it's stale

See the comment right after this ... you can't assume apache has exited until the pid file is removed. If you delete the pid file too early, you may attempt to restart the server while the previous server is still running, and fail. You should only delete the pid file is the pid is gone.

You should probably reuse the log in http_server.py, which checks for the running process.
Comment 4 Ryosuke Niwa 2012-04-17 13:22:15 PDT
Comment on attachment 137586 [details]
Try to remove apache's pid file in the case it's stale

(In reply to comment #3)
> (From update of attachment 137586 [details])
> See the comment right after this ... you can't assume apache has exited until the pid file is removed. If you delete the pid file too early, you may attempt to restart the server while the previous server is still running, and fail. You should only delete the pid file is the pid is gone.
> 
> You should probably reuse the log in http_server.py, which checks for the running process.

Okay, please make those changes then. I don't understand this code and don't have a time for it.
Comment 5 Jer Noble 2012-04-18 13:17:24 PDT
Created attachment 137753 [details]
Patch
Comment 6 Dirk Pranke 2012-04-18 13:34:52 PDT
Comment on attachment 137753 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=137753&action=review

> Tools/Scripts/webkitpy/layout_tests/servers/apache_http_server.py:144
> +        # If apache was forcefully killed, the pid file will not have been deleted, so check

At the beginning of this routine, (before we call self._run(self._stop_cmd), we should check to see if the server is running ... in the context of the problem we're seeing, we know that self._pid is set to the pid we care about (see start() in http_server_base.py:78).

So, I would rewrite the whole routine as something like:

def _stop_running_server(self):
    if self._pid and not self._executive.check_running_pid(pid):
        self._filesystem.remove(self._pid_file)
        return

   retval, err = self._run(self.stop_cmd)
   if retval or len(err):
       raise http_server_base.ServerError('Failed to stop %s: %s' % (self._name, err))
  # For some reason ...
  ...

In this case you'll raise an error if you can't delete the pidfile, but I think that's probably good.
Comment 7 Jer Noble 2012-04-18 13:43:06 PDT
Created attachment 137757 [details]
Patch

Updated with dpranke's rewrite.
Comment 8 Dirk Pranke 2012-04-18 13:44:43 PDT
Comment on attachment 137757 [details]
Patch

r+ if you add a "return" after line 143 so that we don't execute the rest of the function.
Comment 9 Ryosuke Niwa 2012-04-18 14:47:23 PDT
Thanks for the fix!
Comment 10 Jer Noble 2012-04-20 12:34:33 PDT
Committed r114555: <http://trac.webkit.org/changeset/114555>