RESOLVED FIXED 49122
webkitpy/tool/* unittests change cwd and don't clean up properly
https://bugs.webkit.org/show_bug.cgi?id=49122
Summary webkitpy/tool/* unittests change cwd and don't clean up properly
Dirk Pranke
Reported 2010-11-05 20:33:05 PDT
In the course of merging and preparing to land the patch for bug 48280 , I stumbled into this. various unit tests under webkitpy/tool/* change the current working directory and don't seem to change back when they're done. For example, download_unittest.test_apply_attachment creates a MockSCM() object, and then invokes a series of steps including steps/cleanworkingdirectory.py (which has the effect of changing to /private/tmp ). This by itself isn't that bad. However, the real scm checkout code uses the current working directory to figure out where the checkout root is, in default_scm(): def default_scm(): cwd = os.getcwd() scm_system = detect_scm_system(cwd) if not scm_system: script_directory = os.path.abspath(sys.path[0]) scm_system = detect_scm_system(script_directory) if scm_system: log("The current directory (%s) is not a WebKit checkout, using %s" % (cwd, scm_system.checkout_root)) else: error("FATAL: Failed to determine the SCM system for either %s or %s" % (cwd, script_directory)) return scm_system The first part fails, so we fall back to os.path.abspath(sys.path[0]). This by itself also isn't that bad. Normally, in python, sys.path[0] is defined to be the directory of the script that was invoked (i.e., test-webkitpy, so WebKitTools/Scripts) . Sadly, in the monkeying with paths that we do now in test-webkit that seems to have something to do with making sure that QueueStatusServer is in our path, we call out to the WebKitTools/QueueStatusServer/__init__.py:fix_sys_path() which manipulates sys.path . Even that's not so bad. Sadly, that code also calls google_appengine.fix_sys_path(). And THAT code, at least in the version I happen to have installed on my machine (1.3.8), *prepends* EXTRA_PATHS to sys.paths, which breaks the assumption that sys.path[0] will be inside the tree. So, if any test code then attempts to call the real scm.default_scm() code, it will fail. Which happens after the above unittests execute (and we've changed the directory), when we run commands/rebaseline_unittest, when that code constructs a layout_tests/port object, which calls the real scm code when I've merged in the changes in 48280. A comedy of cascading errors, so to speak. There are several things that can/should be fixed here. (1) The AppEngine code is probably wrong. (2) the tool/commands unittests should not actually be changing directories. (3) It may be bad to be relying only on sys.path[0] and cwd, since clearly both of those things can be outside the tree. The only other mechanism I can think of which should be safe would be to look at __file__ from inside scm.py . I think there are times when even that doesn't work right, because there was a reason I started using sys.path[0] instead of __file__ , but those reasons may not apply in this usage. I don't remember exactly what they were. (4) Arguably the rebaseline_unittest command should be trying to get a test port, not a real port, and it should be passing in a MockConfig() object so that we don't use the real scm detection code. But there might need to be other stuff that changes in that test, I haven't really explored this yet. I think (3) is safe, so I'll upload a patch for that.
Attachments
Patch (1.32 KB, patch)
2010-11-05 20:35 PDT, Dirk Pranke
no flags
Dirk Pranke
Comment 1 2010-11-05 20:35:57 PDT
Eric Seidel (no email)
Comment 2 2010-11-05 20:41:31 PDT
Comment on attachment 73161 [details] Patch Hmmm... I'm not sure this is right. There was a reason why I used sys.path[0], but now I'm not sure I remember. The bug title doesn't seem to align with what the change actually does. :)
Eric Seidel (no email)
Comment 3 2010-11-05 23:11:40 PDT
Sorry, I didn't read your long bug description before reviewing the first time.
Eric Seidel (no email)
Comment 4 2010-11-05 23:16:53 PDT
(In reply to comment #0) > In the course of merging and preparing to land the patch for bug 48280 , I stumbled into this. > > various unit tests under webkitpy/tool/* change the current working directory and don't seem to change back when they're done. For example, download_unittest.test_apply_attachment creates a MockSCM() object, and then invokes a series of steps including steps/cleanworkingdirectory.py (which has the effect of changing to /private/tmp ). Yeah, we have to be careful to use MockSCM().checkout_root and avoid using os.getcwd() as otherwise tests may or may not have manipulated the cwd. > This by itself isn't that bad. > > However, the real scm checkout code uses the current working directory to figure out where the checkout root is, in default_scm(): > > def default_scm(): > cwd = os.getcwd() > scm_system = detect_scm_system(cwd) > if not scm_system: > script_directory = os.path.abspath(sys.path[0]) > scm_system = detect_scm_system(script_directory) > if scm_system: > log("The current directory (%s) is not a WebKit checkout, using %s" % (cwd, scm_system.checkout_root)) > else: > error("FATAL: Failed to determine the SCM system for either %s or %s" % (cwd, script_directory)) > return scm_system > > The first part fails, so we fall back to os.path.abspath(sys.path[0]). > > This by itself also isn't that bad. Normally, in python, sys.path[0] is defined to be the directory of the script that was invoked (i.e., test-webkitpy, so WebKitTools/Scripts) . > > Sadly, in the monkeying with paths that we do now in test-webkit that seems to have something to do with making sure that QueueStatusServer is in our path, we call out to the WebKitTools/QueueStatusServer/__init__.py:fix_sys_path() which manipulates sys.path . I *just* added this fix_sys_path code. I should probably remove it and or fix it. It seems to be the major problem here. > Even that's not so bad. Sadly, that code also calls google_appengine.fix_sys_path(). And THAT code, at least in the version I happen to have installed on my machine (1.3.8), *prepends* EXTRA_PATHS to sys.paths, which breaks the assumption that sys.path[0] will be inside the tree. > > So, if any test code then attempts to call the real scm.default_scm() code, it will fail. Which happens after the above unittests execute (and we've changed the directory), when we run commands/rebaseline_unittest, when that code constructs a layout_tests/port object, which calls the real scm code when I've merged in the changes in 48280. Hmm.. seems slightly fishy that we use real scm code... > A comedy of cascading errors, so to speak. Agreed. :) > There are several things that can/should be fixed here. > > (1) The AppEngine code is probably wrong. > > (2) the tool/commands unittests should not actually be changing directories. > > (3) It may be bad to be relying only on sys.path[0] and cwd, since clearly both of those things can be outside the tree. The only other mechanism I can think of which should be safe would be to look at __file__ from inside scm.py . I think there are times when even that doesn't work right, because there was a reason I started using sys.path[0] instead of __file__ , but those reasons may not apply in this usage. I don't remember exactly what they were. Yeah. :) webkit-patch tries hard to make it so that you can have webkit-patch in your path, but use it with a different checkout. As long as that use case doesn't beak, we can basically do whatever we want here. :) > (4) Arguably the rebaseline_unittest command should be trying to get a test port, not a real port, and it should be passing in a MockConfig() object so that we don't use the real scm detection code. But there might need to be other stuff that changes in that test, I haven't really explored this yet. > > I think (3) is safe, so I'll upload a patch for that. I'd like to better understand the contract behind sys.path[0] before r+'ing this. I should go read some python docs and see if I can figure out how we ended up using sys.path[0]. Maybe other things will break now as a result of this new AppEngine code?
Eric Seidel (no email)
Comment 5 2010-11-05 23:20:18 PDT
http://docs.python.org/library/sys.html#sys.path "As initialized upon program startup, the first item of this list, path[0], is the directory containing the script that was used to invoke the Python interpreter. If the script directory is not available (e.g. if the interpreter is invoked interactively or if the script is read from standard input), path[0] is the empty string, which directs Python to search modules in the current directory first. Notice that the script directory is inserted before the entries inserted as a result of PYTHONPATH." It seems changing to __file__ here is safe since assuming path[0] is a valid directory from which the python interpreter was invoked sounds like it will break. If we're invoking webkit-patch inside another checkout the previous detect_scm_system(cwd) will have already succeeded, so __file__ seems fine. Sorry you had to run down this goose. I think we'll need to consider some of the other fixes you mention as well for future patches.
Dirk Pranke
Comment 6 2010-11-06 19:31:06 PDT
Filed bug http://code.google.com/p/googleappengine/issues/detail?id=4031 against the app engine SDK as well.
Dirk Pranke
Comment 7 2010-11-06 19:31:13 PDT
Comment on attachment 73161 [details] Patch Clearing flags on attachment: 73161 Committed r71472: <http://trac.webkit.org/changeset/71472>
Dirk Pranke
Comment 8 2010-11-06 19:31:18 PDT
All reviewed patches have been landed. Closing bug.
Dirk Pranke
Comment 9 2010-11-06 19:44:29 PDT
Note You need to log in before you can comment on or make changes to this bug.