Summary: | [NRWT] XvfbDriver should choose the next free display | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Kristóf Kosztyó <kkristof> | ||||||||||||
Component: | Tools / Tests | Assignee: | Nobody <webkit-unassigned> | ||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||
Severity: | Normal | CC: | abarth, cdumez, dpranke, galpeter, ojan, ossy, rakuco, rgabor, webkit.review.bot, zan | ||||||||||||
Priority: | P2 | Keywords: | NRWT | ||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||||
Hardware: | Unspecified | ||||||||||||||
OS: | Unspecified | ||||||||||||||
Bug Depends on: | 89956, 90441, 98346, 103806 | ||||||||||||||
Bug Blocks: | 77730, 89880, 98162 | ||||||||||||||
Attachments: |
|
Description
Kristóf Kosztyó
2012-06-06 07:08:18 PDT
Created attachment 146020 [details]
WIP patch
We've discussed with Ossy that maybe we could copy the method the way xvfb-run scripts choose the next unreserved display. It works locally but I want to do more tests.
Comment on attachment 146020 [details] WIP patch View in context: https://bugs.webkit.org/attachment.cgi?id=146020&action=review > Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver.py:62 > - display_id = self._worker_number * 2 + running_displays > + display_id = next_free_id() > + print 'start new xvfb with %d id' % display_id > if pixel_tests: > display_id += 1 But what if next_free_id() + 1 isn't free? Comment on attachment 146020 [details] WIP patch View in context: https://bugs.webkit.org/attachment.cgi?id=146020&action=review > Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver.py:54 > + def next_free_id(): > + for i in range(99): > + if not os.path.exists('/tmp/.X%d-lock' % i): > + return i Hmm ... I think checking /tmp/.X%d-lock isn't enough ... On the machine where run 3 buildslave, I have only these lock files: .X0-lock, .X90-lock . But there are three xvfb-run.XXXXXX directories with Xauthority files contain the following numbers: 84, 39, 90. (These numbers were passed to xvfb-run when we started the bots.) And there are two sockets in /tmp/.X11-unix dir: X0 and X90. Committed r120688: <http://trac.webkit.org/changeset/120688> (In reply to comment #4) > Committed r120688: <http://trac.webkit.org/changeset/120688> Temporarily we disable the xvfb driver on qt Is there any progression? (In reply to comment #6) > Is there any progression? Yes there is. Now I check the first free xvfb id with the check of the already existing .X#-lock files. After when I found the first free one I lock this number with the webkitpy's FileLock. At locally it works fine also when I run more than one NRWT in parallel. Created attachment 148786 [details]
WIP patch
Work in progress patch with the FileLock. I would like to do some tests with it, but if everything go well I will upload the final patch.
View in context: https://bugs.webkit.org/attachment.cgi?id=148786&action=review > Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver.py:71 > + environment['DUMPRENDERTREE_TEMP'] = str(self._driver_tempdir) I did a little digging: Where will be this _driver_tempdir be set to something? In the parent class it'll be set in the _start method, but we override it here. Or am I missing something here? (In reply to comment #9) > View in context: https://bugs.webkit.org/attachment.cgi?id=148786&action=review > > > Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver.py:71 > > + environment['DUMPRENDERTREE_TEMP'] = str(self._driver_tempdir) > > I did a little digging: Where will be this _driver_tempdir be set to something? In the parent class it'll be set in the _start method, but we override it here. Or am I missing something here? You're right I missed out that. Created attachment 149029 [details]
proposed fix
Comment on attachment 149029 [details] proposed fix View in context: https://bugs.webkit.org/attachment.cgi?id=149029&action=review the patch basically looks fine with a couple nits ... I'll leave cq? in case you want to change things. > Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver.py:53 > + _guard_lock_file = self._port._filesystem.join('/tmp', 'WebKitXvfb.lock.%i' % i) nit: there's no reason to use an underscore in front of the name. I'd probably just combine lines 53 and 54 ... > Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver.py:58 > + self.display_id = next_free_id() should this be self._display_id to indicate that it's a private variable? Thank you. I commited this in r121236: <http://trac.webkit.org/changeset/121236> if everything go well on the bots I will close the bug. (In reply to comment #13) > Thank you. > I commited this in r121236: <http://trac.webkit.org/changeset/121236> if everything go well on the bots I will close the bug. Rollouted in r121239 because it produced two type of error: * start the xvfb with the same display * or everything went fine until it starts getting warnings from the file lock and the tests are timed out or crashed Comment on attachment 149029 [details]
proposed fix
It is obsolete patch.
After Balázs's patch - https://trac.webkit.org/changeset/124581 - NRWT doesn't run two DRT for non-pixel and pixel tests, so XVFBDriver's display handling should be refactored with this fix too. (In reply to comment #14) > (In reply to comment #13) > > Thank you. > > I commited this in r121236: <http://trac.webkit.org/changeset/121236> if everything go well on the bots I will close the bug. > > Rollouted in r121239 because it produced two type of error: > * start the xvfb with the same display > * or everything went fine until it starts getting warnings from the file lock and the tests are timed out or crashed I think I got the root of the problem. I saw the following failing webkitpy test on the Qt buildbot: Traceback (most recent call last): File "/ramdisk/qt-linux-release/build/Tools/Scripts/webkitpy/common/system/filesystem_unittest.py", line 220, in test_read_and_write_file text_contents = fs.read_text_file(binary_path) File "/ramdisk/qt-linux-release/build/Tools/Scripts/webkitpy/common/system/filesystem.py", line 213, in read_text_file with codecs.open(path, 'r', 'utf8') as f: File "/usr/lib/python2.6/codecs.py", line 881, in open file = __builtin__.open(filename, mode, buffering) IOError: [Errno 2] No such file or directory: '/tmp/tree_unittest_9TdIXB' [283/1600] webkitpy.common.system.file_lock_integrationtest.FileLockTest.test_lock_lifecycle passed ------------------------------------- It seems something is wrong with the FileLock class and it caused that NRWT started two driver on same display. Could you check it, please? Instead of checking for the X locks in /tmp directory, how about listing all the current Xorg and Xvfb processes, parsing their commands to get the displays they are using and then using the first free display for the new Xvfb instance? Perhaps -nolock could be passed to new Xvfb processes as well to avoid Xvfb-spawned lock files, but I think using a FileLock to denote that Xvfb (spawned by NRWT) is using a specific display is a good idea. Created attachment 164148 [details]
draft patch
Now I check the next free display by the running processes. It seems more stable than the xvfb's lock files based approach. If this approach is good I will rewrite some unittest to fit this change.
Created attachment 165091 [details]
proposed fix
It is process based choose of the next free display as the draft patch was but it contains unit tests :)
Committed r129891: <http://trac.webkit.org/changeset/129891> Do you guys have any idea why this keeps failing every once in a while? See, for example, <http://build.webkit.org/builders/EFL%20Linux%2064-bit%20Debug/builds/6866/steps/webkitpy-test/logs/stdio>: [1580/1585] webkitpy.layout_tests.port.xvfbdriver_unittest.XvfbDriverTest.test_next_free_display failed: Traceback (most recent call last): File "/home/buildslave-1/webkit-buildslave/efl-linux-64-debug/build/Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver_unittest.py", line 86, in test_next_free_display self.assertEqual(driver._next_free_display(), 2) AssertionError: 3 != 2 (In reply to comment #22) > Do you guys have any idea why this keeps failing every once in a while? > > See, for example, <http://build.webkit.org/builders/EFL%20Linux%2064-bit%20Debug/builds/6866/steps/webkitpy-test/logs/stdio>: > > [1580/1585] webkitpy.layout_tests.port.xvfbdriver_unittest.XvfbDriverTest.test_next_free_display failed: > Traceback (most recent call last): > File "/home/buildslave-1/webkit-buildslave/efl-linux-64-debug/build/Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver_unittest.py", line 86, in test_next_free_display > self.assertEqual(driver._next_free_display(), 2) > AssertionError: 3 != 2 Set the bug #98346 to be blocking this one. As for the failure, it's strange. I'm pretty sure that if somehow there are any Xvfb processes left running after the NRWT is done, it still shouldn't affect this unit test as the mock executive is used. As much as I tried, though, I couldn't reproduce even one such failure locally. Reopening so the work can continue. As far as I know, xvfbdriver works fine, only the unit test is flakey because of stucked xvfbs. Am I right? (In reply to comment #25) > As far as I know, xvfbdriver works fine, only the unit test is flakey > because of stucked xvfbs. Am I right? As far as I can tell, that's the case. I think the main problem with the flaky unit tests is that the running Xvfb instances somehow affect them even though they shouldn't. Yup, I can reproduce this problem by running NRWT and test-webkitpy in parallel. I'll try to come up with a patch, but probably in another bug. (In reply to comment #27) > Yup, I can reproduce this problem by running NRWT and test-webkitpy in parallel. I'll try to come up with a patch, but probably in another bug. Handled in bug #103806, landed in r136314. http://trac.webkit.org/changeset/136314 Closing the bug. |