Bug 88414

Summary:

[NRWT] XvfbDriver should choose the next free display

Product:

WebKit

Reporter:

Kristóf Kosztyó <kkristof>

Component:

Tools / Tests

Assignee:

Nobody <webkit-unassigned>

Status:

RESOLVED FIXED

Severity:

Normal

CC:

abarth, cdumez, dpranke, galpeter, ojan, ossy, rakuco, rgabor, webkit.review.bot, zan

Priority:

Keywords:

NRWT

Version:

528+ (Nightly build)

Hardware:

Unspecified

OS:

Unspecified

Bug Depends on:

89956, 90441, 98346, 103806

Bug Blocks:

77730, 89880, 98162

Attachments:

Description	Flags
WIP patch	none
WIP patch	none
proposed fix	none
draft patch	none
proposed fix	none

Kristóf Kosztyó

Reported 2012-06-06 07:08:18 PDT

Now the XfvbDriver chose the next display id with these simple way: running_displays = len(Executive().running_pids(x_filter)) display_id = self._worker_number * 2 + running_displays But if we ran some parallel nrwt's they could choose an already existing display which can cause problems.

Attachments
WIP patch (1.13 KB, patch) 2012-06-06 07:25 PDT, Kristóf Kosztyó	no flags	Details Formatted Diff Diff
WIP patch (4.14 KB, patch) 2012-06-21 06:37 PDT, Kristóf Kosztyó	no flags	Details Formatted Diff Diff
proposed fix (4.99 KB, patch) 2012-06-22 08:00 PDT, Kristóf Kosztyó	no flags	Details Formatted Diff Diff
draft patch (2.64 KB, patch) 2012-09-14 07:29 PDT, Kristóf Kosztyó	no flags	Details Formatted Diff Diff
proposed fix (9.40 KB, patch) 2012-09-21 02:29 PDT, Kristóf Kosztyó	no flags	Details Formatted Diff Diff
Show Obsolete (5) View All Add attachment proposed patch, testcase, etc.

Kristóf Kosztyó

Comment 1 2012-06-06 07:25:24 PDT

Created attachment 146020 [details] WIP patch We've discussed with Ossy that maybe we could copy the method the way xvfb-run scripts choose the next unreserved display. It works locally but I want to do more tests.

Csaba Osztrogonác

Comment 2 2012-06-07 02:45:02 PDT

Comment on attachment 146020 [details] WIP patch View in context: https://bugs.webkit.org/attachment.cgi?id=146020&action=review > Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver.py:62 > - display_id = self._worker_number * 2 + running_displays > + display_id = next_free_id() > + print 'start new xvfb with %d id' % display_id > if pixel_tests: > display_id += 1 But what if next_free_id() + 1 isn't free?

Csaba Osztrogonác

Comment 3 2012-06-07 02:49:19 PDT

Comment on attachment 146020 [details] WIP patch View in context: https://bugs.webkit.org/attachment.cgi?id=146020&action=review > Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver.py:54 > + def next_free_id(): > + for i in range(99): > + if not os.path.exists('/tmp/.X%d-lock' % i): > + return i Hmm ... I think checking /tmp/.X%d-lock isn't enough ... On the machine where run 3 buildslave, I have only these lock files: .X0-lock, .X90-lock . But there are three xvfb-run.XXXXXX directories with Xauthority files contain the following numbers: 84, 39, 90. (These numbers were passed to xvfb-run when we started the bots.) And there are two sockets in /tmp/.X11-unix dir: X0 and X90.

Kristóf Kosztyó

Comment 4 2012-06-19 00:41:08 PDT

Committed r120688: <http://trac.webkit.org/changeset/120688>

Kristóf Kosztyó

Comment 5 2012-06-19 00:42:37 PDT

(In reply to comment #4) > Committed r120688: <http://trac.webkit.org/changeset/120688> Temporarily we disable the xvfb driver on qt

Csaba Osztrogonác

Comment 6 2012-06-21 04:07:21 PDT

Is there any progression?

Kristóf Kosztyó

Comment 7 2012-06-21 06:32:34 PDT

(In reply to comment #6) > Is there any progression? Yes there is. Now I check the first free xvfb id with the check of the already existing .X#-lock files. After when I found the first free one I lock this number with the webkitpy's FileLock. At locally it works fine also when I run more than one NRWT in parallel.

Kristóf Kosztyó

Comment 8 2012-06-21 06:37:37 PDT

Created attachment 148786 [details] WIP patch Work in progress patch with the FileLock. I would like to do some tests with it, but if everything go well I will upload the final patch.

Peter Gal

Comment 9 2012-06-22 02:25:42 PDT

View in context: https://bugs.webkit.org/attachment.cgi?id=148786&action=review > Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver.py:71 > + environment['DUMPRENDERTREE_TEMP'] = str(self._driver_tempdir) I did a little digging: Where will be this _driver_tempdir be set to something? In the parent class it'll be set in the _start method, but we override it here. Or am I missing something here?

Kristóf Kosztyó

Comment 10 2012-06-22 04:27:55 PDT

(In reply to comment #9) > View in context: https://bugs.webkit.org/attachment.cgi?id=148786&action=review > > > Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver.py:71 > > + environment['DUMPRENDERTREE_TEMP'] = str(self._driver_tempdir) > > I did a little digging: Where will be this _driver_tempdir be set to something? In the parent class it'll be set in the _start method, but we override it here. Or am I missing something here? You're right I missed out that.

Kristóf Kosztyó

Comment 11 2012-06-22 08:00:34 PDT

Created attachment 149029 [details] proposed fix

Dirk Pranke

Comment 12 2012-06-22 14:53:53 PDT

Comment on attachment 149029 [details] proposed fix View in context: https://bugs.webkit.org/attachment.cgi?id=149029&action=review the patch basically looks fine with a couple nits ... I'll leave cq? in case you want to change things. > Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver.py:53 > + _guard_lock_file = self._port._filesystem.join('/tmp', 'WebKitXvfb.lock.%i' % i) nit: there's no reason to use an underscore in front of the name. I'd probably just combine lines 53 and 54 ... > Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver.py:58 > + self.display_id = next_free_id() should this be self._display_id to indicate that it's a private variable?

Kristóf Kosztyó

Comment 13 2012-06-26 00:58:20 PDT

Thank you. I commited this in r121236: <http://trac.webkit.org/changeset/121236> if everything go well on the bots I will close the bug.

Kristóf Kosztyó

Comment 14 2012-06-26 01:59:06 PDT

(In reply to comment #13) > Thank you. > I commited this in r121236: <http://trac.webkit.org/changeset/121236> if everything go well on the bots I will close the bug. Rollouted in r121239 because it produced two type of error: * start the xvfb with the same display * or everything went fine until it starts getting warnings from the file lock and the tests are timed out or crashed

Csaba Osztrogonác

Comment 15 2012-07-12 03:29:50 PDT

Comment on attachment 149029 [details] proposed fix It is obsolete patch.

Csaba Osztrogonác

Comment 16 2012-08-23 03:23:15 PDT

After Balázs's patch - https://trac.webkit.org/changeset/124581 - NRWT doesn't run two DRT for non-pixel and pixel tests, so XVFBDriver's display handling should be refactored with this fix too.

Csaba Osztrogonác

Comment 17 2012-08-23 07:36:17 PDT

(In reply to comment #14) > (In reply to comment #13) > > Thank you. > > I commited this in r121236: <http://trac.webkit.org/changeset/121236> if everything go well on the bots I will close the bug. > > Rollouted in r121239 because it produced two type of error: > * start the xvfb with the same display > * or everything went fine until it starts getting warnings from the file lock and the tests are timed out or crashed I think I got the root of the problem. I saw the following failing webkitpy test on the Qt buildbot: Traceback (most recent call last): File "/ramdisk/qt-linux-release/build/Tools/Scripts/webkitpy/common/system/filesystem_unittest.py", line 220, in test_read_and_write_file text_contents = fs.read_text_file(binary_path) File "/ramdisk/qt-linux-release/build/Tools/Scripts/webkitpy/common/system/filesystem.py", line 213, in read_text_file with codecs.open(path, 'r', 'utf8') as f: File "/usr/lib/python2.6/codecs.py", line 881, in open file = __builtin__.open(filename, mode, buffering) IOError: [Errno 2] No such file or directory: '/tmp/tree_unittest_9TdIXB' [283/1600] webkitpy.common.system.file_lock_integrationtest.FileLockTest.test_lock_lifecycle passed ------------------------------------- It seems something is wrong with the FileLock class and it caused that NRWT started two driver on same display. Could you check it, please?

Zan Dobersek

Comment 18 2012-08-29 00:59:23 PDT

Instead of checking for the X locks in /tmp directory, how about listing all the current Xorg and Xvfb processes, parsing their commands to get the displays they are using and then using the first free display for the new Xvfb instance? Perhaps -nolock could be passed to new Xvfb processes as well to avoid Xvfb-spawned lock files, but I think using a FileLock to denote that Xvfb (spawned by NRWT) is using a specific display is a good idea.

Kristóf Kosztyó

Comment 19 2012-09-14 07:29:24 PDT

Created attachment 164148 [details] draft patch Now I check the next free display by the running processes. It seems more stable than the xvfb's lock files based approach. If this approach is good I will rewrite some unittest to fit this change.

Kristóf Kosztyó

Comment 20 2012-09-21 02:29:11 PDT

Created attachment 165091 [details] proposed fix It is process based choose of the next free display as the draft patch was but it contains unit tests :)

Kristóf Kosztyó

Comment 21 2012-09-28 06:57:54 PDT

Committed r129891: <http://trac.webkit.org/changeset/129891>

Raphael Kubo da Costa (:rakuco)

Comment 22 2012-10-08 01:56:32 PDT

Do you guys have any idea why this keeps failing every once in a while? See, for example, <http://build.webkit.org/builders/EFL%20Linux%2064-bit%20Debug/builds/6866/steps/webkitpy-test/logs/stdio>: [1580/1585] webkitpy.layout_tests.port.xvfbdriver_unittest.XvfbDriverTest.test_next_free_display failed: Traceback (most recent call last): File "/home/buildslave-1/webkit-buildslave/efl-linux-64-debug/build/Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver_unittest.py", line 86, in test_next_free_display self.assertEqual(driver._next_free_display(), 2) AssertionError: 3 != 2

Zan Dobersek

Comment 23 2012-10-08 02:04:22 PDT

(In reply to comment #22) > Do you guys have any idea why this keeps failing every once in a while? > > See, for example, <http://build.webkit.org/builders/EFL%20Linux%2064-bit%20Debug/builds/6866/steps/webkitpy-test/logs/stdio>: > > [1580/1585] webkitpy.layout_tests.port.xvfbdriver_unittest.XvfbDriverTest.test_next_free_display failed: > Traceback (most recent call last): > File "/home/buildslave-1/webkit-buildslave/efl-linux-64-debug/build/Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver_unittest.py", line 86, in test_next_free_display > self.assertEqual(driver._next_free_display(), 2) > AssertionError: 3 != 2 Set the bug #98346 to be blocking this one. As for the failure, it's strange. I'm pretty sure that if somehow there are any Xvfb processes left running after the NRWT is done, it still shouldn't affect this unit test as the mock executive is used. As much as I tried, though, I couldn't reproduce even one such failure locally.

Zan Dobersek

Comment 24 2012-10-09 12:39:05 PDT

Reopening so the work can continue.

Csaba Osztrogonác

Comment 25 2012-11-22 03:38:46 PST

As far as I know, xvfbdriver works fine, only the unit test is flakey because of stucked xvfbs. Am I right?

Zan Dobersek

Comment 26 2012-11-22 05:40:59 PST

(In reply to comment #25) > As far as I know, xvfbdriver works fine, only the unit test is flakey > because of stucked xvfbs. Am I right? As far as I can tell, that's the case. I think the main problem with the flaky unit tests is that the running Xvfb instances somehow affect them even though they shouldn't.

Zan Dobersek

Comment 27 2012-11-30 09:31:53 PST

Yup, I can reproduce this problem by running NRWT and test-webkitpy in parallel. I'll try to come up with a patch, but probably in another bug.

Zan Dobersek

Comment 28 2013-01-14 23:14:57 PST

(In reply to comment #27) > Yup, I can reproduce this problem by running NRWT and test-webkitpy in parallel. I'll try to come up with a patch, but probably in another bug. Handled in bug #103806, landed in r136314. http://trac.webkit.org/changeset/136314 Closing the bug.

Note You need to log in before you can comment on or make changes to this bug.