Bug 77335

Summary: [Qt][NRWT] Run each DRT in it's own xvfb
Product: WebKit Reporter: János Badics <jbadics>
Component: WebKit QtAssignee: Nobody <webkit-unassigned>
Status: RESOLVED INVALID    
Severity: Normal CC: abarth, dpranke, eric, kkristof, mrobinson, ojan, ossy, pnormand, tony, webkit.review.bot
Priority: P2 Keywords: Qt, QtTriaged
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 84928, 85654    
Bug Blocks: 77730    
Attachments:
Description Flags
proposed fix
tony: review-, tony: commit-queue-
proposed fix
none
proposed fix none

Description János Badics 2012-01-30 06:00:27 PST
running xvfb-run ...../DumpRenderTree [testname] gives the right output;
but putting xvfb-run in cmd_line(self) causes tests to timeout and give no results.
Comment 1 Kristóf Kosztyó 2012-01-30 06:42:13 PST
Hi,
I looked after it and I have found this problem is caused by the xvfb-run because it redirects the stderr to stdout.
You can try to modify the xvfb-run and put into the webkit dir and call this modified script.
Comment 2 Kristóf Kosztyó 2012-02-03 05:36:17 PST
Created attachment 125312 [details]
proposed fix

Hi
I modified the xvfb-run script which now doesn't redirect the stderr to the stdout, so the nrwt get the stderr from the driver.
I also add a new --xvfb option to the nrwt.
Comment 3 Csaba Osztrogonác 2012-02-03 05:49:10 PST
We need it to be able run layout tests parallel with NRWT (https://bugs.webkit.org/show_bug.cgi?id=77730).

There are strange flakey crashes now in the editing tests,
because they try to use same clipboard at the same time.

There are several ways to solve this problem:
- run all DRT in separated XVFB (with separated clipboards) - best solution
- serialize editing tests - it works, but it makes testing slow
Comment 4 Eric Seidel (no email) 2012-02-03 11:06:52 PST
Comment on attachment 125312 [details]
proposed fix

The cr-linux EWS uses xvfb.  I wonder how adam set that up.
Comment 5 Csaba Osztrogonác 2012-02-03 11:11:41 PST
Our buildbots run in xvfb long time ago. The problem is that now the buildslave
run in xvfb, and its DRT instances run in one xvfb with its one clipboard, etc.
Comment 6 Tony Chang 2012-02-03 11:34:57 PST
GTK+ has the same problem and has code for this already: http://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/layout_tests/port/gtk.py#L46

Maybe Qt and GTK can somehow refactor and share this code?

Chromium Linux doesn't have this problem because we use a mock clipboard in DRT.
Comment 7 Adam Barth 2012-02-03 14:22:15 PST
> The cr-linux EWS uses xvfb.  I wonder how adam set that up.

I created a chromium-xvfb port:

http://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/common/config/ports.py#L261

Why doesn't the buildbot just run-webkit-tests inside an xvfb?  This doesn't seem like a concern for run-webkit-tests itself.
Comment 8 Dirk Pranke 2012-02-03 14:32:00 PST
(In reply to comment #7)
> > The cr-linux EWS uses xvfb.  I wonder how adam set that up.
> 
> I created a chromium-xvfb port:
> 
> http://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/common/config/ports.py#L261
> 
> Why doesn't the buildbot just run-webkit-tests inside an xvfb?  This doesn't seem like a concern for run-webkit-tests itself.

See comment #3 - we need one xvfb per worker, not one for all of NRWT.
Comment 9 Adam Barth 2012-02-03 14:35:35 PST
Is this a problem for all ports?  Do we need to solve this problem for everyone (not just buildbots)?
Comment 10 Csaba Osztrogonác 2012-02-04 00:09:09 PST
(In reply to comment #9)
> Is this a problem for all ports?  Do we need to solve this problem for everyone (not just buildbots)?

I don't know how other ports works or not with parallel testing. But it is a blocker to run parallel tests on Qt now. 

It isn't problem only for the buildbots. Developers can't run parallel tests because of this bug.
Comment 11 Csaba Osztrogonác 2012-04-17 05:49:05 PDT
Kristóf, could you pick up this bug again?
Comment 12 Kristóf Kosztyó 2012-04-25 07:58:42 PDT
Created attachment 138810 [details]
proposed fix

This patch uses the same way as the gtk to start each DRT in separate xvfb hence I moved the GtkDriver to the new XvfbDriver class and use this in the qt as well.
Comment 13 Eric Seidel (no email) 2012-04-25 13:32:00 PDT
Odd that you'r eusing xvfb for isolation.  I guess it works.

Chromium has code for this, for running tests in cr-linux-ews.  But it's in a different place.  I believe we run the whole session in a single xvfb.

http://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/common/config/ports.py#L196
Comment 14 WebKit Review Bot 2012-04-25 13:33:46 PDT
Comment on attachment 138810 [details]
proposed fix

Clearing flags on attachment: 138810

Committed r115240: <http://trac.webkit.org/changeset/115240>
Comment 15 WebKit Review Bot 2012-04-25 13:34:01 PDT
All reviewed patches have been landed.  Closing bug.
Comment 16 Ojan Vafai 2012-04-25 13:37:22 PDT
(In reply to comment #13)
> Odd that you'r eusing xvfb for isolation.  I guess it works.
> 
> Chromium has code for this, for running tests in cr-linux-ews.  But it's in a different place.  I believe we run the whole session in a single xvfb.

If that's true, we should change it. Running tests in parallel is blocked on the xvfb process if there's only one. I thought we changed to used multiple xvfbs ages ago, but I don't see the code now.
Comment 17 Dirk Pranke 2012-04-25 13:38:47 PDT
(In reply to comment #16)
> (In reply to comment #13)
> > Odd that you'r eusing xvfb for isolation.  I guess it works.
> > 
> > Chromium has code for this, for running tests in cr-linux-ews.  But it's in a different place.  I believe we run the whole session in a single xvfb.
> 
> If that's true, we should change it. Running tests in parallel is blocked on the xvfb process if there's only one. I thought we changed to used multiple xvfbs ages ago, but I don't see the code now.

See the earlier comments on this bug ... the problems arise on the Qt and Gtk ports from using the shared clipboard; chromium's DRT uses a Mock clipboard and so doesn't have to worry about this.
Comment 18 Eric Seidel (no email) 2012-04-25 13:49:58 PDT
Historically the approach has been to mock out things like clipboard at a system level in DRT.  But it's possible xvfb is a better solution for these linux ports.  It won't help them on non-linux platforms of course :)
Comment 19 Ojan Vafai 2012-04-25 14:31:40 PDT
(In reply to comment #17)
> (In reply to comment #16)
> > (In reply to comment #13)
> > > Odd that you'r eusing xvfb for isolation.  I guess it works.
> > > 
> > > Chromium has code for this, for running tests in cr-linux-ews.  But it's in a different place.  I believe we run the whole session in a single xvfb.
> > 
> > If that's true, we should change it. Running tests in parallel is blocked on the xvfb process if there's only one. I thought we changed to used multiple xvfbs ages ago, but I don't see the code now.
> 
> See the earlier comments on this bug ... the problems arise on the Qt and Gtk ports from using the shared clipboard; chromium's DRT uses a Mock clipboard and so doesn't have to worry about this.

We don't need it for correctness, but chromium should use multiple xvfbs because the tests will run faster because they won't be gated on the single xvfb process.
Comment 20 Dirk Pranke 2012-04-25 14:46:38 PDT
(In reply to comment #19)
> 
> We don't need it for correctness, but chromium should use multiple xvfbs because the tests will run faster because they won't be gated on the single xvfb process.

Maybe. I've not seen xvfb be a bottleneck, but then again I haven't really tested it lately. It could be that having multiple X servers will just increase contention on the CPUs ...
Comment 21 Csaba Osztrogonác 2012-04-25 21:58:00 PDT
Reopen, because it was rolled out - http://trac.webkit.org/changeset/115286

Please check the bot history. :(
Comment 22 Dirk Pranke 2012-04-26 17:16:54 PDT
Interesting, I didn't realize wait() would raise an OSError if the process was already dead.

You can either catch that, or use _xvfb_process.terminate() instead of executive.kill_process and/or use _xvfb_process.join() instead of wait(), since join() definitely won't throw an error.
Comment 23 Tony Chang 2012-04-27 13:57:18 PDT
(In reply to comment #20)
> (In reply to comment #19)
> > 
> > We don't need it for correctness, but chromium should use multiple xvfbs because the tests will run faster because they won't be gated on the single xvfb process.
> 
> Maybe. I've not seen xvfb be a bottleneck, but then again I haven't really tested it lately. It could be that having multiple X servers will just increase contention on the CPUs ...

xvfb isn't a bottleneck since we don't paint to the screen.  I forget when we made that change, but it's been a while.

Chromium only wants a single Xvfb.
Comment 24 Kristóf Kosztyó 2012-05-04 10:26:02 PDT
(In reply to comment #22)
> Interesting, I didn't realize wait() would raise an OSError if the process was already dead.
> 
> You can either catch that, or use _xvfb_process.terminate() instead of executive.kill_process and/or use _xvfb_process.join() instead of wait(), since join() definitely won't throw an error.

I've tried to reproduce that when the wait throw an OSError but it wasn't successful. Anyway I've modified the patch as you said.
Comment 25 Kristóf Kosztyó 2012-05-04 10:26:37 PDT
Created attachment 140265 [details]
proposed fix
Comment 26 Dirk Pranke 2012-05-04 11:39:49 PDT
Comment on attachment 140265 [details]
proposed fix

looks fine.
Comment 27 Csaba Osztrogonác 2012-05-04 11:43:57 PDT
Comment on attachment 140265 [details]
proposed fix

Clearing flags on attachment: 140265

Committed r116134: <http://trac.webkit.org/changeset/116134>
Comment 28 Csaba Osztrogonác 2012-05-04 11:44:07 PDT
All reviewed patches have been landed.  Closing bug.
Comment 29 Csaba Osztrogonác 2012-08-23 03:19:46 PDT
Reopen, because XVFBDriver was disabled on Qt temporarily: https://trac.webkit.org/changeset/120688

See https://bugs.webkit.org/show_bug.cgi?id=88414 for details.
Comment 30 Jocelyn Turcotte 2014-02-03 03:19:50 PST
=== Bulk closing of Qt bugs ===

If you believe that this bug report is still relevant for a non-Qt port of webkit.org, please re-open it and remove [Qt] from the summary.

If you believe that this is still an important QtWebKit bug, please fill a new report at https://bugreports.qt-project.org and add a link to this issue. See http://qt-project.org/wiki/ReportingBugsInQt for additional guidelines.