WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
229758
[GTK] The Xvfb display server may fail to start sometimes causing tests to randomly crash
https://bugs.webkit.org/show_bug.cgi?id=229758
Summary
[GTK] The Xvfb display server may fail to start sometimes causing tests to ra...
Carlos Alberto Lopez Perez
Reported
2021-09-01 04:33:53 PDT
I have observed this here:
https://build.webkit.org/#/builders/199/builds/1062
- If you take a look at that build there are 11 unexpected crashes which only happened on that build (didn't happened not on the build before or after) - There are no crash log for any of the crashes, this is weird.. it is an evidence it was not an actual crash but a failure to start - And if you take a closer look to the stdio log you see the crashes happened on a few workers. As soon as this workers were started all the tests on that workers crashed with a common stderr message: "Gtk-WARNING **: 15:52:29.549: cannot open display: :$X". And as soon the driver was restarted the tests started to work back again. So, I think that what happened here was that the Xvfb driver failed to start on this workers, leading to crashes on all tests until it was restarted. We need to make the Xvfb driver more robust and deal better with an eventual failure to start Xvfb.
Attachments
Patch
(21.56 KB, patch)
2021-09-01 10:47 PDT
,
Carlos Alberto Lopez Perez
no flags
Details
Formatted Diff
Diff
Patch
(21.56 KB, patch)
2021-09-01 12:16 PDT
,
Carlos Alberto Lopez Perez
no flags
Details
Formatted Diff
Diff
Patch
(22.29 KB, patch)
2021-09-06 06:32 PDT
,
Carlos Alberto Lopez Perez
no flags
Details
Formatted Diff
Diff
Show Obsolete
(2)
View All
Add attachment
proposed patch, testcase, etc.
Carlos Alberto Lopez Perez
Comment 1
2021-09-01 09:24:30 PDT
Looks like this was more common than I expected.. I have been able to reproduce the issue of Xvfb not starting properly locally when testing this. Patch incoming
Carlos Alberto Lopez Perez
Comment 2
2021-09-01 10:47:18 PDT
Created
attachment 437043
[details]
Patch
Philippe Normand
Comment 3
2021-09-01 12:05:23 PDT
EWS is red, python2 test failure: webkitpy.port.xvfbdriver_unittest.XvfbDriverTest.test_xvfb_not_replying I'm all-in-favor for fixing this, but it begs the question, should we soon migrate the bots to wayland and consider X as a legacy thing?
Carlos Alberto Lopez Perez
Comment 4
2021-09-01 12:16:07 PDT
Created
attachment 437051
[details]
Patch
Carlos Alberto Lopez Perez
Comment 5
2021-09-01 12:24:33 PDT
(In reply to Philippe Normand from
comment #3
)
> EWS is red, python2 test failure: > webkitpy.port.xvfbdriver_unittest.XvfbDriverTest.test_xvfb_not_replying > > I'm all-in-favor for fixing this, but it begs the question, should we soon > migrate the bots to wayland and consider X as a legacy thing?
Running on a real wayland display is a major issue because it complicates a lot the hardware maintenance and deployment (makes difficult to use containers, also depending on a specific graphics hardware drivers makes failures less reproducible, etc). I would be all in favour of migrating to wayland if there is a way of having the tests to run in a virtualized wayland environment. Fortunately we have that, currently we can pass --display-server=weston to run-webkit-tests which runs the tests inside weston inside Xvfb. Is a bit hackish but it works. This patch also fixes the startup of Xvfb for the --display-server=weston option (weston in Xvfb--- fixes the Xvfb part). I plan to also fix the weston startup part (likely there are race conditions there as well) on a follow-up patch soon. Then we would need to check how different are the results from the Weston driver compared to the Xvfb one. Also I'm not really found of this solution of Weston inside Xvfb.. if there is a better way of achieving a virtualized display for wayland please let me know that.
Philippe Normand
Comment 6
2021-09-01 12:47:39 PDT
> if there is a better way of achieving a virtualized display for wayland please let me know that.
Maybe there's a way to run weston in swrast mode? I don't know, I haven't done research in that field :) But yeah it's a bit sad to run weston within xvfb :(
Michael Catanzaro
Comment 7
2021-09-01 12:51:17 PDT
(In reply to Carlos Alberto Lopez Perez from
comment #5
)
> Also I'm not really found of this solution of Weston inside Xvfb.. if there > is a better way of achieving a virtualized display for wayland please let me > know that.
I think we can do: $ weston --backend=headless-backend.so If that doesn't work, mutter has a headless mode added in
https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1698
. You could try: $ mutter --wayland --headless --virtual-monitor 1280x720 I know wlroots also has a headless mode.
Carlos Alberto Lopez Perez
Comment 8
2021-09-01 13:20:55 PDT
Comment on
attachment 437051
[details]
Patch Clearing flags on attachment: 437051 Committed
r281870
(
241200@main
): <
https://commits.webkit.org/241200@main
>
Carlos Alberto Lopez Perez
Comment 9
2021-09-01 13:21:01 PDT
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 10
2021-09-01 13:21:27 PDT
<
rdar://problem/82643101
>
Carlos Alberto Lopez Perez
Comment 11
2021-09-01 13:52:57 PDT
This broke GTK API tests :\ ... it seems the GTK API test runner script runs with python2 and the code I added is not working properly there .. I will revert this.
Carlos Alberto Lopez Perez
Comment 12
2021-09-01 13:56:24 PDT
Reverted
r281870
for reason: It Committed
r281874
(
241204@main
): <
https://commits.webkit.org/241204@main
>
Jonathan Bedard
Comment 13
2021-09-01 14:06:18 PDT
(In reply to Carlos Alberto Lopez Perez from
comment #11
)
> This broke GTK API tests :\ ... it seems the GTK API test runner script runs > with python2 and the code I added is not working properly there .. I will > revert this.
On this note, we've started changing invocations of all Apple's Python scripts to Python3. Other ports will get some of this work for free (already happened with run-webkit-tests, for example) but Apple folks won't make that change for scripts we don't own or partially own
Carlos Alberto Lopez Perez
Comment 14
2021-09-01 17:28:38 PDT
(In reply to Jonathan Bedard from
comment #13
)
> (In reply to Carlos Alberto Lopez Perez from
comment #11
) > > This broke GTK API tests :\ ... it seems the GTK API test runner script runs > > with python2 and the code I added is not working properly there .. I will > > revert this. > > On this note, we've started changing invocations of all Apple's Python > scripts to Python3. Other ports will get some of this work for free (already > happened with run-webkit-tests, for example) but Apple folks won't make that > change for scripts we don't own or partially own
That's for the note. I think that is fair and makes sense. Thanks for working on this, btw. I opened
bug 229782
for porting the GTK/WPE API test runner to python3 Also opened
bug 229783
for run-perf-tests
Carlos Alberto Lopez Perez
Comment 15
2021-09-06 06:21:40 PDT
I converted run-gtk-tests, run-wpe-tests and run-perf-tests to use python3. But I realized that there are still more scripts running with python2 like run-webdriver-tests and maybe even others which I still didn't realized about. So I will upload now a new version of this patch that is compatible with python2, to not block here on migrating every consumer of it to python3.
Carlos Alberto Lopez Perez
Comment 16
2021-09-06 06:32:18 PDT
Created
attachment 437409
[details]
Patch Make code compatible with python2. This is the diff against the previous patch:
http://sprunge.us/BfNZVc?diff
Carlos Alberto Lopez Perez
Comment 17
2021-09-06 12:48:39 PDT
Comment on
attachment 437409
[details]
Patch Clearing flags on attachment: 437409 Committed
r282064
(
241364@main
): <
https://commits.webkit.org/241364@main
>
Carlos Alberto Lopez Perez
Comment 18
2021-09-06 12:48:45 PDT
All reviewed patches have been landed. Closing bug.
WebKit Commit Bot
Comment 19
2021-09-07 03:28:32 PDT
Re-opened since this is blocked by
bug 229990
Carlos Alberto Lopez Perez
Comment 20
2021-09-07 07:02:23 PDT
(In reply to WebKit Commit Bot from
comment #19
)
> Re-opened since this is blocked by
bug 229990
An update of what happened here.. On
r282064
I messed things up and by mistake I redirected stderr to stdout on the subprocess call to check the screen size of the Xvfb display, so when for any reason the program that checks the size prints anything on stderr (like a gtk/glib warning), then it would mix stderr with stdout and it will make the check fail (since the screen dimensions of the display it reports on stdout are compared with the expected ones) In the reported case this was happening: 04:13:30.337 2 worker/5 Waiting for Xvfb display server to be ready. 04:13:30.338 2 worker/1 The queried Xvfb screen size "(print-screen-size:258): dbind-WARNING **: 04:13:30.304: Couldn't connect to accessibility bus: Failed to connect to socket /tmp/dbus-WDVOI6JrL3: No such file or directory 1024x768" does not match the expectation "1024x768". The string """(print-screen-size:258): dbind-WARNING **: 04:13:30.304: Couldn't connect to accessibility bus: Failed to connect to socket /tmp/dbus-WDVOI6JrL3: No such file or directory""" should belong only to stderr. This patch on top fixes the issue:
http://sprunge.us/r865Ui
I will re-land it with the above fix
Carlos Alberto Lopez Perez
Comment 21
2021-09-07 07:14:01 PDT
Committed
r282082
(
241382@main
): <
https://commits.webkit.org/241382@main
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug