RESOLVED CONFIGURATION CHANGED 218375
[GTK] imported WPT tests are very flaky on Ubuntu 20.04 (xdg-desktop-portal 1.6)
https://bugs.webkit.org/show_bug.cgi?id=218375
Summary [GTK] imported WPT tests are very flaky on Ubuntu 20.04 (xdg-desktop-portal 1.6)
Fujii Hironori
Reported 2020-10-29 23:58:30 PDT
[GTK] wpt is very flaky on my Linux box wpt reports a lot of flaky failures on my Linux box. ./Tools/Scripts/run-webkit-tests --gtk --release imported/w3c/web-platform-tests I'm using Ubuntu 20.04 on VirtualBox on Windows.
Attachments
debug log (5.50 MB, application/gzip)
2020-11-01 12:54 PST, Fujii Hironori
no flags
dbus session log (34.98 KB, application/gzip)
2020-11-04 17:40 PST, Fujii Hironori
no flags
dbus-session-2.log.gz (68.49 KB, application/gzip)
2020-11-04 18:34 PST, Fujii Hironori
no flags
dbus-system-2.log.gz (5.03 KB, application/gzip)
2020-11-04 18:35 PST, Fujii Hironori
no flags
Fujii Hironori
Comment 1 2020-10-30 00:00:22 PDT
This is reproducible by running a single test case repeatedly. ./Tools/Scripts/run-webkit-tests --gtk --release imported/w3c/web-platform-tests/css/css-cascade/all-prop-001.html --repeat-each=3000 > Running 1 test > > Running 1 WebKitTestRunner. > > [181/3000] imported/w3c/web-platform-tests/css/css-cascade/all-prop-001.html failed unexpectedly (reference mismatch) > [363/3000] imported/w3c/web-platform-tests/css/css-cascade/all-prop-001.html failed unexpectedly (reference mismatch) > [548/3000] imported/w3c/web-platform-tests/css/css-cascade/all-prop-001.html failed unexpectedly (reference mismatch) > [727/3000] imported/w3c/web-platform-tests/css/css-cascade/all-prop-001.html failed unexpectedly (reference mismatch) > [914/3000] imported/w3c/web-platform-tests/css/css-cascade/all-prop-001.html failed unexpectedly (reference mismatch) > [1110/3000] imported/w3c/web-platform-tests/css/css-cascade/all-prop-001.html failed unexpectedly (reference mismatch) > [1306/3000] imported/w3c/web-platform-tests/css/css-cascade/all-prop-001.html failed unexpectedly (reference mismatch) > [1496/3000] imported/w3c/web-platform-tests/css/css-cascade/all-prop-001.html failed unexpectedly (reference mismatch) > [1678/3000] imported/w3c/web-platform-tests/css/css-cascade/all-prop-001.html failed unexpectedly (reference mismatch) > [1860/3000] imported/w3c/web-platform-tests/css/css-cascade/all-prop-001.html failed unexpectedly (reference mismatch) > [2045/3000] imported/w3c/web-platform-tests/css/css-cascade/all-prop-001.html failed unexpectedly (reference mismatch) > [2220/3000] imported/w3c/web-platform-tests/css/css-cascade/all-prop-001.html failed unexpectedly (reference mismatch) > [2395/3000] imported/w3c/web-platform-tests/css/css-cascade/all-prop-001.html failed unexpectedly (reference mismatch) > [2571/3000] imported/w3c/web-platform-tests/css/css-cascade/all-prop-001.html failed unexpectedly (reference mismatch) > [2754/3000] imported/w3c/web-platform-tests/css/css-cascade/all-prop-001.html failed unexpectedly (reference mismatch) > [2941/3000] imported/w3c/web-platform-tests/css/css-cascade/all-prop-001.html failed unexpectedly (reference mismatch) > > Retrying 1 unexpected failure ... > > Running 1 WebKitTestRunner. > > > 2984 tests ran as expected, 16 didn't: Hmm, it is constantly failing every 180-190 iterations.
Fujii Hironori
Comment 2 2020-10-30 00:09:04 PDT
If I copy all-prop-001.html and all-prop-001-expected.html into LayoutTests/fast and LayoutTests/http/tests/css directories, it doesn't reproduce the flaky failures. ./Tools/Scripts/run-webkit-tests --gtk --release fast/all-prop-001.html --repeat-each=3000 -f ./Tools/Scripts/run-webkit-tests --gtk --release http/tests/css/all-prop-001.html --repeat-each=3000 -f
Fujii Hironori
Comment 3 2020-10-30 00:17:27 PDT
This issue can be reproduced by invoking WebKitTestRunner manually. ./Tools/Scripts/run-webkit-httpd --no-httpd ./Tools/Scripts/webkit-flatpak --gtk --release -c /usr/bin/bash export TEST_RUNNER_TEST_PLUGIN_PATH=$PWD/WebKitBuild/GTK/Release/lib yes http://localhost:8800/css/css-cascade/all-prop-001.html | head -3000 | WebKitBuild/GTK/Release/bin/WebKitTestRunner - | tee wpt.log WebKitTestRunner renders blank pages intermittently. > Content-Type: text/plain > layer at (0,0) size 800x600 > RenderView at (0,0) size 800x600 > layer at (0,0) size 800x158 > RenderBlock {HTML} at (0,0) size 800x158 > RenderBody {BODY} at (8,16) size 784x134 > RenderBlock {P} at (0,0) size 784x18 > RenderText {#text} at (0,0) size 294x17 > text run at (0,0) width 294: "Test passes if there is a filled green square and " > RenderInline {STRONG} at (0,0) size 45x17 > RenderText {#text} at (293,0) size 45x17 > text run at (293,0) width 45: "no red" > RenderText {#text} at (337,0) size 5x17 > text run at (337,0) width 5: "." > layer at (8,50) size 784x100 > RenderBlock (relative positioned) {DIV} at (0,34) size 784x100 > RenderBlock {DIV} at (684,0) size 100x100 [bgcolor=#FF0000] > layer at (692,50) size 100x100 > RenderBlock (positioned) {DIV} at (684,0) size 100x100 [bgcolor=#008000] > #EOF > #EOF > Content-Type: text/plain > layer at (0,0) size 800x600 > RenderView at (0,0) size 800x600 > layer at (0,0) size 800x600 > RenderBlock {HTML} at (0,0) size 800x600 > RenderBody {BODY} at (8,8) size 784x584 > #EOF > #EOF Grepping RenderBody and numbering and greping the blank page. grep RenderBody wpt.log | cat -n | grep 784x584 293 RenderBody {BODY} at (8,8) size 784x584 586 RenderBody {BODY} at (8,8) size 784x584 886 RenderBody {BODY} at (8,8) size 784x584 1163 RenderBody {BODY} at (8,8) size 784x584 1442 RenderBody {BODY} at (8,8) size 784x584 1716 RenderBody {BODY} at (8,8) size 784x584 1995 RenderBody {BODY} at (8,8) size 784x584 2263 RenderBody {BODY} at (8,8) size 784x584 2545 RenderBody {BODY} at (8,8) size 784x584 2827 RenderBody {BODY} at (8,8) size 784x584 constantly failing.
Carlos Alberto Lopez Perez
Comment 4 2020-10-30 05:32:47 PDT
Strange. I wonder if it can be related to bug 212622 ? Perhaps you have a left-over http server running from a previous run? Can you retry to reboot the linux box and see if you can reproduce the issue after a fresh boot?
Fujii Hironori
Comment 5 2020-10-30 12:49:55 PDT
Thanks, but still no luck in a fresh boot. I'm going to enable debug logging.
Fujii Hironori
Comment 6 2020-11-01 12:54:26 PST
Created attachment 412869 [details] debug log
Fujii Hironori
Comment 7 2020-11-01 12:56:07 PST
Everytime the blank page is shown, the following error messages were reported. < HTTP/1.1 7 GDBus.Error:org.freedesktop.DBus.Error.NoReply: Message recipient disconnected from message bus without replying < Soup-Debug-Timestamp: 1604263197 < Soup-Debug: SoupMessage 0 (0x5597b84550b0) (WebProcess) WebResourceLoader::didFailResourceLoad for 'http://localhost:8800/css/css-cascade/all-prop-001.html' Failed to load 'http://localhost:8800/css/css-cascade/all-prop-001.html'.
Fujii Hironori
Comment 8 2020-11-01 13:15:14 PST
I don't think this is an issue of wpt.py because I observe no issues by requesting the URL with curl. seq 30000 | xargs -n 1 -P 30 curl http://localhost:8800/apng/supported-in-source-type.html -o md5sum * | sort
Michael Catanzaro
Comment 9 2020-11-04 15:32:08 PST
Ouch. You can try to use bustle (recommended) or dbus-monitor to figure out what message is being sent. Be sure to check both the session bus and the system bus. If you don't see anything, then I guess next step is to try to figure out how to run xdg-dbus-proxy in some debugging mode to see if it's blocking anything.
Fujii Hironori
Comment 10 2020-11-04 17:40:53 PST
Created attachment 413232 [details] dbus session log I recorded the session bus with dbus-monitor while runing run-webkit-tests. > ./Tools/Scripts/run-webkit-tests --gtk --debug imported/w3c/web-platform-tests/css/css-cascade/all-prop-001.html --repeat-each=300 --no-retry-failures --no-show-results While run-webkit-tests reported three flaky failures, > [27/300] imported/w3c/web-platform-tests/css/css-cascade/all-prop-001.html failed unexpectedly (reference mismatch) > [98/300] imported/w3c/web-platform-tests/css/css-cascade/all-prop-001.html failed unexpectedly (reference mismatch) > [171/300] imported/w3c/web-platform-tests/css/css-cascade/all-prop-001.html failed unexpectedly (reference mismatch) dbus-monitor reported 3 org.freedesktop.DBus.Error.NoReply errors.
Fujii Hironori
Comment 11 2020-11-04 18:34:44 PST
Created attachment 413239 [details] dbus-session-2.log.gz I recorded the system bus and session bus at the same time whiel running run-webkit-tests. In this time, run-webkit-tests reported 2 flaky failures. The session bus reported 2 NoReply error. > error time=1604542788.420320 sender=org.freedesktop.DBus -> destination=:1.1175 error_name=org.freedesktop.DBus.Error.NoReply reply_serial=310 > error time=1604542849.392397 sender=org.freedesktop.DBus -> destination=:1.1175 error_name=org.freedesktop.DBus.Error.NoReply reply_serial=591 The system bus reported several signals at the same times.
Fujii Hironori
Comment 12 2020-11-04 18:35:05 PST
Created attachment 413240 [details] dbus-system-2.log.gz
Fujii Hironori
Comment 13 2020-11-04 19:11:35 PST
There is the word 'coredump' in dbus-system-2.log.gz. coredumpctl has a lot of xdg-desktop-portal coredump. $ coredumpctl -r | head TIME PID UID GID SIG COREFILE EXE Thu 2020-11-05 12:07:59 JST 104497 1000 1000 11 present /usr/libexec/xdg-desktop-portal Thu 2020-11-05 12:07:28 JST 103814 1000 1000 11 present /usr/libexec/xdg-desktop-portal Thu 2020-11-05 12:06:58 JST 103100 1000 1000 11 present /usr/libexec/xdg-desktop-portal Thu 2020-11-05 12:06:28 JST 100552 1000 1000 11 present /usr/libexec/xdg-desktop-portal Thu 2020-11-05 11:20:49 JST 99870 1000 1000 11 present /usr/libexec/xdg-desktop-portal Thu 2020-11-05 11:20:19 JST 99220 1000 1000 11 present /usr/libexec/xdg-desktop-portal Thu 2020-11-05 11:19:48 JST 98562 1000 1000 11 present /usr/libexec/xdg-desktop-portal Thu 2020-11-05 11:19:18 JST 97888 1000 1000 11 present /usr/libexec/xdg-desktop-portal Thu 2020-11-05 11:15:33 JST 95785 1000 1000 11 present /usr/libexec/xdg-desktop-portal I confirmed running run-webkit-tests increases coredump of xdg-desktop-portal.
Fujii Hironori
Comment 14 2020-11-04 19:55:03 PST
This is the backtrace (without debug sysmbols): #0 0x00007f711bc4c494 in g_str_hash () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 #1 0x00007f711bc4b5dc in g_hash_table_lookup () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 #2 0x0000556f8b75f282 in ?? () #3 0x0000556f8b75f778 in ?? () #4 0x00007f711bc87931 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 #5 0x00007f711bbf2609 in start_thread (arg=<optimized out>) at pthread_create.c:477 #6 0x00007f711bb19293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 It seems like this issue. segfault when running org.freedesktop.Platform/x86_64/19.08 · Issue #433 · flatpak/xdg-desktop-portal · GitHub https://github.com/flatpak/xdg-desktop-portal/issues/433
Fujii Hironori
Comment 15 2020-11-04 20:31:25 PST
I upgraded to Ubuntu 20.10 (xdg-desktop-portal 1.8). It works fine now.
Michael Catanzaro
Comment 16 2020-11-05 06:16:31 PST
Wow.
Note You need to log in before you can comment on or make changes to this bug.