Bug 132134

Summary: [GTK] TestWebKitAccessibility unit test is flaky.
Product: WebKit Reporter: Carlos Alberto Lopez Perez <clopez>
Component: Tools / TestsAssignee: Nobody <webkit-unassigned>
Status: NEW ---    
Severity: Normal CC: bugs-noreply, cgarcia, jdiggs, mario, mcatanzaro, svillar
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=127352
Attachments:
Description Flags
broken patch none

Description Carlos Alberto Lopez Perez 2014-04-24 10:16:55 PDT
The TestWebKitAccessibility unit test is flaky.

The test usually (not ever) works if you run it directly like:

Tools/Scripts/run-gtk-tests --release WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestWebKitAccessibility

The failure seems easier to reproduce if you run the complete test suite like:

Tools/Scripts/run-gtk-tests --release

Probably this is caused by some race condition.

A related bug is https://bugs.webkit.org/show_bug.cgi?id=100408

Some relevant errors may be:

TEST: ./Tools/gtk/../../WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestWebKitAccessibility... (pid=18529)

  /webkit2/WebKitAccessibility/atspi-basic-hierarchy:                  **

ERROR:../../Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestWebKitAccessibility.cpp:154:void testAtspiBasicHierarchy(WebViewTest*, gconstpointer): assertion failed: (ATSPI_IS_ACCESSIBLE(testServerApp.get()))

FAIL

GTester: last random seed: R02S13299414e127a0424403f994eb6d6bfb

(pid=18582)

Error receiving IPC message on socket -1 in process 18581: Bad file descriptor

FAIL: ./Tools/gtk/../../WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestWebKitAccessibility


Complete log: http://build.webkit.org/builders/GTK%20Linux%2064-bit%20Release/builds/46795/steps/API%20tests/logs/stdio
Comment 1 Carlos Alberto Lopez Perez 2014-04-24 14:57:05 PDT
This test has been marked to be skipped because of its flaky behaviour on r167769 <http://trac.webkit.org/changeset/167769>
Comment 2 Mario Sanchez Prada 2014-04-25 00:48:14 PDT
Thanks for reporting this, Carlos. I'll try to take a look to this during the next week, although I suspect is going to be a tricky one to reproduce and fix without direct access to the bots, since part of the problem could be due to the specific environment (AT-SPI2, D-Bus...).

Still I will try, but next week, this one is being too surreal for me already :)
Comment 3 Carlos Garcia Campos 2014-04-25 00:56:04 PDT
I wonder why we have a unit test for accessibility since we don't expose any API for accessibility. Shouldn't it be tested with layout tests instead?
Comment 4 Mario Sanchez Prada 2014-04-25 01:25:17 PDT
(In reply to comment #3)
> I wonder why we have a unit test for accessibility since we don't expose any API for accessibility. Shouldn't it be tested with layout tests instead?

Accessibility is a tricky beast :)

You are right in what that WebKit2GTK+ does not expose explicitly any API related to accessibility as in other APIs exposed in the UIProcess (in UIProcess/API/gtk). 

But that does not mean there is no API exposed at all. What happens here is that most of the accessibility is exposed directly from WebCore, by implementing different ATK interfaces (there you have the exposed API). This, in WebKit2, means that the accessibility ATK-based API is actually exposed to the world right from the WebProcess, not the UI Process.

However, there's still something we need to do from the UIProcess related to WebKit2, so ATs can see the whole ATK hierarchy exposed in the Web Process as "connected" to the very tiny ATK hierarchy present in the UIProcess (basically, an  AtkObject associated to the WebView widget):

We need to connect both hierarchies using AtkSocket and AtkPlug so ATs, which "speak" AT-SPI only -not ATK-, they will see everything as a single hierarchy, starting up at the level of the AtkObject for the WebView and going all the way down into the domain of WebProcess, where the "real accessibility stuff" lives.

So, this is what this TestWebKitAccessibility unit test checks: that such a connection between those two worlds is not broken. That's precisely why it uses AT-SPI instead of ATK to navigate top - down, so it can check that it can find the relevant AT-SPI objects wrapping ATK objects both in the UI Process and the Web Process.

Layout tests are not enough to test this because WebKitTestRunner uses the InjectedBundle to get "direct access" to the WebProcess for testing purposes. So, having an accessibility layout test passing using WKTR does not guarantee that this connection using AtkSocket/AtkPlug is not broken.

You can check this post of mine from last year where I tried to explain the whole thing all together, since it can be quite confusing: http://mariospr.org/2013/02/03/accessibility-in-webkitgtk/
Comment 5 Carlos Garcia Campos 2014-04-25 01:51:55 PDT
Understood, thanks for the detailed explanation :-)
Comment 6 Carlos Alberto Lopez Perez 2014-04-25 03:18:58 PDT
(In reply to comment #2)
> Thanks for reporting this, Carlos. I'll try to take a look to this during the next week, although I suspect is going to be a tricky one to reproduce and fix without direct access to the bots, since part of the problem could be due to the specific environment (AT-SPI2, D-Bus...).
> 

For the record: I'm able to reproduce the reported behaviour on my laptop. But is also true that both machines (my laptop and the buildbot) are running Debian testing, so it could be also some behaviour specific to Debian.

If you are not able to reproduce it, then ping me on the IRC.
Comment 7 Mario Sanchez Prada 2014-05-29 03:05:58 PDT
(In reply to comment #6)
> (In reply to comment #2)
> > Thanks for reporting this, Carlos. I'll try to take a look to this during the next week, although I suspect is going to be a tricky one to reproduce and fix without direct access to the bots, since part of the problem could be due to the specific environment (AT-SPI2, D-Bus...).
> > 
> 
> For the record: I'm able to reproduce the reported behaviour on my laptop. But is also true that both machines (my laptop and the buildbot) are running Debian testing, so it could be also some behaviour specific to Debian.
> 
> If you are not able to reproduce it, then ping me on the IRC.

Hi Carlos, I was able to reproduce this test failing (only when running the full test suite, though), but the error I got is different:

   TEST: ./Tools/gtk/../../WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestWebKitAccessibility... (pid=32387)

   ** (./Tools/gtk/../../WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestWebKitAccessibility:32387): ERROR **: AT-SPI: COuldn't connect to accessibility bus. Is at-spi-bus-launcher running?
   GTester: last random seed: R02S135fb01e6858c1cc208988c3589f63d1
   (pid=32430)
   FAIL: ./Tools/gtk/../../WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestWebKitAccessibility


So, not sure I'm seeing the same issue than you. Could you please confirm that what you can reproduce locally matches this bug's description and not this other issue?

Thanks!
Comment 8 Mario Sanchez Prada 2014-07-03 05:30:10 PDT
(In reply to comment #7)
> So, not sure I'm seeing the same issue than you. Could you please confirm that what you can reproduce locally matches this bug's description and not this other issue?
> 
> Thanks!

Ping?
Comment 9 Carlos Alberto Lopez Perez 2014-07-08 07:50:23 PDT
(In reply to comment #7)
> (In reply to comment #6)
> > (In reply to comment #2)
> > > Thanks for reporting this, Carlos. I'll try to take a look to this during the next week, although I suspect is going to be a tricky one to reproduce and fix without direct access to the bots, since part of the problem could be due to the specific environment (AT-SPI2, D-Bus...).
> > > 
> > 
> > For the record: I'm able to reproduce the reported behaviour on my laptop. But is also true that both machines (my laptop and the buildbot) are running Debian testing, so it could be also some behaviour specific to Debian.
> > 
> > If you are not able to reproduce it, then ping me on the IRC.
> 
> Hi Carlos, I was able to reproduce this test failing (only when running the full test suite, though), but the error I got is different:
> 
>    TEST: ./Tools/gtk/../../WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestWebKitAccessibility... (pid=32387)
> 
>    ** (./Tools/gtk/../../WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestWebKitAccessibility:32387): ERROR **: AT-SPI: COuldn't connect to accessibility bus. Is at-spi-bus-launcher running?
>    GTester: last random seed: R02S135fb01e6858c1cc208988c3589f63d1
>    (pid=32430)
>    FAIL: ./Tools/gtk/../../WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestWebKitAccessibility
> 
> 
> So, not sure I'm seeing the same issue than you. Could you please confirm that what you can reproduce locally matches this bug's description and not this other issue?
> 
> Thanks!

I have retested this on r170881 (after un-skipping this test on Tools/Scripts/run-gtk-tests)

I can't make the test fail by running it alone, but if I run the whole test suite the test fails.


The failure I get is:

TEST: ./Tools/gtk/../../WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestWebKitAccessibility... (pid=2165)

** (./Tools/gtk/../../WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestWebKitAccessibility:2165): ERROR **: AT-SPI: COuldn't connect to accessibility bus. Is at-spi-bus-launcher running?
GTester: last random seed: R02Sa59b6626786d9147f3d265f28c5b4d50
(pid=2198)
Error receiving IPC message on socket -1 in process 2196: Bad file descriptor
FAIL: ./Tools/gtk/../../WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestWebKitAccessibility


So, probably there is now a different problem because the error is different.

But what is clear to me is that something is wrong with this test. We can't unskip it until it runs reliable (when running the whole test suite)
Comment 10 Joanmarie Diggs (irc: joanie) 2014-07-09 18:05:09 PDT
Is it possible that the problem is being caused by some other issue, and the TestWebKitAccessibility test is merely an unfortunate casualty?

As an experiment, I skipped not just atspi-basic-hierarchy, but the full suite of accessibility tests. Then I ran the (remaining) complete test suite. I'm seeing both accessibility and non-accessibility dbus errors for non-accessibility tests:


TEST: ./Tools/gtk/../../WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestDOMNode... (pid=31957)

** (WebKitWebProcess:31969): WARNING **: Error retrieving accessibility bus address: org.freedesktop.DBus.Error.ServiceUnknown: The name org.a11y.Bus was not provided by any .service files
g_dbus_connection_real_closed: Remote peer vanished with error: Underlying GIOStream returned 0 bytes on an async read (g-io-error-quark, 0). Exiting.

PASS: ./Tools/gtk/../../WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestDOMNode

------------

TEST: ./Tools/gtk/../../WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestFrame... (pid=32047)

** (WebKitWebProcess:32059): WARNING **: Error retrieving accessibility bus address: org.freedesktop.DBus.Error.ServiceUnknown: The name org.a11y.Bus was not provided by any .service files

PASS: ./Tools/gtk/../../WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestFrame

------------

TEST: ./Tools/gtk/../../WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestLoaderClient... (pid=32140)

g_dbus_connection_real_closed: Remote peer vanished with error: Underlying GIOStream returned 0 bytes on an async read (g-io-error-quark, 0). Exiting.

PASS: ./Tools/gtk/../../WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestLoaderClient

------------

TEST: ./Tools/gtk/../../WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestDOMXPathNSResolver... (pid=32365)

** (WebKitWebProcess:32377): WARNING **: Error retrieving accessibility bus address: org.freedesktop.DBus.Error.ServiceUnknown: The name org.a11y.Bus was not provided by any .service files

g_dbus_connection_real_closed: Remote peer vanished with error: Underlying GIOStream returned 0 bytes on an async read (g-io-error-quark, 0). Exiting.

------------

TEST: ./Tools/gtk/../../WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestDOMNodeFilter... (pid=32725)

** (WebKitWebProcess:32737): WARNING **: Error retrieving accessibility bus address: org.freedesktop.DBus.Error.ServiceUnknown: The name org.a11y.Bus was not provided by any .service files

g_dbus_connection_real_closed: Remote peer vanished with error: Underlying GIOStream returned 0 bytes on an async read (g-io-error-quark, 0). Exiting.

PASS: ./Tools/gtk/../../WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestDOMNodeFilter

------------

Are you guys also seeing these errors for the non-accessibility tests? If something (the harness, some other test, whatever) is mucking with dbus and/or the at-spi2 registry, it might explain why the at-spi2 test doesn't fail reliably when run alone but does when run as part of the full suite.
Comment 11 Carlos Alberto Lopez Perez 2014-07-10 04:00:59 PDT
(In reply to comment #10)
> Are you guys also seeing these errors for the non-accessibility tests? If something (the harness, some other test, whatever) is mucking with dbus and/or the at-spi2 registry, it might explain why the at-spi2 test doesn't fail reliably when run alone but does when run as part of the full suite.

I also see this kind of warnings on several of the other tests when running the test suite. However the other test pass. Check the output you pasted: all the tests "PASS".

So I'm not sure if this dbus errors are relevant to this bug or is just noise.
Comment 12 Joanmarie Diggs (irc: joanie) 2014-07-10 17:33:00 PDT
Quick and dirty "fix" below. At least in my environment, what seems to be happening is:

1. run-gtk-tests sets up the environment, including starting
   up the at-spi-bus-launcher 
2. If the first and/or only test is TestWebKitAccessibility, it
   will work as expected, even when unskipped.
3. For every subsequent test, main() comes along and unsets the
   DBUS_SESSION_BUS_ADDRESS environment var. Nothing comes along
   and resets it. As a result, for all but the first test, the
   harness cannot find the session bus, cannot connect to the
   at-spi2 registry, and anything that depends upon that connection
   to pass (like the accessibility tests) subsequently fails.

You don't even need to run a11y tests for this to happen. Just use run-gtk-tests to run the same test twice in a row. For instance:

Tools/Scripts/run-gtk-tests --release WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestDOMNode WebKitBuild/Release/bin/TestWebKitAPI/WebKit2Gtk/TestDOMNode

In between the two runs, you'll get an error:

** (WebKitWebProcess:7471): WARNING **: Error retrieving accessibility bus address: org.freedesktop.DBus.Error.ServiceUnknown: The name org.a11y.Bus was not provided by any .service files

There's probably some good reason for unsetting DBUS_SESSION_BUS_ADDRESS, but since I don't know what it is, I'm not in the best position to provide an alternative solution. And if there's not a good reason, could we just stop unsetting it so that we can unskip this accessibility test? :)

diff --git a/Tools/Scripts/run-gtk-tests b/Tools/Scripts/run-gtk-tests
index d530f9d..eececf5 100755
--- a/Tools/Scripts/run-gtk-tests
+++ b/Tools/Scripts/run-gtk-tests
@@ -63,7 +63,6 @@ class TestRunner:
     SKIPPED = [
         SkippedTest("WebKit2Gtk/TestUIClient", "/webkit2/WebKitWebView/mouse-target", "Test times out after r150890", 117689),
         SkippedTest("WebKit2Gtk/TestContextMenu", SkippedTest.ENTIRE_SUITE, "Test times out after r150890", 117689),
-        SkippedTest("WebKit2APITests/TestWebKitAccessibility", "/webkit2/WebKitAccessibility/atspi-basic-hierarchy", "Test is flaky"
         SkippedTest("WebKit2Gtk/TestWebKitWebView", "/webkit2/WebKitWebView/snapshot", "Test fails", 120404),
         SkippedTest("WebKit2Gtk/TestWebKitWebView", "/webkit2/WebKitWebView/page-visibility", "Test fails or times out", 131731),
         SkippedTest("WebKit2Gtk/TestCookieManager", "/webkit2/WebKitCookieManager/persistent-storage", "Test is flaky", 134580),
diff --git a/Tools/TestWebKitAPI/gtk/WebKit2Gtk/TestMain.cpp b/Tools/TestWebKitAPI/gtk/WebKit2Gtk/TestMain.cpp
index 8043554..7f1aa63 100644
--- a/Tools/TestWebKitAPI/gtk/WebKit2Gtk/TestMain.cpp
+++ b/Tools/TestWebKitAPI/gtk/WebKit2Gtk/TestMain.cpp
@@ -53,7 +53,6 @@ static void removeNonEmptyDirectory(const char* directoryPath)
 
 int main(int argc, char** argv)
 {
-    g_unsetenv("DBUS_SESSION_BUS_ADDRESS");
     gtk_test_init(&argc, &argv, 0);
     g_setenv("WEBKIT_EXEC_PATH", WEBKIT_EXEC_PATH, FALSE);
     g_setenv("WEBKIT_INJECTED_BUNDLE_PATH", WEBKIT_INJECTED_BUNDLE_PATH, FALSE);
Comment 13 Carlos Garcia Campos 2014-07-11 04:29:27 PDT
Is the a11y bus daemon required for all tests or only for TestWebKitAccessibility? If it's only needed by that one, it should probably spawned by the test itself (like all other tests that use their own bus daemon), instead of the script.  We really need to unset the DBUS_SESSION_BUS_ADDRESS, see https://bugs.webkit.org/show_bug.cgi?id=118427


    * UIProcess/API/gtk/tests/TestMain.cpp:
    (main): Unset DBUS_SESSION_BUS_ADDRESS environment variable to
    make sure that the GLib bus singleton is initialized by the
    private DBus session bus created by the tests.
Comment 14 Michael Catanzaro 2015-01-13 20:07:11 PST
*** Bug 140413 has been marked as a duplicate of this bug. ***
Comment 15 Michael Catanzaro 2015-01-13 20:12:30 PST
Created attachment 244572 [details]
broken patch

Here I implement Carlos's suggestion, but the test is still broken when running with all other tests, and passes even without accessibility daemons(!) (comment out the call to startAccessibilityDaemons()) when run alone. Hopefully I've done something wrong. I'm just uploading the patch because it will save time for anyone who tries to look at this in the future.