Bug 127352

Summary: [GTK] Accessibility daemons are killing the bots
Product: WebKit Reporter: Sergio Villar Senin <svillar>
Component: Tools / TestsAssignee: Nobody <webkit-unassigned>
Status: REOPENED ---    
Severity: Major CC: apinheiro, bugs-noreply, cgarcia, clopez, commit-queue, jdiggs, mario, mcatanzaro, ossy, svillar, zan, zsborbely.u-szeged
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=132134
https://bugs.webkit.org/show_bug.cgi?id=153483
Attachments:
Description Flags
Patch none

Description Sergio Villar Senin 2014-01-21 09:36:57 PST
I knew that we fixed something like this in the past, but I have just logged in our bots and there were literally hundreds of dbus-daemon and at-spi processes running in the bot machines.
Comment 1 Mario Sanchez Prada 2014-02-04 06:09:38 PST
What we fixed was a typo in the run-gtk-tests that was preventing the daemon to properly shut down, so I thought that would be enough to fix the problem in the bots.

Unfortunately, that seemed not to be enough, and it's quite hard for me to find the real issue and fix it because I neither can reproduce the problem locally nor I can ssh into the bots. Thus, if someone else with access to the bot could investigate this issue, that would be great.

Of course, I'd be happy to help with it, but at this point I think some investigation inside the actual bot would be the way to go.
Comment 2 Sergio Villar Senin 2014-04-04 02:12:14 PDT
Raising importance.
Comment 3 Carlos Alberto Lopez Perez 2014-04-04 10:48:31 PDT
Created attachment 228604 [details]
Patch
Comment 4 Carlos Alberto Lopez Perez 2014-04-04 10:49:48 PDT
I can confirm the issue with the dbus-daemon processes, but not with at-spi ones.

I have attached a patch for this.
Comment 5 WebKit Commit Bot 2014-04-04 11:37:28 PDT
Comment on attachment 228604 [details]
Patch

Clearing flags on attachment: 228604

Committed r166798: <http://trac.webkit.org/changeset/166798>
Comment 6 WebKit Commit Bot 2014-04-04 11:37:32 PDT
All reviewed patches have been landed.  Closing bug.
Comment 7 Zsolt Borbely 2014-04-08 04:52:47 PDT
Killing the dbus-daemon is not a good solution, since it locks the screen of the EFL performance bot. In this case every tests will fail.
Comment 8 Carlos Alberto Lopez Perez 2014-04-08 05:00:24 PDT
(In reply to comment #7)
> Killing the dbus-daemon is not a good solution, since it locks the screen of the EFL performance bot. In this case every tests will fail.

How that can happen?

You should be running the bot as an unprivileged user, so the bot can't kill the processes (dbus-daemon) of the system or of other users.
Comment 9 Zsolt Borbely 2014-04-08 05:37:27 PDT
Every user has his/her own dbus-daemon, and the EFL performance bot kills the current user's daemon.
Comment 10 Carlos Alberto Lopez Perez 2014-04-08 05:51:15 PDT
(In reply to comment #9)
> Every user has his/her own dbus-daemon, and the EFL performance bot kills the current user's daemon.

Does the EFL performance bot runs as root? Otherwise I don't understand
Comment 11 Zsolt Borbely 2014-04-10 07:37:24 PDT
We run it as not root, the buildbot killed its own dbus-daemon, not anyone else's daemon. The dbus-daemon is neccessary in case of gnome and unity as well.
Comment 12 Csaba Osztrogon√°c 2014-04-10 07:57:37 PDT
I agree with Zsolt, killing dbus-daemon is not the proper fix 
instead of finding and fixing the issue you raised on GTK.

If you really want to kill all dbus-daemon processes, please improve 
kill-old-processes to receive --efl/--gtk/... options and kill dbus-daemon 
only on GTK. And then you can simple make the buildmaster to pass the 
platform to the script with calling appendCustomBuildFlag() in master.cfg
similar to CompileWebKit or RunWebKitTests.
Comment 13 Carlos Alberto Lopez Perez 2014-04-14 05:43:46 PDT
(In reply to comment #12)
> I agree with Zsolt, killing dbus-daemon is not the proper fix 
> instead of finding and fixing the issue you raised on GTK.
> 

I don't know if there is another possible fix for this issue other than killing all the dbus-daemon process. Ideas?

> If you really want to kill all dbus-daemon processes, please improve 
> kill-old-processes to receive --efl/--gtk/... options and kill dbus-daemon 
> only on GTK. And then you can simple make the buildmaster to pass the 
> platform to the script with calling appendCustomBuildFlag() in master.cfg
> similar to CompileWebKit or RunWebKitTests.

This looks like a good idea, I think we can do it.
Comment 14 Carlos Alberto Lopez Perez 2014-05-20 04:37:57 PDT
I have been investigating this issue, and my tests reveal that the rogue dbus-daemon processes that are created when running Tools/Scripts/run-gtk-tests is because of bug https://bugs.webkit.org/show_bug.cgi?id=131675

I tried to run run-gtk-tests several times with and without the patch attached on https://bugs.webkit.org/show_bug.cgi?id=131675 and compared the dbus-daemon process before and after the execution of run-gtk-tests.
I can confirm that with the patch on https://bugs.webkit.org/show_bug.cgi?id=131675 this problem not longer shows, and not extra dbus-daemon process are left after the test ends the execution.


The approach of killing all the dbus-daemon process on the step kill-old-processes only hides the real problem. It also makes very difficult to run the bot inside a GNOME session (and that is convenient for a perf bot). So I suggest to rollout r166798 <http://trac.webkit.org/changeset/166798> and get the patch on https://bugs.webkit.org/show_bug.cgi?id=131675 applied.
Comment 15 Carlos Alberto Lopez Perez 2014-05-23 07:42:16 PDT
(In reply to comment #14)
> The approach of killing all the dbus-daemon process on the step kill-old-processes only hides the real problem. It also makes very difficult to run the bot inside a GNOME session (and that is convenient for a perf bot). So I suggest to rollout r166798 <http://trac.webkit.org/changeset/166798> and get the patch on https://bugs.webkit.org/show_bug.cgi?id=131675 applied.

Created bug 133215 for rolling out r166798.

But before rolling out r166798, it would be great if we can get bug 131675 fixed
Comment 16 Carlos Alberto Lopez Perez 2015-10-01 04:26:20 PDT
We have detected this started to happen again.

There are lots of rogue at-spi-bus-launcher and dbus-daemon process on the GTK+ bots

There was about 600 processes like this on the test bots:

slave     9795  0.0  0.0  34120  1016 ?        Ss   Sep30   0:00 /usr/bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session
slave     9797  0.0  0.0 255720  3276 ?        Sl   Sep30   0:00 /home/slave/webkitgtk/gtk-linux-64-debug-tests/build/WebKitBuild/DependenciesGTK/Root/libexec/at-spi-bus-launcher
slave     9800  0.0  0.0  34120  1640 ?        S    Sep30   0:00  \_ /usr/bin/dbus-daemon --config-file=/home/slave/webkitgtk/gtk-linux-64-debug-tests/build/WebKitBuild/DependenciesGTK/Root/etc/at-spi2/accessibility.conf --nofork --print-address 3
Comment 17 Carlos Alberto Lopez Perez 2016-01-26 06:32:12 PST
I have proposed in bug 153483 a patch to allow defining an extra list of tasks to kill on each bot. That way I can configure dbus-daemon and related at-spi process to be killed on the GTK test bots as needed.