I knew that we fixed something like this in the past, but I have just logged in our bots and there were literally hundreds of dbus-daemon and at-spi processes running in the bot machines.
What we fixed was a typo in the run-gtk-tests that was preventing the daemon to properly shut down, so I thought that would be enough to fix the problem in the bots. Unfortunately, that seemed not to be enough, and it's quite hard for me to find the real issue and fix it because I neither can reproduce the problem locally nor I can ssh into the bots. Thus, if someone else with access to the bot could investigate this issue, that would be great. Of course, I'd be happy to help with it, but at this point I think some investigation inside the actual bot would be the way to go.
Raising importance.
Created attachment 228604 [details] Patch
I can confirm the issue with the dbus-daemon processes, but not with at-spi ones. I have attached a patch for this.
Comment on attachment 228604 [details] Patch Clearing flags on attachment: 228604 Committed r166798: <http://trac.webkit.org/changeset/166798>
All reviewed patches have been landed. Closing bug.
Killing the dbus-daemon is not a good solution, since it locks the screen of the EFL performance bot. In this case every tests will fail.
(In reply to comment #7) > Killing the dbus-daemon is not a good solution, since it locks the screen of the EFL performance bot. In this case every tests will fail. How that can happen? You should be running the bot as an unprivileged user, so the bot can't kill the processes (dbus-daemon) of the system or of other users.
Every user has his/her own dbus-daemon, and the EFL performance bot kills the current user's daemon.
(In reply to comment #9) > Every user has his/her own dbus-daemon, and the EFL performance bot kills the current user's daemon. Does the EFL performance bot runs as root? Otherwise I don't understand
We run it as not root, the buildbot killed its own dbus-daemon, not anyone else's daemon. The dbus-daemon is neccessary in case of gnome and unity as well.
I agree with Zsolt, killing dbus-daemon is not the proper fix instead of finding and fixing the issue you raised on GTK. If you really want to kill all dbus-daemon processes, please improve kill-old-processes to receive --efl/--gtk/... options and kill dbus-daemon only on GTK. And then you can simple make the buildmaster to pass the platform to the script with calling appendCustomBuildFlag() in master.cfg similar to CompileWebKit or RunWebKitTests.
(In reply to comment #12) > I agree with Zsolt, killing dbus-daemon is not the proper fix > instead of finding and fixing the issue you raised on GTK. > I don't know if there is another possible fix for this issue other than killing all the dbus-daemon process. Ideas? > If you really want to kill all dbus-daemon processes, please improve > kill-old-processes to receive --efl/--gtk/... options and kill dbus-daemon > only on GTK. And then you can simple make the buildmaster to pass the > platform to the script with calling appendCustomBuildFlag() in master.cfg > similar to CompileWebKit or RunWebKitTests. This looks like a good idea, I think we can do it.
I have been investigating this issue, and my tests reveal that the rogue dbus-daemon processes that are created when running Tools/Scripts/run-gtk-tests is because of bug https://bugs.webkit.org/show_bug.cgi?id=131675 I tried to run run-gtk-tests several times with and without the patch attached on https://bugs.webkit.org/show_bug.cgi?id=131675 and compared the dbus-daemon process before and after the execution of run-gtk-tests. I can confirm that with the patch on https://bugs.webkit.org/show_bug.cgi?id=131675 this problem not longer shows, and not extra dbus-daemon process are left after the test ends the execution. The approach of killing all the dbus-daemon process on the step kill-old-processes only hides the real problem. It also makes very difficult to run the bot inside a GNOME session (and that is convenient for a perf bot). So I suggest to rollout r166798 <http://trac.webkit.org/changeset/166798> and get the patch on https://bugs.webkit.org/show_bug.cgi?id=131675 applied.
(In reply to comment #14) > The approach of killing all the dbus-daemon process on the step kill-old-processes only hides the real problem. It also makes very difficult to run the bot inside a GNOME session (and that is convenient for a perf bot). So I suggest to rollout r166798 <http://trac.webkit.org/changeset/166798> and get the patch on https://bugs.webkit.org/show_bug.cgi?id=131675 applied. Created bug 133215 for rolling out r166798. But before rolling out r166798, it would be great if we can get bug 131675 fixed
We have detected this started to happen again. There are lots of rogue at-spi-bus-launcher and dbus-daemon process on the GTK+ bots There was about 600 processes like this on the test bots: slave 9795 0.0 0.0 34120 1016 ? Ss Sep30 0:00 /usr/bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session slave 9797 0.0 0.0 255720 3276 ? Sl Sep30 0:00 /home/slave/webkitgtk/gtk-linux-64-debug-tests/build/WebKitBuild/DependenciesGTK/Root/libexec/at-spi-bus-launcher slave 9800 0.0 0.0 34120 1640 ? S Sep30 0:00 \_ /usr/bin/dbus-daemon --config-file=/home/slave/webkitgtk/gtk-linux-64-debug-tests/build/WebKitBuild/DependenciesGTK/Root/etc/at-spi2/accessibility.conf --nofork --print-address 3
I have proposed in bug 153483 a patch to allow defining an extra list of tasks to kill on each bot. That way I can configure dbus-daemon and related at-spi process to be killed on the GTK test bots as needed.
Seems that this is no longer an issue. Let's reopen this in case it fails again.