Bug 137725
| Summary: | layout test on EFL buildbot is too often broken | ||
|---|---|---|---|
| Product: | WebKit | Reporter: | Gyuyoung Kim <gyuyoung.kim> |
| Component: | Tools / Tests | Assignee: | Nobody <webkit-unassigned> |
| Status: | RESOLVED FIXED | ||
| Severity: | Normal | CC: | ap, ossy |
| Priority: | P2 | ||
| Version: | 528+ (Nightly build) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
Gyuyoung Kim
Layout test has been broken though I restart buildbot very often. It looks apache server is too often locked after layout test ran many times.
Need to fix this problem !
16:53:14.728 14762 Using port 'efl'
16:53:14.728 14762 Test configuration: <, x86, release>
16:53:14.728 14762 Placing test results in /home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/layout-test-results
16:53:14.728 14762 Baseline search path: efl -> wk2 -> generic
16:53:14.728 14762 Using Release build
16:53:14.728 14762 Pixel tests disabled
16:53:14.728 14762 Regular timeout: 35000, slow test timeout: 175000
16:53:14.765 14762 "perl Tools/Scripts/webkit-build-directory --configuration --release --efl" took 0.04s
16:53:14.765 14762 Command line: /home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/Tools/jhbuild/jhbuild-wrapper --efl run /home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/WebKitBuild/Release/bin/WebKitTestRunner -
16:53:14.765 14762
16:53:14.765 14762 Collecting tests ...
16:53:16.245 14762 Parsing expectations ...
16:53:23.550 14762 Found 38184 tests; running 30186, skipping 7998.
16:53:23.550 14762 Checking build ...
16:53:23.600 14762 "Tools/Scripts/build-dumprendertree --release --efl" took 0.04s
16:53:23.600 14762 Output of ['Tools/Scripts/build-dumprendertree', '--release', '--efl']:
16:53:23.646 14762 "Tools/Scripts/build-webkittestrunner --release --efl" took 0.05s
16:53:23.646 14762 Output of ['Tools/Scripts/build-webkittestrunner', '--release', '--efl']:
16:53:23.647 14762 Starting helper ...
16:53:23.647 14762 Checking system dependencies ...
16:53:23.687 14762 "/usr/sbin/apache2 -v" took 0.02s
16:53:23.767 14762 "/home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/Tools/jhbuild/jhbuild-wrapper --efl run which Xvfb" took 0.08s
16:53:23.832 14762 Expect: 29170 passes (29170 now, 0 wontfix)
16:53:23.833 14762 Expect: 659 failures ( 658 now, 1 wontfix)
16:53:23.833 14762 Expect: 357 flaky ( 357 now, 0 wontfix)
16:53:23.833 14762
16:53:23.920 14762 Sharding tests ...
16:53:23.936 14762 Acquiring http lock ...
16:53:23.937 14762 Creating lock file: /tmp/WebKitHttpd.lock.34
16:53:23.939 14762 Retrieving current lock pid from /tmp/WebKitHttpd.lock.33
16:53:23.939 14762 Checking current lock on pid 12032
16:53:23.939 14762 Removing stuck lock file: /tmp/WebKitHttpd.lock.33
16:53:24.943 14762 Retrieving current lock pid from /tmp/WebKitHttpd.lock.34
16:53:24.944 14762 Checking current lock on pid 14762
16:53:24.944 14762 HTTP lock acquired
16:53:24.944 14762 Starting HTTP server ...
16:53:24.986 14762 "/usr/sbin/apache2 -v" took 0.04s
16:53:25.019 14762 "/usr/sbin/apache2 -v" took 0.03s
16:53:25.020 14762 Starting httpd server, cmd="/usr/sbin/apache2 -f "/home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/layout-test-results/httpd.conf" -C 'DocumentRoot "/home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/LayoutTests/http/tests"' -c 'Alias /js-test-resources "/home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/LayoutTests/resources"' -c 'Alias /media-resources "/home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/LayoutTests/media"' -c 'TypesConfig "/home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/LayoutTests/http/conf/mime.types"' -c 'CustomLog "/home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/layout-test-results/access_log.txt" common' -c 'ErrorLog "/home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/layout-test-results/error_log.txt"' -C 'User "buildbot"' -c 'PidFile /tmp/WebKit/httpd.pid' -k start -C 'Listen 127.0.0.1:8000' -C 'Listen [::1]:8000' -C 'Listen 127.0.0.1:8080' -C 'Listen [::1]:8080' -C 'Listen 127.0.0.1:8443' -C 'Listen [::1]:8443' -c 'StartServers 2' -c 'MinSpareServers 2' -c 'MaxSpareServers 2' -c 'SSLCertificateFile /home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/LayoutTests/http/conf/webkit-httpd.pem'"
16:53:25.041 14762 Waiting for action: <function <lambda> at 0x7f9fb4369848>
16:53:26.043 14762 Server isn't running at all
16:53:26.043 14762 Flushing stdout
16:53:26.043 14762 Flushing stderr
16:53:26.043 14762 Stopping helper
16:53:26.043 14762 Cleaning up port
ServerError raised: Server exited
Traceback (most recent call last):
File "/home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/Tools/Scripts/webkitpy/layout_tests/run_webkit_tests.py", line 80, in main
run_details = run(port, options, args, stderr)
File "/home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/Tools/Scripts/webkitpy/layout_tests/run_webkit_tests.py", line 419, in run
run_details = manager.run(args)
File "/home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py", line 200, in run
int(self._options.child_processes), retrying=False)
File "/home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py", line 257, in _run_tests
return self._runner.run_tests(self._expectations, test_inputs, tests_to_skip, num_workers, needs_http, needs_websockets, retrying)
File "/home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/Tools/Scripts/webkitpy/layout_tests/controllers/layout_test_runner.py", line 120, in run_tests
self.start_servers_with_lock(2 * min(num_workers, len(locked_shards)))
File "/home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/Tools/Scripts/webkitpy/layout_tests/controllers/layout_test_runner.py", line 205, in start_servers_with_lock
self._port.start_http_server(number_of_servers=number_of_servers)
File "/home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/Tools/Scripts/webkitpy/port/base.py", line 888, in start_http_server
server.start()
File "/home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/Tools/Scripts/webkitpy/layout_tests/servers/http_server_base.py", line 92, in start
if self._wait_for_action(self._is_server_running_on_all_ports):
File "/home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/Tools/Scripts/webkitpy/layout_tests/servers/http_server_base.py", line 174, in _wait_for_action
if action():
File "/home/buildslave/efl-buildslave-2/efl-linux-64-release-wk2/build/Tools/Scripts/webkitpy/layout_tests/servers/http_server_base.py", line 185, in _is_server_running_on_all_ports
raise ServerError("Server exited")
ServerError: Server exited
program finished with exit code 254
elapsedTime=11.567065
| Attachments | ||
|---|---|---|
| Add attachment proposed patch, testcase, etc. |
Csaba Osztrogonác
It fails again. I checked the apache log on the bot:
https://build.webkit.org/results/EFL%20Linux%2064-bit%20Release%20WK2/r175842%20%2817555%29/error_log.txt
[Mon Nov 10 17:27:53.699810 2014] [core:crit] [pid 22702] (28)No space left on device: AH00001: unable to create or access scoreboard "/tmp/WebKit/httpd.scoreboard" (name-based shared memory failure)
It seems the free space is leaking on the bot.
Gyuyoung Kim
(In reply to comment #1)
> It fails again. I checked the apache log on the bot:
> https://build.webkit.org/results/EFL%20Linux%2064-bit%20Release%20WK2/
> r175842%20%2817555%29/error_log.txt
>
> [Mon Nov 10 17:27:53.699810 2014] [core:crit] [pid 22702] (28)No space left
> on device: AH00001: unable to create or access scoreboard
> "/tmp/WebKit/httpd.scoreboard" (name-based shared memory failure)
>
> It seems the free space is leaking on the bot.
I found an article which deals with this problem.It looks this problem can occur when shared memory isn't freed up.
http://www.kattare.com/docs/faq_view/702/apache-file-exists-unable-to-create-scoreboard-name-based-shared-memory-failure.html
However I don't know how to fix this problem without my manual fix whenever this problem happens.
Csaba Osztrogonác
I remember that we have not the same, but a similar problem long long
ago in the QtWebKit era with leaking semaphores and shared memory.
It wasn't related to apache, but buggy IPC implementation.
Until the proper fix, we had a magic script to clean up
trashes regularly with a cron job:
ipcs -m| awk '{ print $2 }'|xargs ipcrm shm >/dev/null
ipcs -s| awk '{ print $2 }'|xargs ipcrm sem >/dev/null
I'm not sure if it is too safe, but it worked, I can't
remember if it removed a neccessary thing ever.
Gyuyoung Kim
(In reply to comment #3)
> I remember that we have not the same, but a similar problem long long
> ago in the QtWebKit era with leaking semaphores and shared memory.
> It wasn't related to apache, but buggy IPC implementation.
>
> Until the proper fix, we had a magic script to clean up
> trashes regularly with a cron job:
> ipcs -m| awk '{ print $2 }'|xargs ipcrm shm >/dev/null
> ipcs -s| awk '{ print $2 }'|xargs ipcrm sem >/dev/null
>
> I'm not sure if it is too safe, but it worked, I can't
> remember if it removed a neccessary thing ever.
Ossy, I increased shared memory on EFL buildbot yesterday. In /etc/sysctl.conf file, I set "kernel.shmmax" with "4294967296" (4GB).
So now layout test on EFL buildbot looks fine until now. Let's see that this fix can solve this issue.
Gyuyoung Kim
(In reply to comment #3)
> I remember that we have not the same, but a similar problem long long
> ago in the QtWebKit era with leaking semaphores and shared memory.
> It wasn't related to apache, but buggy IPC implementation.
>
> Until the proper fix, we had a magic script to clean up
> trashes regularly with a cron job:
> ipcs -m| awk '{ print $2 }'|xargs ipcrm shm >/dev/null
> ipcs -s| awk '{ print $2 }'|xargs ipcrm sem >/dev/null
Ossy, I set up a crontab using above commands for now. Thanks.
Gyuyoung Kim
(In reply to comment #5)
> (In reply to comment #3)
> > I remember that we have not the same, but a similar problem long long
> > ago in the QtWebKit era with leaking semaphores and shared memory.
> > It wasn't related to apache, but buggy IPC implementation.
> >
> > Until the proper fix, we had a magic script to clean up
> > trashes regularly with a cron job:
> > ipcs -m| awk '{ print $2 }'|xargs ipcrm shm >/dev/null
> > ipcs -s| awk '{ print $2 }'|xargs ipcrm sem >/dev/null
>
> Ossy, I set up a crontab using above commands for now. Thanks.
This workaround fix seems to work well ! Thanks.