RESOLVED FIXED Bug 159166
LayoutTests intermittently failing to run due to issues starting Web Platform Test server
https://bugs.webkit.org/show_bug.cgi?id=159166
Summary LayoutTests intermittently failing to run due to issues starting Web Platform...
Ryan Haddad
Reported 2016-06-27 13:52:43 PDT
This has been happening intermittently across Mac testers for the past 2 days or so. https://build.webkit.org/builders/Apple%20El%20Capitan%20Release%20WK2%20(Tests)/builds/7160 https://build.webkit.org/builders/Apple%20Yosemite%20Release%20WK1%20(Tests)/builds/15787/ 12:52:04.076 91713 Starting Web Platform Test server ... 12:52:04.083 91713 Copying WebKit resources files 12:52:04.084 91713 Copying WebKit web platform server config.json 12:52:05.096 91713 Server NOT running on 49001: [Errno 61] Connection refused 12:52:05.096 91713 Waiting for action: <bound method WebPlatformTestServer._is_server_running_on_all_ports of <webkitpy.layout_tests.servers.web_platform_test_server.WebPlatformTestServer object at 0x111e34890>> 12:52:06.104 91713 Server NOT running on 49001: [Errno 61] Connection refused 12:52:06.104 91713 Waiting for action: <bound method WebPlatformTestServer._is_server_running_on_all_ports of <webkitpy.layout_tests.servers.web_platform_test_server.WebPlatformTestServer object at 0x111e34890>> 12:52:07.112 91713 Server NOT running on 49001: [Errno 61] Connection refused 12:52:07.112 91713 Waiting for action: <bound method WebPlatformTestServer._is_server_running_on_all_ports of <webkitpy.layout_tests.servers.web_platform_test_server.WebPlatformTestServer object at 0x111e34890>> 12:52:08.114 91713 Server NOT running on 49001: [Errno 61] Connection refused 12:52:08.114 91713 Waiting for action: <bound method WebPlatformTestServer._is_server_running_on_all_ports of <webkitpy.layout_tests.servers.web_platform_test_server.WebPlatformTestServer object at 0x111e34890>> 12:52:09.123 91713 Server NOT running on 49001: [Errno 61] Connection refused 12:52:09.123 91713 Waiting for action: <bound method WebPlatformTestServer._is_server_running_on_all_ports of <webkitpy.layout_tests.servers.web_platform_test_server.WebPlatformTestServer object at 0x111e34890>> 12:52:10.129 91713 Server NOT running on 49001: [Errno 61] Connection refused 12:52:10.129 91713 Waiting for action: <bound method WebPlatformTestServer._is_server_running_on_all_ports of <webkitpy.layout_tests.servers.web_platform_test_server.WebPlatformTestServer object at 0x111e34890>> 12:52:11.141 91713 Server NOT running on 49001: [Errno 61] Connection refused 12:52:11.141 91713 Waiting for action: <bound method WebPlatformTestServer._is_server_running_on_all_ports of <webkitpy.layout_tests.servers.web_platform_test_server.WebPlatformTestServer object at 0x111e34890>> 12:52:12.146 91713 Server NOT running on 49001: [Errno 61] Connection refused 12:52:12.146 91713 Waiting for action: <bound method WebPlatformTestServer._is_server_running_on_all_ports of <webkitpy.layout_tests.servers.web_platform_test_server.WebPlatformTestServer object at 0x111e34890>> 12:52:13.158 91713 Server NOT running on 49001: [Errno 61] Connection refused 12:52:13.158 91713 Waiting for action: <bound method WebPlatformTestServer._is_server_running_on_all_ports of <webkitpy.layout_tests.servers.web_platform_test_server.WebPlatformTestServer object at 0x111e34890>> 12:52:14.162 91713 Server NOT running on 49001: [Errno 61] Connection refused 12:52:14.162 91713 Waiting for action: <bound method WebPlatformTestServer._is_server_running_on_all_ports of <webkitpy.layout_tests.servers.web_platform_test_server.WebPlatformTestServer object at 0x111e34890>> 12:52:15.165 91713 Server NOT running on 49001: [Errno 61] Connection refused 12:52:15.165 91713 Waiting for action: <bound method WebPlatformTestServer._is_server_running_on_all_ports of <webkitpy.layout_tests.servers.web_platform_test_server.WebPlatformTestServer object at 0x111e34890>> 12:52:16.177 91713 Server NOT running on 49001: [Errno 61] Connection refused 12:52:16.177 91713 Waiting for action: <bound method WebPlatformTestServer._is_server_running_on_all_ports of <webkitpy.layout_tests.servers.web_platform_test_server.WebPlatformTestServer object at 0x111e34890>> 12:52:17.179 91713 Server NOT running on 49001: [Errno 61] Connection refused 12:52:17.179 91713 Waiting for action: <bound method WebPlatformTestServer._is_server_running_on_all_ports of <webkitpy.layout_tests.servers.web_platform_test_server.WebPlatformTestServer object at 0x111e34890>> 12:52:18.181 91713 Server NOT running on 49001: [Errno 61] Connection refused 12:52:18.181 91713 Waiting for action: <bound method WebPlatformTestServer._is_server_running_on_all_ports of <webkitpy.layout_tests.servers.web_platform_test_server.WebPlatformTestServer object at 0x111e34890>> 12:52:19.193 91713 Server NOT running on 49001: [Errno 61] Connection refused 12:52:19.194 91713 Waiting for action: <bound method WebPlatformTestServer._is_server_running_on_all_ports of <webkitpy.layout_tests.servers.web_platform_test_server.WebPlatformTestServer object at 0x111e34890>> 12:52:20.196 91713 Server NOT running on 49001: [Errno 61] Connection refused 12:52:20.197 91713 Waiting for action: <bound method WebPlatformTestServer._is_server_running_on_all_ports of <webkitpy.layout_tests.servers.web_platform_test_server.WebPlatformTestServer object at 0x111e34890>> 12:52:21.208 91713 Server NOT running on 49001: [Errno 61] Connection refused 12:52:21.208 91713 Waiting for action: <bound method WebPlatformTestServer._is_server_running_on_all_ports of <webkitpy.layout_tests.servers.web_platform_test_server.WebPlatformTestServer object at 0x111e34890>> 12:52:22.212 91713 Server NOT running on 49001: [Errno 61] Connection refused 12:52:22.212 91713 Waiting for action: <bound method WebPlatformTestServer._is_server_running_on_all_ports of <webkitpy.layout_tests.servers.web_platform_test_server.WebPlatformTestServer object at 0x111e34890>> 12:52:23.215 91713 Server NOT running on 49001: [Errno 61] Connection refused 12:52:23.216 91713 Waiting for action: <bound method WebPlatformTestServer._is_server_running_on_all_ports of <webkitpy.layout_tests.servers.web_platform_test_server.WebPlatformTestServer object at 0x111e34890>> 12:52:24.226 91713 Server NOT running on 49001: [Errno 61] Connection refused 12:52:24.226 91713 Waiting for action: <bound method WebPlatformTestServer._is_server_running_on_all_ports of <webkitpy.layout_tests.servers.web_platform_test_server.WebPlatformTestServer object at 0x111e34890>> 12:52:25.231 91713 Stopping wptwk server 12:52:25.231 91713 Cleaning WPT resources files 12:52:25.233 91713 Cleaning WPT web platform server config.json 12:52:25.234 91713 Flushing stdout 12:52:25.234 91713 Flushing stderr 12:52:25.234 91713 Stopping helper 12:52:25.234 91713 Stopping LayoutTestHelper 12:52:25.240 91713 Cleaning up port ServerError raised: Failed to start wptwk server Traceback (most recent call last): File "/Volumes/Data/slave/elcapitan-release-tests-wk2/build/Tools/Scripts/webkitpy/layout_tests/run_webkit_tests.py", line 77, in main run_details = run(port, options, args, stderr) File "/Volumes/Data/slave/elcapitan-release-tests-wk2/build/Tools/Scripts/webkitpy/layout_tests/run_webkit_tests.py", line 409, in run run_details = manager.run(args) File "/Volumes/Data/slave/elcapitan-release-tests-wk2/build/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py", line 203, in run int(self._options.child_processes), retrying=False) File "/Volumes/Data/slave/elcapitan-release-tests-wk2/build/Tools/Scripts/webkitpy/layout_tests/controllers/manager.py", line 262, in _run_tests return self._runner.run_tests(self._expectations, test_inputs, tests_to_skip, num_workers, needs_http, needs_websockets, needs_web_platform_test_server, retrying) File "/Volumes/Data/slave/elcapitan-release-tests-wk2/build/Tools/Scripts/webkitpy/layout_tests/controllers/layout_test_runner.py", line 111, in run_tests self.start_servers() File "/Volumes/Data/slave/elcapitan-release-tests-wk2/build/Tools/Scripts/webkitpy/layout_tests/controllers/layout_test_runner.py", line 199, in start_servers self._port.start_web_platform_test_server() File "/Volumes/Data/slave/elcapitan-release-tests-wk2/build/Tools/Scripts/webkitpy/port/base.py", line 952, in start_web_platform_test_server self._web_platform_test_server.start() File "/Volumes/Data/slave/elcapitan-release-tests-wk2/build/Tools/Scripts/webkitpy/layout_tests/servers/http_server_base.py", line 95, in start raise ServerError('Failed to start %s server' % self._name) ServerError: Failed to start wptwk server
Attachments
Removing ws port from config (1.24 KB, patch)
2016-06-28 01:36 PDT, youenn fablet
no flags
Patch (11.08 KB, patch)
2016-06-30 00:19 PDT, youenn fablet
no flags
Ryan Haddad
Comment 1 2016-06-27 13:53:54 PDT
Not sure if this could be related to <https://trac.webkit.org/changeset/202471>
youenn fablet
Comment 2 2016-06-28 01:35:38 PDT
I am not sure either but it also happens on iOS sim EWS bots, this is rather annoying :( It seems the web socket server is not able to start properly. https://trac.webkit.org/changeset/202471 is not supposed to change the web socket server implementation so I am not sure where that comes from. Do you know where I can find the full wpt_wk log?
youenn fablet
Comment 3 2016-06-28 01:36:40 PDT
Created attachment 282223 [details] Removing ws port from config
Radar WebKit Bug Importer
Comment 4 2016-06-28 02:50:38 PDT
Alexey Proskuryakov
Comment 5 2016-06-28 10:02:03 PDT
You can get to the logs by going to test results, and removing "results.html" from the URL, e.g. <https://build.webkit.org/results/Apple%20Yosemite%20Release%20WK1%20(Tests)/r202508%20(15787)/>. The proposed patch doesn't seem related to this error.
Alexey Proskuryakov
Comment 6 2016-06-29 20:57:01 PDT
This is getting very frequent, and a real problem. Youenn, could you take another look?
Alexey Proskuryakov
Comment 7 2016-06-29 20:58:43 PDT
E.g. in this patch, two EWS bubbles are orange at once because the bots can't start wtpserve: <https://bugs.webkit.org/show_bug.cgi?id=159287>.
youenn fablet
Comment 8 2016-06-29 23:53:36 PDT
(In reply to comment #7) > E.g. in this patch, two EWS bubbles are orange at once because the bots > can't start wtpserve: <https://bugs.webkit.org/show_bug.cgi?id=159287>. From the error log, it seems bots have difficulties generating certificates from openssl. We can try regenerating the certificates and I will prepare a patch accordingly. But there may be an underlying issue...
youenn fablet
Comment 9 2016-06-30 00:19:30 PDT
youenn fablet
Comment 10 2016-06-30 00:21:05 PDT
(In reply to comment #9) > Created attachment 282416 [details] > Patch Hopefully this should work. I think there may be a bug in some of openssl wpt scripts.
youenn fablet
Comment 11 2016-06-30 00:22:00 PDT
youenn fablet
Comment 12 2016-06-30 00:26:31 PDT
youenn fablet
Comment 13 2016-06-30 07:36:58 PDT
(In reply to comment #12) > Raised issue here: https://github.com/w3c/wpt-tools/issues/86 OK, the issue is already fixed in w3c repo. Instead of this patch, we could reimport the latest wpt repo.
Alexey Proskuryakov
Comment 14 2016-06-30 08:51:54 PDT
Comment on attachment 282416 [details] Patch I don't quite understand this change, nor whether it's practical to reimport quickly. Does this prevent re-generating the certificates on every run, and thus sidestep the problem in automatic generation? Either way, r=me, as we need to resolve this ASAP.
youenn fablet
Comment 15 2016-06-30 08:56:13 PDT
> I don't quite understand this change, nor whether it's practical to reimport > quickly. Reimporting would fix the bug without us having to change our current setup. > Does this prevent re-generating the certificates on every run, and thus > sidestep the problem in automatic generation? You are correct. I don't think serial is used in wpt except for generating certificates when launching servers, hence why the problem should not appear anymore.
WebKit Commit Bot
Comment 16 2016-06-30 09:15:08 PDT
Comment on attachment 282416 [details] Patch Clearing flags on attachment: 282416 Committed r202681: <http://trac.webkit.org/changeset/202681>
WebKit Commit Bot
Comment 17 2016-06-30 09:15:13 PDT
All reviewed patches have been landed. Closing bug.
Chris Dumez
Comment 18 2016-08-01 13:17:27 PDT
Comment on attachment 282416 [details] Patch After this change, LayoutTests/imported/w3c/resources/_wpt_certs/* keep getting overridden locally and showing up in my local diff / patches. This is very annoying.
youenn fablet
Comment 19 2016-08-01 13:31:09 PDT
Ah, it might be due to recent changes to wpt python scripts. We can go back to the previous strategy since not-closed-handle bug is now fixed.
Alexey Proskuryakov
Comment 21 2016-11-15 16:45:21 PST
I looked at one of the bots hitting this, and it has lots of connections that are not properly closed ("sunwebadmin" is port 8800): tcp4 0 0 localhost.sunwebadmin localhost.62045 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62049 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62048 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.61980 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.61984 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.61982 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.61986 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.61988 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.61990 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62051 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62055 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62054 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62057 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62059 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62061 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62063 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62067 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62066 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62069 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62073 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62072 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62075 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62079 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62078 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62081 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62086 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62087 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62085 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62093 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62097 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62096 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62099 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62102 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62103 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62105 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62108 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62109 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62111 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62114 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62115 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62117 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62121 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62120 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62123 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62127 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62126 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62129 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62133 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62132 TIME_WAIT tcp4 0 0 localhost.sunwebadmin localhost.62135 TIME_WAIT
Alexey Proskuryakov
Comment 22 2016-11-15 16:48:08 PST
(over 4000 connections in TIMED_WAIT on that bot)
youenn fablet
Comment 23 2016-11-16 08:37:28 PST
(In reply to comment #22) > (over 4000 connections in TIMED_WAIT on that bot) I run 50 times LayoutTests/imported/w3c/web-platform-tests/fetch/api/request/request-structure.html which has two sub-resources. We end up with 151 TIME_WAIT sockets. Looking at the server, it seems to support HTTP keep-alive and socket reuse
youenn fablet
Comment 24 2016-11-16 08:43:36 PST
(In reply to comment #23) > (In reply to comment #22) > > (over 4000 connections in TIMED_WAIT on that bot) > > I run 50 times > LayoutTests/imported/w3c/web-platform-tests/fetch/api/request/request- > structure.html which has two sub-resources. > We end up with 151 TIME_WAIT sockets. > > Looking at the server, it seems to support HTTP keep-alive and socket reuse But most responses do not have content-length headers
Note You need to log in before you can comment on or make changes to this bug.