WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED CONFIGURATION CHANGED
182420
EWS bots hitting network issues
https://bugs.webkit.org/show_bug.cgi?id=182420
Summary
EWS bots hitting network issues
Aakash Jain
Reported
2018-02-01 22:38:02 PST
Patch 332939 (on
https://bugs.webkit.org/show_bug.cgi?id=182036
) got stuck on commit-queue (bot webkit-cq-02)
https://webkit-queues.webkit.org/patch/332939/commit-queue
webkit-cq-02 had following error in logs: 2018-02-01 21:58:28,594 - Fetching:
https://bugs.webkit.org/attachment.cgi?id=332939&action=edit
2018-02-01 21:58:29,005 - Fetching:
https://bugs.webkit.org/show_bug.cgi?id=182036
&ctype=xml&excludefield=attachmentdata 2018-02-01 21:58:29,303 - Running: webkit-patch --status-host=webkit-queues.webkit.org --bot-id=webkit-cq-02 update --port=mac 2018-02-01 21:58:33,932 - Updated working directory Traceback (most recent call last): File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/tool/bot/queueengine.py", line 103, in run if not self._delegate.process_work_item(work_item): File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/tool/commands/queues.py", line 340, in process_work_item if task.run(): File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/tool/bot/commitqueuetask.py", line 77, in run if not self._update(): File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/tool/bot/patchanalysistask.py", line 121, in _update "Unable to update working directory") File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/tool/bot/patchanalysistask.py", line 101, in _run_command self._delegate.command_passed(success_message, patch=self._patch) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/tool/commands/queues.py", line 383, in command_passed self._update_status(message, patch=patch) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/tool/commands/queues.py", line 210, in _update_status return self._tool.status_server.update_status(self.name, message, patch, results_file) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/common/net/statusserver.py", line 160, in update_status return NetworkTransaction().run(lambda: self._post_status_to_server(queue_name, status, patch, results_file)) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/common/net/networktransaction.py", line 53, in run return request() File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/common/net/statusserver.py", line 160, in <lambda> return NetworkTransaction().run(lambda: self._post_status_to_server(queue_name, status, patch, results_file)) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/common/net/statusserver.py", line 85, in _post_status_to_server self._browser.open(update_status_url) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/autoinstalled/mechanize/_mechanize.py", line 203, in open return self._mech_open(url, data, timeout=timeout) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/autoinstalled/mechanize/_mechanize.py", line 230, in _mech_open response = UserAgentBase.open(self, request, data) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/autoinstalled/mechanize/_opener.py", line 193, in open response = urlopen(self, req, data) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/autoinstalled/mechanize/_urllib2_fork.py", line 344, in _open '_open', req) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/autoinstalled/mechanize/_urllib2_fork.py", line 332, in _call_chain result = func(*args) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/autoinstalled/mechanize/_urllib2_fork.py", line 1142, in http_open return self.do_open(httplib.HTTPConnection, req) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/autoinstalled/mechanize/_urllib2_fork.py", line 1118, in do_open raise URLError(err) URLError: <urlopen error [Errno 60] Operation timed out> 2018-02-01 21:59:48,953 - Exception while preparing queue Sleeping until 2018-02-01 22:01:48 (120 seconds). 2018-02-01 22:01:48,963 - Fetching next work item for commit-queue Traceback (most recent call last): File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/tool/bot/queueengine.py", line 97, in run work_item = self._delegate.next_work_item() File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/tool/commands/queues.py", line 334, in next_work_item return self._next_patch() File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/tool/commands/queues.py", line 217, in _next_patch patch_id = self._tool.status_server.next_work_item(self.name) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/common/net/statusserver.py", line 128, in next_work_item return self._fetch_url(next_patch_url) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/common/net/statusserver.py", line 169, in _fetch_url return urllib2.urlopen(url, timeout=300).read() File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 154, in urlopen return opener.open(url, data, timeout) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 431, in open response = self._open(req, data) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 449, in _open '_open', req) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 409, in _call_chain result = func(*args) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1227, in http_open return self.do_open(httplib.HTTPConnection, req) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1197, in do_open raise URLError(err) URLError: <urlopen error [Errno 60] Operation timed out> 2018-02-01 22:03:04,001 - Exception while preparing queue Sleeping until 2018-02-01 22:05:04 (120 seconds). 2018-02-01 22:05:04,002 - Fetching next work item for commit-queue 2018-02-01 22:05:04,203 - No work item. Sleeping until 2018-02-01 22:07:04 (120 seconds). 2018-02-01 22:07:04,204 - Delegate terminated queue.
Attachments
Add attachment
proposed patch, testcase, etc.
Tim Horton
Comment 1
2018-02-03 15:45:48 PST
I just got the same thing on
https://bugs.webkit.org/show_bug.cgi?id=182460
Aakash Jain
Comment 2
2018-02-03 17:55:44 PST
workaround: If a bot is stuck processing the patch, unlock the patch at:
https://webkit-queues.webkit.org/release-lock
(the patch is unlocked by default after 2 hours)
Aakash Jain
Comment 3
2018-02-05 07:59:05 PST
Happnned again with
https://webkit-queues.webkit.org/patch/333081/commit-queue
, i unlocked the patch so that the bot picked up the patch again. Logs: 2018-02-05 07:50:50,621 - Fetching:
https://bugs.webkit.org/attachment.cgi?id=333081&action=edit
2018-02-05 07:50:50,902 - Fetching:
https://bugs.webkit.org/show_bug.cgi?id=179743
&ctype=xml&excludefield=attachmentdata 2018-02-05 07:50:51,232 - Running: webkit-patch --status-host=webkit-queues.webkit.org --bot-id=webkit-cq-02 apply-attachment --no-update --non-interactive 333081 --port=mac 2018-02-05 07:50:56,176 - Applied patch Traceback (most recent call last): File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/tool/bot/queueengine.py", line 103, in run if not self._delegate.process_work_item(work_item): File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/tool/commands/queues.py", line 340, in process_work_item if task.run(): File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/tool/bot/commitqueuetask.py", line 79, in run if not self._apply(): File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/tool/bot/patchanalysistask.py", line 131, in _apply "Patch does not apply") File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/tool/bot/patchanalysistask.py", line 101, in _run_command self._delegate.command_passed(success_message, patch=self._patch) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/tool/commands/queues.py", line 383, in command_passed self._update_status(message, patch=patch) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/tool/commands/queues.py", line 210, in _update_status return self._tool.status_server.update_status(self.name, message, patch, results_file) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/common/net/statusserver.py", line 160, in update_status return NetworkTransaction().run(lambda: self._post_status_to_server(queue_name, status, patch, results_file)) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/common/net/networktransaction.py", line 53, in run return request() File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/common/net/statusserver.py", line 160, in <lambda> return NetworkTransaction().run(lambda: self._post_status_to_server(queue_name, status, patch, results_file)) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/common/net/statusserver.py", line 85, in _post_status_to_server self._browser.open(update_status_url) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/autoinstalled/mechanize/_mechanize.py", line 203, in open return self._mech_open(url, data, timeout=timeout) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/autoinstalled/mechanize/_mechanize.py", line 230, in _mech_open response = UserAgentBase.open(self, request, data) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/autoinstalled/mechanize/_opener.py", line 193, in open response = urlopen(self, req, data) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/autoinstalled/mechanize/_urllib2_fork.py", line 344, in _open '_open', req) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/autoinstalled/mechanize/_urllib2_fork.py", line 332, in _call_chain result = func(*args) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/autoinstalled/mechanize/_urllib2_fork.py", line 1142, in http_open return self.do_open(httplib.HTTPConnection, req) File "/Volumes/Data/EWS/WebKit/Tools/Scripts/webkitpy/thirdparty/autoinstalled/mechanize/_urllib2_fork.py", line 1118, in do_open raise URLError(err) URLError: <urlopen error [Errno 60] Operation timed out> 2018-02-05 07:52:11,198 - Exception while preparing queue Sleeping until 2018-02-05 07:54:11 (120 seconds).
Aakash Jain
Comment 4
2018-02-05 07:59:46 PST
Seems to be happening frequently. This completely break commit-queue. Raising to P1. Need to investigate asap.
Aakash Jain
Comment 5
2018-02-05 08:09:04 PST
We can consider adding retry when posting status to webkit-queues server fails. We should also figure out why this network issue is happening frequently lately.
Jonathan Bedard
Comment 6
2018-02-05 08:30:02 PST
I think a retry is the best option. If this is being caused by network flakiness (which seems likely), we can investigate the root cause, but that will take a few days, at least. It seems like we need this back up and running reliably as fast as possible.
Wenson Hsieh
Comment 7
2018-02-05 08:30:48 PST
Another datapoint, this time on mac-debug-ews:
https://webkit-queues.webkit.org/results/6361047
for
https://bugs.webkit.org/show_bug.cgi?id=182472
Tim Horton
Comment 8
2018-02-19 10:42:04 PST
And
https://bugs.webkit.org/show_bug.cgi?id=182919
#attach_334163 Any idea why this seems to have gotten worse in the last few months?
Aakash Jain
Comment 9
2018-02-20 15:59:58 PST
This issue is happening with both Bugzilla server as well as webkit-queues.webkit.org server. It seems like intermittent network issue, maybe something specific to the lab network in which the machines are. Debugging it further in <
rdar://problem/37716391
>. Meanwhile, adding the retry logic for webkit-queues network transactions in
https://bugs.webkit.org/show_bug.cgi?id=182987
Aakash Jain
Comment 10
2018-02-26 15:43:13 PST
(In reply to Tim Horton from
comment #8
)
> Any idea why this seems to have gotten worse in the last few months?
No idea why these network issues have become so frequent now. Maybe something changed in our lab network causing frequent intermittent network issues. Adding retry while talking to webkit-queues helped (
https://bugs.webkit.org/show_bug.cgi?id=182987
). We should add the similar retry in Bugzilla code as well (Bugzilla code should use NetworkTransaction class instead of directly using mechanize or urllib).
Aakash Jain
Comment 11
2020-03-21 04:17:18 PDT
EWS has been re-implemented from scratch and is now based on Buildbot. In case of network issue between the worker and buildbot-master, the build is automatically retried. In most cases of network issue with Bugzilla, build is either retried or handled appropriately. Please file a new bug if you notice any issue.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug