Bug 182987

Summary: webkitpy NetworkTransaction should retry on URLError
Product: WebKit Reporter: Aakash Jain <aakash_jain>
Component: Tools / TestsAssignee: Aakash Jain <aakash_jain>
Status: RESOLVED FIXED    
Severity: Normal CC: aakash_jain, ap, commit-queue, ews-watchlist, glenn, jbedard, lforschler, webkit-bug-importer
Priority: P1 Keywords: InRadar
Version: Other   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=182420
https://bugs.webkit.org/show_bug.cgi?id=183156
Attachments:
Description Flags
Proposed patch none

Description Aakash Jain 2018-02-20 15:50:10 PST
We have been seeing frequent network errors recently in https://bugs.webkit.org/show_bug.cgi?id=182420. We are debugging the root-cause to fix the underlying network issue in <rdar://problem/37716391>.

However, our code should be more robust against network failures and it should retry when it encounters network issue like: URLError: <urlopen error [Errno 60] Operation timed out>.
Comment 1 Aakash Jain 2018-02-20 15:53:55 PST
Created attachment 334310 [details]
Proposed patch
Comment 2 Aakash Jain 2018-02-20 16:06:53 PST
This will add retry for webkit-queues network transactions, since they use NetworkTransaction class. This wouldn't fix network issues with Bugzilla as Bugzilla code use Mechanize directly.
Comment 3 Jonathan Bedard 2018-02-21 08:29:00 PST
Unofficial R+.
Comment 4 Alexey Proskuryakov 2018-02-21 11:03:15 PST
Comment on attachment 334310 [details]
Proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=334310&action=review

Seems fine given that e.filename doesn't appear to work. But it would be good to fix printing the URL both in this case and in the existing HTTPError case.

> Tools/ChangeLog:3
> +        webkitpy NetworkTransaction should retry on URLError

Is there any possibility of deadlocking here?
Comment 5 WebKit Commit Bot 2018-02-21 11:27:56 PST
Comment on attachment 334310 [details]
Proposed patch

Clearing flags on attachment: 334310

Committed r228885: <https://trac.webkit.org/changeset/228885>
Comment 6 WebKit Commit Bot 2018-02-21 11:27:58 PST
All reviewed patches have been landed.  Closing bug.
Comment 7 Radar WebKit Bug Importer 2018-02-21 11:28:20 PST
<rdar://problem/37754241>
Comment 8 Aakash Jain 2018-02-21 12:38:29 PST
> Is there any possibility of deadlocking here?
This class has a maximum Timeout of 10 minutes (timeout_seconds=(10 * 60)), so it wouldn't be stuck indefinitely retrying on network errors.
Comment 9 Aakash Jain 2018-02-23 12:57:59 PST
This patch has definitely helped with network issues. Noticing the logs, all the patches below would have failed ews with network exceptions, which now worked properly.

e.g.:
ews109:
2018-02-21 12:55:17,769 - Started processing patch
2018-02-21 12:56:32,778 - Received URLError: [Errno 60] Operation timed out. Retrying in 10 seconds...
2018-02-21 12:56:43,353 - Fetching: https://bugs.webkit.org/attachment.cgi?id=334405&action=edit
2018-02-21 12:56:43,628 - Fetching: https://bugs.webkit.org/show_bug.cgi?id=183013&ctype=xml&excludefield=attachmentdata
2018-02-21 12:56:43,891 - Running: webkit-patch --status-host=webkit-queues.webkit.org --bot-id=ews109 clean --port=ios-device --architecture=arm64


2018-02-21 16:22:46,539 - Applied patch
2018-02-21 16:24:01,662 - Received URLError: [Errno 60] Operation timed out. Retrying in 10 seconds...
2018-02-21 16:24:12,036 - Fetching: https://bugs.webkit.org/attachment.cgi?id=334424&action=edit
2018-02-21 16:24:12,322 - Fetching: https://bugs.webkit.org/show_bug.cgi?id=182891&ctype=xml&excludefield=attachmentdata

2018-02-21 12:55:17,769 - Started processing patch
2018-02-21 12:56:32,778 - Received URLError: [Errno 60] Operation timed out. Retrying in 10 seconds...
2018-02-21 12:56:43,353 - Fetching: https://bugs.webkit.org/attachment.cgi?id=334405&action=edit
2018-02-21 12:56:43,628 - Fetching: https://bugs.webkit.org/show_bug.cgi?id=183013&ctype=xml&excludefield=attachmentdata

ews108:

2018-02-23 06:57:01,397 - Built patch
2018-02-23 06:58:18,323 - Received URLError: [Errno 60] Operation timed out. Retrying in 10 seconds...
2018-02-23 06:59:43,470 - Received URLError: [Errno 60] Operation timed out. Retrying in 15.0 seconds...
2018-02-23 06:59:58,908 - Pass


2018-02-23 07:40:12,103 - Cleaned working directory
2018-02-23 07:41:27,849 - Received URLError: [Errno 60] Operation timed out. Retrying in 10 seconds...
2018-02-23 07:41:39,461 - Fetching: https://bugs.webkit.org/attachment.cgi?id=334533&action=edit
2018-02-23 07:41:39,752 - Fetching: https://bugs.webkit.org/show_bug.cgi?id=183027&ctype=xml&excludefield=attachmentdata