Bug 186919 - Navigation Timing data is corrupt in WebView (UIWebView/WKWebView)
Summary: Navigation Timing data is corrupt in WebView (UIWebView/WKWebView)
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: WebKit Nightly Build
Hardware: iPhone / iPad iOS 11
: P2 Major
Assignee: Alex Christensen
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2018-06-22 07:09 PDT by krinklemail
Modified: 2022-03-29 13:55 PDT (History)
8 users (show)

See Also:


Attachments
Test case (1.77 KB, text/html)
2018-06-23 09:37 PDT, krinklemail
no flags Details
Test case (response/request order) (1.73 KB, text/html)
2018-06-23 09:43 PDT, krinklemail
no flags Details
Patch (4.10 KB, patch)
2022-03-28 13:03 PDT, Alex Christensen
simon.fraser: review+
ews-feeder: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description krinklemail 2018-06-22 07:09:55 PDT
iOS 8.0 introduced support for the Navigation Timing API, which had issues and was removed in 8.1, then re-introduced in iOS 9.0 with different issues. These were fixed for the Mobile Safari app in 9.1, but WebView remained a source of invalid Navigation Timing data and has continued to have the same issues to this day in iOS 11.

Downstream report: https://phabricator.wikimedia.org/T190977

Specifically, the following two types of data corruption are the most common in data received from Wikipedia users:

* fetchStart went backwards from unloadEventEnd.
* responseStart went backwards from requestStart.

Test url:
https://codepen.io/Krinkle/pen/pKVqPy?editors=0100

Steps to reproduce:
Intermittent, but loading this page within an app (e.g. Apple Dictionary, Facebook Messenger, Feedly, etc.), either directly from a native view, or as secondary navigation after initially entering an embedded view, most often when refreshing the view, will eventually cause the error to occur.

Impact:
For Wikipedia, the data from Navigation Timing API is central to measuring performance from users. iOS WebView is a significant source of traffic (at between 1% and 2% of Wikipedia page views, per Jan-May 2018). Unfortunately, (almost) all of this data is discarded before recording because it does not pass a basic data compliance check.

Here's a dump from the logs of our data validation service from last few hours (sorted by frequency)

| Rank | Hits | Normalised message
| ---- | ---- | ------------------
| #1   | 1987 | (is_compliant:443) Discarding event because responseEnd is out of order [{'os_minor': '4', 'os_major': '11', 'device_family': 'iPhone', 'os_family': 'iOS', 'browser_minor': '0', 'browser_major': '11', 'browser_family': 'Mobile Safari'}]
| #2   | 1566 | (is_compliant:443) Discarding event because requestStart is out of order [{'os_minor': '3', 'os_major': '9', 'device_family': 'iPad', 'os_family': 'iOS', 'browser_minor': '0', 'browser_major': '9', 'browser_family': 'Mobile Safari'}]
| #3   |  821 | (is_compliant:443) Discarding event because responseEnd is out of order [{'os_minor': '3', 'os_major': '11', 'device_family': 'iPhone', 'os_family': 'iOS', 'browser_minor': '0', 'browser_major': '11', 'browser_family': 'Mobile Safari'}]
| #4   |  786 | (is_compliant:443) Discarding event because requestStart is out of order [{'os_minor': '3', 'os_major': '9', 'device_family': 'iPhone', 'os_family': 'iOS', 'browser_minor': '0', 'browser_major': '9', 'browser_family': 'Mobile Safari'}]
| #5   |  468 | (is_compliant:443) Discarding event because responseEnd is out of order [{'os_minor': '2', 'os_major': '11', 'device_family': 'iPhone', 'os_family': 'iOS', 'browser_minor': '0', 'browser_major': '11', 'browser_family': 'Mobile Safari'}]

There used to be a Chrome entry in the top 5 as well (Chrome 62 and earlier produced invalid timing data on redirected navigations), but that was fixed per https://bugs.chromium.org/p/chromium/issues/detail?id=813889.

At this time, Mobile Safari (WebView) accounts for 98% of navigation timing errors.

See also:
* https://bugs.webkit.org/show_bug.cgi?id=184363
* Downstream investigation: https://phabricator.wikimedia.org/T190977
* Similar issue in Chromium (affecting redirects only, FIXED) - https://bugs.chromium.org/p/chromium/issues/detail?id=813889
* Wikipedia performance data validation - https://grafana.wikimedia.org/dashboard/db/eventlogging-schema?var-schema=NavigationTiming (currently around 0.3 errors per second, from a sampling of 1:1000 page views)
Comment 1 Radar WebKit Bug Importer 2018-06-22 23:15:41 PDT
<rdar://problem/41393423>
Comment 2 krinklemail 2018-06-23 09:37:52 PDT
Created attachment 343438 [details]
Test case

Simple HTML file (the pen url only works when using CodePen's Debug View, due to the framed wrapper somehow causing the issue not to happen).
Comment 3 krinklemail 2018-06-23 09:43:14 PDT
Created attachment 343439 [details]
Test case (response/request order)
Comment 4 krinklemail 2020-05-26 12:37:47 PDT
This is still seen from Mobile Safari 13 clients.


May 26
Discarding event because requestStart is out of order [{u'wmf_app_version': u'-', u'os_minor': u'4', u'os_major': u'13', u'is_bot': False, u'device_family': u'iPhone', u'os_family': u'iOS', u'browser_minor': u'1', u'is_mediawiki': False, u'browser_major': u'13', u'browser_family': u'Mobile Safari'}]
Discarding event because requestStart is out of order [{u'wmf_app_version': u'-', u'os_minor': u'4', u'os_major': u'13', u'is_bot': False, u'device_family': u'iPhone', u'os_family': u'iOS', u'browser_minor': u'1', u'is_mediawiki': False, u'browser_major': u'13', u'browser_family': u'Mobile Safari'}]
Discarding event because requestStart is out of order [{u'wmf_app_version': u'-', u'os_minor': u'4', u'os_major': u'13', u'is_bot': False, u'device_family': u'iPhone', u'os_family': u'iOS', u'browser_minor': u'1', u'is_mediawiki': False, u'browser_major': u'13', u'browser_family': u'Mobile Safari'}]
…

Current downstream report:
https://phabricator.wikimedia.org/T197954
Comment 5 krinklemail 2020-10-02 22:47:18 PDT
Updated log now startingb to include iOS 14 as we start to see adoption.

Below is the aggregregate of a few hours of 1:1000 navtiming samples from Wikimedia, with first column indicating how many similar errors were seen (same out of order, same OS/browser version)

3848  Discarding event because requestStart is out of order [{u'os_minor': u'7', u'os_major': u'13', u'device_family': u'iPhone', u'os_family': u'iOS', u'browser_minor': u'1', u'browser_major': u'13', u'browser_family': u'Mobile Safari'}]
1612  Discarding event because requestStart is out of order [{u'os_minor': u'15', u'os_major': u'10', u'device_family': u'Mac', u'os_family': u'Mac OS X', u'browser_minor': u'1', u'browser_major': u'13', u'browser_family': u'Safari'}]
1474  Discarding event because requestStart is out of order [{u'os_minor': u'6', u'os_major': u'13', u'device_family': u'iPhone', u'os_family': u'iOS', u'browser_minor': u'1', u'browser_major': u'13', u'browser_family': u'Mobile Safari'}]
711  Discarding event because requestStart is out of order [{u'os_minor': u'7', u'os_major': u'13', u'device_family': u'iPhone', u'os_family': u'iOS', u'browser_minor': u'2', u'browser_major': u'125', u'browser_family': u'Google'}]
672  Discarding event because requestStart is out of order [{u'os_minor': u'5', u'os_major': u'13', u'device_family': u'iPhone', u'os_family': u'iOS', u'browser_minor': u'1', u'browser_major': u'13', u'browser_family': u'Mobile Safari'}]
665  Discarding event because requestStart is out of order [{u'os_minor': u'3', u'os_major': u'9', u'device_family': u'iPad', u'os_family': u'iOS', u'browser_minor': u'0', u'browser_major': u'9', u'browser_family': u'Mobile Safari'}]
592  Discarding event because requestStart is out of order [{u'os_minor': u'7', u'os_major': u'13', u'device_family': u'iPhone', u'os_family': u'iOS', u'browser_minor': u'0', u'browser_major': u'85', u'browser_family': u'Chrome Mobile iOS'}]
528  Discarding event because responseEnd is out of order [{u'os_minor': u'0', u'os_major': u'14', u'device_family': u'iPhone', u'os_family': u'iOS', u'browser_minor': u'0', u'browser_major': u'14', u'browser_family': u'Mobile Safari'}]
454  Discarding event because responseEnd is out of order [{u'os_minor': u'7', u'os_major': u'13', u'device_family': u'iPhone', u'os_family': u'iOS', u'browser_minor': u'1', u'browser_major': u'13', u'browser_family': u'Mobile Safari'}]
344  Discarding event because responseEnd is out of order [{u'os_minor': u'15', u'os_major': u'10', u'device_family': u'Mac', u'os_family': u'Mac OS X', u'browser_minor': u'1', u'browser_major': u'13', u'browser_family': u'Safari'}]
295  Discarding event because requestStart is out of order [{u'os_minor': u'15', u'os_major': u'10', u'device_family': u'Mac', u'os_family': u'Mac OS X', u'browser_minor': u'0', u'browser_major': u'14', u'browser_family': u'Safari'}]
259  Discarding event because responseEnd is out of order [{u'os_minor': u'15', u'os_major': u'10', u'device_family': u'Mac', u'os_family': u'Mac OS X', u'browser_minor': u'0', u'browser_major': u'14', u'browser_family': u'Safari'}]
244  Discarding event because requestStart is out of order [{u'os_minor': u'0', u'os_major': u'14', u'device_family': u'iPhone', u'os_family': u'iOS', u'browser_minor': u'0', u'browser_major': u'14', u'browser_family': u'Mobile Safari'}]
235  Discarding event because requestStart is out of order [{u'os_minor': u'3', u'os_major': u'13', u'device_family': u'iPhone', u'os_family': u'iOS', u'browser_minor': u'0', u'browser_major': u'13', u'browser_family': u'Mobile Safari'}]
193  Discarding event because requestStart is out of order [{u'os_minor': u'4', u'os_major': u'13', u'device_family': u'iPhone', u'os_family': u'iOS', u'browser_minor': u'1', u'browser_major': u'13', u'browser_family': u'Mobile Safari'}]
175  Discarding event because requestStart is out of order [{u'os_minor': u'6', u'os_major': u'13', u'device_family': u'iPhone', u'os_family': u'iOS', u'browser_minor': u'2', u'browser_major': u'125', u'browser_family': u'Google'}]
174  Discarding event because requestStart is out of order [{u'os_minor': u'6', u'os_major': u'13', u'device_family': u'iPhone', u'os_family': u'iOS', u'browser_minor': u'0', u'browser_major': u'85', u'browser_family': u'Chrome Mobile iOS'}]
166  Discarding event because responseEnd is out of order [{u'os_minor': u'6', u'os_major': u'13', u'device_family': u'iPhone', u'os_family': u'iOS', u'browser_minor': u'1', u'browser_major': u'13', u'browser_family': u'Mobile Safari'}]

Safari is (still) the only browser reporting errors with significant frequency.

'requestStart' being out of order is far far the most frequent, however we are still seeing out of order 'responseEnd' as well as you can see including on iOS 14.
Comment 6 krinklemail 2022-02-23 04:40:00 PST
Still seen from iOS 15.3 and Mac Safari 15.

Estimated affected Wikipedia pageviews per day (before 1:1000 sampling):

| pageviews | wikimedia_navtiming/is_compliant/error_invalid
|-----------|---------------------------------------
|   500,000 | Discarding event because requestStart is out of order [('Mobile_Safari', '15_3')]
|   400,000 | Discarding event because requestStart is out of order [('Mobile_Safari', '14_1')]
|   400,000 | Discarding event because requestStart is out of order [('Mobile_Safari', '15_2')]
|   300,000 | Discarding event because requestStart is out of order [('Mobile_Safari', '13_1')]
|   300,000 | Discarding event because requestStart is out of order [('Safari', '15')]
|   220,000 | Discarding event because requestStart is out of order [('Safari', '14')]
|   220,000 | Discarding event because requestStart is out of order [('Mobile_Safari', '15_1')]
|   200,000 | Discarding event because responseEnd is out of order [('Mobile_Safari', '14_1')]
|   150,000 | Discarding event because requestStart is out of order [('Safari', '13')]
|   100,000 | Discarding event because responseEnd is out of order [('Safari', '14')]
|   100,000 | Discarding event because requestStart is out of order [('Mobile_Safari', '15_0')]
|   100,000 | Discarding event because requestStart is out of order [('Mobile_Safari', '14_0')]
|    90,000 | Discarding event because requestStart is out of order [('Mobile_Safari', '13_0')]
|    70,000 | Discarding event because connectStart is out of order [('Safari', '15')]
|    50,000 | Discarding event because responseEnd is out of order [('Mobile_Safari', '14_0')]
|    50,000 | Discarding event because responseEnd is out of order [('Safari', '13')]
|    30,000 | Discarding event because responseEnd is out of order [('Mobile_Safari', '13_0')]
|    30,000 | Discarding event because responseEnd is out of order [('Mobile_Safari', '12_1')]
|    20,000 | Discarding event because requestStart is out of order [('Mobile_Safari', '12_0')]
Comment 7 Alex Christensen 2022-03-28 13:03:56 PDT
Created attachment 455946 [details]
Patch
Comment 8 Simon Fraser (smfr) 2022-03-28 13:07:30 PDT
Comment on attachment 455946 [details]
Patch

Why would this affect WKWebView but not Safari?
Comment 9 Alex Christensen 2022-03-28 13:13:41 PDT
I think in 2018 this was a bug about the WebView implementation having some issues, some of which I fixed last year.  I think in 2022 it's about one more remaining issue which affects Safari, WKWebView, WebView, and UIWebView.
Comment 10 Simon Fraser (smfr) 2022-03-28 15:57:32 PDT
Comment on attachment 455946 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=455946&action=review

> Source/WebCore/platform/network/cocoa/NetworkLoadMetrics.mm:57
> +    // Sometimes in production responseStart is before requestStart, possibly due to clock updates. If this happens, use the later of the two.

I don't think "in production" adds anything here. Should we reference a radar?
Comment 11 Alex Christensen 2022-03-29 12:13:00 PDT
I'll reference rdar://90997689 instead.
Comment 12 Alex Christensen 2022-03-29 12:15:51 PDT
r292055
Comment 13 Alex Christensen 2022-03-29 13:53:40 PDT
Reverted in r292066
Comment 14 Alex Christensen 2022-03-29 13:55:46 PDT
r292067 on trunk