1. Track user-triggered load cancels at the 5 second mark. 2. Track stalled page loads at the 20 second mark. 3. Track successful page loads. I think we should use the same metric to claim a page has finished loading as Safari does for showing a full progress bar. <rdar://problem/36549013>
Created attachment 334762 [details] Patch Preliminary version to check the approach. Final version will have logging removed and will actually upload the telemetry data.
Comment on attachment 334762 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=334762&action=review > Source/WebCore/loader/FrameLoader.cpp:237 > + m_frame.page()->progress().progressCompleted(m_frame, -999); I do not think it is a good idea to hardcode and NSError code in non-platform-specific code. -999 is so obscure here too. > Source/WebCore/loader/FrameLoader.cpp:1831 > + m_progressTracker->progressCompleted(-999); I do not think it is a good idea to hardcode and NSError code in non-platform-specific code. -999 is so obscure here too. > Source/WebCore/loader/FrameLoader.cpp:2365 > + m_progressTracker->progressCompleted(error.isNull() ? 0 : error.errorCode()); Why the isNull() check? errorCode() is 0 by default. > Source/WebCore/loader/ProgressTracker.h:52 > + void progressCompleted(Frame&, int errorCode); Could we use a std::optional<ResourceError> or ResourceError::Type? so we don't have to hardcode some error codes in a few places? Especially considering error codes are platform-specific. > Source/WebCore/page/DiagnosticLoggingKeys.cpp:573 > +String DiagnosticLoggingKeys::telemetryPageLoadCanceledKey() The method name is too specific. Given the string value, using a more generic name would make it more reusable. As a matter of fact, I wouldn't be surprised if we already had already getters for some of those strings. > Source/WebCore/page/DiagnosticLoggingKeys.cpp:578 > +String DiagnosticLoggingKeys::telemetryPageLoadFailedKey() The method name is too specific. Given the string value, using a more generic name would make it more reusable. > Source/WebCore/page/DiagnosticLoggingKeys.cpp:583 > +String DiagnosticLoggingKeys::telemetryPageLoadHungKey() The method name is too specific. Given the string value, using a more generic name would make it more reusable. > Source/WebCore/page/DiagnosticLoggingKeys.cpp:588 > +String DiagnosticLoggingKeys::telemetryPageLoadOccurredKey() The method name is too specific. Given the string value, using a more generic name would make it more reusable. > Source/WebCore/page/DiagnosticLoggingKeys.cpp:593 > +String DiagnosticLoggingKeys::telemetryPageLoadSuccessfulKey() The method name is too specific. Given the string value, using a more generic name would make it more reusable. > Source/WebCore/page/Page.cpp:193 > +const static Seconds kCancelTimeThreshold { 5_s }; No need for static since it is global and const. > Source/WebCore/page/Page.cpp:194 > +const static Seconds kHangTimeThreshold { 20_s }; ditto. > Source/WebCore/page/Page.cpp:195 > +const static int kCanceledErrorCode { -999 }; ditto. Also, I do not think it is a good idea to hardcode and NSError code in non-platform-specific code. > Source/WebCore/page/Page.cpp:197 > +class TelemetryProgressTrackerClient : public ProgressTrackerClient { Can this be moved to its own file? This is a decent amount of new code in Page.cpp :/ Is this ever subclassed? If not, it should be marked as final? > Source/WebCore/page/Page.cpp:199 > + virtual ~TelemetryProgressTrackerClient() = default; I do not think this brings anything? > Source/WebCore/page/Page.cpp:208 > + virtual void progressTrackerDestroyed() Can some / most of these methods be private? Also, as per WebKit coding style, we omit the virtual and use either final/override. > Source/WebCore/page/Page.cpp:263 > + m_reportedResults = false; Per coding style, boolean variables need a prefix. In this case, I suggest m_hasResportedResults. > Source/WebCore/page/Page.cpp:268 > + if (m_lastChangedTime - m_previousChangedTime >= kHangTimeThreshold) We do not usually use k prefix for global variables. Just "hangTimeThreshold" would do. > Source/WebCore/page/Page.cpp:283 > + if (*m_errorCode == kCanceledErrorCode) { I suggest using ResourceError and rely on ResourceError::isCancellation(). Or if you only care about the error type, we could use ResourceError::Type. > Source/WebCore/page/Page.cpp:299 > + void logDiagnosticMessage(String message) const String&
Comment on attachment 334762 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=334762&action=review > Source/WebCore/loader/FrameLoader.cpp:2363 > + const ResourceError& error = dl->mainDocumentError(); auto& >> Source/WebCore/page/Page.cpp:197 >> +class TelemetryProgressTrackerClient : public ProgressTrackerClient { > > Can this be moved to its own file? This is a decent amount of new code in Page.cpp :/ > > Is this ever subclassed? If not, it should be marked as final? Also, I think it is missing WTF_MAKE_FAST_ALLOCATED; > Source/WebCore/page/Page.cpp:203 > + : m_page(page), m_client(wrappedClient) One per line as per coding style. > Source/WebCore/page/Page.cpp:205 > + RELEASE_LOG(ResourceLoading, "%p - logDiagnosticMessage: creating", this); Shouldn't all these RELEASE_LOG check that we are not in an ephemeral session. > Source/WebCore/page/Page.cpp:278 > + if (!m_errorCode.has_value()) { if (!m_errorCode) would it be enough.
Comment on attachment 334762 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=334762&action=review I think after addressing Chris's comments this patch will be correct. >> Source/WebCore/loader/FrameLoader.cpp:237 >> + m_frame.page()->progress().progressCompleted(m_frame, -999); > > I do not think it is a good idea to hardcode and NSError code in non-platform-specific code. -999 is so obscure here too. Yeah, could we just use ResourceError::Type::Cancellation in these places somehow? >> Source/WebCore/page/DiagnosticLoggingKeys.cpp:578 >> +String DiagnosticLoggingKeys::telemetryPageLoadFailedKey() > > The method name is too specific. Given the string value, using a more generic name would make it more reusable. Chris: I don't understand this (and the following) comments. This is just a value to be appended to 'telemetry.pageload' -- what kind of more generic name would be used to log a failed load event? Should "cancelled", "failed", "hung", etc., just be defined in the TelemetryProgressTrackerClient class, since they are only used there? >> Source/WebCore/page/Page.cpp:195 >> +const static int kCanceledErrorCode { -999 }; > > ditto. Also, I do not think it is a good idea to hardcode and NSError code in non-platform-specific code. Yeah, I think it might be enough to just pass the ResourceError::Type here, since we only really want to track successes, failures (of any kind), and user cancellation happening after some amount of time that indicates the user got angry and gave up. I don't think we need the specific error number for those uses. > Source/WebKit/WebProcess/WebCoreSupport/WebProgressTrackerClient.cpp:68 > +void WebProgressTrackerClient::progressFinished(Frame& originatingProgressFrame, int /* errorCode */) As Chris suggested, I do think "ResourceError::Type" might be a better thing to pass.
Comment on attachment 334762 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=334762&action=review >>> Source/WebCore/page/DiagnosticLoggingKeys.cpp:578 >>> +String DiagnosticLoggingKeys::telemetryPageLoadFailedKey() >> >> The method name is too specific. Given the string value, using a more generic name would make it more reusable. > > Chris: I don't understand this (and the following) comments. This is just a value to be appended to 'telemetry.pageload' -- what kind of more generic name would be used to log a failed load event? > > Should "cancelled", "failed", "hung", etc., just be defined in the TelemetryProgressTrackerClient class, since they are only used there? Just follow the pattern in this file: String DiagnosticLoggingKeys::failedKey() { return ASCIILiteral("failed"); } so that it is reusable. There is no reason we need / want "telemetryPageLoad" in the method name as it only returns the String "failed".
Comment on attachment 334762 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=334762&action=review > Source/WebCore/page/Page.cpp:212 > + reportResults(); As commented offline, I do not think we need to report anything in progressTrackerDestroyed().progressFinished() should have been called already. > Source/WebCore/page/Page.cpp:241 > + checkForHang(); We only call checkForHang() when progressEstimateChanged() is called it seems. In a true hang, I guess progressEstimateChanged() may not even get called, no? > Source/WebCore/page/Page.cpp:288 > + logDiagnosticMessage(DiagnosticLoggingKeys::telemetryPageLoadCanceledKey()); If the user cancels because the page load is hung, we report "canceled" and not "hung", is this intentional?
Created attachment 334795 [details] Patch Another preliminary patch. This responds to the previous comments, and provides the new strings to be registered with MessageTracer. It also still contains logging that will ultimately be removed.
Attachment 334795 [details] did not pass style-queue: ERROR: Source/WebCore/loader/FrameLoader.cpp:238: Missing space before { [whitespace/braces] [5] ERROR: Source/WebCore/loader/FrameLoader.cpp:1836: Missing space before { [whitespace/braces] [5] Total errors found: 2 in 22 files If any of these errors are false positives, please file a bug against check-webkit-style.
Created attachment 334796 [details] Patch Fix Windows build
(In reply to Chris Dumez from comment #6) > Comment on attachment 334762 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=334762&action=review > > > Source/WebCore/page/Page.cpp:212 > > + reportResults(); > > As commented offline, I do not think we need to report anything in > progressTrackerDestroyed().progressFinished() should have been called > already. > > > Source/WebCore/page/Page.cpp:241 > > + checkForHang(); > > We only call checkForHang() when progressEstimateChanged() is called it > seems. In a true hang, I guess progressEstimateChanged() may not even get > called, no? > > > Source/WebCore/page/Page.cpp:288 > > + logDiagnosticMessage(DiagnosticLoggingKeys::telemetryPageLoadCanceledKey()); > > If the user cancels because the page load is hung, we report "canceled" and > not "hung", is this intentional? Did you see those comments?
Created attachment 335106 [details] Patch
Comment on attachment 335106 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=335106&action=review > Source/WebKit/UIProcess/WebPageProxy.cpp:775 > + reportPageLoadResult(WebCore::ResourceError { WebCore::ResourceError::Type::Cancellation }); WebCore:: is not needed. > Source/WebKit/UIProcess/WebPageProxy.cpp:3255 > + reportPageLoadResult(WebCore::ResourceError { WebCore::ResourceError::Type::Cancellation }); WebCore:: is not needed. > Source/WebKit/UIProcess/WebPageProxy.cpp:3539 > + reportPageLoadResult(WebCore::ResourceError { WebCore::ResourceError::Type::Null }); reportPageLoadResult({ }) should work no? I would also suggest making the parameter to reportPageLoadResult() optional with a default value of { } so that such call sites look nicer. > Source/WebKit/UIProcess/WebPageProxy.cpp:7353 > + Cancel, Cancelation for consistency with the 2 others? > Source/WebKit/UIProcess/WebPageProxy.cpp:7357 > +struct MessageMapType { MessageMap is confusing since not used in a Map. > Source/WebKit/UIProcess/WebPageProxy.cpp:7362 > +static const int CFNetworkCancelErrorCode = -999; As mentioned before, I'd rather we do not hardcode a NSError code in cross-platform code. This error code is already in CFNetworkErrors.h as kCFURLErrorCancelled. I think we should instead fix ResourceError so that it's type is Cancel if the error code in -999. So in ResourceError::ResourceError(CFErrorRef cfError): if (cfError) setType((CFErrorGetCode(m_platformError.get()) == kCFURLErrorTimedOut) ? Type::Timeout : Type::General); Should also probably recognize kCFURLErrorCancelled too so that you don't have to work around it. > Source/WebKit/UIProcess/WebPageProxy.cpp:7367 > + static NeverDestroyed<Vector<MessageMapType>> messageMap(std::initializer_list<MessageMapType> { messageMap is confusing since this is not a map. Also, this can be marked as const. > Source/WebKit/UIProcess/WebPageProxy.cpp:7376 > + MessageMapType { Seconds::infinity(), CompletionCondition::Error, DiagnosticLoggingKeys::failedMoreThan20SecondsKey() }, Given the nature of this work, wouldn't we be interested in distinguishing errors that are timeouts? (ResourceError::isTimeout()). > Source/WebKit/UIProcess/WebPageProxy.cpp:7393 > + else if (!error.isNull() || error.errorCode()) Do we have cases where the error is null but has an error code? If so, it seems like a bug. > Source/WebKit/UIProcess/WebPageProxy.cpp:7397 > + if (pageLoadTime < mapItem.seconds && condition == mapItem.condition) { It would look more logical to me if the 2 conditions were reversed. First look for the right CompletionCondition, then the right load time. > Source/WebKit/UIProcess/WebPageProxy.cpp:7399 > + logDiagnosticMessage(DiagnosticLoggingKeys::telemetryPageLoadKey(), DiagnosticLoggingKeys::occurredKey(), WebCore::ShouldSample::No); What's this line? > Source/WebKit/UIProcess/WebPageProxy.h:2109 > + std::optional<MonotonicTime> m_startPageLoadTime; Your english is better than mine so your call but I would have named this m_pageLoadStart or m_pageLoadStartTime;
Comment on attachment 335106 [details] Patch Attachment 335106 [details] did not pass win-ews (win): Output: http://webkit-queues.webkit.org/results/6833774 New failing tests: http/tests/security/contentSecurityPolicy/video-with-https-url-allowed-by-csp-media-src-star.html
Created attachment 335167 [details] Archive of layout-test-results from ews206 for win-future The attached test failures were seen while running run-webkit-tests on the win-ews. Bot: ews206 Port: win-future Platform: CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit
Created attachment 335343 [details] Patch
Comment on attachment 335343 [details] Patch This looks great! r+ as long as EWS is happy.
Comment on attachment 335343 [details] Patch Attachment 335343 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.webkit.org/results/6864050 New failing tests: imported/w3c/web-platform-tests/service-workers/service-worker/unregister-then-register-new-script.https.html imported/w3c/web-platform-tests/service-workers/service-worker/registration-script.https.html http/wpt/beacon/contentextensions/beacon-redirect-blocked.html http/tests/security/http-0.9/iframe-blocked.html http/tests/security/http-0.9/xhr-blocked.html imported/w3c/web-platform-tests/service-workers/service-worker/registration-security-error.https.html http/tests/workers/service/basic-register.html imported/w3c/web-platform-tests/service-workers/service-worker/registration-iframe.https.html imported/w3c/web-platform-tests/service-workers/service-worker/register-same-scope-different-script-url.https.html http/wpt/beacon/connect-src-beacon-redirect-blocked.sub.html
Created attachment 335360 [details] Archive of layout-test-results from ews107 for mac-sierra-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews107 Port: mac-sierra-wk2 Platform: Mac OS X 10.12.6
Comment on attachment 335343 [details] Patch Attachment 335343 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: http://webkit-queues.webkit.org/results/6863829 New failing tests: imported/w3c/web-platform-tests/service-workers/service-worker/registration-security-error.https.html imported/w3c/web-platform-tests/service-workers/service-worker/registration-script.https.html imported/w3c/web-platform-tests/service-workers/service-worker/unregister-then-register-new-script.https.html http/tests/security/http-0.9/iframe-blocked.html http/tests/security/http-0.9/xhr-blocked.html http/wpt/beacon/connect-src-beacon-redirect-blocked.sub.html http/tests/workers/service/basic-register.html imported/w3c/web-platform-tests/service-workers/service-worker/registration-iframe.https.html
Created attachment 335361 [details] Archive of layout-test-results from ews121 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews121 Port: ios-simulator-wk2 Platform: Mac OS X 10.12.6
Comment on attachment 335343 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=335343&action=review > Source/WebCore/platform/network/mac/ResourceErrorMac.mm:-116 > - setType(([m_platformError.get() code] == NSURLErrorTimedOut) ? Type::Timeout : Type::General); I would suggest the minimal change: if (nsError) setType(([m_platformError.get() code] == NSURLErrorTimedOut) ? Type::Timeout : ([m_platformError.get() code] == NSURLErrorCancelled) ? Type::Cancellation : Type::General); No test failures with that.
Comment on attachment 335343 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=335343&action=review > Source/WebCore/platform/network/mac/ResourceErrorMac.mm:141 > + if (!ourDomain.isEmpty() && ourErrorCode) { Now if I construct a ResourceError { errorDomainWebKitInternal, , url(), "Failed to load script", ResourceError::Type::Cancellation }; Its Type will be reset to General, even though I explicitly asked for Cancellation.
Comment on attachment 335343 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=335343&action=review > Source/WebCore/platform/network/cf/ResourceError.h:58 > + mapPlatformError(); I don't think we should be calling this. This will override the type that is being given
(In reply to Chris Dumez from comment #21) > I would suggest the minimal change: > if (nsError) > setType(([m_platformError.get() code] == NSURLErrorTimedOut) ? > Type::Timeout : ([m_platformError.get() code] == NSURLErrorCancelled) ? > Type::Cancellation : Type::General); > > No test failures with that. True, but that doesn't mean that the new logging will work correctly. Without thorough coverage, it's possible for some Type=General,Code=-999 errors to not get translated to Type=Cancellation errors. > I don't think we should be calling this. This will override the type that is being given Overriding the type to be Cancellation is kind of the point, though. I've got a new version that tries to be thorough, but also passes all the tests. I'll upload in a bit.
Created attachment 335373 [details] Patch
(In reply to Keith Rollin from comment #24) > (In reply to Chris Dumez from comment #21) > > I would suggest the minimal change: > > if (nsError) > > setType(([m_platformError.get() code] == NSURLErrorTimedOut) ? > > Type::Timeout : ([m_platformError.get() code] == NSURLErrorCancelled) ? > > Type::Cancellation : Type::General); > > > > No test failures with that. > > True, but that doesn't mean that the new logging will work correctly. > Without thorough coverage, it's possible for some Type=General,Code=-999 > errors to not get translated to Type=Cancellation errors. Do you have evidence this is an issue, thus justifying more complex code? The type will get set properly for all resource errors constructed from a NSError / CFError. Which is what we want. I do not want the constructor not to obey me when I give it an explicit type and use whatever type it wants. > > > I don't think we should be calling this. This will override the type that is being given > > Overriding the type to be Cancellation is kind of the point, though. > > I've got a new version that tries to be thorough, but also passes all the > tests. I'll upload in a bit. I disagree that this is the point, as commented above. A constructor that takes in a type but sometimes ignores it because it thinks it knows better is a terrible pattern IMO.
Comment on attachment 335373 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=335373&action=review > Source/WebCore/platform/network/cf/ResourceError.h:58 > + mapPlatformError(); As commented earlier, I think this is not a good idea. We should obey the type being given. > Source/WebCore/platform/network/mac/ResourceErrorMac.mm:121 > + mapPlatformError(); This is not needed since the constructor above already does it. > Source/WebCore/platform/network/mac/ResourceErrorMac.mm:124 > +void ResourceError::mapPlatformError() This claims to map platform error codes but tries to deal with cases where we do not have a platform error, this is confusing.
(In reply to Chris Dumez from comment #26) > (In reply to Keith Rollin from comment #24) > > (In reply to Chris Dumez from comment #21) > > > I would suggest the minimal change: > > > if (nsError) > > > setType(([m_platformError.get() code] == NSURLErrorTimedOut) ? > > > Type::Timeout : ([m_platformError.get() code] == NSURLErrorCancelled) ? > > > Type::Cancellation : Type::General); > > > > > > No test failures with that. > > > > True, but that doesn't mean that the new logging will work correctly. > > Without thorough coverage, it's possible for some Type=General,Code=-999 > > errors to not get translated to Type=Cancellation errors. > > Do you have evidence this is an issue, thus justifying more complex code? > The type will get set properly for all resource errors constructed from a > NSError / CFError. Which is what we want. I do not want the constructor not > to obey me when I give it an explicit type and use whatever type it wants. > Such evidence would be easy to get by greping our codebase. My understanding is that WebKit always constructs resource errors using its own domain or no domain. Therefore, this function call would do nothing and the extra logic would be unwarranted. > > > > > I don't think we should be calling this. This will override the type that is being given > > > > Overriding the type to be Cancellation is kind of the point, though. > > > > I've got a new version that tries to be thorough, but also passes all the > > tests. I'll upload in a bit. > > I disagree that this is the point, as commented above. A constructor that > takes in a type but sometimes ignores it because it thinks it knows better > is a terrible pattern IMO.
Created attachment 335793 [details] Patch
Comment on attachment 335793 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=335793&action=review > Source/WebCore/platform/network/cf/ResourceError.h:57 > +#if PLATFORM(COCOA) && !ASSERT_DISABLED !ASSERT_DISABLED should not be needed since this block only contains ASSERTs. > Source/WebCore/platform/network/cf/ResourceError.h:89 > +#if !ASSERT_DISABLED Not sure it is worth disabling this when asserts are disabled. > Source/WebCore/platform/network/mac/ResourceErrorMac.mm:144 > + if (m_platformError) { We usually do early returns in WebKit: if (!m_platformError) return; > Source/WebCore/platform/network/mac/ResourceErrorMac.mm:145 > + auto ourDomain = [m_platformError.get() domain]; Not sure we need the "our" prefix, it looks a bit weird. "domain" or "errorDomain" would look fine IMO. > Source/WebCore/platform/network/mac/ResourceErrorMac.mm:146 > + auto ourErrorCode = [m_platformError.get() code]; ditto. "errorCode"? > Source/WebKit/UIProcess/WebPageProxy.cpp:779 > + reportPageLoadResult(ResourceError { ResourceError::Type::Cancellation }); I think this would look better: if (m_pageLoadStart) reportPageLoadResult(ResourceError { ResourceError::Type::Cancellation }); This avoid constructing a ResourceError unnecessarily in the common case where there is no pending load. It also looks weird otherwise to unconditionally report a cancellation when starting a new load, like you do below. > Source/WebKit/UIProcess/WebPageProxy.cpp:3262 > + reportPageLoadResult(ResourceError { ResourceError::Type::Cancellation }); I think this would look better: if (m_pageLoadStart) reportPageLoadResult(ResourceError { ResourceError::Type::Cancellation }); > Source/WebKit/UIProcess/WebPageProxy.cpp:7386 > +enum class CompletionCondition { Can we use Error::Type? What's the benefit of using another enum class here? It seems to be used only for lookup in messages table. I think Error::Type would work just as well for this. > Source/WebKit/UIProcess/WebPageProxy.cpp:7420 > + if (!m_pageLoadStart) We may be able to make this an ASSERT() > Source/WebKit/UIProcess/WebPageProxy.cpp:7434 > + for (const auto& messageItem : messages.get()) { We usually omit the "const" when we use auto&, to be concise.
Comment on attachment 335793 [details] Patch r=me with suggested changes.
Created attachment 335874 [details] Patch
Comment on attachment 335874 [details] Patch Clearing flags on attachment: 335874 Committed r229643: <https://trac.webkit.org/changeset/229643>
All reviewed patches have been landed. Closing bug.