WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
162811
Update Resource Load Statistics
https://bugs.webkit.org/show_bug.cgi?id=162811
Summary
Update Resource Load Statistics
John Wilander
Reported
2016-09-30 14:40:18 PDT
Update resource load statistics. Use the Public Suffix list, capture web sockets statistics, capture synchronous XHR requests, skip main resource loads done in SubresourceLoader, and clear data records.
Attachments
Patch
(27.84 KB, patch)
2016-09-30 15:14 PDT
,
John Wilander
no flags
Details
Formatted Diff
Diff
Patch
(28.03 KB, patch)
2016-10-03 17:34 PDT
,
John Wilander
no flags
Details
Formatted Diff
Diff
Patch
(28.20 KB, patch)
2016-10-03 18:09 PDT
,
John Wilander
no flags
Details
Formatted Diff
Diff
Patch
(30.06 KB, patch)
2016-10-05 10:50 PDT
,
John Wilander
no flags
Details
Formatted Diff
Diff
Patch
(28.98 KB, patch)
2016-10-05 12:37 PDT
,
John Wilander
no flags
Details
Formatted Diff
Diff
Show Obsolete
(4)
View All
Add attachment
proposed patch, testcase, etc.
John Wilander
Comment 1
2016-09-30 15:14:33 PDT
Created
attachment 290395
[details]
Patch
Andy Estes
Comment 2
2016-09-30 17:00:44 PDT
Comment on
attachment 290395
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=290395&action=review
> Source/WebCore/Modules/websockets/WebSocket.cpp:323 > + } else { > + ResourceLoadObserver::sharedObserver().logWebSocketLoading(document.frame(), m_url);
Our style is to omit braces for single-line control clauses.
> Source/WebCore/loader/ResourceLoadObserver.cpp:61 > +static inline bool is3xxRedirect(const ResourceResponse& response)
Can you use ResourceResponse::isRedirected() instead of this function?
> Source/WebCore/loader/ResourceLoadObserver.cpp:63 > + return !response.isNull() && response.httpStatusCode() / 100 == 3;
Probably ok to omit the isNull() check. Null responses should have a status code of 0, and 0 / 100 == 0.
> Source/WebCore/loader/ResourceLoadObserver.cpp:66 > +static inline bool shouldLog(bool usesEphemeralSession, RefPtr<ResourceLoadStatisticsStore> store)
Passing a RefPtr by value here causes unnecessary ref count churn. This would make more sense a private member function that takes a Page* as an argument, since you can derive usesEphemeralSession from a Page, and can access m_store from 'this'.
> Source/WebCore/loader/ResourceLoadObserver.cpp:163 > + if (frame && frame->page() && frame->page()->usesEphemeralSession())
You assert that frame and frame->page() are non-null above, so you don't need to check for that here.
> Source/WebCore/loader/ResourceLoadStatisticsStore.h:67 > + WEBCORE_EXPORT Vector<String> getPrevalentResourceDomainsWithoutUserInteraction();
Our style is to omit 'get' from getters. This should be called 'prevalentResourceDomainsWithoutUserInteraction()'.
> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:47 > +static const double numberOfSecondsBetweenClearingDataRecords = 600.0; > +static const double featureVectorLengthThreshold = 3.0;
Our style is to omit the '.0' from literals. Do these really need to be doubles? Seems like auto would work fine here.
> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:69 > + unsigned featureOne = resourceStatistic.subresourceUnderTopFrameOrigins.size(); > + unsigned featureTwo = resourceStatistic.subresourceUniqueRedirectsTo.size(); > + unsigned featureThree = resourceStatistic.subframeUnderTopFrameOrigins.size();
I'd use auto instead of unsigned. Also, 'feature{One,Two,Three}' seem like poor names. I'd call these subresourceUnderTopFrameOriginsCount, etc.
> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:79 > + double vectorLength = 0.0;
Our style is to omit the '.0' from literals.
> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:86 > + vectorLength += pow(featureOne, 2); > + vectorLength += pow(featureTwo, 2); > + vectorLength += pow(featureThree, 2); > + > + ASSERT(vectorLength > 0.0); > + > + return sqrt(vectorLength) > featureVectorLengthThreshold;
This could use a 'why' comment.
> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:96 > +void WebResourceLoadStatisticsStore::clearDataRecords(Vector<String> prevalentResources)
Passing a Vector by value doesn't seem like the right thing to do here. From looking at this function's caller, I think you want to pass an rvalue reference (Vector<String>&&).
> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:114 > + RunLoop::main().dispatch([prevalentResources, webResourceLoadStatisticsStore = this] {
Capturing prevalentResources by value makes a copy. This can be avoided by using move capture: [prevalentResources = WTFMove(prevalentResources), ...]. Also, I think you can just capture 'this' instead of assigning it to a new variable.
> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:117 > + websiteDataStore.fetchData(WebsiteDataType::Cookies, { }, [prevalentResources, webResourceLoadStatisticsStore](auto websiteDataRecords) {
Ditto.
> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:132 > + websiteDataStore.removeData(WebsiteDataType::Cookies, { WTFMove(dataRecords) }, [webResourceLoadStatisticsStore] {
Ditto regarding the 'this' capture issue.
> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:147 > + Vector<String> prevalentResources; > + webResourceLoadStatisticsStore->classifyResource(resourceStatistic); > + prevalentResources = webResourceLoadStatisticsStore->coreStore().getPrevalentResourceDomainsWithoutUserInteraction();
Instead of default-constructing prevalentResources then assigning it a new value, you should construct it with the return value of getPrevalentResourceDomainsWithoutUserInteraction() to take advantage of the return value optimization.
> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:149 > + if (prevalentResources.size() > 0) > + webResourceLoadStatisticsStore->clearDataRecords(prevalentResources);
I'd move the 'prevalentResources.size() > 0' check into WebResourceLoadStatisticsStore::clearDataRecords(), then either WTFMove() prevalentResources or just move the call to getPrevalentResourceDomainsWithoutUserInteraction() into this statement.
> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.h:92 > + double m_lastTimeDataRecordsWereCleared { 0.0 };
Our style is to omit the '.0' from literals.
Andy Estes
Comment 3
2016-09-30 17:08:57 PDT
Comment on
attachment 290395
[details]
Patch r- due to the EWS failures.
Andy Estes
Comment 4
2016-09-30 17:22:32 PDT
Comment on
attachment 290395
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=290395&action=review
> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:122 > + if (websiteDataRecord.displayName == prevalentResource || websiteDataRecord.displayName.endsWith("." + prevalentResource)) > + dataRecords.append(websiteDataRecord);
Do websiteDataRecord.displayName and prevalentResource represent hostnames? If so, what if the names differ in case but are otherwise equal? It's not great to do two separate string comparisons here. You could do something like this instead: if (websiteDataRecord.displayName.endsWith(prevalentResource)) { // should we use endsWithIgnoringASCIICase() instead? unsigned suffixStart = websiteDataRecord.displayName.length() - prevalentResource.length(); if (suffixStart == 0 || websiteDataRecord.displayName[suffixStart - 1] == '.') dataRecords.append(websiteDataRecord); }
Andy Estes
Comment 5
2016-09-30 17:25:44 PDT
(In reply to
comment #4
)
> Comment on
attachment 290395
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=290395&action=review
> > > Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:122 > > + if (websiteDataRecord.displayName == prevalentResource || websiteDataRecord.displayName.endsWith("." + prevalentResource)) > > + dataRecords.append(websiteDataRecord); > > Do websiteDataRecord.displayName and prevalentResource represent hostnames? > If so, what if the names differ in case but are otherwise equal? > > It's not great to do two separate string comparisons here. You could do > something like this instead: > > if (websiteDataRecord.displayName.endsWith(prevalentResource)) { // > should we use endsWithIgnoringASCIICase() instead? > unsigned suffixStart = websiteDataRecord.displayName.length() - > prevalentResource.length(); > if (suffixStart == 0 || websiteDataRecord.displayName[suffixStart - > 1] == '.') > dataRecords.append(websiteDataRecord); > }
Note that we already do this for computing cookie partitions, so there is a similar function in NetworkStorageSessionCFNet.cpp: static inline bool hostIsInDomain(). I wonder if we should move that somewhere where it can be reused.
John Wilander
Comment 6
2016-10-03 17:34:02 PDT
Created
attachment 290541
[details]
Patch
John Wilander
Comment 7
2016-10-03 17:38:34 PDT
Thanks for great review comments, Andy! I picked up all of it except: 1. ResourceResponse::isRedirected(). I did see this function and considered it. But I want to capture "isRedirecting" which is the 3xx response whereas "isRedirected" I believe is a response that has already been redirected. I might be wrong. We should figure it out and make sure the naming is correct. Maybe also introduce ResourceResponse::isRedirecting(). 2. hostIsInDomain(). I agree this would be nice to have as a shared function.
John Wilander
Comment 8
2016-10-03 18:09:47 PDT
Created
attachment 290542
[details]
Patch
John Wilander
Comment 9
2016-10-03 18:10:36 PDT
Fixed a failing test case where a web socket was constructed in a detached frame.
John Wilander
Comment 10
2016-10-03 19:03:53 PDT
r206763
seems to be failing on Windows.
John Wilander
Comment 11
2016-10-05 10:50:24 PDT
Created
attachment 290725
[details]
Patch
John Wilander
Comment 12
2016-10-05 10:51:09 PDT
Fixed Windows build error.
Alex Christensen
Comment 13
2016-10-05 12:10:22 PDT
Comment on
attachment 290725
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=290725&action=review
> Source/WebCore/ChangeLog:9 > + No new tests. The counting is based on top privately owned domains > + which currently is not supported by layout tests nor API tests.
We need to make a way to synthesize things to test this. We can't have such a large feature completely untested.
> Source/WebCore/loader/ResourceLoadObserver.cpp:64 > +static inline bool is3xxRedirect(const ResourceResponse& response) > { > - if (!Settings::resourceLoadStatisticsEnabled()) > - return; > + return response.httpStatusCode() / 100 == 3; > +}
I'd prefer range checks >= 300 && <= 399 Do we want to support unknown status codes, like 366 (which has no defined meaning)?
> Source/WebCore/loader/ResourceLoadObserver.cpp:68 > + // Err on the safe side until we have sorted out what to do in worker contexts
This should be a FIXME. Workers should have started with something that was once connected to a page. We'll probably need to pipe something refcounted around like I did in
https://trac.webkit.org/changeset/202930
> Source/WebCore/loader/ResourceLoadObserver.cpp:231 > + const URL& mainFrameURL = frame ? frame->mainFrame().document()->url() : URL();
We already did an early return if frame were null.
> Source/WebCore/loader/ResourceLoadObserver.cpp:278 > String ResourceLoadObserver::primaryDomain(const URL& url)
This logic ought to be in URL and it should definitely have unit tests. It should just return a substring of URL.m_string instead of using a StringBuilder to reconstruct a String we've split apart.
> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:88 > + vectorLength += pow(subresourceUnderTopFrameOriginsCount, 2); > + vectorLength += pow(subresourceUniqueRedirectsToCount, 2); > + vectorLength += pow(subframeUnderTopFrameOriginsCount, 2);
Don't use pow(x, 2). Just use x * x.
John Wilander
Comment 14
2016-10-05 12:37:18 PDT
Created
attachment 290740
[details]
Patch
John Wilander
Comment 15
2016-10-05 12:42:10 PDT
Thanks for the review, Alex! Fixed all of it. Some comments inline below. (In reply to
comment #13
)
> Comment on
attachment 290725
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=290725&action=review
> > > Source/WebCore/ChangeLog:9 > > + No new tests. The counting is based on top privately owned domains > > + which currently is not supported by layout tests nor API tests. > > We need to make a way to synthesize things to test this. We can't have such > a large feature completely untested.
I agree fully and I do have a plan.
> > Source/WebCore/loader/ResourceLoadObserver.cpp:64 > > +static inline bool is3xxRedirect(const ResourceResponse& response) > > { > > - if (!Settings::resourceLoadStatisticsEnabled()) > > - return; > > + return response.httpStatusCode() / 100 == 3; > > +} > > I'd prefer range checks >= 300 && <= 399
Fixed.
> Do we want to support unknown status codes, like 366 (which has no defined > meaning)?
I believe our network stack handles all of HTTP 300-399 as redirects. That's why I did it this way.
> > Source/WebCore/loader/ResourceLoadObserver.cpp:68 > > + // Err on the safe side until we have sorted out what to do in worker contexts > > This should be a FIXME. Workers should have started with something that was > once connected to a page. We'll probably need to pipe something refcounted > around like I did in
https://trac.webkit.org/changeset/202930
Fixed.
> > Source/WebCore/loader/ResourceLoadObserver.cpp:231 > > + const URL& mainFrameURL = frame ? frame->mainFrame().document()->url() : URL(); > > We already did an early return if frame were null.
Fixed.
> > Source/WebCore/loader/ResourceLoadObserver.cpp:278 > > String ResourceLoadObserver::primaryDomain(const URL& url) > > This logic ought to be in URL and it should definitely have unit tests. It > should just return a substring of URL.m_string instead of using a > StringBuilder to reconstruct a String we've split apart.
Although that code didn't really change with this patch I agree it should have tests. So I removed it instead. WebKit ports without public suffix list support can decide how to go about this if they want to use the feature.
> > Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:88 > > + vectorLength += pow(subresourceUnderTopFrameOriginsCount, 2); > > + vectorLength += pow(subresourceUniqueRedirectsToCount, 2); > > + vectorLength += pow(subframeUnderTopFrameOriginsCount, 2); > > Don't use pow(x, 2). Just use x * x.
Fixed.
Alex Christensen
Comment 16
2016-10-06 09:21:11 PDT
Comment on
attachment 290740
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=290740&action=review
> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:85 > + // Vector length for n dimensions is sqrt(a^2 + (...) + n^2). > + double vectorLength = 0;
I'm not sure this comment is necessary. I think it would be obvious if we did something like this: double vectorLength = sqrt( a * a + b * b + c * c);
John Wilander
Comment 17
2016-10-06 10:19:51 PDT
Comment on
attachment 290740
[details]
Patch Thanks, Alex! I'll revisit the vector bit in a subsequent patch since we have more logic to add there anyway.
WebKit Commit Bot
Comment 18
2016-10-06 10:42:43 PDT
Comment on
attachment 290740
[details]
Patch Clearing flags on attachment: 290740 Committed
r206869
: <
http://trac.webkit.org/changeset/206869
>
WebKit Commit Bot
Comment 19
2016-10-06 10:42:47 PDT
All reviewed patches have been landed. Closing bug.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug