Bug 155242

Summary: Move prevalent resource classifier from WebCore to WebKit
Product: WebKit Reporter: John Wilander <wilander>
Component: WebKit2Assignee: John Wilander <wilander>
Status: RESOLVED FIXED    
Severity: Normal CC: aestes, bfulgham, cdumez, commit-queue, japhet, webkit-bug-importer, wilander, zalan
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch none

Description John Wilander 2016-03-09 11:06:02 PST
Move prevalent resource classifier from WebCore to WebKit after the WebKit2 implementation.
Comment 1 John Wilander 2016-03-09 11:06:45 PST
rdar://problem/24913272
Comment 2 John Wilander 2016-03-09 11:21:45 PST
Created attachment 273445 [details]
Patch
Comment 3 WebKit Commit Bot 2016-03-09 11:24:15 PST
Attachment 273445 [details] did not pass style-queue:


ERROR: Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:56:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:57:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:58:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
Total errors found: 3 in 7 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 4 John Wilander 2016-03-09 12:48:14 PST
Created attachment 273462 [details]
Patch
Comment 5 John Wilander 2016-03-09 12:49:11 PST
Filed https://bugs.webkit.org/show_bug.cgi?id=155243 regarding the style errors.
Comment 6 WebKit Commit Bot 2016-03-09 12:49:54 PST
Attachment 273462 [details] did not pass style-queue:


ERROR: Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:56:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:57:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:58:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
Total errors found: 3 in 7 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 7 John Wilander 2016-03-09 13:15:28 PST
Created attachment 273464 [details]
Patch
Comment 8 WebKit Commit Bot 2016-03-09 13:16:52 PST
Attachment 273464 [details] did not pass style-queue:


ERROR: Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:57:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:58:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:59:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
Total errors found: 3 in 7 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 9 John Wilander 2016-03-09 13:17:18 PST
The classifier needs to include StringHash.h except for on debug builds.
Comment 10 Brent Fulgham 2016-03-09 13:48:33 PST
Comment on attachment 273464 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273464&action=review

This is looking pretty good, but it needs a bit more work.

> Source/WebCore/ChangeLog:4
> +        rdar://problem/24913272

We write this as <rdar://problem/24913272>

> Source/WebCore/ChangeLog:5
> +        https://bugs.webkit.org/show_bug.cgi?id=155242

Bugzilla bug goes BEFORE the corresponding radar. Most other people don't care about the Radar number! :-)

> Source/WebCore/ChangeLog:24
> +2016-03-09  John Wilander  <wilander@apple.com>

Whoops! It looks like you accidentally ran "prepare-ChangeLog" twice. Please delete the duplicate data below.

> Source/WebKit2/ChangeLog:4
> +        rdar://problem/24913272

Please put inside <>

> Source/WebKit2/ChangeLog:5
> +        https://bugs.webkit.org/show_bug.cgi?id=155242

Please move above the Radar.

> Source/WebKit2/ChangeLog:18
> +2016-03-09  John Wilander  <wilander@apple.com>

Please get rid of this duplicate ChangeLog info.

> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:61
> +    Vector<String> prevalentResourcesWithUserInteraction;

This could just be Vector<String> prevalentResources, prevalentResourcesWithUserInteraction;

> Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:52
> +        if (resourceStatistic.isPrevalentResource && resourceStatistic.hadUserInteraction) {

We don't use curly braces when there is only one line in the conditional clause.

This might read better as follows:

if (resourceStatistics.isPrevalentResource) {
    if (resourceStatistic.hadUserInteraction)
        prevalentResourcesWithUserInteraction.append(resourceStatistic.highLevelDomain);
    else
        prevalentResources.append(resourceStatistic.highLevelDomain);
    continue;
}

... then you can get rid of the 'else' at line 56.

> Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:55
> +            prevalentResources.append(resourceStatistic.highLevelDomain);

Ditto.

> Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:56
> +        } else if (resourceStatistic.subframeUnderTopFrameOrigins.size() > subframeUnderTopFrameOriginsThreshold

... this could just be:

if (hasPrevalentResourceCharacteristics(resourceStatistics) {
   ... stuff ...
}

> Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:59
> +                   || resourceStatistic.redirectedToOtherPrevalentResourceOrigins.size() > redirectedToOtherPrevalentResourceOriginsThreshold) {

I liked this much better as a predicate:

static bool hasPrevalentResourceCharacteristics(const ResourceStatistics& ...), like when it was in WebCore. Please change that back.

This is also the cause of your stylebot failures. The "||" should align with one indent below the '}' in the "} else if" statement, not the "if".

> Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.h:38
> +class WebResourcePrevalenceClassifier {

It doesn't make sense for this to be a class when its only purpose is to be a holder for a single static method.

Instead, I think this should just be a namespace containing one static function.
Comment 11 Brent Fulgham 2016-03-09 13:54:39 PST
Note: GTK (and probably EFL) is failing because they don't know what the new WebResourcePrevalenceClassifier is:

error: undefined reference to 'WebKit::WebResourcePrevalenceClassifier::classify

Please add it to the the three CMake files where the "UIProcess/WebResourceLoadStatisticsStore.cpp" files are added:

PlatformEfl.cmake:    UIProcess/WebResourceLoadStatisticsStore.cpp
PlatformGTK.cmake:    UIProcess/WebResourceLoadStatisticsStore.cpp
PlatformMac.cmake:    UIProcess/WebResourceLoadStatisticsStore.cpp
Comment 12 John Wilander 2016-03-09 14:02:42 PST
Created attachment 273470 [details]
Patch
Comment 13 WebKit Commit Bot 2016-03-09 14:03:57 PST
Attachment 273470 [details] did not pass style-queue:


ERROR: Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:57:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:58:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:59:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
Total errors found: 3 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 14 John Wilander 2016-03-09 14:05:44 PST
Created attachment 273471 [details]
Patch
Comment 15 WebKit Commit Bot 2016-03-09 14:06:30 PST
Attachment 273471 [details] did not pass style-queue:


ERROR: Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:57:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:58:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:59:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
Total errors found: 3 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 16 John Wilander 2016-03-09 14:13:50 PST
Created attachment 273473 [details]
Patch
Comment 17 Brent Fulgham 2016-03-09 14:53:39 PST
Comment on attachment 273473 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273473&action=review

You did not correct the ChangeLog per my earlier review comments. r- Until you address those issues.

> Source/WebCore/ChangeLog:4
> +        rdar://problem/24913272

Please place this inside <>.

> Source/WebCore/ChangeLog:5
> +        https://bugs.webkit.org/show_bug.cgi?id=155242

Please move this above the Radar.

> Source/WebKit2/ChangeLog:4
> +        rdar://problem/24913272

Please place this inside <>.

> Source/WebKit2/ChangeLog:5
> +        https://bugs.webkit.org/show_bug.cgi?id=155242

This needs to go above the Radar.

> Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:56
> +        } else if (resourceStatistic.subframeUnderTopFrameOrigins.size() > subframeUnderTopFrameOriginsThreshold

You did not address my request that you make these four lines of "||" into a static predicate method.

> Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.h:38
> +class WebResourcePrevalenceClassifier {

You did not address my comment that a class with a single static method is not the right approach. We are not programming in Java!

This should be a namespace holding this static method.
Comment 18 Andy Estes 2016-03-09 17:07:20 PST
Comment on attachment 273473 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273473&action=review

> Source/WebCore/loader/ResourceLoadStatisticsStore.h:33
> +namespace WebKit {
> +class WebResourceLoadStatisticsStore;
> +}

This is a layering violation. WebCore shouldn't know things about WebKit. We need to find another solution.

>> Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.h:38
>> +class WebResourcePrevalenceClassifier {
> 
> You did not address my comment that a class with a single static method is not the right approach. We are not programming in Java!
> 
> This should be a namespace holding this static method.

Or just a free function in the WebKit namespace :)
Comment 19 John Wilander 2016-03-09 17:16:41 PST
Created attachment 273513 [details]
Patch
Comment 20 John Wilander 2016-03-09 17:18:45 PST
Sorry, Brent! I did not see your comments between my previous patch changes. I was busy fixing the various build errors on the bots.

Andy, the layering violation is still in this patch. I'll find a way to address it.
Comment 21 Brent Fulgham 2016-03-09 20:26:02 PST
Comment on attachment 273513 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273513&action=review

Patch doesn't apply for some reason! Please do an "update-webkit" and try uploading again...

> Source/WebCore/ChangeLog:22
> +           - Made this class friend of WebKit::WebResourceLoadStatisticsStore since the latter needs to send the statistics map for classification.

ResourceLoadStatisticsStore.h has no changes anymore.

> Source/WebCore/loader/ResourceLoadStatisticsStore.h:34
> +

This isn't needed anymore.

> Source/WebCore/loader/ResourceLoadStatisticsStore.h:67
> +    friend class WebKit::WebResourceLoadStatisticsStore;

This isn't needed anymore.

> Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:69
> +bool hasPrevalentResourceCharacteristics(const ResourceLoadStatistics& resourceStatistic)

This should be a static function. Move it above your 'classify' function.

> Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:74
> +    || resourceStatistic.redirectedToOtherPrevalentResourceOrigins.size() > redirectedToOtherPrevalentResourceOriginsThreshold;

Each "||" should be indented another level.

> Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.h:40
> +bool hasPrevalentResourceCharacteristics(const WebCore::ResourceLoadStatistics& resourceStatistic);

This doesn't need to be exposed outside this compilation unit. Just remove this line.
Comment 22 John Wilander 2016-03-10 11:35:39 PST
Created attachment 273588 [details]
Patch
Comment 23 John Wilander 2016-03-10 11:37:46 PST
Thanks, Brent! I've addressed all your comments and changed the layering violation to a model where WebKit supplies a classifier lambda for the WebCore part to apply to its statistics map.
Comment 24 Brent Fulgham 2016-03-10 11:44:29 PST
Comment on attachment 273588 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273588&action=review

Looks good! Since this touches WK2 code we need to get Andy to do the final r+. Maybe you could adjust my minor nits and upload a final clean patch for him to review.

> Source/WebCore/loader/ResourceLoadStatisticsStore.cpp:140
> +static const unsigned minimumOriginsLoadedForProcessing = 100;

I recommend putting all of these magic numbers together at the top of the file where they can be easily tweaked.

> Source/WebCore/loader/ResourceLoadStatisticsStore.cpp:148
> +        processFunction(resourceStatistic);

Nice!

> Source/WebCore/loader/ResourceLoadStatisticsStore.h:63
> +    WEBCORE_EXPORT void processStatistics(std::function<void(ResourceLoadStatistics&)>&& processFunction);

You don't need the argument name here.

> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:65
>      // TODO: Notify individual WebProcesses of prevalent domains. <rdar://problem/24703099>

For completeness, you should mention that this TODO should consume the two lists.

> Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.h:39
> +void classify(WebCore::ResourceLoadStatistics& resourceStatistic, Vector<String>& prevalentResources, Vector<String>& prevalentResourcesWithUserInteraction);

You can get rid of the "resourceStatistic" argument here, it doesn't provide any new information.
Comment 25 John Wilander 2016-03-10 13:06:57 PST
Created attachment 273605 [details]
Patch
Comment 26 John Wilander 2016-03-10 13:07:35 PST
Thanks again, Brent! All those comments addressed.
Comment 27 Brent Fulgham 2016-03-10 13:14:57 PST
(In reply to comment #26)
> Thanks again, Brent! All those comments addressed.

Great! It looks good. Let's get Andy to do the final review and we can land it.
Comment 28 zalan 2016-03-10 14:25:22 PST
Comment on attachment 273605 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273605&action=review

> Source/WebKit2/UIProcess/WebResourcePrevalenceClassifier.cpp:67
> +    if (resourceStatistic.isPrevalentResource) {
> +        if (resourceStatistic.hadUserInteraction)
> +            prevalentResourcesWithUserInteraction.append(resourceStatistic.highLevelDomain);
> +        else
> +            prevalentResources.append(resourceStatistic.highLevelDomain);
> +    } else if (hasPrevalentResourceCharacteristics(resourceStatistic)) {
> +        if (resourceStatistic.hadUserInteraction)
> +            prevalentResourcesWithUserInteraction.append(resourceStatistic.highLevelDomain);
> +        else {
> +            prevalentResources.append(resourceStatistic.highLevelDomain);
> +            resourceStatistic.isPrevalentResource = true;
> +        }
> +    }

I'd rather not duplicate the logic here.

if (resourceStatistic.isPrevalentResource || hasPrevalentResourceCharacteristics(resourceStatistic)) {
    if (resourceStatistic.hadUserInteraction)
        prevalentResourcesWithUserInteraction.append(resourceStatistic.highLevelDomain);
    else {
        prevalentResources.append(resourceStatistic.highLevelDomain);
        resourceStatistic.isPrevalentResource = true
    }
}
Comment 29 John Wilander 2016-03-10 14:46:45 PST
That means we will set resourceStatistic.isPrevalentResource = true over and over when it is already true but I guess that's OK. Do others agree?
Comment 30 zalan 2016-03-10 14:47:50 PST
(In reply to comment #29)
> That means we will set resourceStatistic.isPrevalentResource = true over and
> over when it is already true but I guess that's OK. Do others agree?
I think it's better than duplicating logic which is highly error prone.
Comment 31 Brent Fulgham 2016-03-10 16:22:36 PST
(In reply to comment #30)
> (In reply to comment #29)
> > That means we will set resourceStatistic.isPrevalentResource = true over and
> > over when it is already true but I guess that's OK. Do others agree?
> I think it's better than duplicating logic which is highly error prone.

That's a good point. Setting that boolean over and over is very low cost. The multiple branch statements were probably far more expensive!
Comment 32 Brent Fulgham 2016-03-10 16:23:30 PST
BTW: The iOS build error does not seem to be related to this patch.
Comment 33 Andy Estes 2016-03-10 16:24:11 PST
Comment on attachment 273605 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273605&action=review

> Source/WebCore/loader/ResourceLoadStatisticsStore.cpp:148
> +    if (m_resourceStatisticsMap.size() < minimumOriginsLoadedForProcessing)
> +        return;
> +
> +    for (auto& resourceStatistic : m_resourceStatisticsMap.values())
> +        processFunction(resourceStatistic);

Somewhat unfortunate that WebResourceLoadStatisticsStore has to repeatedly call this function even when there isn't enough data for processing. A better design would be where the processing function is registered up front and then called automatically at the appropriate times.

I'm ok with you landing this as-is, but at some point we should move ResourceLoadStatisticsStore out of WebCore and fold it into WebResourceLoadStatisticsStore. Having this class split between WebCore and WebKit is awkward and no longer seems necessary with WebKit1 out of the picture. In its place could just be an abstract class that ResourceLoadObserver calls into and that is subclassed by something in WebKit. That's the typical pattern we follow for delegating from WebCore to WebKit.

> Source/WebKit2/ChangeLog:18
> +        (WebKit::WebResourcePrevalenceClassifier::hasPrevalentResourceCharacteristics):
> +        (WebKit::WebResourcePrevalenceClassifier::classify):
> +        * UIProcess/WebResourcePrevalenceClassifier.h: Added.
> +           - Added new classifier for prevalent subresources.

A new .cpp file and namespace seem like overkill just to add a function that is only called in one place. Why not just define classify() and its helpers as static functions in WebResourceLoadStatisticsStore.cpp?

> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:65
> +    // TODO: Notify individual WebProcesses of prevalent domains using the two vectors populated by the classifier. <rdar://problem/24703099>

WebKit style says to use FIXME instead of TODO.
Comment 34 John Wilander 2016-03-10 16:55:50 PST
Zalan, your comment actually caught a subtle bug. This is how it should look:
if (resourceStatistic.isPrevalentResource || hasPrevalentResourceCharacteristics(resourceStatistic)) {
    resourceStatistic.isPrevalentResource = true;
    if (resourceStatistic.hadUserInteraction)
        prevalentResourcesWithUserInteraction.append(resourceStatistic.highLevelDomain);
    else
        prevalentResources.append(resourceStatistic.highLevelDomain);
}

Andy, I like your comments. I've been onto this split of where the code resides but haven't really been able to suggest the right change. Anyhow, I'll gate the lamda call on an initial check and also get rid of the classifier files by moving the code to WebResourceLoadStatisticsStore.

New patch coming up.

Thanks!
Comment 35 John Wilander 2016-03-10 19:23:31 PST
Created attachment 273668 [details]
Patch
Comment 36 Andy Estes 2016-03-10 19:50:09 PST
Comment on attachment 273668 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273668&action=review

> Source/WebCore/loader/ResourceLoadStatisticsStore.cpp:149
> +    if (hasEnoughDataForStatisticsProcessing()) {

You only call processStatistics() when there's enough data, so I'd change this if statement to ASSERT(hasEnoughDataForStatisticsProcessing()). No need to check twice.

> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:179
> +bool WebResourceLoadStatisticsStore::hasPrevalentResourceCharacteristics(const ResourceLoadStatistics& resourceStatistic)

This function doesn't access an instance of WebResourceLoadStatisticsStore, so it shouldn't be a member function. It should be a static inline free function:

    static inline bool hasPrevalentResourceCharacteristics(const ResourceLoadStatistics&) { ... }

You'll want to move the definition above the first call site to avoid having to forward-declare.

> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:184
> +    return resourceStatistic.subframeUnderTopFrameOrigins.size() > subframeUnderTopFrameOriginsThreshold
> +    || resourceStatistic.subresourceUnderTopFrameOrigins.size() > subresourceUnderTopFrameOriginsThreshold
> +    || resourceStatistic.subresourceUniqueRedirectsTo.size() > subresourceHasBeenRedirectedFromToUniqueDomainsThreshold
> +    || resourceStatistic.redirectedToOtherPrevalentResourceOrigins.size() > redirectedToOtherPrevalentResourceOriginsThreshold;

Be sure to re-indent this.

> Source/WebKit2/UIProcess/WebResourceLoadStatisticsStore.cpp:187
> +void WebResourceLoadStatisticsStore::classifyPrevalentResources(ResourceLoadStatistics& resourceStatistic, Vector<String>& prevalentResources, Vector<String>& prevalentResourcesWithUserInteraction)

Previous comment about member functions applies here too.
Comment 37 John Wilander 2016-03-11 15:10:01 PST
Created attachment 273776 [details]
Patch
Comment 38 John Wilander 2016-03-11 15:10:39 PST
Thanks, Andy! All your comments covered by the new patch.
Comment 39 Andy Estes 2016-03-11 15:27:59 PST
Comment on attachment 273776 [details]
Patch

Looks good, thanks!

FYI: I r+'d the previous patch, so it would've been ok for you to just land your change directly after addressing the feedback.
Comment 40 WebKit Commit Bot 2016-03-11 16:36:07 PST
Comment on attachment 273776 [details]
Patch

Clearing flags on attachment: 273776

Committed r198055: <http://trac.webkit.org/changeset/198055>
Comment 41 WebKit Commit Bot 2016-03-11 16:36:13 PST
All reviewed patches have been landed.  Closing bug.