Bug 207523 - Expose prevalent domains on a per-page basis
Summary: Expose prevalent domains on a per-page basis
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit Misc. (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: katherine_cheney
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2020-02-10 17:08 PST by katherine_cheney
Modified: 2020-02-19 09:11 PST (History)
12 users (show)

See Also:


Attachments
Patch (32.88 KB, patch)
2020-02-10 18:13 PST, katherine_cheney
no flags Details | Formatted Diff | Diff
Patch (32.95 KB, patch)
2020-02-11 08:16 PST, katherine_cheney
no flags Details | Formatted Diff | Diff
Patch (41.57 KB, patch)
2020-02-11 17:43 PST, katherine_cheney
no flags Details | Formatted Diff | Diff
Patch (41.80 KB, patch)
2020-02-11 17:50 PST, katherine_cheney
no flags Details | Formatted Diff | Diff
Patch (41.93 KB, patch)
2020-02-12 08:11 PST, katherine_cheney
no flags Details | Formatted Diff | Diff
Patch (43.54 KB, patch)
2020-02-12 11:38 PST, katherine_cheney
no flags Details | Formatted Diff | Diff
Patch (43.58 KB, patch)
2020-02-12 11:55 PST, katherine_cheney
no flags Details | Formatted Diff | Diff
Patch (40.70 KB, patch)
2020-02-12 16:10 PST, katherine_cheney
no flags Details | Formatted Diff | Diff
Patch (41.79 KB, patch)
2020-02-13 17:07 PST, katherine_cheney
no flags Details | Formatted Diff | Diff
Patch for landing (41.69 KB, patch)
2020-02-13 18:43 PST, katherine_cheney
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description katherine_cheney 2020-02-10 17:08:55 PST
There should be an API for prevalent domains on a per page basis.
Comment 1 katherine_cheney 2020-02-10 17:09:16 PST
rdar://problem/59270758
Comment 2 katherine_cheney 2020-02-10 18:13:27 PST
Created attachment 390332 [details]
Patch
Comment 3 katherine_cheney 2020-02-11 08:16:33 PST
Created attachment 390370 [details]
Patch
Comment 4 Chris Dumez 2020-02-11 08:24:15 PST
Comment on attachment 390370 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=390370&action=review

> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:302
> +        updatePrevalentDomains(request.url());

For every sub-resource load, we may now send a NetworkProcessProxy::UpdatePrevalentDomains() to the UIProcess. Considering that there may be hundreds of sub-resources loads in a single page, many of them for the same registrable domain, this seems excessive.

> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:342
> +    const auto domain = RegistrableDomain(url);

We usually don't use const locals in WebKit.

> Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStore.mm:413
> +        completionHandler({ });

Shouldn't this be completionHandler(nil); ?

> Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStore.mm:417
> +    auto webPageProxy = [webView _page];

auto*

> Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStore.mm:419
> +        completionHandler({ });

Shouldn't this be completionHandler(nil); ?

> Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStore.mm:436
> +    auto webPageProxy = [webView _page];

auto*

> Source/WebKit/UIProcess/Network/NetworkProcessProxy.cpp:521
> +void NetworkProcessProxy::updatePrevalentDomains(WebPageProxyIdentifier pageID, RegistrableDomain domain)

RegistrableDomain&& since it is coming from IPC and you're moving it inside the implementation.

> Source/WebKit/UIProcess/WebPageProxy.cpp:9859
> +void WebPageProxy::updatePrevalentDomains(WebCore::RegistrableDomain&& domain)

WebCore:: is likely not needed.

> Source/WebKit/UIProcess/WebPageProxy.h:1688
> +    HashSet<WebCore::RegistrableDomain>& prevalentDomains() { return m_prevalentDomains; }

should this be a const getter returning a const HashSet?
Comment 5 Chris Dumez 2020-02-11 08:29:22 PST
Comment on attachment 390370 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=390370&action=review

> Tools/WebKitTestRunner/InjectedBundle/InjectedBundle.cpp:419
> +                domains.append(toWTFString(static_cast<WKStringRef>(item)));

You can use uncheckedAppend() since you called reserveInitialCapacity()

> Tools/WebKitTestRunner/InjectedBundle/InjectedBundle.cpp:422
> +        m_testRunner->callDidReceivePrevalentDomainsCallback(domains);

WTFMove(domains)

> Tools/WebKitTestRunner/InjectedBundle/TestRunner.cpp:2361
> +    WKBundlePostSynchronousMessage(InjectedBundle::singleton().bundle(), messageName.get(), nullptr, nullptr);

No, this should not be a synchronous message. It should be an asynchronous one since you're taking a callback.

> Tools/WebKitTestRunner/InjectedBundle/TestRunner.cpp:2364
> +void TestRunner::callDidReceivePrevalentDomainsCallback(Vector<String>& domains)

Vector<String>&&

> Tools/WebKitTestRunner/InjectedBundle/TestRunner.cpp:2370
> +    stringBuilder.appendLiteral("[");

append('[')

> Tools/WebKitTestRunner/InjectedBundle/TestRunner.cpp:2371
> +    bool firstDomain = true;

Boolean variables need a prefix: isFirstDomain for e.g.

> Tools/WebKitTestRunner/InjectedBundle/TestRunner.cpp:2381
> +    stringBuilder.appendLiteral("]");

append(']')

> Tools/WebKitTestRunner/TestInvocation.cpp:938
> +    if (WKStringIsEqualToUTF8CString(messageName, "GetPrevalentDomains")) {

Should not be in didReceiveSynchronousMessage*(), use the async message version please.

> Tools/WebKitTestRunner/TestInvocation.h:89
> +    void didReceivePrevalentDomains(Vector<String>& domains);

Vector<String>&&

> Tools/WebKitTestRunner/cocoa/TestControllerCocoa.mm:346
> +        m_currentInvocation->didReceivePrevalentDomains(domains);

WTFMove(domains)
Comment 6 katherine_cheney 2020-02-11 08:31:35 PST
(In reply to Chris Dumez from comment #4)
> Comment on attachment 390370 [details]
> Patch
> 

Thanks for the comments! I'll make these changes. 

> View in context:
> https://bugs.webkit.org/attachment.cgi?id=390370&action=review
> 
> > Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:302
> > +        updatePrevalentDomains(request.url());
> 
> For every sub-resource load, we may now send a
> NetworkProcessProxy::UpdatePrevalentDomains() to the UIProcess. Considering
> that there may be hundreds of sub-resources loads in a single page, many of
> them for the same registrable domain, this seems excessive.
> 

Do you think a good solution is to batch in a HashSet and send every few seconds, like WebResourceLoadObserver?
Comment 7 katherine_cheney 2020-02-11 17:43:54 PST
Created attachment 390472 [details]
Patch
Comment 8 katherine_cheney 2020-02-11 17:46:24 PST
(In reply to Chris Dumez from comment #5)
> Comment on attachment 390370 [details]
> Patch
> 

Thanks for the comments! 

(In reply to Chris Dumez from comment #4)
> Comment on attachment 390370 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=390370&action=review
> 

> > Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStore.mm:417
> > +    auto webPageProxy = [webView _page];
> 
> auto*
> 

This one gave me a compiler error, but I addressed the rest.
Comment 9 katherine_cheney 2020-02-11 17:50:01 PST
Created attachment 390474 [details]
Patch
Comment 10 katherine_cheney 2020-02-12 08:11:33 PST
Created attachment 390524 [details]
Patch
Comment 11 Chris Dumez 2020-02-12 08:29:02 PST
Comment on attachment 390474 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=390474&action=review

This patch seems to completely ignore loads served by the memory cache and ping/beacon loads, which seems wrong? I think you likely want centralized logging in CachedResourceLoader::requestResource() which I believe is our bottleneck *before* the memory cache (Youenn can confirm if this would cover everything or not). I think it would cover everything.

> Source/WebKit/NetworkProcess/NetworkConnectionToWebProcess.cpp:791
> +void NetworkConnectionToWebProcess::checkForPrevalentSubresourceLoad(WebPageProxyIdentifier pageID, RegistrableDomain&& domain)

Please use webPageProxyID instead of pageID. We stopped using pageID now that we have 2 identifier types for pages (one for WebPageProxy and another for WebPage/Page).

> Source/WebKit/NetworkProcess/NetworkConnectionToWebProcess.cpp:797
> +                    m_networkProcess->parentProcessConnection()->send(Messages::NetworkProcessProxy::UpdatePrevalentDomains(pageID, domain), 0);

You should be able to call send() on m_networkProcess directly since NetworkProcess class subclasses IPC::MessageSender.

> Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStore.mm:417
> +    auto webPageProxy = [webView _page];

auto*, assuming it returns a raw pointer.

> Source/WebKit/UIProcess/Network/NetworkProcessProxy.cpp:1151
> +void NetworkProcessProxy::updatePrevalentDomains(WebKit::WebPageProxyIdentifier pageID, WebCore::RegistrableDomain&& domain)

WebKit:: is unnecessary. WebCore:: likely is too.

> Source/WebKit/UIProcess/Network/NetworkProcessProxy.messages.in:53
> +    UpdatePrevalentDomains(WebKit::WebPageProxyIdentifier pageIdentifier, WebCore::RegistrableDomain domain)

Shouldn't this be called AddPrevalentDomainForPage() ?

> Source/WebKit/UIProcess/WebPageProxy.cpp:9860
> +void WebPageProxy::updatePrevalentDomains(RegistrableDomain&& domain)

Seems like this should be named addPrevalentDomain().

> Source/WebKit/UIProcess/WebPageProxy.cpp:9868
> +    send(Messages::WebPage::ClearLoadedSubDomains());

What's a SubDomain? I don't think this is a term we normally use in WebKit. We use registrable domain, domain or host. Here I assume you mean registrable domain.

> Source/WebKit/UIProcess/WebPageProxy.h:1548
> +    HashSet<WebCore::RegistrableDomain>& prevalentDomains() { return m_prevalentDomains; }

As previously commented, this getter should likely be const and return a const HashSet<>&.

> Source/WebKit/WebProcess/Network/WebLoaderStrategy.cpp:108
>  void WebLoaderStrategy::loadResource(Frame& frame, CachedResource& resource, ResourceRequest&& request, const ResourceLoaderOptions& options, CompletionHandler<void(RefPtr<SubresourceLoader>&&)>&& completionHandler)

How about ping loads (or beacons), do you intend to log those too? If so, you likely need to add logging to startPingLoad()

> Source/WebKit/WebProcess/WebCoreSupport/WebFrameLoaderClient.cpp:908
> +    webPage->updateLoadedSubDomains(RegistrableDomain(request.url()));

This is not quite right, at this point we have not started to load the domain yet and the client may refuse to proceed with the load via the policy delegate. Odds are that you want to do this in didStartProvisionalLoad().
Also, you are again using an odd naming for this method. Seems to me it should be called something like addLoadedRegistrableDomain().

Also you should check but I believe decidePolicyForNavigationAction() / didStartProvisionalLoadForFrame() and WebLoaderStrategy::loadResource() are all called for main resources. Therefore, your code in WebLoaderStrategy::loadResource() may be sufficient. I believe WebLoaderStrategy::loadResource() is called for all async resource loads (main and sub resources). Not sure which domains you are attempting to log exactly (I assume all).

> Source/WebKit/WebProcess/WebPage/WebPage.cpp:6672
> +    if (!m_loadedSubDomains.contains(domain)) {

You're doing a double hash lookup here, once in contains, and then again in add(). You should call add() and use the returned AddResult to check it is was new or not.

> Source/WebKit/WebProcess/WebPage/WebPage.cpp:6674
> +        WebProcess::singleton().ensureNetworkProcessConnection().connection().send(Messages::NetworkConnectionToWebProcess::CheckForPrevalentSubresourceLoad(m_webPageProxyIdentifier, domain), 0);

You're calling this "Check" but it does not actually expect a response. Seems like you are "Reporting" instead.
Comment 12 Chris Dumez 2020-02-12 08:29:21 PST
Comment on attachment 390524 [details]
Patch

See comments on previous patch.
Comment 13 katherine_cheney 2020-02-12 08:35:35 PST
(In reply to Chris Dumez from comment #11)
> Comment on attachment 390474 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=390474&action=review
> 
> This patch seems to completely ignore loads served by the memory cache and
> ping/beacon loads, which seems wrong? I think you likely want centralized
> logging in CachedResourceLoader::requestResource() which I believe is our
> bottleneck *before* the memory cache (Youenn can confirm if this would cover
> everything or not). I think it would cover everything.
> 

I think this covers cached loads. I say this because with the original patch logging domains in NetworkResourceLoader, the layout tests I added were failing because the iframe was in the memory cache and never made it to the Network Process. But here the tests run fine. I could be wrong though, Youenn should definitely confirm or deny.
Comment 14 Chris Dumez 2020-02-12 08:53:15 PST
(In reply to katherine_cheney from comment #13)
> (In reply to Chris Dumez from comment #11)
> > Comment on attachment 390474 [details]
> > Patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=390474&action=review
> > 
> > This patch seems to completely ignore loads served by the memory cache and
> > ping/beacon loads, which seems wrong? I think you likely want centralized
> > logging in CachedResourceLoader::requestResource() which I believe is our
> > bottleneck *before* the memory cache (Youenn can confirm if this would cover
> > everything or not). I think it would cover everything.
> > 
> 
> I think this covers cached loads. I say this because with the original patch
> logging domains in NetworkResourceLoader, the layout tests I added were
> failing because the iframe was in the memory cache and never made it to the
> Network Process. But here the tests run fine. I could be wrong though,
> Youenn should definitely confirm or deny.


You are NOT covering *memory* cache loads. The only improvement in your new patch is that you are now covering *disk* cache loads.
Comment 15 youenn fablet 2020-02-12 09:07:11 PST
Comment on attachment 390524 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=390524&action=review

> Source/WebKit/WebProcess/Network/WebLoaderStrategy.cpp:112
> +    auto* webPage = webFrame ? webFrame->page() : nullptr;

Ping/beacon loads nowadays should go through this code path.
This is indeed missing all loads that are served by MemoryCache (content served directly by an existing CachedResource).
CachedResourceLoader::requestResource is the place for all loads.

At the same time, if it is in the memory cache, we should have already queried network process to know whether that is a prevalent resource.
I guess we might miss the case where a cached resource, not prevalent at the time, is becoming prevalent after this load.
Not sure whether this has an impact here.

> Source/WebKit/WebProcess/WebPage/WebPage.cpp:6670
> +void WebPage::updateLoadedSubDomains(WebCore::RegistrableDomain domain)

s/WebCore::RegistrableDomain/RegistrableDomain&&/ or const&

> Source/WebKit/WebProcess/WebPage/WebPage.cpp:6674
> +        WebProcess::singleton().ensureNetworkProcessConnection().connection().send(Messages::NetworkConnectionToWebProcess::CheckForPrevalentSubresourceLoad(m_webPageProxyIdentifier, domain), 0);

We will send two IPC messages for every request (at least those that are not prevalent).
If we do not need to check for cached resources, maybe we could just send the regular request and add an option to the regular request to query the prevalent load.
Comment 16 Chris Dumez 2020-02-12 09:41:50 PST
Comment on attachment 390524 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=390524&action=review

> Source/WebKit/UIProcess/WebPageProxy.cpp:9867
> +    m_prevalentDomains.clear();

Shouldn't we clear these when the page commits a new main frame load (i.e. the user navigated away)?

>> Source/WebKit/WebProcess/Network/WebLoaderStrategy.cpp:112
>> +    auto* webPage = webFrame ? webFrame->page() : nullptr;
> 
> Ping/beacon loads nowadays should go through this code path.
> This is indeed missing all loads that are served by MemoryCache (content served directly by an existing CachedResource).
> CachedResourceLoader::requestResource is the place for all loads.
> 
> At the same time, if it is in the memory cache, we should have already queried network process to know whether that is a prevalent resource.
> I guess we might miss the case where a cached resource, not prevalent at the time, is becoming prevalent after this load.
> Not sure whether this has an impact here.

I was referring to this branch in CachedResource::load():
if (m_options.keepAlive && shouldUsePingLoad(type()) && platformStrategies()->loaderStrategy()->usePingLoad()) {
}
Comment 17 Brent Fulgham 2020-02-12 09:42:07 PST
Comment on attachment 390524 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=390524&action=review

> Source/WebKit/ChangeLog:5
> +        rdar://problem/59270758

Nit: <rdar://problem/59270758>

>> Source/WebKit/WebProcess/Network/WebLoaderStrategy.cpp:112
>> +    auto* webPage = webFrame ? webFrame->page() : nullptr;
> 
> Ping/beacon loads nowadays should go through this code path.
> This is indeed missing all loads that are served by MemoryCache (content served directly by an existing CachedResource).
> CachedResourceLoader::requestResource is the place for all loads.
> 
> At the same time, if it is in the memory cache, we should have already queried network process to know whether that is a prevalent resource.
> I guess we might miss the case where a cached resource, not prevalent at the time, is becoming prevalent after this load.
> Not sure whether this has an impact here.

I think we do want to capture information about prevalent resources that were requested from cached content, even though we would have taken corrective action. Ideally, a cold page load and a warm load from the cache would produce the same set of prevalent resources encountered. Of course, unclassified domains that later became classified would introduce some variance, but I think that is acceptable for this use case.

>> Source/WebKit/WebProcess/WebPage/WebPage.cpp:6674
>> +        WebProcess::singleton().ensureNetworkProcessConnection().connection().send(Messages::NetworkConnectionToWebProcess::CheckForPrevalentSubresourceLoad(m_webPageProxyIdentifier, domain), 0);
> 
> We will send two IPC messages for every request (at least those that are not prevalent).
> If we do not need to check for cached resources, maybe we could just send the regular request and add an option to the regular request to query the prevalent load.

Maybe we could collect these domains as the sub resource loads occur, then send one message when the page load completes?
Comment 18 youenn fablet 2020-02-12 10:04:11 PST
I missed the fact that the hash set is for subdomains, not prevalent subdomains.
Might not be a big deal then.
Comment 19 katherine_cheney 2020-02-12 11:38:30 PST
Created attachment 390541 [details]
Patch
Comment 20 katherine_cheney 2020-02-12 11:39:26 PST
Thanks for the feedback everyone! This patch addresses all comments except for this one (from Chris):

> Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStore.mm:417
> +    auto webPageProxy = [webView _page];


auto*, assuming it returns a raw pointer.

I get error: variable 'webPageProxy' with type 'auto *' has incompatible initializer of type 'NakedPtr<WebKit::WebPageProxy>’
Comment 21 katherine_cheney 2020-02-12 11:55:38 PST
Created attachment 390542 [details]
Patch
Comment 22 Chris Dumez 2020-02-12 13:15:01 PST
Comment on attachment 390542 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=390542&action=review

This is much better but I am not sure I understand why we want to store these registrable domains in the UIProcess at all.

> Source/WebCore/loader/FrameLoaderClient.h:37
> +#include "RegistrableDomain.h"

Can be forward-declared instead.

> Source/WebCore/loader/cache/CachedResourceLoader.cpp:794
> +    m_documentLoader->frameLoader()->client().addPrevalentDomain(RegistrableDomain(request.resourceRequest().url()));

Seems like you could get the frame loader like so:
frame->loader()

since you already have the frame. Seems more straightforward

> Source/WebKit/NetworkProcess/NetworkConnectionToWebProcess.cpp:791
> +void NetworkConnectionToWebProcess::reportSubresourceLoad(WebPageProxyIdentifier webPageProxyID, RegistrableDomain&& domain)

I don't really like the name, seems too generic. Maybe something like reportSubresourceLoadToDomain()

> Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStorePrivate.h:65
> +- (void)_getPrevalentDomainsFor:(WKWebView *)webView completionHandler:(void (^)(NSArray<NSString *> *domains))completionHandler WK_API_AVAILABLE(macos(WK_MAC_TBA), ios(WK_IOS_TBA));

Do we expect this to get called often? Given that this takes in a completion handler, another (simpler) approach would have been to keep the registrable domains on the WebPage (not the WebPageProxy) and then send an async IPC to the WebProcess to get the list of Registrable domains. This would avoid the awkward IPC from the NetworkProcess to the UIProcess to tell it to add a registrable domain for a WebPage for e.g. This would also simplify invalidation of registrable domains when necessary.

> Source/WebKit/UIProcess/WebPageProxy.cpp:4529
> +    if (frame->isMainFrame())

Can you reuse the if (frame->isMainFrame()) { block at line 4570.

> Source/WebKit/WebProcess/WebCoreSupport/WebFrameLoaderClient.cpp:163
> +    webPage->addLoadedRegistrableDomain(WTFMove(domain));

Something weird here: addPrevalentDomain() calls addLoadedRegistrableDomain() ??
Comment 23 katherine_cheney 2020-02-12 16:10:50 PST
Created attachment 390580 [details]
Patch
Comment 24 Chris Dumez 2020-02-13 14:55:40 PST
Comment on attachment 390580 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=390580&action=review

Almost perfect, just a few more improvements.

> Source/WebCore/loader/cache/CachedResourceLoader.cpp:794
> +    frame.loader().client().addLoadedRegistrableDomain(RegistrableDomain(request.resourceRequest().url()));

You may need something in CachedResource::redirectReceived() too to handle redirects. Please double check to Youenn where is the best place to add the addLoadedRegistrableDomain() for handling redirect cases.

> Source/WebKit/NetworkProcess/NetworkConnectionToWebProcess.cpp:791
> +void NetworkConnectionToWebProcess::isPrevalentSubresourceLoad(WebPageProxyIdentifier webPageProxyID, RegistrableDomain&& domain, CompletionHandler<void(bool)>&& completionHandler)

webPageProxyID parameter is unused and should be dropped.

> Source/WebKit/NetworkProcess/NetworkConnectionToWebProcess.cpp:801
> +#endif

Please add a blank line after this one.

> Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStore.mm:423
> +#if ENABLE(RESOURCE_LOAD_STATISTICS)

Why isn't this the first line of this method?

> Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStore.mm:424
> +    webPageProxy->getPrevalentDomains([completionHandler = makeBlockPtr(completionHandler)] (HashSet<WebCore::RegistrableDomain> prevalentDomains) {

HashSet<WebCore::RegistrableDomain>&&

> Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStore.mm:428
> +            apiDomains.uncheckedAppend(API::String::create(domain.string()));

WTFMove(domain.string())

but you will need to add this to RegistrableDomain:
String& string() { return m_registrableDomain; }

> Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStore.mm:445
> +#if ENABLE(RESOURCE_LOAD_STATISTICS)

Why isn't this the first line of this method?

> Source/WebKit/UIProcess/WebPageProxy.cpp:4567
> +        clearPrevalentDomains();

This is weird, it should be in the didCommitLoadForFrame() in the WebProcess instead, now that we don't store them in the UIProcess. It does unnecessary IPC otherwise.

> Source/WebKit/WebProcess/WebCoreSupport/WebFrameLoaderClient.cpp:159
> +    WebPage* webPage = m_frame->page();

auto*

> Source/WebKit/WebProcess/WebPage/WebPage.cpp:6593
> +    clearPrevalentDomains();

Not needed anymore now that things are only stored in the WebProcess. You want this in didCommitLoadForFrame() when we are the main frame though (in WebProcess side).

> Source/WebKit/WebProcess/WebPage/WebPage.cpp:6673
> +void WebPage::addLoadedRegistrableDomain(WebCore::RegistrableDomain&& domain)

I doubt WebCore:: is really needed.

> Source/WebKit/WebProcess/WebPage/WebPage.cpp:6677
> +        WebProcess::singleton().ensureNetworkProcessConnection().connection().sendWithAsyncReply(Messages::NetworkConnectionToWebProcess::IsPrevalentSubresourceLoad(m_webPageProxyIdentifier, domain), [this, protectedThis = makeRefPtr(*this), domain] (bool isPrevalent) {

m_webPageProxyIdentifier parameter is unnecessary.
Comment 25 Chris Dumez 2020-02-13 15:42:57 PST
Comment on attachment 390580 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=390580&action=review

> Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStore.mm:427
> +        for (auto& domain : prevalentDomains)

Youenn speaking through my hands:
"this should use WTF::map(prevalentDomains, [](auto& domain) {
    return API::String::create(WTFMove(domain.string()))
});"
Comment 26 katherine_cheney 2020-02-13 17:07:45 PST
Created attachment 390706 [details]
Patch
Comment 27 Chris Dumez 2020-02-13 17:56:17 PST
Comment on attachment 390706 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=390706&action=review

r=me with changes

> Source/WebCore/loader/cache/CachedResourceLoader.cpp:794
> +    frame.loader().client().addLoadedRegistrableDomain(RegistrableDomain(request.resourceRequest().url()));

I think this should be moved towards the end of the method to take into account content extensions/blockers for example (we may block the load or update the url). Probably by this line at the end:
 m_documentResources.set(resource->url(), resource);

> Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStore.mm:426
> +        apiDomains = WTF::map(prevalentDomains, [](auto& domain) {

this can be on the previous line:
Vector<RefPtr<API::Object>> apiDomains = WTF::map();

> Source/WebKit/WebProcess/WebPage/WebPage.cpp:6678
> +        WebProcess::singleton().ensureNetworkProcessConnection().connection().sendWithAsyncReply(Messages::NetworkConnectionToWebProcess::IsPrevalentSubresourceLoad(domain), [this, protectedThis = makeRefPtr(*this), domain] (bool isPrevalent) {

makeRef(), not makeRefPtr()

> Source/WebKit/WebProcess/WebPage/WebPage.cpp:6687
> +    

nit: extra blank line
Comment 28 katherine_cheney 2020-02-13 18:43:37 PST
Created attachment 390715 [details]
Patch for landing
Comment 29 katherine_cheney 2020-02-13 18:44:05 PST
Thanks Chris and Youenn!
Comment 30 WebKit Commit Bot 2020-02-13 19:40:23 PST
The commit-queue encountered the following flaky tests while processing attachment 390715 [details]:

editing/spelling/spellcheck-attribute.html bug 206178 (authors: g.czajkowski@samsung.com, mark.lam@apple.com, and rniwa@webkit.org)
The commit-queue is continuing to process your patch.
Comment 31 WebKit Commit Bot 2020-02-13 19:41:11 PST
Comment on attachment 390715 [details]
Patch for landing

Clearing flags on attachment: 390715

Committed r256583: <https://trac.webkit.org/changeset/256583>
Comment 32 WebKit Commit Bot 2020-02-13 19:41:13 PST
All reviewed patches have been landed.  Closing bug.
Comment 33 Radar WebKit Bug Importer 2020-02-13 19:42:17 PST
<rdar://problem/59448257>
Comment 34 Truitt Savell 2020-02-19 09:11:10 PST
The new test http/tests/resourceLoadStatistics/prevalent-domains-per-page-database.html

is a flaky timeout on iOS, tracking in https://bugs.webkit.org/show_bug.cgi?id=207944