Bug 213291

Summary: Add support for fetching registrable domains with resource load statistics
Product: WebKit Reporter: Carlos Garcia Campos <cgarcia>
Component: WebKit2Assignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: achristensen, aperez, bugs-noreply, webkit-bug-importer, wilander, youennf
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
aperez: review+
Patch for landing
none
Patch for landing none

Description Carlos Garcia Campos 2020-06-17 07:09:49 PDT
WebsiteDataStore::fetchData() doesn't return anything for resource load statistics because NetworkProcess::fetchWebsiteData() doesn't handle WebsiteDataType::ResourceLoadStatistics
Comment 1 Carlos Garcia Campos 2020-06-17 07:19:15 PDT
Created attachment 402105 [details]
Patch
Comment 2 John Wilander 2020-06-17 10:42:02 PDT
Comment on attachment 402105 [details]
Patch

Just as a general comment, we've struggled with the term website data here because ITP data isn't website data, i.e. it isn't data that the website creates but that WebKit creates *about* websites. That is in part why this has not been folded into the reporting structure.
Comment 3 Adrian Perez 2020-06-18 00:46:09 PDT
Comment on attachment 402105 [details]
Patch

Patch looks reasonable, but the API tests are failing. Could you take
check that up, please?
Comment 4 Carlos Garcia Campos 2020-06-18 01:00:52 PDT
(In reply to John Wilander from comment #2)
> Comment on attachment 402105 [details]
> Patch
> 
> Just as a general comment, we've struggled with the term website data here
> because ITP data isn't website data, i.e. it isn't data that the website
> creates but that WebKit creates *about* websites. That is in part why this
> has not been folded into the reporting structure.

I see. I still think it makes sense to allow the user to remove the resource load stats, like other (website) data and caches. In epiphany we always query the data  before showing the option to remove it. In the case of itp data we don't know if there's data to remove. That's how I noticed this.
Comment 5 Carlos Garcia Campos 2020-06-18 01:04:09 PDT
(In reply to Adrian Perez from comment #3)
> Comment on attachment 402105 [details]
> Patch
> 
> Patch looks reasonable, but the API tests are failing. Could you take
> check that up, please?

I'll check, tests pass for me locally. It might be flaky.
Comment 6 Carlos Garcia Campos 2020-06-18 01:08:50 PDT
The API test failure could also be because of bug #213290, if there are registrations in the default location in the system, I'll run EWS again after landing that patch.
Comment 7 Adrian Perez 2020-06-18 01:13:28 PDT
(In reply to Carlos Garcia Campos from comment #6)
> The API test failure could also be because of bug #213290, if there are
> registrations in the default location in the system, I'll run EWS again
> after landing that patch.

I suppose it would make sense to reset to a known state (in this case:
remove all) at the beginning of the API tests, before doing checks :)
Comment 8 Adrian Perez 2020-06-18 01:15:56 PDT
Comment on attachment 402105 [details]
Patch

r=me, but let's please make sure the tests pass before landing :]
Comment 9 Carlos Garcia Campos 2020-06-18 01:21:01 PDT
(In reply to Adrian Perez from comment #8)
> Comment on attachment 402105 [details]
> Patch
> 
> r=me, but let's please make sure the tests pass before landing :]

Sure, I'll also wait for WebKit2 owners to confirm they are ok with this change.
Comment 10 Adrian Perez 2020-06-18 01:22:23 PDT
(In reply to Carlos Garcia Campos from comment #9)
> (In reply to Adrian Perez from comment #8)
> > Comment on attachment 402105 [details]
> > Patch
> > 
> > r=me, but let's please make sure the tests pass before landing :]
> 
> Sure, I'll also wait for WebKit2 owners to confirm they are ok with this
> change.

Perfect, thanks!
Comment 11 youenn fablet 2020-06-18 02:16:52 PDT
Comment on attachment 402105 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=402105&action=review

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:147
> +    Vector<RegistrableDomain> allDomains() const override;

final?

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsMemoryStore.h:116
> +    Vector<RegistrableDomain> allDomains() const override;

final?

> Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.cpp:1375
> +        postTaskReply([domains = WTFMove(domains), completionHandler = WTFMove(completionHandler)]() mutable {

Hopping to another thread, this requires isolatedCopying domains.

> Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.cpp:1378
> +    });

Can we write it with a ternary, something like:
auto domains = m_statisticsStore ? m_statisticsStore->allDomains() : { };
postTaskReply([domains = crossThreadCopy(WTFMove(domains)), completionHandler = WTFMove(completionHandler)]() mutable {
    completionHandler(WTFMove(domains));
});

> Source/WebKit/NetworkProcess/NetworkProcess.cpp:1586
> +                resourceLoadStatistics->registrableDomains([callbackAggregator = callbackAggregator.copyRef()](Vector<RegistrableDomain>&& domains) mutable {

s/Vector<RegistrableDomain>/auto
Comment 12 youenn fablet 2020-06-18 02:18:33 PDT
LGTM though I'd like to understand John comment more.

(In reply to John Wilander from comment #2)
> Comment on attachment 402105 [details]
> Patch
> 
> Just as a general comment, we've struggled with the term website data here
> because ITP data isn't website data, i.e. it isn't data that the website
> creates but that WebKit creates *about* websites. That is in part why this
> has not been folded into the reporting structure.

It seems good to have a way to retrieve that information.
I guess your suggestion might be that the UI might be different and maybe the API should be different from other website data?
Comment 13 Carlos Garcia Campos 2020-06-18 05:58:42 PDT
(In reply to youenn fablet from comment #11)
> Comment on attachment 402105 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=402105&action=review
> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:147
> > +    Vector<RegistrableDomain> allDomains() const override;
> 
> final?
> 
> > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsMemoryStore.h:116
> > +    Vector<RegistrableDomain> allDomains() const override;
> 
> final?
> 
> > Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.cpp:1375
> > +        postTaskReply([domains = WTFMove(domains), completionHandler = WTFMove(completionHandler)]() mutable {
> 
> Hopping to another thread, this requires isolatedCopying domains.

Also when moving?

> > Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.cpp:1378
> > +    });
> 
> Can we write it with a ternary, something like:
> auto domains = m_statisticsStore ? m_statisticsStore->allDomains() : { };
> postTaskReply([domains = crossThreadCopy(WTFMove(domains)),
> completionHandler = WTFMove(completionHandler)]() mutable {
>     completionHandler(WTFMove(domains));
> });

Yes, I started with something like this but then I decided to follow what other similar methods in the file were doing.

> > Source/WebKit/NetworkProcess/NetworkProcess.cpp:1586
> > +                resourceLoadStatistics->registrableDomains([callbackAggregator = callbackAggregator.copyRef()](Vector<RegistrableDomain>&& domains) mutable {
> 
> s/Vector<RegistrableDomain>/auto
Comment 14 youenn fablet 2020-06-18 06:07:52 PDT
> > > Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.cpp:1375
> > > +        postTaskReply([domains = WTFMove(domains), completionHandler = WTFMove(completionHandler)]() mutable {
> > 
> > Hopping to another thread, this requires isolatedCopying domains.
> 
> Also when moving?

Yes, moving will only prevent refing when capturing.
It is safe to hop to another thread with a String which counter is 1. But otherwise, like in the in memory case, the strings may be derefed in both threads.
There is no guarantee that this will be the case in your code (especially for in memory stores).
Also isolatedCopy tries to optimise the case of Strings that are moved and which counter is 1, see String String::isolatedCopy() &&.

We could try to optimise the case of a Vector<T>&& so as to reuse the allocated vector and call isolatedCopy() && on each item as well.
Comment 15 Carlos Garcia Campos 2020-06-18 06:24:25 PDT
Created attachment 402202 [details]
Patch for landing
Comment 16 Carlos Garcia Campos 2020-06-18 07:28:53 PDT
Created attachment 402206 [details]
Patch for landing
Comment 17 Carlos Garcia Campos 2020-06-19 00:49:31 PDT
Committed r263258: <https://trac.webkit.org/changeset/263258>
Comment 18 Radar WebKit Bug Importer 2020-06-19 00:50:21 PDT
<rdar://problem/64522896>