Bug 213493

Summary: [WPE][GTK] Add API to support "Privacy Report"
Product: WebKit Reporter: Michael Catanzaro <mcatanzaro>
Component: WebKitGTKAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: aperez, benjamin, berto, bugs-noreply, cdumez, cgarcia, cmarcelo, ews-watchlist, gustavo, katherine_cheney, mcatanzaro, wilander, youennf
Priority: P2    
Version: WebKit Nightly Build   
Hardware: PC   
OS: Linux   
See Also: https://bugs.webkit.org/show_bug.cgi?id=214590
Bug Depends on: 214401    
Bug Blocks: 210184    
Attachments:
Description Flags
Patch aperez: review+

Michael Catanzaro
Reported 2020-06-22 15:57:43 PDT
""" Apple also announced that its upcoming macOS Big Sur will include an update to the company’s web browser that gives users an easy-to-access “Privacy Report” right up next to the URL field. In a single click, the report will tell users which website trackers the browser has blocked. """ https://cdn.vox-cdn.com/thumbor/nuA_JNtgc1U3xSaOVIpoYcgR8VY=/0x0:653x653/920x0/filters:focal(0x0:653x653):format(webp):no_upscale()/cdn.vox-cdn.com/uploads/chorus_asset/file/20047451/tracker.jpg There must be some Cocoa SPI somewhere to enable this (where?). We could expose an equivalent API.
Attachments
Patch (27.86 KB, patch)
2020-07-22 03:41 PDT, Carlos Garcia Campos
aperez: review+
Carlos Garcia Campos
Comment 1 2020-06-24 06:42:14 PDT
John, Youenn is WebsiteDataStore::getResourceLoadStatisticsDataSummary() the function returning the data needed to create the privacy report?
Carlos Garcia Campos
Comment 2 2020-07-09 23:44:16 PDT
Ping.
John Wilander
Comment 3 2020-07-10 00:25:19 PDT
Sorry for the delay and thanks for the ping. I’m cc’ing my coworker who implemented the data reporting functionality from ITP.
Kate Cheney
Comment 4 2020-07-10 10:24:30 PDT
(In reply to Carlos Garcia Campos from comment #1) > John, Youenn is WebsiteDataStore::getResourceLoadStatisticsDataSummary() the > function returning the data needed to create the privacy report? Yes, WebsiteDataStore::getResourceLoadStatisticsDataSummary() is the correct function.
John Wilander
Comment 5 2020-07-10 10:33:02 PDT
One thing to note here is that the reporting relies on full third-party cookie blocking. If your implementation only blocks some third-party cookies, you need to make sure the reporting only reports on those.
Michael Catanzaro
Comment 6 2020-07-10 11:04:05 PDT
We use ThirdPartyCookieBlockingMode::All since that's default in Safari. Our goal is to match what Safari does. I wonder if the other ThirdPartyCookieBlockingMode options can be removed? I guess nobody is using ITP with any of the other modes anymore?
John Wilander
Comment 7 2020-07-10 11:10:36 PDT
(In reply to Michael Catanzaro from comment #6) > We use ThirdPartyCookieBlockingMode::All since that's default in Safari. Our > goal is to match what Safari does. > > I wonder if the other ThirdPartyCookieBlockingMode options can be removed? I > guess nobody is using ITP with any of the other modes anymore? Yes, I started working on removing all of that in https://bugs.webkit.org/show_bug.cgi?id=208819 but we decided we wanted to wait with such refactoring until we've branched trunk for the upcoming releases.
Carlos Garcia Campos
Comment 8 2020-07-16 03:37:31 PDT
(In reply to katherine_cheney from comment #4) > (In reply to Carlos Garcia Campos from comment #1) > > John, Youenn is WebsiteDataStore::getResourceLoadStatisticsDataSummary() the > > function returning the data needed to create the privacy report? > > Yes, WebsiteDataStore::getResourceLoadStatisticsDataSummary() is the correct > function. Thanks! I started to play with this, but I noticed I was not getting the same results than Safari, actually I was getting no entries in the report. So, I tried using the database store and then I got the same results. It seems to be a bug in the memory store (see bug #214401). This makes me wonder if we should be using the database store like Safari. Why do we have memory and db stores? When are you supposed to use one or the other? Should we also expose a setting?
Carlos Garcia Campos
Comment 9 2020-07-16 03:56:07 PDT
I've also noticed that with the memory store ThirdPartyDataForSpecificFirstParty::timeLastUpdated is always NoExistingTimestamp
Michael Catanzaro
Comment 10 2020-07-16 05:55:06 PDT
(In reply to Carlos Garcia Campos from comment #8) > Thanks! I started to play with this, but I noticed I was not getting the > same results than Safari, actually I was getting no entries in the report. > So, I tried using the database store and then I got the same results. It > seems to be a bug in the memory store (see bug #214401). This makes me > wonder if we should be using the database store like Safari. Why do we have > memory and db stores? When are you supposed to use one or the other? Should > we also expose a setting? Good catch. I don't think we need a setting for this. Memory store is for ephemeral mode and database store is for normal browsing mode. You can see if you restart Epiphany that all ITP data tracked in the Clear Data dialog gets deleted... oops, right? That's supposed to be persistent.
Carlos Garcia Campos
Comment 11 2020-07-16 06:04:53 PDT
(In reply to Michael Catanzaro from comment #10) > (In reply to Carlos Garcia Campos from comment #8) > > Thanks! I started to play with this, but I noticed I was not getting the > > same results than Safari, actually I was getting no entries in the report. > > So, I tried using the database store and then I got the same results. It > > seems to be a bug in the memory store (see bug #214401). This makes me > > wonder if we should be using the database store like Safari. Why do we have > > memory and db stores? When are you supposed to use one or the other? Should > > we also expose a setting? > > Good catch. > > I don't think we need a setting for this. Memory store is for ephemeral mode > and database store is for normal browsing mode. You can see if you restart > Epiphany that all ITP data tracked in the Clear Data dialog gets deleted... > oops, right? That's supposed to be persistent. It's persistent, stored in full_browsing_session_resourceLog.plist
Michael Catanzaro
Comment 12 2020-07-16 06:22:40 PDT
Hm... well then I don't understand either. How can it be a "memory store" if it's stored on disk? Maybe it's considered serialized memory, similar to WebKit session state? It really does disappear after restarting the browser btw. I'm seeing *very* strange behavior. After restarting the browser and reloading this bug, I have no ITP data stored. After loading a reddit.com, I see ITP data stored for webkit.org (plausible since this bug was reloaded), mozilla.net (???), and firefox.com (???). After doing nothing and just waiting a few more seconds, ITP data appears for several other domains. This can't be right.
Michael Catanzaro
Comment 13 2020-07-19 12:09:53 PDT
I found this in WebResourceLoadStatisticsStore: auto legacyPlistFilePath = FileSystem::pathByAppendingComponent(resourceLoadStatisticsDirectory, "full_browsing_session_resourceLog.plist"); if (FileSystem::fileExists(legacyPlistFilePath)) FileSystem::deleteFile(legacyPlistFilePath); Seems that ResourceLoadStatisticsMemoryStore is obsoleted by ResourceLoadStatisticsDatabaseStore, and enabling the database store will cause it to clean up the plist file.
Carlos Garcia Campos
Comment 14 2020-07-20 01:12:51 PDT
(In reply to Michael Catanzaro from comment #13) > I found this in WebResourceLoadStatisticsStore: > > auto legacyPlistFilePath = > FileSystem::pathByAppendingComponent(resourceLoadStatisticsDirectory, > "full_browsing_session_resourceLog.plist"); > if (FileSystem::fileExists(legacyPlistFilePath)) > FileSystem::deleteFile(legacyPlistFilePath); > > Seems that ResourceLoadStatisticsMemoryStore is obsoleted by > ResourceLoadStatisticsDatabaseStore, and enabling the database store will > cause it to clean up the plist file. Let's switch to use the database store unconditionally then, but I would like someone from Apple to confirm we are right first.
Kate Cheney
Comment 15 2020-07-20 10:14:49 PDT
The database store is a SQLite version of the full_browsing_session_resourceLog.plist file. The "memory" store actually consists of both memory and persistent storage, and in this version data is stored in memory and flushed to disk periodically. The database store is a complete replacement for the memory store, and as you noticed, switching to the memory store from the database store is not supported. I would recommend using the database store unconditionally, as we may not continue to support new behavior with the memory store. Currently we are writing and maintaining tests for both versions, but this is not something we want to keep up indefinitely.
Carlos Garcia Campos
Comment 16 2020-07-22 03:41:31 PDT
EWS Watchlist
Comment 17 2020-07-22 03:42:33 PDT
Thanks for the patch. If this patch contains new public API please make sure it follows the guidelines for new WebKit2 GTK+ API. See http://trac.webkit.org/wiki/WebKitGTK/AddingNewWebKit2API
Adrian Perez
Comment 18 2020-07-22 06:23:03 PDT
Comment on attachment 404913 [details] Patch Patch LGTM overall, but there are a couple of types and a an introspection annotation which needs fixing, so please check the comments below before landing. View in context: https://bugs.webkit.org/attachment.cgi?id=404913&action=review > Source/WTF/ChangeLog:8 > + Add support for using GRePtr with GDateTime. Typo: GRePtr → GRefPtr > Source/WebKit/ChangeLog:8 > + Add WebKitITPFirstParty and WebKitITPThirdParty bosex types to expose the data returned by the ITP summary. Typo: bosex → boxed > Source/WebKit/UIProcess/API/glib/WebKitWebsiteDataManager.cpp:1221 > + * Returns: (transfer full): the last update time as a #GDateTime The implementation returns the pointer directly, without adding a reference, therefore the annotation here should be “(transfer none)”.
Michael Catanzaro
Comment 19 2020-07-22 09:38:46 PDT
Comment on attachment 404913 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=404913&action=review API looks good. > Source/WebKit/UIProcess/API/glib/WebKitWebsiteDataManager.cpp:1202 > + * Get whether website data access has been granted for a #WebKitITPThirdParty > + * under @itp_first_party. Hm, I understand the #WebkitITPFirstParty object corresponds to exactly one #WebKitITPThirdParty, yes? That is not very clear. Perhaps: Get whether @itp_first_party has granted website data access to its #WebKitITPThirdParty. Each @WebKitITPFirstParty is created by webkit_itp_third_party_get_first_parties() and therefore corresponds to exactly one #WebKitITPThirdParty.
Carlos Garcia Campos
Comment 20 2020-07-27 01:31:30 PDT
(In reply to Adrian Perez from comment #18) > Comment on attachment 404913 [details] > Patch > > Patch LGTM overall, but there are a couple of types and a an introspection > annotation which needs fixing, so please check the comments below before > landing. > > View in context: > https://bugs.webkit.org/attachment.cgi?id=404913&action=review > > > Source/WTF/ChangeLog:8 > > + Add support for using GRePtr with GDateTime. > > Typo: GRePtr → GRefPtr > > > Source/WebKit/ChangeLog:8 > > + Add WebKitITPFirstParty and WebKitITPThirdParty bosex types to expose the data returned by the ITP summary. > > Typo: bosex → boxed > > > Source/WebKit/UIProcess/API/glib/WebKitWebsiteDataManager.cpp:1221 > > + * Returns: (transfer full): the last update time as a #GDateTime > > The implementation returns the pointer directly, without adding a > reference, therefore the annotation here should be “(transfer none)”. Right!
Carlos Garcia Campos
Comment 21 2020-07-27 01:32:22 PDT
(In reply to Michael Catanzaro from comment #19) > Comment on attachment 404913 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=404913&action=review > > API looks good. > > > Source/WebKit/UIProcess/API/glib/WebKitWebsiteDataManager.cpp:1202 > > + * Get whether website data access has been granted for a #WebKitITPThirdParty > > + * under @itp_first_party. > > Hm, I understand the #WebkitITPFirstParty object corresponds to exactly one > #WebKitITPThirdParty, yes? That is not very clear. Perhaps: Right. > Get whether @itp_first_party has granted website data access to its > #WebKitITPThirdParty. Each @WebKitITPFirstParty is created by > webkit_itp_third_party_get_first_parties() and therefore corresponds to > exactly one #WebKitITPThirdParty. Great, thanks!
Carlos Garcia Campos
Comment 22 2020-07-27 01:35:44 PDT
Note You need to log in before you can comment on or make changes to this bug.