Summary: | Add timeStamp to ITP database | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Kate Cheney <katherine_cheney> | ||||||||||||||||
Component: | WebKit Misc. | Assignee: | Kate Cheney <katherine_cheney> | ||||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||||
Severity: | Normal | CC: | bfulgham, cdumez, commit-queue, dbates, ews-watchlist, japhet, wilander | ||||||||||||||||
Priority: | P2 | Keywords: | InRadar | ||||||||||||||||
Version: | WebKit Nightly Build | ||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||
Attachments: |
|
Description
Kate Cheney
2019-12-11 09:42:19 PST
Created attachment 385405 [details]
Patch
Created attachment 385433 [details]
Patch
Comment on attachment 385433 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=385433&action=review It's hard for me to see what we're trying to achieve here. I think it is a timestamp of the latest time a third-party was loaded under a specific first-party either as a sub frame or as a subresource. The change log comment on "frame navigation" and "cross site load with link decoration" indicates that there are exceptions. Are there? Also, frame navigation is used for both top and sub frame navigations. Shouldn't these changes be done in the persistent store too? Or are we certain we will never go back? > Source/WebKit/ChangeLog:14 > + 2. On a cross site load with link decoration Can you reason about why these two and not just page loads? > Source/WebKit/ChangeLog:17 > + should be exposed via API. You probably want to land the bug fix in a separate patch. That's good practice to 1) not get bug fixes rolled out because something else needs to be rolled out. and 2) patches should have a single purpose if possible. > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:829 > + return timesUnderFirstParty > 0 && isPrevalentResource(domain); This is the bug fix, right? Looks like something that could land in its own patch with an updated test case to make sure the API doesn't report non-prevalent things. > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1841 > + return Seconds { -1 }; This is reusing what we interpret as "never," right? A comment on what -1 means would be good. But since you use it three times here I don't know. We should probably have a const declaration of -1 Seconds that spells out what it means. > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1853 > + return Seconds { -1 }; Ditto. > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1857 > + return Seconds { -1 }; Ditto. > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1859 > + double max = m_getMostRecentlyUpdatedTimestampStatement.getColumnDouble(0); The name max here is vague. If it were the result of calculating the max out of two values I'd get it but now it just gets a value. Would mostRecentlyUpdatedTimestamp work? > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1861 > + return Seconds { max }; Here max gets even more confusing. > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsMemoryStore.cpp:108 > + ThirdPartyDataForSpecificFirstParty thirdPartyDataForSpecificFirstParty { firstPartyDomain, thirdPartyHasStorageAccess, Seconds { -1 }}; Another case of the -1 Seconds issue. > Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.h:88 > + return makeString("Has been granted storage access under ", firstPartyDomain.string(), ": ", storageAccessGranted ? '1' : '0', "; Has been seen under ", firstPartyDomain.string(), " within 24 hours: ", WallTime::now().secondsSinceEpoch() - timeLastUpdated < 24_h ? '1' : '0'); I think "in the last 24 hours" is better than "within." > LayoutTests/ChangeLog:12 > + prevalent. Ah, so there's your test of the bug fix. > LayoutTests/ChangeLog:1624 > + [ MacOS ] http/tests/resourceLoadStatistics/cookie-deletion.html is timing out This is unrelated, right? Better to land in an unreviewed test gardening patch. > LayoutTests/ChangeLog:2192 > + [ MacOS ] http/tests/resourceLoadStatistics/cookie-deletion.html is timing out Ditto. Comment on attachment 385433 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=385433&action=review Thanks John! I think it would be best to make any updates to the memory store in a separate patch to keep this one smaller. >> Source/WebKit/ChangeLog:14 >> + 2. On a cross site load with link decoration > > Can you reason about why these two and not just page loads? I did not include subresource loading in this list, but this patch will still update the timestamp for subresource loads when those statistics are merged into the database. I will update the ChangeLog to reflect this. >> Source/WebKit/ChangeLog:17 >> + should be exposed via API. > > You probably want to land the bug fix in a separate patch. That's good practice to 1) not get bug fixes rolled out because something else needs to be rolled out. and 2) patches should have a single purpose if possible. Good idea, I will put this in a new patch. >> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1841 >> + return Seconds { -1 }; > > This is reusing what we interpret as "never," right? A comment on what -1 means would be good. But since you use it three times here I don't know. We should probably have a const declaration of -1 Seconds that spells out what it means. Good idea. I'll add something like #define NO_EXISTING_TIMESTAMP -1 Created attachment 385778 [details]
Patch
Created attachment 385782 [details]
Patch
> > You probably want to land the bug fix in a separate patch. That's good practice to 1) not get bug fixes rolled out because something else needs to be rolled out. and 2) patches should have a single purpose if possible. > > Good idea, I will put this in a new patch. > Link: https://bugs.webkit.org/show_bug.cgi?id=205281 Created attachment 385925 [details]
Patch
making this patch slightly bigger, but now that https://bugs.webkit.org/show_bug.cgi?id=204932 has landed, this patch should include the timestamp in the new API classes. *** Bug 205289 has been marked as a duplicate of this bug. *** Comment on attachment 385925 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=385925&action=review I think we're good here. Please see my comments and questions before landing. > Source/WebKit/ChangeLog:13 > + _getResourceLoadStatisticsDataSummaryAPI. Missing space before API. > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:664 > + if (insertRelationshipStatement.bindDouble(2, WallTime::now().secondsSinceEpoch().value()) != SQLITE_OK) { Just double checking: 1) Does this cover all the types of data ITP collects? Subresource, sub frame, web sockets (which map to subresource)? 2) Is this timestamp double-keyed so that it's for the third-party under a specific first-party? I was expecting a timestamp capture where we capture the loads but you can absolutely do it at the sink too. > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1859 > + double mostRecentlyUpdatedTimestamp = m_getMostRecentlyUpdatedTimestampStatement.getColumnDouble(0); Much clearer variable name. > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsMemoryStore.cpp:108 > + WebResourceLoadStatisticsStore::ThirdPartyDataForSpecificFirstParty thirdPartyDataForSpecificFirstParty { firstPartyDomain, thirdPartyHasStorageAccess, Seconds { NO_EXISTING_TIMESTAMP }}; Thanks for introducing a constant. See my comment on what its definition though. > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsStore.h:42 > +#define NO_EXISTING_TIMESTAMP -1 I think this should instead be … static constexpr Seconds NoExistingTimestamp = { -1 }; … in the ResourceLoadStatistics namespace and then used as ResourceLoadStatistics::NoExistingTimestamp. Note that you then don't have to instantiate new Seconds objects in all those places. See for instance LoggedInStatus::TimeToLiveShort and WatchDog::noTimeLimit. > Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.h:140 > + return WTF::nullopt; Will this lead to us dumping all the existing data? Because it won't have these timestamps. In the case of ResourceLoadStatistics.cpp, there's a modelVersion to check for and a graceful import of data when the source is less than the new model. > Tools/ChangeLog:10 > + in the Resource Load Statistics database backend. You're mixing Resource Load Statistics and ITP in this entry. Let's stick to one. > Tools/ChangeLog:11 > + This also adds an api test case using the ITP database store. api -> API > Tools/TestWebKitAPI/Tests/WebKitCocoa/ResourceLoadStatistics.mm:901 > + // Teach ITP about bad origins: Change colon to period. > Tools/TestWebKitAPI/Tests/WebKitCocoa/ResourceLoadStatistics.mm:923 > + // capture time for comparison later Capital c and period. > Tools/TestWebKitAPI/Tests/WebKitCocoa/ResourceLoadStatistics.mm:926 > + Remove double linefeed. > Tools/TestWebKitAPI/Tests/WebKitCocoa/ResourceLoadStatistics.mm:942 > + Remove double linefeed. > Tools/TestWebKitAPI/Tests/WebKitCocoa/ResourceLoadStatistics.mm:1020 > + // it appears under or redirects to Period. > Tools/TestWebKitAPI/Tests/WebKitCocoa/ResourceLoadStatistics.mm:1049 > + // check timestamp for evil3 is reported as being within the correct range Start with capital C and end with a period. > Tools/TestWebKitAPI/Tests/WebKitCocoa/ResourceLoadStatistics.mm:1083 > + // check timestamp for evil1 is reported as being within the correct range Start with capital C and end with a period. > Tools/TestWebKitAPI/Tests/WebKitCocoa/ResourceLoadStatistics.mm:1108 > + // check timestamp for evil2 is reported as being within the correct range Start with capital C and end with a period. Comment on attachment 385925 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=385925&action=review Thanks! Addressing these before uploading. >> Source/WebKit/ChangeLog:13 >> + _getResourceLoadStatisticsDataSummaryAPI. > > Missing space before API. good catch! >> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:664 >> + if (insertRelationshipStatement.bindDouble(2, WallTime::now().secondsSinceEpoch().value()) != SQLITE_OK) { > > Just double checking: > 1) Does this cover all the types of data ITP collects? Subresource, sub frame, web sockets (which map to subresource)? > 2) Is this timestamp double-keyed so that it's for the third-party under a specific first-party? > > I was expecting a timestamp capture where we capture the loads but you can absolutely do it at the sink too. 1) It captures the following relationships: subresourceUnderTopFrame, subFrameUnderTopFrame, subresourceUniqueRedirectTo, and crossSiteLoadWithLinkDecoration. I see web sockets are added to subresourcesUnderTopFrameDomains in WebResourceLoadObserver.cpp, so they will be captured with a timestamp. 2) the schema is | thirdParty | firstParty | timestamp |, so the timestamp will always be associated with a specific pair. I thought about that to, but this came out a lot cleaner and has everything in one place (as opposed to trying to log subresource and subframe loads at different places). >> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsStore.h:42 >> +#define NO_EXISTING_TIMESTAMP -1 > > I think this should instead be … > > static constexpr Seconds NoExistingTimestamp = { -1 }; > > … in the ResourceLoadStatistics namespace and then used as ResourceLoadStatistics::NoExistingTimestamp. Note that you then don't have to instantiate new Seconds objects in all those places. > > See for instance LoggedInStatus::TimeToLiveShort and WatchDog::noTimeLimit. Noted, I will change this >> Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.h:140 >> + return WTF::nullopt; > > Will this lead to us dumping all the existing data? Because it won't have these timestamps. In the case of ResourceLoadStatistics.cpp, there's a modelVersion to check for and a graceful import of data when the source is less than the new model. Thats a very good point, I'll try to copy that logic. >> Tools/ChangeLog:10 >> + in the Resource Load Statistics database backend. > > You're mixing Resource Load Statistics and ITP in this entry. Let's stick to one. Agreed.
> >> Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.h:140
> >> + return WTF::nullopt;
> >
> > Will this lead to us dumping all the existing data? Because it won't have these timestamps. In the case of ResourceLoadStatistics.cpp, there's a modelVersion to check for and a graceful import of data when the source is less than the new model.
>
> Thats a very good point, I'll try to copy that logic.
Update: John and I talked offline and this is a larger fix that will be a separate patch.
Created attachment 386170 [details]
Patch
Going to let bots run their course before landing Comment on attachment 386170 [details] Patch Rejecting attachment 386170 [details] from commit-queue. Failed to run "['/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch', '--status-host=webkit-queues.webkit.org', '--bot-id=webkit-cq-03', 'apply-attachment', '--no-update', '--non-interactive', 386170, '--port=mac']" exit_code: 2 cwd: /Volumes/Data/EWS/WebKit Logging in as commit-queue@webkit.org... Fetching: https://bugs.webkit.org/attachment.cgi?id=386170&action=edit Fetching: https://bugs.webkit.org/show_bug.cgi?id=205121&ctype=xml&excludefield=attachmentdata Processing 1 patch from 1 bug. Processing patch 386170 from bug 205121. Fetching: https://bugs.webkit.org/attachment.cgi?id=386170 Failed to run "[u'/Volumes/Data/EWS/WebKit/Tools/Scripts/svn-apply', '--force']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit Parsed 20 diffs from patch file(s). patching file Source/WebKit/ChangeLog patching file Source/WebCore/loader/ResourceLoadStatistics.h patching file Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp patching file Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h patching file Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsMemoryStore.cpp patching file Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.h patching file Source/WebKit/UIProcess/API/APIResourceLoadStatisticsFirstParty.h patching file Source/WebKit/UIProcess/API/Cocoa/WKWebViewConfigurationPrivate.h patching file Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStore.mm patching file Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStorePrivate.h patching file Source/WebKit/UIProcess/API/Cocoa/_WKResourceLoadStatisticsFirstParty.h patching file Source/WebKit/UIProcess/API/Cocoa/_WKResourceLoadStatisticsFirstParty.mm patching file Tools/ChangeLog Hunk #1 succeeded at 1 with fuzz 3. patching file Tools/TestWebKitAPI/Tests/WebKitCocoa/ResourceLoadStatistics.mm patching file LayoutTests/ChangeLog Hunk #1 succeeded at 1 with fuzz 3. Hunk #2 succeeded at 2772 (offset 60 lines). Hunk #3 succeeded at 3340 (offset 60 lines). patching file LayoutTests/http/tests/resourceLoadStatistics/aggregate-sorted-data-no-storage-access-database-expected.txt patching file LayoutTests/http/tests/resourceLoadStatistics/aggregate-sorted-data-no-storage-access-expected.txt patching file LayoutTests/http/tests/storageAccess/aggregate-sorted-data-with-storage-access-database-expected.txt patching file LayoutTests/http/tests/storageAccess/aggregate-sorted-data-with-storage-access-expected.txt patching file 'WebKit.xcworkspace/xcshareddata/xcschemes/All Source.xcscheme' Hunk #1 FAILED at 57. 1 out of 1 hunk FAILED -- saving rejects to file 'WebKit.xcworkspace/xcshareddata/xcschemes/All Source.xcscheme.rej' Failed to run "[u'/Volumes/Data/EWS/WebKit/Tools/Scripts/svn-apply', '--force']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit Full output: https://webkit-queues.webkit.org/results/13293768 Created attachment 386286 [details]
Patch
Comment on attachment 386286 [details] Patch Clearing flags on attachment: 386286 Committed r253863: <https://trac.webkit.org/changeset/253863> All reviewed patches have been landed. Closing bug. |