WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
205121
Add timeStamp to ITP database
https://bugs.webkit.org/show_bug.cgi?id=205121
Summary
Add timeStamp to ITP database
Kate Cheney
Reported
2019-12-11 09:42:19 PST
Add a way to store most-recently-updated timestamps in the ITP database
Attachments
Patch
(45.38 KB, patch)
2019-12-11 10:23 PST
,
Kate Cheney
no flags
Details
Formatted Diff
Diff
Patch
(46.03 KB, patch)
2019-12-11 13:31 PST
,
Kate Cheney
no flags
Details
Formatted Diff
Diff
Patch
(32.95 KB, patch)
2019-12-16 10:14 PST
,
Kate Cheney
no flags
Details
Formatted Diff
Diff
Patch
(34.53 KB, patch)
2019-12-16 10:25 PST
,
Kate Cheney
no flags
Details
Formatted Diff
Diff
Patch
(53.26 KB, patch)
2019-12-17 15:54 PST
,
Kate Cheney
no flags
Details
Formatted Diff
Diff
Patch
(50.05 KB, patch)
2019-12-19 17:47 PST
,
Kate Cheney
no flags
Details
Formatted Diff
Diff
Patch
(48.29 KB, patch)
2019-12-20 21:09 PST
,
Kate Cheney
no flags
Details
Formatted Diff
Diff
Show Obsolete
(6)
View All
Add attachment
proposed patch, testcase, etc.
Kate Cheney
Comment 1
2019-12-11 09:42:50 PST
<
rdar://problem/57633021
>
Kate Cheney
Comment 2
2019-12-11 10:23:48 PST
Created
attachment 385405
[details]
Patch
Kate Cheney
Comment 3
2019-12-11 13:31:58 PST
Created
attachment 385433
[details]
Patch
John Wilander
Comment 4
2019-12-13 16:22:50 PST
Comment on
attachment 385433
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=385433&action=review
It's hard for me to see what we're trying to achieve here. I think it is a timestamp of the latest time a third-party was loaded under a specific first-party either as a sub frame or as a subresource. The change log comment on "frame navigation" and "cross site load with link decoration" indicates that there are exceptions. Are there? Also, frame navigation is used for both top and sub frame navigations. Shouldn't these changes be done in the persistent store too? Or are we certain we will never go back?
> Source/WebKit/ChangeLog:14 > + 2. On a cross site load with link decoration
Can you reason about why these two and not just page loads?
> Source/WebKit/ChangeLog:17 > + should be exposed via API.
You probably want to land the bug fix in a separate patch. That's good practice to 1) not get bug fixes rolled out because something else needs to be rolled out. and 2) patches should have a single purpose if possible.
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:829 > + return timesUnderFirstParty > 0 && isPrevalentResource(domain);
This is the bug fix, right? Looks like something that could land in its own patch with an updated test case to make sure the API doesn't report non-prevalent things.
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1841 > + return Seconds { -1 };
This is reusing what we interpret as "never," right? A comment on what -1 means would be good. But since you use it three times here I don't know. We should probably have a const declaration of -1 Seconds that spells out what it means.
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1853 > + return Seconds { -1 };
Ditto.
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1857 > + return Seconds { -1 };
Ditto.
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1859 > + double max = m_getMostRecentlyUpdatedTimestampStatement.getColumnDouble(0);
The name max here is vague. If it were the result of calculating the max out of two values I'd get it but now it just gets a value. Would mostRecentlyUpdatedTimestamp work?
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1861 > + return Seconds { max };
Here max gets even more confusing.
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsMemoryStore.cpp:108 > + ThirdPartyDataForSpecificFirstParty thirdPartyDataForSpecificFirstParty { firstPartyDomain, thirdPartyHasStorageAccess, Seconds { -1 }};
Another case of the -1 Seconds issue.
> Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.h:88 > + return makeString("Has been granted storage access under ", firstPartyDomain.string(), ": ", storageAccessGranted ? '1' : '0', "; Has been seen under ", firstPartyDomain.string(), " within 24 hours: ", WallTime::now().secondsSinceEpoch() - timeLastUpdated < 24_h ? '1' : '0');
I think "in the last 24 hours" is better than "within."
> LayoutTests/ChangeLog:12 > + prevalent.
Ah, so there's your test of the bug fix.
> LayoutTests/ChangeLog:1624 > + [ MacOS ] http/tests/resourceLoadStatistics/cookie-deletion.html is timing out
This is unrelated, right? Better to land in an unreviewed test gardening patch.
> LayoutTests/ChangeLog:2192 > + [ MacOS ] http/tests/resourceLoadStatistics/cookie-deletion.html is timing out
Ditto.
Kate Cheney
Comment 5
2019-12-16 09:23:11 PST
Comment on
attachment 385433
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=385433&action=review
Thanks John! I think it would be best to make any updates to the memory store in a separate patch to keep this one smaller.
>> Source/WebKit/ChangeLog:14 >> + 2. On a cross site load with link decoration > > Can you reason about why these two and not just page loads?
I did not include subresource loading in this list, but this patch will still update the timestamp for subresource loads when those statistics are merged into the database. I will update the ChangeLog to reflect this.
>> Source/WebKit/ChangeLog:17 >> + should be exposed via API. > > You probably want to land the bug fix in a separate patch. That's good practice to 1) not get bug fixes rolled out because something else needs to be rolled out. and 2) patches should have a single purpose if possible.
Good idea, I will put this in a new patch.
>> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1841 >> + return Seconds { -1 }; > > This is reusing what we interpret as "never," right? A comment on what -1 means would be good. But since you use it three times here I don't know. We should probably have a const declaration of -1 Seconds that spells out what it means.
Good idea. I'll add something like #define NO_EXISTING_TIMESTAMP -1
Kate Cheney
Comment 6
2019-12-16 10:14:13 PST
Created
attachment 385778
[details]
Patch
Kate Cheney
Comment 7
2019-12-16 10:25:14 PST
Created
attachment 385782
[details]
Patch
Kate Cheney
Comment 8
2019-12-16 17:08:56 PST
> > You probably want to land the bug fix in a separate patch. That's good practice to 1) not get bug fixes rolled out because something else needs to be rolled out. and 2) patches should have a single purpose if possible. > > Good idea, I will put this in a new patch. >
Link:
https://bugs.webkit.org/show_bug.cgi?id=205281
Kate Cheney
Comment 9
2019-12-17 15:54:00 PST
Created
attachment 385925
[details]
Patch
Kate Cheney
Comment 10
2019-12-17 15:54:16 PST
making this patch slightly bigger, but now that
https://bugs.webkit.org/show_bug.cgi?id=204932
has landed, this patch should include the timestamp in the new API classes.
Kate Cheney
Comment 11
2019-12-19 14:16:08 PST
***
Bug 205289
has been marked as a duplicate of this bug. ***
John Wilander
Comment 12
2019-12-19 16:56:14 PST
Comment on
attachment 385925
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=385925&action=review
I think we're good here. Please see my comments and questions before landing.
> Source/WebKit/ChangeLog:13 > + _getResourceLoadStatisticsDataSummaryAPI.
Missing space before API.
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:664 > + if (insertRelationshipStatement.bindDouble(2, WallTime::now().secondsSinceEpoch().value()) != SQLITE_OK) {
Just double checking: 1) Does this cover all the types of data ITP collects? Subresource, sub frame, web sockets (which map to subresource)? 2) Is this timestamp double-keyed so that it's for the third-party under a specific first-party? I was expecting a timestamp capture where we capture the loads but you can absolutely do it at the sink too.
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1859 > + double mostRecentlyUpdatedTimestamp = m_getMostRecentlyUpdatedTimestampStatement.getColumnDouble(0);
Much clearer variable name.
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsMemoryStore.cpp:108 > + WebResourceLoadStatisticsStore::ThirdPartyDataForSpecificFirstParty thirdPartyDataForSpecificFirstParty { firstPartyDomain, thirdPartyHasStorageAccess, Seconds { NO_EXISTING_TIMESTAMP }};
Thanks for introducing a constant. See my comment on what its definition though.
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsStore.h:42 > +#define NO_EXISTING_TIMESTAMP -1
I think this should instead be … static constexpr Seconds NoExistingTimestamp = { -1 }; … in the ResourceLoadStatistics namespace and then used as ResourceLoadStatistics::NoExistingTimestamp. Note that you then don't have to instantiate new Seconds objects in all those places. See for instance LoggedInStatus::TimeToLiveShort and WatchDog::noTimeLimit.
> Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.h:140 > + return WTF::nullopt;
Will this lead to us dumping all the existing data? Because it won't have these timestamps. In the case of ResourceLoadStatistics.cpp, there's a modelVersion to check for and a graceful import of data when the source is less than the new model.
> Tools/ChangeLog:10 > + in the Resource Load Statistics database backend.
You're mixing Resource Load Statistics and ITP in this entry. Let's stick to one.
> Tools/ChangeLog:11 > + This also adds an api test case using the ITP database store.
api -> API
> Tools/TestWebKitAPI/Tests/WebKitCocoa/ResourceLoadStatistics.mm:901 > + // Teach ITP about bad origins:
Change colon to period.
> Tools/TestWebKitAPI/Tests/WebKitCocoa/ResourceLoadStatistics.mm:923 > + // capture time for comparison later
Capital c and period.
> Tools/TestWebKitAPI/Tests/WebKitCocoa/ResourceLoadStatistics.mm:926 > +
Remove double linefeed.
> Tools/TestWebKitAPI/Tests/WebKitCocoa/ResourceLoadStatistics.mm:942 > +
Remove double linefeed.
> Tools/TestWebKitAPI/Tests/WebKitCocoa/ResourceLoadStatistics.mm:1020 > + // it appears under or redirects to
Period.
> Tools/TestWebKitAPI/Tests/WebKitCocoa/ResourceLoadStatistics.mm:1049 > + // check timestamp for evil3 is reported as being within the correct range
Start with capital C and end with a period.
> Tools/TestWebKitAPI/Tests/WebKitCocoa/ResourceLoadStatistics.mm:1083 > + // check timestamp for evil1 is reported as being within the correct range
Start with capital C and end with a period.
> Tools/TestWebKitAPI/Tests/WebKitCocoa/ResourceLoadStatistics.mm:1108 > + // check timestamp for evil2 is reported as being within the correct range
Start with capital C and end with a period.
Kate Cheney
Comment 13
2019-12-19 17:16:30 PST
Comment on
attachment 385925
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=385925&action=review
Thanks! Addressing these before uploading.
>> Source/WebKit/ChangeLog:13 >> + _getResourceLoadStatisticsDataSummaryAPI. > > Missing space before API.
good catch!
>> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:664 >> + if (insertRelationshipStatement.bindDouble(2, WallTime::now().secondsSinceEpoch().value()) != SQLITE_OK) { > > Just double checking: > 1) Does this cover all the types of data ITP collects? Subresource, sub frame, web sockets (which map to subresource)? > 2) Is this timestamp double-keyed so that it's for the third-party under a specific first-party? > > I was expecting a timestamp capture where we capture the loads but you can absolutely do it at the sink too.
1) It captures the following relationships: subresourceUnderTopFrame, subFrameUnderTopFrame, subresourceUniqueRedirectTo, and crossSiteLoadWithLinkDecoration. I see web sockets are added to subresourcesUnderTopFrameDomains in WebResourceLoadObserver.cpp, so they will be captured with a timestamp. 2) the schema is | thirdParty | firstParty | timestamp |, so the timestamp will always be associated with a specific pair. I thought about that to, but this came out a lot cleaner and has everything in one place (as opposed to trying to log subresource and subframe loads at different places).
>> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsStore.h:42 >> +#define NO_EXISTING_TIMESTAMP -1 > > I think this should instead be … > > static constexpr Seconds NoExistingTimestamp = { -1 }; > > … in the ResourceLoadStatistics namespace and then used as ResourceLoadStatistics::NoExistingTimestamp. Note that you then don't have to instantiate new Seconds objects in all those places. > > See for instance LoggedInStatus::TimeToLiveShort and WatchDog::noTimeLimit.
Noted, I will change this
>> Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.h:140 >> + return WTF::nullopt; > > Will this lead to us dumping all the existing data? Because it won't have these timestamps. In the case of ResourceLoadStatistics.cpp, there's a modelVersion to check for and a graceful import of data when the source is less than the new model.
Thats a very good point, I'll try to copy that logic.
>> Tools/ChangeLog:10 >> + in the Resource Load Statistics database backend. > > You're mixing Resource Load Statistics and ITP in this entry. Let's stick to one.
Agreed.
Kate Cheney
Comment 14
2019-12-19 17:37:51 PST
> >> Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.h:140 > >> + return WTF::nullopt; > > > > Will this lead to us dumping all the existing data? Because it won't have these timestamps. In the case of ResourceLoadStatistics.cpp, there's a modelVersion to check for and a graceful import of data when the source is less than the new model. > > Thats a very good point, I'll try to copy that logic.
Update: John and I talked offline and this is a larger fix that will be a separate patch.
Kate Cheney
Comment 15
2019-12-19 17:47:14 PST
Created
attachment 386170
[details]
Patch
Kate Cheney
Comment 16
2019-12-19 17:48:00 PST
Going to let bots run their course before landing
WebKit Commit Bot
Comment 17
2019-12-20 00:11:51 PST
Comment on
attachment 386170
[details]
Patch Rejecting
attachment 386170
[details]
from commit-queue. Failed to run "['/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch', '--status-host=webkit-queues.webkit.org', '--bot-id=webkit-cq-03', 'apply-attachment', '--no-update', '--non-interactive', 386170, '--port=mac']" exit_code: 2 cwd: /Volumes/Data/EWS/WebKit Logging in as
commit-queue@webkit.org
... Fetching:
https://bugs.webkit.org/attachment.cgi?id=386170&action=edit
Fetching:
https://bugs.webkit.org/show_bug.cgi?id=205121
&ctype=xml&excludefield=attachmentdata Processing 1 patch from 1 bug. Processing patch 386170 from
bug 205121
. Fetching:
https://bugs.webkit.org/attachment.cgi?id=386170
Failed to run "[u'/Volumes/Data/EWS/WebKit/Tools/Scripts/svn-apply', '--force']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit Parsed 20 diffs from patch file(s). patching file Source/WebKit/ChangeLog patching file Source/WebCore/loader/ResourceLoadStatistics.h patching file Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp patching file Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h patching file Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsMemoryStore.cpp patching file Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.h patching file Source/WebKit/UIProcess/API/APIResourceLoadStatisticsFirstParty.h patching file Source/WebKit/UIProcess/API/Cocoa/WKWebViewConfigurationPrivate.h patching file Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStore.mm patching file Source/WebKit/UIProcess/API/Cocoa/WKWebsiteDataStorePrivate.h patching file Source/WebKit/UIProcess/API/Cocoa/_WKResourceLoadStatisticsFirstParty.h patching file Source/WebKit/UIProcess/API/Cocoa/_WKResourceLoadStatisticsFirstParty.mm patching file Tools/ChangeLog Hunk #1 succeeded at 1 with fuzz 3. patching file Tools/TestWebKitAPI/Tests/WebKitCocoa/ResourceLoadStatistics.mm patching file LayoutTests/ChangeLog Hunk #1 succeeded at 1 with fuzz 3. Hunk #2 succeeded at 2772 (offset 60 lines). Hunk #3 succeeded at 3340 (offset 60 lines). patching file LayoutTests/http/tests/resourceLoadStatistics/aggregate-sorted-data-no-storage-access-database-expected.txt patching file LayoutTests/http/tests/resourceLoadStatistics/aggregate-sorted-data-no-storage-access-expected.txt patching file LayoutTests/http/tests/storageAccess/aggregate-sorted-data-with-storage-access-database-expected.txt patching file LayoutTests/http/tests/storageAccess/aggregate-sorted-data-with-storage-access-expected.txt patching file 'WebKit.xcworkspace/xcshareddata/xcschemes/All Source.xcscheme' Hunk #1 FAILED at 57. 1 out of 1 hunk FAILED -- saving rejects to file 'WebKit.xcworkspace/xcshareddata/xcschemes/All Source.xcscheme.rej' Failed to run "[u'/Volumes/Data/EWS/WebKit/Tools/Scripts/svn-apply', '--force']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit Full output:
https://webkit-queues.webkit.org/results/13293768
Kate Cheney
Comment 18
2019-12-20 21:09:14 PST
Created
attachment 386286
[details]
Patch
WebKit Commit Bot
Comment 19
2019-12-21 06:58:09 PST
Comment on
attachment 386286
[details]
Patch Clearing flags on attachment: 386286 Committed
r253863
: <
https://trac.webkit.org/changeset/253863
>
WebKit Commit Bot
Comment 20
2019-12-21 06:58:11 PST
All reviewed patches have been landed. Closing bug.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug