WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
229527
Separate PrivateClickMeasurement database from ResourceLoadStatistics database and add SPI to set its location
https://bugs.webkit.org/show_bug.cgi?id=229527
Summary
Separate PrivateClickMeasurement database from ResourceLoadStatistics databas...
Alex Christensen
Reported
2021-08-25 16:27:14 PDT
Separate PrivateClickMeasurement database from ResourceLoadStatistics database and add SPI to set its location
Attachments
Patch
(198.05 KB, patch)
2021-08-25 16:35 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(198.18 KB, patch)
2021-08-25 20:15 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(201.45 KB, patch)
2021-08-25 22:19 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(215.35 KB, patch)
2021-08-26 11:41 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(235.75 KB, patch)
2021-08-26 17:40 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(242.23 KB, patch)
2021-08-27 11:39 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(242.24 KB, patch)
2021-08-27 12:08 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Show Obsolete
(6)
View All
Add attachment
proposed patch, testcase, etc.
Alex Christensen
Comment 1
2021-08-25 16:35:43 PDT
Created
attachment 436437
[details]
Patch
Alex Christensen
Comment 2
2021-08-25 20:15:42 PDT
Created
attachment 436461
[details]
Patch
Alex Christensen
Comment 3
2021-08-25 22:19:40 PDT
Created
attachment 436473
[details]
Patch
Alex Christensen
Comment 4
2021-08-26 11:41:55 PDT
Created
attachment 436540
[details]
Patch
Kate Cheney
Comment 5
2021-08-26 13:26:14 PDT
Comment on
attachment 436540
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=436540&action=review
I like this change, it's very clever. My main question is why we are storing the PCM store inside the ITP store? I think it could be separate and use separate queues now that they are in different databases.
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:561 > void ResourceLoadStatisticsDatabaseStore::renameColumnInTable(String&& tableName, ExistingColumnName&& existingColumnName, ExpectedColumnName&& expectedColumnName)
You should delete this function, it is only used by the functions you deleted.
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:795 > m_updateAttributionsEarliestTimeToSendStatement = nullptr;
This seems like it can be deleted.
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:60 > using ExpectedColumnName = String;
ExistingColumns, ExpectedColumns, ExistingColumnName and ExpectedColumnName are all used in the migration functions you deleted. You can delete them here as well.
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:133 > Vector<String> columnsForTable(const String&);
Ditto, you should delete this function now that it is no longer used.
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:142 > void addMissingTablesIfNecessary();
Ditto.
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:145 > void renameColumnInTable(String&&, ExistingColumnName&&, ExpectedColumnName&&);
Ditto.
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:146 > void addMissingColumnsToTable(String&&, const ExistingColumns&, const ExpectedColumns&);
Ditto.
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:147 > void renameColumnInTable();
Ditto.
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:149 > void updatePrivateClickMeasurementSchemaIfNecessary();
Ditto.
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:267 > std::unique_ptr<WebCore::SQLiteStatement> m_updateAttributionsEarliestTimeToSendStatement;
Please delete this.
> Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.cpp:170 > + , m_pcmStore(PCM::Store::create(sharedStatisticsQueue(), pcmStoreDirectory(networkSession, resourceLoadStatisticsDirectory, privateClickMeasurementStorageDirectory)))
We don't need PCM and ITP to share a queue because they have different databases now. Could we separate them out so we can do more work in parallel?
> Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.h:243 > + PCM::Store& privateClickMeasurementStore() { return m_pcmStore.get();}
It seems strange to me to keep the PCM::Store as a member of the WebResourceLoadStatisticsStore instead of in the NetworkSession. Is there a reason you did it this way?
> Source/WebKit/NetworkProcess/NetworkSession.cpp:131 > + return completionHandler(ResourceError(ResourceError::Type::Cancellation), { }, { });
Nice catch.
> Source/WebKit/NetworkProcess/PrivateClickMeasurement/PrivateClickMeasurementStore.cpp:43 > + if (!databaseDirectory.isEmpty()) {
I can't find the place in code where you set this to be empty if the session is ephemeral, where does that happen?
Alex Christensen
Comment 6
2021-08-26 13:38:17 PDT
(In reply to Kate Cheney from
comment #5
)
> Comment on
attachment 436540
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=436540&action=review
> > I like this change, it's very clever. My main question is why we are storing > the PCM store inside the ITP store? I think it could be separate and use > separate queues now that they are in different databases.
I'm just writing the two database files to the same directory by default so we don't have to add another directory, which would hurt startup time. I like the idea of separate queues, though.
> > Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.h:243 > > + PCM::Store& privateClickMeasurementStore() { return m_pcmStore.get();} > > It seems strange to me to keep the PCM::Store as a member of the > WebResourceLoadStatisticsStore instead of in the NetworkSession. Is there a > reason you did it this way?
That was necessary for the incremental approach with which I wrote this patch. Now that it's a self-contained object, it can go anywhere. I'll move it.
> > Source/WebKit/NetworkProcess/PrivateClickMeasurement/PrivateClickMeasurementStore.cpp:43 > > + if (!databaseDirectory.isEmpty()) { > > I can't find the place in code where you set this to be empty if the session > is ephemeral, where does that happen?
currently in a function called pcmStoreDirectory.
Kate Cheney
Comment 7
2021-08-26 15:24:59 PDT
Comment on
attachment 436540
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=436540&action=review
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:238 > const int expectedIndexCount = 14;
I think you should decrement this now that you've removed some indexes.
Alex Christensen
Comment 8
2021-08-26 17:36:38 PDT
(In reply to Kate Cheney from
comment #7
)
> Comment on
attachment 436540
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=436540&action=review
> > > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:238 > > const int expectedIndexCount = 14; > > I think you should decrement this now that you've removed some indexes.
Yes, now it needs to be 12. (In reply to Alex Christensen from
comment #6
)
> > > Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.h:243 > > > + PCM::Store& privateClickMeasurementStore() { return m_pcmStore.get();} > > > > It seems strange to me to keep the PCM::Store as a member of the > > WebResourceLoadStatisticsStore instead of in the NetworkSession. Is there a > > reason you did it this way? > That was necessary for the incremental approach with which I wrote this > patch. Now that it's a self-contained object, it can go anywhere. I'll > move it.
Although it's easy to move now, I think I'm going to keep it owned by WebResourceLoadStatisticsStore because of the lifetime of that object, which I want to be the same as the lifetime of PCM::Store. Adding that lifetime management logic would look stranger than leaving this.
Alex Christensen
Comment 9
2021-08-26 17:40:05 PDT
Created
attachment 436593
[details]
Patch
Alex Christensen
Comment 10
2021-08-26 17:40:24 PDT
Still no data migration, but this is getting closer.
Kate Cheney
Comment 11
2021-08-27 10:33:44 PDT
Comment on
attachment 436593
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=436593&action=review
Ok, took a closer pass today and I think this is almost there. I suggested one optimization for clearing now that we have a new schema, and had a few other nits. Then the grandfathering API tests seem to be failing as well. Grandfathering happens in WebResourceLoadStatisticsStore::populateMemoryStoreFromDisk() based on whether the file is new or not, so maybe a bug in the database open code?
> Source/WebKit/NetworkProcess/DatabaseUtilities.cpp:110 > + return createdNewFile;
This return is not needed.
> Source/WebKit/NetworkProcess/PrivateClickMeasurement/PrivateClickMeasurementDatabase.cpp:425 > +String Database::attributionToString(WebCore::SQLiteStatement* statement, PrivateClickMeasurementAttributionType attributionType)
Optional, could add ForTesting at the end of this function.
> Source/WebKit/NetworkProcess/PrivateClickMeasurement/PrivateClickMeasurementDatabase.cpp:482 > + ASSERT(!RunLoop::isMain());
Now that we have a separate table with all PCM domains and foreign key constraints, we can create one query to remove the domains from the PCMObservedDomains table to accomplish this task. Then we could replace this function with something simpler, like this: void Database::clearPrivateClickMeasurement(std::optional<RegistrableDomain> domain) { ASSERT(!RunLoop::isMain()); // Default to clear all entries if no domain is specified. String bindParameter = "%"; if (domain) { auto domainIDToMatch = domainID(*domain); if (!domainIDToMatch) return; bindParameter = String::number(*domainIDToMatch); } auto transactionScope = beginTransactionIfNecessary(); auto clearAllPrivateClickMeasurementScopedStatement = this->scopedStatement(m_ clearAllPrivateClickMeasurementStatement, clearAllPrivateClickMeasurementQuery, "clearPrivateClickMeasurement"_s); if (!clearAllPrivateClickMeasurementScopedStatement || clearAllPrivateClickMeasurementScopedStatement->bindText(1, bindParameter) != SQLITE_OK || clearAllPrivateClickMeasurementScopedStatement->step() != SQLITE_DONE) { ITP_RELEASE_LOG_ERROR(m_sessionID, "%p - ResourceLoadStatisticsStore::clearPrivateClickMeasurement clearAllPrivateClickMeasurementScopedStatement, error message: %" PRIVATE_LOG_STRING, this, m_database.lastErrorMsg()); ASSERT_NOT_REACHED(); } } You will need a new clearAllPrivateClickMeasurementQuery (DELETE FROM PCMObservedDomains WHERE domainID LIKE ?) and statement that remove the domains from the PCMObservedDomains table, but can remove the two others used in this function.
> Source/WebKit/NetworkProcess/PrivateClickMeasurement/PrivateClickMeasurementDatabase.cpp:679 > + RELEASE_LOG_ERROR(PrivateClickMeasurement, "%p - Database::insertObservedDomain failed to bind, error message: %" PRIVATE_LOG_STRING, this, m_database.lastErrorMsg());
Database::insertObservedDomain -> Database::ensureDomainID
> Source/WebKit/NetworkProcess/PrivateClickMeasurement/PrivateClickMeasurementDatabase.cpp:685 > + RELEASE_LOG_ERROR(PrivateClickMeasurement, "%p - Database::insertObservedDomain failed to commit, error message: %" PRIVATE_LOG_STRING, this, m_database.lastErrorMsg());
Ditto.
> Source/WebKit/NetworkProcess/PrivateClickMeasurement/PrivateClickMeasurementStore.cpp:102 > + postTask([this, protectedThis = makeRef(*this), sourceSite = sourceSite.isolatedCopy(), destinationSite = destinationSite.isolatedCopy(), attributionTriggerData = WTFMove(attributionTriggerData), ephemeralMeasurement = crossThreadCopy(ephemeralMeasurement), completionHandler = WTFMove(completionHandler)] () mutable {
Why a crossThreadCopy of ephemeralMeasurement instead of an isolatedCopy?
Kate Cheney
Comment 12
2021-08-27 10:38:13 PDT
And, I think doing migration in a separate patch is fine, this one is quite big.
Kate Cheney
Comment 13
2021-08-27 11:18:35 PDT
Comment on
attachment 436593
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=436593&action=review
> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:310 > enableForeignKeys();
You should add this in the PCM database as well, so you can use PCMObservedDomains to perform cascading deletions.
Alex Christensen
Comment 14
2021-08-27 11:38:53 PDT
(In reply to Kate Cheney from
comment #11
)
> > + postTask([this, protectedThis = makeRef(*this), sourceSite = sourceSite.isolatedCopy(), destinationSite = destinationSite.isolatedCopy(), attributionTriggerData = WTFMove(attributionTriggerData), ephemeralMeasurement = crossThreadCopy(ephemeralMeasurement), completionHandler = WTFMove(completionHandler)] () mutable { > > Why a crossThreadCopy of ephemeralMeasurement instead of an isolatedCopy?
ephemeralMeasurement is a std::optional, which has no isolatedCopy member. crossThreadCopy deals with this and calls isolatedCopy() of the contained type if the optional is engaged.
Alex Christensen
Comment 15
2021-08-27 11:39:30 PDT
Created
attachment 436646
[details]
Patch
Kate Cheney
Comment 16
2021-08-27 12:01:21 PDT
Comment on
attachment 436646
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=436646&action=review
r=me as soon as EWS is happy.
> Source/WebKit/NetworkProcess/DatabaseUtilities.cpp:109 > + return createdNewFile;
This return is still not needed :P
Alex Christensen
Comment 17
2021-08-27 12:08:45 PDT
Created
attachment 436654
[details]
Patch
Alex Christensen
Comment 18
2021-08-27 14:22:52 PDT
http://trac.webkit.org/r281721
Radar WebKit Bug Importer
Comment 19
2021-08-27 14:23:21 PDT
<
rdar://problem/82454337
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug