RESOLVED FIXED 223726
Implement PCM SQLite changes based on spec review
https://bugs.webkit.org/show_bug.cgi?id=223726
Summary Implement PCM SQLite changes based on spec review
Kate Cheney
Reported 2021-03-24 20:06:29 PDT
We should update some things in PCM’s SQLite database based on spec review.
Attachments
Patch (217.72 KB, patch)
2021-03-24 20:08 PDT, Kate Cheney
no flags
Patch (217.76 KB, patch)
2021-03-25 17:23 PDT, Kate Cheney
no flags
Patch (223.06 KB, patch)
2021-03-25 20:09 PDT, Kate Cheney
no flags
Patch (223.12 KB, patch)
2021-03-26 09:11 PDT, Kate Cheney
no flags
Patch for landing (223.13 KB, patch)
2021-03-26 11:39 PDT, Kate Cheney
no flags
Kate Cheney
Comment 1 2021-03-24 20:08:15 PDT
Kate Cheney
Comment 2 2021-03-25 13:03:27 PDT
Kate Cheney
Comment 3 2021-03-25 13:06:45 PDT
cq- was for an unrelated flaky test.
John Wilander
Comment 4 2021-03-25 14:58:08 PDT
Comment on attachment 424212 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=424212&action=review Ugh. This is why I don't like databases. But I do like folks like yourself who are proficient at it and willing to make these kind of changes. "Name change? Let me just write hundreds of lines of code." I'll let a DB expert formally review this. From a spec standpoint, this looks good. > Source/WebKit/ChangeLog:11 > + based on spec review. Second, it adds support for sending reports to You can add this link as a reference for anyone who wants to dig into the spec and name change: https://trac.webkit.org/changeset/275046/webkit
Brady Eidson
Comment 5 2021-03-25 16:43:46 PDT
SQLite stuff looks fine, but I didn't review the rest in depth.
Sihui Liu
Comment 6 2021-03-25 16:44:13 PDT
Comment on attachment 424212 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=424212&action=review > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:284 > +static Vector<String> expectedUnattributedColumns() static const Vector<String> expectedUnattributedColumns() { static const auto columns = makeNeverDestroyed(Vector<String> { ... }); return columns; } > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:291 > + return { "sourceSiteDomainID", "destinationSiteDomainID", "sourceID", "attributionTriggerData", "priority", "timeOfAdClick", "earliestTimeToSendToSource", "token", "signature", "keyID", "earliestTimeToSendToDestination" }; Ditto. > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:296 > + return !schema.isEmpty() && schema.contains("REFERENCES TopLevelDomains"); !schema.isEmpty() is not needed. > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:451 > + auto currentSchema = tableSchema("AttributedPrivateClickMeasurement"); You are checking a different database scheme with the change. How can we make sure the old table schema is correct if this check pass? It seems we can add missing tables at different times. Maybe add asserts that the other tables are correct or add separate checks for each. > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:540 > +{ You may want to assert size of existingColumns is less than or equal to size of expectedColumns; and no existingColumns is in unexpected > Source/WebKit/UIProcess/WebsiteData/WebsiteDataStore.h:199 > + void statisticsDatabaseColumnsForTable(const String&, CompletionHandler<void(Vector<String>&&)>&&); I don't quite like this SPI because it is checking the low-level data, and I would expect an high-level error when a column is missing in the table. For example the test may verify this like: if the database is upgraded successfully, we can get the data from the store; otherwise we should see a failure/error.
Kate Cheney
Comment 7 2021-03-25 17:04:59 PDT
Comment on attachment 424212 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=424212&action=review >> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:284 >> +static Vector<String> expectedUnattributedColumns() > > static const Vector<String> expectedUnattributedColumns() > { > static const auto columns = makeNeverDestroyed(Vector<String> { ... }); > return columns; > } Will fix. >> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:291 >> + return { "sourceSiteDomainID", "destinationSiteDomainID", "sourceID", "attributionTriggerData", "priority", "timeOfAdClick", "earliestTimeToSendToSource", "token", "signature", "keyID", "earliestTimeToSendToDestination" }; > > Ditto. Ditto. >> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:296 >> + return !schema.isEmpty() && schema.contains("REFERENCES TopLevelDomains"); > > !schema.isEmpty() is not needed. Will remove. >> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:451 >> + auto currentSchema = tableSchema("AttributedPrivateClickMeasurement"); > > You are checking a different database scheme with the change. How can we make sure the old table schema is correct if this check pass? > > It seems we can add missing tables at different times. Maybe add asserts that the other tables are correct or add separate checks for each. This will be the last thing that happens -- first we will add missing tables, add missing columns, and rename columns, then we check for something that requires a full migration. In this case, attributedPrivateClickMeasurementSchemaV1(). I don't think I understand your comment fully. I can add asserts for the other tables if we think that is not wasteful or excessive. I hesitated on doing that because most other tables haven't changed since the initial implementation. >> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:540 >> +{ > > You may want to assert size of existingColumns is less than or equal to size of expectedColumns; and no existingColumns is in unexpected Good idea. >> Source/WebKit/UIProcess/WebsiteData/WebsiteDataStore.h:199 >> + void statisticsDatabaseColumnsForTable(const String&, CompletionHandler<void(Vector<String>&&)>&&); > > I don't quite like this SPI because it is checking the low-level data, and I would expect an high-level error when a column is missing in the table. > For example the test may verify this like: if the database is upgraded successfully, we can get the data from the store; otherwise we should see a failure/error. I think this is good in theory. In practice it may be difficult because of the way we fail if there is a database error. We mostly rely on debug asserts. I think it would be good in the future to surface errors in this case.
Kate Cheney
Comment 8 2021-03-25 17:23:59 PDT
Kate Cheney
Comment 9 2021-03-25 20:09:07 PDT
Kate Cheney
Comment 10 2021-03-25 20:13:37 PDT
(In reply to katherine_cheney from comment #9) > Created attachment 424312 [details] > Patch This patch fixes everything from the comments except removing the testing SPI. We don't have a great way to unit test SQLite right now, or surface errors to the UI process if a schema migration fails to do what Sihui suggested. I think an SPI to check the schema is the best way to test this right now, but we should consider adding better infrastructure to test in the future.
Kate Cheney
Comment 11 2021-03-26 09:11:51 PDT
Brent Fulgham
Comment 12 2021-03-26 10:32:26 PDT
Comment on attachment 424362 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=424362&action=review r=me. I have a few minor suggestions, but otherwise looks good. > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:274 > +static const Vector<String> expectedUnattributedColumns() Since this is a never-destroyed object, I would return a const reference to avoid making a copy. Also, why not use the ExpectedColumns type alias? static const ExpectedColumns& expectedUnattributedColumns() > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:280 > +static const Vector<String> expectedAttributedColumns() Ditto "const ExpectedColumns&" > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:444 > + auto currentSchema = tableSchema(table); Maybe: auto currentSchema = tableSchema("AttributedPrivateClickMeasurement"_s) > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:456 > + auto oldSchema = tableSchema(table); I wonder if the compiler could do a better job if you didn't stack allocate 'table' every time you hit this call. Maybe: auto oldSchema = tableSchema("TopFrameUniqueRedirectsTo"_s); > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:536 > + Vector<String> columns; Depending on how many columns table tend to have, there might be benefit to hinting at the capacity of 'columns'. (Vector::reserveCapacity) > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:571 > + auto attributedTableName = "AttributedPrivateClickMeasurement"_s; I would suggest making these 'const'. > Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:600 > + auto foundAttributedTableColumns = columnsForTable(attributedTableName); I think all of the above could be 'const'.
Kate Cheney
Comment 13 2021-03-26 11:33:29 PDT
Comment on attachment 424362 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=424362&action=review Thanks for the review! >> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:274 >> +static const Vector<String> expectedUnattributedColumns() > > Since this is a never-destroyed object, I would return a const reference to avoid making a copy. Also, why not use the ExpectedColumns type alias? > > static const ExpectedColumns& expectedUnattributedColumns() Oops -- good catch. Will fix. >> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:280 >> +static const Vector<String> expectedAttributedColumns() > > Ditto "const ExpectedColumns&" Will fix. >> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:444 >> + auto currentSchema = tableSchema(table); > > Maybe: > auto currentSchema = tableSchema("AttributedPrivateClickMeasurement"_s) Will fix. >> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:456 >> + auto oldSchema = tableSchema(table); > > I wonder if the compiler could do a better job if you didn't stack allocate 'table' every time you hit this call. > > Maybe: > auto oldSchema = tableSchema("TopFrameUniqueRedirectsTo"_s); Will fix. >> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:536 >> + Vector<String> columns; > > Depending on how many columns table tend to have, there might be benefit to hinting at the capacity of 'columns'. (Vector::reserveCapacity) I think this would require knowing the expected number of columns, which we don't know in all cases (like WebResourceLoadStatisticsStore::statisticsDatabaseColumnsForTable). I could always allocate the max number of columns, but I don't know enough about vector allocation to know whether this is better than not reserving capacity at all. >> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:571 >> + auto attributedTableName = "AttributedPrivateClickMeasurement"_s; > > I would suggest making these 'const'. Will fix. >> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:600 >> + auto foundAttributedTableColumns = columnsForTable(attributedTableName); > > I think all of the above could be 'const'. Ditto.
Kate Cheney
Comment 14 2021-03-26 11:39:58 PDT
Created attachment 424384 [details] Patch for landing
EWS
Comment 15 2021-03-26 12:14:08 PDT
Committed r275106: <https://commits.webkit.org/r275106> All reviewed patches have been landed. Closing bug and clearing flags on attachment 424384 [details].
Note You need to log in before you can comment on or make changes to this bug.