Bug 229646 - Migrate PrivateClickMeasurements from ResourceLoadStatistics database to new database
Summary: Migrate PrivateClickMeasurements from ResourceLoadStatistics database to new ...
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Alex Christensen
Keywords: InRadar
Depends on:
Reported: 2021-08-28 00:56 PDT by Alex Christensen
Modified: 2021-08-30 15:36 PDT (History)
2 users (show)

See Also:

Patch (220.06 KB, patch)
2021-08-28 00:59 PDT, Alex Christensen
no flags Details | Formatted Diff | Diff
Patch (219.90 KB, patch)
2021-08-28 12:46 PDT, Alex Christensen
no flags Details | Formatted Diff | Diff
Patch (212.69 KB, patch)
2021-08-30 14:40 PDT, Alex Christensen
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alex Christensen 2021-08-28 00:56:30 PDT
Migrate PrivateClickMeasurements from ResourceLoadStatistics database to new database
Comment 1 Alex Christensen 2021-08-28 00:59:45 PDT
Created attachment 436713 [details]
Comment 2 Alex Christensen 2021-08-28 12:46:20 PDT
Created attachment 436722 [details]
Comment 3 Kate Cheney 2021-08-30 12:49:26 PDT
Comment on attachment 436722 [details]

View in context: https://bugs.webkit.org/attachment.cgi?id=436722&action=review

This is fine but I think we can remove a lot of migration code now that we convert the database entries into PCM objects before re-inserting. We shouldn't need to rename columns or even add new ones I don't think.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:537
> +        constexpr auto allAttributedPrivateClickMeasurementQuery = "SELECT *, MIN(earliestTimeToSendToSource, earliestTimeToSendToDestination) as minVal "

We sort these later in PrivateClickMeasurementDatabase when we pass them to PrivateClickMeasurementManager, so no need to sort them here. You can just use SELECT * FROM AttributedPrivateClickMeasurement.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:639
> +void ResourceLoadStatisticsDatabaseStore::renameColumnInTable(String&& tableName, String&& existingColumnName, String&& expectedColumnName)

You shouldn't need this anymore if you use SELECT * statements when migrating to the new schema, because they don't rely on the name of the columns to migrate the data.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:654
> +void ResourceLoadStatisticsDatabaseStore::addMissingColumnsToTable(String&& tableName, const Vector<String>& existingColumns, const Vector<String>& expectedColumns)

You shouldn't need this either. I think SQLite returns null values if the requested column doesn't exist, so the call to buildPrivateClickMeasurementFromDatabase() will return null values for keyID, token, signature, and earliestTimeToSend even if the columns don't exist. It might be nice to be safe and check for null strings before calling attribution.setSourceSecretToken() so we don't create a token with only null values.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:679
> +void ResourceLoadStatisticsDatabaseStore::addMissingColumnsIfNecessary()

Ditto, you can remove this.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:735
> +void ResourceLoadStatisticsDatabaseStore::renameColumnsIfNecessary()

Ditto, I think we can remove.

> Source/WebKit/NetworkProcess/PrivateClickMeasurement/PrivateClickMeasurementDatabase.cpp:153
> +        ASSERT(sourceEarliestTimeToSend != -1);

I think we should adjust this to be ASSERT(sourceEarliestTimeToSend != -1 || destinationEarliestTimeToSend != -1);

It is possible you will migrate an entry that was already sent to the source but not the destination, resulting in a sourceEarliestTimeToSend value of -1.

> Tools/TestWebKitAPI/Tests/WebKitCocoa/PrivateClickMeasurement.mm:58
> +void addValuesToTable(WebCore::SQLiteDatabase& database, ASCIILiteral query, std::array<Variant<StringView, int, double>, size> values)

fancy tests!

> Tools/TestWebKitAPI/Tests/WebKitCocoa/PrivateClickMeasurement.mm:88
> +        "earliestTimeToSend REAL, "

To mimic the actual v1, earliestTimeToSend should be REAL NOT NULL.

> Tools/TestWebKitAPI/Tests/WebKitCocoa/PrivateClickMeasurement.mm:117
> +        "earliestTimeToSend REAL, token TEXT, signature TEXT, keyID TEXT, "

Ditto, this should be REAL NOT NULL for an accurate test.
Comment 4 Alex Christensen 2021-08-30 14:40:48 PDT
Created attachment 436814 [details]
Comment 5 Kate Cheney 2021-08-30 15:28:54 PDT
Comment on attachment 436814 [details]

View in context: https://bugs.webkit.org/attachment.cgi?id=436814&action=review


> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:261
> +    mutable std::unique_ptr<WebCore::SQLiteStatement> m_checkIfTableExistsStatement;

I don't think this needs to be mutable.

> Source/WebKit/NetworkProcess/PrivateClickMeasurement/PrivateClickMeasurementDatabase.h:92
> +    mutable Statement m_insertObservedDomainStatement;

Do these all need to be mutable? I don't think most are used in const functions.
Comment 6 Alex Christensen 2021-08-30 15:31:14 PDT
Comment on attachment 436814 [details]

They're all conceptually caches, which is what "mutable" is for.  Making them mutable allows them to be used in const contexts.  I needed to add some const, so I made them all mutable so that we don't need to hurt future change history by adding it.
Comment 7 EWS 2021-08-30 15:35:37 PDT
Committed r281779 (241116@main): <https://commits.webkit.org/241116@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 436814 [details].
Comment 8 Radar WebKit Bug Importer 2021-08-30 15:36:25 PDT