Bug 229646

Summary: Migrate PrivateClickMeasurements from ResourceLoadStatistics database to new database
Product: WebKit Reporter: Alex Christensen <achristensen>
Component: New BugsAssignee: Alex Christensen <achristensen>
Status: RESOLVED FIXED    
Severity: Normal CC: katherine_cheney, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
none
Patch none

Description Alex Christensen 2021-08-28 00:56:30 PDT
Migrate PrivateClickMeasurements from ResourceLoadStatistics database to new database
Comment 1 Alex Christensen 2021-08-28 00:59:45 PDT
Created attachment 436713 [details]
Patch
Comment 2 Alex Christensen 2021-08-28 12:46:20 PDT
Created attachment 436722 [details]
Patch
Comment 3 Kate Cheney 2021-08-30 12:49:26 PDT
Comment on attachment 436722 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=436722&action=review

This is fine but I think we can remove a lot of migration code now that we convert the database entries into PCM objects before re-inserting. We shouldn't need to rename columns or even add new ones I don't think.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:537
> +        constexpr auto allAttributedPrivateClickMeasurementQuery = "SELECT *, MIN(earliestTimeToSendToSource, earliestTimeToSendToDestination) as minVal "

We sort these later in PrivateClickMeasurementDatabase when we pass them to PrivateClickMeasurementManager, so no need to sort them here. You can just use SELECT * FROM AttributedPrivateClickMeasurement.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:639
> +void ResourceLoadStatisticsDatabaseStore::renameColumnInTable(String&& tableName, String&& existingColumnName, String&& expectedColumnName)

You shouldn't need this anymore if you use SELECT * statements when migrating to the new schema, because they don't rely on the name of the columns to migrate the data.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:654
> +void ResourceLoadStatisticsDatabaseStore::addMissingColumnsToTable(String&& tableName, const Vector<String>& existingColumns, const Vector<String>& expectedColumns)

You shouldn't need this either. I think SQLite returns null values if the requested column doesn't exist, so the call to buildPrivateClickMeasurementFromDatabase() will return null values for keyID, token, signature, and earliestTimeToSend even if the columns don't exist. It might be nice to be safe and check for null strings before calling attribution.setSourceSecretToken() so we don't create a token with only null values.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:679
> +void ResourceLoadStatisticsDatabaseStore::addMissingColumnsIfNecessary()

Ditto, you can remove this.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:735
> +void ResourceLoadStatisticsDatabaseStore::renameColumnsIfNecessary()

Ditto, I think we can remove.

> Source/WebKit/NetworkProcess/PrivateClickMeasurement/PrivateClickMeasurementDatabase.cpp:153
> +        ASSERT(sourceEarliestTimeToSend != -1);

I think we should adjust this to be ASSERT(sourceEarliestTimeToSend != -1 || destinationEarliestTimeToSend != -1);

It is possible you will migrate an entry that was already sent to the source but not the destination, resulting in a sourceEarliestTimeToSend value of -1.

> Tools/TestWebKitAPI/Tests/WebKitCocoa/PrivateClickMeasurement.mm:58
> +void addValuesToTable(WebCore::SQLiteDatabase& database, ASCIILiteral query, std::array<Variant<StringView, int, double>, size> values)

fancy tests!

> Tools/TestWebKitAPI/Tests/WebKitCocoa/PrivateClickMeasurement.mm:88
> +        "earliestTimeToSend REAL, "

To mimic the actual v1, earliestTimeToSend should be REAL NOT NULL.

> Tools/TestWebKitAPI/Tests/WebKitCocoa/PrivateClickMeasurement.mm:117
> +        "earliestTimeToSend REAL, token TEXT, signature TEXT, keyID TEXT, "

Ditto, this should be REAL NOT NULL for an accurate test.
Comment 4 Alex Christensen 2021-08-30 14:40:48 PDT
Created attachment 436814 [details]
Patch
Comment 5 Kate Cheney 2021-08-30 15:28:54 PDT
Comment on attachment 436814 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=436814&action=review

r=me

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:261
> +    mutable std::unique_ptr<WebCore::SQLiteStatement> m_checkIfTableExistsStatement;

I don't think this needs to be mutable.

> Source/WebKit/NetworkProcess/PrivateClickMeasurement/PrivateClickMeasurementDatabase.h:92
> +    mutable Statement m_insertObservedDomainStatement;

Do these all need to be mutable? I don't think most are used in const functions.
Comment 6 Alex Christensen 2021-08-30 15:31:14 PDT
Comment on attachment 436814 [details]
Patch

They're all conceptually caches, which is what "mutable" is for.  Making them mutable allows them to be used in const contexts.  I needed to add some const, so I made them all mutable so that we don't need to hurt future change history by adding it.
Comment 7 EWS 2021-08-30 15:35:37 PDT
Committed r281779 (241116@main): <https://commits.webkit.org/241116@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 436814 [details].
Comment 8 Radar WebKit Bug Importer 2021-08-30 15:36:25 PDT
<rdar://problem/82550832>