Bug 194867

Summary: Use a SQLite database to hold the ResourceLoadStatistics data
Product: WebKit Reporter: Brent Fulgham <bfulgham>
Component: WebKit Misc.Assignee: Brent Fulgham <bfulgham>
Status: RESOLVED FIXED    
Severity: Normal CC: achristensen, beidson, bfulgham, cdumez, commit-queue, ews-watchlist, Hironori.Fujii, mark.lam, simon.fraser, webkit-bug-importer, wilander
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 195088, 195294, 195419    
Attachments:
Description Flags
WIP Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch for landing
none
Patch for landing
none
Patch for landing
none
Patch for landing none

Description Brent Fulgham 2019-02-20 13:05:47 PST
The ResourceLoadStatistics database plist is a silly way to store the data. It requires a lot of stuff to be held in memory, and forces lots of looping and string comparisons to find out information.

This problem has already been solved in a much better way in the form of relational databases. We use them elsewhere in WebKit, and should do so for this storage as well.

This patch creates an optional SQLite database to handle ITP operations.

1. It adds a new internal experimental feature flag to activate it. It requires the user exit and restart the process to take effect.
2. It populates itself from the existing plist file (if it exists).
3. It stops using the plist in favor of the database.
4. It does queries and other operations using the database instead of the old hash table implementation.
Comment 1 Brent Fulgham 2019-02-20 13:54:06 PST
<rdar://problem/24240854>
Comment 2 Brent Fulgham 2019-02-21 10:12:26 PST
Created attachment 362617 [details]
WIP Patch
Comment 3 Brent Fulgham 2019-02-26 16:03:25 PST
Created attachment 363032 [details]
Patch
Comment 4 Brent Fulgham 2019-02-26 21:28:29 PST
Created attachment 363065 [details]
Patch
Comment 5 Brent Fulgham 2019-02-28 13:08:06 PST
Created attachment 363255 [details]
Patch
Comment 6 Simon Fraser (smfr) 2019-03-01 10:53:07 PST
Comment on attachment 363255 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363255&action=review

I didn't do a full review. This is a lot of code!

> Source/WebCore/page/RuntimeEnabledFeatures.h:540
> +    bool m_itpDatabaseMode { false };

This should be called m_itpDatabaseModeEnabled

> Source/WebCore/platform/sql/SQLiteDatabase.h:67
> +    WEBCORE_EXPORT int runVacuumCommand();

Comment should say what the return value means.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:94
> +    , m_observedDomainCount(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "SELECT COUNT(*) FROM ObservedDomains"_s))
> +    , m_insertObservedDomainStatement(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "INSERT INTO ObservedDomains (registrableDomain, lastSeen, hadUserInteraction, mostRecentUserInteractionTime, grandfathered, isPrevalent, isVeryPrevalent, dataRecordsRemoved, timesAccessedAsFirstPartyDueToUserInteraction, timesAccessedAsFirstPartyDueToStorageAccessAPI) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)"_s))
> +    , m_insertTopLevelDomainStatement(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "INSERT INTO TopLevelDomains VALUES (?)"_s))
> +    , m_domainIDFromStringStatement(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "SELECT domainID FROM ObservedDomains WHERE registrableDomain = ?"_s))
> +    , m_storageAccessUnderTopFrameDomainsStatement(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "INSERT INTO StorageAccessUnderTopFrameDomains (domainID, topLevelDomainID) SELECT ?, domainID FROM ObservedDomains WHERE registrableDomain == ?"_s))
> +    , m_topFrameUniqueRedirectsTo(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "INSERT INTO TopFrameUniqueRedirectsTo (sourceDomainID, toDomainID) SELECT ?, domainID FROM ObservedDomains WHERE registrableDomain == ?"_s))
> +    , m_topFrameUniqueRedirectsToExists(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "SELECT EXISTS (SELECT 1 FROM TopFrameUniqueRedirectsTo WHERE sourceDomainID = ? AND toDomainID = (SELECT domainID FROM ObservedDomains WHERE registrableDomain = ?))"_s))
> +    , m_topFrameUniqueRedirectsFrom(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "INSERT INTO TopFrameUniqueRedirectsFrom (targetDomainID, fromDomainID) SELECT ?, domainID FROM ObservedDomains WHERE registrableDomain = ?"_s))
> +    , m_topFrameUniqueRedirectsFromExists(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "SELECT EXISTS (SELECT 1 FROM TopFrameUniqueRedirectsFrom WHERE targetDomainID = ? AND fromDomainID = (SELECT domainID FROM ObservedDomains WHERE registrableDomain = ?))"_s))
> +    , m_subframeUnderTopFrameDomains(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "INSERT INTO SubframeUnderTopFrameDomains (subFrameDomainID, topFrameDomainID) SELECT ?, domainID FROM ObservedDomains WHERE registrableDomain = ?"_s))
> +    , m_subframeUnderTopFrameDomainExists(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "SELECT EXISTS (SELECT 1 FROM SubframeUnderTopFrameDomains WHERE subFrameDomainID = ? AND topFrameDomainID = (SELECT domainID FROM ObservedDomains WHERE registrableDomain = ?))"_s))
> +    , m_subresourceUnderTopFrameDomains(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "INSERT INTO SubresourceUnderTopFrameDomains (subresourceDomainID, topFrameDomainID) SELECT ?, domainID FROM ObservedDomains WHERE registrableDomain = ?"_s))
> +    , m_subresourceUnderTopFrameDomainExists(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "SELECT EXISTS (SELECT 1 FROM SubresourceUnderTopFrameDomains WHERE subresourceDomainID = ? AND topFrameDomainID = (SELECT domainID FROM ObservedDomains WHERE registrableDomain = ?))"_s))
> +    , m_subresourceUniqueRedirectsTo(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "INSERT INTO SubresourceUniqueRedirectsTo (subresourceDomainID, toDomainID) SELECT ?, domainID FROM ObservedDomains WHERE registrableDomain == ?"_s))
> +    , m_subresourceUniqueRedirectsToExists(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "SELECT EXISTS (SELECT 1 FROM SubresourceUniqueRedirectsTo WHERE subresourceDomainID = ? AND toDomainID = (SELECT domainID FROM ObservedDomains WHERE registrableDomain = ?))"_s))
> +    , m_subresourceUniqueRedirectsFrom(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "INSERT INTO SubresourceUniqueRedirectsFrom (subresourceDomainID, fromDomainID) SELECT ?, domainID FROM ObservedDomains WHERE registrableDomain == ?"_s))
> +    , m_subresourceUniqueRedirectsFromExists(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "SELECT EXISTS (SELECT 1 FROM SubresourceUniqueRedirectsFrom WHERE subresourceDomainID = ? AND fromDomainID = (SELECT domainID FROM ObservedDomains WHERE registrableDomain = ?))"_s))
> +    , m_mostRecentUserInteractionStatement(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "UPDATE ObservedDomains SET hadUserInteraction = ?, mostRecentUserInteractionTime = ? WHERE registrableDomain = ?"_s))
> +    , m_updateLastSeenStatement(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "UPDATE ObservedDomains SET lastSeen = ? WHERE registrableDomain = ?"_s))
> +    , m_updatePrevalentResourceStatement(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "UPDATE ObservedDomains SET isPrevalent = ? WHERE registrableDomain = ?"_s))
> +    , m_isPrevalentResourceStatement(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "SELECT isPrevalent FROM ObservedDomains WHERE registrableDomain = ?"_s))
> +    , m_updateVeryPrevalentResourceStatement(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "UPDATE ObservedDomains SET isVeryPrevalent = ? WHERE registrableDomain = ?"_s))
> +    , m_isVeryPrevalentResourceStatement(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "SELECT isVeryPrevalent FROM ObservedDomains WHERE registrableDomain = ?"_s))
> +    , m_clearPrevalentResourceStatement(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "UPDATE ObservedDomains SET isPrevalent = 0, isVeryPrevalent = 0 WHERE registrableDomain = ?"_s))
> +    , m_updateHadUserInteractionStatement(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "UPDATE ObservedDomains SET hadUserInteraction = ? WHERE registrableDomain = ?"_s))
> +    , m_hadUserInteractionStatement(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "SELECT hadUserInteraction, mostRecentUserInteractionTime FROM ObservedDomains WHERE registrableDomain = ?"_s))
> +    , m_updateMostRecentUserInteractionTimeStatement(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "UPDATE ObservedDomains SET mostRecentUserInteractionTime = ? WHERE registrableDomain = ?"_s))
> +    , m_updateGrandfatheredStatement(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "UPDATE ObservedDomains SET grandfathered = ? WHERE registrableDomain = ?"_s))
> +    , m_isGrandfatheredStatement(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "SELECT grandfathered FROM ObservedDomains WHERE registrableDomain = ?"_s))
> +    , m_updateDataRecordsRemovedStatement(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "UPDATE ObservedDomains SET dataRecordsRemoved = ? WHERE registrableDomain = ?"_s))
> +    , m_updateTimesAccessedAsFirstPartyDueToUserInteractionStatement(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "UPDATE ObservedDomains SET timesAccessedAsFirstPartyDueToUserInteraction = ? WHERE registrableDomain = ?"_s))
> +    , m_updateTimesAccessedAsFirstPartyDueToStorageAccessAPIStatement(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "UPDATE ObservedDomains SET timesAccessedAsFirstPartyDueToStorageAccessAPI = ? WHERE registrableDomain = ?"_s))
> +    , m_findExpiredUserInteractionStatement(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "SELECT domainID FROM ObservedDomains WHERE hadUserInteraction = 1 AND mostRecentUserInteractionTime < ?"_s))

Would be nice if all these strings were in a const char* const [] to make them more readable.
Comment 7 Chris Dumez 2019-03-01 11:13:13 PST
Comment on attachment 363255 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363255&action=review

This is huge. I did not have time to go further.

> Source/WebCore/page/RuntimeEnabledFeatures.h:151
> +    void setItpDatabaseModeEnabled(bool isEnabled) { m_itpDatabaseMode = isEnabled; }

Don't we use all caps for abbreviations? setITP...

>> Source/WebCore/page/RuntimeEnabledFeatures.h:540
>> +    bool m_itpDatabaseMode { false };
> 
> This should be called m_itpDatabaseModeEnabled

Actually, Boolean variables need a prefix as per WebKit coding style so:
m_isITPDatabaseModeEnabled ?

Am I not even sure what this means though? Do we really want "Mode" in there?

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:55
> +#define RELEASE_LOG_IF_ALLOWED(sessionID, fmt, ...) RELEASE_LOG_IF(sessionID.isAlwaysOnLoggingAllowed(), Network, "%p - NetworkHTTPSUpgradeChecker::" fmt, this, ##__VA_ARGS__)

NetworkHTTPSUpgradeChecker, haha funny. copy/pasta.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:56
> +#define RELEASE_LOG_ERROR_IF_ALLOWED(sessionID, fmt, ...) RELEASE_LOG_ERROR_IF(sessionID.isAlwaysOnLoggingAllowed(), Network, "%p - NetworkHTTPSUpgradeChecker::" fmt, this, ##__VA_ARGS__)

NetworkHTTPSUpgradeChecker, haha funny.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:60
> +    , m_storageDirectoryPath(storageDirectoryPath + String("/observations.db"))

Why String() ?

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:382
> +    for (const auto& topFrameDomain : loadStatistics.storageAccessUnderTopFrameDomains)

We usually omit the "const" when using auto. Same comment below.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:397
> +    for (const auto& topFrameDomain : loadStatistics.subresourceUniqueRedirectsTo)

Should this still be named topFrameDomain?

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:400
> +    for (const auto& topFrameDomain : loadStatistics.subresourceUniqueRedirectsFrom)

ditto.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:413
> +        const auto& statisticsMap = weakMemoryStore->data();

we normally omit const when using auto.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:414
> +        for (const auto& statistic : statisticsMap)

ditto.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:419
> +        for (const auto& statistic : statisticsMap)

ditto.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:439
> +        builder.appendLiteral("\"");

builder.append('"');

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:441
> +        builder.appendLiteral("\"");

ditto.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:464
> +        RELEASE_LOG(ResourceLoadStatistics, "Hit %u recursive calls in redirect backtrace. Returning early.", maxNumberOfRecursiveCallsInRedirectTraceBack);

Should this be a RELEASE_LOG_ERROR?

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:468
> +    numberOfRecursiveCalls++;

++numberOfRecursiveCalls;

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:699
> +#if !RELEASE_LOG_DISABLED

Why do we need all these #if !RELEASE_LOG_DISABLED?
Cannot be make RELEASE_LOG_INFO_IF() a no-op in this case instead?

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1031
> +    ASSERT(0);

ASSERT_NOT_REACHED() ?
Comment 8 Chris Dumez 2019-03-01 12:04:25 PST
Comment on attachment 363255 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363255&action=review

Ok, I did a full pass now.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:49
> +class ResourceLoadStatisticsDatabaseStore : public ResourceLoadStatisticsStore {

final ?

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:168
> +    UniqueRef<WebCore::SQLiteStatement> m_observedDomainCount;

Could we mark some of these as mutable to avoid some of the const casting?

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsMemoryStore.h:48
> +class ResourceLoadStatisticsMemoryStore : public ResourceLoadStatisticsStore {

final ?

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsStore.h:93
> +    ResourceLoadStatisticsStore(WebResourceLoadStatisticsStore&, WorkQueue&);

This constructor should probably be protected.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsStore.h:221
> +protected:

It is considered bad practice to have protected data members in a base class. Data members should ideally be private. You can have protected getters / setters for them as needed.

> Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.h:195
> +    std::unique_ptr<ResourceLoadStatisticsDatabaseStore> m_databaseStore;

It seems you cannot have both a m_memoryStore and a m_databaseStore. Also, they both inherit the same base class. Seems to me there should be a single data member whose type is the base class. It would avoid all those if (m_databaseStore) checks in the cpp.
Comment 9 Alex Christensen 2019-03-01 12:18:29 PST
Comment on attachment 363255 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363255&action=review

How are we going to handle versioning and future schema upgrades?

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:178
> +    mutable UniqueRef<WebCore::SQLiteStatement> m_subframeUnderTopFrameDomainExists;

Why are these all UniqueRef's?  No inheritance is involved, and this uses the same amount of memory, it just makes them separate allocations and adds a pointer.
Comment 10 Brent Fulgham 2019-03-01 12:45:38 PST
Comment on attachment 363255 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363255&action=review

>>> Source/WebCore/page/RuntimeEnabledFeatures.h:540
>>> +    bool m_itpDatabaseMode { false };
>> 
>> This should be called m_itpDatabaseModeEnabled
> 
> Actually, Boolean variables need a prefix as per WebKit coding style so:
> m_isITPDatabaseModeEnabled ?
> 
> Am I not even sure what this means though? Do we really want "Mode" in there?

Sure -- it could be m_isITPDatabaseEnabled.

>> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:55
>> +#define RELEASE_LOG_IF_ALLOWED(sessionID, fmt, ...) RELEASE_LOG_IF(sessionID.isAlwaysOnLoggingAllowed(), Network, "%p - NetworkHTTPSUpgradeChecker::" fmt, this, ##__VA_ARGS__)
> 
> NetworkHTTPSUpgradeChecker, haha funny. copy/pasta.

Now you know what I used as my inspiration ...

>> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:94
>> +    , m_findExpiredUserInteractionStatement(makeUniqueRef<WebCore::SQLiteStatement>(m_database.get(), "SELECT domainID FROM ObservedDomains WHERE hadUserInteraction = 1 AND mostRecentUserInteractionTime < ?"_s))
> 
> Would be nice if all these strings were in a const char* const [] to make them more readable.

Will do!

>> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:382
>> +    for (const auto& topFrameDomain : loadStatistics.storageAccessUnderTopFrameDomains)
> 
> We usually omit the "const" when using auto. Same comment below.

Will fix.

>> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:397
>> +    for (const auto& topFrameDomain : loadStatistics.subresourceUniqueRedirectsTo)
> 
> Should this still be named topFrameDomain?

No. Copy/paste foo on my part. I'll fix.

>> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:464
>> +        RELEASE_LOG(ResourceLoadStatistics, "Hit %u recursive calls in redirect backtrace. Returning early.", maxNumberOfRecursiveCallsInRedirectTraceBack);
> 
> Should this be a RELEASE_LOG_ERROR?

Yes, good catch.

>> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:699
>> +#if !RELEASE_LOG_DISABLED
> 
> Why do we need all these #if !RELEASE_LOG_DISABLED?
> Cannot be make RELEASE_LOG_INFO_IF() a no-op in this case instead?

Good point.

>> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1031
>> +    ASSERT(0);
> 
> ASSERT_NOT_REACHED() ?

Yes.

>> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:49
>> +class ResourceLoadStatisticsDatabaseStore : public ResourceLoadStatisticsStore {
> 
> final ?

Sure!

>> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:168
>> +    UniqueRef<WebCore::SQLiteStatement> m_observedDomainCount;
> 
> Could we mark some of these as mutable to avoid some of the const casting?

Yes. That's a better idea.

>> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:178
>> +    mutable UniqueRef<WebCore::SQLiteStatement> m_subframeUnderTopFrameDomainExists;
> 
> Why are these all UniqueRef's?  No inheritance is involved, and this uses the same amount of memory, it just makes them separate allocations and adds a pointer.

Do you mean why not just make them members (i.e., WebCore::SQLiteStatement m_subframeUnder...Blah)?

Sure - I can do so. I was mimicking other SQLite code I found in WebKit and though this was needed for some reason.

>> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsMemoryStore.h:48
>> +class ResourceLoadStatisticsMemoryStore : public ResourceLoadStatisticsStore {
> 
> final ?

Yes

>> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsStore.h:93
>> +    ResourceLoadStatisticsStore(WebResourceLoadStatisticsStore&, WorkQueue&);
> 
> This constructor should probably be protected.

Good idea.

>> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsStore.h:221
>> +protected:
> 
> It is considered bad practice to have protected data members in a base class. Data members should ideally be private. You can have protected getters / setters for them as needed.

Okay -- I'll do so.

>> Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsStore.h:195
>> +    std::unique_ptr<ResourceLoadStatisticsDatabaseStore> m_databaseStore;
> 
> It seems you cannot have both a m_memoryStore and a m_databaseStore. Also, they both inherit the same base class. Seems to me there should be a single data member whose type is the base class. It would avoid all those if (m_databaseStore) checks in the cpp.

Oh! What an idiot I am! Of course -- I'll change that.
Comment 11 Brent Fulgham 2019-03-03 19:01:23 PST
Created attachment 363476 [details]
Patch
Comment 12 Brent Fulgham 2019-03-04 08:24:07 PST
Created attachment 363513 [details]
Patch
Comment 13 EWS Watchlist 2019-03-04 08:26:09 PST
Attachment 363513 [details] did not pass style-queue:


ERROR: Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:291:  Multi line control clauses should use braces.  [whitespace/braces] [4]
Total errors found: 1 in 27 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 14 Chris Dumez 2019-03-04 10:57:50 PST
Comment on attachment 363513 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=363513&action=review

r=me with changes.

> Source/WebCore/page/RuntimeEnabledFeatures.h:152
> +    bool iTPDatabaseEnabled() const { return m_isITPDatabaseEnabled; }

typo: iTPDatabaseEnabled -> isTPDatabaseEnabled

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:145
> +    this->workQueue()->dispatch([this, path = m_storageDirectoryPath.isolatedCopy()] {

this->workQueue()->dispatch() could be workQueue.dispatch(). But really, why are we dispatching here? We are already on the background queue.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:155
> +        if (!m_database.tableExists("ObservedDomains")) {

"ObservedDomains"_s

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:182
> +        result = m_observedDomainCount.getColumnInt(0) ? false : true;

Personally, I think all these ternaries would look better like so:
result = !m_observedDomainCount.getColumnInt(0);

But your call.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:190
> +constexpr auto createObservedDomain = "CREATE TABLE ObservedDomains ("

Could we move those to the top of the file with the others?

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:329
> +{

ASSERT(!RunLoop::isMain());

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:363
> +    if (statement.bindInt(1, firstDomainID) != SQLITE_OK

ASSERT(!RunLoop::isMain());

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:371
> +    bool result = statement.getColumnInt(0) ? true : false;

I think it would look better like so: bool relationShipExists = !!statement.getColumnInt(0);

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:381
> +    if (statement.bindInt(1, domainID) != SQLITE_OK

ASSERT(!RunLoop::isMain());

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:419
> +    unsigned domainID = 0;

ASSERT(!RunLoop::isMain());

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:438
> +    auto registrableDomainID = domainID(loadStatistics.registrableDomain);

ASSERT(!RunLoop::isMain());

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:463
> +    workQueue()->dispatch([this, weakMemoryStore = makeWeakPtr(memoryStore)] {

Why are we dispatching? I believe this is already called on the background queue since all methods of the stores are supposed to be always called on the background queue. On a related note, we should assert we are on the right queue.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:506
> +    SQLiteStatement domainsToUpdateStatement(m_database, makeString("UPDATE ObservedDomains SET dataRecordsRemoved = dataRecordsRemoved + 1 WHERE registrableDomain IN (", domainsToString(domains), ")"));

ASSERT(!RunLoop::isMain());

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:539
> +            auto newDomain = findTopFrames.getColumnInt(0);

isNewDomain? or more likely newDomainID?

The name is confusing considering this is an int.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:594
> +    HashMap<unsigned, NotVeryPrevalentResources> results;

ASSERT(!RunLoop::isMain());

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:674
> +    ensurePrevalentResourcesForDebugMode();

ASSERT(!RunLoop::isMain());

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:680
> +{

ASSERT(!RunLoop::isMain());

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:685
> +{

ASSERT(!RunLoop::isMain());

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:822
> +{

ASSERT(!RunLoop::isMain());

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:833
> +    if (!debugModeEnabled())

ASSERT(!RunLoop::isMain());

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:942
> +    if (m_mostRecentUserInteractionStatement.bindInt(1, hadUserInteraction)

ASSERT(!RunLoop::isMain());

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1041
> +    SQLiteStatement domainsToUpdateStatement(m_database, makeString("UPDATE ObservedDomains SET isPrevalent = 1 WHERE domainID IN (", buildList(WTF::IteratorRange<StdSet<unsigned>::iterator>(domains.begin(), domains.end())), ")"));

ASSERT(!RunLoop::isMain());

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1055
> +    ASSERT_NOT_REACHED(0);

ASSERT_NOT_REACHED(0); -> ASSERT_NOT_REACHED();

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1268
> +    SQLiteStatement statement(m_database, makeString("SELECT isPrevalent, hadUserInteraction FROM ObservedDomains WHERE registrableDomain = ", domain.string()));

ASSERT(!RunLoop::isMain());

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:1284
> +    auto firstPartyPrimaryDomainID = domainID(firstPartyDomain);

ASSERT(!RunLoop::isMain());

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsMemoryStore.cpp:79
> +    this->workQueue()->dispatchAfter(5_s, [weakThis = makeWeakPtr(*this)] {

workQueue.dispatchAfter() ?

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsStore.h:219
> +    Vector<OperatingDate>& operatingDates() { return m_operatingDates; }

Please add a proper setter instead.

> Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsStore.h:221
> +    WallTime& endOfGrandfatheringTimestamp() { return m_endOfGrandfatheringTimestamp; }

Please add a proper setter instead: setEndOfGrandfatheringTimestamp(WallTime);
Comment 15 Alex Christensen 2019-03-04 12:10:14 PST
(In reply to Alex Christensen from comment #9)
> How are we going to handle versioning and future schema upgrades?
This was not answered.
Comment 16 Brent Fulgham 2019-03-04 14:23:25 PST
(In reply to Alex Christensen from comment #9)
> Comment on attachment 363255 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=363255&action=review
> 
> How are we going to handle versioning and future schema upgrades?

I'll add a new radar to track that. I'll probably need to have a revision stored somewhere. I was thinking about adding a table to hold "operating dates" as well, so I can do a separate patch for that once this one is landed. It's already enormous.
Comment 17 Brent Fulgham 2019-03-04 14:42:56 PST
(In reply to Brent Fulgham from comment #16)
> (In reply to Alex Christensen from comment #9)
> > Comment on attachment 363255 [details]
> > Patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=363255&action=review
> > 
> > How are we going to handle versioning and future schema upgrades?
> 
> I'll add a new radar to track that. I'll probably need to have a revision
> stored somewhere. I was thinking about adding a table to hold "operating
> dates" as well, so I can do a separate patch for that once this one is
> landed. It's already enormous.

Bug 195294
Comment 18 Brent Fulgham 2019-03-04 14:51:42 PST
Created attachment 363552 [details]
Patch for landing
Comment 19 Brent Fulgham 2019-03-04 16:13:24 PST
Created attachment 363566 [details]
Patch for landing
Comment 20 WebKit Commit Bot 2019-03-04 16:45:56 PST
Comment on attachment 363566 [details]
Patch for landing

Rejecting attachment 363566 [details] from commit-queue.

Failed to run "['/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch', '--status-host=webkit-queues.webkit.org', '--bot-id=webkit-cq-03', 'build', '--no-clean', '--no-update', '--build-style=release', '--port=mac']" exit_code: 2 cwd: /Volumes/Data/EWS/WebKit

Last 5000 characters of output:
MATTING_CONTEXT -DENABLE_LEGACY_CSS_VENDOR_PREFIXES -DENABLE_LEGACY_CUSTOM_PROTOCOL_MANAGER -DENABLE_LEGACY_ENCRYPTED_MEDIA -DENABLE_MATHML -DENABLE_MEDIA_CONTROLS_SCRIPT -DENABLE_MEDIA_SOURCE -DENABLE_MEDIA_STREAM -DENABLE_METER_ELEMENT -DENABLE_MOUSE_CURSOR_SCALE -DENABLE_NOTIFICATIONS -DENABLE_PAYMENT_REQUEST -DENABLE_PDFKIT_PLUGIN -DENABLE_POINTER_EVENTS -DENABLE_POINTER_LOCK -DENABLE_PUBLIC_SUFFIX_LIST -DENABLE_REMOTE_INSPECTOR -DENABLE_RESOURCE_LOAD_STATISTICS -DENABLE_RESOURCE_USAGE -DENABLE_RUBBER_BANDING -DENABLE_SERVICE_CONTROLS -DENABLE_SERVICE_WORKER -DENABLE_SPEECH_SYNTHESIS -DENABLE_STREAMS_API -DENABLE_WEB_CRYPTO -DENABLE_SVG_FONTS -DENABLE_TELEPHONE_NUMBER_DETECTION -DENABLE_TEXT_AUTOSIZING -DENABLE_USER_MESSAGE_HANDLERS -DENABLE_USERSELECT_ALL -DENABLE_VARIATION_FONTS -DENABLE_VIDEO -DENABLE_VIDEO_PRESENTATION_MODE -DENABLE_VIDEO_TRACK -DENABLE_VIDEO_USES_ELEMENT_FULLSCREEN -DENABLE_WEB_AUDIO -DENABLE_WEB_AUTHN -DENABLE_WEB_RTC -DENABLE_WEBGL -DENABLE_WEBGL2 -DENABLE_WEBGPU -DENABLE_WEBMETAL -DENABLE_WIRELESS_PLAYBACK_TARGET -DENABLE_XSLT -DENABLE_MANUAL_SANDBOXING -DHAVE_CORE_PREDICTION -DU_HIDE_DEPRECATED_API -DU_DISABLE_RENAMING=1 -DU_SHOW_CPLUSPLUS_API=0 -DFRAMEWORK_NAME=WebKit -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk -fasm-blocks -fstrict-aliasing -Wdeprecated-declarations -Winvalid-offsetof -mmacosx-version-min=10.13 -g -fvisibility=hidden -fvisibility-inlines-hidden -fno-threadsafe-statics -Wno-sign-conversion -Winfinite-recursion -Wmove -Wcomma -Wblock-capture-autoreleasing -Wstrict-prototypes -Wrange-loop-analysis -Wno-semicolon-before-method-body -iquote /Volumes/Data/EWS/WebKit/WebKitBuild/WebKit.build/Release/WebKit.build/WebKit-generated-files.hmap -I/Volumes/Data/EWS/WebKit/WebKitBuild/WebKit.build/Release/WebKit.build/WebKit-own-target-headers.hmap -I/Volumes/Data/EWS/WebKit/WebKitBuild/WebKit.build/Release/WebKit.build/WebKit-all-target-headers.hmap -iquote /Volumes/Data/EWS/WebKit/WebKitBuild/WebKit.build/Release/WebKit.build/WebKit-project-headers.hmap -I/Volumes/Data/EWS/WebKit/WebKitBuild/Release/include -I/Volumes/Data/EWS/WebKit/WebKitBuild/Release/usr/local/include -I/Volumes/Data/EWS/WebKit/WebKitBuild/Release/WebCore.framework/PrivateHeaders/ForwardingHeaders -I/Volumes/Data/EWS/WebKit/WebKitBuild/Release/DerivedSources/WebKit2 -I/Volumes/Data/EWS/WebKit/WebKitBuild/Release/usr/local/include/WebKitAdditions -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/usr/local/include/WebKitAdditions -I/Volumes/Data/EWS/WebKit/WebKitBuild/Release/usr/local/include/webrtc -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/usr/local/include/webrtc -I/Volumes/Data/EWS/WebKit/Source/WebKit -I/Volumes/Data/EWS/WebKit/WebKitBuild/WebKit.build/Release/WebKit.build/DerivedSources/x86_64 -I/Volumes/Data/EWS/WebKit/WebKitBuild/WebKit.build/Release/WebKit.build/DerivedSources -Wall -Wextra -Wcast-qual -Wchar-subscripts -Wextra-tokens -Wformat-security -Winit-self -Wmissing-format-attribute -Wmissing-noreturn -Wno-unused-parameter -Wpacked -Wpointer-arith -Wredundant-decls -Wundef -Wwrite-strings -Wexit-time-destructors -Wglobal-constructors -Wtautological-compare -Wimplicit-fallthrough -F/Volumes/Data/EWS/WebKit/WebKitBuild/Release -iframework /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/System/Library/PrivateFrameworks -iframework /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/System/Library/Frameworks -iframework /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/System/Library/PrivateFrameworks -isystem /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/System.framework/PrivateHeaders -include /Volumes/Data/EWS/WebKit/WebKitBuild/PrecompiledHeaders/WebKit2Prefix-bzwgnnttmtoavwhghxyzrzozdnuw/WebKit2Prefix.h -MMD -MT dependencies -MF /Volumes/Data/EWS/WebKit/WebKitBuild/WebKit.build/Release/WebKit.build/Objects-normal/x86_64/WebSocketStreamMessageReceiver.d --serialize-diagnostics /Volumes/Data/EWS/WebKit/WebKitBuild/WebKit.build/Release/WebKit.build/Objects-normal/x86_64/WebSocketStreamMessageReceiver.dia -c /Volumes/Data/EWS/WebKit/WebKitBuild/Release/DerivedSources/WebKit2/WebSocketStreamMessageReceiver.cpp -o /Volumes/Data/EWS/WebKit/WebKitBuild/WebKit.build/Release/WebKit.build/Objects-normal/x86_64/WebSocketStreamMessageReceiver.o

** BUILD FAILED **


The following build commands failed:
	CompileC /Volumes/Data/EWS/WebKit/WebKitBuild/WebKit.build/Release/WebKit.build/Objects-normal/x86_64/WebPageUpdatePreferences.o /Volumes/Data/EWS/WebKit/WebKitBuild/Release/DerivedSources/WebKit2/WebPageUpdatePreferences.cpp normal x86_64 c++ com.apple.compilers.llvm.clang.1_0.compiler
(1 failure)

Full output: https://webkit-queues.webkit.org/results/11369087
Comment 21 Brent Fulgham 2019-03-04 18:15:30 PST
Created attachment 363577 [details]
Patch for landing
Comment 22 Brent Fulgham 2019-03-04 18:23:15 PST
Created attachment 363578 [details]
Patch for landing
Comment 23 WebKit Commit Bot 2019-03-04 19:00:36 PST
Comment on attachment 363578 [details]
Patch for landing

Clearing flags on attachment: 363578

Committed r242406: <https://trac.webkit.org/changeset/242406>
Comment 24 WebKit Commit Bot 2019-03-04 19:00:38 PST
All reviewed patches have been landed.  Closing bug.
Comment 25 Fujii Hironori 2019-03-04 20:26:03 PST
Committed r242409: <https://trac.webkit.org/changeset/242409>
Comment 26 Brent Fulgham 2019-08-12 10:04:12 PDT
*** Bug 154642 has been marked as a duplicate of this bug. ***