Bug 234194

Summary: Add PushDatabase
Product: WebKit Reporter: Ben Nham <nham>
Component: New BugsAssignee: Ben Nham <nham>
Status: RESOLVED FIXED    
Severity: Normal CC: annulen, beidson, ews-watchlist, gyuyoung.kim, nham, ryuan.choi, sergio, sihui_liu, webkit-bug-importer, youennf
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 235857, 238010    
Attachments:
Description Flags
Patch
ews-feeder: commit-queue-
Patch
ews-feeder: commit-queue-
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
rebase
none
patch feedback none

Description Ben Nham 2021-12-10 22:27:30 PST
Add PushDatabase
Comment 1 Ben Nham 2021-12-10 22:31:21 PST
Created attachment 446862 [details]
Patch
Comment 2 Radar WebKit Bug Importer 2021-12-10 22:32:41 PST
<rdar://problem/86353439>
Comment 3 Ben Nham 2021-12-10 22:34:27 PST
Created attachment 446863 [details]
Patch
Comment 4 Ben Nham 2021-12-10 22:53:02 PST
Created attachment 446864 [details]
Patch
Comment 5 Ben Nham 2021-12-11 12:28:56 PST
Created attachment 446896 [details]
Patch
Comment 6 Ben Nham 2021-12-14 17:44:24 PST
Created attachment 447185 [details]
Patch
Comment 7 Ben Nham 2022-01-28 22:31:23 PST
Created attachment 450314 [details]
Patch
Comment 8 Ben Nham 2022-01-28 22:32:56 PST
Created attachment 450315 [details]
Patch
Comment 9 Ben Nham 2022-01-30 22:00:00 PST
Created attachment 450379 [details]
rebase
Comment 10 youenn fablet 2022-02-01 23:19:11 PST
Comment on attachment 450379 [details]
rebase

Some nits here and there.
LGTM overall but it may be good to have a closer SQL-dedicated look.

Question: in service worker we encode the schema ID in the filename, here it is added in the database itself.
Any reason why we are not using a single model?

View in context: https://bugs.webkit.org/attachment.cgi?id=450379&action=review

> Source/WebCore/Modules/push-api/PushDatabase.cpp:2
> + * Copyright (C) 2021 Apple Inc. All rights reserved.

2022

> Source/WebCore/Modules/push-api/PushDatabase.cpp:131
> +                failOnMainQueue(WTFMove(completionHandler));

Shouldn't we call createFreshPushDatabase?

> Source/WebCore/Modules/push-api/PushDatabase.cpp:152
> +                    failOnMainQueue(WTFMove(completionHandler));

Shouldn't we call createFreshPushDatabase here as well?

> Source/WebCore/Modules/push-api/PushDatabase.cpp:216
> +static void completeOnMainQueue(CompletionHandler<void(T)>&& completionHandler, const U& result)

s/const U&/U&&/

> Source/WebCore/Modules/push-api/PushDatabase.cpp:218
> +    WorkQueue::main().dispatch([completionHandler = WTFMove(completionHandler), result = crossThreadCopy(result)]() mutable {

s/result/WTFMove(result) to get the potential optimization of reusing what can be reused (one ref strings, vectors...)

> Source/WebCore/Modules/push-api/PushDatabase.cpp:468
> +        Vector<Vector<String>> topicsByWakeState(numberOfWakeStates); // used because HashMap::isolatedCopy doesn't exist

Can it be a Vector<Vector<String>, PushWakeState::NumberOfStates>?

> Source/WebCore/Modules/push-api/PushDatabase.cpp:490
> +        WorkQueue::main().dispatch([completionHandler = WTFMove(completionHandler), topicsByWakeState = crossThreadCopy(topicsByWakeState)]() mutable {

WTFMove(topicsByWakeState) to optimise it.

> Source/WebCore/Modules/push-api/PushDatabase.cpp:492
> +            for (int i = 0; i < static_cast<int>(PushWakeState::NumberOfStates); ++i)

Would uint8_t allow to remove the static_cast?

> Source/WebCore/Modules/push-api/PushDatabase.h:33
> +#include "SQLiteStatement.h"

Needed?

> Source/WebCore/Modules/push-api/PushDatabase.h:37
> +#include <wtf/Function.h>

Function probably not needed.

> Source/WebCore/Modules/push-api/PushDatabase.h:68
> +    WEBCORE_EXPORT PushRecord isolatedCopy() const;

A WEBCORE_EXPORT PushRecord isolatedCopy() && version might be useful to reuse all those Vectors and probably one-ref Strings.

> Source/WebCore/Modules/push-api/PushDatabase.h:88
> +    explicit PushDatabase(Ref<WorkQueue>&&, std::unique_ptr<WebCore::SQLiteDatabase>&&);

s/explicit//

> Source/WebCore/Modules/push-api/PushDatabase.h:92
> +    std::unique_ptr<WebCore::SQLiteDatabase> m_db;

I guess it could be a UniqueRef.

> Tools/ChangeLog:17
> +        (TestWebKitAPI::TEST_F):

Great tests!
Can we add another one that would do something like:
- Start an action on a database
- Synchronously destroy explicitly the database and make sure nothing crashes (dispatchSync probably ensures this).
Comment 11 Sihui Liu 2022-02-02 10:32:26 PST
Comment on attachment 450379 [details]
rebase

View in context: https://bugs.webkit.org/attachment.cgi?id=450379&action=review

> Source/WebCore/Modules/push-api/PushDatabase.cpp:69
> +    "PRAGMA user_version = 1"_s

Do we want to ensure user_version is the same as currentPushDatabaseVersion, e.g. by creating this query from currentPushDatabaseVersion.

> Source/WebCore/Modules/push-api/PushDatabase.cpp:134
> +            version = sql->columnInt(0);

I think columnInt() runs step(); may just check if version is 0.

> Source/WebCore/Modules/push-api/PushDatabase.cpp:137
> +        if (version > currentPushDatabaseVersion) {

Why do we have different policies for incorrect versions if there is only one valid version in current code? We may just create a new database if version does not match, and add code to migrate data when different versions are possible.

> Source/WebCore/Modules/push-api/PushDatabase.cpp:140
> +                createFreshPushDatabase(path, WTFMove(completionHandler));

Why do we need to dispatch back to main thread? It seems more efficient to just close the database, delete file, and open again here?

> Source/WebCore/Modules/push-api/PushDatabase.cpp:174
> +{

Can we assert this is main thread?

> Source/WebCore/Modules/push-api/PushDatabase.cpp:177
> +PushDatabase::~PushDatabase()

Can we avoid file I/O in destructor by adding a public PushDatabase::close() and assert database is closed here?

> Source/WebCore/Modules/push-api/PushDatabase.cpp:178
> +{

Can we assert this is main thread?

> Source/WebCore/Modules/push-api/PushDatabase.cpp:179
> +    m_queue->dispatchSync([db = WTFMove(m_db), statements = WTFMove(m_statements)]() mutable {

Is it safe to access m_db from both m_queue and main thread?

> Source/WebCore/Modules/push-api/PushDatabase.cpp:186
> +{

Can we assert this is non-main thread?

> Source/WebCore/Modules/push-api/PushDatabase.cpp:217
> +{

Can we assert this is non-main thread.

> Source/WebCore/Modules/push-api/PushDatabase.cpp:223
> +void PushDatabase::insertRecord(const PushRecord& record, CompletionHandler<void(std::optional<PushRecord>&&)>&& completionHandler)

If we are not changing the record, can we rely on call site to remember the record (e.g. in the completionHandler) and only return true/false for result? It seems we need to isolated copy again when passing record back to main thread.

> Source/WebCore/Modules/push-api/PushDatabase.cpp:404
> +            "CROSS JOIN SubscriptionSets ss "

Do we want INNER JOIN instead? (https://www.sqlite.org/lang_select.html 2.2. Special handling of CROSS JOIN)

> Source/WebCore/Modules/push-api/PushDatabase.h:2
> + * Copyright (C) 2021 Apple Inc. All rights reserved.

2022

> Source/WebCore/platform/sql/SQLiteDatabase.cpp:151
> +    bool isInMemory = filename == ":memory:"_s;

Can we add a function SQLiteDatabase::openInMemory(OpenMode openMode) which calls open(":memory:", openMode), so the call sites don't need to know the special name.
In PushDatabase we can check empty string for in-memory database.

> Tools/TestWebKitAPI/Tests/WebCore/PushDatabase.cpp:2
> + * Copyright (C) 2020 Apple Inc. All rights reserved.

2022
Comment 12 Ben Nham 2022-02-02 16:47:07 PST
(In reply to youenn fablet from comment #10)
> Comment on attachment 450379 [details]

> Question: in service worker we encode the schema ID in the filename, here it
> is added in the database itself.
> Any reason why we are not using a single model?

We will eventually need to support data migrations, so we need to store the schema version somewhere, and PRAGMA user_version is designed to do that. It is also protected by the same SQL transaction that migrates the schema, so we can be sure that the migration is atomic.

Changing a filename occurs outside of the context of a SQL transaction, so that seems more likely to be buggy. Why do we do that?

> > Source/WebCore/Modules/push-api/PushDatabase.cpp:492
> > +            for (int i = 0; i < static_cast<int>(PushWakeState::NumberOfStates); ++i)
> 
> Would uint8_t allow to remove the static_cast?

I don't think so, an enum class doesn't get implicitly converted to its underlying type.

> > Source/WebCore/Modules/push-api/PushDatabase.h:33
> > +#include "SQLiteStatement.h"
> 
> Needed?

This one is actually needed or else I get a bunch of errors from clients of this class using an incomplete type (I think this is due to the way UniqueRef is implemented).

I think I've addressed the rest of the comments in the latest patch.
Comment 13 Ben Nham 2022-02-02 16:53:33 PST
(In reply to Sihui Liu from comment #11)
> Comment on attachment 450379 [details]

> > Source/WebCore/Modules/push-api/PushDatabase.cpp:134
> > +            version = sql->columnInt(0);
> 
> I think columnInt() runs step(); may just check if version is 0.

Yeah, I just checked step() for SQLITE_ROW here anyway since technically I need to check the return result of prepareStatement.

> > Source/WebCore/Modules/push-api/PushDatabase.cpp:140
> > +                createFreshPushDatabase(path, WTFMove(completionHandler));
> 
> Why do we need to dispatch back to main thread? It seems more efficient to
> just close the database, delete file, and open again here?

I've gotten rid of the thread hopping in the upcoming version of the patch.

> > Source/WebCore/Modules/push-api/PushDatabase.cpp:179
> > +    m_queue->dispatchSync([db = WTFMove(m_db), statements = WTFMove(m_statements)]() mutable {
> 
> Is it safe to access m_db from both m_queue and main thread?

It should be safe in this limited case, as moving from those member variables is basically doing a bunch of pointer swaps, and the queue dispatch (dispatch_async) should emit a full memory barrier.

> > Source/WebCore/Modules/push-api/PushDatabase.cpp:223
> > +void PushDatabase::insertRecord(const PushRecord& record, CompletionHandler<void(std::optional<PushRecord>&&)>&& completionHandler)
> 
> If we are not changing the record, can we rely on call site to remember the
> record (e.g. in the completionHandler) and only return true/false for
> result? It seems we need to isolated copy again when passing record back to
> main thread.

We are changing both the identifier and wakeState fields in the record on insertion. We could just pass back both of those fields, but it makes the calling code cleaner to just pass back the whole record. Creating a push subscription shouldn't happen very often, so I think it's probably okay to do that if it makes the code easier to reason about.

> > Source/WebCore/Modules/push-api/PushDatabase.cpp:404
> > +            "CROSS JOIN SubscriptionSets ss "
> 
> Do we want INNER JOIN instead? (https://www.sqlite.org/lang_select.html 2.2.
> Special handling of CROSS JOIN)

Yeah, in the comment I am trying to mention that we are using CROSS JOIN on purpose to get a specific lookup ordering.

I think I've addressed the other comments in the upcoming version of the patch.
Comment 14 Ben Nham 2022-02-02 17:30:22 PST
I forgot to reply to one concern.

(In reply to Sihui Liu from comment #11)
> > Source/WebCore/Modules/push-api/PushDatabase.cpp:177
> > +PushDatabase::~PushDatabase()
> 
> Can we avoid file I/O in destructor by adding a public PushDatabase::close()
> and assert database is closed here?

I would prefer to keep things simple here with the dispatchSync rather than making the object track additional state about whether it's closed. In reality this object is eventually pointed to by a NeverDestroyed<WebPushDaemon> singleton object so the destructor will never actually run anyway. So it seems better to just keep the destructor implementation simple-but-correct.
Comment 15 Ben Nham 2022-02-02 22:07:18 PST
Created attachment 450736 [details]
patch feedback
Comment 16 Brady Eidson 2022-02-04 15:00:44 PST
Comment on attachment 450736 [details]
patch feedback

Based on the review back and forth already, and a glance at the patch, seems fine.
Comment 17 EWS 2022-02-04 15:22:47 PST
Committed r289139 (246835@main): <https://commits.webkit.org/246835@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 450736 [details].
Comment 18 Nikolas Zimmermann 2022-03-17 01:30:41 PDT
For me clean rebuilds are broken after this change, since PushSubscriptionIdentifier.h is present twice in the Xcode project file, which causes the "copy phase" to fail on clean builds.

I've fixed this in r291394.