Bug 201057 - IndexedDB: update size of database when database operation is completed
Summary: IndexedDB: update size of database when database operation is completed
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Sihui Liu
URL:
Keywords: InRadar
Depends on: 169621
Blocks:
  Show dependency treegraph
 
Reported: 2019-08-22 16:37 PDT by Sihui Liu
Modified: 2019-08-30 10:20 PDT (History)
8 users (show)

See Also:


Attachments
Patch (21.15 KB, patch)
2019-08-22 17:05 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Patch (22.66 KB, patch)
2019-08-26 12:46 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Patch (26.38 KB, patch)
2019-08-26 17:54 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews210 for win-future (13.56 MB, application/zip)
2019-08-26 21:13 PDT, Build Bot
no flags Details
Patch (19.70 KB, patch)
2019-08-27 15:40 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Patch for landing (19.44 KB, patch)
2019-08-28 11:18 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Patch for landing (19.46 KB, patch)
2019-08-28 11:19 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from webkit-cq-01 for mac-highsierra (3.32 MB, application/zip)
2019-08-28 13:19 PDT, WebKit Commit Bot
no flags Details
Patch for landing (19.53 KB, patch)
2019-08-30 09:36 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Sihui Liu 2019-08-22 16:37:04 PDT
Currently when a database operation is completed and its result gets posted to main thread, we re-compute the disk usage of the origin. 
This computation lists all databases in the origin directory and read the size of each database file. This is very inefficient because the completed operation should only affect one database.
Comment 1 Sihui Liu 2019-08-22 17:05:39 PDT
Created attachment 377079 [details]
Patch
Comment 2 Daniel Bates 2019-08-23 00:41:07 PDT
Comment on attachment 377079 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=377079&action=review

This patch looks sane. All of my comments are optional: code is OK as-is.

> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:745
> +void IDBServer::QuotaUser::initializeSpaceUsed(HashMap<String, uint64_t> databaseSizes)

I was not expecting databaseSizes to be passed by-value. I was expecting it to either

1. Be passed by rvalue-reference and WTFMove()ed into m_databaseSizes.
    ^^^ This is the post C++11 way we do things. Caller that still needs to have this data would perform the copy.

2. Be passed by const lvalue reference and copied into m_databaseSizes.
    ^^^ This is the pre-C++11 way we use to pass things.

> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:755
> +void IDBServer::QuotaUser::updateSpaceUsed(const String& encodeDatabaseName, uint64_t newSize)

I was not expecting the name encodeDatabaseName because:

1. encode is a verb and the variable reads like an action, but it's not an action.
2. this isn't the same name as the parameter in updateSpaceUsedForDatabase(). 
3. caller's caller passes result of SQLiteFileSystem::computeHashForFileName() for first argument, which is a "hash" according to the name of that function.

> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:767
> +uint64_t IDBServer::QuotaUser::spaceUsed() const
> +{
> +    uint64_t result = 0;
> +    for (auto size : m_databaseSizes.values())
> +        result += size;
> +
> +    return result + m_estimatedSpaceIncrease;
> +}

Theoretically if there is a bug in this code this code could yield an arithmetic overflow. In my opinion, this patch would benefit by adding a remark in the ChangeLog that points out this theoretical possibility, but explains that in all practicality that this is an impossibility. Adding a debug assert(s) to ensure this invariant could also complement this, though this may be excessive given the impossibility.

> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:792
> +void IDBServer::finishComputingSpaceUsedForOrigin(const ClientOrigin& origin, HashMap<String, uint64_t> databaseSizes)

databaseSizes is a by-value param. This could be a good thing because threads are involved. I did not expect to see a by-value param. This could be because:

1. I have this feeling that passing by rvalue-reference could work out here and ultimately calling through this function transfers ownership of the HashMap into the IDBServer::QuotaUser::m_databaseSizes.
2. If I am wrong about (1) then I would have expected an explanation in the ChangeLog on why pass by-value is used.
3. I expected to see a WTFMove() used in this function to avoid a duplicate HashMap copy.

> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:794
> +    ensureQuotaUser(origin).initializeSpaceUsed(databaseSizes);

The modification of this line is not what I expected to see. I expected to see one of the following:

1. A WTFMove() here.
2. A comment on why there is not a WTFMove() here.
3. An explanation in the ChangeLog why there is not a WTFMove() here.

> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:814
> +void IDBServer::updateSpaceUsedForDatabase(const ClientOrigin& origin, const String& databaseName, uint64_t databaseSize)

In my opinion, we can squeeze out a little better performance if we pass the database name as a rvalue-reference and WTFMove() it around.

> Source/WebCore/Modules/indexeddb/server/IDBServer.h:174
> +        HashMap<String, uint64_t> m_databaseSizes;

Something about this name threw me off. Pluralization made me think this was an array/vector.

> Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.cpp:848
> +static HashMap<String, uint64_t> databaseSizesForFolder(const String& folder)

FYI, I think I recall that the Apple Style Guide recommends against using "folder" in developer code/documentation and suggests using "directory". If I'm still recalling this correctly, "folder" is suggested in non developer  (i.e. regular users) documentation.

> Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.cpp:851
> +    for (auto& databaseDirectory : FileSystem::listDirectory(folder, "*")) {

I expected to see "*"_s here as it is just slightly more efficient.

> Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.cpp:865
> +    String oldVersionOriginDirectory = IDBDatabaseIdentifier::databaseDirectoryRelativeToRoot(topLevelOrigin, openingOrigin, directory, "v0");

databaseDirectoryRelativeToRoot() takes a const char*? If not, adding the  _s suffix on the strings would ever so slightly optimize this code.

In my opinion, auto could be used here.

> Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.cpp:868
> +    String newVersionOriginDirectory = IDBDatabaseIdentifier::databaseDirectoryRelativeToRoot(topLevelOrigin, openingOrigin, directory, "v1");

Ditto.

> Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.cpp:869
> +    for (auto[encodedDatabaseName, size] : databaseSizesForFolder(newVersionOriginDirectory))

In my opinion I would put a space between auto and [. Otherwise, it looks like an array subscript access.

> Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.h:97
> +    static HashMap<String, uint64_t> databaseSizesForOrigin(const SecurityOriginData&, const SecurityOriginData&, const String&);

In my opinion, this code would be clearer if there were names for the first and second argument because they have the same type.

In my opinion this patch would benefit by defining a type alias for HashMap<String, uint64_t> because:

1. It’s kinda long to type out this type.
2. Less chance for misspelling with a shorter type => less re-compiles => increase developer productivity.
3. The alias’s name could be more succinct/canonical and easier to refer to both in new code and conversations about this code.
4. Allow us to re-define this type later in one central place, reducing the amount of future code changes.
Comment 3 youenn fablet 2019-08-23 06:25:20 PDT
Comment on attachment 377079 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=377079&action=review

> Source/WebCore/ChangeLog:16
> +        the size only when completing an operation that may increase disk usage.

It makes sense to me for UniqueIDBDatabase to store its database size.
That said, do we really need a map<String, size> in the QuotaUser? It seems (or could be made) somehow partially redundant with m_uniqueIDBDatabaseMap.
Would it be possible to initialise the quota user size as we do now, and then have UniqueIDBDatabase know the size before and after an operation so as to compute the operation real cost?
Also, is the code working in case a database is deleted?
Comment 4 Sihui Liu 2019-08-26 12:44:16 PDT
Comment on attachment 377079 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=377079&action=review

>> Source/WebCore/ChangeLog:16
>> +        the size only when completing an operation that may increase disk usage.
> 
> It makes sense to me for UniqueIDBDatabase to store its database size.
> That said, do we really need a map<String, size> in the QuotaUser? It seems (or could be made) somehow partially redundant with m_uniqueIDBDatabaseMap.
> Would it be possible to initialise the quota user size as we do now, and then have UniqueIDBDatabase know the size before and after an operation so as to compute the operation real cost?
> Also, is the code working in case a database is deleted?

1. It's possible. I did not take that approach in consideration of accuracy. An implementation I thought about was: 
[1]QuotaUser computes its m_spaceUsed of origin directory during initialization 
[2]UniqueIDBDatabase is opened and computes its m_databaseSpaceUsed
[3]UniqueIDBDatabase finishes operations, re-computes m_databaseSpaceUsed
[4]UniqueIDBDatabase reports delta(m_databaseSpaceUsed) to QuotaUser and QuotaUser updates with m_spaceUsed plus the delta
The underlying assumption is UniqueIDBDatabase occupies size m_databaseSpaceUsed of m_spaceUsed. In the implementation above, initialization of  m_databaseSpaceUsed and m_spaceUsed happens at different time, which could introduce some error.
This patch aims to reduce the cost of listing the directory while keeping the old behavior. We may improve the quota computation later.

2. Nice catch. I should add an database size update to deleteBackingStore.

>> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:745
>> +void IDBServer::QuotaUser::initializeSpaceUsed(HashMap<String, uint64_t> databaseSizes)
> 
> I was not expecting databaseSizes to be passed by-value. I was expecting it to either
> 
> 1. Be passed by rvalue-reference and WTFMove()ed into m_databaseSizes.
>     ^^^ This is the post C++11 way we do things. Caller that still needs to have this data would perform the copy.
> 
> 2. Be passed by const lvalue reference and copied into m_databaseSizes.
>     ^^^ This is the pre-C++11 way we use to pass things.

Okay, will use method 1.

>> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:755
>> +void IDBServer::QuotaUser::updateSpaceUsed(const String& encodeDatabaseName, uint64_t newSize)
> 
> I was not expecting the name encodeDatabaseName because:
> 
> 1. encode is a verb and the variable reads like an action, but it's not an action.
> 2. this isn't the same name as the parameter in updateSpaceUsedForDatabase(). 
> 3. caller's caller passes result of SQLiteFileSystem::computeHashForFileName() for first argument, which is a "hash" according to the name of that function.

Changed to databaseHash, and apply to other places.

>> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:767
>> +}
> 
> Theoretically if there is a bug in this code this code could yield an arithmetic overflow. In my opinion, this patch would benefit by adding a remark in the ChangeLog that points out this theoretical possibility, but explains that in all practicality that this is an impossibility. Adding a debug assert(s) to ensure this invariant could also complement this, though this may be excessive given the impossibility.

Will add debug asserts.

>> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:792
>> +void IDBServer::finishComputingSpaceUsedForOrigin(const ClientOrigin& origin, HashMap<String, uint64_t> databaseSizes)
> 
> databaseSizes is a by-value param. This could be a good thing because threads are involved. I did not expect to see a by-value param. This could be because:
> 
> 1. I have this feeling that passing by rvalue-reference could work out here and ultimately calling through this function transfers ownership of the HashMap into the IDBServer::QuotaUser::m_databaseSizes.
> 2. If I am wrong about (1) then I would have expected an explanation in the ChangeLog on why pass by-value is used.
> 3. I expected to see a WTFMove() used in this function to avoid a duplicate HashMap copy.

rvalue-reference should work as createCrossThreadTask makes isolated copy of parameters.

>> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:794
>> +    ensureQuotaUser(origin).initializeSpaceUsed(databaseSizes);
> 
> The modification of this line is not what I expected to see. I expected to see one of the following:
> 
> 1. A WTFMove() here.
> 2. A comment on why there is not a WTFMove() here.
> 3. An explanation in the ChangeLog why there is not a WTFMove() here.

Sure, will use 1.

>> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:814
>> +void IDBServer::updateSpaceUsedForDatabase(const ClientOrigin& origin, const String& databaseName, uint64_t databaseSize)
> 
> In my opinion, we can squeeze out a little better performance if we pass the database name as a rvalue-reference and WTFMove() it around.

Okay.

>> Source/WebCore/Modules/indexeddb/server/IDBServer.h:174
>> +        HashMap<String, uint64_t> m_databaseSizes;
> 
> Something about this name threw me off. Pluralization made me think this was an array/vector.

Will use m_databaseSizeMap.

>> Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.cpp:848
>> +static HashMap<String, uint64_t> databaseSizesForFolder(const String& folder)
> 
> FYI, I think I recall that the Apple Style Guide recommends against using "folder" in developer code/documentation and suggests using "directory". If I'm still recalling this correctly, "folder" is suggested in non developer  (i.e. regular users) documentation.

Will change to "directory" and move this function to IDBServer.

>> Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.cpp:851
>> +    for (auto& databaseDirectory : FileSystem::listDirectory(folder, "*")) {
> 
> I expected to see "*"_s here as it is just slightly more efficient.

Okay.

>> Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.cpp:865
>> +    String oldVersionOriginDirectory = IDBDatabaseIdentifier::databaseDirectoryRelativeToRoot(topLevelOrigin, openingOrigin, directory, "v0");
> 
> databaseDirectoryRelativeToRoot() takes a const char*? If not, adding the  _s suffix on the strings would ever so slightly optimize this code.
> 
> In my opinion, auto could be used here.

Okay.

>> Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.cpp:869
>> +    for (auto[encodedDatabaseName, size] : databaseSizesForFolder(newVersionOriginDirectory))
> 
> In my opinion I would put a space between auto and [. Otherwise, it looks like an array subscript access.

Okay.

>> Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.h:97
>> +    static HashMap<String, uint64_t> databaseSizesForOrigin(const SecurityOriginData&, const SecurityOriginData&, const String&);
> 
> In my opinion, this code would be clearer if there were names for the first and second argument because they have the same type.
> 
> In my opinion this patch would benefit by defining a type alias for HashMap<String, uint64_t> because:
> 
> 1. It’s kinda long to type out this type.
> 2. Less chance for misspelling with a shorter type => less re-compiles => increase developer productivity.
> 3. The alias’s name could be more succinct/canonical and easier to refer to both in new code and conversations about this code.
> 4. Allow us to re-define this type later in one central place, reducing the amount of future code changes.

Sure would change to DatabaseSizeMap.
Comment 5 Sihui Liu 2019-08-26 12:46:37 PDT
Created attachment 377261 [details]
Patch
Comment 6 youenn fablet 2019-08-26 13:20:38 PDT
Comment on attachment 377261 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=377261&action=review

> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:760
> +        m_databaseSizeMap.set(databaseHash, newSize);

WTFMove(databaseHash) or maybe do not pass an r-value to updateSpaceUsed.

> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:798
> +        result.set(databaseDirectoryName, spaceUsed);

result.add and WTFMove(databaseDirectoryName)

> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:812
> +    for (auto[databaseHash, size] : databaseSizeMapForDirectory(newVersionOriginDirectory))

s/auto[/auto [/
s/databaseHash/databaseName/

> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:813
> +        databaseSizeMap.set(databaseHash, databaseSizeMap.get(databaseHash) + size);

WTFMove(databaseHash)

The use of set() is somehow weird here. If we have two different files, we should in theory add the file sizes.
But since they have the same name, I would think we should remove the v0 version.

> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:815
> +    postDatabaseTaskReply(createCrossThreadTask(*this, &IDBServer::finishComputingSpaceUsedForOrigin, origin, databaseSizeMap));

Since you are cross-thread copying the map, I wonder whether a Vector would not be more efficient, since we would create the map and its hash table only once instead of twice.

> Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:447
> +    m_server->updateSpaceUsedForDatabase(m_identifier.origin(), SQLiteFileSystem::computeHashForFileName(m_identifier.databaseName()), m_databaseSize);

It seems not efficient to compute the hash for filename every time we update the database size.
Should we store it somewhere?
Comment 7 youenn fablet 2019-08-26 13:21:03 PDT
> The underlying assumption is UniqueIDBDatabase occupies size
> m_databaseSpaceUsed of m_spaceUsed. In the implementation above,
> initialization of  m_databaseSpaceUsed and m_spaceUsed happens at different
> time, which could introduce some error.

Are you fearing some kind of attack or some general accuracy issues?
I am not sure this is worth the extra complexity.
Comment 8 youenn fablet 2019-08-26 13:25:31 PDT
Comment on attachment 377261 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=377261&action=review

> Source/WebCore/ChangeLog:10
> +        because the completed operation should only affect one database.

There are two potential gains here.
One is to recompute the size for the mutating operations only.
The other one is to recompute only for the modified files.
Do we know where are the biggest gains?

Recomputing the size of the whole folder has some benefits if SQL is sometime asynchronously updating some files.
Comment 9 Sihui Liu 2019-08-26 16:35:02 PDT
Comment on attachment 377261 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=377261&action=review

>> Source/WebCore/ChangeLog:10
>> +        because the completed operation should only affect one database.
> 
> There are two potential gains here.
> One is to recompute the size for the mutating operations only.
> The other one is to recompute only for the modified files.
> Do we know where are the biggest gains?
> 
> Recomputing the size of the whole folder has some benefits if SQL is sometime asynchronously updating some files.

We spent a lot of time listing files in the origin directory and getting stat(size) of all files. Basically the file system operations are slow.

If we find SQLite always updates files some time after executing commands and that causes big error in quota computation, we probably want to delay the update of database file size instead of iterating the directory, which is very expensive.

>> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:760
>> +        m_databaseSizeMap.set(databaseHash, newSize);
> 
> WTFMove(databaseHash) or maybe do not pass an r-value to updateSpaceUsed.

Okay.

>> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:812
>> +    for (auto[databaseHash, size] : databaseSizeMapForDirectory(newVersionOriginDirectory))
> 
> s/auto[/auto [/
> s/databaseHash/databaseName/

Okay.

>> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:813
>> +        databaseSizeMap.set(databaseHash, databaseSizeMap.get(databaseHash) + size);
> 
> WTFMove(databaseHash)
> 
> The use of set() is somehow weird here. If we have two different files, we should in theory add the file sizes.
> But since they have the same name, I would think we should remove the v0 version.

Yes. Same database should only have files under one version directory, so no need to add the sizes.

>> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:815
>> +    postDatabaseTaskReply(createCrossThreadTask(*this, &IDBServer::finishComputingSpaceUsedForOrigin, origin, databaseSizeMap));
> 
> Since you are cross-thread copying the map, I wonder whether a Vector would not be more efficient, since we would create the map and its hash table only once instead of twice.

For cross-thread copying, vector should be more efficient, so I will replace the map<key, value> here with vector<pair<key, value>>

>> Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:447
>> +    m_server->updateSpaceUsedForDatabase(m_identifier.origin(), SQLiteFileSystem::computeHashForFileName(m_identifier.databaseName()), m_databaseSize);
> 
> It seems not efficient to compute the hash for filename every time we update the database size.
> Should we store it somewhere?

Will store it in IDBDatabaseIdentifier.
Comment 10 Sihui Liu 2019-08-26 17:37:25 PDT
(In reply to youenn fablet from comment #7)
> > The underlying assumption is UniqueIDBDatabase occupies size
> > m_databaseSpaceUsed of m_spaceUsed. In the implementation above,
> > initialization of  m_databaseSpaceUsed and m_spaceUsed happens at different
> > time, which could introduce some error.
> 
> Are you fearing some kind of attack or some general accuracy issues?
> I am not sure this is worth the extra complexity.

No. I mean numerical error.

For example:
1) QuotaUser reads directory and initializes m_spaceUsed as 100MB, where the testDB.sqlite3 and its temporary files use 20MB.
2) After we open the database testDB, the file sizes of testDB may change due to some SQLite internal operations (checkpointing/merging temp files, etc. And say the total file size becomes 15MB.
3) Next we perform some add operations on testDB, and say the size increases to 25MB.
4) UniqueIDBDatabase would report +10MB to QuotaUser, and QuotaUser updates its space to 110MB.

Also having the database sizes in memory can help decide whether a database already exists when we try to open the database, though this patch does not include this.
Comment 11 Sihui Liu 2019-08-26 17:54:13 PDT
Created attachment 377302 [details]
Patch
Comment 12 Build Bot 2019-08-26 17:56:52 PDT
Attachment 377302 [details] did not pass style-queue:


ERROR: Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:749:  Extra space before [.  [whitespace/brackets] [5]
ERROR: Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:815:  Extra space before [.  [whitespace/brackets] [5]
Total errors found: 2 in 12 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 13 Build Bot 2019-08-26 21:13:19 PDT
Comment on attachment 377302 [details]
Patch

Attachment 377302 [details] did not pass win-ews (win):
Output: https://webkit-queues.webkit.org/results/12971611

Number of test failures exceeded the failure limit.
Comment 14 Build Bot 2019-08-26 21:13:21 PDT
Created attachment 377315 [details]
Archive of layout-test-results from ews210 for win-future

The attached test failures were seen while running run-webkit-tests on the win-ews.
Bot: ews210  Port: win-future  Platform: CYGWIN_NT-10.0-17763-3.0.5-338.x86_64-x86_64-64bit
Comment 15 youenn fablet 2019-08-26 23:48:36 PDT
> For example:
> 1) QuotaUser reads directory and initializes m_spaceUsed as 100MB, where the
> testDB.sqlite3 and its temporary files use 20MB.
> 2) After we open the database testDB, the file sizes of testDB may change
> due to some SQLite internal operations (checkpointing/merging temp files,
> etc. And say the total file size becomes 15MB.

Can we initialize the database size before opening it, say in UniqueIDBDatabase::openBackingStore, before calling getOrEstablishDatabaseInfo?
This will trigger the database computation twice, at quota user initialization time and at database opening.
This is probably ok.

open then becomes a regular operation that will change the space being used.
This somehow makes sense since temporary files might be created at that time.

> Also having the database sizes in memory can help decide whether a database
> already exists when we try to open the database, though this patch does not
> include this.

The potential gain is to not hop to a background thread in some cases?
Comment 16 Sihui Liu 2019-08-27 15:40:37 PDT
Created attachment 377393 [details]
Patch
Comment 17 Sihui Liu 2019-08-27 15:42:30 PDT
(In reply to youenn fablet from comment #15)
> > For example:
> > 1) QuotaUser reads directory and initializes m_spaceUsed as 100MB, where the
> > testDB.sqlite3 and its temporary files use 20MB.
> > 2) After we open the database testDB, the file sizes of testDB may change
> > due to some SQLite internal operations (checkpointing/merging temp files,
> > etc. And say the total file size becomes 15MB.
> 
> Can we initialize the database size before opening it, say in
> UniqueIDBDatabase::openBackingStore, before calling
> getOrEstablishDatabaseInfo?
> This will trigger the database computation twice, at quota user
> initialization time and at database opening.
> This is probably ok.
> 
> open then becomes a regular operation that will change the space being used.
> This somehow makes sense since temporary files might be created at that time.
> 

Updated the patch as suggested! This should work.
Comment 18 youenn fablet 2019-08-28 00:08:07 PDT
Comment on attachment 377393 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=377393&action=review

> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:783
> +static uint64_t databasesSizeForDirectory(const String& directory)

I quite liked that SQLLiteIDBBackingStore was solely responsible to know that the extension is sqlite3.
Now, we have two different places, IDBServer and SQLiteIDBBackingStore.

> Source/WebCore/Modules/indexeddb/server/MemoryIDBBackingStore.cpp:606
> +    return;

return; not needed.

> Source/WebCore/Modules/indexeddb/server/MemoryIDBBackingStore.h:82
> +    uint64_t databaseSize() const final;

Could be made private.

> Source/WebCore/Modules/indexeddb/server/MemoryIDBBackingStore.h:89
> +    void close();

final and private?

> Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.cpp:850
> +    return SQLiteFileSystem::getDatabaseFileSize(fullDatabasePath());

Add ASSERT(!isMainThread());

> Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.h:85
> +    uint64_t databaseSize() const final;

Could be made private.

> Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.h:106
> +    void close();

final and private?

> Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:354
> +        m_reportedDatabaseSize = backingStore->databaseSize();

m_reportedDatabaseSize is not really reported per se.
Can we have more precise terms like: m_currentDatabaseSize and m_newDatabaseSize.
Or m_databaseSizeBeforeTask and m_databaseSizeAfterTask

> Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:789
> +    // QuotaUser should already initiliazes storage usage, which contains the

s/initiliazes/have initialized/
Comment 19 Sihui Liu 2019-08-28 11:18:13 PDT
Created attachment 377464 [details]
Patch for landing
Comment 20 Sihui Liu 2019-08-28 11:19:34 PDT
Created attachment 377465 [details]
Patch for landing
Comment 21 WebKit Commit Bot 2019-08-28 13:19:13 PDT
Comment on attachment 377465 [details]
Patch for landing

Rejecting attachment 377465 [details] from commit-queue.

New failing tests:
imported/w3c/web-platform-tests/IndexedDB/fire-error-event-exception.html
Full output: https://webkit-queues.webkit.org/results/12977393
Comment 22 WebKit Commit Bot 2019-08-28 13:19:15 PDT
Created attachment 377471 [details]
Archive of layout-test-results from webkit-cq-01 for mac-highsierra

The attached test failures were seen while running run-webkit-tests on the commit-queue.
Bot: webkit-cq-01  Port: mac-highsierra  Platform: Mac OS X 10.13.6
Comment 23 Sihui Liu 2019-08-29 17:42:22 PDT
(In reply to WebKit Commit Bot from comment #22)
> Created attachment 377471 [details]
> Archive of layout-test-results from webkit-cq-01 for mac-highsierra
> 
> The attached test failures were seen while running run-webkit-tests on the
> commit-queue.
> Bot: webkit-cq-01  Port: mac-highsierra  Platform: Mac OS X 10.13.6

I believe the test has been flaky due to existing bug since it was added, and it's somehow exposed by this patch. Will fix the bug in webkit.org/b/169621 first.
Comment 24 WebKit Commit Bot 2019-08-30 01:59:49 PDT
Comment on attachment 377465 [details]
Patch for landing

Rejecting attachment 377465 [details] from commit-queue.

Failed to run "['/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch', '--status-host=webkit-queues.webkit.org', '--bot-id=webkit-cq-01', 'apply-attachment', '--no-update', '--non-interactive', 377465, '--port=mac']" exit_code: 2 cwd: /Volumes/Data/EWS/WebKit

Logging in as commit-queue@webkit.org...
Fetching: https://bugs.webkit.org/attachment.cgi?id=377465&action=edit
Fetching: https://bugs.webkit.org/show_bug.cgi?id=201057&ctype=xml&excludefield=attachmentdata
Processing 1 patch from 1 bug.
Processing patch 377465 from bug 201057.
Fetching: https://bugs.webkit.org/attachment.cgi?id=377465
Failed to run "[u'/Volumes/Data/EWS/WebKit/Tools/Scripts/svn-apply', '--force']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit

Parsed 10 diffs from patch file(s).
patching file Source/WebCore/ChangeLog
Hunk #1 succeeded at 1 with fuzz 3.
patching file Source/WebCore/Modules/indexeddb/server/IDBBackingStore.h
patching file Source/WebCore/Modules/indexeddb/server/IDBServer.cpp
Hunk #2 succeeded at 787 with fuzz 1.
patching file Source/WebCore/Modules/indexeddb/server/IDBServer.h
patching file Source/WebCore/Modules/indexeddb/server/MemoryIDBBackingStore.cpp
patching file Source/WebCore/Modules/indexeddb/server/MemoryIDBBackingStore.h
patching file Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.cpp
Hunk #1 FAILED at 845.
1 out of 2 hunks FAILED -- saving rejects to file Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.cpp.rej
patching file Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.h
patching file Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp
patching file Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.h

Failed to run "[u'/Volumes/Data/EWS/WebKit/Tools/Scripts/svn-apply', '--force']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit

Full output: https://webkit-queues.webkit.org/results/12983411
Comment 25 Sihui Liu 2019-08-30 09:36:59 PDT
Created attachment 377719 [details]
Patch for landing
Comment 26 WebKit Commit Bot 2019-08-30 10:19:45 PDT
Comment on attachment 377719 [details]
Patch for landing

Clearing flags on attachment: 377719

Committed r249333: <https://trac.webkit.org/changeset/249333>
Comment 27 WebKit Commit Bot 2019-08-30 10:19:46 PDT
All reviewed patches have been landed.  Closing bug.
Comment 28 Radar WebKit Bug Importer 2019-08-30 10:20:31 PDT
<rdar://problem/54886923>