Bug 201957 - IndexedDB: update size to actual disk usage only when estimated increase is bigger than space available
Summary: IndexedDB: update size to actual disk usage only when estimated increase is b...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Sihui Liu
URL:
Keywords: InRadar
: 201728 (view as bug list)
Depends on: 202480
Blocks: 202483
  Show dependency treegraph
 
Reported: 2019-09-18 17:28 PDT by Sihui Liu
Modified: 2019-10-09 18:44 PDT (History)
10 users (show)

See Also:


Attachments
Patch (52.02 KB, patch)
2019-09-18 17:49 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Patch (54.03 KB, patch)
2019-09-19 12:31 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Patch (78.31 KB, patch)
2019-09-19 21:47 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Patch (79.98 KB, patch)
2019-09-24 18:42 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Patch (79.87 KB, patch)
2019-09-25 11:43 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Patch (48.91 KB, patch)
2019-09-27 12:23 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Patch (48.95 KB, patch)
2019-09-27 13:23 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Patch (50.05 KB, patch)
2019-09-27 18:31 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Patch (50.07 KB, patch)
2019-09-29 13:18 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Patch (48.66 KB, patch)
2019-09-30 13:07 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Patch (48.53 KB, patch)
2019-10-01 13:22 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Patch (9.08 KB, patch)
2019-10-02 14:00 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Patch (31.88 KB, patch)
2019-10-03 13:34 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Patch (32.50 KB, patch)
2019-10-07 14:46 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Patch (32.81 KB, patch)
2019-10-07 19:36 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Patch for landing (32.93 KB, patch)
2019-10-09 14:50 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff
Patch for landing (2.58 KB, patch)
2019-10-09 17:56 PDT, Sihui Liu
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Sihui Liu 2019-09-18 17:28:57 PDT
...
Comment 1 Sihui Liu 2019-09-18 17:49:05 PDT
Created attachment 379089 [details]
Patch
Comment 2 Geoffrey Garen 2019-09-18 19:43:49 PDT
Comment on attachment 379089 [details]
Patch

Looks like this needs an update to trunk.

Can you include performance data for this change?
Comment 3 Sihui Liu 2019-09-19 12:31:03 PDT
Created attachment 379152 [details]
Patch
Comment 4 Sihui Liu 2019-09-19 21:47:07 PDT
Created attachment 379204 [details]
Patch
Comment 5 Sihui Liu 2019-09-20 11:14:43 PDT
This patch cuts about 100ms for one iteration of PerformanceTests/IndexedDB/basic/objectstore-add.html (which does a lot of small adds) when running with "run-minibrowser --release".
Comment 6 Geoffrey Garen 2019-09-20 14:14:12 PDT
Comment on attachment 379204 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=379204&action=review

> Source/WebCore/ChangeLog:11
> +        the file size only when the accumulated estimated increase is bigger than space availabe.

available

> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:715
> +    m_shouldSyncSpaceCompletely = true;
> +    if (m_manager)
> +        m_manager->userNeedSync(*this);

It looks like the m_shouldSyncSpaceCompletely boolean is only used for the sake of userNeedSync(). Is that right? If so, can you reorganize the code so that userNeedSync() just calls a separate function? It seems like there's one function that initializes all values from disk, and another function that updates values based on changes that have been recorded. I would call those two functions "computeSpaceUsed" and "updateSpaceUsed".

> Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:723
> +    if (m_shouldSyncSpaceCompletely) {
> +        m_server.startComputingSpaceUsedForOrigin(m_origin);

It looks like we use two words to describe the operation of reading sizes on disk: "compute" and "sync". Also for describing a space delta: "modify" and "change". I find code a lot more readable when we use one term consistently. I like "compute" and "update", so I'll suggest renames that use them:

syncSpaceUsed => computeSpaceUsed
m_syncCallbacks => m_computeSpaceUsedCallbacks
didSyncSpaceUsed => didComputeSpaceUsed
userNeedSync => userNeedsToComputeSpaceUsed
askUserToSync => askUserToComputeSpaceUsed

computeSpaceUsedChangeForOrigin => updateSpaceUsedForOrigin
startComputingSpaceUsedChangeForOrigin => startUpdatingSpaceUsedChangeForOrigin
finishComputingSpaceUsedChangeForOrigin => finishUpdatingSpaceUsedForOrigin

m_shouldSyncSpaceCompletely => [ remove ]
didSyncSpaceUsedModifiedOnly => didUpdateSpaceUsed
canSyncSpaceModifiedOnly => canUpdateSpaceUsed

> Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:220
> +{

startSpaceIncreasingTask => startSpaceIncreaseTask

> Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:226
> +void UniqueIDBDatabase::completeSpaceIncreasingTask(uint64_t identifier, uint64_t actualTaskSize)
> +{

completeSpaceIncreasingTask => finishSpaceIncreaseTask

> Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:227
> +    auto iterator = m_pendingSpaceIncreasingTasks.find(identifier);

m_pendingSpaceIncreasingTasks => m_pendingSpaceIncreaseTasks
Comment 7 Sihui Liu 2019-09-24 18:42:45 PDT
Created attachment 379521 [details]
Patch
Comment 8 youenn fablet 2019-09-25 02:02:16 PDT
Comment on attachment 379521 [details]
Patch

Some tests are failing, patch has some WTFLogAlways.

View in context: https://bugs.webkit.org/attachment.cgi?id=379521&action=review

> Source/WebCore/ChangeLog:12
> +        the accurate file size only when the accumulated estimated increase is bigger than space available.

This seems like a fine plan.
I do not really understand why the current model of StorageQuotaManager needs any change to implement this plan.
Can we do something like:
- IDBServer receives a new action
- IDBServer queries the StorageQuotaManager to check whether the action will get the space go above a given limit, for instance in IDBServer::requestSpace
- If it does not go above the limit, continue as usual
- If it goes above the limit, update the actual IDB size before continuing with IDBServer::requestSpace.
Comment 9 Sihui Liu 2019-09-25 11:33:30 PDT
(In reply to youenn fablet from comment #8)
> Comment on attachment 379521 [details]
> Patch
> 
> Some tests are failing, patch has some WTFLogAlways.
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=379521&action=review
> 
> > Source/WebCore/ChangeLog:12
> > +        the accurate file size only when the accumulated estimated increase is bigger than space available.
> 
> This seems like a fine plan.
> I do not really understand why the current model of StorageQuotaManager
> needs any change to implement this plan.
> Can we do something like:
> - IDBServer receives a new action
> - IDBServer queries the StorageQuotaManager to check whether the action will
> get the space go above a given limit, for instance in IDBServer::requestSpace
> - If it does not go above the limit, continue as usual
> - If it goes above the limit, update the actual IDB size before continuing
> with IDBServer::requestSpace.

This would probably work in general cases. The flow may be: 
- IDBServer receives a new action with size S: asks StorageQuotaManager pre-RequestSpaceCheck(S)
- StorageQuotaManager says NoEnoughSpace, IDBServer dispatches a task to updateDatabaseSizes
- IDBServer receives a new action, queues it up because the update is not finished
- IDBServer didUpdateDatabaseSizes, requestSpace(S), this request gets handled or queued up in StorageQuotaManager
- IDBServer handles next requests in queue

Then do we do pre-RequestSpaceCheck for Cache?

The reasons I made the change in StorageQuotaManager instead of IDBServer:
1. StorageQuotaManager is aware of all pending requests (compared to queueing up requests in IDBServer) so it could decide how much more space it wants when askForMoreSpace.
2. StorageQuotaManager would block and update IDB database sizes when Cache requestSpace gets Deny. (maybe we can improve this a bit that if we find IDBServer has not executed any action since last update, we just skip this.)
3. It might be easier for us to integrate other storage types, which might also use estimatedSizeChange, in the future. We only need to add the capabilities of QuotaUser instead of adding a queue in StorageTypeServer.
Comment 10 Sihui Liu 2019-09-25 11:43:30 PDT
Created attachment 379567 [details]
Patch
Comment 11 youenn fablet 2019-09-26 01:50:38 PDT
(In reply to Sihui Liu from comment #9)
> (In reply to youenn fablet from comment #8)
> > Comment on attachment 379521 [details]
> > Patch
> > 
> > Some tests are failing, patch has some WTFLogAlways.
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=379521&action=review
> > 
> > > Source/WebCore/ChangeLog:12
> > > +        the accurate file size only when the accumulated estimated increase is bigger than space available.
> > 
> > This seems like a fine plan.
> > I do not really understand why the current model of StorageQuotaManager
> > needs any change to implement this plan.
> > Can we do something like:
> > - IDBServer receives a new action
> > - IDBServer queries the StorageQuotaManager to check whether the action will
> > get the space go above a given limit, for instance in IDBServer::requestSpace
> > - If it does not go above the limit, continue as usual
> > - If it goes above the limit, update the actual IDB size before continuing
> > with IDBServer::requestSpace.
> 
> This would probably work in general cases. The flow may be: 
> - IDBServer receives a new action with size S: asks StorageQuotaManager
> pre-RequestSpaceCheck(S)
> - StorageQuotaManager says NoEnoughSpace, IDBServer dispatches a task to
> updateDatabaseSizes
> - IDBServer receives a new action, queues it up because the update is not
> finished
> - IDBServer didUpdateDatabaseSizes, requestSpace(S), this request gets
> handled or queued up in StorageQuotaManager
> - IDBServer handles next requests in queue
> 
> Then do we do pre-RequestSpaceCheck for Cache?

I don't believe we need pre-RequestSpaceCheck for Cache.
We could reuse requestSpace for pre-RequestSpaceCheck by adding a new parameter (RequestMoreSpaceIfNeeded::No).

> 
> The reasons I made the change in StorageQuotaManager instead of IDBServer:
> 1. StorageQuotaManager is aware of all pending requests (compared to
> queueing up requests in IDBServer) so it could decide how much more space it
> wants when askForMoreSpace.
> 2. StorageQuotaManager would block and update IDB database sizes when Cache
> requestSpace gets Deny. (maybe we can improve this a bit that if we find
> IDBServer has not executed any action since last update, we just skip this.)

Good point. IIUC, the scenario is that IDB fills up almost all space.
Cache API adds a new entry and hits the limit. In that case, we should recompute IDB actual size before answering cache API.

Maybe we should do a refactoring for having QuotaUser::spaceUsed() take a lambda.
In most cases, the lambda would be called synchronously.
quota manager would store a lastSpaceUsed, that would use IDB to either return synchronously or do a full asynchronous computation.
Would you be able to investigate this?

Another approach might be to delay full size computation based on a timer.
Every time a new action happens, we would delay the timer.
When no action happens for some time (one minute?), we would update the IDB size.

> 3. It might be easier for us to integrate other storage types, which might
> also use estimatedSizeChange, in the future. We only need to add the
> capabilities of QuotaUser instead of adding a queue in StorageTypeServer.

It seems to me IDB is very specific there.
Comment 12 Geoffrey Garen 2019-09-26 17:34:54 PDT
Comment on attachment 379567 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=379567&action=review

Ideas for simplifying:

(1) Remove updateSpaceUsed(). Always computeSpaceUsed(). We've already taken care to make it rare, so we don't need to be so precise.

(2) Remove m_estimatedSpaceIncrease and m_spaceReserved. Instead, the quota manager can track the number of bytes written, and computeSpaceUsed() once the number exceeds half the available space (or some other reasonable fraction).

(3) Remove the code to compute space used when shouldAskForMoreSpace() is true. Instead, based on (2), we can assume that when shouldAskForMoreSpace() is true, we really have to ask. (shouldAskForMoreSpace should check whether bytesWritten exceeds bytesAvailable. It's OK if this check is a little aggressive or imperfect. When the check fires, it is certain that we need more space to operate efficiently, and very likely to be true that we're close to our space limit.)

> Source/WebCore/ChangeLog:9
> +        operation could slow.

could slow => is slow
Comment 13 youenn fablet 2019-09-26 22:59:17 PDT
If we go with async spaceUsed, we can also probably remove whenInitialized code as well.
Comment 14 Sihui Liu 2019-09-27 12:23:53 PDT
Created attachment 379752 [details]
Patch
Comment 15 Sihui Liu 2019-09-27 12:34:24 PDT
(In reply to youenn fablet from comment #13)
> If we go with async spaceUsed, we can also probably remove whenInitialized
> code as well.

There are details that I am not so sure about your proposed approach, so I chose to keep the main structure of QuotaStorageManager. Let me know what do you think about the new patch. The idea is to reset UniqueIDBDatabase usage if we think the requested size is bigger than allowed, and try processing the request after the usage is reset.
Comment 16 Sihui Liu 2019-09-27 13:23:56 PDT
Created attachment 379757 [details]
Patch
Comment 17 Sihui Liu 2019-09-27 13:24:56 PDT
*** Bug 201728 has been marked as a duplicate of this bug. ***
Comment 18 Sihui Liu 2019-09-27 18:31:29 PDT
Created attachment 379781 [details]
Patch
Comment 19 Sihui Liu 2019-09-29 13:18:31 PDT
Created attachment 379819 [details]
Patch
Comment 20 Sihui Liu 2019-09-30 09:18:57 PDT
Comment on attachment 379819 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=379819&action=review

> Source/WebCore/ChangeLog:6
> +        Update disk usage of IndexedDB only when estimated increase is bigger than space available
> +        https://bugs.webkit.org/show_bug.cgi?id=201957
> +
> +        Reviewed by NOBODY (OOPS!).

Double Changelog.. will fix this in next patch.
Comment 21 youenn fablet 2019-09-30 09:52:37 PDT
Comment on attachment 379819 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=379819&action=review

> Source/WebCore/Modules/indexeddb/server/IDBServer.h:170
> +        uint64_t spaceUsed() const final {

"{" in its own line.

> Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.cpp:988
> +    return IDBError { WTF::nullopt, { }, sizeIncrease };

It might be good to add a static IDBError method, to remove the first two parameters and make it clear this is a success case, not an error case.
Or maybe we should use something like Expected<uint64_t, IDBError>.

> Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.cpp:1793
> +    sizeIncrease += error.size();

This line reads a bit odd (Expected might be better here I guess).
Also, error should be null, so probably no need to add its size?

> Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.h:306
> +    HashMap<uint64_t, uint64_t> m_pendingSpaceIncreaseTasks;

Maybe we should introduce a typed identifier for this map key.

> Source/WebCore/storage/StorageQuotaManager.cpp:154
> +        for (auto user : copyToVector(m_users))

auto&

> Source/WebCore/storage/StorageQuotaManager.h:91
> +        State state;

Why do we need this state for each pending request?
Isn't it that this state should be a member of StorageQuotaManager instead?
In that case, we could merge it with m_hasStartedHandlingRequests maybe?

> Source/WebCore/storage/StorageQuotaUser.h:38
> +    virtual void resetSpaceUsed(Condition) { return; }

The name is not very clear and StorageQuotaManager is only calling with Condition::IndexedDBOnly.
How about:
- renaming it to computeExactSpaceUsed() with no parameters.


Also s/{  return; }//


One case to check is the case where there is no IDB at all.
In that case, we want to be sure the code is still working and the tasks will be processed.
I am not sure this will be the case since, if there is no IDB, all users will be still initialized.
Comment 22 Sihui Liu 2019-09-30 13:03:32 PDT
(In reply to youenn fablet from comment #21)
> Comment on attachment 379819 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=379819&action=review
> 
> > Source/WebCore/Modules/indexeddb/server/IDBServer.h:170
> > +        uint64_t spaceUsed() const final {
> 
> "{" in its own line.
> 
Okay.

> > Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.cpp:988
> > +    return IDBError { WTF::nullopt, { }, sizeIncrease };
> 
> It might be good to add a static IDBError method, to remove the first two
> parameters and make it clear this is a success case, not an error case.
> Or maybe we should use something like Expected<uint64_t, IDBError>.
> 
Will add a static method.

> > Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.cpp:1793
> > +    sizeIncrease += error.size();
> 
> This line reads a bit odd (Expected might be better here I guess).
> Also, error should be null, so probably no need to add its size?
> 
IDBError's size will be set as the task size (space increase) when task successfully finishes, so size should be added only when error is null.

> > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.h:306
> > +    HashMap<uint64_t, uint64_t> m_pendingSpaceIncreaseTasks;
> 
> Maybe we should introduce a typed identifier for this map key.
> 
Sure. This would introduce changes to other maps, so will do in https://bugs.webkit.org/show_bug.cgi?id=202366.

> > Source/WebCore/storage/StorageQuotaManager.cpp:154
> > +        for (auto user : copyToVector(m_users))
> 
> auto&
> 
Sure.

> > Source/WebCore/storage/StorageQuotaManager.h:91
> > +        State state;
> 
> Why do we need this state for each pending request?
> Isn't it that this state should be a member of StorageQuotaManager instead?
> In that case, we could merge it with m_hasStartedHandlingRequests maybe?
> 
This is used to ensure that we would compute space used/ask for space increase on the same request once and only once.

Currently for StorageQuotaManger, its states are
1. waiting users to compute/initialize (m_pendingInitializationUsers is not null)
2. waiting space increase response (m_isWaitingForSpaceIncreaseResponse is true)
3. can handle request

m_hasStartedHandlingRequests can be true or false when state 3.


For PendingRequest, its states are
1. never handled before (can become 2 or 4)
2. already request for usage update (can become 3 or 4)
3. already request for space increase(can become 4)
4. handled - Grant or Deny

For StorageQuotaManger, QuotaUser update/initialize/reset are all implemented by adding that user to m_pendingInitializationUsers users. Therefore when StorageQuotaManger goes from state 1 to state 3, it does not know what action to take for the first pending request in queue.

> > Source/WebCore/storage/StorageQuotaUser.h:38
> > +    virtual void resetSpaceUsed(Condition) { return; }
> 
> The name is not very clear and StorageQuotaManager is only calling with
> Condition::IndexedDBOnly.
> How about:
> - renaming it to computeExactSpaceUsed() with no parameters.
> 
Sure.

> 
> Also s/{  return; }//
>
Okay.
 
>
> One case to check is the case where there is no IDB at all.
> In that case, we want to be sure the code is still working and the tasks
> will be processed.
> I am not sure this will be the case since, if there is no IDB, all users
> will be still initialized.
>
Yes, this case would not happen under current implementation - IDBServer would be there in the initialization of QuotaUsers(NetworkProcess::initializeQuotaUsers).

We can also add a check for pending users after computeExactSpaceUsed so we can continue to process the same request if IDB is not present.
Comment 23 Sihui Liu 2019-09-30 13:07:48 PDT
Created attachment 379838 [details]
Patch
Comment 24 youenn fablet 2019-10-01 05:29:06 PDT
> > > Source/WebCore/storage/StorageQuotaManager.h:91
> > > +        State state;
> > 
> > Why do we need this state for each pending request?
> > Isn't it that this state should be a member of StorageQuotaManager instead?
> > In that case, we could merge it with m_hasStartedHandlingRequests maybe?
> > 
> This is used to ensure that we would compute space used/ask for space
> increase on the same request once and only once.
> 
> Currently for StorageQuotaManger, its states are
> 1. waiting users to compute/initialize (m_pendingInitializationUsers is not
> null)
> 2. waiting space increase response (m_isWaitingForSpaceIncreaseResponse is
> true)
> 3. can handle request
> 
> m_hasStartedHandlingRequests can be true or false when state 3.
> 
> 
> For PendingRequest, its states are
> 1. never handled before (can become 2 or 4)
> 2. already request for usage update (can become 3 or 4)
> 3. already request for space increase(can become 4)
> 4. handled - Grant or Deny
> 
> For StorageQuotaManger, QuotaUser update/initialize/reset are all
> implemented by adding that user to m_pendingInitializationUsers users.
> Therefore when StorageQuotaManger goes from state 1 to state 3, it does not
> know what action to take for the first pending request in queue.

Agreed, but this is only useful for the first pending requests.
All other pending requests are Unhandled.
It seems best to have the additional information needed to go from state 1 to state 3 in StorageQuotaManager.
Or maybe we could introduce a fourth state for StorageQuotaManager which is PendingRecomputationofActualSpaceUsage.

> > > Source/WebCore/storage/StorageQuotaUser.h:38
> > > +    virtual void resetSpaceUsed(Condition) { return; }
> > 
> > The name is not very clear and StorageQuotaManager is only calling with
> > Condition::IndexedDBOnly.
> > How about:
> > - renaming it to computeExactSpaceUsed() with no parameters.
> > 
> Sure.
> 
> > 
> > Also s/{  return; }//
> >
> Okay.
>  
> >
> > One case to check is the case where there is no IDB at all.
> > In that case, we want to be sure the code is still working and the tasks
> > will be processed.
> > I am not sure this will be the case since, if there is no IDB, all users
> > will be still initialized.
> >
> Yes, this case would not happen under current implementation - IDBServer
> would be there in the initialization of
> QuotaUsers(NetworkProcess::initializeQuotaUsers).
> 
> We can also add a check for pending users after computeExactSpaceUsed so we
> can continue to process the same request if IDB is not present.

Yes, it seems more robust to cover that case.
Comment 25 Geoffrey Garen 2019-10-01 10:09:24 PDT
> We can also add a check for pending users after computeExactSpaceUsed so we
> can continue to process the same request if IDB is not present.

Do we really need to make IDB requests wait for computeExactSpaceUsed?

Could the implementation be simpler if we didn't wait?

If we didn't wait, the consequence would be that some requests near the space limit, but still below it, would trigger a user approval for more space. I think that consequence is acceptable, if it simplifies the implementation. 99.9% of websites will never even be near the space limit.
Comment 26 Geoffrey Garen 2019-10-01 10:35:30 PDT
Comment on attachment 379838 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=379838&action=review

> Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:236
> +void UniqueIDBDatabase::startSpaceIncreaseTask(uint64_t identifier, uint64_t taskSize)
> +{
> +    m_server->increasePotentialSpaceUsed(m_identifier.origin(), taskSize);
> +    m_pendingSpaceIncreaseTasks.add(identifier, taskSize);
> +}
> +
> +void UniqueIDBDatabase::finishSpaceIncreaseTask(uint64_t identifier, uint64_t actualTaskSize)
> +{
> +    auto iterator = m_pendingSpaceIncreaseTasks.find(identifier);
> +    if (iterator != m_pendingSpaceIncreaseTasks.end()) {
> +        m_server->decreasePotentialSpaceUsed(m_identifier.origin(), iterator->value);
> +        if (actualTaskSize)
> +            m_server->increaseSpaceUsed(m_identifier.origin(), actualTaskSize);
> +        m_pendingSpaceIncreaseTasks.remove(iterator);

Is it really essential that we guess how much space a task will use in advance, and then update to a precise value afterwards, and keep a precise tally in memory?

I believe it should be sufficient only to guess how much space each task will use, and accumulate guesses, and only measure the amount of space used on disk periodically.

It's OK if our guess is a little inaccurate, as long as it doesn't prompt the user too much, and it doesn't allow a webpage to use way more than 500MB on disk.

For example, even a really simple guess that only counted space increases and never counted space decreases, and synchronized with the disk when the size of space increases reach half of available capacity, would probably be good enough, no?
Comment 27 Sihui Liu 2019-10-01 12:29:09 PDT
(In reply to Geoffrey Garen from comment #26)
> Comment on attachment 379838 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=379838&action=review
> 
> > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:236
> > +void UniqueIDBDatabase::startSpaceIncreaseTask(uint64_t identifier, uint64_t taskSize)
> > +{
> > +    m_server->increasePotentialSpaceUsed(m_identifier.origin(), taskSize);
> > +    m_pendingSpaceIncreaseTasks.add(identifier, taskSize);
> > +}
> > +
> > +void UniqueIDBDatabase::finishSpaceIncreaseTask(uint64_t identifier, uint64_t actualTaskSize)
> > +{
> > +    auto iterator = m_pendingSpaceIncreaseTasks.find(identifier);
> > +    if (iterator != m_pendingSpaceIncreaseTasks.end()) {
> > +        m_server->decreasePotentialSpaceUsed(m_identifier.origin(), iterator->value);
> > +        if (actualTaskSize)
> > +            m_server->increaseSpaceUsed(m_identifier.origin(), actualTaskSize);
> > +        m_pendingSpaceIncreaseTasks.remove(iterator);
> 
> Is it really essential that we guess how much space a task will use in
> advance, and then update to a precise value afterwards, and keep a precise
> tally in memory?
> 
> I believe it should be sufficient only to guess how much space each task
> will use, and accumulate guesses, and only measure the amount of space used
> on disk periodically.
> 
> It's OK if our guess is a little inaccurate, as long as it doesn't prompt
> the user too much, and it doesn't allow a webpage to use way more than 500MB
> on disk.
> 
> For example, even a really simple guess that only counted space increases
> and never counted space decreases, and synchronized with the disk when the
> size of space increases reach half of available capacity, would probably be
> good enough, no?

Performance concern:
We only want to update disk usage(conduct file system operation) when it's necessary. Estimated space increase reaches some amount of available capacity can be too aggressive (or conservative, depending on how we do the estimate).

Correctness concern:
It's hard to estimate space increase for operation like put before it's executed, based on our current IDB implementation. The updating estimatedSizeIncrease after task done is a way to make the guess become "a little inaccurate" instead of "very inaccurate". For example, put(key, value)
(1) Size estimate before task: size of key + size of value
(2) Size estimate after task: size of serialized key + size of serialized value + size of records for different indexes(which may be double (1)).
Comment 28 Sihui Liu 2019-10-01 13:22:04 PDT
Created attachment 379938 [details]
Patch
Comment 29 Sihui Liu 2019-10-01 13:48:55 PDT
(In reply to Geoffrey Garen from comment #25)
> > We can also add a check for pending users after computeExactSpaceUsed so we
> > can continue to process the same request if IDB is not present.
> 
> Do we really need to make IDB requests wait for computeExactSpaceUsed?
> 
> Could the implementation be simpler if we didn't wait?
> 
> If we didn't wait, the consequence would be that some requests near the
> space limit, but still below it, would trigger a user approval for more
> space. I think that consequence is acceptable, if it simplifies the
> implementation. 99.9% of websites will never even be near the space limit.

This implementation is relatively simple now since we already have the code path to blocked update usage of QuotaUsers (IDBServer::QuotaUser::resetSpaceUsed). We barely modify QuotaUser.

If we do not block the requests during update, in some edge case, we may stat the file system multiple times(e.g. reset user usage during user usage update) or prompt multiple times(usage update running in the background thread). And this does not seem easier as we may need to keep states in QuotaUser or StorageQuotaManager.

If we believe most website will never reach this state, then blocking update should not happen very often so may be less a concern.
Comment 30 youenn fablet 2019-10-02 09:26:11 PDT
This patch seems to do 3 things:
1. Improve the task estimate before calling the quota manager
2. Improve the task estimate after evaluating the task size so that our overall estimate is overall better
3. Trigger actual computation of real size when reaching quota

1 is straightforward and could be landed quickly.
2 is bigger but I think is also ok given our current approach
Could we split these in 3 patches to make progress?
Comment 31 youenn fablet 2019-10-02 09:29:59 PDT
Comment on attachment 379938 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=379938&action=review

> Source/WebCore/storage/StorageQuotaManager.h:96
> +    Action m_actionTakenOnNextRequest { Action::Never };

I do not think Action is the right term. It might be State { NeedSpaceUsageInitialization, ComputingSpaceUsage, AskingForMoreSpace }
I wonder whether None could be equal to ComputingExactSpaceUsage.
Comment 32 Sihui Liu 2019-10-02 09:39:20 PDT
(In reply to youenn fablet from comment #30)
> This patch seems to do 3 things:
> 1. Improve the task estimate before calling the quota manager
> 2. Improve the task estimate after evaluating the task size so that our
> overall estimate is overall better
> 3. Trigger actual computation of real size when reaching quota
> 
> 1 is straightforward and could be landed quickly.
> 2 is bigger but I think is also ok given our current approach
> Could we split these in 3 patches to make progress?

Sure.(In reply to youenn fablet from comment #30)
> This patch seems to do 3 things:
> 1. Improve the task estimate before calling the quota manager
> 2. Improve the task estimate after evaluating the task size so that our
> overall estimate is overall better
> 3. Trigger actual computation of real size when reaching quota
> 
> 1 is straightforward and could be landed quickly.
> 2 is bigger but I think is also ok given our current approach
> Could we split these in 3 patches to make progress?

Sure, will do.
Comment 33 Sihui Liu 2019-10-02 10:05:10 PDT
(In reply to youenn fablet from comment #31)
> Comment on attachment 379938 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=379938&action=review
> 
> > Source/WebCore/storage/StorageQuotaManager.h:96
> > +    Action m_actionTakenOnNextRequest { Action::Never };
> 
> I do not think Action is the right term. It might be State {
> NeedSpaceUsageInitialization, ComputingSpaceUsage, AskingForMoreSpace }
> I wonder whether None could be equal to ComputingExactSpaceUsage.

Never // Has never handled request, needs to initialize m_quota first
None // No action has been taken on first pending request, next action should be ComputeExactSpaceUsed or making a decision
ComputeExactSpaceUsed // ComputeExactSpaceUsed has been called, next action should be AskForMoreSpace or making a decision
AskForMoreSpace // AskForMoreSpace has been called, next action should be making decision

So None is not quite the same as ComputeExactSpaceUsed. If we are using state, it's probably more like
enum class State {
NeedInitializeQuota, 
Idle,
ComputingSpaceUsage, 
AskingForMoreSpace,
};
State m_lastState;
Comment 34 Sihui Liu 2019-10-02 12:09:01 PDT
(In reply to youenn fablet from comment #30)
> This patch seems to do 3 things:
> 1. Improve the task estimate before calling the quota manager
> 2. Improve the task estimate after evaluating the task size so that our
> overall estimate is overall better
> 3. Trigger actual computation of real size when reaching quota
> 
> 1 is straightforward and could be landed quickly.
> 2 is bigger but I think is also ok given our current approach
> Could we split these in 3 patches to make progress?

Patch 1: https://bugs.webkit.org/show_bug.cgi?id=202480
Patch 2: https://bugs.webkit.org/show_bug.cgi?id=202483
Comment 35 Sihui Liu 2019-10-02 14:00:12 PDT
Created attachment 380055 [details]
Patch
Comment 36 Sihui Liu 2019-10-03 13:34:43 PDT
Created attachment 380158 [details]
Patch
Comment 37 Sihui Liu 2019-10-03 13:55:35 PDT
(In reply to youenn fablet from comment #30)
> This patch seems to do 3 things:
> 1. Improve the task estimate before calling the quota manager
> 2. Improve the task estimate after evaluating the task size so that our
> overall estimate is overall better
> 3. Trigger actual computation of real size when reaching quota
> 
> 1 is straightforward and could be landed quickly.
> 2 is bigger but I think is also ok given our current approach
> Could we split these in 3 patches to make progress?

It turns out the order of these patches should be:
1. Improve the task estimate before calling the quota manager (202480).
2. Use estimated usage and trigger actual computation when estimated usage reaches limit (201957, this one). Use estimated usage without triggering actual computation would cause correctness problem as the estimated increase accumulates and never reset. 
3. Improve the task estimate after task execution so that the estimated usage in 2 is more accurate (202483).

So we still need to solve 2 first.
Comment 38 youenn fablet 2019-10-07 11:38:37 PDT
Comment on attachment 380158 [details]
Patch

This looks almost good to me.
Some suggestions below.

View in context: https://bugs.webkit.org/attachment.cgi?id=380158&action=review

> Source/WebCore/storage/StorageQuotaManager.cpp:-87
> -        updateQuotaBasedOnSpaceUsage();

Why removing this call?
It should be the one that advance m_previousState to Uninitialized to MakingDecisionForRequest

> Source/WebCore/storage/StorageQuotaManager.cpp:149
> +    }

Why do we have that code?
Normally, either we are initialised, in which case  the call in askUserToInitialize would move us out of Uninitialized, or the !m_pendingInitializationUsers.isEmpty() check would kick in.

> Source/WebCore/storage/StorageQuotaManager.cpp:189
> +void StorageQuotaManager::processPendingRequests(Optional<uint64_t> newQuota)

I wonder whether we should split processPendingRequests in two methods, one for processing the request after a quota increase request and another one to restart the queue on various cases.
If you do not do it in this patch I can take a stab.

> Source/WebCore/storage/StorageQuotaManager.cpp:216
> +        m_previousState = State::MakingDecisionForRequest;

We should probably make set m_previousState before callback call for compatibility with other call sites.

> Source/WebCore/storage/StorageQuotaManager.h:96
> +    State m_previousState { State::Uninitialized };

s/m_state/m_previousState.
Also, can we merge m_isWaitingForSpaceIncreaseResponse with m_state?
Comment 39 Sihui Liu 2019-10-07 14:46:02 PDT
Created attachment 380364 [details]
Patch
Comment 40 Sihui Liu 2019-10-07 19:36:46 PDT
Created attachment 380385 [details]
Patch
Comment 41 youenn fablet 2019-10-07 23:28:55 PDT
Comment on attachment 380385 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=380385&action=review

> Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:225
> +    m_pendingSpaceIncreaseTasks.add(identifier, taskSize);

Add ASSERT(!m_pendingSpaceIncreaseTasks.add(identifier))

> Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:230
> +    auto iterator = m_pendingSpaceIncreaseTasks.find(identifier);

Can we add an ASSERT, something like ASSERT(iterator != m_pendingSpaceIncreaseTasks.end() || !isTaskSuccessful);

> Source/WebCore/storage/StorageQuotaManager.h:96
> +    State m_state { State::Uninitialized };

We might want to add some API tests for StorageQuotaManager like we are doing for other utility classes like Strings.
Comment 42 Sihui Liu 2019-10-08 11:53:22 PDT
Comment on attachment 380385 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=380385&action=review

>> Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:225
>> +    m_pendingSpaceIncreaseTasks.add(identifier, taskSize);
> 
> Add ASSERT(!m_pendingSpaceIncreaseTasks.add(identifier))

Okay.

>> Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:230
>> +    auto iterator = m_pendingSpaceIncreaseTasks.find(identifier);
> 
> Can we add an ASSERT, something like ASSERT(iterator != m_pendingSpaceIncreaseTasks.end() || !isTaskSuccessful);

Sure.

>> Source/WebCore/storage/StorageQuotaManager.h:96
>> +    State m_state { State::Uninitialized };
> 
> We might want to add some API tests for StorageQuotaManager like we are doing for other utility classes like Strings.

What kind of API tests? 
I see we have some related API tests in StorageQuota.mm, this patch should not change their current behavior?
Comment 43 youenn fablet 2019-10-09 00:18:06 PDT
> > We might want to add some API tests for StorageQuotaManager like we are doing for other utility classes like Strings.
> 
> What kind of API tests? 
> I see we have some related API tests in StorageQuota.mm, this patch should
> not change their current behavior?

See for instance TestWebKitAPI/Tests/WTF/Vector.cpp.
Comment 44 Sihui Liu 2019-10-09 14:27:45 PDT
(In reply to youenn fablet from comment #43)
> > > We might want to add some API tests for StorageQuotaManager like we are doing for other utility classes like Strings.
> > 
> > What kind of API tests? 
> > I see we have some related API tests in StorageQuota.mm, this patch should
> > not change their current behavior?
> 
> See for instance TestWebKitAPI/Tests/WTF/Vector.cpp.

Will add a unit test in https://bugs.webkit.org/show_bug.cgi?id=202755. I will get this patch land first for some performance testing.
Comment 45 Sihui Liu 2019-10-09 14:50:54 PDT
Created attachment 380569 [details]
Patch for landing
Comment 46 WebKit Commit Bot 2019-10-09 15:39:04 PDT
Comment on attachment 380569 [details]
Patch for landing

Clearing flags on attachment: 380569

Committed r250937: <https://trac.webkit.org/changeset/250937>
Comment 47 WebKit Commit Bot 2019-10-09 15:39:06 PDT
All reviewed patches have been landed.  Closing bug.
Comment 48 Radar WebKit Bug Importer 2019-10-09 15:40:19 PDT
<rdar://problem/56133205>
Comment 49 Sihui Liu 2019-10-09 17:44:19 PDT
Reopened to upload test fix.
Comment 50 Sihui Liu 2019-10-09 17:56:44 PDT
Created attachment 380597 [details]
Patch for landing
Comment 51 WebKit Commit Bot 2019-10-09 18:44:57 PDT
Comment on attachment 380597 [details]
Patch for landing

Clearing flags on attachment: 380597

Committed r250951: <https://trac.webkit.org/changeset/250951>
Comment 52 WebKit Commit Bot 2019-10-09 18:44:59 PDT
All reviewed patches have been landed.  Closing bug.