RESOLVED FIXED 201957
IndexedDB: update size to actual disk usage only when estimated increase is bigger than space available
https://bugs.webkit.org/show_bug.cgi?id=201957
Summary IndexedDB: update size to actual disk usage only when estimated increase is b...
Sihui Liu
Reported 2019-09-18 17:28:57 PDT
...
Attachments
Patch (52.02 KB, patch)
2019-09-18 17:49 PDT, Sihui Liu
no flags
Patch (54.03 KB, patch)
2019-09-19 12:31 PDT, Sihui Liu
no flags
Patch (78.31 KB, patch)
2019-09-19 21:47 PDT, Sihui Liu
no flags
Patch (79.98 KB, patch)
2019-09-24 18:42 PDT, Sihui Liu
no flags
Patch (79.87 KB, patch)
2019-09-25 11:43 PDT, Sihui Liu
no flags
Patch (48.91 KB, patch)
2019-09-27 12:23 PDT, Sihui Liu
no flags
Patch (48.95 KB, patch)
2019-09-27 13:23 PDT, Sihui Liu
no flags
Patch (50.05 KB, patch)
2019-09-27 18:31 PDT, Sihui Liu
no flags
Patch (50.07 KB, patch)
2019-09-29 13:18 PDT, Sihui Liu
no flags
Patch (48.66 KB, patch)
2019-09-30 13:07 PDT, Sihui Liu
no flags
Patch (48.53 KB, patch)
2019-10-01 13:22 PDT, Sihui Liu
no flags
Patch (9.08 KB, patch)
2019-10-02 14:00 PDT, Sihui Liu
no flags
Patch (31.88 KB, patch)
2019-10-03 13:34 PDT, Sihui Liu
no flags
Patch (32.50 KB, patch)
2019-10-07 14:46 PDT, Sihui Liu
no flags
Patch (32.81 KB, patch)
2019-10-07 19:36 PDT, Sihui Liu
no flags
Patch for landing (32.93 KB, patch)
2019-10-09 14:50 PDT, Sihui Liu
no flags
Patch for landing (2.58 KB, patch)
2019-10-09 17:56 PDT, Sihui Liu
no flags
Sihui Liu
Comment 1 2019-09-18 17:49:05 PDT
Geoffrey Garen
Comment 2 2019-09-18 19:43:49 PDT
Comment on attachment 379089 [details] Patch Looks like this needs an update to trunk. Can you include performance data for this change?
Sihui Liu
Comment 3 2019-09-19 12:31:03 PDT
Sihui Liu
Comment 4 2019-09-19 21:47:07 PDT
Sihui Liu
Comment 5 2019-09-20 11:14:43 PDT
This patch cuts about 100ms for one iteration of PerformanceTests/IndexedDB/basic/objectstore-add.html (which does a lot of small adds) when running with "run-minibrowser --release".
Geoffrey Garen
Comment 6 2019-09-20 14:14:12 PDT
Comment on attachment 379204 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=379204&action=review > Source/WebCore/ChangeLog:11 > + the file size only when the accumulated estimated increase is bigger than space availabe. available > Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:715 > + m_shouldSyncSpaceCompletely = true; > + if (m_manager) > + m_manager->userNeedSync(*this); It looks like the m_shouldSyncSpaceCompletely boolean is only used for the sake of userNeedSync(). Is that right? If so, can you reorganize the code so that userNeedSync() just calls a separate function? It seems like there's one function that initializes all values from disk, and another function that updates values based on changes that have been recorded. I would call those two functions "computeSpaceUsed" and "updateSpaceUsed". > Source/WebCore/Modules/indexeddb/server/IDBServer.cpp:723 > + if (m_shouldSyncSpaceCompletely) { > + m_server.startComputingSpaceUsedForOrigin(m_origin); It looks like we use two words to describe the operation of reading sizes on disk: "compute" and "sync". Also for describing a space delta: "modify" and "change". I find code a lot more readable when we use one term consistently. I like "compute" and "update", so I'll suggest renames that use them: syncSpaceUsed => computeSpaceUsed m_syncCallbacks => m_computeSpaceUsedCallbacks didSyncSpaceUsed => didComputeSpaceUsed userNeedSync => userNeedsToComputeSpaceUsed askUserToSync => askUserToComputeSpaceUsed computeSpaceUsedChangeForOrigin => updateSpaceUsedForOrigin startComputingSpaceUsedChangeForOrigin => startUpdatingSpaceUsedChangeForOrigin finishComputingSpaceUsedChangeForOrigin => finishUpdatingSpaceUsedForOrigin m_shouldSyncSpaceCompletely => [ remove ] didSyncSpaceUsedModifiedOnly => didUpdateSpaceUsed canSyncSpaceModifiedOnly => canUpdateSpaceUsed > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:220 > +{ startSpaceIncreasingTask => startSpaceIncreaseTask > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:226 > +void UniqueIDBDatabase::completeSpaceIncreasingTask(uint64_t identifier, uint64_t actualTaskSize) > +{ completeSpaceIncreasingTask => finishSpaceIncreaseTask > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:227 > + auto iterator = m_pendingSpaceIncreasingTasks.find(identifier); m_pendingSpaceIncreasingTasks => m_pendingSpaceIncreaseTasks
Sihui Liu
Comment 7 2019-09-24 18:42:45 PDT
youenn fablet
Comment 8 2019-09-25 02:02:16 PDT
Comment on attachment 379521 [details] Patch Some tests are failing, patch has some WTFLogAlways. View in context: https://bugs.webkit.org/attachment.cgi?id=379521&action=review > Source/WebCore/ChangeLog:12 > + the accurate file size only when the accumulated estimated increase is bigger than space available. This seems like a fine plan. I do not really understand why the current model of StorageQuotaManager needs any change to implement this plan. Can we do something like: - IDBServer receives a new action - IDBServer queries the StorageQuotaManager to check whether the action will get the space go above a given limit, for instance in IDBServer::requestSpace - If it does not go above the limit, continue as usual - If it goes above the limit, update the actual IDB size before continuing with IDBServer::requestSpace.
Sihui Liu
Comment 9 2019-09-25 11:33:30 PDT
(In reply to youenn fablet from comment #8) > Comment on attachment 379521 [details] > Patch > > Some tests are failing, patch has some WTFLogAlways. > > View in context: > https://bugs.webkit.org/attachment.cgi?id=379521&action=review > > > Source/WebCore/ChangeLog:12 > > + the accurate file size only when the accumulated estimated increase is bigger than space available. > > This seems like a fine plan. > I do not really understand why the current model of StorageQuotaManager > needs any change to implement this plan. > Can we do something like: > - IDBServer receives a new action > - IDBServer queries the StorageQuotaManager to check whether the action will > get the space go above a given limit, for instance in IDBServer::requestSpace > - If it does not go above the limit, continue as usual > - If it goes above the limit, update the actual IDB size before continuing > with IDBServer::requestSpace. This would probably work in general cases. The flow may be: - IDBServer receives a new action with size S: asks StorageQuotaManager pre-RequestSpaceCheck(S) - StorageQuotaManager says NoEnoughSpace, IDBServer dispatches a task to updateDatabaseSizes - IDBServer receives a new action, queues it up because the update is not finished - IDBServer didUpdateDatabaseSizes, requestSpace(S), this request gets handled or queued up in StorageQuotaManager - IDBServer handles next requests in queue Then do we do pre-RequestSpaceCheck for Cache? The reasons I made the change in StorageQuotaManager instead of IDBServer: 1. StorageQuotaManager is aware of all pending requests (compared to queueing up requests in IDBServer) so it could decide how much more space it wants when askForMoreSpace. 2. StorageQuotaManager would block and update IDB database sizes when Cache requestSpace gets Deny. (maybe we can improve this a bit that if we find IDBServer has not executed any action since last update, we just skip this.) 3. It might be easier for us to integrate other storage types, which might also use estimatedSizeChange, in the future. We only need to add the capabilities of QuotaUser instead of adding a queue in StorageTypeServer.
Sihui Liu
Comment 10 2019-09-25 11:43:30 PDT
youenn fablet
Comment 11 2019-09-26 01:50:38 PDT
(In reply to Sihui Liu from comment #9) > (In reply to youenn fablet from comment #8) > > Comment on attachment 379521 [details] > > Patch > > > > Some tests are failing, patch has some WTFLogAlways. > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=379521&action=review > > > > > Source/WebCore/ChangeLog:12 > > > + the accurate file size only when the accumulated estimated increase is bigger than space available. > > > > This seems like a fine plan. > > I do not really understand why the current model of StorageQuotaManager > > needs any change to implement this plan. > > Can we do something like: > > - IDBServer receives a new action > > - IDBServer queries the StorageQuotaManager to check whether the action will > > get the space go above a given limit, for instance in IDBServer::requestSpace > > - If it does not go above the limit, continue as usual > > - If it goes above the limit, update the actual IDB size before continuing > > with IDBServer::requestSpace. > > This would probably work in general cases. The flow may be: > - IDBServer receives a new action with size S: asks StorageQuotaManager > pre-RequestSpaceCheck(S) > - StorageQuotaManager says NoEnoughSpace, IDBServer dispatches a task to > updateDatabaseSizes > - IDBServer receives a new action, queues it up because the update is not > finished > - IDBServer didUpdateDatabaseSizes, requestSpace(S), this request gets > handled or queued up in StorageQuotaManager > - IDBServer handles next requests in queue > > Then do we do pre-RequestSpaceCheck for Cache? I don't believe we need pre-RequestSpaceCheck for Cache. We could reuse requestSpace for pre-RequestSpaceCheck by adding a new parameter (RequestMoreSpaceIfNeeded::No). > > The reasons I made the change in StorageQuotaManager instead of IDBServer: > 1. StorageQuotaManager is aware of all pending requests (compared to > queueing up requests in IDBServer) so it could decide how much more space it > wants when askForMoreSpace. > 2. StorageQuotaManager would block and update IDB database sizes when Cache > requestSpace gets Deny. (maybe we can improve this a bit that if we find > IDBServer has not executed any action since last update, we just skip this.) Good point. IIUC, the scenario is that IDB fills up almost all space. Cache API adds a new entry and hits the limit. In that case, we should recompute IDB actual size before answering cache API. Maybe we should do a refactoring for having QuotaUser::spaceUsed() take a lambda. In most cases, the lambda would be called synchronously. quota manager would store a lastSpaceUsed, that would use IDB to either return synchronously or do a full asynchronous computation. Would you be able to investigate this? Another approach might be to delay full size computation based on a timer. Every time a new action happens, we would delay the timer. When no action happens for some time (one minute?), we would update the IDB size. > 3. It might be easier for us to integrate other storage types, which might > also use estimatedSizeChange, in the future. We only need to add the > capabilities of QuotaUser instead of adding a queue in StorageTypeServer. It seems to me IDB is very specific there.
Geoffrey Garen
Comment 12 2019-09-26 17:34:54 PDT
Comment on attachment 379567 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=379567&action=review Ideas for simplifying: (1) Remove updateSpaceUsed(). Always computeSpaceUsed(). We've already taken care to make it rare, so we don't need to be so precise. (2) Remove m_estimatedSpaceIncrease and m_spaceReserved. Instead, the quota manager can track the number of bytes written, and computeSpaceUsed() once the number exceeds half the available space (or some other reasonable fraction). (3) Remove the code to compute space used when shouldAskForMoreSpace() is true. Instead, based on (2), we can assume that when shouldAskForMoreSpace() is true, we really have to ask. (shouldAskForMoreSpace should check whether bytesWritten exceeds bytesAvailable. It's OK if this check is a little aggressive or imperfect. When the check fires, it is certain that we need more space to operate efficiently, and very likely to be true that we're close to our space limit.) > Source/WebCore/ChangeLog:9 > + operation could slow. could slow => is slow
youenn fablet
Comment 13 2019-09-26 22:59:17 PDT
If we go with async spaceUsed, we can also probably remove whenInitialized code as well.
Sihui Liu
Comment 14 2019-09-27 12:23:53 PDT
Sihui Liu
Comment 15 2019-09-27 12:34:24 PDT
(In reply to youenn fablet from comment #13) > If we go with async spaceUsed, we can also probably remove whenInitialized > code as well. There are details that I am not so sure about your proposed approach, so I chose to keep the main structure of QuotaStorageManager. Let me know what do you think about the new patch. The idea is to reset UniqueIDBDatabase usage if we think the requested size is bigger than allowed, and try processing the request after the usage is reset.
Sihui Liu
Comment 16 2019-09-27 13:23:56 PDT
Sihui Liu
Comment 17 2019-09-27 13:24:56 PDT
*** Bug 201728 has been marked as a duplicate of this bug. ***
Sihui Liu
Comment 18 2019-09-27 18:31:29 PDT
Sihui Liu
Comment 19 2019-09-29 13:18:31 PDT
Sihui Liu
Comment 20 2019-09-30 09:18:57 PDT
Comment on attachment 379819 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=379819&action=review > Source/WebCore/ChangeLog:6 > + Update disk usage of IndexedDB only when estimated increase is bigger than space available > + https://bugs.webkit.org/show_bug.cgi?id=201957 > + > + Reviewed by NOBODY (OOPS!). Double Changelog.. will fix this in next patch.
youenn fablet
Comment 21 2019-09-30 09:52:37 PDT
Comment on attachment 379819 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=379819&action=review > Source/WebCore/Modules/indexeddb/server/IDBServer.h:170 > + uint64_t spaceUsed() const final { "{" in its own line. > Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.cpp:988 > + return IDBError { WTF::nullopt, { }, sizeIncrease }; It might be good to add a static IDBError method, to remove the first two parameters and make it clear this is a success case, not an error case. Or maybe we should use something like Expected<uint64_t, IDBError>. > Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.cpp:1793 > + sizeIncrease += error.size(); This line reads a bit odd (Expected might be better here I guess). Also, error should be null, so probably no need to add its size? > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.h:306 > + HashMap<uint64_t, uint64_t> m_pendingSpaceIncreaseTasks; Maybe we should introduce a typed identifier for this map key. > Source/WebCore/storage/StorageQuotaManager.cpp:154 > + for (auto user : copyToVector(m_users)) auto& > Source/WebCore/storage/StorageQuotaManager.h:91 > + State state; Why do we need this state for each pending request? Isn't it that this state should be a member of StorageQuotaManager instead? In that case, we could merge it with m_hasStartedHandlingRequests maybe? > Source/WebCore/storage/StorageQuotaUser.h:38 > + virtual void resetSpaceUsed(Condition) { return; } The name is not very clear and StorageQuotaManager is only calling with Condition::IndexedDBOnly. How about: - renaming it to computeExactSpaceUsed() with no parameters. Also s/{ return; }// One case to check is the case where there is no IDB at all. In that case, we want to be sure the code is still working and the tasks will be processed. I am not sure this will be the case since, if there is no IDB, all users will be still initialized.
Sihui Liu
Comment 22 2019-09-30 13:03:32 PDT
(In reply to youenn fablet from comment #21) > Comment on attachment 379819 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=379819&action=review > > > Source/WebCore/Modules/indexeddb/server/IDBServer.h:170 > > + uint64_t spaceUsed() const final { > > "{" in its own line. > Okay. > > Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.cpp:988 > > + return IDBError { WTF::nullopt, { }, sizeIncrease }; > > It might be good to add a static IDBError method, to remove the first two > parameters and make it clear this is a success case, not an error case. > Or maybe we should use something like Expected<uint64_t, IDBError>. > Will add a static method. > > Source/WebCore/Modules/indexeddb/server/SQLiteIDBBackingStore.cpp:1793 > > + sizeIncrease += error.size(); > > This line reads a bit odd (Expected might be better here I guess). > Also, error should be null, so probably no need to add its size? > IDBError's size will be set as the task size (space increase) when task successfully finishes, so size should be added only when error is null. > > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.h:306 > > + HashMap<uint64_t, uint64_t> m_pendingSpaceIncreaseTasks; > > Maybe we should introduce a typed identifier for this map key. > Sure. This would introduce changes to other maps, so will do in https://bugs.webkit.org/show_bug.cgi?id=202366. > > Source/WebCore/storage/StorageQuotaManager.cpp:154 > > + for (auto user : copyToVector(m_users)) > > auto& > Sure. > > Source/WebCore/storage/StorageQuotaManager.h:91 > > + State state; > > Why do we need this state for each pending request? > Isn't it that this state should be a member of StorageQuotaManager instead? > In that case, we could merge it with m_hasStartedHandlingRequests maybe? > This is used to ensure that we would compute space used/ask for space increase on the same request once and only once. Currently for StorageQuotaManger, its states are 1. waiting users to compute/initialize (m_pendingInitializationUsers is not null) 2. waiting space increase response (m_isWaitingForSpaceIncreaseResponse is true) 3. can handle request m_hasStartedHandlingRequests can be true or false when state 3. For PendingRequest, its states are 1. never handled before (can become 2 or 4) 2. already request for usage update (can become 3 or 4) 3. already request for space increase(can become 4) 4. handled - Grant or Deny For StorageQuotaManger, QuotaUser update/initialize/reset are all implemented by adding that user to m_pendingInitializationUsers users. Therefore when StorageQuotaManger goes from state 1 to state 3, it does not know what action to take for the first pending request in queue. > > Source/WebCore/storage/StorageQuotaUser.h:38 > > + virtual void resetSpaceUsed(Condition) { return; } > > The name is not very clear and StorageQuotaManager is only calling with > Condition::IndexedDBOnly. > How about: > - renaming it to computeExactSpaceUsed() with no parameters. > Sure. > > Also s/{ return; }// > Okay. > > One case to check is the case where there is no IDB at all. > In that case, we want to be sure the code is still working and the tasks > will be processed. > I am not sure this will be the case since, if there is no IDB, all users > will be still initialized. > Yes, this case would not happen under current implementation - IDBServer would be there in the initialization of QuotaUsers(NetworkProcess::initializeQuotaUsers). We can also add a check for pending users after computeExactSpaceUsed so we can continue to process the same request if IDB is not present.
Sihui Liu
Comment 23 2019-09-30 13:07:48 PDT
youenn fablet
Comment 24 2019-10-01 05:29:06 PDT
> > > Source/WebCore/storage/StorageQuotaManager.h:91 > > > + State state; > > > > Why do we need this state for each pending request? > > Isn't it that this state should be a member of StorageQuotaManager instead? > > In that case, we could merge it with m_hasStartedHandlingRequests maybe? > > > This is used to ensure that we would compute space used/ask for space > increase on the same request once and only once. > > Currently for StorageQuotaManger, its states are > 1. waiting users to compute/initialize (m_pendingInitializationUsers is not > null) > 2. waiting space increase response (m_isWaitingForSpaceIncreaseResponse is > true) > 3. can handle request > > m_hasStartedHandlingRequests can be true or false when state 3. > > > For PendingRequest, its states are > 1. never handled before (can become 2 or 4) > 2. already request for usage update (can become 3 or 4) > 3. already request for space increase(can become 4) > 4. handled - Grant or Deny > > For StorageQuotaManger, QuotaUser update/initialize/reset are all > implemented by adding that user to m_pendingInitializationUsers users. > Therefore when StorageQuotaManger goes from state 1 to state 3, it does not > know what action to take for the first pending request in queue. Agreed, but this is only useful for the first pending requests. All other pending requests are Unhandled. It seems best to have the additional information needed to go from state 1 to state 3 in StorageQuotaManager. Or maybe we could introduce a fourth state for StorageQuotaManager which is PendingRecomputationofActualSpaceUsage. > > > Source/WebCore/storage/StorageQuotaUser.h:38 > > > + virtual void resetSpaceUsed(Condition) { return; } > > > > The name is not very clear and StorageQuotaManager is only calling with > > Condition::IndexedDBOnly. > > How about: > > - renaming it to computeExactSpaceUsed() with no parameters. > > > Sure. > > > > > Also s/{ return; }// > > > Okay. > > > > > One case to check is the case where there is no IDB at all. > > In that case, we want to be sure the code is still working and the tasks > > will be processed. > > I am not sure this will be the case since, if there is no IDB, all users > > will be still initialized. > > > Yes, this case would not happen under current implementation - IDBServer > would be there in the initialization of > QuotaUsers(NetworkProcess::initializeQuotaUsers). > > We can also add a check for pending users after computeExactSpaceUsed so we > can continue to process the same request if IDB is not present. Yes, it seems more robust to cover that case.
Geoffrey Garen
Comment 25 2019-10-01 10:09:24 PDT
> We can also add a check for pending users after computeExactSpaceUsed so we > can continue to process the same request if IDB is not present. Do we really need to make IDB requests wait for computeExactSpaceUsed? Could the implementation be simpler if we didn't wait? If we didn't wait, the consequence would be that some requests near the space limit, but still below it, would trigger a user approval for more space. I think that consequence is acceptable, if it simplifies the implementation. 99.9% of websites will never even be near the space limit.
Geoffrey Garen
Comment 26 2019-10-01 10:35:30 PDT
Comment on attachment 379838 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=379838&action=review > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:236 > +void UniqueIDBDatabase::startSpaceIncreaseTask(uint64_t identifier, uint64_t taskSize) > +{ > + m_server->increasePotentialSpaceUsed(m_identifier.origin(), taskSize); > + m_pendingSpaceIncreaseTasks.add(identifier, taskSize); > +} > + > +void UniqueIDBDatabase::finishSpaceIncreaseTask(uint64_t identifier, uint64_t actualTaskSize) > +{ > + auto iterator = m_pendingSpaceIncreaseTasks.find(identifier); > + if (iterator != m_pendingSpaceIncreaseTasks.end()) { > + m_server->decreasePotentialSpaceUsed(m_identifier.origin(), iterator->value); > + if (actualTaskSize) > + m_server->increaseSpaceUsed(m_identifier.origin(), actualTaskSize); > + m_pendingSpaceIncreaseTasks.remove(iterator); Is it really essential that we guess how much space a task will use in advance, and then update to a precise value afterwards, and keep a precise tally in memory? I believe it should be sufficient only to guess how much space each task will use, and accumulate guesses, and only measure the amount of space used on disk periodically. It's OK if our guess is a little inaccurate, as long as it doesn't prompt the user too much, and it doesn't allow a webpage to use way more than 500MB on disk. For example, even a really simple guess that only counted space increases and never counted space decreases, and synchronized with the disk when the size of space increases reach half of available capacity, would probably be good enough, no?
Sihui Liu
Comment 27 2019-10-01 12:29:09 PDT
(In reply to Geoffrey Garen from comment #26) > Comment on attachment 379838 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=379838&action=review > > > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:236 > > +void UniqueIDBDatabase::startSpaceIncreaseTask(uint64_t identifier, uint64_t taskSize) > > +{ > > + m_server->increasePotentialSpaceUsed(m_identifier.origin(), taskSize); > > + m_pendingSpaceIncreaseTasks.add(identifier, taskSize); > > +} > > + > > +void UniqueIDBDatabase::finishSpaceIncreaseTask(uint64_t identifier, uint64_t actualTaskSize) > > +{ > > + auto iterator = m_pendingSpaceIncreaseTasks.find(identifier); > > + if (iterator != m_pendingSpaceIncreaseTasks.end()) { > > + m_server->decreasePotentialSpaceUsed(m_identifier.origin(), iterator->value); > > + if (actualTaskSize) > > + m_server->increaseSpaceUsed(m_identifier.origin(), actualTaskSize); > > + m_pendingSpaceIncreaseTasks.remove(iterator); > > Is it really essential that we guess how much space a task will use in > advance, and then update to a precise value afterwards, and keep a precise > tally in memory? > > I believe it should be sufficient only to guess how much space each task > will use, and accumulate guesses, and only measure the amount of space used > on disk periodically. > > It's OK if our guess is a little inaccurate, as long as it doesn't prompt > the user too much, and it doesn't allow a webpage to use way more than 500MB > on disk. > > For example, even a really simple guess that only counted space increases > and never counted space decreases, and synchronized with the disk when the > size of space increases reach half of available capacity, would probably be > good enough, no? Performance concern: We only want to update disk usage(conduct file system operation) when it's necessary. Estimated space increase reaches some amount of available capacity can be too aggressive (or conservative, depending on how we do the estimate). Correctness concern: It's hard to estimate space increase for operation like put before it's executed, based on our current IDB implementation. The updating estimatedSizeIncrease after task done is a way to make the guess become "a little inaccurate" instead of "very inaccurate". For example, put(key, value) (1) Size estimate before task: size of key + size of value (2) Size estimate after task: size of serialized key + size of serialized value + size of records for different indexes(which may be double (1)).
Sihui Liu
Comment 28 2019-10-01 13:22:04 PDT
Sihui Liu
Comment 29 2019-10-01 13:48:55 PDT
(In reply to Geoffrey Garen from comment #25) > > We can also add a check for pending users after computeExactSpaceUsed so we > > can continue to process the same request if IDB is not present. > > Do we really need to make IDB requests wait for computeExactSpaceUsed? > > Could the implementation be simpler if we didn't wait? > > If we didn't wait, the consequence would be that some requests near the > space limit, but still below it, would trigger a user approval for more > space. I think that consequence is acceptable, if it simplifies the > implementation. 99.9% of websites will never even be near the space limit. This implementation is relatively simple now since we already have the code path to blocked update usage of QuotaUsers (IDBServer::QuotaUser::resetSpaceUsed). We barely modify QuotaUser. If we do not block the requests during update, in some edge case, we may stat the file system multiple times(e.g. reset user usage during user usage update) or prompt multiple times(usage update running in the background thread). And this does not seem easier as we may need to keep states in QuotaUser or StorageQuotaManager. If we believe most website will never reach this state, then blocking update should not happen very often so may be less a concern.
youenn fablet
Comment 30 2019-10-02 09:26:11 PDT
This patch seems to do 3 things: 1. Improve the task estimate before calling the quota manager 2. Improve the task estimate after evaluating the task size so that our overall estimate is overall better 3. Trigger actual computation of real size when reaching quota 1 is straightforward and could be landed quickly. 2 is bigger but I think is also ok given our current approach Could we split these in 3 patches to make progress?
youenn fablet
Comment 31 2019-10-02 09:29:59 PDT
Comment on attachment 379938 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=379938&action=review > Source/WebCore/storage/StorageQuotaManager.h:96 > + Action m_actionTakenOnNextRequest { Action::Never }; I do not think Action is the right term. It might be State { NeedSpaceUsageInitialization, ComputingSpaceUsage, AskingForMoreSpace } I wonder whether None could be equal to ComputingExactSpaceUsage.
Sihui Liu
Comment 32 2019-10-02 09:39:20 PDT
(In reply to youenn fablet from comment #30) > This patch seems to do 3 things: > 1. Improve the task estimate before calling the quota manager > 2. Improve the task estimate after evaluating the task size so that our > overall estimate is overall better > 3. Trigger actual computation of real size when reaching quota > > 1 is straightforward and could be landed quickly. > 2 is bigger but I think is also ok given our current approach > Could we split these in 3 patches to make progress? Sure.(In reply to youenn fablet from comment #30) > This patch seems to do 3 things: > 1. Improve the task estimate before calling the quota manager > 2. Improve the task estimate after evaluating the task size so that our > overall estimate is overall better > 3. Trigger actual computation of real size when reaching quota > > 1 is straightforward and could be landed quickly. > 2 is bigger but I think is also ok given our current approach > Could we split these in 3 patches to make progress? Sure, will do.
Sihui Liu
Comment 33 2019-10-02 10:05:10 PDT
(In reply to youenn fablet from comment #31) > Comment on attachment 379938 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=379938&action=review > > > Source/WebCore/storage/StorageQuotaManager.h:96 > > + Action m_actionTakenOnNextRequest { Action::Never }; > > I do not think Action is the right term. It might be State { > NeedSpaceUsageInitialization, ComputingSpaceUsage, AskingForMoreSpace } > I wonder whether None could be equal to ComputingExactSpaceUsage. Never // Has never handled request, needs to initialize m_quota first None // No action has been taken on first pending request, next action should be ComputeExactSpaceUsed or making a decision ComputeExactSpaceUsed // ComputeExactSpaceUsed has been called, next action should be AskForMoreSpace or making a decision AskForMoreSpace // AskForMoreSpace has been called, next action should be making decision So None is not quite the same as ComputeExactSpaceUsed. If we are using state, it's probably more like enum class State { NeedInitializeQuota, Idle, ComputingSpaceUsage, AskingForMoreSpace, }; State m_lastState;
Sihui Liu
Comment 34 2019-10-02 12:09:01 PDT
(In reply to youenn fablet from comment #30) > This patch seems to do 3 things: > 1. Improve the task estimate before calling the quota manager > 2. Improve the task estimate after evaluating the task size so that our > overall estimate is overall better > 3. Trigger actual computation of real size when reaching quota > > 1 is straightforward and could be landed quickly. > 2 is bigger but I think is also ok given our current approach > Could we split these in 3 patches to make progress? Patch 1: https://bugs.webkit.org/show_bug.cgi?id=202480 Patch 2: https://bugs.webkit.org/show_bug.cgi?id=202483
Sihui Liu
Comment 35 2019-10-02 14:00:12 PDT
Sihui Liu
Comment 36 2019-10-03 13:34:43 PDT
Sihui Liu
Comment 37 2019-10-03 13:55:35 PDT
(In reply to youenn fablet from comment #30) > This patch seems to do 3 things: > 1. Improve the task estimate before calling the quota manager > 2. Improve the task estimate after evaluating the task size so that our > overall estimate is overall better > 3. Trigger actual computation of real size when reaching quota > > 1 is straightforward and could be landed quickly. > 2 is bigger but I think is also ok given our current approach > Could we split these in 3 patches to make progress? It turns out the order of these patches should be: 1. Improve the task estimate before calling the quota manager (202480). 2. Use estimated usage and trigger actual computation when estimated usage reaches limit (201957, this one). Use estimated usage without triggering actual computation would cause correctness problem as the estimated increase accumulates and never reset. 3. Improve the task estimate after task execution so that the estimated usage in 2 is more accurate (202483). So we still need to solve 2 first.
youenn fablet
Comment 38 2019-10-07 11:38:37 PDT
Comment on attachment 380158 [details] Patch This looks almost good to me. Some suggestions below. View in context: https://bugs.webkit.org/attachment.cgi?id=380158&action=review > Source/WebCore/storage/StorageQuotaManager.cpp:-87 > - updateQuotaBasedOnSpaceUsage(); Why removing this call? It should be the one that advance m_previousState to Uninitialized to MakingDecisionForRequest > Source/WebCore/storage/StorageQuotaManager.cpp:149 > + } Why do we have that code? Normally, either we are initialised, in which case the call in askUserToInitialize would move us out of Uninitialized, or the !m_pendingInitializationUsers.isEmpty() check would kick in. > Source/WebCore/storage/StorageQuotaManager.cpp:189 > +void StorageQuotaManager::processPendingRequests(Optional<uint64_t> newQuota) I wonder whether we should split processPendingRequests in two methods, one for processing the request after a quota increase request and another one to restart the queue on various cases. If you do not do it in this patch I can take a stab. > Source/WebCore/storage/StorageQuotaManager.cpp:216 > + m_previousState = State::MakingDecisionForRequest; We should probably make set m_previousState before callback call for compatibility with other call sites. > Source/WebCore/storage/StorageQuotaManager.h:96 > + State m_previousState { State::Uninitialized }; s/m_state/m_previousState. Also, can we merge m_isWaitingForSpaceIncreaseResponse with m_state?
Sihui Liu
Comment 39 2019-10-07 14:46:02 PDT
Sihui Liu
Comment 40 2019-10-07 19:36:46 PDT
youenn fablet
Comment 41 2019-10-07 23:28:55 PDT
Comment on attachment 380385 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=380385&action=review > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:225 > + m_pendingSpaceIncreaseTasks.add(identifier, taskSize); Add ASSERT(!m_pendingSpaceIncreaseTasks.add(identifier)) > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:230 > + auto iterator = m_pendingSpaceIncreaseTasks.find(identifier); Can we add an ASSERT, something like ASSERT(iterator != m_pendingSpaceIncreaseTasks.end() || !isTaskSuccessful); > Source/WebCore/storage/StorageQuotaManager.h:96 > + State m_state { State::Uninitialized }; We might want to add some API tests for StorageQuotaManager like we are doing for other utility classes like Strings.
Sihui Liu
Comment 42 2019-10-08 11:53:22 PDT
Comment on attachment 380385 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=380385&action=review >> Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:225 >> + m_pendingSpaceIncreaseTasks.add(identifier, taskSize); > > Add ASSERT(!m_pendingSpaceIncreaseTasks.add(identifier)) Okay. >> Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:230 >> + auto iterator = m_pendingSpaceIncreaseTasks.find(identifier); > > Can we add an ASSERT, something like ASSERT(iterator != m_pendingSpaceIncreaseTasks.end() || !isTaskSuccessful); Sure. >> Source/WebCore/storage/StorageQuotaManager.h:96 >> + State m_state { State::Uninitialized }; > > We might want to add some API tests for StorageQuotaManager like we are doing for other utility classes like Strings. What kind of API tests? I see we have some related API tests in StorageQuota.mm, this patch should not change their current behavior?
youenn fablet
Comment 43 2019-10-09 00:18:06 PDT
> > We might want to add some API tests for StorageQuotaManager like we are doing for other utility classes like Strings. > > What kind of API tests? > I see we have some related API tests in StorageQuota.mm, this patch should > not change their current behavior? See for instance TestWebKitAPI/Tests/WTF/Vector.cpp.
Sihui Liu
Comment 44 2019-10-09 14:27:45 PDT
(In reply to youenn fablet from comment #43) > > > We might want to add some API tests for StorageQuotaManager like we are doing for other utility classes like Strings. > > > > What kind of API tests? > > I see we have some related API tests in StorageQuota.mm, this patch should > > not change their current behavior? > > See for instance TestWebKitAPI/Tests/WTF/Vector.cpp. Will add a unit test in https://bugs.webkit.org/show_bug.cgi?id=202755. I will get this patch land first for some performance testing.
Sihui Liu
Comment 45 2019-10-09 14:50:54 PDT
Created attachment 380569 [details] Patch for landing
WebKit Commit Bot
Comment 46 2019-10-09 15:39:04 PDT
Comment on attachment 380569 [details] Patch for landing Clearing flags on attachment: 380569 Committed r250937: <https://trac.webkit.org/changeset/250937>
WebKit Commit Bot
Comment 47 2019-10-09 15:39:06 PDT
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 48 2019-10-09 15:40:19 PDT
Sihui Liu
Comment 49 2019-10-09 17:44:19 PDT
Reopened to upload test fix.
Sihui Liu
Comment 50 2019-10-09 17:56:44 PDT
Created attachment 380597 [details] Patch for landing
WebKit Commit Bot
Comment 51 2019-10-09 18:44:57 PDT
Comment on attachment 380597 [details] Patch for landing Clearing flags on attachment: 380597 Committed r250951: <https://trac.webkit.org/changeset/250951>
WebKit Commit Bot
Comment 52 2019-10-09 18:44:59 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.