WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
202483
IndexedDB: include size of index records in size estimate of put/add task
https://bugs.webkit.org/show_bug.cgi?id=202483
Summary
IndexedDB: include size of index records in size estimate of put/add task
Sihui Liu
Reported
2019-10-02 11:04:41 PDT
...
Attachments
Patch
(38.03 KB, patch)
2019-10-02 11:59 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(38.01 KB, patch)
2019-10-02 14:21 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
test.html
(1.91 KB, text/html)
2019-10-02 18:59 PDT
,
Sihui Liu
no flags
Details
Patch
(64.89 KB, patch)
2019-10-02 20:32 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(2.24 KB, patch)
2019-10-04 17:37 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(2.24 KB, patch)
2019-10-07 08:46 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(9.75 KB, patch)
2019-10-09 13:07 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch for landing
(9.74 KB, patch)
2019-10-09 14:16 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch for landing
(9.80 KB, patch)
2019-10-09 14:19 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Show Obsolete
(7)
View All
Add attachment
proposed patch, testcase, etc.
Sihui Liu
Comment 1
2019-10-02 11:59:37 PDT
Created
attachment 380045
[details]
Patch
Geoffrey Garen
Comment 2
2019-10-02 12:44:58 PDT
Comment on
attachment 380045
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=380045&action=review
> Source/WebCore/Modules/indexeddb/shared/IDBError.cpp:70 > +uint64_t IDBError::taskSize() const > +{ > + ASSERT(isNull() || (!isNull() && !m_size)); > + return m_size; > +}
It's an unfortunate / surprising design to use an error object to communicate a successful result (size). Usually, an error communicates that something failed. But in this case, you're using it to communicate that something succeeded. Can you return the size of the task some other way? Or Make an abstraction like a Variant that's a either an error or a size, depending on whether the task succeeded?
Geoffrey Garen
Comment 3
2019-10-02 12:52:36 PDT
Comment on
attachment 380045
[details]
Patch Is the purpose of this design to avoid storing something too big, or to avoid prompting the user too much? Can you include a test case that demonstrates the bug we're trying to fix by being more precise? It's probably easier to talk about whether a test passes or fails than it is to discuss design in the abstract. If the purpose of this design is to avoid storing something too big, I'm not sure we've achieved that goal. An item that is estimated to be small, but turns out to be big, will be stored. Later, we will update our estimate to know that we've gone over quota. But by that time, we're already over quota, possibly by a lot. So, I'd like to understand what we're trying to do with this more precise estimate.
Sihui Liu
Comment 4
2019-10-02 14:21:36 PDT
Created
attachment 380059
[details]
Patch
Sihui Liu
Comment 5
2019-10-02 18:59:45 PDT
Created
attachment 380076
[details]
test.html
Sihui Liu
Comment 6
2019-10-02 19:17:40 PDT
(In reply to Geoffrey Garen from
comment #3
)
> Comment on
attachment 380045
[details]
> Patch > > Is the purpose of this design to avoid storing something too big, or to > avoid prompting the user too much? > > Can you include a test case that demonstrates the bug we're trying to fix by > being more precise? It's probably easier to talk about whether a test passes > or fails than it is to discuss design in the abstract. >
Test file uploaded. For each add operation, estimated task size before execution is about 58KB, actual task size reported after execution is about 1.09MB. Database file size after db.close() is about 9MB.
> If the purpose of this design is to avoid storing something too big, I'm not > sure we've achieved that goal. An item that is estimated to be small, but > turns out to be big, will be stored. Later, we will update our estimate to > know that we've gone over quota. But by that time, we're already over quota, > possibly by a lot. So, I'd like to understand what we're trying to do with > this more precise estimate.
That's true. This is an effort to reduce effect of this problem but not to solve the problem completely. One case where updating the size after task execution could be meaningful is when we have multiple transactions. As in the test file, it executes two transactions. When we set the quota to be 1MB and do not update the task size, both transactions would succeed. When we update the task size estimate, the second transaction would fail as the estimated usage would be updated at the end of transaction. Bette size estimate also helps the case where one origin has multiple databases or multiple storage types for the same reason.
Sihui Liu
Comment 7
2019-10-02 20:32:07 PDT
Created
attachment 380077
[details]
Patch
youenn fablet
Comment 8
2019-10-03 00:01:14 PDT
Comment on
attachment 380077
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=380077&action=review
> Source/WebCore/Modules/indexeddb/shared/IDBError.h:100 > +class IDBErrorOrSize {
I would rename it to something like OperationSizeOrError or TaskSizeOrError. An Expected<uint64_t, IDBError> might be better in principle, it might require adding some makeUnexpected. Otherwise a Variant.
Geoffrey Garen
Comment 9
2019-10-03 21:25:02 PDT
> > Can you include a test case that demonstrates the bug we're trying to fix by > > being more precise? It's probably easier to talk about whether a test passes > > or fails than it is to discuss design in the abstract. > > > Test file uploaded. > > For each add operation, estimated task size before execution is about 58KB, > actual task size reported after execution is about 1.09MB. Database file > size after db.close() is about 9MB.
Thank you! I think this test demonstrates two issues: (1) In the worst case, when we store an object, we replicate the store in every index. (2) In the worst worst case, SQLite may use even more space than (1), due to internal implementation details. It looks like this patch attempts to address (1) but not (2). So, let's set aside (2) for the moment, and treat it as a separate bug. (Probably the fix for that bug will be to use "PRAGMA auto_vacuum = FULL" or similar.) In the case of (1), it looks like this patch attempts to address the subset of cases where the indexing overhead accumulates gradually over many stores, but not immediately in one store. So, if I set indexCount to 1000000, or I change createObject() to call randomArray(1000000)), this patch does not stop me. Is that right? Consider this algorithm, which I'll name INDEX_ESTIMATE_SYNC: bytes = (object size) * (number of indices) if (bytes > bytesAvailable) bytesAvailable = synchronously read file sizes on disk if (bytes > bytesAvailable) bytesAvailable = ask user for more space if (bytes > bytesAvailable) return an error bytesAvailable -= bytes And this algorithm, which I'll name INDEX_ESTIMATE_ASYNC: bytes = (object size) * (number of indices) if (bytes > bytesAvailable) if (bytesAvailableNeedsUpdate) schedule a task: bytesAvailable = synchronously read file sizes on disk bytesAvailableNeedsUpdate = false restart this algorithm else bytesAvailable = ask user for more space if (bytes > bytesAvailable) return an error bytesAvailableNeedsUpdate = true bytesAvailable -= bytes Is INDEX_ESTIMATE or INDEX_ESTIMATE_ASYNC practical to implement in our database task system? I believe that INDEX_ESTIMATE_* would solve the problem in (1) in all cases. I also believe it has the benefit of being a little more straightforward. The process of estimating a size, reserving space for it, and then passing back the difference between the estimated size and the true size (which is itself still an estimate) is challenging to follow. What do you think?
Sihui Liu
Comment 10
2019-10-04 01:02:17 PDT
(In reply to Geoffrey Garen from
comment #9
)
> > > Can you include a test case that demonstrates the bug we're trying to fix by > > > being more precise? It's probably easier to talk about whether a test passes > > > or fails than it is to discuss design in the abstract. > > > > > Test file uploaded. > > > > For each add operation, estimated task size before execution is about 58KB, > > actual task size reported after execution is about 1.09MB. Database file > > size after db.close() is about 9MB. > > Thank you! > > I think this test demonstrates two issues: > > (1) In the worst case, when we store an object, we replicate the store in > every index. > > (2) In the worst worst case, SQLite may use even more space than (1), due to > internal implementation details. > > It looks like this patch attempts to address (1) but not (2). So, let's set > aside (2) for the moment, and treat it as a separate bug. (Probably the fix > for that bug will be to use "PRAGMA auto_vacuum = FULL" or similar.) > > In the case of (1), it looks like this patch attempts to address the subset > of cases where the indexing overhead accumulates gradually over many stores, > but not immediately in one store. So, if I set indexCount to 1000000, or I > change createObject() to call randomArray(1000000)), this patch does not > stop me. > > Is that right? >
Right.
> Consider this algorithm, which I'll name INDEX_ESTIMATE_SYNC: > > bytes = (object size) * (number of indices) > if (bytes > bytesAvailable) > bytesAvailable = synchronously read file sizes on disk > if (bytes > bytesAvailable) > bytesAvailable = ask user for more space > if (bytes > bytesAvailable) > return an error > bytesAvailable -= bytes > > And this algorithm, which I'll name INDEX_ESTIMATE_ASYNC: > > bytes = (object size) * (number of indices) > if (bytes > bytesAvailable) > if (bytesAvailableNeedsUpdate) > schedule a task: > bytesAvailable = synchronously read file sizes on disk > bytesAvailableNeedsUpdate = false > restart this algorithm > else > bytesAvailable = ask user for more space > if (bytes > bytesAvailable) > return an error > > bytesAvailableNeedsUpdate = true > bytesAvailable -= bytes > > Is INDEX_ESTIMATE or INDEX_ESTIMATE_ASYNC practical to implement in our > database task system? > > I believe that INDEX_ESTIMATE_* would solve the problem in (1) in all cases. > I also believe it has the benefit of being a little more straightforward. > The process of estimating a size, reserving space for it, and then passing > back the difference between the estimated size and the true size (which is > itself still an estimate) is challenging to follow. > > What do you think?
I think this proposal is doable. Current patch would give a more accurate estimate than the proposed one but it would update the size estimate only after task execution, before which StorageQuotaManager may already grant some requests that it should not. It's more safe and cleaner to have a good task estimate when requesting the space.
Sihui Liu
Comment 11
2019-10-04 17:37:55 PDT
Created
attachment 380273
[details]
Patch
Sihui Liu
Comment 12
2019-10-07 08:46:19 PDT
Created
attachment 380330
[details]
Patch
Geoffrey Garen
Comment 13
2019-10-07 20:10:01 PDT
Comment on
attachment 380330
[details]
Patch r=me Unless the result is impractical, I think we should land this change with a test case similar to the one you uploaded, with indexCount set to 1000000, and/or createObject() set to call randomArray(1000000)). That way, the test documents the reason we made this implementation choice, and it can be used to qualified other implementations we may want to try out in the future.
Geoffrey Garen
Comment 14
2019-10-07 20:12:11 PDT
(The test case should verify that the store fails.)
Sihui Liu
Comment 15
2019-10-09 13:07:57 PDT
Created
attachment 380558
[details]
Patch
Sihui Liu
Comment 16
2019-10-09 14:16:17 PDT
Created
attachment 380564
[details]
Patch for landing
Sihui Liu
Comment 17
2019-10-09 14:19:00 PDT
Created
attachment 380566
[details]
Patch for landing
WebKit Commit Bot
Comment 18
2019-10-09 15:20:36 PDT
Comment on
attachment 380566
[details]
Patch for landing Clearing flags on attachment: 380566 Committed
r250936
: <
https://trac.webkit.org/changeset/250936
>
WebKit Commit Bot
Comment 19
2019-10-09 15:20:38 PDT
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 20
2019-10-09 15:21:17 PDT
<
rdar://problem/56132514
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug