Bug 91125

Summary: IndexedDB: generate index keys for existing data in createIndex in front end
Product: WebKit Reporter: Alec Flett <alecflett>
Component: WebCore Misc.Assignee: Alec Flett <alecflett>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, dglazkov, dgrogan, fishd, jamesr, jsbell, levin+threading, ojan, tkent+wkapi, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 91123, 93410    
Bug Blocks: 91146, 94023    
Attachments:
Description Flags
Patch
none
Patch
none
Archive of layout-test-results from gce-cr-linux-05
none
Patch
none
Patch for landing
none
Patch for landing
none
Archive of layout-test-results from gce-cq-02
none
Patch for landing
none
Patch
none
Archive of layout-test-results from gce-cr-linux-07 none

Description Alec Flett 2012-07-12 11:25:52 PDT
At the moment the IDBObjectStoreBackendImpl generates index data, but that has architectural issues for chromium - specifically because a JS engine (V8 or JSC) needs to be available to extract key data.

Rather than doing it in the backend, we need to create a cursor to iterate through the existing data, creating index data on the fly and notifying the backend, and then call the final success handlers only when this cursor has been exhausted.
Comment 1 Alec Flett 2012-08-01 16:12:27 PDT
I started to explain what I was doing in bug 91146, but I think this is a better place.

The basic problem I'm trying to deal with is to use render-side cursors to iterate existing data and send index data to the backend, when createIndex() is called.

The basic use case is a script that looks like:

objectStore.put({foo: 1}, 'x')
objectStore.put({foo: 2}, 'y')
objectStore.createIndex('foo', 'foo');
objectStore.index('foo').get(1)

so this seems simple enough, the index gets created, asynchronously populated in between createIndex() and index().get()

So initially I used the backend taskQueue to queue up asynchronous events, and run the cursor from the frontend.

IDBObjectStore.cpp more or less looked like (pardon the JS-like pseudocode)

IDBObjectStore::createIndex(...)
    m_backend->createIndex(...)
    openCursor().onsuccess = addIndexData;

The problem here is that the events get run through the taskQueue like

1. createIndexInternal
2. putInternal
3. putInternal
4. openCursorInternal
5. continueFunctionInternal
6. continueFunctionInternal
7. setIndexReadyInternal (which addIndexData sends when it is done, to let the backend know the index is ready for use)

(they don't all necessarily live in the task queue at the same time, but that's the order)

So the answer seems simple, right? use the "Early task queue" from bug 91146, and you end up doing the openCursorInternal and continueFunctionInternal in that queue instead of the normal one.

The problem there is that this breaks down for this case (back to JS)

objectStore.put({foo: 1}, 'x')
objectStore.put({foo: 2}, 'y')
objectStore.createIndex('foo', 'foo');
objectStore.createIndex('bar', 'foo');
objectStore.index('foo').get(1)

now events should queued up like this (normal event queue on the left, early on the right)
1. putInternal
2. putInternal
3. createIndexInternal (foo)
                              openCursorInternal (foo)
4. createIndexInternal (bar)
                              openCursorInternal (bar)
                              continueFunctionInternal (foo)
                              continueFunctionInternal (bar)
                              continueFunctionInternal (foo)
                              continueFunctionInternal (bar)
                              setIndexReadyInternal (foo)
                              setIndexReadyInternal (bar)
5. getInternal

But really what happens is that the 2nd createIndexInternal can't fire because we're in "early" mode, and the early queue is empty because we're waiting for a continueFunction from the renderer, so what actually happens is something like this:

1. createIndexInternal (foo)
                              openCursorInternal (foo)
                              continueFunctionInternal (foo)
                              continueFunctionInternal (foo)
                              setIndexReadyInternal (foo)
2. createIndexInternal (bar)
                              openCursorInternal (bar)
                              continueFunctionInternal (bar)
                              continueFunctionInternal (bar)
                              setIndexReadyInternal (bar)
3. putInternal
4. putInternal

BUT... when you're in single-process mode (aka DumpRenderTree) the events come in slightly differently. I now can't remember exactly but it was in some chicken-and-egg situation where I had events in the early queue but the backend couldn't figure out which queue it was supposed to be processing.

so ultimately, an idea that I outlined for David, is that TaskQueue should be a priority queue, and the events should have monotonically increasing sequence numbers.. except that when createIndex goes to open the cursor, it should use the same sequence number as the createIndexInternal event. This way, the frontend can continue to "pump" the cursor at the same position in the task queue, until it is exhausted.

I have a hacky fix right now without a priority queue, that passes all the tests (except for the one in bug 91146, where I'm arguing  that this is open for interpretation) - I'm cleaning up the patch to see how bad it is with all my debugging cruft removed.
Comment 2 Alec Flett 2012-08-01 16:13:29 PDT
I just realized the very last task queue example should look like:
1. putInternal
2. putInternal
3. createIndexInternal (foo)
                              openCursorInternal (foo)
                              continueFunctionInternal (foo)
                              continueFunctionInternal (foo)
                              setIndexReadyInternal (foo)
4. createIndexInternal (bar)
                              openCursorInternal (bar)
                              continueFunctionInternal (bar)
                              continueFunctionInternal (bar)
                              setIndexReadyInternal (bar)
5. getInternal
Comment 3 Alec Flett 2012-08-10 13:30:17 PDT
Created attachment 157800 [details]
Patch
Comment 4 WebKit Review Bot 2012-08-10 13:33:07 PDT
Please wait for approval from abarth@webkit.org, dglazkov@chromium.org, fishd@chromium.org, jamesr@chromium.org or tkent@chromium.org before submitting, as this patch contains changes to the Chromium public API. See also https://trac.webkit.org/wiki/ChromiumWebKitAPI.
Comment 5 Alec Flett 2012-08-13 10:21:00 PDT
and here's the webkit patch with the meat of the changes.. jsbell/dgrogan, mind taking a look? I'll also add a comment in the unit test here about the spec interpretation issue (on when an error should throw for createIndex / put
Comment 6 Joshua Bell 2012-08-13 17:13:44 PDT
Comment on attachment 157800 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=157800&action=review

overall LGTM with various nits/notes

> Source/WebCore/ChangeLog:18
> +        is still in discussion as the spec is vague on the proper error

"...still in discussion on the standards list as..."

> Source/WebCore/ChangeLog:23
> +        * Modules/indexeddb/IDBCursorBackendImpl.cpp:

Provide some high level details about the per-file/per-function changes here, unless they're trivial.

> Source/WebCore/Modules/indexeddb/IDBObjectStore.cpp:287
> +class PopulateIndex : public EventListener {

Naming: how about "IndexPopulator" ?

> Source/WebCore/Modules/indexeddb/IDBObjectStore.cpp:307
> +    { }

Put these { } on the previous line.

> Source/WebCore/Modules/indexeddb/IDBObjectStore.cpp:309
> +    virtual void handleEvent(ScriptExecutionContext*, Event* event)

Just out of curiosity, would making this a subclass of IDBRequest instead of an EventListener simplify it at all? It would change the plumbing through to the openCursor() call, but might be a better encapsulation. Dunno.

> Source/WebCore/Modules/indexeddb/IDBObjectStore.cpp:421
> +    if (ec)

Do you need to call indexRequest->markEarlyDeath() here to handle the back-end transaction asynchronously aborting?

> Source/WebCore/Modules/indexeddb/IDBObjectStore.cpp:424
> +    RefPtr<PopulateIndex> indexPopulater = PopulateIndex::create(m_backend, m_transaction->backend(), indexRequest, metadata);

spelling: "populator"

> Source/WebCore/Modules/indexeddb/IDBObjectStore.cpp:425
> +    indexRequest->setOnsuccess(indexPopulater);

What manages the lifetime of this object? Is it kept alive by the request, and the request is kept alive only by the transaction?

> Source/WebCore/Modules/indexeddb/IDBObjectStoreBackendImpl.cpp:265
> +    // synchronous. Should be async...

Capitalize.

> Source/WebCore/Modules/indexeddb/IDBObjectStoreBackendImpl.cpp:275
> +        // FIXME: need to deal with errorMessage here

Capitalize, punctuate.

> Source/WebCore/Modules/indexeddb/IDBObjectStoreBackendImpl.cpp:348
> +        // Are we supposed to abort here?

Mark this FIXME, don't leave in commented-out code.

What does a false return code from makeIndexWriters indicate?

> Source/WebCore/Modules/indexeddb/IDBRequest.cpp:-49
> -    RefPtr<IDBRequest> request(adoptRef(new IDBRequest(context, source, transaction)));

Should probably have this ::create() call the other one, rather than duplicating the logic e.g. suspendIfNeeded(). Possibly even do this in the .h file.

> Source/WebCore/Modules/indexeddb/IDBRequest.cpp:170
> +    onError(IDBDatabaseError::create(IDBDatabaseException::IDB_ABORT_ERR, "The request was aborted, so the request cannot be fulfilled."));

I disagree with this change - this error occurs when the /transaction/ aborted before this /request/ was processed. From an outside perspective, requests don't abort, transactions do. (Plus it's redundant to say "the request was aborted, so it was aborted"). The message is still kinda lame, though.

> Source/WebCore/Modules/indexeddb/IDBTransactionBackendImpl.cpp:255
> +    // Just process a single event here, in case the event iself

Typo: "iself"

> Source/WebCore/Modules/indexeddb/IDBTransactionBackendImpl.h:66
> +    void addEarlyEvent() { m_pendingEarlyRequests++; }

Name this addPreemptiveEvent() for consistent terminology?

> Source/WebCore/Modules/indexeddb/IDBTransactionBackendImpl.h:97
> +    TaskQueue m_earlyTaskQueue;

Name this m_preemptiveTaskQueue for consistent terminology?

> Source/WebCore/Modules/indexeddb/IDBTransactionBackendImpl.h:105
> +    int m_pendingEarlyRequests;

Ditto.

> LayoutTests/ChangeLog:12
> +        * storage/indexeddb/lazy-index-population.html: Added.

Hey, this looks familiar. :)
Comment 7 Alec Flett 2012-08-13 20:54:10 PDT
Comment on attachment 157800 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=157800&action=review

>> Source/WebCore/Modules/indexeddb/IDBObjectStore.cpp:287
>> +class PopulateIndex : public EventListener {
> 
> Naming: how about "IndexPopulator" ?

Done.

>> Source/WebCore/Modules/indexeddb/IDBObjectStore.cpp:309
>> +    virtual void handleEvent(ScriptExecutionContext*, Event* event)
> 
> Just out of curiosity, would making this a subclass of IDBRequest instead of an EventListener simplify it at all? It would change the plumbing through to the openCursor() call, but might be a better encapsulation. Dunno.

I think it would increase the complexity quite a bit - the markEarlyDeath() issue below is just one example. The EventListener class has just one method to implement and the API is spec'ed in stone. An IDBRequest subclass would have to consistently manage a lot of IDB state in an architecture that will be changing internally quite a bit. (This is one of those cases where I wish we could just implement this in JavaScript!)

>> Source/WebCore/Modules/indexeddb/IDBObjectStore.cpp:421
>> +    if (ec)
> 
> Do you need to call indexRequest->markEarlyDeath() here to handle the back-end transaction asynchronously aborting?

Nope, that's the great thing about using the front-end API - you only have to do what JavaScript does, as the request events are dispatched by normal means and request is accounting is done in IDBRequest::dispatchRequest. Added some class comments to explain this.

(As an aside, this is why I think that the inspector should be using the front-end APIs rather then mucking around with IDBRequest subclassing)

>> Source/WebCore/Modules/indexeddb/IDBObjectStore.cpp:425
>> +    indexRequest->setOnsuccess(indexPopulater);
> 
> What manages the lifetime of this object? Is it kept alive by the request, and the request is kept alive only by the transaction?

Yep, exactly. Just like a JS cursor handler on the frontend.

>> Source/WebCore/Modules/indexeddb/IDBObjectStoreBackendImpl.cpp:348
>> +        // Are we supposed to abort here?
> 
> Mark this FIXME, don't leave in commented-out code.
> 
> What does a false return code from makeIndexWriters indicate?

oops, yeah I realized the error was enough to cause the abort.

>> Source/WebCore/Modules/indexeddb/IDBRequest.cpp:170
>> +    onError(IDBDatabaseError::create(IDBDatabaseException::IDB_ABORT_ERR, "The request was aborted, so the request cannot be fulfilled."));
> 
> I disagree with this change - this error occurs when the /transaction/ aborted before this /request/ was processed. From an outside perspective, requests don't abort, transactions do. (Plus it's redundant to say "the request was aborted, so it was aborted"). The message is still kinda lame, though.

I totally agree, and I have no idea why I made that change.

>> Source/WebCore/Modules/indexeddb/IDBTransactionBackendImpl.h:66
>> +    void addEarlyEvent() { m_pendingEarlyRequests++; }
> 
> Name this addPreemptiveEvent() for consistent terminology?

done - I totally missed these when I renamed EarlyEvent -> PreemptiveEvent! oops.

>> Source/WebCore/Modules/indexeddb/IDBTransactionBackendImpl.h:97
>> +    TaskQueue m_earlyTaskQueue;
> 
> Name this m_preemptiveTaskQueue for consistent terminology?

done

>> Source/WebCore/Modules/indexeddb/IDBTransactionBackendImpl.h:105
>> +    int m_pendingEarlyRequests;
> 
> Ditto.

done.
Comment 8 Alec Flett 2012-08-13 21:03:36 PDT
Created attachment 158200 [details]
Patch
Comment 9 WebKit Review Bot 2012-08-13 21:58:57 PDT
Comment on attachment 158200 [details]
Patch

Attachment 158200 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/13493434

New failing tests:
storage/indexeddb/index-multientry.html
storage/indexeddb/objectstore-basics-workers.html
storage/indexeddb/objectstore-basics.html
storage/indexeddb/index-unique.html
Comment 10 WebKit Review Bot 2012-08-13 21:59:01 PDT
Created attachment 158209 [details]
Archive of layout-test-results from gce-cr-linux-05

The attached test failures were seen while running run-webkit-tests on the chromium-ews.
Bot: gce-cr-linux-05  Port: <class 'webkitpy.common.config.ports.ChromiumXVFBPort'>  Platform: Linux-2.6.39-gcg-201203291735-x86_64-with-Ubuntu-10.04-lucid
Comment 11 Alec Flett 2012-08-14 11:23:11 PDT
Created attachment 158377 [details]
Patch
Comment 12 Alec Flett 2012-08-14 11:24:57 PDT
ok, new patch works. One issue is that when makeIndexWriters fails, we should fail with a ConstraintError - oddly, I discovered a few places (as you can see in the updated tests) that were expecting a DATA_ERR in a put() when the put was failing due to uniqueness constraints. I'm not sure how we missed that up until this point? Or maybe we previously just didn't have a way to distinguish because of the way the code was previously structured.
Comment 13 Alec Flett 2012-08-14 16:54:41 PDT
abarth@ - mind a review of the chromium side, and if you're feeling generous, the rest of the patch?
Comment 14 Adam Barth 2012-08-14 22:35:03 PDT
Comment on attachment 158377 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=158377&action=review

API change LGTM.  The rest of the patch is too complicated for me.  :(

> Source/WebCore/Modules/indexeddb/IDBObjectStore.cpp:311
> +            : EventListener(CPPEventListenerType)
> +            , m_objectStoreBackend(backend)
> +            , m_transaction(transaction)
> +            , m_indexMetadata(indexMetadata) { }

four-space indent pls.  Also, the { and } should be on separate lines.
Comment 15 Tony Chang 2012-08-14 22:52:44 PDT
Comment on attachment 158377 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=158377&action=review

> Source/WebCore/Modules/indexeddb/IDBObjectStore.cpp:359
> +

Nit: extra blank line?

> Source/WebCore/Modules/indexeddb/IDBObjectStoreBackendImpl.cpp:223
> +static bool makeIndexWriters(IDBObjectStoreBackendImpl* objectStore, PassRefPtr<IDBKey> primaryKey, bool keyWasGenerated, const Vector<String> indexNames, const Vector<IDBObjectStoreBackendInterface::IndexKeys>& indexKeys, Vector<OwnPtr<IndexWriter> > *indexWriters, String *errorMessage)

Nit: * in the wrong spot for indexWriters and errorMessage.

> Source/WebCore/Modules/indexeddb/IDBObjectStoreBackendImpl.cpp:245
> +        indexWriter = adoptPtr(new IndexWriter(index->metadata(), keys));

Nit: Maybe move the variable declaration into the same line to have 1 less line of code?

> Source/WebKit/chromium/src/WebIDBObjectStoreImpl.cpp:97
> +    for (size_t i = 0; i < webIndexNames.size(); i++)

Nit: ++i
Comment 16 Alec Flett 2012-08-15 10:20:13 PDT
Created attachment 158588 [details]
Patch for landing
Comment 17 Alec Flett 2012-08-15 11:06:36 PDT
Created attachment 158596 [details]
Patch for landing
Comment 18 WebKit Review Bot 2012-08-15 12:44:04 PDT
Comment on attachment 158596 [details]
Patch for landing

Rejecting attachment 158596 [details] from commit-queue.

New failing tests:
storage/indexeddb/index-multientry.html
storage/indexeddb/index-population.html
storage/indexeddb/mozilla/create-index-with-integer-keys.html
storage/indexeddb/transaction-error.html
storage/indexeddb/opencursor-key.html
storage/indexeddb/mozilla/indexes.html
storage/indexeddb/mozilla/index-prev-no-duplicate.html
storage/indexeddb/lazy-index-population.html
storage/indexeddb/index-basics.html
storage/indexeddb/index-basics-workers.html
Full output: http://queues.webkit.org/results/13501755
Comment 19 WebKit Review Bot 2012-08-15 12:44:09 PDT
Created attachment 158618 [details]
Archive of layout-test-results from gce-cq-02

The attached test failures were seen while running run-webkit-tests on the commit-queue.
Bot: gce-cq-02  Port: <class 'webkitpy.common.config.ports.ChromiumXVFBPort'>  Platform: Linux-2.6.39-gcg-201203291735-x86_64-with-Ubuntu-10.04-lucid
Comment 20 Alec Flett 2012-08-15 15:14:32 PDT
Created attachment 158646 [details]
Patch for landing
Comment 21 Alec Flett 2012-08-15 15:15:52 PDT
(last land was a bad merge, dropping a parameter by accident)
Comment 22 WebKit Review Bot 2012-08-15 15:18:11 PDT
Attachment 158646 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/ChangeLog', u'LayoutTests/stor..." exit_code: 1
Source/WebCore/Modules/indexeddb/IDBObjectStore.cpp:311:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
Total errors found: 1 in 29 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 23 Alec Flett 2012-08-15 15:22:42 PDT
Created attachment 158648 [details]
Patch
Comment 24 WebKit Review Bot 2012-08-15 16:28:37 PDT
Comment on attachment 158648 [details]
Patch

Attachment 158648 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/13515167

New failing tests:
http/tests/cache/post-with-cached-subresources.php
Comment 25 WebKit Review Bot 2012-08-15 16:28:41 PDT
Created attachment 158658 [details]
Archive of layout-test-results from gce-cr-linux-07

The attached test failures were seen while running run-webkit-tests on the chromium-ews.
Bot: gce-cr-linux-07  Port: <class 'webkitpy.common.config.ports.ChromiumXVFBPort'>  Platform: Linux-2.6.39-gcg-201203291735-x86_64-with-Ubuntu-10.04-lucid
Comment 26 WebKit Review Bot 2012-08-15 17:51:08 PDT
Comment on attachment 158648 [details]
Patch

Clearing flags on attachment: 158648

Committed r125728: <http://trac.webkit.org/changeset/125728>
Comment 27 WebKit Review Bot 2012-08-15 17:51:14 PDT
All reviewed patches have been landed.  Closing bug.