Bug 157448

Summary: Modern IDB: IDBOpenDBRequests that are stop()'ed don't notify the IDBServer of that fact
Product: WebKit Reporter: Brady Eidson <beidson>
Component: WebCore Misc.Assignee: Brady Eidson <beidson>
Status: RESOLVED FIXED    
Severity: Normal CC: achristensen, alecflett, ap, benjamin, buildbot, cdumez, cmarcelo, commit-queue, darin, jsbell, rniwa
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 149117, 154968, 149953    
Attachments:
Description Flags
Patch
none
Archive of layout-test-results from ews106 for mac-yosemite-wk2
none
Archive of layout-test-results from ews113 for mac-yosemite
none
Archive of layout-test-results from ews121 for ios-simulator-wk2
none
Archive of layout-test-results from ews101 for mac-yosemite
none
Patch
darin: review-
Patch
achristensen: review+, buildbot: commit-queue-
Archive of layout-test-results from ews107 for mac-yosemite-wk2
none
Patch for EWS + landing none

Description Brady Eidson 2016-05-06 22:13:00 PDT
Modern IDB: IDBRequests that are stop()'ed don't notify the IDBServer of that fact

Scenario:
A ScriptExecutionContext has a pending IDBOpenDBRequest that is blocked because there's already another connection to the UniqueIDBDatabase.
The IDBServer is waiting to hear back from this request.
Then the user navigates away from the page (in the Document case) or the Worker thread is terminated (in the WorkerGlobalScope case).

The server will prevent all future access to that UniqueIDBDatabase because it doesn't know that IDBOpenDBRequest has been torn down.

This is definitely preventing us from enabled all existing Worker tests, and is likely the cause of at least *some* of the flakiness tracked in https://bugs.webkit.org/show_bug.cgi?id=154968
Comment 1 Brady Eidson 2016-05-06 22:17:49 PDT
There's (3) ActiveDOMObjects in play in the IDB family of objects.

1 - IDBDatabases already close() themselves when the ScriptExecutionContext stops.
2 - IDBTransactions already abort() themselves when the context stops.

When transactions abort, that also happens to clean up *most* of the queued up IDBRequests associated with that transaction.

3 - IDBRequests do *not* do any cleanup when they are stop()'ed. As mentioned above in #2, this is usually not an issue for most requests.

But for IDBOpenDBRequests that have not yet resulted in an open IDBDatabase connection or version change IDBTransaction, they leave the IDBServer in a bad state.

To be conservative to start (and not risk flooding the IDBServer with cross thread traffic or IPC traffic just because of a page navigation), I'll just implement the cleanup in IDBOpenDBRequest.
Comment 2 Brady Eidson 2016-05-06 22:18:33 PDT
Since my code change here will be only about IDBOpenDBRequests, retitling to:
Modern IDB: IDBOpenDBRequests that are stop()'ed don't notify the IDBServer of that fact
Comment 3 Brady Eidson 2016-05-10 16:57:30 PDT
Created attachment 278551 [details]
Patch
Comment 4 Alex Christensen 2016-05-10 17:25:09 PDT
Comment on attachment 278551 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=278551&action=review

> Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:1176
>      m_openDatabaseConnections.remove(&connection);

This can also be removed.  This connection was removed on line 1148.
Comment 5 Build Bot 2016-05-10 17:59:22 PDT
Comment on attachment 278551 [details]
Patch

Attachment 278551 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/1300917

Number of test failures exceeded the failure limit.
Comment 6 Build Bot 2016-05-10 17:59:26 PDT
Created attachment 278560 [details]
Archive of layout-test-results from ews106 for mac-yosemite-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews106  Port: mac-yosemite-wk2  Platform: Mac OS X 10.10.5
Comment 7 Build Bot 2016-05-10 18:06:08 PDT
Comment on attachment 278551 [details]
Patch

Attachment 278551 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/1300922

Number of test failures exceeded the failure limit.
Comment 8 Build Bot 2016-05-10 18:06:12 PDT
Created attachment 278561 [details]
Archive of layout-test-results from ews113 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews113  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 9 Build Bot 2016-05-10 18:30:10 PDT
Comment on attachment 278551 [details]
Patch

Attachment 278551 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: http://webkit-queues.webkit.org/results/1300960

Number of test failures exceeded the failure limit.
Comment 10 Build Bot 2016-05-10 18:30:14 PDT
Created attachment 278566 [details]
Archive of layout-test-results from ews121 for ios-simulator-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews121  Port: ios-simulator-wk2  Platform: Mac OS X 10.11.4
Comment 11 Build Bot 2016-05-10 18:43:24 PDT
Comment on attachment 278551 [details]
Patch

Attachment 278551 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/1301087

Number of test failures exceeded the failure limit.
Comment 12 Build Bot 2016-05-10 18:43:30 PDT
Created attachment 278568 [details]
Archive of layout-test-results from ews101 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews101  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 13 Brady Eidson 2016-05-10 20:15:58 PDT
Well yikes! 

All of storage/indexeddb passed, but I forgot to give the imported tests a go.

Exploring.
Comment 14 Brady Eidson 2016-05-10 20:20:59 PDT
Created attachment 278573 [details]
Patch
Comment 15 Brady Eidson 2016-05-10 20:22:50 PDT
The layout test failures were resolved by a simple and appropriate null check.

(In reply to comment #4)
> > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:1176
> >      m_openDatabaseConnections.remove(&connection);
> 
> This can also be removed.  This connection was removed on line 1148.

Fixed in the new patch.
Comment 16 Darin Adler 2016-05-10 21:22:19 PDT
Comment on attachment 278573 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=278573&action=review

> Source/WTF/wtf/Deque.h:86
> +    T& operator[](size_t i) { ASSERT(m_start != m_end); return m_buffer.buffer()[m_start + i]; }
> +    const T& operator[](size_t i) const { ASSERT(m_start != m_end); return m_buffer.buffer()[m_start + i]; }

This isn’t correct. The Deque implementation uses a circular buffer, so there needs to be code that handles the wraparound when m_start + i is >= m_buffer.capacity(). We also should assert that i < size(), not that m_start != m_end.

A correct implementation would look more like this:

    ASSERT(i < size());
    ASSERT(m_start <= m_buffer.capacity());
    size_t sizeAfterStart = m_buffer.capacity() - m_start;
    if (i < sizeAfterStart)
        return m_buffer.buffer()[m_start + i];
    return m_buffer.buffer()[i - sizeAfterStart];

> Source/WebCore/Modules/indexeddb/IDBOpenDBRequest.cpp:116
> +    connectionProxy().openDBRequestCancelled(IDBRequestData(connectionProxy(), *this));

Would be more elegant to just write it like this:

    connectionProxy().openDBRequestCancelled({ connectionProxy(), *this });

> Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:489
> +#ifndef NDEBUG
> +        if (m_versionChangeTransaction) {
> +            ASSERT(m_versionChangeTransaction->databaseConnection().openRequestIdentifier() == requestIdentifier);
> +            ASSERT(&m_versionChangeTransaction->databaseConnection() == m_versionChangeDatabaseConnection);
> +        }
> +#endif

Could we put the "!m_versionChangeTransaction ||" inside the asserts to avoid the #ifndef NDEBUG?

> Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:499
> +    for (size_t i = 0; i < m_pendingOpenDBRequests.size(); ++i) {
> +        if (m_pendingOpenDBRequests[i]->requestData().requestIdentifier() == requestIdentifier) {
> +            m_pendingOpenDBRequests.remove(i);
> +            return;
> +        }
> +    }

Why does this have to be written with indices instead of with begin/end? If it was written with begin/end, then we would not need to make any changes to Deque.h, right?

    for (auto iterator = m_pendingOpenDBRequests.begin(); iterator != m_pendingOpenDBRequests.end(); ++iterator) {
        if (iterator->get().requestData().requestIdentifier() == requestIdentifier) {
            m_pendingOpenDBRequests.remove(iterator);
            return;
        }
    }

But also, seems like we are using the wrong kind of collection if we have to iterate through it like this when canceling.

> Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:1149
> +    RefPtr<UniqueIDBDatabaseConnection> refConnection(&connection);

Our name for this is normally protectedConnection. Also, we should use Ref for this, not RefPtr.

> Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabaseConnection.cpp:76
> +    RefPtr<UniqueIDBDatabaseConnection> protector(this);

I think protectedThis would be a better name.

> Source/WebCore/Modules/indexeddb/shared/InProcessIDBServer.cpp:404
> +    RefPtr<InProcessIDBServer> self(this);

Should be named protectedThis, not "self", even though other functions in this file call it self.
Comment 17 Brady Eidson 2016-05-10 21:40:54 PDT
(In reply to comment #16)
> Comment on attachment 278573 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=278573&action=review
> 
> > Source/WTF/wtf/Deque.h:86
> > +    T& operator[](size_t i) { ASSERT(m_start != m_end); return m_buffer.buffer()[m_start + i]; }
> > +    const T& operator[](size_t i) const { ASSERT(m_start != m_end); return m_buffer.buffer()[m_start + i]; }
> 
> This isn’t correct. The Deque implementation uses a circular buffer, so
> there needs to be code that handles the wraparound when m_start + i is >=
> m_buffer.capacity(). We also should assert that i < size(), not that m_start
> != m_end.
> 
> A correct implementation would look more like this:
> 
>     ASSERT(i < size());
>     ASSERT(m_start <= m_buffer.capacity());
>     size_t sizeAfterStart = m_buffer.capacity() - m_start;
>     if (i < sizeAfterStart)
>         return m_buffer.buffer()[m_start + i];
>     return m_buffer.buffer()[i - sizeAfterStart];

Holy moly you're so right. I thought it was just using a sliding window. Yikes!

> > Source/WebCore/Modules/indexeddb/IDBOpenDBRequest.cpp:116
> > +    connectionProxy().openDBRequestCancelled(IDBRequestData(connectionProxy(), *this));
> 
> Would be more elegant to just write it like this:
> 
>     connectionProxy().openDBRequestCancelled({ connectionProxy(), *this });

k

> > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:489
> > +#ifndef NDEBUG
> > +        if (m_versionChangeTransaction) {
> > +            ASSERT(m_versionChangeTransaction->databaseConnection().openRequestIdentifier() == requestIdentifier);
> > +            ASSERT(&m_versionChangeTransaction->databaseConnection() == m_versionChangeDatabaseConnection);
> > +        }
> > +#endif
> 
> Could we put the "!m_versionChangeTransaction ||" inside the asserts to
> avoid the #ifndef NDEBUG?

k
 
> > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:499
> > +    for (size_t i = 0; i < m_pendingOpenDBRequests.size(); ++i) {
> > +        if (m_pendingOpenDBRequests[i]->requestData().requestIdentifier() == requestIdentifier) {
> > +            m_pendingOpenDBRequests.remove(i);
> > +            return;
> > +        }
> > +    }
> 
> Why does this have to be written with indices instead of with begin/end? If
> it was written with begin/end, then we would not need to make any changes to
> Deque.h, right?

In my boneheaded short-sightedness, I saw that Deque didn't have the right iterator types to work with modern C++ for() iteration, and completely didn't think of falling back to our old iterator style.

This is what I would do, except...

> But also, seems like we are using the wrong kind of collection if we have to
> iterate through it like this when canceling.

Yup, this is a ListHashSet now.

> > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:1149
> > +    RefPtr<UniqueIDBDatabaseConnection> refConnection(&connection);
> 
> Our name for this is normally protectedConnection. Also, we should use Ref
> for this, not RefPtr.

k

> > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabaseConnection.cpp:76
> > +    RefPtr<UniqueIDBDatabaseConnection> protector(this);
> 
> I think protectedThis would be a better name.

There's a lot more "protector" than "protectedThis" in WebCore, though neither is used an incredible amount.

I wonder if we should ditch the grey area for black and white and implement a coding style guideline for this?

> > Source/WebCore/Modules/indexeddb/shared/InProcessIDBServer.cpp:404
> > +    RefPtr<InProcessIDBServer> self(this);
> 
> Should be named protectedThis, not "self", even though other functions in
> this file call it self.

I agree that the self()s in this file should be whatever we decide above.

I can't in good OCD-consciousness land one in this file that is different than the rest in the file, but I will *gladly* update them all - along with all the other protector/protectedThis instances in WebCore - once we decide what the standard should be.
Comment 18 Brady Eidson 2016-05-10 22:38:09 PDT
Created attachment 278588 [details]
Patch

This gets rid of the WTF/Deque change and makes the m_pendingOpenDBRequests a ListHashSet instead.

It also includes most of the other review feedback.

While local tests are in-progress as I hit the sack, I'll rely on EWS for this one if anybody can review tonight.
Comment 19 Alex Christensen 2016-05-10 23:19:30 PDT
Comment on attachment 278588 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=278588&action=review

> Source/WebCore/ChangeLog:3
> +        Modern IDB: IDBOpenDBRequests that are stop()'ed don't notify the IDBServer of that fact.

stop()'ed

> Source/WebCore/ChangeLog:12
> +        1 - When an IDBOpenDBRequest is stop()ed due to page navigation or worker termination,

stop()ed.  This is inconsistent ;)

> Source/WebCore/Modules/indexeddb/IDBRequest.cpp:254
> +    // The base IDBRequest class has nothing additional to do here.

Isn't that implied by an empty function?
Comment 20 Build Bot 2016-05-10 23:26:44 PDT
Comment on attachment 278588 [details]
Patch

Attachment 278588 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/1302130

New failing tests:
imported/w3c/web-platform-tests/IndexedDB/transaction-lifetime.htm
imported/w3c/web-platform-tests/IndexedDB/idbfactory_open5.htm
Comment 21 Build Bot 2016-05-10 23:26:48 PDT
Created attachment 278596 [details]
Archive of layout-test-results from ews107 for mac-yosemite-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews107  Port: mac-yosemite-wk2  Platform: Mac OS X 10.10.5
Comment 22 Brady Eidson 2016-05-11 08:54:00 PDT
(In reply to comment #20)
> Comment on attachment 278588 [details]
> Patch
> 
> Attachment 278588 [details] did not pass mac-wk2-ews (mac-wk2):
> Output: http://webkit-queues.webkit.org/results/1302130
> 
> New failing tests:
> imported/w3c/web-platform-tests/IndexedDB/transaction-lifetime.htm
> imported/w3c/web-platform-tests/IndexedDB/idbfactory_open5.htm

I cannot reproduce these failures locally.

Nor do I see how this patch has any behaviorally meaningful difference from the last patch, which passed EWS.
Comment 23 Brady Eidson 2016-05-11 08:55:28 PDT
The crashes are in the Databases process, and both look like:

Thread 2 Crashed:: IndexedDatabase Server
0   com.apple.WebCore             	0x0000000106d81c29 WebCore::IDBServer::SQLiteIDBBackingStore::filenameForDatabaseName() const + 25
1   com.apple.WebCore             	0x0000000106d81d83 WebCore::IDBServer::SQLiteIDBBackingStore::fullDatabasePath() const + 35
2   com.apple.WebCore             	0x0000000106d81f47 WebCore::IDBServer::SQLiteIDBBackingStore::getOrEstablishDatabaseInfo(WebCore::IDBDatabaseInfo&) + 279
3   com.apple.WebCore             	0x0000000106f4111c WebCore::IDBServer::UniqueIDBDatabase::openBackingStore(WebCore::IDBDatabaseIdentifier const&) + 188
4   com.apple.WebCore             	0x00000001064b9f6a WebCore::IDBServer::IDBServer::databaseRunLoop() + 122
5   com.apple.JavaScriptCore      	0x00000001058ee962 WTF::threadEntryPoint(void*) + 178 (functional:1742)
6   com.apple.JavaScriptCore      	0x00000001058eed7f WTF::wtfThreadEntryPoint(void*) + 15 (memory:2657)
7   libsystem_pthread.dylib       	0x00007fff9353e05a _pthread_body + 131
8   libsystem_pthread.dylib       	0x00007fff9353dfd7 _pthread_start + 176
9   libsystem_pthread.dylib       	0x00007fff9353b3ed thread_start + 13
Comment 24 Brady Eidson 2016-05-11 09:00:53 PDT
(In reply to comment #23)
> The crashes are in the Databases process, and both look like:
> 
> Thread 2 Crashed:: IndexedDatabase Server
> 0   com.apple.WebCore             	0x0000000106d81c29
> WebCore::IDBServer::SQLiteIDBBackingStore::filenameForDatabaseName() const +
> 25
> 1   com.apple.WebCore             	0x0000000106d81d83
> WebCore::IDBServer::SQLiteIDBBackingStore::fullDatabasePath() const + 35
> 2   com.apple.WebCore             	0x0000000106d81f47
> WebCore::IDBServer::SQLiteIDBBackingStore::
> getOrEstablishDatabaseInfo(WebCore::IDBDatabaseInfo&) + 279
> 3   com.apple.WebCore             	0x0000000106f4111c
> WebCore::IDBServer::UniqueIDBDatabase::openBackingStore(WebCore::
> IDBDatabaseIdentifier const&) + 188
> 4   com.apple.WebCore             	0x00000001064b9f6a
> WebCore::IDBServer::IDBServer::databaseRunLoop() + 122
> 5   com.apple.JavaScriptCore      	0x00000001058ee962
> WTF::threadEntryPoint(void*) + 178 (functional:1742)
> 6   com.apple.JavaScriptCore      	0x00000001058eed7f
> WTF::wtfThreadEntryPoint(void*) + 15 (memory:2657)
> 7   libsystem_pthread.dylib       	0x00007fff9353e05a _pthread_body + 131
> 8   libsystem_pthread.dylib       	0x00007fff9353dfd7 _pthread_start + 176
> 9   libsystem_pthread.dylib       	0x00007fff9353b3ed thread_start + 13

Nothing in there looks suspect, at all.

The runloop is calling a task for "UniqueIDBDatabase::openBackingStore"

A UniqueIDBDatabase object itself posted that task, which is a CrossThreadTask that keeps the UniqueIDBDatabase alive.
So the UniqueIDBDatabase object is fine.

Then inside of UniqueIDBDatabase::openBackingStore we create a SQLiteIDBBackingStore and immediately start using it.

It's definitely non-null.

And looking at what the SQLiteIDBBacking store is doing to end up crashing here... nothing fancy.

I'm stumped.
Comment 25 Brady Eidson 2016-05-11 09:24:49 PDT
(In reply to comment #22)
> (In reply to comment #20)
> > Comment on attachment 278588 [details]
> > Patch
> > 
> > Attachment 278588 [details] did not pass mac-wk2-ews (mac-wk2):
> > Output: http://webkit-queues.webkit.org/results/1302130
> > 
> > New failing tests:
> > imported/w3c/web-platform-tests/IndexedDB/transaction-lifetime.htm
> > imported/w3c/web-platform-tests/IndexedDB/idbfactory_open5.htm
> 
> I cannot reproduce these failures locally.


Scratch that - I can't reproduce locally when running just subdirectories of IDB tests, but I can reproduce a similar crash locally when running *all* the tests. 

For me it was imported/w3c/web-platform-tests/IndexedDB/idbfactory_open6.htm instead of open5.

Curious.
Comment 26 Brady Eidson 2016-05-11 09:25:58 PDT
(In reply to comment #25)
> (In reply to comment #22)
> > (In reply to comment #20)
> > > Comment on attachment 278588 [details]
> > > Patch
> > > 
> > > Attachment 278588 [details] did not pass mac-wk2-ews (mac-wk2):
> > > Output: http://webkit-queues.webkit.org/results/1302130
> > > 
> > > New failing tests:
> > > imported/w3c/web-platform-tests/IndexedDB/transaction-lifetime.htm
> > > imported/w3c/web-platform-tests/IndexedDB/idbfactory_open5.htm
> > 
> > I cannot reproduce these failures locally.
> 
> 
> Scratch that - I can't reproduce locally when running just subdirectories of
> IDB tests, but I can reproduce a similar crash locally when running *all*
> the tests. 
> 
> For me it was imported/w3c/web-platform-tests/IndexedDB/idbfactory_open6.htm
> instead of open5.
> 
> Curious.


Never mind - I saw a database process crash, but it was *not* the same.  However, I'm running a debug build and got an assertion, so maybe the root cause is the same and I'm just seeing debug build symptoms instead of release build symptoms.
Comment 27 Brady Eidson 2016-05-11 09:29:25 PDT
Indeed, the ASSERT I see is as follows:

void UniqueIDBDatabase::didDeleteBackingStore(uint64_t deletedVersion)
{
...
    ASSERT(m_openDatabaseConnections.isEmpty());

So a database delete is finishing yet there *are* open database connections when it finishes.

This is bad, and could feasibly be related to the re-opening of the backing store afterwards with things in a bad state, which could be what the release build crashes are seeing.

I'll try to narrow down reproducibility so I can get all sorts of logging and catch in the debugger.
Comment 28 Brady Eidson 2016-05-11 10:39:53 PDT
Created attachment 278634 [details]
Patch for EWS + landing

Previously, m_currentOpenDBRequest hung around much longer, until the end of the actual delete operation, and that prevented other new openDBRequests from happening in the meantime.

Now, m_currentOpenDBRequest might be cleared out while a delete operation is in progress, which can then lead to other open requests being handled at the same time.

This patch makes one change that explicitly prevents any new openDB requests from being handled while a delete request is in progress, and therefore should restore previous behavior.
Comment 29 WebKit Commit Bot 2016-05-11 11:41:15 PDT
Comment on attachment 278634 [details]
Patch for EWS + landing

Clearing flags on attachment: 278634

Committed r200695: <http://trac.webkit.org/changeset/200695>
Comment 30 Darin Adler 2016-05-11 14:58:26 PDT
(In reply to comment #17)
> I saw that Deque didn't have the right
> iterator types to work with modern C++ for() iteration

I think you mean that we can’t delete something in a modern for loop. Or is it really true that we can’t iterate a Deque with a modern for loop at all?

> Yup, this is a ListHashSet now.

Great!

> There's a lot more "protector" than "protectedThis" in WebCore, though
> neither is used an incredible amount.
> 
> I wonder if we should ditch the grey area for black and white and implement
> a coding style guideline for this?

Seems worth doing. I like protectedXXX a lot better than "protector" or "protect". Or "self" ;-)