Modern IDB: IDBRequests that are stop()'ed don't notify the IDBServer of that fact Scenario: A ScriptExecutionContext has a pending IDBOpenDBRequest that is blocked because there's already another connection to the UniqueIDBDatabase. The IDBServer is waiting to hear back from this request. Then the user navigates away from the page (in the Document case) or the Worker thread is terminated (in the WorkerGlobalScope case). The server will prevent all future access to that UniqueIDBDatabase because it doesn't know that IDBOpenDBRequest has been torn down. This is definitely preventing us from enabled all existing Worker tests, and is likely the cause of at least *some* of the flakiness tracked in https://bugs.webkit.org/show_bug.cgi?id=154968
There's (3) ActiveDOMObjects in play in the IDB family of objects. 1 - IDBDatabases already close() themselves when the ScriptExecutionContext stops. 2 - IDBTransactions already abort() themselves when the context stops. When transactions abort, that also happens to clean up *most* of the queued up IDBRequests associated with that transaction. 3 - IDBRequests do *not* do any cleanup when they are stop()'ed. As mentioned above in #2, this is usually not an issue for most requests. But for IDBOpenDBRequests that have not yet resulted in an open IDBDatabase connection or version change IDBTransaction, they leave the IDBServer in a bad state. To be conservative to start (and not risk flooding the IDBServer with cross thread traffic or IPC traffic just because of a page navigation), I'll just implement the cleanup in IDBOpenDBRequest.
Since my code change here will be only about IDBOpenDBRequests, retitling to: Modern IDB: IDBOpenDBRequests that are stop()'ed don't notify the IDBServer of that fact
Created attachment 278551 [details] Patch
Comment on attachment 278551 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=278551&action=review > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:1176 > m_openDatabaseConnections.remove(&connection); This can also be removed. This connection was removed on line 1148.
Comment on attachment 278551 [details] Patch Attachment 278551 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.webkit.org/results/1300917 Number of test failures exceeded the failure limit.
Created attachment 278560 [details] Archive of layout-test-results from ews106 for mac-yosemite-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews106 Port: mac-yosemite-wk2 Platform: Mac OS X 10.10.5
Comment on attachment 278551 [details] Patch Attachment 278551 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/1300922 Number of test failures exceeded the failure limit.
Created attachment 278561 [details] Archive of layout-test-results from ews113 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews113 Port: mac-yosemite Platform: Mac OS X 10.10.5
Comment on attachment 278551 [details] Patch Attachment 278551 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: http://webkit-queues.webkit.org/results/1300960 Number of test failures exceeded the failure limit.
Created attachment 278566 [details] Archive of layout-test-results from ews121 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews121 Port: ios-simulator-wk2 Platform: Mac OS X 10.11.4
Comment on attachment 278551 [details] Patch Attachment 278551 [details] did not pass mac-ews (mac): Output: http://webkit-queues.webkit.org/results/1301087 Number of test failures exceeded the failure limit.
Created attachment 278568 [details] Archive of layout-test-results from ews101 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews101 Port: mac-yosemite Platform: Mac OS X 10.10.5
Well yikes! All of storage/indexeddb passed, but I forgot to give the imported tests a go. Exploring.
Created attachment 278573 [details] Patch
The layout test failures were resolved by a simple and appropriate null check. (In reply to comment #4) > > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:1176 > > m_openDatabaseConnections.remove(&connection); > > This can also be removed. This connection was removed on line 1148. Fixed in the new patch.
Comment on attachment 278573 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=278573&action=review > Source/WTF/wtf/Deque.h:86 > + T& operator[](size_t i) { ASSERT(m_start != m_end); return m_buffer.buffer()[m_start + i]; } > + const T& operator[](size_t i) const { ASSERT(m_start != m_end); return m_buffer.buffer()[m_start + i]; } This isn’t correct. The Deque implementation uses a circular buffer, so there needs to be code that handles the wraparound when m_start + i is >= m_buffer.capacity(). We also should assert that i < size(), not that m_start != m_end. A correct implementation would look more like this: ASSERT(i < size()); ASSERT(m_start <= m_buffer.capacity()); size_t sizeAfterStart = m_buffer.capacity() - m_start; if (i < sizeAfterStart) return m_buffer.buffer()[m_start + i]; return m_buffer.buffer()[i - sizeAfterStart]; > Source/WebCore/Modules/indexeddb/IDBOpenDBRequest.cpp:116 > + connectionProxy().openDBRequestCancelled(IDBRequestData(connectionProxy(), *this)); Would be more elegant to just write it like this: connectionProxy().openDBRequestCancelled({ connectionProxy(), *this }); > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:489 > +#ifndef NDEBUG > + if (m_versionChangeTransaction) { > + ASSERT(m_versionChangeTransaction->databaseConnection().openRequestIdentifier() == requestIdentifier); > + ASSERT(&m_versionChangeTransaction->databaseConnection() == m_versionChangeDatabaseConnection); > + } > +#endif Could we put the "!m_versionChangeTransaction ||" inside the asserts to avoid the #ifndef NDEBUG? > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:499 > + for (size_t i = 0; i < m_pendingOpenDBRequests.size(); ++i) { > + if (m_pendingOpenDBRequests[i]->requestData().requestIdentifier() == requestIdentifier) { > + m_pendingOpenDBRequests.remove(i); > + return; > + } > + } Why does this have to be written with indices instead of with begin/end? If it was written with begin/end, then we would not need to make any changes to Deque.h, right? for (auto iterator = m_pendingOpenDBRequests.begin(); iterator != m_pendingOpenDBRequests.end(); ++iterator) { if (iterator->get().requestData().requestIdentifier() == requestIdentifier) { m_pendingOpenDBRequests.remove(iterator); return; } } But also, seems like we are using the wrong kind of collection if we have to iterate through it like this when canceling. > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:1149 > + RefPtr<UniqueIDBDatabaseConnection> refConnection(&connection); Our name for this is normally protectedConnection. Also, we should use Ref for this, not RefPtr. > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabaseConnection.cpp:76 > + RefPtr<UniqueIDBDatabaseConnection> protector(this); I think protectedThis would be a better name. > Source/WebCore/Modules/indexeddb/shared/InProcessIDBServer.cpp:404 > + RefPtr<InProcessIDBServer> self(this); Should be named protectedThis, not "self", even though other functions in this file call it self.
(In reply to comment #16) > Comment on attachment 278573 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=278573&action=review > > > Source/WTF/wtf/Deque.h:86 > > + T& operator[](size_t i) { ASSERT(m_start != m_end); return m_buffer.buffer()[m_start + i]; } > > + const T& operator[](size_t i) const { ASSERT(m_start != m_end); return m_buffer.buffer()[m_start + i]; } > > This isn’t correct. The Deque implementation uses a circular buffer, so > there needs to be code that handles the wraparound when m_start + i is >= > m_buffer.capacity(). We also should assert that i < size(), not that m_start > != m_end. > > A correct implementation would look more like this: > > ASSERT(i < size()); > ASSERT(m_start <= m_buffer.capacity()); > size_t sizeAfterStart = m_buffer.capacity() - m_start; > if (i < sizeAfterStart) > return m_buffer.buffer()[m_start + i]; > return m_buffer.buffer()[i - sizeAfterStart]; Holy moly you're so right. I thought it was just using a sliding window. Yikes! > > Source/WebCore/Modules/indexeddb/IDBOpenDBRequest.cpp:116 > > + connectionProxy().openDBRequestCancelled(IDBRequestData(connectionProxy(), *this)); > > Would be more elegant to just write it like this: > > connectionProxy().openDBRequestCancelled({ connectionProxy(), *this }); k > > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:489 > > +#ifndef NDEBUG > > + if (m_versionChangeTransaction) { > > + ASSERT(m_versionChangeTransaction->databaseConnection().openRequestIdentifier() == requestIdentifier); > > + ASSERT(&m_versionChangeTransaction->databaseConnection() == m_versionChangeDatabaseConnection); > > + } > > +#endif > > Could we put the "!m_versionChangeTransaction ||" inside the asserts to > avoid the #ifndef NDEBUG? k > > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:499 > > + for (size_t i = 0; i < m_pendingOpenDBRequests.size(); ++i) { > > + if (m_pendingOpenDBRequests[i]->requestData().requestIdentifier() == requestIdentifier) { > > + m_pendingOpenDBRequests.remove(i); > > + return; > > + } > > + } > > Why does this have to be written with indices instead of with begin/end? If > it was written with begin/end, then we would not need to make any changes to > Deque.h, right? In my boneheaded short-sightedness, I saw that Deque didn't have the right iterator types to work with modern C++ for() iteration, and completely didn't think of falling back to our old iterator style. This is what I would do, except... > But also, seems like we are using the wrong kind of collection if we have to > iterate through it like this when canceling. Yup, this is a ListHashSet now. > > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabase.cpp:1149 > > + RefPtr<UniqueIDBDatabaseConnection> refConnection(&connection); > > Our name for this is normally protectedConnection. Also, we should use Ref > for this, not RefPtr. k > > Source/WebCore/Modules/indexeddb/server/UniqueIDBDatabaseConnection.cpp:76 > > + RefPtr<UniqueIDBDatabaseConnection> protector(this); > > I think protectedThis would be a better name. There's a lot more "protector" than "protectedThis" in WebCore, though neither is used an incredible amount. I wonder if we should ditch the grey area for black and white and implement a coding style guideline for this? > > Source/WebCore/Modules/indexeddb/shared/InProcessIDBServer.cpp:404 > > + RefPtr<InProcessIDBServer> self(this); > > Should be named protectedThis, not "self", even though other functions in > this file call it self. I agree that the self()s in this file should be whatever we decide above. I can't in good OCD-consciousness land one in this file that is different than the rest in the file, but I will *gladly* update them all - along with all the other protector/protectedThis instances in WebCore - once we decide what the standard should be.
Created attachment 278588 [details] Patch This gets rid of the WTF/Deque change and makes the m_pendingOpenDBRequests a ListHashSet instead. It also includes most of the other review feedback. While local tests are in-progress as I hit the sack, I'll rely on EWS for this one if anybody can review tonight.
Comment on attachment 278588 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=278588&action=review > Source/WebCore/ChangeLog:3 > + Modern IDB: IDBOpenDBRequests that are stop()'ed don't notify the IDBServer of that fact. stop()'ed > Source/WebCore/ChangeLog:12 > + 1 - When an IDBOpenDBRequest is stop()ed due to page navigation or worker termination, stop()ed. This is inconsistent ;) > Source/WebCore/Modules/indexeddb/IDBRequest.cpp:254 > + // The base IDBRequest class has nothing additional to do here. Isn't that implied by an empty function?
Comment on attachment 278588 [details] Patch Attachment 278588 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.webkit.org/results/1302130 New failing tests: imported/w3c/web-platform-tests/IndexedDB/transaction-lifetime.htm imported/w3c/web-platform-tests/IndexedDB/idbfactory_open5.htm
Created attachment 278596 [details] Archive of layout-test-results from ews107 for mac-yosemite-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews107 Port: mac-yosemite-wk2 Platform: Mac OS X 10.10.5
(In reply to comment #20) > Comment on attachment 278588 [details] > Patch > > Attachment 278588 [details] did not pass mac-wk2-ews (mac-wk2): > Output: http://webkit-queues.webkit.org/results/1302130 > > New failing tests: > imported/w3c/web-platform-tests/IndexedDB/transaction-lifetime.htm > imported/w3c/web-platform-tests/IndexedDB/idbfactory_open5.htm I cannot reproduce these failures locally. Nor do I see how this patch has any behaviorally meaningful difference from the last patch, which passed EWS.
The crashes are in the Databases process, and both look like: Thread 2 Crashed:: IndexedDatabase Server 0 com.apple.WebCore 0x0000000106d81c29 WebCore::IDBServer::SQLiteIDBBackingStore::filenameForDatabaseName() const + 25 1 com.apple.WebCore 0x0000000106d81d83 WebCore::IDBServer::SQLiteIDBBackingStore::fullDatabasePath() const + 35 2 com.apple.WebCore 0x0000000106d81f47 WebCore::IDBServer::SQLiteIDBBackingStore::getOrEstablishDatabaseInfo(WebCore::IDBDatabaseInfo&) + 279 3 com.apple.WebCore 0x0000000106f4111c WebCore::IDBServer::UniqueIDBDatabase::openBackingStore(WebCore::IDBDatabaseIdentifier const&) + 188 4 com.apple.WebCore 0x00000001064b9f6a WebCore::IDBServer::IDBServer::databaseRunLoop() + 122 5 com.apple.JavaScriptCore 0x00000001058ee962 WTF::threadEntryPoint(void*) + 178 (functional:1742) 6 com.apple.JavaScriptCore 0x00000001058eed7f WTF::wtfThreadEntryPoint(void*) + 15 (memory:2657) 7 libsystem_pthread.dylib 0x00007fff9353e05a _pthread_body + 131 8 libsystem_pthread.dylib 0x00007fff9353dfd7 _pthread_start + 176 9 libsystem_pthread.dylib 0x00007fff9353b3ed thread_start + 13
(In reply to comment #23) > The crashes are in the Databases process, and both look like: > > Thread 2 Crashed:: IndexedDatabase Server > 0 com.apple.WebCore 0x0000000106d81c29 > WebCore::IDBServer::SQLiteIDBBackingStore::filenameForDatabaseName() const + > 25 > 1 com.apple.WebCore 0x0000000106d81d83 > WebCore::IDBServer::SQLiteIDBBackingStore::fullDatabasePath() const + 35 > 2 com.apple.WebCore 0x0000000106d81f47 > WebCore::IDBServer::SQLiteIDBBackingStore:: > getOrEstablishDatabaseInfo(WebCore::IDBDatabaseInfo&) + 279 > 3 com.apple.WebCore 0x0000000106f4111c > WebCore::IDBServer::UniqueIDBDatabase::openBackingStore(WebCore:: > IDBDatabaseIdentifier const&) + 188 > 4 com.apple.WebCore 0x00000001064b9f6a > WebCore::IDBServer::IDBServer::databaseRunLoop() + 122 > 5 com.apple.JavaScriptCore 0x00000001058ee962 > WTF::threadEntryPoint(void*) + 178 (functional:1742) > 6 com.apple.JavaScriptCore 0x00000001058eed7f > WTF::wtfThreadEntryPoint(void*) + 15 (memory:2657) > 7 libsystem_pthread.dylib 0x00007fff9353e05a _pthread_body + 131 > 8 libsystem_pthread.dylib 0x00007fff9353dfd7 _pthread_start + 176 > 9 libsystem_pthread.dylib 0x00007fff9353b3ed thread_start + 13 Nothing in there looks suspect, at all. The runloop is calling a task for "UniqueIDBDatabase::openBackingStore" A UniqueIDBDatabase object itself posted that task, which is a CrossThreadTask that keeps the UniqueIDBDatabase alive. So the UniqueIDBDatabase object is fine. Then inside of UniqueIDBDatabase::openBackingStore we create a SQLiteIDBBackingStore and immediately start using it. It's definitely non-null. And looking at what the SQLiteIDBBacking store is doing to end up crashing here... nothing fancy. I'm stumped.
(In reply to comment #22) > (In reply to comment #20) > > Comment on attachment 278588 [details] > > Patch > > > > Attachment 278588 [details] did not pass mac-wk2-ews (mac-wk2): > > Output: http://webkit-queues.webkit.org/results/1302130 > > > > New failing tests: > > imported/w3c/web-platform-tests/IndexedDB/transaction-lifetime.htm > > imported/w3c/web-platform-tests/IndexedDB/idbfactory_open5.htm > > I cannot reproduce these failures locally. Scratch that - I can't reproduce locally when running just subdirectories of IDB tests, but I can reproduce a similar crash locally when running *all* the tests. For me it was imported/w3c/web-platform-tests/IndexedDB/idbfactory_open6.htm instead of open5. Curious.
(In reply to comment #25) > (In reply to comment #22) > > (In reply to comment #20) > > > Comment on attachment 278588 [details] > > > Patch > > > > > > Attachment 278588 [details] did not pass mac-wk2-ews (mac-wk2): > > > Output: http://webkit-queues.webkit.org/results/1302130 > > > > > > New failing tests: > > > imported/w3c/web-platform-tests/IndexedDB/transaction-lifetime.htm > > > imported/w3c/web-platform-tests/IndexedDB/idbfactory_open5.htm > > > > I cannot reproduce these failures locally. > > > Scratch that - I can't reproduce locally when running just subdirectories of > IDB tests, but I can reproduce a similar crash locally when running *all* > the tests. > > For me it was imported/w3c/web-platform-tests/IndexedDB/idbfactory_open6.htm > instead of open5. > > Curious. Never mind - I saw a database process crash, but it was *not* the same. However, I'm running a debug build and got an assertion, so maybe the root cause is the same and I'm just seeing debug build symptoms instead of release build symptoms.
Indeed, the ASSERT I see is as follows: void UniqueIDBDatabase::didDeleteBackingStore(uint64_t deletedVersion) { ... ASSERT(m_openDatabaseConnections.isEmpty()); So a database delete is finishing yet there *are* open database connections when it finishes. This is bad, and could feasibly be related to the re-opening of the backing store afterwards with things in a bad state, which could be what the release build crashes are seeing. I'll try to narrow down reproducibility so I can get all sorts of logging and catch in the debugger.
Created attachment 278634 [details] Patch for EWS + landing Previously, m_currentOpenDBRequest hung around much longer, until the end of the actual delete operation, and that prevented other new openDBRequests from happening in the meantime. Now, m_currentOpenDBRequest might be cleared out while a delete operation is in progress, which can then lead to other open requests being handled at the same time. This patch makes one change that explicitly prevents any new openDB requests from being handled while a delete request is in progress, and therefore should restore previous behavior.
Comment on attachment 278634 [details] Patch for EWS + landing Clearing flags on attachment: 278634 Committed r200695: <http://trac.webkit.org/changeset/200695>
(In reply to comment #17) > I saw that Deque didn't have the right > iterator types to work with modern C++ for() iteration I think you mean that we can’t delete something in a modern for loop. Or is it really true that we can’t iterate a Deque with a modern for loop at all? > Yup, this is a ListHashSet now. Great! > There's a lot more "protector" than "protectedThis" in WebCore, though > neither is used an incredible amount. > > I wonder if we should ditch the grey area for black and white and implement > a coding style guideline for this? Seems worth doing. I like protectedXXX a lot better than "protector" or "protect". Or "self" ;-)