Summary: | webdatabase: Refactor DatabaseContext lookup to be thread-safe | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Mark Lam <mark.lam> | ||||||||||||
Component: | WebCore Misc. | Assignee: | Mark Lam <mark.lam> | ||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||
Severity: | Normal | CC: | abarth, alancutter, beidson, buildbot, dcarney, dglazkov, ggaren, jochen, levin, levin+threading, ojan.autocc, rniwa, sam, webkit-ews, webkit.review.bot | ||||||||||||
Priority: | P2 | ||||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||||
Hardware: | Unspecified | ||||||||||||||
OS: | Unspecified | ||||||||||||||
Bug Depends on: | |||||||||||||||
Bug Blocks: | 107475 | ||||||||||||||
Attachments: |
|
Description
Mark Lam
2013-01-23 22:01:57 PST
Created attachment 184400 [details]
The patch.
Uploading the patch for EWS bot testing. I'm not requesting a review yet because I still need to give a little thought to the life-cycle management of Database objects. This patch ties the life-cycle of DatabaseContexts to the life-cycle of Databases. This is the right thing to do. However, I need to do due diligence to make sure that the life-cycle of Databases are managed appropriately so that there is no leak.
> We will fix this by changing DatabaseContext to not be a Supplement of ScriptExecutionContext. Instead, we'll manage the map from ScriptExecutionContext to DatabaseContext in the DatabaseManager, and guard it with a lock. This allows us to get the thread-safe access we need for mapping ScriptExecutionContexts to DatabaseContexts without changing the owner-thread only usage requirement of Supplement mechanism.
Ok.
What you've written sounds reasonable. I suspect that we could come up with a better design if we wanted to do more surgery, but that's likely a big project.
I seem to remember that jochen and dcarney were working on a bug about the lifetime of databases. I've CCed them in case they have some insights.
Comment on attachment 184400 [details] The patch. Attachment 184400 [details] did not pass qt-ews (qt): Output: http://queues.webkit.org/results/16082511 Comment on attachment 184400 [details] The patch. Attachment 184400 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/16074591 Comment on attachment 184400 [details] The patch. Attachment 184400 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://queues.webkit.org/results/16015034 Created attachment 184408 [details]
Patch + fix for release builds.
Comment on attachment 184408 [details] Patch + fix for release builds. Attachment 184408 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/16078671 New failing tests: fast/repaint/selection-clear.html Comment on attachment 184408 [details]
Patch + fix for release builds.
I think there is no implications with the DatabaseContext being tied to the Database life-cycle. The Database live-ness is determined by the JS garbage collector and destructs only when it is no longer needed.
However, I'm obsoleting this patch for 2 reasons:
1. I discovered a bug in the handling of the DatabaseThread (owned by the DatabaseContext). The context needs to shut it down on destruction. I have a fix for this already.
2. There seems to be some regressions in the layout tests when run with a debug build. This doesn't seem to affect release builds. I need to investigate this a little more.
(In reply to comment #8) > (From update of attachment 184408 [details]) > I think there is no implications with the DatabaseContext being tied to the Database life-cycle. The Database live-ness is determined by the JS garbage collector and destructs only when it is no longer needed. The DatabaseThread has a RefPtr to all Databases that were opened during the page's lifetime. Even if JS drops the last reference to the Database, it won't be deleted as long as the DatabaseThread is alive :-/ See https://bugs.webkit.org/show_bug.cgi?id=68303 for a repro and a patch we proposed (but nobody liked). > > However, I'm obsoleting this patch for 2 reasons: > 1. I discovered a bug in the handling of the DatabaseThread (owned by the DatabaseContext). The context needs to shut it down on destruction. I have a fix for this already. > 2. There seems to be some regressions in the layout tests when run with a debug build. This doesn't seem to affect release builds. I need to investigate this a little more. (In reply to comment #9) > (In reply to comment #8) > > (From update of attachment 184408 [details] [details]) > > I think there is no implications with the DatabaseContext being tied to the Database life-cycle. The Database live-ness is determined by the JS garbage collector and destructs only when it is no longer needed. > > The DatabaseThread has a RefPtr to all Databases that were opened during the page's lifetime. Even if JS drops the last reference to the Database, it won't be deleted as long as the DatabaseThread is alive :-/ > > See https://bugs.webkit.org/show_bug.cgi?id=68303 for a repro and a patch we proposed (but nobody liked). Thanks for the tip. This issue will certainly get in the way of my implementation. I will investigate. Created attachment 184687 [details] The final patch. In addition to the work that has already been written up in the bug description, this final patch makes DatabaseContext a sub-class of ContextDestructionObserver. This allows it to be notified when the ScriptExecutionContext is destructed, and trigger its own shutdown as well as the shutdown of its DatabaseThread and close all the Database instances held by the thread. Previously, I thought that it is sufficient to depend on garbage collection of the Database instances to trigger the shutdown of the DatabaseContext and DatabaseThread. But Jochen pointed out that the Database instances is currently kept alive by the DatabaseThread itself (https://bugs.webkit.org/show_bug.cgi?id=68303). After thinking thru the issue some more, I decided that fixing 68303 is beyond the scope of this bug. I'm also not convinced yet that the solution proposed in the 68303 patch is the best way to go (I haven't had time to digest it properly yet). I'll leave that for a later time. Also, previously, I mentioned that I saw some regression in the layout tests with debug builds. They were actually not failures, but stderr diffs in the console out, but were consistently manifesting when I ran a debug build with this patch applied (in contrast with the baseline). It turned out that the issue there is due to https://bugs.webkit.org/show_bug.cgi?id=79013 which Adam provided a fix for. Once I incorporate Adam's fix, the stderr diffs went away completely. I should point out that with this patch, I also cleaned up the shutdown code of DatabaseContext and DatabaseThread. They are still only shutdown when their associated ScriptExecutionContext is destructed. However, they are now guaranteed to be cleaned up. I'm not completely sure, but there may have been a bug before where the DatabaseThread (and its databases) associated with the MainThread may be leaked if stopDatabases() wasn't called by someone external to the database system (because I couldn't find it in the code). I've changed the code so that the thread shutdown is done if it is not already triggered by someone else before the DatabaseContext is destructed. And I did some testing with some printfs to ensure that the shutdown did indeed occurred as expected with this patch. The patch is now ready for a review please. Thanks. > In addition to the work that has already been written up in the bug description, this final patch makes DatabaseContext a sub-class of ContextDestructionObserver. This allows it to be notified when the ScriptExecutionContext is destructed, and trigger its own shutdown as well as the shutdown of its DatabaseThread and close all the Database instances held by the thread.
That's unlikely to be correct. Triggering shutdown via ContextDestructionObserver doesn't work if JavaScript can add a reference from some object retained by DatabaseContext to some object that retains ScriptExecutionContext. Instead, you'll probably need to use ActiveDOMObject, whose stop() notification is called at a predictable time.
Comment on attachment 184687 [details] The final patch. View in context: https://bugs.webkit.org/attachment.cgi?id=184687&action=review > Source/WebCore/Modules/webdatabase/AbstractDatabase.cpp:177 > +AbstractDatabase::AbstractDatabase(RefPtr<DatabaseContext>& databaseContext, const String& name, const String& expectedVersion, To pass a reference to an object, use PassRefPtr<T>, rather than RefPtr<T>&. This reduces refcount churn and is idiomatic. If you have a RefPtr<T> x, "x.release()" will set x to 0 and return a PassRefPtr<T>. Comment on attachment 184687 [details]
The final patch.
New patch with fixes coming soon.
Created attachment 185158 [details]
the final final patch?
Addressed Adam's comment: DatabaseContext now implements ActiveDOMObject as opposed to ContextDestructionObserver.
Addressed Geoff's comment: The DatabaseContext ref is now being passed as a PassRefPtr instead of a RefPtr&.
Also fixed some race condition bugs pertaining the shutdown process of the DatabaseContext. These were encountered while teasing the changes. The shutdown code is now homogenized. The triggers sources for shutdown are:
1. the destructor: i.e. when all the Databases in this context has been GC'ed, and the ScriptExecutionContext may or may not be still alive.
2. ActiveDOMObject::stop(): called for shutting down ActiveDOMObjects before the ScriptExecutionContext goes away. The ScriptExecutionContext is still alive.
3. ContextDestructionObserver ::contextDestroyed(): this needed in case the ScriptExecutionContext dies before the DatabaseContext (which is kept alive by Database objects that yet to be GC'ed). We need to detach from the ScriptExecutionContext here. To be robust, we also shutdown the databases here in case the client did not call ActiveDOMObject::stop() before hand.
4. stopDatabases(): can be called by the client directly e.g. when frame loading gets aborted. While we could just leave the DatabaseContext and its databases around until the ScriptExecutionContext dies, I left this in there so that we can clean up the databases sooner.
All of these paths now go thru / calls stopDatabases() to initiate orderly shutdown.
Comment on attachment 185158 [details]
the final final patch?
May I have a review please? Thanks.
Created attachment 185182 [details]
svn up'ed to fix patch apply problem.
Comment on attachment 185182 [details] svn up'ed to fix patch apply problem. Attachment 185182 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://queues.webkit.org/results/16151213 New failing tests: fast/workers/worker-lifecycle.html (In reply to comment #18) > (From update of attachment 185182 [details]) > Attachment 185182 [details] did not pass mac-wk2-ews (mac-wk2): > Output: http://queues.webkit.org/results/16151213 > > New failing tests: > fast/workers/worker-lifecycle.html fast/workers/worker-lifecycle.html also fails in the baseline before this patch is applied. So, it is not an issue with this patch. Comment on attachment 185182 [details] svn up'ed to fix patch apply problem. View in context: https://bugs.webkit.org/attachment.cgi?id=185182&action=review r- because I think DatabaseContext::contextDestroyed() is a bug. The rest of my comments here would be good to resolve before landing, but don't stand in the way of an r+. > Source/WebCore/Modules/webdatabase/DatabaseContext.cpp:55 > + DatabaseManager::manager().notifyDatabaseContextConstructed(); I prefer for notifications to specify whether the event in question happened already, or is about to happen. In this case, I think we want to notify when construction has happened already, and when destruction is about to happen. Other options risk our client dereferencing an object before or after its valid state. I'd suggest moving this call to below the initializing call to suspendIfNeeded(), since this object is not fully constructed until after that call. Then, I would rename the notifications to "didConstructDatabaseContext" and "willDestructDatabaseContext". > Source/WebCore/Modules/webdatabase/DatabaseContext.cpp:65 > + // Make sure that the DatabaseThread is stopped before we destruct. > + stopDatabases(); The comment here duplicates what the code says almost verbatim, increasing total reading time for these two lines by about 3X. I think you should remove it. > Source/WebCore/Modules/webdatabase/DatabaseContext.cpp:70 > + // This is because the Databases that are managed by DatabaseThread still > + // relies on this ref between the context and the thread to execute the Grammar: "relies" should be "rely". > Source/WebCore/Modules/webdatabase/DatabaseContext.cpp:77 > if (m_databaseThread) { > ASSERT(m_databaseThread->terminationRequested()); > m_databaseThread = 0; > } This code is a no-op. Clearing is the default behavior of the RefPtr destructor. I would suggest this instead: stopDatabases(); ASSERT(!m_databaseThread || m_databaseThread->terminationRequested()); Your comment about why stopDatabases() can't clear our m_databaseThread reference should move into stopDatabases(). > Source/WebCore/Modules/webdatabase/DatabaseContext.cpp:95 > + // However, we're here because the destructor hasn't been called, and the > + // ScriptExecutionContext we're associated with is about to be destructed. I don't understand this comment or code. m_scriptExecutionContext is a RefPtr, so it should be impossible for this function to be called while m_scriptExecutionContext is non-NULL. What am I missing? > Source/WebCore/Modules/webdatabase/DatabaseManager.h:65 > + PassRefPtr<DatabaseContext> getDatabaseContext(ScriptExecutionContext*); We use the "get" prefix to indicate an out parameter. I'd rename this to "databaseContext" or "databaseContextFor". > Source/WebCore/Modules/webdatabase/DatabaseManager.h:127 > + PassRefPtr<DatabaseContext> getExistingDatabaseContext(ScriptExecutionContext*); We use the "get" prefix to indicate an out parameter. I'd rename this to "existingDatabaseContext" or "existingDatabaseContextFor". > Source/WebCore/Modules/webdatabase/DatabaseManager.h:129 > + typedef HashMap<ScriptExecutionContext*, DatabaseContext*, PtrHash<ScriptExecutionContext*> > ContextMap; PtrHash is the default hash for pointers, and doesn't need to be specified. > Source/WebCore/Modules/webdatabase/DatabaseManager.h:139 > +#if !ASSERT_DISABLED > + int m_databaseContextRegisteredCount; > + int m_databaseContextInstanceCount; > +#endif You should comment that these data members require locking m_contextMapLock. > Source/WebCore/Modules/webdatabase/DatabaseThread.cpp:58 > + // care of ensuring that a termination request has been issue. The Grammar: "issue" should be "issued". > Source/WebCore/Modules/webdatabase/DatabaseThread.cpp:60 > + // termination request will trigger an orderly shutdown of the thread, and > + // then deref itself. And then what will deref itself? The request, or the thread? > Source/WebCore/dom/ActiveDOMObject.cpp:60 > + // m_scriptExecutionContext would/should have been nullify by Grammar: "nullify" should be "nullified". Comment on attachment 185182 [details] svn up'ed to fix patch apply problem. View in context: https://bugs.webkit.org/attachment.cgi?id=185182&action=review >> Source/WebCore/Modules/webdatabase/DatabaseContext.cpp:55 >> + DatabaseManager::manager().notifyDatabaseContextConstructed(); > > I prefer for notifications to specify whether the event in question happened already, or is about to happen. > > In this case, I think we want to notify when construction has happened already, and when destruction is about to happen. Other options risk our client dereferencing an object before or after its valid state. > > I'd suggest moving this call to below the initializing call to suspendIfNeeded(), since this object is not fully constructed until after that call. Then, I would rename the notifications to "didConstructDatabaseContext" and "willDestructDatabaseContext". Fixed. >> Source/WebCore/Modules/webdatabase/DatabaseContext.cpp:65 >> + stopDatabases(); > > The comment here duplicates what the code says almost verbatim, increasing total reading time for these two lines by about 3X. I think you should remove it. Removed. >> Source/WebCore/Modules/webdatabase/DatabaseContext.cpp:70 >> + // relies on this ref between the context and the thread to execute the > > Grammar: "relies" should be "rely". Fixed. >> Source/WebCore/Modules/webdatabase/DatabaseContext.cpp:77 >> } > > This code is a no-op. Clearing is the default behavior of the RefPtr destructor. I would suggest this instead: > > stopDatabases(); > ASSERT(!m_databaseThread || m_databaseThread->terminationRequested()); > > Your comment about why stopDatabases() can't clear our m_databaseThread reference should move into stopDatabases(). Good point. Thanks for pointing that out. This code was inherited, but I'll delete it. The comment is moved to stopDatabases(). >> Source/WebCore/Modules/webdatabase/DatabaseContext.cpp:95 >> + // ScriptExecutionContext we're associated with is about to be destructed. > > I don't understand this comment or code. m_scriptExecutionContext is a RefPtr, so it should be impossible for this function to be called while m_scriptExecutionContext is non-NULL. What am I missing? m_scriptExecutionContext came from ContextDestructionObserver and is just a raw pointer, not a RefPtr. The whole purpose of contextDestroyed() is to inform the ContextDestructionObserver (which is the DatabaseContext in this case) that the ScriptExecutionContext is going down. Since the DatabaseContext can live on pass the ScriptExecutionContext, we need to detach from it. >> Source/WebCore/Modules/webdatabase/DatabaseManager.h:65 >> + PassRefPtr<DatabaseContext> getDatabaseContext(ScriptExecutionContext*); > > We use the "get" prefix to indicate an out parameter. I'd rename this to "databaseContext" or "databaseContextFor". OK. Renamed to databaseContextFor. >> Source/WebCore/Modules/webdatabase/DatabaseManager.h:127 >> + PassRefPtr<DatabaseContext> getExistingDatabaseContext(ScriptExecutionContext*); > > We use the "get" prefix to indicate an out parameter. I'd rename this to "existingDatabaseContext" or "existingDatabaseContextFor". Renamed to existingDatabaseContextFor. >> Source/WebCore/Modules/webdatabase/DatabaseManager.h:129 >> + typedef HashMap<ScriptExecutionContext*, DatabaseContext*, PtrHash<ScriptExecutionContext*> > ContextMap; > > PtrHash is the default hash for pointers, and doesn't need to be specified. Fixed. >> Source/WebCore/Modules/webdatabase/DatabaseManager.h:139 >> +#endif > > You should comment that these data members require locking m_contextMapLock. Grouped the fields together and added the comment. >> Source/WebCore/Modules/webdatabase/DatabaseThread.cpp:58 >> + // care of ensuring that a termination request has been issue. The > > Grammar: "issue" should be "issued". Typo. Fixed. >> Source/WebCore/Modules/webdatabase/DatabaseThread.cpp:60 >> + // then deref itself. > > And then what will deref itself? The request, or the thread? The databaseThread() thread function will deref the DatabaseThread object before returning. I conflate the function and the object as one because they work together. I've re-written the comment to be more clear about this. >> Source/WebCore/dom/ActiveDOMObject.cpp:60 >> + // m_scriptExecutionContext would/should have been nullify by > > Grammar: "nullify" should be "nullified". Thanks. Typo again. Fixed. Landed in r141166: <http://trac.webkit.org/changeset/141166>. r141166 seems to have introduced some thread related assert failures on the Linux debug test bot. http://build.chromium.org/p/chromium.webkit/builders/WebKit%20Linux%20%28dbg%29/builds/5080 http://build.chromium.org/f/chromium/layout_test_results/WebKit_Linux__dbg_/results/layout-test-results/fast/workers/storage/change-version-sync-crash-log.txt ASSERTION FAILED: m_verifier.isSafeToUse() (In reply to comment #23) > r141166 seems to have introduced some thread related assert failures on the Linux debug test bot. > > http://build.chromium.org/p/chromium.webkit/builders/WebKit%20Linux%20%28dbg%29/builds/5080 > http://build.chromium.org/f/chromium/layout_test_results/WebKit_Linux__dbg_/results/layout-test-results/fast/workers/storage/change-version-sync-crash-log.txt > ASSERTION FAILED: m_verifier.isSafeToUse() The bug for this is https://bugs.webkit.org/show_bug.cgi?id=108285. (In reply to comment #24) > (In reply to comment #23) > > r141166 seems to have introduced some thread related assert failures on the Linux debug test bot. > > > > http://build.chromium.org/p/chromium.webkit/builders/WebKit%20Linux%20%28dbg%29/builds/5080 > > http://build.chromium.org/f/chromium/layout_test_results/WebKit_Linux__dbg_/results/layout-test-results/fast/workers/storage/change-version-sync-crash-log.txt > > ASSERTION FAILED: m_verifier.isSafeToUse() > > The bug for this is https://bugs.webkit.org/show_bug.cgi?id=108285. Note that this indicates a thread safety issue. This likely is an issue for the OSX port as well. It is just that the OSX port decided to turn off this assert everywhere so it doesn't show up. (In reply to comment #25) > Note that this indicates a thread safety issue. This likely is an issue for the OSX port as well. It is just that the OSX port decided to turn off this assert everywhere so it doesn't show up. I'm working on it. |