Bug 55919

Summary: Yet another multi-threading bug in WebSQLDatabase.
Product: WebKit Reporter: Michael Nordman <michaeln>
Component: WebCore Misc.Assignee: Michael Nordman <michaeln>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, commit-queue, dimich, levin, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Bug Depends on: 56229    
Bug Blocks:    
Attachments:
Description Flags
fix
dimich: review-
box terminology
none
wrapper
none
wrapper2
none
wrapper
levin: review-
wrapper
levin: review+
wrapper levin: review+, commit-queue: commit-queue-

Michael Nordman
Reported 2011-03-07 19:52:07 PST
See http://code.google.com/p/chromium/issues/detail?id=71026 SQLTransaction and SQLStatement keep RefPtrs to javascript callbacks, which hold onto the tree nodes. We shouldn't release callbacks or tree nodes on a database thread.
Attachments
fix (15.14 KB, patch)
2011-03-07 20:02 PST, Michael Nordman
dimich: review-
box terminology (15.24 KB, patch)
2011-03-08 11:54 PST, Michael Nordman
no flags
wrapper (20.93 KB, patch)
2011-03-08 19:16 PST, Michael Nordman
no flags
wrapper2 (20.98 KB, patch)
2011-03-09 12:28 PST, Michael Nordman
no flags
wrapper (23.07 KB, patch)
2011-03-09 15:08 PST, Michael Nordman
levin: review-
wrapper (23.30 KB, patch)
2011-03-09 20:45 PST, Michael Nordman
levin: review+
wrapper (23.44 KB, patch)
2011-03-11 12:28 PST, Michael Nordman
levin: review+
commit-queue: commit-queue-
Michael Nordman
Comment 1 2011-03-07 20:02:06 PST
Created attachment 85012 [details] fix Wdyt? I can put the new class in a separate file if need be.
Dmitry Titov
Comment 2 2011-03-07 22:31:25 PST
Comment on attachment 85012 [details] fix View in context: https://bugs.webkit.org/attachment.cgi?id=85012&action=review Thanks for doing this! Some notes: > Source/WebCore/storage/SQLStatement.cpp:170 > + RefPtr<SQLStatementErrorCallback> errorCallback = m_statementErrorCallback.release(); This looks a bit confusing. The naming is such as if both 'callback' objects are of the same type, and the only thing that goes on is refcount manipulations. Perhaps m_statementErrorCallback -> m_statementErrorCallbackBox or some other name that would make the need for .release() (or unbox()?) obvious. Otherwise code looks like strange assignment of RefPtrs. > Source/WebCore/storage/SQLStatement.h:52 > +template<typename T> class SQLCallbackReference { Indeed could be nice to have a file for it. > Source/WebCore/storage/SQLStatement.h:90 > + bool has() const bool operator() could be clearer here... > Source/WebCore/storage/SQLStatement.h:104 > + mutable Mutex m_mutex; Hmm, why do we need a mutex in this class? We didn't have one before, and it seems posting the task to the context thread for destruction is all we need. Or, is there more then one issue this patch is fixing? > Source/WebCore/storage/SQLStatement.h:106 > + RefPtr<ScriptExecutionContext> m_scriptExecutionContent; m_scriptExecutionContent -> m_scriptExecutionContext
Michael Nordman
Comment 3 2011-03-08 09:43:00 PST
Thanks for looking. (In reply to comment #2) > > Source/WebCore/storage/SQLStatement.cpp:170 > > + RefPtr<SQLStatementErrorCallback> errorCallback = m_statementErrorCallback.release(); > > This looks a bit confusing. The naming is such as if both 'callback' objects are of the same type... The 'box' term works for me. > > Source/WebCore/storage/SQLStatement.h:52 > > +template<typename T> class SQLCallbackReference { > > Indeed could be nice to have a file for it. Definitely want to get the class name settled before updating dozens of make files to add the .h file. > > Source/WebCore/storage/SQLStatement.h:90 > > + bool has() const > > bool operator() could be clearer here... sgtm > > Source/WebCore/storage/SQLStatement.h:104 > > + mutable Mutex m_mutex; > > Hmm, why do we need a mutex in this class? We didn't have one before, and it seems posting the task to the context thread for destruction is all we need. > Or, is there more then one issue this patch is fixing? Multiple threads can test and clearing the reference. I couldn't convince myself that a mutex wasn't needed, and now it's two data members that are being cleared instead of one. Can you convince me that we don't need one? > > Source/WebCore/storage/SQLStatement.h:106 > > + RefPtr<ScriptExecutionContext> m_scriptExecutionContent; > > m_scriptExecutionContent -> m_scriptExecutionContext Doh... thnx!
Michael Nordman
Comment 4 2011-03-08 11:54:57 PST
Created attachment 85074 [details] box terminology
Michael Nordman
Comment 5 2011-03-08 12:00:14 PST
Comment on attachment 85074 [details] box terminology How's SQLCallbackBox for a class name and unbox() for the method of interest? > Source/WebCore/ChangeLog:17 > + (WebCore::SQLCallbackReference::SQLCallbackReference): oh... i'll have to update the ChangeLog too
Dmitry Titov
Comment 6 2011-03-08 13:00:09 PST
Comment on attachment 85074 [details] box terminology View in context: https://bugs.webkit.org/attachment.cgi?id=85074&action=review Looks good. > Source/WebCore/storage/SQLStatement.h:51 > +// SQLStatement and SQLTransaction on the proper thread. Lets add a comment to this class describing all 3 ways the 'boxed' reference can be dereferenced: - by destructing enclosing object - on either thread - by calling clear() - on either thread - by unboxing and then dereferencing normally - on context thread only)
Dmitry Titov
Comment 7 2011-03-08 13:01:22 PST
Oh, and lets remove taking a lock in bool operators...
Michael Nordman
Comment 8 2011-03-08 13:08:04 PST
Thnx... i still need to do the make file changes... any bets on how many builds i can break by adding a single .h file to the project :) > Source/WebCore/storage/SQLStatement.h:98 > + MutexLocker locker(m_mutex); Right... locking the mutex in the bool operators looks pointless, i think we can remove them too. Q: RefPtr and PassRefPtr use a trick discourage mis-using operator bool(), should that trick be used here?
Dmitry Titov
Comment 9 2011-03-08 14:59:32 PST
Comment on attachment 85074 [details] box terminology Removing flag for now, reflecting offline discussion: - it's good to have a separate file for this new class - lets go back from bool operators to hasCallback() (w/o lock)
Michael Nordman
Comment 10 2011-03-08 15:38:00 PST
(In reply to comment #9) > (From update of attachment 85074 [details]) > Removing flag for now, reflecting offline discussion: > - it's good to have a separate file for this new class > - lets go back from bool operators to hasCallback() (w/o lock) Sounds good... when i figure out which dozen makefiles to change and how... i'll post a new patch.
Michael Nordman
Comment 11 2011-03-08 19:16:22 PST
Created attachment 85124 [details] wrapper Made changes as discussed. I still have to poke at more make files to add the new .h to them, but the code changes should be settled now. I'm on take three for a class name so far, any better suggestions would be welcome.
Michael Nordman
Comment 12 2011-03-08 19:19:31 PST
Comment on attachment 85124 [details] wrapper View in context: https://bugs.webkit.org/attachment.cgi?id=85124&action=review > Source/WebCore/storage/SQLCallbackWrapper.h:93 > + mutable Mutex m_mutex; this doesn't have to be mutable anymore
Michael Nordman
Comment 13 2011-03-09 12:28:01 PST
Created attachment 85216 [details] wrapper2
WebKit Review Bot
Comment 14 2011-03-09 12:30:03 PST
Attachment 85216 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1 Source/WebCore/storage/SQLCallbackWrapper.h:28: #ifndef header guard has wrong style, please use: WTF_SQLCallbackWrapper_h [build/header_guard] [5] Total errors found: 1 in 7 files If any of these errors are false positives, please file a bug against check-webkit-style.
David Levin
Comment 15 2011-03-09 12:36:52 PST
(In reply to comment #14) > Attachment 85216 [details] did not pass style-queue: > > Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1 > > Source/WebCore/storage/SQLCallbackWrapper.h:28: #ifndef header guard has wrong style, please use: WTF_SQLCallbackWrapper_h [build/header_guard] [5] The error is real. The suggestion is wrong.
Michael Nordman
Comment 16 2011-03-09 12:54:42 PST
(In reply to comment #15) > (In reply to comment #14) > > Attachment 85216 [details] [details] did not pass style-queue: > > > > Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1 > > > > Source/WebCore/storage/SQLCallbackWrapper.h:28: #ifndef header guard has wrong style, please use: WTF_SQLCallbackWrapper_h [build/header_guard] [5] > > The error is real. The suggestion is wrong. Yes, i see it. Still have poked at most of the make files, but i wanted to see what the EWS bots thought about this much.
Michael Nordman
Comment 17 2011-03-09 15:08:25 PST
Created attachment 85244 [details] wrapper fixed the header guard and added the .h file to various project/make fiels.
David Levin
Comment 18 2011-03-09 15:32:43 PST
Comment on attachment 85244 [details] wrapper View in context: https://bugs.webkit.org/attachment.cgi?id=85244&action=review Just a few comments, so r- for now (but perhaps after a bit of back and forth I could be convinced to do r+ with some changes on check in). > Source/WebCore/storage/SQLCallbackWrapper.h:72 > + context->postTask(createCallbackTask(&safeRelease, m_callback.release().leakRef())); Personally, I would prefer if postTask were done outside of the mutex (since it can be as long as m_callback.release().leakRef() is stored in a local). > Source/WebCore/storage/SQLCallbackWrapper.h:77 > + ASSERT(!m_callback || m_scriptExecutionContext->isContextThread()); Doesn't the assert need to be inside of the locker? > Source/WebCore/storage/SQLCallbackWrapper.h:83 > + bool hasCallback() const { return m_callback; } I don't understand how this method could be useful unless there is some implied locking mechanism at a higher level which ensure that it isn't called if clear or unwrap may be called on another thread. Given that clear and unwrap both have mutex guards in them, it implies that they may be called on another thread so this seems concerning. btw, I'm not arguing for a mutex here as that doesn't fix the problem that I'm talking about. It feels like this needs a comment in the code because something tricky is happening here (and in general this method feels like it is prone to leading people to write broken code).
Michael Nordman
Comment 19 2011-03-09 16:00:20 PST
(In reply to comment #18) > (From update of attachment 85244 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=85244&action=review > > Just a few comments, so r- for now (but perhaps after a bit of back and forth I could be convinced to do r+ with some changes on check in). > > > Source/WebCore/storage/SQLCallbackWrapper.h:72 > > + context->postTask(createCallbackTask(&safeRelease, m_callback.release().leakRef())); > > Personally, I would prefer if postTask were done outside of the mutex > (since it can be as long as m_callback.release().leakRef() is stored in a local). If you think it matters, i could change it, although that would decrease readability a little. > > Source/WebCore/storage/SQLCallbackWrapper.h:77 > > + ASSERT(!m_callback || m_scriptExecutionContext->isContextThread()); > > Doesn't the assert need to be inside of the locker? Yup... good point. > > Source/WebCore/storage/SQLCallbackWrapper.h:83 > > + bool hasCallback() const { return m_callback; } > > I don't understand how this method could be useful unless there is some implied locking mechanism at a higher level which ensure that it isn't called if clear or unwrap may be called on another thread. > > Given that clear and unwrap both have mutex guards in them, it implies that they may be called on another thread so this seems concerning. > > btw, I'm not arguing for a mutex here as that doesn't fix the problem that I'm talking about. It feels like this needs a comment in the code because something tricky is happening here (and in general this method feels like it is prone to leading people to write broken code). Dimich and chatted about the lock in this method and we decided to remove it as it doesn't actually accomplish anything. This method is used for optimizations in the database system to avoid some round trips to the context thread and back to the database thread. The usage is correct albeit fragile. The entire db system if fragile, i'm just trying to fix the particular problem of deref'ing the callbacks on the wrong thread w/o tackling the larger problems with the db system.
David Levin
Comment 20 2011-03-09 16:09:30 PST
(In reply to comment #19) > (In reply to comment #18) > > (From update of attachment 85244 [details] [details]) > > View in context: https://bugs.webkit.org/attachment.cgi?id=85244&action=review > > > > Just a few comments, so r- for now (but perhaps after a bit of back and forth I could be convinced to do r+ with some changes on check in). > > > > > Source/WebCore/storage/SQLCallbackWrapper.h:72 > > > + context->postTask(createCallbackTask(&safeRelease, m_callback.release().leakRef())); > > > > Personally, I would prefer if postTask were done outside of the mutex > > (since it can be as long as m_callback.release().leakRef() is stored in a local). > > If you think it matters, i could change it, although that would decrease readability a little. I had to spend a few minutes time convincing myself it was safe because the postTask also takes a mutex (but fortunately this should be in a MessageQueue which will only take the mutex around queue and dequeue operations so there shouldn't be any possible lock order inversion happening) so it is ok as is but it feels mildly fragile. > > > > Source/WebCore/storage/SQLCallbackWrapper.h:83 > > > + bool hasCallback() const { return m_callback; } > > Dimich and chatted about the lock in this method and we decided to remove it as it doesn't actually accomplish anything. This method is used for optimizations in the database system to avoid some round trips to the context thread and back to the database thread. The usage is correct albeit fragile. I was thinking of when someone would use this method and it seemed that immediately after it returned the result could be wrong -- a lock wouldn't help that. Upon further reflection, this will all work as long as one always checks the return value from unwrap which it looks like the code always does (and this method is only used as a quick optimization check).
Dmitry Titov
Comment 21 2011-03-09 16:30:07 PST
We might add comment to hasCallback() that this method can only be used for optimization in threaded environment, since David is the second already who spotted potential trouble here. Of course, if one would do a more substantial redesign of Database code, this kind of things would go. It is only there because existing code does the same type of check, and so far it doesn't seem to cause a problem.
Michael Nordman
Comment 22 2011-03-09 20:45:02 PST
Created attachment 85280 [details] wrapper made the changes we talked about
David Levin
Comment 23 2011-03-10 11:42:40 PST
Comment on attachment 85280 [details] wrapper View in context: https://bugs.webkit.org/attachment.cgi?id=85280&action=review Two nits. If you fixed them before check in, that would be nice, but the check in could go in as is. > Source/WebCore/ChangeLog:7 > + which holds those referenc whose dtor will schedule the release of those references on the ScriptExecution thread. typo: referenc > Source/WebCore/storage/SQLCallbackWrapper.h:75 > + callback = m_callback.release().leakRef(); extra space after =
David Levin
Comment 24 2011-03-10 11:49:14 PST
fyi dimich was ok with this r+ as well.
Michael Nordman
Comment 25 2011-03-11 12:28:55 PST
Created attachment 85511 [details] wrapper i've fixed up the typos david mentioned and just syncd/merged the project files... so i'd like to throw this patch at the commit queue now
WebKit Commit Bot
Comment 26 2011-03-11 12:34:40 PST
Comment on attachment 85511 [details] wrapper Rejecting attachment 85511 [details] from commit-queue. Failed to run "['./Tools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '--bot-id=eseidel-cq-sf', 'ap..." exit_code: 2 Last 500 characters of output: Core/WebCore.vcproj/WebCore.vcproj.rej patching file Source/WebCore/WebCore.xcodeproj/project.pbxproj patching file Source/WebCore/storage/SQLCallbackWrapper.h patching file Source/WebCore/storage/SQLStatement.cpp patching file Source/WebCore/storage/SQLStatement.h patching file Source/WebCore/storage/SQLTransaction.cpp patching file Source/WebCore/storage/SQLTransaction.h Failed to run "[u'/Projects/CommitQueue/Tools/Scripts/svn-apply', u'--reviewer', u'David Levin', u'--force']" exit_code: 1 Full output: http://queues.webkit.org/results/8141387
Michael Nordman
Comment 27 2011-03-11 12:44:30 PST
(In reply to comment #26) > (From update of attachment 85511 [details]) > Rejecting attachment 85511 [details] from commit-queue. > > Failed to run "['./Tools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '--bot-id=eseidel-cq-sf', 'ap..." exit_code: 2 > > Last 500 characters of output: > Core/WebCore.vcproj/WebCore.vcproj.rej > patching file Source/WebCore/WebCore.xcodeproj/project.pbxproj > patching file Source/WebCore/storage/SQLCallbackWrapper.h > patching file Source/WebCore/storage/SQLStatement.cpp > patching file Source/WebCore/storage/SQLStatement.h > patching file Source/WebCore/storage/SQLTransaction.cpp > patching file Source/WebCore/storage/SQLTransaction.h > > Failed to run "[u'/Projects/CommitQueue/Tools/Scripts/svn-apply', u'--reviewer', u'David Levin', u'--force']" exit_code: 1 > > Full output: http://queues.webkit.org/results/8141387 whatever... landing manually using webkit-patch land
Adam Barth
Comment 28 2011-03-11 12:45:37 PST
> whatever... landing manually using webkit-patch land Sorry. :( We really need to improve the commit-queue's error messages for these cases.
Michael Nordman
Comment 29 2011-03-11 13:10:16 PST
Michael Nordman
Comment 30 2011-03-11 14:56:26 PST
(In reply to comment #29) > Committed r80876: <http://trac.webkit.org/changeset/80876> bah... some test failures on GTK and XP... change-version.html executesql-accepts-only-one-statement.html null-callbacks.html quota-tracking.html statement-error-callback.html ... all crash, no data on how
Michael Nordman
Comment 31 2011-03-11 15:27:55 PST
(In reply to comment #30) > (In reply to comment #29) > > Committed r80876: <http://trac.webkit.org/changeset/80876> > > bah... some test failures on GTK and XP... > > change-version.html > executesql-accepts-only-one-statement.html > null-callbacks.html > quota-tracking.html > statement-error-callback.html > > ... all crash, no data on how Gee... maybe it's that missing ~ on the dtor :(
Michael Nordman
Comment 32 2011-03-11 16:00:47 PST
Gee... maybe it's that missing ~ on the dtor? http://trac.webkit.org/changeset/80897
Note You need to log in before you can comment on or make changes to this bug.