Bug 55919 - Yet another multi-threading bug in WebSQLDatabase.
Summary: Yet another multi-threading bug in WebSQLDatabase.
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Michael Nordman
URL:
Keywords:
Depends on: 56229
Blocks:
  Show dependency treegraph
 
Reported: 2011-03-07 19:52 PST by Michael Nordman
Modified: 2011-03-11 16:00 PST (History)
5 users (show)

See Also:


Attachments
fix (15.14 KB, patch)
2011-03-07 20:02 PST, Michael Nordman
dimich: review-
Details | Formatted Diff | Diff
box terminology (15.24 KB, patch)
2011-03-08 11:54 PST, Michael Nordman
no flags Details | Formatted Diff | Diff
wrapper (20.93 KB, patch)
2011-03-08 19:16 PST, Michael Nordman
no flags Details | Formatted Diff | Diff
wrapper2 (20.98 KB, patch)
2011-03-09 12:28 PST, Michael Nordman
no flags Details | Formatted Diff | Diff
wrapper (23.07 KB, patch)
2011-03-09 15:08 PST, Michael Nordman
levin: review-
Details | Formatted Diff | Diff
wrapper (23.30 KB, patch)
2011-03-09 20:45 PST, Michael Nordman
levin: review+
Details | Formatted Diff | Diff
wrapper (23.44 KB, patch)
2011-03-11 12:28 PST, Michael Nordman
levin: review+
commit-queue: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Nordman 2011-03-07 19:52:07 PST
See http://code.google.com/p/chromium/issues/detail?id=71026

SQLTransaction and SQLStatement keep RefPtrs to javascript callbacks, which hold onto the tree nodes. We shouldn't release callbacks or tree nodes on a database thread.
Comment 1 Michael Nordman 2011-03-07 20:02:06 PST
Created attachment 85012 [details]
fix

Wdyt? I can put the new class in a separate file if need be.
Comment 2 Dmitry Titov 2011-03-07 22:31:25 PST
Comment on attachment 85012 [details]
fix

View in context: https://bugs.webkit.org/attachment.cgi?id=85012&action=review

Thanks for doing this! 
Some notes:

> Source/WebCore/storage/SQLStatement.cpp:170
> +    RefPtr<SQLStatementErrorCallback> errorCallback = m_statementErrorCallback.release();

This looks a bit confusing. The naming is such as if both 'callback' objects are of the same type, and the only thing that goes on is refcount manipulations.
Perhaps m_statementErrorCallback -> m_statementErrorCallbackBox or some other name that would make the need for .release() (or unbox()?) obvious. Otherwise code looks like strange assignment of RefPtrs.

> Source/WebCore/storage/SQLStatement.h:52
> +template<typename T> class SQLCallbackReference {

Indeed could be nice to have a file for it.

> Source/WebCore/storage/SQLStatement.h:90
> +    bool has() const

bool operator() could be clearer here...

> Source/WebCore/storage/SQLStatement.h:104
> +    mutable Mutex m_mutex;

Hmm, why do we need a mutex in this class? We didn't have one before, and it seems posting the task to the context thread for destruction is all we need.
Or, is there more then one issue this patch is fixing?

> Source/WebCore/storage/SQLStatement.h:106
> +    RefPtr<ScriptExecutionContext> m_scriptExecutionContent;

m_scriptExecutionContent -> m_scriptExecutionContext
Comment 3 Michael Nordman 2011-03-08 09:43:00 PST
Thanks for looking.

(In reply to comment #2)
> > Source/WebCore/storage/SQLStatement.cpp:170
> > +    RefPtr<SQLStatementErrorCallback> errorCallback = m_statementErrorCallback.release();
> 
> This looks a bit confusing. The naming is such as if both 'callback' objects are of the same type...

The 'box' term works for me.

> > Source/WebCore/storage/SQLStatement.h:52
> > +template<typename T> class SQLCallbackReference {
> 
> Indeed could be nice to have a file for it.

Definitely want to get the class name settled before updating dozens of make files to add the .h file.

> > Source/WebCore/storage/SQLStatement.h:90
> > +    bool has() const
> 
> bool operator() could be clearer here...

sgtm

> > Source/WebCore/storage/SQLStatement.h:104
> > +    mutable Mutex m_mutex;
> 
> Hmm, why do we need a mutex in this class? We didn't have one before, and it seems posting the task to the context thread for destruction is all we need.
> Or, is there more then one issue this patch is fixing?

Multiple threads can test and clearing the reference. I couldn't convince myself that a mutex wasn't needed, and now it's two data members that are being cleared instead of one. Can you convince me that we don't need one?

> > Source/WebCore/storage/SQLStatement.h:106
> > +    RefPtr<ScriptExecutionContext> m_scriptExecutionContent;
> 
> m_scriptExecutionContent -> m_scriptExecutionContext

Doh... thnx!
Comment 4 Michael Nordman 2011-03-08 11:54:57 PST
Created attachment 85074 [details]
box terminology
Comment 5 Michael Nordman 2011-03-08 12:00:14 PST
Comment on attachment 85074 [details]
box terminology

How's SQLCallbackBox for a class name and unbox() for the method of interest?

> Source/WebCore/ChangeLog:17
> +        (WebCore::SQLCallbackReference::SQLCallbackReference):

oh... i'll have to update the ChangeLog too
Comment 6 Dmitry Titov 2011-03-08 13:00:09 PST
Comment on attachment 85074 [details]
box terminology

View in context: https://bugs.webkit.org/attachment.cgi?id=85074&action=review

Looks good.

> Source/WebCore/storage/SQLStatement.h:51
> +// SQLStatement and SQLTransaction on the proper thread.

Lets add a comment to this class describing all 3 ways the 'boxed' reference can be dereferenced:
- by destructing enclosing object - on either thread
- by calling clear() - on either thread
- by unboxing and then dereferencing normally - on context thread only)
Comment 7 Dmitry Titov 2011-03-08 13:01:22 PST
Oh, and lets remove taking a lock in bool operators...
Comment 8 Michael Nordman 2011-03-08 13:08:04 PST
Thnx... i still need to do the make file changes... any bets on how many builds i can break by adding a single .h file to the project :)

> Source/WebCore/storage/SQLStatement.h:98
> +        MutexLocker locker(m_mutex);

Right... locking the mutex in the bool operators looks pointless, i think we can remove them too.

Q: RefPtr and PassRefPtr use a trick discourage mis-using operator bool(), should that trick be used here?
Comment 9 Dmitry Titov 2011-03-08 14:59:32 PST
Comment on attachment 85074 [details]
box terminology

Removing flag for now, reflecting offline discussion:
- it's good to have a separate file for this new class
- lets go back from bool operators to hasCallback() (w/o lock)
Comment 10 Michael Nordman 2011-03-08 15:38:00 PST
(In reply to comment #9)
> (From update of attachment 85074 [details])
> Removing flag for now, reflecting offline discussion:
> - it's good to have a separate file for this new class
> - lets go back from bool operators to hasCallback() (w/o lock)

Sounds good... when i figure out which dozen makefiles to change and how... i'll post a new patch.
Comment 11 Michael Nordman 2011-03-08 19:16:22 PST
Created attachment 85124 [details]
wrapper

Made changes as discussed. I still have to poke at more make files to add the new .h to them, but the code changes should be settled now. I'm on take three for  a class name so far, any better suggestions would be welcome.
Comment 12 Michael Nordman 2011-03-08 19:19:31 PST
Comment on attachment 85124 [details]
wrapper

View in context: https://bugs.webkit.org/attachment.cgi?id=85124&action=review

> Source/WebCore/storage/SQLCallbackWrapper.h:93
> +    mutable Mutex m_mutex;

this doesn't have to be mutable anymore
Comment 13 Michael Nordman 2011-03-09 12:28:01 PST
Created attachment 85216 [details]
wrapper2
Comment 14 WebKit Review Bot 2011-03-09 12:30:03 PST
Attachment 85216 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1

Source/WebCore/storage/SQLCallbackWrapper.h:28:  #ifndef header guard has wrong style, please use: WTF_SQLCallbackWrapper_h  [build/header_guard] [5]
Total errors found: 1 in 7 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 15 David Levin 2011-03-09 12:36:52 PST
(In reply to comment #14)
> Attachment 85216 [details] did not pass style-queue:
> 
> Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1
> 
> Source/WebCore/storage/SQLCallbackWrapper.h:28:  #ifndef header guard has wrong style, please use: WTF_SQLCallbackWrapper_h  [build/header_guard] [5]

The error is real. The suggestion is wrong.
Comment 16 Michael Nordman 2011-03-09 12:54:42 PST
(In reply to comment #15)
> (In reply to comment #14)
> > Attachment 85216 [details] [details] did not pass style-queue:
> > 
> > Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1
> > 
> > Source/WebCore/storage/SQLCallbackWrapper.h:28:  #ifndef header guard has wrong style, please use: WTF_SQLCallbackWrapper_h  [build/header_guard] [5]
> 
> The error is real. The suggestion is wrong.

Yes, i see it. Still have poked at most of the make files, but i wanted to see what the EWS bots thought about this much.
Comment 17 Michael Nordman 2011-03-09 15:08:25 PST
Created attachment 85244 [details]
wrapper

fixed the header guard and added the .h file to various project/make fiels.
Comment 18 David Levin 2011-03-09 15:32:43 PST
Comment on attachment 85244 [details]
wrapper

View in context: https://bugs.webkit.org/attachment.cgi?id=85244&action=review

Just a few comments, so r- for now (but perhaps after a bit of back and forth I could be convinced to do r+ with some changes on check in).

> Source/WebCore/storage/SQLCallbackWrapper.h:72
> +        context->postTask(createCallbackTask(&safeRelease, m_callback.release().leakRef()));

Personally, I would prefer if postTask were done outside of the mutex (since it can be as long as m_callback.release().leakRef() is stored in a local).

> Source/WebCore/storage/SQLCallbackWrapper.h:77
> +        ASSERT(!m_callback || m_scriptExecutionContext->isContextThread());

Doesn't the assert need to be inside of the locker?

> Source/WebCore/storage/SQLCallbackWrapper.h:83
> +    bool hasCallback() const { return m_callback; }

I don't understand how this method could be useful unless there is some implied locking mechanism at a higher level which ensure that it isn't called if clear or unwrap may be called on another thread.

Given that clear and unwrap both have mutex guards in them, it implies that they may be called on another thread so this seems concerning.

btw, I'm not arguing for a mutex here as that doesn't fix the problem that I'm talking about. It feels like this needs a comment in the code because something tricky is happening here (and in general this method feels like it is prone to leading people to write broken code).
Comment 19 Michael Nordman 2011-03-09 16:00:20 PST
(In reply to comment #18)
> (From update of attachment 85244 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=85244&action=review
> 
> Just a few comments, so r- for now (but perhaps after a bit of back and forth I could be convinced to do r+ with some changes on check in).
> 
> > Source/WebCore/storage/SQLCallbackWrapper.h:72
> > +        context->postTask(createCallbackTask(&safeRelease, m_callback.release().leakRef()));
> 
> Personally, I would prefer if postTask were done outside of the mutex
> (since it can be as long as m_callback.release().leakRef() is stored in a local).

If you think it matters, i could change it, although that would decrease readability a little.

> > Source/WebCore/storage/SQLCallbackWrapper.h:77
> > +        ASSERT(!m_callback || m_scriptExecutionContext->isContextThread());
> 
> Doesn't the assert need to be inside of the locker?

Yup... good point.

> > Source/WebCore/storage/SQLCallbackWrapper.h:83
> > +    bool hasCallback() const { return m_callback; }
> 
> I don't understand how this method could be useful unless there is some implied locking mechanism at a higher level which ensure that it isn't called if clear or unwrap may be called on another thread.
> 
> Given that clear and unwrap both have mutex guards in them, it implies that they may be called on another thread so this seems concerning.
> 
> btw, I'm not arguing for a mutex here as that doesn't fix the problem that I'm talking about. It feels like this needs a comment in the code because something tricky is happening here (and in general this method feels like it is prone to leading people to write broken code).

Dimich and chatted about the lock in this method and we decided to remove it as it doesn't actually accomplish anything. This method is used for optimizations in the database system to avoid some round trips to the context thread and back to the database thread. The usage is correct albeit fragile.

The entire db system if fragile, i'm just trying to fix the particular problem of deref'ing the callbacks on the wrong thread w/o tackling the larger problems with the db system.
Comment 20 David Levin 2011-03-09 16:09:30 PST
(In reply to comment #19)
> (In reply to comment #18)
> > (From update of attachment 85244 [details] [details])
> > View in context: https://bugs.webkit.org/attachment.cgi?id=85244&action=review
> > 
> > Just a few comments, so r- for now (but perhaps after a bit of back and forth I could be convinced to do r+ with some changes on check in).
> > 
> > > Source/WebCore/storage/SQLCallbackWrapper.h:72
> > > +        context->postTask(createCallbackTask(&safeRelease, m_callback.release().leakRef()));
> > 
> > Personally, I would prefer if postTask were done outside of the mutex
> > (since it can be as long as m_callback.release().leakRef() is stored in a local).
> 
> If you think it matters, i could change it, although that would decrease readability a little.

I had to spend a few minutes time convincing myself it was safe because the postTask also takes a mutex (but fortunately this should be in a MessageQueue which will only take the mutex around queue and dequeue operations so there shouldn't be any possible lock order inversion happening) so it is ok as is but it feels mildly fragile.

> 
> > > Source/WebCore/storage/SQLCallbackWrapper.h:83
> > > +    bool hasCallback() const { return m_callback; }
> 
> Dimich and chatted about the lock in this method and we decided to remove it as it doesn't actually accomplish anything. This method is used for optimizations in the database system to avoid some round trips to the context thread and back to the database thread. The usage is correct albeit fragile.

I was thinking of when someone would use this method and it seemed that immediately after it returned the result could be wrong -- a lock wouldn't help that.

Upon further reflection, this will all work as long as one always checks the return value from unwrap which it looks like the code always does (and this method is only used as a quick optimization check).
Comment 21 Dmitry Titov 2011-03-09 16:30:07 PST
We might add comment to hasCallback() that this method can only be used for optimization in threaded environment, since David is the second already who spotted potential trouble here.
Of course, if one would do a more substantial redesign of Database code, this kind of things would go. It is only there because existing code does the same type of check, and so far it doesn't seem to cause a problem.
Comment 22 Michael Nordman 2011-03-09 20:45:02 PST
Created attachment 85280 [details]
wrapper

made the changes we talked about
Comment 23 David Levin 2011-03-10 11:42:40 PST
Comment on attachment 85280 [details]
wrapper

View in context: https://bugs.webkit.org/attachment.cgi?id=85280&action=review

Two nits. If you fixed them before check in, that would be nice, but the check in could go in as is.

> Source/WebCore/ChangeLog:7
> +        which holds those referenc whose dtor will schedule the release of those references on the ScriptExecution thread.

typo: referenc

> Source/WebCore/storage/SQLCallbackWrapper.h:75
> +            callback =  m_callback.release().leakRef();

extra space after =
Comment 24 David Levin 2011-03-10 11:49:14 PST
fyi dimich was ok with this r+ as well.
Comment 25 Michael Nordman 2011-03-11 12:28:55 PST
Created attachment 85511 [details]
wrapper

i've fixed up the typos david mentioned and just syncd/merged the project files... so i'd like to throw this patch at the commit queue now
Comment 26 WebKit Commit Bot 2011-03-11 12:34:40 PST
Comment on attachment 85511 [details]
wrapper

Rejecting attachment 85511 [details] from commit-queue.

Failed to run "['./Tools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '--bot-id=eseidel-cq-sf', 'ap..." exit_code: 2

Last 500 characters of output:
Core/WebCore.vcproj/WebCore.vcproj.rej
patching file Source/WebCore/WebCore.xcodeproj/project.pbxproj
patching file Source/WebCore/storage/SQLCallbackWrapper.h
patching file Source/WebCore/storage/SQLStatement.cpp
patching file Source/WebCore/storage/SQLStatement.h
patching file Source/WebCore/storage/SQLTransaction.cpp
patching file Source/WebCore/storage/SQLTransaction.h

Failed to run "[u'/Projects/CommitQueue/Tools/Scripts/svn-apply', u'--reviewer', u'David Levin', u'--force']" exit_code: 1

Full output: http://queues.webkit.org/results/8141387
Comment 27 Michael Nordman 2011-03-11 12:44:30 PST
(In reply to comment #26)
> (From update of attachment 85511 [details])
> Rejecting attachment 85511 [details] from commit-queue.
> 
> Failed to run "['./Tools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '--bot-id=eseidel-cq-sf', 'ap..." exit_code: 2
> 
> Last 500 characters of output:
> Core/WebCore.vcproj/WebCore.vcproj.rej
> patching file Source/WebCore/WebCore.xcodeproj/project.pbxproj
> patching file Source/WebCore/storage/SQLCallbackWrapper.h
> patching file Source/WebCore/storage/SQLStatement.cpp
> patching file Source/WebCore/storage/SQLStatement.h
> patching file Source/WebCore/storage/SQLTransaction.cpp
> patching file Source/WebCore/storage/SQLTransaction.h
> 
> Failed to run "[u'/Projects/CommitQueue/Tools/Scripts/svn-apply', u'--reviewer', u'David Levin', u'--force']" exit_code: 1
> 
> Full output: http://queues.webkit.org/results/8141387

whatever... landing manually using webkit-patch land
Comment 28 Adam Barth 2011-03-11 12:45:37 PST
> whatever... landing manually using webkit-patch land

Sorry.  :(

We really need to improve the commit-queue's error messages for these cases.
Comment 29 Michael Nordman 2011-03-11 13:10:16 PST
Committed r80876: <http://trac.webkit.org/changeset/80876>
Comment 30 Michael Nordman 2011-03-11 14:56:26 PST
(In reply to comment #29)
> Committed r80876: <http://trac.webkit.org/changeset/80876>

bah... some test failures on GTK and XP...

change-version.html
executesql-accepts-only-one-statement.html
null-callbacks.html
quota-tracking.html
statement-error-callback.html

... all crash, no data on how
Comment 31 Michael Nordman 2011-03-11 15:27:55 PST
(In reply to comment #30)
> (In reply to comment #29)
> > Committed r80876: <http://trac.webkit.org/changeset/80876>
> 
> bah... some test failures on GTK and XP...
> 
> change-version.html
> executesql-accepts-only-one-statement.html
> null-callbacks.html
> quota-tracking.html
> statement-error-callback.html
> 
> ... all crash, no data on how

Gee... maybe it's that missing ~ on the dtor :(
Comment 32 Michael Nordman 2011-03-11 16:00:47 PST
Gee... maybe it's that missing ~ on the dtor?

http://trac.webkit.org/changeset/80897