WebKit Bugzilla
New
Browse
Search+
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED WONTFIX
30954
Possible race condition in Database.cpp
https://bugs.webkit.org/show_bug.cgi?id=30954
Summary
Possible race condition in Database.cpp
Xan Lopez
Reported
2009-10-30 07:02:31 PDT
While debugging some ASSERTions happening randomly in the storage code in the GTK port, I found the following with valgrind/helgrind: ==7268== Possible data race during write of size 1 at 0xa5c20a9 by thread #1 ==7268== at 0x47D4983: WebCore::Database::stop() (Database.cpp:361) ==7268== by 0x4357DAD: WebCore::Document::stopDatabases() (Document.cpp:4406) ==7268== by 0x455316C: WebCore::FrameLoader::stopLoading(WebCore::UnloadEventPolicy, WebCore::DatabasePolicy) (FrameLoader.cpp:554) ==7268== by 0x455331F: WebCore::FrameLoader::closeURL() (FrameLoader.cpp:585) ==7268== by 0x455BDC0: WebCore::FrameLoader::transitionToCommitted(WTF::PassRefPtr<WebCore::CachedPage>) (FrameLoader.cpp:2465) ==7268== by 0x455B86F: WebCore::FrameLoader::commitProvisionalLoad(WTF::PassRefPtr<WebCore::CachedPage>) (FrameLoader.cpp:2391) ==7268== by 0x453EAD1: WebCore::DocumentLoader::commitIfReady() (DocumentLoader.cpp:320) ==7268== by 0x453EB0A: WebCore::DocumentLoader::finishedLoading() (DocumentLoader.cpp:327) ==7268== by 0x455CA6D: WebCore::FrameLoader::finishedLoading() (FrameLoader.cpp:2697) ==7268== by 0x456CEA1: WebCore::MainResourceLoader::didFinishLoading() (MainResourceLoader.cpp:393) ==7268== by 0x456C660: WebCore::MainResourceLoader::continueAfterContentPolicy(WebCore::PolicyAction, WebCore::ResourceResponse const&) (MainResourceLoader.cpp:264) ==7268== by 0x456C777: WebCore::MainResourceLoader::continueAfterContentPolicy(WebCore::PolicyAction) (MainResourceLoader.cpp:278) ==7268== This conflicts with a previous read of size 1 by thread #3 ==7268== at 0x47F7BC4: WebCore::Database::stopped() const (Database.h:110) ==7268== by 0x47F5E71: WebCore::SQLTransaction::checkAndHandleClosedDatabase() (SQLTransaction.cpp:151) ==7268== by 0x47F60CD: WebCore::SQLTransaction::performNextStep() (SQLTransaction.cpp:182) ==7268== by 0x47DE215: WebCore::DatabaseTransactionTask::doPerformTask() (DatabaseTask.cpp:145) ==7268== by 0x47DDC7B: WebCore::DatabaseTask::performTask() (DatabaseTask.cpp:56) ==7268== by 0x47DEDB5: WebCore::DatabaseThread::databaseThread() (DatabaseThread.cpp:99) ==7268== by 0x47DED02: WebCore::DatabaseThread::databaseThreadStart(void*) (DatabaseThread.cpp:82) ==7268== by 0x41044B5: WTF::threadEntryPoint(void*) (Threading.cpp:64) Basically there's the possibility of the main thread being busy stopping all current transactions, while the database thread wants to check if they are stopped to do something else. This all happens without any kind of lock, so there's a real possibility of things going the wrong way I think. Not sure what would be the best fix here though, since I'm not very familiar with the code.
Attachments
patch: put m_stopped behind a lock
(1.78 KB, patch)
2010-03-31 18:53 PDT
,
Dumitru Daniliuc
levin
: review-
dumi
: commit-queue-
Details
Formatted Diff
Diff
View All
Add attachment
proposed patch, testcase, etc.
Xan Lopez
Comment 1
2009-11-02 03:28:26 PST
There's another possible race condition spotted by valgrind/helgrind running the same test (the test is storage/open-database-while-transaction-in-progress.html, btw): ==18197== Possible data race during write of size 4 at 0xa643cb4 by thread #1 ==18197== at 0x47F7607: WebCore::SQLTransaction::deliverSuccessCallback() (SQLTransaction.cpp:482) ==18197== by 0x47F6351: WebCore::SQLTransaction::performPendingCallback() (SQLTransaction.cpp:204) ==18197== by 0x47D63EA: WebCore::Database::deliverPendingCallback(void*) (Database.cpp:624) ==18197== by 0x41015EF: WTF::dispatchFunctionsFromMainThread() (MainThread.cpp:94) ==18197== by 0x41072FF: WTF::timeoutFired(void*) (MainThreadGtk.cpp:43) ==18197== by 0x7998D22: g_timeout_dispatch (gmain.c:3396) ==18197== by 0x7995EB9: g_main_dispatch (gmain.c:1960) ==18197== by 0x7997116: g_main_context_dispatch (gmain.c:2513) ==18197== by 0x7997536: g_main_context_iterate (gmain.c:2591) ==18197== by 0x7997C42: g_main_loop_run (gmain.c:2799) ==18197== by 0x7324C59: gtk_main (gtkmain.c:1218) ==18197== by 0x80509B1: runTest(std::string const&) (DumpRenderTree.cpp:488) ==18197== This conflicts with a previous read of size 4 by thread #3 ==18197== at 0x47F61A7: WebCore::SQLTransaction::performNextStep() (SQLTransaction.cpp:188) ==18197== by 0x47DE2A5: WebCore::DatabaseTransactionTask::doPerformTask() (DatabaseTask.cpp:145) ==18197== by 0x47DDD0B: WebCore::DatabaseTask::performTask() (DatabaseTask.cpp:56) ==18197== by 0x47DEE45: WebCore::DatabaseThread::databaseThread() (DatabaseThread.cpp:99) ==18197== by 0x47DED92: WebCore::DatabaseThread::databaseThreadStart(void*) (DatabaseThread.cpp:82) ==18197== by 0x41044B5: WTF::threadEntryPoint(void*) (Threading.cpp:64) ==18197== by 0x4009B47: mythread_wrapper (hg_intercepts.c:201) ==18197== by 0x56F51E: start_thread (pthread_create.c:297) From reading the code a bit it would seem that in most places the lock (m_lockAcquired) is supposed to be held when running any step of a transaction. In some cases, though, it's not explicitly enforced, and in this one it seems to be completely missing.
Dumitru Daniliuc
Comment 2
2010-03-31 18:53:10 PDT
Created
attachment 52234
[details]
patch: put m_stopped behind a lock
WebKit Review Bot
Comment 3
2010-03-31 18:57:21 PDT
Attachment 52234
[details]
did not pass style-queue: Failed to run "WebKitTools/Scripts/check-webkit-style" exit_code: 1 WebCore/storage/Database.h:117: More than one command on the same line [whitespace/newline] [4] Total errors found: 1 in 3 files If any of these errors are false positives, please file a bug against check-webkit-style.
Dmitry Titov
Comment 4
2010-04-01 12:55:20 PDT
I'm not sure I understand how having a lock around assigning a bool may prevent anything. I see how it prevents the valgrind from complaining, but if there is an issue with setting/checking this bool on different threads, then adding a lock in this manner doesn't seem to fix that. The bool can be flipped by another thread right after the check anyways. Do I miss something?
Dumitru Daniliuc
Comment 5
2010-04-01 13:08:50 PDT
(In reply to
comment #4
)
> I'm not sure I understand how having a lock around assigning a bool may prevent > anything. I see how it prevents the valgrind from complaining, but if there is > an issue with setting/checking this bool on different threads, then adding a > lock in this manner doesn't seem to fix that. The bool can be flipped by > another thread right after the check anyways. > Do I miss something?
This flag is basically used to interrupt transactions. The database thread checks it before running any transaction step, and if it's set to 'true', then we roll back the transaction. So it doesn't matter when the main thread flips this bool -- we'll just keep executing transactions until we notice it's set to 'true'. This patch doesn't change any of that. It just makes sure that access to a variable used on multiple threads is properly synchronized, which should make valgrind happy.
Alexey Proskuryakov
Comment 6
2010-04-01 13:20:09 PDT
Can you use atomicIncrement() for this?
Dumitru Daniliuc
Comment 7
2010-04-01 14:58:17 PDT
(In reply to
comment #6
)
> Can you use atomicIncrement() for this?
I can, but I was told that valgrind doesn't know about atomic operations and will keep complaining about this problem. Also, there's not a single atomic operation in the entire DB code. Should I still use atomicIncrement()? I'm happy to go with whatever solution is best, just want to make sure that all opinions are taken into account.
Dmitry Titov
Comment 8
2010-04-01 15:13:38 PDT
I would rather not add any synchronization primitives if the functionality does not require it, and I take from your explanation that the code does not have a race, actually - it's just the valgrind complains that a variable not protected by a lock is used by 2 threads. Valgrind as a tool shoudl have some ways to exclude the particular false positive. It would be better to do that then add a lock that is not needed. Or is it actually needed? What ASSERT did you refer to?
David Levin
Comment 9
2010-04-01 16:05:22 PDT
Comment on
attachment 52234
[details]
patch: put m_stopped behind a lock It seems to have been noted by other folks that this doesn't really solve anything except to make an arbitrary tool happy by slowing things down and using more memory. If there is some core issue, then this definitely doesn't fix it. If there isn't an issue, then there perhaps there should be a comment in the code to explain why this is ok.
Dumitru Daniliuc
Comment 10
2010-04-01 16:44:21 PDT
OK, I guess I'm going to close this issue then. m_stopped: Doesn't need to be locked. It's just a flag used to signal that a particular database needs to be closed (either because the document is about to go away, or because we need to delete that database). The main thread and the DB thread check this flag at the beginning of every transaction step, and immediately roll it back if m_stopped is set to 'true'. m_lockAcquired: It is a flag used to make sure that no transaction gets to run any step on the DB thread without going through the SQLTransactionCoordinator first. We assert that m_lockAcquired == true in all SQLTransaction methods that run on the DB thread. However, there's no need to do that for the steps that run on the main thread, because the main thread can safely handle multiple transactions on different handles to the same database at the same time (even though in practice m_lockAcquired == true during those steps too: it gets set to 'true' at the beginning of a transaction, and is reset to 'false' after committing/rolling back the transaction).
David Levin
Comment 11
2010-04-01 17:05:02 PDT
(In reply to
comment #10
)
> OK, I guess I'm going to close this issue then. >
How is the race with reading and setting "m_nextStep" (that is mentioned in
comment #2
, ok?
Dumitru Daniliuc
Comment 12
2010-04-01 17:09:12 PDT
(In reply to
comment #11
)
> (In reply to
comment #10
) > > OK, I guess I'm going to close this issue then. > > > > How is the race with reading and setting "m_nextStep" (that is mentioned in >
comment #2
, ok?
There's no race there. At every point in time the transaction is either running on the main thread, or the DB thread (or none): it is impossible for the main thread and the DB thread to run code in the same SQLTransaction object at the same time.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug