Summary: | fast/workers/storage/execute-sql-args-worker.html crashed on Leopard Commit Bot | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Eric Seidel (no email) <eric> | ||||||||
Component: | WebCore Misc. | Assignee: | Dumitru Daniliuc <dumi> | ||||||||
Status: | RESOLVED FIXED | ||||||||||
Severity: | Normal | CC: | dimich, dumi, ericu, michaeln | ||||||||
Priority: | P2 | ||||||||||
Version: | 528+ (Nightly build) | ||||||||||
Hardware: | PC | ||||||||||
OS: | OS X 10.5 | ||||||||||
Attachments: |
|
Description
Eric Seidel (no email)
2010-06-23 14:52:27 PDT
Created attachment 59565 [details]
crash report from commit bot
http://developer.apple.com/mac/library/technotes/tn2004/tn2123.html: "EXC_BAD_ACCESS/KERN_PROTECTION_FAILURE — This is caused by the thread trying to write to read-only memory. This is always caused by a data access." Also, OriginUsageRecord:75 is: if (m_cachedDiskUsageIsValid), and m_cachedDiskUsageIsValid is just a bool field. Some forums suggest that this might be caused by bad RAM. In any case, I've never seen this crash before, and I really don't see how checking the value of a bool variable could've caused it. I think we should treat it as flakiness, unless it happens again. Exception Codes: KERN_PROTECTION_FAILURE at 0x0000000000000034 Means something is trying to grab at an offset 34 bytes from NULL. Meaning that either OriginUsageRecord (or whatever object owns the OriginUsageRecord) is NULL. Or rather something is tryihng to grab at that object from a NULL pointer. If the bug isn't useful, we can close it. Just pointing out that your diagnosis of "bad ram" is wrong. :) Looks like: unsigned long long OriginQuotaManager::diskUsage(SecurityOrigin* origin) const 126 { 127 ASSERT(m_usageRecordGuardLocked); 128 129 OriginUsageRecord* usageRecord = m_usageMap.get(origin); 130 ASSERT(usageRecord); 131 132 return usageRecord->diskUsage(); 133 } Got back a NULL OriginUsageRecord* from the m_usageMap. The question is why. How can that happen? :) That's the reason it crashed. I think I found a scenario that would lead to this crash. Short version: race condition that can happen only when using workers. Long version: In order to open a database, Database::openDatabase() calls (among other things) DatabaseTracker::canEstablishDatabase() and then DatabaseTracker::addOpenDatabase(). canEstablishDatabase() makes sure that the OriginQuotaManager instance used by DatabaseTracker starts tracking the given origin, but does not tell the DatabaseTracker that there's a new open database. addOpenDatabase() does the opposite: it tells DatabaseTracker() that a new database was opened, but it doesn't make sure that OriginQuotaManager is tracking this origin. On the other hand, when a database is closed, it calls DatabaseTracker::removeOpenDatabase(), which tells DatabaseTracker to stop keeping track of this DB. Also, if this happens to be the last DB in this context that DatabaseTracker knows about, then it tells OriginQuotaManager to stop tracking this security origin too. Now in the non-worker case, this is fine, because all these calls happen on the same context thread, so there's no race. When we have multiple workers though, the context threads are different for each worker, but they still share the same DatabaseTracker instance. So here's what I think happens: 1. Worker #1 runs DB test N. 2. OriginQuotaManager starts tracking this origin. 3. Worker #1 wants to shut down. This posts a task to the DatabaseThread of Worker #1, which will make sure all DBs are closed, all references are released, etc. 4. Meanwhile, Worker #2 starts running test (N+1), in the same security origin. 5. Worker #2 calls DatabaseTracker::canEstablishDatabase(). Nothing interesting happens, because OriginQuotaManager is still tracking this origin. 6. Worker #1 finally closes the DB used by test N, and calls DatabaseTracker::removeOpenDatabase(). At this point, DatabaseTracker doesn't know yet about the DB that will be opened in test (N+1), so as far as it's concerned, the last DB in this origin is being closed. Therefore, DatabaseTracker tells OriginQuotaManager to stop tracking this origin. 7. Worker #2 calls DatabaseTracker::addOpenDatabase(). DatabaseTracker is happy to start keeping track of this new DB, but doesn't tell OriginQuotaManager to re-start keeping track of this origin. 8. Now worker #2 gets to a transaction. Before the transaction can run, we need to know how much space is available to this origin. Since OriginQuotaManager is not tracking this origin, a NULL OriginUsageRecord is returned. Solution: I think the best solution would be to call OriginQuotaManager::trackOrigin() whenever we call DatabaseTracker::addOpenDatabase(). I'm not sure if this is the problem that led to this crash, but it looks to me like it would lead to the same stack track, and I think it's worth fixing anyway. Created attachment 59592 [details]
patch
Comment on attachment 59592 [details] patch > Index: WebCore/ChangeLog > =================================================================== > --- WebCore/ChangeLog (revision 61722) > +++ WebCore/ChangeLog (working copy) > @@ -1,3 +1,13 @@ > +2010-06-23 Dumitru Daniliuc <dumi@chromium.org> > + > + Reviewed by NOBODY (OOPS!). > + > + Fix a race condition that can happen when using DBs in workers. > + https://bugs.webkit.org/show_bug.cgi?id=41105 > + > + * storage/DatabaseTracker.cpp: > + (WebCore::DatabaseTracker::addOpenDatabase): > + > 2010-06-23 Kwang Yul Seo <skyul@company100.net> > > Reviewed by Kent Tamura. > Index: WebCore/storage/DatabaseTracker.cpp > =================================================================== > --- WebCore/storage/DatabaseTracker.cpp (revision 61722) > +++ WebCore/storage/DatabaseTracker.cpp (working copy) > @@ -508,6 +508,12 @@ void DatabaseTracker::addOpenDatabase(Ab > LOG(StorageAPI, "Added open Database %s (%p)\n", database->stringIdentifier().ascii().data(), database); > } > > + { > + Locker<OriginQuotaManager> quotaManagerLocker(originQuotaManager()); > + if (!originQuotaManager().tracksOrigin(database->securityOrigin())) > + originQuotaManager().trackOrigin(database->securityOrigin()); > + } > + > MutexLocker lockDatabase(m_databaseGuard); > doneCreatingDatabase(database->securityOrigin(), database->stringIdentifier()); > } WebCore/storage/DatabaseTracker.cpp:515 + } I think that still doesn't work. The deleting thread could just drop the openDatabaseMapLock in removeOpenDatabase, then wait there for a bit. Meanwhile we hit and pass your new code. Then the deleting thread locks OQM and removes the origin we just checked on, and we're back where we started. One approach which I think would work would be to define originQuotaManager's lock as lockable under m_openDatabaseMapGuard in the DatabaseTracker locking hierarchy, so that you could hold the mapGuard through the whole thing on both sides.. I believe that would be safe, although it would restrict such changes in the future. Created attachment 59946 [details]
patch
Should be fixed now.
Comment on attachment 59946 [details] patch > Index: WebCore/ChangeLog > =================================================================== > --- WebCore/ChangeLog (revision 62036) > +++ WebCore/ChangeLog (working copy) > @@ -1,3 +1,14 @@ > +2010-06-28 Dumitru Daniliuc <dumi@chromium.org> > + > + Reviewed by NOBODY (OOPS!). > + > + Fix a race condition that can happen when using DBs in workers. > + https://bugs.webkit.org/show_bug.cgi?id=41105 > + > + * storage/DatabaseTracker.cpp: > + (WebCore::DatabaseTracker::addOpenDatabase): > + * storage/DatabaseTracker.h: > + > 2010-06-28 Beth Dakin <bdakin@apple.com> > > Reviewed by Sam Weinig. > Index: WebCore/storage/DatabaseTracker.cpp > =================================================================== > --- WebCore/storage/DatabaseTracker.cpp (revision 62036) > +++ WebCore/storage/DatabaseTracker.cpp (working copy) > @@ -506,6 +506,10 @@ void DatabaseTracker::addOpenDatabase(Ab > databaseSet->add(database); > > LOG(StorageAPI, "Added open Database %s (%p)\n", database->stringIdentifier().ascii().data(), database); > + > + Locker<OriginQuotaManager> quotaManagerLocker(originQuotaManager()); > + if (!originQuotaManager().tracksOrigin(database->securityOrigin())) > + originQuotaManager().trackOrigin(database->securityOrigin()); > } > > MutexLocker lockDatabase(m_databaseGuard); > @@ -553,10 +557,10 @@ void DatabaseTracker::removeOpenDatabase > > m_openDatabaseMap->remove(database->securityOrigin()); > delete nameMap; > - } > > - Locker<OriginQuotaManager> quotaManagerLocker(originQuotaManager()); > - originQuotaManager().removeOrigin(database->securityOrigin()); > + Locker<OriginQuotaManager> quotaManagerLocker(originQuotaManager()); > + originQuotaManager().removeOrigin(database->securityOrigin()); > + } > } > > void DatabaseTracker::getOpenDatabases(SecurityOrigin* origin, const String& name, HashSet<RefPtr<AbstractDatabase> >* databases) > Index: WebCore/storage/DatabaseTracker.h > =================================================================== > --- WebCore/storage/DatabaseTracker.h (revision 62036) > +++ WebCore/storage/DatabaseTracker.h (working copy) > @@ -63,7 +63,7 @@ public: > // This singleton will potentially be used from multiple worker threads and the page's context thread simultaneously. To keep this safe, it's > // currently using 4 locks. In order to avoid deadlock when taking multiple locks, you must take them in the correct order: > // m_databaseGuard before quotaManager if both locks are needed. > - // no other lock is taken in the code locked on m_openDatabaseMapGuard. > + // m_openDatabaseMapGuard before quotaManager if both locks are needed. > // notificationMutex() is currently independent of the other locks. > > bool canEstablishDatabase(ScriptExecutionContext*, const String& name, const String& displayName, unsigned long estimatedSize); WebCore/storage/DatabaseTracker.h:66 + // m_openDatabaseMapGuard before quotaManager if both locks are needed. Please point out that m_databaseGuard and m_openDatabaseMapGuard currently don't overlap at all. WebCore/storage/DatabaseTracker.cpp:512 + originQuotaManager().trackOrigin(database->securityOrigin()); LGTM. > WebCore/storage/DatabaseTracker.h:66
> + // m_openDatabaseMapGuard before quotaManager if both locks are needed.
> Please point out that m_databaseGuard and m_openDatabaseMapGuard currently don't overlap at all.
Will do on landing. Thanks for the review!
Comment on attachment 59946 [details]
patch
I don't fully understand this, but I trust eric's blessing.
(In reply to comment #13) > (From update of attachment 59946 [details]) > I don't fully understand this, but I trust eric's blessing. Awesome. I'm gonna get my reviewer bit before I get my committer bit ;'>. Landed: r62043. |