Bug 74260

Summary: SnowLeopard crashes due to thread-unsafe EventListenerIterator ASSERTs
Product: WebKit Reporter: Geoffrey Garen <ggaren>
Component: DOMAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Critical CC: darin, ddkilzer, dglazkov, gustavo, kling, sam, tsepez, webkit.review.bot, xan.lopez
Priority: P1    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
darin: review+
Patch v2 darin: review+, gyuyoung.kim: commit-queue-

Description Geoffrey Garen 2011-12-11 17:19:40 PST
I believe that at least some of the SnowLeopard debug tester crashes are caused by lack of thread-safety in ASSERT-related code in EventListenerIterator. E.g.: http://build.webkit.org/results/SnowLeopard%20Intel%20Debug%20(WebKit2%20Tests)/r102544%20(3035)/editing/deleting/delete-br-013-stderr.txt.

This bug was introduced in http://trac.webkit.org/changeset/95372 ("Reduce EventTarget memory usage by deferring hash map allocation…").

The right fix is to put a mutex around modifications to m_activeIteratorCount. WTF::HashTable has a similar problem and solution, so you can copy from there.

EventListenerIterator is used on multiple threads during the GC mark phase, at least.
Comment 1 Andreas Kling 2011-12-12 08:49:11 PST
So you're saying that EventListenerIterator is used on other threads while WebCore chugs along merrily on the main thread. Doesn't this mean that we have a race condition where the EventListenerMap could be jumping from single-type to multi-type mode while an iterator is active, leaving the iterator pointing to deleted data?

If so, I'm not entirely sure what the correct solution is. Perhaps generating a list of RefPtr<EventListener> when the visitChildren() pass starts and iterating over that instead?
Comment 2 Thomas Sepez 2011-12-14 14:47:28 PST
Downstreamed as http://code.google.com/p/chromium/issues/detail?id=107620
Comment 3 David Kilzer (:ddkilzer) 2011-12-15 10:14:04 PST
Geoff, is there a radar for this issue?
Comment 4 Darin Adler 2011-12-18 18:26:09 PST
> So you're saying that EventListenerIterator is used on other threads while WebCore chugs along merrily on the main thread.

He is not saying that. JavaScript marks using multiple threads. The main thread is also doing marking. No other WebCore code is running during that marking process. So multiple threads can be reading the event listeners, but there is a guarantee that no thread is writing to it at the same time.

All we need to resolve the problem with the new debugging assertions is to add a mutex around the debug machinery we recently added.

There may be some sort of subtle thread-safety issue, but I don’t think so.
Comment 5 Andreas Kling 2011-12-22 09:09:25 PST
Created attachment 120328 [details]
Patch

Guard EventListenerMap::m_activeIteratorCount with a mutex.
Comment 6 WebKit Review Bot 2011-12-22 09:11:43 PST
Attachment 120328 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/update-webkit']" exit_code: 9

Updating OpenSource
From git://git.webkit.org/WebKit
 + 4e5cbb7...83cc6dc master     -> origin/master  (forced update)
	M	Tools/TestWebKitAPI/TestsController.cpp
	M	Tools/Scripts/run-api-tests
	M	Tools/Scripts/webkitpy/test/main.py
	M	Tools/Scripts/webkitpy/thirdparty/__init__.py
	M	Tools/ChangeLog
r103547 = 10bde81bae001a4da66ddaa326e03b261872898e (refs/remotes/origin/master)
	M	LayoutTests/platform/gtk/Skipped
	M	LayoutTests/platform/mac/accessibility/aria-describedby-on-input-expected.txt
	M	LayoutTests/accessibility/aria-describedby-on-input.html
	M	LayoutTests/ChangeLog
	M	Source/WebCore/ChangeLog
	M	Source/WebCore/accessibility/AccessibilityRenderObject.cpp
r103548 = 0639b9de03475c2b49464647c607d0249c7445d1 (refs/remotes/origin/master)
First, rewinding head to replay your work on top of it...
Applying: run-api-tests: dumpAllTests() should not use global variables
Using index info to reconstruct a base tree...
Falling back to patching base and 3-way merge...
Auto-merging Tools/ChangeLog
CONFLICT (content): Merge conflict in Tools/ChangeLog
Failed to merge in the changes.
Patch failed at 0001 run-api-tests: dumpAllTests() should not use global variables

When you have resolved this problem run "git rebase --continue".
If you would prefer to skip this patch, instead run "git rebase --skip".
To restore the original branch and stop rebasing run "git rebase --abort".

rebase refs/remotes/origin/master: command returned error: 1

Died at Tools/Scripts/update-webkit line 158.


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 7 Darin Adler 2011-12-22 09:13:59 PST
Comment on attachment 120328 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=120328&action=review

> Source/WebCore/dom/EventListenerMap.h:85
> +#ifdef NDEBUG
> +    void assertNoActiveIterators() { }
> +#else
> +    void assertNoActiveIterators();

Sometimes I like to put the declaration in unconditionally and put the inline empty implementation outside the class definition.

> Source/WebCore/dom/EventListenerMap.h:87
> +    Mutex m_activeIteratorCountMutex;

Since this is for assertion purposes only, we could consider just sharing one global mutex rather than having a separate one for each map. That could keep down the number of #if statements in the header.
Comment 8 Andreas Kling 2011-12-22 09:21:13 PST
(In reply to comment #7)
> (From update of attachment 120328 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=120328&action=review
> 
> > Source/WebCore/dom/EventListenerMap.h:85
> > +#ifdef NDEBUG
> > +    void assertNoActiveIterators() { }
> > +#else
> > +    void assertNoActiveIterators();
> 
> Sometimes I like to put the declaration in unconditionally and put the inline empty implementation outside the class definition.
> 
> > Source/WebCore/dom/EventListenerMap.h:87
> > +    Mutex m_activeIteratorCountMutex;
> 
> Since this is for assertion purposes only, we could consider just sharing one global mutex rather than having a separate one for each map. That could keep down the number of #if statements in the header.

Great suggestions, will revise!
Comment 9 Andreas Kling 2011-12-22 09:37:03 PST
Created attachment 120333 [details]
Patch v2
Comment 10 WebKit Review Bot 2011-12-22 09:38:23 PST
Attachment 120333 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/update-webkit']" exit_code: 9

Updating OpenSource
Index mismatch: e2cbe978fd74a95efeede79908d082d70edf7254 != 61426a582f9075041358d45255f438344a0219d4
rereading 3ff8cc11dd87e8ac3a4acaee763d8549687e5da1
	M	Tools/TestWebKitAPI/TestsController.cpp
	M	Tools/ChangeLog
103546 = 4e5cbb711dbbaa35b7a064791c6a6e3e2556787c already exists! Why are we refetching it?
 at /usr/lib/git-core/git-svn line 5210

Died at Tools/Scripts/update-webkit line 158.


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 11 Gyuyoung Kim 2011-12-22 10:03:06 PST
Comment on attachment 120333 [details]
Patch v2

Attachment 120333 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/11000632
Comment 12 Darin Adler 2011-12-22 10:04:12 PST
Comment on attachment 120333 [details]
Patch v2

View in context: https://bugs.webkit.org/attachment.cgi?id=120333&action=review

> Source/WebCore/dom/EventListenerMap.h:104
> +void EventListenerMap::assertNoActiveIterators() { }

You forgot the inline keyword here. That’s why the EFL builds are failing.
Comment 13 WebKit Review Bot 2011-12-22 10:11:34 PST
Comment on attachment 120333 [details]
Patch v2

Attachment 120333 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/11000636
Comment 14 Gustavo Noronha (kov) 2011-12-22 10:18:50 PST
Comment on attachment 120333 [details]
Patch v2

Attachment 120333 [details] did not pass gtk-ews (gtk):
Output: http://queues.webkit.org/results/10998563
Comment 15 Andreas Kling 2011-12-22 10:38:19 PST
Committed r103556: <http://trac.webkit.org/changeset/103556>