Bug 32689 - Fix the leak of ThreadIdentifiers in threadMap across threads.
Summary: Fix the leak of ThreadIdentifiers in threadMap across threads.
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Web Template Framework (show other bugs)
Version: 528+ (Nightly build)
Hardware: Mac OS X 10.5
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks: 31639
  Show dependency treegraph
 
Reported: 2009-12-17 18:10 PST by Dmitry Titov
Modified: 2010-09-30 15:56 PDT (History)
6 users (show)

See Also:


Attachments
Proposed patch. (23.37 KB, patch)
2009-12-17 18:47 PST, Dmitry Titov
mjs: review-
dimich: commit-queue-
Details | Formatted Diff | Diff
Patch with test. (26.50 KB, patch)
2010-01-05 18:00 PST, Dmitry Titov
dimich: commit-queue-
Details | Formatted Diff | Diff
Patch with test and style fix. (26.62 KB, patch)
2010-01-06 13:46 PST, Dmitry Titov
dimich: commit-queue-
Details | Formatted Diff | Diff
Updated patch. (26.63 KB, patch)
2010-01-06 15:52 PST, Dmitry Titov
mjs: review+
dimich: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dmitry Titov 2009-12-17 18:10:46 PST
Working on adding asserts to RefCounted, I've added more calls to WTF::currentThread() around. This made a problem we had before more pronounced and caused crashes.

Currently, thread identifiers in Pthread-based ports are stored in a threadMap which maps them to OS-provided thread handles. WTF::currentThread() automatically generates a new ThreadIdentifier if the current thread does not yet have a map entry. Unfortunately, some OSes (OSX at least, see test in the discussion for bug 25348) reuse thread handles as soon as previous thread terminates. If the thread does not call WTF::detachThread() which removes the map entry, the new thread will have the same ThreadIdentifier which makes comparing them bug-prone. This also causes existing assertion when a new thread is created by WTF::createThread and finds out there is already ThreadIdentifier from the previously existing non-WTF thread.

In addition, current WTF::detachThread(ThreadIdentifier) is often called from another thread, which can happen after a thread with given ThreadIdentifier already finished and a new one created, with the same OS handle. If the ThreadIdentifier leaks in threadMap, it may cause detachThread to 'detach' the new thread instead of old one.

Proposed modification is to add a thread-specific memory slot to store ThreadIdentifier and remove the ThreadIdentifier from the threadMap in its thread-specific destructor call. Since other thread-specific destructors can invoke 'currentThread()' and re-establish the identifier, trigger the second pass of the destructor which happens after all regular destructors already were called. This way, the threads won't leave their ThreadIdentifiers in the map.

Other ports info: Windows threading does not have this issue because it uses unique ThreadIdentifiers supplied by OS. Chromium uses Pthreads or Windows threading so it's covered. So it leaves GTK and Qt ports which also use threadMap, but I don't know GTK not Qt API well enough to provide a fix, and hope the folks who maintain them will be able to see what, if anything, should be done there. For example, if the respected OSes do not reuse thread handles then there is no problem.

This patch will enable another attempt to fix bug 31639.
Comment 1 Dmitry Titov 2009-12-17 18:47:33 PST
Created attachment 45119 [details]
Proposed patch.
Comment 2 Dmitry Titov 2009-12-19 18:41:53 PST
One more comment on the patch: it might look that removing threadMap mapping in thread-specific destructor will cause calls like detachThread(threadID) or waitForThreadCompletion(threadID) made from other threads fail if they happen to be made just after the tread identified by threadID terminates. From what I see in debugger, Safari that uses WTF to start threads can realize this call pattern.
This is ok, actually.  Since resolving threadID via map will return 0 pthread handle, the pthread methods will simply return error and do nothing.
Comment 3 Maciej Stachowiak 2009-12-29 04:37:20 PST
Comment on attachment 45119 [details]
Proposed patch.

The code change and your explanation look good to me, but is it possible to provide a test case? It's ok if the test case only fails in debug builds with the relevant asserts enabled. If it's not possible to create a test case, please explain why in the ChangeLog.
Comment 4 Dmitry Titov 2010-01-05 18:00:07 PST
Created attachment 45951 [details]
Patch with test.

Same change, with a test.

See DumRenderTree.mm and WebKitTools/ChangeLog for the description of the test. Basically, it starts couple of trivial threads that use WTF::currentThread() method. Without the fix, there are at least 2 ASSERTS on the way of DRT to completion.
It simulates scenarios when embedders like Safari run threads not created by WTF that call into JSC/WebKit code which can use currentThread(), or  DRT running with '--threaded', to name a couple.
Comment 5 WebKit Review Bot 2010-01-05 18:12:28 PST
Attachment 45951 [details] did not pass style-queue:

Failed to run "WebKitTools/Scripts/check-webkit-style" exit_code: 1
JavaScriptCore/wtf/ThreadIdentifierDataPthreads.cpp:35:  You should add a blank line after implementation file's own header.  [build/include_order] [4]
JavaScriptCore/wtf/ThreadIdentifierDataPthreads.cpp:52:  this_ptr is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
JavaScriptCore/wtf/ThreadIdentifierDataPthreads.cpp:67:  this_ptr is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
Total errors found: 3
Comment 6 Dmitry Titov 2010-01-06 13:46:11 PST
Created attachment 45988 [details]
Patch with test and style fix.

Fixed style issue noticed by style bot.
Comment 7 WebKit Review Bot 2010-01-06 13:48:05 PST
Attachment 45988 [details] did not pass style-queue:

Failed to run "WebKitTools/Scripts/check-webkit-style" exit_code: 1
JavaScriptCore/wtf/ThreadIdentifierDataPthreads.cpp:35:  You should add a blank line after implementation file's own header.  [build/include_order] [4]
Total errors found: 1
Comment 8 Dmitry Titov 2010-01-06 15:52:36 PST
Created attachment 46004 [details]
Updated patch.

I should have used check-webkit-style before... Updated patch.
Comment 9 WebKit Review Bot 2010-01-06 15:55:29 PST
style-queue ran check-webkit-style on attachment 46004 [details] without any errors.
Comment 10 Eric Seidel (no email) 2010-01-06 16:04:56 PST
(In reply to comment #8)
> Created an attachment (id=46004) [details]
> Updated patch.
> 
> I should have used check-webkit-style before... Updated patch.

I've filed bug 33275 on your behalf. :)
Comment 11 Maciej Stachowiak 2010-01-22 01:53:57 PST
Comment on attachment 46004 [details]
Updated patch.

r=me

I was hoping for a test that shows what goes wrong in actual browsing if this bug is not fixed, but the unit test you added seems fine too.
Comment 12 Dmitry Titov 2010-01-22 13:28:38 PST
(In reply to comment #11)
> (From update of attachment 46004 [details])
> r=me
> 
> I was hoping for a test that shows what goes wrong in actual browsing if this
> bug is not fixed, but the unit test you added seems fine too.

Today, it doesn't fail in browsing (at least I don't have a repro case) but it prevents adding more ASSERTS with CurrentThread() verification (for example, a patch for bug 31639 is blocked by this). 

Thanks a lot for review!
Comment 13 Dmitry Titov 2010-01-22 14:03:22 PST
Landed: http://trac.webkit.org/changeset/53714
Comment 14 Mark Rowe (bdash) 2010-01-23 14:59:33 PST
This appears to have caused Tiger nightly builds to crash after loading a page.
Comment 15 Mark Rowe (bdash) 2010-01-23 14:59:52 PST
Please see bug 34039.
Comment 16 Eric Seidel (no email) 2010-01-24 22:25:29 PST
Seems we should roll out this change then.  The only reason I haven't already is that the Tiger bug is not confirmed.
Comment 17 Eric Seidel (no email) 2010-01-25 16:06:09 PST
Bug 34039 has been confirmed.  I think this should be rolled out.
Comment 18 Eric Seidel (no email) 2010-01-25 16:12:33 PST
Alexey reports the regression as being fixed in r53815.
Comment 19 Martin Robinson 2010-09-30 15:56:40 PDT
The GTK+ port seems to be hitting a lot of ASSERTs that look like this:

ASSERTION FAILED: m_thread == currentThread()

Is this related to this bug at all? Perhaps we are still missing some code? I'm happy to implement it.