Bug 156647

Summary: iTunes crashing JavaScriptCore.dll
Product: WebKit Reporter: Michael Saboff <msaboff>
Component: JavaScriptCoreAssignee: Michael Saboff <msaboff>
Status: RESOLVED FIXED    
Severity: Normal CC: achristensen, benjamin, bfulgham, cdumez, cmarcelo, commit-queue, darin, ggaren, keith_miller, mark.lam, nehaganoda, ossy, saam
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 156708, 156748    
Bug Blocks:    
Attachments:
Description Flags
Patch
ggaren: review+
Patch - Fixed cast issue I didn't see with VS 2015 Express
none
Patch - Fixed the other cast that the EWS bot didn't catch the first time
darin: review+
Updated Patch - Reduced number of ThreadSpecific values in JSC
none
Removed tab characters
saam: review+
Patch with fixes for 32 bit Windows fpizlo: review+

Michael Saboff
Reported 2016-04-15 15:58:30 PDT
For Windows we create thread specific API built upon the TLS APIs in Windows. The thread specific key code for windows does not call any destructor when the key is destroyed. Instead the thread specific key destructors are called from the function ThreadSpecificThreadExit() and it is called from wtfThreadEntryPoint() right before the thread exits. The problem here is that wtfThreadEntryPoint() is only called for threads created by WTF typically via JavaScriptCore or WebKit. The crash happens when a thread created by some other means calls into JavaScript and gets added to the list of threads that MachineThread tracks AND that thread later exits. In that case, the thread specific key destructor MachineThread::removeThread() will never be called by the "foreign" thread. MachineThread tries to manage a thread, including walking its stack, but the thread no longer exists. The fix is to change the WTF ThreadSpecific wrappers to use FLS, which does of a callback when the fiber exits. Since we don't support multiple fibers, there will be only one fiber per thread. This makes FLS work just like TLS for our purposes.
Attachments
Patch (10.69 KB, patch)
2016-04-15 16:31 PDT, Michael Saboff
ggaren: review+
Patch - Fixed cast issue I didn't see with VS 2015 Express (10.69 KB, patch)
2016-04-15 17:10 PDT, Michael Saboff
no flags
Patch - Fixed the other cast that the EWS bot didn't catch the first time (10.73 KB, patch)
2016-04-15 17:37 PDT, Michael Saboff
darin: review+
Updated Patch - Reduced number of ThreadSpecific values in JSC (16.73 KB, patch)
2016-04-18 21:48 PDT, Michael Saboff
no flags
Removed tab characters (16.76 KB, patch)
2016-04-18 21:54 PDT, Michael Saboff
saam: review+
Patch with fixes for 32 bit Windows (19.11 KB, patch)
2016-04-19 18:15 PDT, Michael Saboff
fpizlo: review+
Michael Saboff
Comment 1 2016-04-15 15:58:49 PDT
Michael Saboff
Comment 2 2016-04-15 16:31:54 PDT
Geoffrey Garen
Comment 3 2016-04-15 16:42:02 PDT
Comment on attachment 276522 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=276522&action=review > Source/WTF/ChangeLog:21 > + The fix is to change from using TLS to using FLS (Fiber Local Storage). Although > + Windows allows multiple fibers per thread, WebKit is not designed to work with a > + multiple fibers per thread. When ther is only one fiber per thread, FLS works just > + like TLS, but it has the destroy callback. Let's double-check that our clients only use one fiber per thread.
Geoffrey Garen
Comment 4 2016-04-15 16:47:06 PDT
Michael Saboff
Comment 5 2016-04-15 17:10:56 PDT
Created attachment 276528 [details] Patch - Fixed cast issue I didn't see with VS 2015 Express
Michael Saboff
Comment 6 2016-04-15 17:28:06 PDT
(In reply to comment #3) > Comment on attachment 276522 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=276522&action=review > > > Source/WTF/ChangeLog:21 > > + The fix is to change from using TLS to using FLS (Fiber Local Storage). Although > > + Windows allows multiple fibers per thread, WebKit is not designed to work with a > > + multiple fibers per thread. When ther is only one fiber per thread, FLS works just > > + like TLS, but it has the destroy callback. > > Let's double-check that our clients only use one fiber per thread. I check with the iTunes team and they only use one fiber per thread.
Michael Saboff
Comment 7 2016-04-15 17:37:47 PDT
Created attachment 276531 [details] Patch - Fixed the other cast that the EWS bot didn't catch the first time
Michael Saboff
Comment 8 2016-04-15 21:52:04 PDT
Darin Adler
Comment 9 2016-04-15 22:35:52 PDT
Comment on attachment 276531 [details] Patch - Fixed the other cast that the EWS bot didn't catch the first time View in context: https://bugs.webkit.org/attachment.cgi?id=276531&action=review > Source/WTF/wtf/ThreadSpecific.h:170 > + DWORD flsKey = FlsAlloc(reinterpret_cast<PFLS_CALLBACK_FUNCTION>(destructor)); Is this reinterpret_cast safe? If so, why? > Source/WTF/wtf/ThreadSpecific.h:196 > + DWORD flsKey = FlsAlloc(reinterpret_cast<PFLS_CALLBACK_FUNCTION>(destroy)); Same question.
Darin Adler
Comment 10 2016-04-15 22:37:59 PDT
Apparently Geoff already reviewed this, but it was still set to review? for some reason.
Csaba Osztrogonác
Comment 11 2016-04-18 08:23:23 PDT
WebKit Commit Bot
Comment 12 2016-04-18 10:23:30 PDT
Re-opened since this is blocked by bug 156708
Michael Saboff
Comment 13 2016-04-18 12:23:43 PDT
It appears that at least some of the test crashes are due to running out of FLS indices. We require one for every ThreadSpecificKey and according to online documentation, it looks like there are only 128 per process. I'm looking into how we can reduce the number of keys to fix this issue. There may be other reasons for the test timeouts / crashes. I'll check after trying this first fix.
Michael Saboff
Comment 14 2016-04-18 21:48:56 PDT
Created attachment 276691 [details] Updated Patch - Reduced number of ThreadSpecific values in JSC I fixed the running out of FLS keys crash with this patch, however I wasn't able to reproduce all of the crashes seen on the build bots.
WebKit Commit Bot
Comment 15 2016-04-18 21:49:52 PDT
Attachment 276691 [details] did not pass style-queue: ERROR: LayoutTests/ChangeLog:8: Line contains tab character. [whitespace/tab] [5] ERROR: LayoutTests/ChangeLog:9: Line contains tab character. [whitespace/tab] [5] ERROR: LayoutTests/ChangeLog:10: Line contains tab character. [whitespace/tab] [5] ERROR: LayoutTests/ChangeLog:11: Line contains tab character. [whitespace/tab] [5] WARNING: Not running on native Windows. Total errors found: 4 in 7 files If any of these errors are false positives, please file a bug against check-webkit-style.
Michael Saboff
Comment 16 2016-04-18 21:54:32 PDT
Created attachment 276692 [details] Removed tab characters
Saam Barati
Comment 17 2016-04-18 23:32:58 PDT
Comment on attachment 276692 [details] Removed tab characters View in context: https://bugs.webkit.org/attachment.cgi?id=276692&action=review r=me > Source/WTF/ChangeLog:14 > + This mechanism was put in place for Windos because we layer the WTF::ThreadSpecific Typo: Windows > Source/WTF/ChangeLog:20 > + multiple fibers per thread. When ther is only one fiber per thread, FLS works just Typo: there
Michael Saboff
Comment 18 2016-04-19 07:11:01 PDT
WebKit Commit Bot
Comment 19 2016-04-19 09:24:54 PDT
Re-opened since this is blocked by bug 156748
Michael Saboff
Comment 20 2016-04-19 09:31:29 PDT
The WebKit bots build and test 32 bits. I had been building and testing 64 bits. Rolled out the latest change. Building 32 bits now and will test after that is done.
Michael Saboff
Comment 21 2016-04-19 18:15:43 PDT
Created attachment 276786 [details] Patch with fixes for 32 bit Windows
Mark Lam
Comment 22 2016-04-19 19:59:18 PDT
Comment on attachment 276786 [details] Patch with fixes for 32 bit Windows View in context: https://bugs.webkit.org/attachment.cgi?id=276786&action=review > Source/WTF/ChangeLog:14 > + This mechanism was put in place for Windos because we layer the WTF::ThreadSpecific still have typo: /Windos/Windows/. > Source/WTF/ChangeLog:16 > + a thread exiting callback the way that pthread_create_key. I think you meant to say "that pthread_create_key has." > Source/WTF/ChangeLog:20 > + multiple fibers per thread. When ther is only one fiber per thread, FLS works just typo: /ther/there/.
Michael Saboff
Comment 23 2016-04-19 22:13:10 PDT
Note You need to log in before you can comment on or make changes to this bug.