On Microsoft Windows, if the main thread of an application has fibers, they all share the same thread-local storage. Now in JSC::Heap::markRoots(), we call sanitizeStackForVM(), which via offlineasm becomes: sanitizeStackForVMImpl PROC PUBLIC mov rdx, qword ptr [51744 + rcx] cmp rsp, rdx jbe _offlineasm_zeroFillDone xor rax, rax _offlineasm_zeroFillLoop: mov qword ptr [0 + rdx], rax add rdx, 8 cmp rsp, rdx ja _offlineasm_zeroFillLoop _offlineasm_zeroFillDone: mov rdx, rsp mov qword ptr [51744 + rcx], rdx ret sanitizeStackForVMImpl ENDP qword ptr [51744 + rcx] is the value of JSC::VM::m_lastStackTop. If we look at all the reads and writes to JSC::VM::m_lastStackTop, besides the read and write above, we find: WRITES: JSC::JSLock::didAcquireLock(): JSC::JSLock::grabAllLocks(): m_vm->setLastStackTop(wtfThreadData().savedLastStackTop()); JSC::VM::VM(): setLastStackTop(wtfThreadData().stack().origin()); READS: JSC::JSLock::dropAllLocks(): wtfThreadData().setSavedLastStackTop(m_vm->lastStackTop()); So the value of JSC::VM::m_lastStackTop depends on what's in the thread-local storage provided by wtfThreadData(), and the only other write to thread-local storage not covered above is: JSC::initializeThreading(): std::call_once(initializeThreadingOnceFlag, []{ // ... wtfThreadData().setSavedLastStackTop(wtfThreadData().stack().origin()); }); There are 2 problems here. First, JSC::initializeThreading() is only called once for the whole process, so the saved last stack top in thread-local storage will only be for one thread in the process. But the bigger problem is that JSC::JSLock::didAcquireLock() and JSC::JSLock::grabAllLocks() set JSC::VM::m_lastStackTop based on this value from thread-local storage, even if it is the wrong value when JSC::JSLock::didAcquireLock() or JSC::JSLock::grabAllLocks() are called from another fiber! Interestingly, note that JSC::VM::VM() sets JSC::VM::m_lastStackTop based on a fresh call to wtfThreadData().stack().origin(), which already has a fix for fiber environments: const StackBounds& stack() { // We need to always get a fresh StackBounds from the OS due to how fibers work. // See https://bugs.webkit.org/show_bug.cgi?id=102411 #if OS(WINDOWS) m_stackBounds = StackBounds::currentThreadStackBounds(); #endif return m_stackBounds; } but this is insufficient to avoid disaster, because JSC::JSLock::didAcquireLock() and JSC::JSLock::grabAllLocks() overwrite the at-construction-time-correct JSC::VM::m_lastStackTop. Now to explain the result of this problem. The stack on an Intel machine grows toward lower memory addresses, and Windows threads by default have 1MB of virtual memory reserved for them. WTF::StackBounds::origin() points to the next byte beyond the end of this allocation: 000000CC`00000000: guard pages top of stack bottom of stack (e.g. main()) 000000CC`00100000: origin sanitizeStackForVMImpl() cares about only two locations in memory: JSC::VM::m_lastStackTop, which is rdx, and rsp. Now it appears that normally, the first time sanitization runs, rdx is supposed to be at the very bottom of the stack, and rsp will be somewhere in the middle, so cmp rsp, rdx jbe _offlineasm_zeroFillDone executes and we simply update JSC::VM::m_lastStackTop with rsp. The next time sanitization runs, rsp may have moved closer to the bottom of the stack, so rdx < rsp and we do some zero-filling. Now suppose we're on a fiber created from the main thread. JSC::VM::m_lastStackTop will be the origin of the main thread, not the fiber. If the fiber's stack has been reserved in the virtual address space at a lower address than the main thread, and we're running in the fiber, rsp < rdx, so rdx is then updated to the correct rsp and no zero-filling occurs. HOWEVER, if the fiber's stack has been reserved in the virtual address space at a higher address than the main thread, now we have rdx < rsp, which to sanitizeStackForVMImpl() looks like everything between must be zero-filled! I've seen as much as 500MB of difference on Windows 8.1, presumably due to the new high entropy ASLR. This will show up as a crash report with a stack like: JavaScriptCore!sanitizeStackForVMImpl(void)+0xf JavaScriptCore!JSC::Heap::markRoots(double JavaScriptCore!JSC::Heap::collect(JSC::HeapOperation JavaScriptCore!JSC::Heap::collectAllGarbage(void)+0x1c JavaScriptCore!JSC::MarkedAllocator::allocateSlowCase(unsigned JavaScriptCore!JSC::RegExpPrototype::create(class JavaScriptCore!JSC::JSGlobalObject::reset(class JavaScriptCore!JSC::JSGlobalObject::finishCreation(class JavaScriptCore!JSC::JSGlobalObject::create(class JavaScriptCore!JSGlobalContextCreateInGroup(struct ... ... ... kernel32!CreateFiberEx+0x265 if you're lucky, but due to the runaway zeroing of memory, other crashes are possible.
<rdar://problem/19327055>
(In reply to comment #0) ... > So the value of JSC::VM::m_lastStackTop depends on what's in the > thread-local storage provided by wtfThreadData(), and the only other write > to thread-local storage not covered above is: > > JSC::initializeThreading(): > std::call_once(initializeThreadingOnceFlag, []{ > // ... > > wtfThreadData().setSavedLastStackTop(wtfThreadData().stack().origin()); > }); > > There are 2 problems here. First, JSC::initializeThreading() is only called > once for the whole process, so the saved last stack top in thread-local > storage will only be for one thread in the process. Although we only call JSC::initializeThreading() once from the main thread, we do initialize wtfThreadData() for any other thread that accesses WebKit. This is done by the first call to wtfThreadData() for those secondary threads. The initialization of m_savedLastStackTop is done in the default constructor of WTFThreadData() where we set it to stack().origin(). To be sure, I verified this with one of the worker tests that creates a worker on a secondary thread.
(In reply to comment #2) > (In reply to comment #0) > ... > > > So the value of JSC::VM::m_lastStackTop depends on what's in the > > thread-local storage provided by wtfThreadData(), and the only other write > > to thread-local storage not covered above is: > > > > JSC::initializeThreading(): > > std::call_once(initializeThreadingOnceFlag, []{ > > // ... > > > > wtfThreadData().setSavedLastStackTop(wtfThreadData().stack().origin()); > > }); > > > > There are 2 problems here. First, JSC::initializeThreading() is only called > > once for the whole process, so the saved last stack top in thread-local > > storage will only be for one thread in the process. > > Although we only call JSC::initializeThreading() once from the main thread, > we do initialize wtfThreadData() for any other thread that accesses WebKit. > This is done by the first call to wtfThreadData() for those secondary > threads. The initialization of m_savedLastStackTop is done in the default > constructor of WTFThreadData() where we set it to stack().origin(). Yeah, I discovered this this afternoon; see my comment in <rdar://problem/18515908>, which is a similar crash report but with JSC being called on a preemptive secondary thread: 12/22/14, 3:47 PM John Lehner: OK, WTFThreadData::m_savedLastStackTop is at least initialized to StackBounds::origin(), so that's why we don't see an access violation writing to zero. But the zeroing of too much memory by the main thread per #139851 was causing the indexingType() to not match any of the supported cases in JSArray::push() (presumably because m_indexingType got zeroed) and that is strikingly similar to jitArrayModeForClassInfo() not matching any supported cases for classInfo->typedArrayStorageType here.
As was discussed, there isn't a know thread initialization issue for VM::m_lastStackTop. WebKit in general and JavaScriptCore in particular does not support Windows fibers and would require work beyond just manage the saved last stack top to support fibers. *IF* an application uses WebKit and takes advantage of fibers, it is the application's responsibility to make sure that WebKit sees only one and the same fiber per thread. This includes that any fiber that calls into WebKit, including JavaScriptCore must not change what thread the fiber executes in. All other fibers in a thread cannot make WebKit calls.