Bug 139851 - JSC::VM::m_lastStackTop initialized incorrectly when on non-main fiber; runaway memory corruption
Summary: JSC::VM::m_lastStackTop initialized incorrectly when on non-main fiber; runaw...
Status: RESOLVED WONTFIX
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC Windows 8
: P2 Critical
Assignee: Nobody
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2014-12-20 02:52 PST by John N. Lehner
Modified: 2015-01-05 11:59 PST (History)
9 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description John N. Lehner 2014-12-20 02:52:53 PST
On Microsoft Windows, if the main thread of an application has fibers, they all share the same thread-local storage.

Now in JSC::Heap::markRoots(), we call sanitizeStackForVM(), which via offlineasm becomes:

sanitizeStackForVMImpl PROC PUBLIC
    mov rdx, qword ptr [51744 + rcx]                         
    cmp rsp, rdx                                             
    jbe _offlineasm_zeroFillDone
    xor rax, rax                                             

  _offlineasm_zeroFillLoop:
    mov qword ptr [0 + rdx], rax                             
    add rdx, 8                                               
    cmp rsp, rdx                                             
    ja _offlineasm_zeroFillLoop

  _offlineasm_zeroFillDone:
    mov rdx, rsp                                             
    mov qword ptr [51744 + rcx], rdx                         
    ret                                                      
sanitizeStackForVMImpl ENDP

qword ptr [51744 + rcx] is the value of JSC::VM::m_lastStackTop.

If we look at all the reads and writes to JSC::VM::m_lastStackTop, besides the read and write above, we find:

WRITES:

JSC::JSLock::didAcquireLock():
JSC::JSLock::grabAllLocks():
	m_vm->setLastStackTop(wtfThreadData().savedLastStackTop());

JSC::VM::VM():
	setLastStackTop(wtfThreadData().stack().origin());

READS:
	
JSC::JSLock::dropAllLocks():
	wtfThreadData().setSavedLastStackTop(m_vm->lastStackTop());

So the value of JSC::VM::m_lastStackTop depends on what's in the thread-local storage provided by wtfThreadData(), and the only other write to thread-local storage not covered above is:

JSC::initializeThreading():
	std::call_once(initializeThreadingOnceFlag, []{
		// ...
        wtfThreadData().setSavedLastStackTop(wtfThreadData().stack().origin());
    });

There are 2 problems here.  First, JSC::initializeThreading() is only called once for the whole process, so the saved last stack top in thread-local storage will only be for one thread in the process.

But the bigger problem is that JSC::JSLock::didAcquireLock() and JSC::JSLock::grabAllLocks() set JSC::VM::m_lastStackTop based on this value from thread-local storage, even if it is the wrong value when JSC::JSLock::didAcquireLock() or JSC::JSLock::grabAllLocks() are called from another fiber!

Interestingly, note that JSC::VM::VM() sets JSC::VM::m_lastStackTop based on a fresh call to wtfThreadData().stack().origin(), which already has a fix for fiber environments:

 const StackBounds& stack()
    {
        // We need to always get a fresh StackBounds from the OS due to how fibers work.
        // See https://bugs.webkit.org/show_bug.cgi?id=102411
#if OS(WINDOWS)
        m_stackBounds = StackBounds::currentThreadStackBounds();
#endif
        return m_stackBounds;
    }

but this is insufficient to avoid disaster, because JSC::JSLock::didAcquireLock() and JSC::JSLock::grabAllLocks() overwrite the at-construction-time-correct JSC::VM::m_lastStackTop.

Now to explain the result of this problem.  The stack on an Intel machine grows toward lower memory addresses, and Windows threads by default have 1MB of virtual memory reserved for them.  WTF::StackBounds::origin() points to the next byte beyond the end of this allocation:

000000CC`00000000:
guard pages

top of stack

bottom of stack (e.g. main())
000000CC`00100000: origin

sanitizeStackForVMImpl() cares about only two locations in memory: JSC::VM::m_lastStackTop, which is rdx, and rsp. Now it appears that normally, the first time sanitization runs, rdx is supposed to be at the very bottom of the stack, and rsp will be somewhere in the middle, so 

cmp rsp, rdx                                             
jbe _offlineasm_zeroFillDone

executes and we simply update JSC::VM::m_lastStackTop with rsp.  The next time sanitization runs, rsp may have moved closer to the bottom of the stack, so rdx < rsp and we do some zero-filling.

Now suppose we're on a fiber created from the main thread.  JSC::VM::m_lastStackTop will be the origin of the main thread, not the fiber.  If the fiber's stack has been reserved in the virtual address space at a lower address than the main thread, and we're running in the fiber, rsp < rdx, so rdx is then updated to the correct rsp and no zero-filling occurs.

HOWEVER, if the fiber's stack has been reserved in the virtual address space at a higher address than the main thread, now we have rdx < rsp, which to sanitizeStackForVMImpl() looks like everything between must be zero-filled!  I've seen as much as 500MB of difference on Windows 8.1, presumably due to the new high entropy ASLR.

This will show up as a crash report with a stack like:

JavaScriptCore!sanitizeStackForVMImpl(void)+0xf
JavaScriptCore!JSC::Heap::markRoots(double
JavaScriptCore!JSC::Heap::collect(JSC::HeapOperation
JavaScriptCore!JSC::Heap::collectAllGarbage(void)+0x1c
JavaScriptCore!JSC::MarkedAllocator::allocateSlowCase(unsigned
JavaScriptCore!JSC::RegExpPrototype::create(class
JavaScriptCore!JSC::JSGlobalObject::reset(class
JavaScriptCore!JSC::JSGlobalObject::finishCreation(class
JavaScriptCore!JSC::JSGlobalObject::create(class
JavaScriptCore!JSGlobalContextCreateInGroup(struct
...
...
...
kernel32!CreateFiberEx+0x265

if you're lucky, but due to the runaway zeroing of memory, other crashes are possible.
Comment 1 Radar WebKit Bug Importer 2014-12-22 11:06:58 PST
<rdar://problem/19327055>
Comment 2 Michael Saboff 2014-12-22 17:26:27 PST
(In reply to comment #0)
...

> So the value of JSC::VM::m_lastStackTop depends on what's in the
> thread-local storage provided by wtfThreadData(), and the only other write
> to thread-local storage not covered above is:
> 
> JSC::initializeThreading():
> 	std::call_once(initializeThreadingOnceFlag, []{
> 		// ...
>        
> wtfThreadData().setSavedLastStackTop(wtfThreadData().stack().origin());
>     });
> 
> There are 2 problems here.  First, JSC::initializeThreading() is only called
> once for the whole process, so the saved last stack top in thread-local
> storage will only be for one thread in the process.

Although we only call JSC::initializeThreading() once from the main thread, we do initialize wtfThreadData() for any other thread that accesses WebKit.  This is done by the first call to wtfThreadData() for those secondary threads.  The initialization of m_savedLastStackTop is done in the default constructor of WTFThreadData() where we set it to stack().origin().

To be sure, I verified this with one of the worker tests that creates a worker on a secondary thread.
Comment 3 John N. Lehner 2014-12-22 18:00:10 PST
(In reply to comment #2)
> (In reply to comment #0)
> ...
> 
> > So the value of JSC::VM::m_lastStackTop depends on what's in the
> > thread-local storage provided by wtfThreadData(), and the only other write
> > to thread-local storage not covered above is:
> > 
> > JSC::initializeThreading():
> > 	std::call_once(initializeThreadingOnceFlag, []{
> > 		// ...
> >        
> > wtfThreadData().setSavedLastStackTop(wtfThreadData().stack().origin());
> >     });
> > 
> > There are 2 problems here.  First, JSC::initializeThreading() is only called
> > once for the whole process, so the saved last stack top in thread-local
> > storage will only be for one thread in the process.
> 
> Although we only call JSC::initializeThreading() once from the main thread,
> we do initialize wtfThreadData() for any other thread that accesses WebKit. 
> This is done by the first call to wtfThreadData() for those secondary
> threads.  The initialization of m_savedLastStackTop is done in the default
> constructor of WTFThreadData() where we set it to stack().origin().

Yeah, I discovered this this afternoon; see my comment in <rdar://problem/18515908>, which is a similar crash report but with JSC being called on a preemptive secondary thread:

12/22/14, 3:47 PM John Lehner:
OK, WTFThreadData::m_savedLastStackTop is at least initialized to StackBounds::origin(), so that's why we don't see an access violation writing to zero.  But the zeroing of too much memory by the main thread per #139851 was causing the indexingType() to not match any of the supported cases in JSArray::push() (presumably because m_indexingType got zeroed) and that is strikingly similar to jitArrayModeForClassInfo() not matching any supported cases for classInfo->typedArrayStorageType here.
Comment 4 Michael Saboff 2015-01-05 11:59:48 PST
As was discussed, there isn't a know thread initialization issue for VM::m_lastStackTop.

WebKit in general and JavaScriptCore in particular does not support Windows fibers and would require work beyond just manage the saved last stack top to support fibers.

*IF* an application uses WebKit and takes advantage of fibers, it is the application's responsibility to make sure that WebKit sees only one and the same fiber per thread.  This includes that any fiber that calls into WebKit, including JavaScriptCore must not change what thread the fiber executes in.  All other fibers in a thread cannot make WebKit calls.