Bug 129955 - Crash on a stack overflow on 32-bit x86 in http/tests/websocket/tests/hybi/workers/no-onmessage-in-sync-op.html
Summary: Crash on a stack overflow on 32-bit x86 in http/tests/websocket/tests/hybi/wo...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Mark Lam
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2014-03-07 19:31 PST by Mark Lam
Modified: 2014-03-10 21:01 PDT (History)
7 users (show)

See Also:


Attachments
the patch. (1.55 KB, patch)
2014-03-10 19:18 PDT, Mark Lam
ggaren: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Lam 2014-03-07 19:31:04 PST
I ran across the following crash while running the layout tests for a 32-bit x86 build.  The crash is on a Worker thread which has a smaller stack.

Looking at the crash in a debugger, we see that the stack is about 512K in size:

(lldb) p/x $esp
(unsigned int) $0 = 0xb06ad000      // For the bottom frame that crashed on a "call" instruction.
(lldb) fr sel 45
frame #45: 0x99f7f485 libsystem_pthread.dylib`_pthread_start + 130
...
(lldb) p/x $esp
(unsigned int) $1 = 0xb072cfd0       // top-ish frame at the start of the stack.

(lldb) p (0xb072cfd0-0xb06ad000)/1024.0
(double) $5 = 511.953125               // Approximate stack size from start of the stack till the crash point.

The stack trace:
(lldb) bt
* thread #16: tid = 0xa380b6, 0x003a7ea8 JavaScriptCore`JSC::TypeInfo::TypeInfo(this=0x0c4c0014, type=176, inlineTypeFlags='j', outOfLineTypeFlags='?') + 8 at JSTypeInfo.h:69, name = 'WebCore: Worker', stop reason = EXC_BAD_ACCESS (code=2, address=0xb06acffc)
  * frame #0: 0x003a7ea8 JavaScriptCore`JSC::TypeInfo::TypeInfo(this=0x0c4c0014, type=176, inlineTypeFlags='j', outOfLineTypeFlags='?') + 8 at JSTypeInfo.h:69
    frame #1: 0x003a7e56 JavaScriptCore`JSC::TypeInfo::TypeInfo(this=0xb06ad110, type=ObjectType, inlineTypeFlags='\x80', outOfLineTypeFlags='\0') + 70 at JSTypeInfo.h:76
    frame #2: 0x003a7df9 JavaScriptCore`JSC::StructureIDBlob::typeInfo(this=0x0c4dd0a8, outOfLineTypeFlags='\0') const + 89 at StructureIDBlob.h:60
    frame #3: 0x0039ea86 JavaScriptCore`JSC::Structure::typeInfo(this=0x0c4dd0a0) const + 166 at Structure.h:154
    frame #4: 0x003a0f5c JavaScriptCore`JSC::JSCell::JSCell(this=0x0c4ce280, =0x7ab9c400, structure=0x0c4dd0a0) + 92 at JSCellInlines.h:49
    frame #5: 0x003a0ead JavaScriptCore`JSC::JSObject::JSObject(this=0x0c4ce280, vm=0x7ab9c400, structure=0x0c4dd0a0, butterfly=0x00000000) + 61 at JSObject.h:1198
    frame #6: 0x003a0def JavaScriptCore`JSC::JSNonFinalObject::JSNonFinalObject(this=0x0c4ce280, vm=0x7ab9c400, structure=0x0c4dd0a0, butterfly=0x00000000) + 63 at JSObject.h:1012
    frame #7: 0x003d5eff JavaScriptCore`JSC::JSDestructibleObject::JSDestructibleObject(this=0x0c4ce280, vm=0x7ab9c400, structure=0x0c4dd0a0, butterfly=0x00000000) + 79 at JSDestructibleObject.h:24
    frame #8: 0x003d5e98 JavaScriptCore`JSC::JSWrapperObject::JSWrapperObject(this=0x0c4ce280, vm=0x7ab9c400, structure=0x0c4dd0a0) + 72 at JSWrapperObject.h:71
    frame #9: 0x004b3167 JavaScriptCore`JSC::DateInstance::DateInstance(this=0x0c4ce280, vm=0x7ab9c400, structure=0x0c4dd0a0) + 55 at DateInstance.cpp:39
    frame #10: 0x004b3121 JavaScriptCore`JSC::DateInstance::DateInstance(this=0x0c4ce280, vm=0x7ab9c400, structure=0x0c4dd0a0) + 49 at DateInstance.cpp:40
    frame #11: 0x004b202d JavaScriptCore`JSC::DateInstance::create(vm=0x7ab9c400, structure=0x0c4dd0a0, date=1394248664449) + 109 at DateInstance.h:41
    frame #12: 0x004b1160 JavaScriptCore`JSC::constructDate(exec=0xb06ad798, globalObject=0x0c51fa40, args=0xb06ad4f8) + 3136 at DateConstructor.cpp:174
    frame #13: 0x004b1207 JavaScriptCore`JSC::constructWithDateConstructor(exec=0xb06ad798) + 103 at DateConstructor.cpp:180
    frame #14: 0x0080387f JavaScriptCore`handleHostCall(execCallee=0xb06ad798, callee=JSValue at 0xb06ad674, kind=CodeForConstruct) + 1247 at JITOperations.cpp:664
    frame #15: 0x0080418a JavaScriptCore`linkFor(execCallee=0xb06ad798, kind=CodeForConstruct, registers=RegisterPreservationNotRequired) + 202 at JITOperations.cpp:686
    frame #16: 0x007fddc7 JavaScriptCore`operationLinkConstruct(execCallee=0xb06ad798) + 55 at JITOperations.cpp:728
    frame #17: 0x0a254da4
    frame #18: 0x0a35bd70
    frame #19: 0x0094eee4 JavaScriptCore`callToJavaScript + 292
    frame #20: 0x007e8e40 JavaScriptCore`JSC::JITCode::execute(this=0x7a7c3140, vm=0x7ab9c400, protoCallFrame=0xb072c440) + 64 at JITCode.cpp:47
    frame #21: 0x007c5daa JavaScriptCore`JSC::Interpreter::executeCall(this=0x7a7dbf90, callFrame=0x0c51fa6c, function=0x0c54e210, callType=CallTypeJS, callData=0xb072c6d8, thisValue=JSValue at 0xb072c4e4, args=0xb072c5e0) + 1482 at Interpreter.cpp:994
    frame #22: 0x0043148d JavaScriptCore`JSC::call(exec=0x0c51fa6c, functionObject=JSValue at 0xb072c554, callType=CallTypeJS, callData=0xb072c6d8, thisValue=JSValue at 0xb072c564, args=0xb072c5e0) + 253 at CallData.cpp:39
    frame #23: 0x03063404 WebCore`WebCore::JSEventListener::handleEvent(this=0x7a7cf130, scriptExecutionContext=0x7a7e2440, event=0x7a7cd8b0) + 1604 at JSEventListener.cpp:127
    frame #24: 0x0272d5a9 WebCore`WebCore::EventTarget::fireEventListeners(this=0x7a7c96c0, event=0x7a7cd8b0, d=0x7a7c96c4, entry=0x7a7cf150) + 1337 at EventTarget.cpp:246
    frame #25: 0x0272cef1 WebCore`WebCore::EventTarget::fireEventListeners(this=0x7a7c96c0, event=0x7a7cd8b0) + 401 at EventTarget.cpp:197
    frame #26: 0x0272cd29 WebCore`WebCore::EventTarget::dispatchEvent(this=0x7a7c96c0, event=0xb072c970) + 169 at EventTarget.cpp:160
    frame #27: 0x0417d6d5 WebCore`WebCore::WebSocket::didConnect(this=0x7a7c96c0) + 501 at WebSocket.cpp:507
    frame #28: 0x0417d96c WebCore`_ZThn64_N7WebCore9WebSocket10didConnectEv(this=0x7a7c9700) + 28 at WebSocket.cpp:507
    frame #29: 0x0405d5c7 WebCore`WebCore::ThreadableWebSocketChannelClientWrapper::didConnectCallback(context=0x00000000, wrapper=0xb072c9f8) + 151 at ThreadableWebSocketChannelClientWrapper.cpp:245
    frame #30: 0x0405fe1b WebCore`WebCore::CrossThreadTask1<WTF::PassRefPtr<WebCore::ThreadableWebSocketChannelClientWrapper>, WTF::PassRefPtr<WebCore::ThreadableWebSocketChannelClientWrapper> >::performTask(this=0x7a7c1910, context=0x00000000) + 75 at CrossThreadTask.h:80
    frame #31: 0x0405d6d5 WebCore`WebCore::ThreadableWebSocketChannelClientWrapper::processPendingTasks(this=0x7a7cc0c0) + 261 at ThreadableWebSocketChannelClientWrapper.cpp:238
    frame #32: 0x0405d51c WebCore`WebCore::ThreadableWebSocketChannelClientWrapper::didConnect(this=0x7a7cc0c0) + 124 at ThreadableWebSocketChannelClientWrapper.cpp:163
    frame #33: 0x041e6a62 WebCore`WebCore::workerGlobalScopeDidConnect(context=0x7a7e2440, workerClientWrapper=0xb072cb00, subprotocol=0x7b6978cc, extensions=0x7b6978d0) + 194 at WorkerThreadableWebSocketChannel.cpp:265
    frame #34: 0x041ec02c WebCore`WebCore::CrossThreadTask3<WTF::PassRefPtr<WebCore::ThreadableWebSocketChannelClientWrapper>, WTF::PassRefPtr<WebCore::ThreadableWebSocketChannelClientWrapper>, WTF::String, WTF::String const&, WTF::String, WTF::String const&>::performTask(this=0x7b6978c0, context=0x7a7e2440) + 108 at CrossThreadTask.h:145
    frame #35: 0x041d1abd WebCore`WebCore::WorkerRunLoop::Task::performTask(this=0x7b6978e0, runLoop=0x7b3835b0, context=0x7a7e2440) + 141 at WorkerRunLoop.cpp:218
    frame #36: 0x041d1601 WebCore`WebCore::WorkerRunLoop::runInMode(this=0x7b3835b0, context=0x7a7e2440, predicate=0xb072cc50, waitMode=WaitForMessage) + 593 at WorkerRunLoop.cpp:164
    frame #37: 0x041d136c WebCore`WebCore::WorkerRunLoop::run(this=0x7b3835b0, context=0x7a7e2440) + 124 at WorkerRunLoop.cpp:133
    frame #38: 0x041db936 WebCore`WebCore::WorkerThread::runEventLoop(this=0x7b3835a0) + 54 at WorkerThread.cpp:205
    frame #39: 0x024a63c6 WebCore`WebCore::DedicatedWorkerThread::runEventLoop(this=0x7b3835a0) + 86 at DedicatedWorkerThread.cpp:65
    frame #40: 0x041db833 WebCore`WebCore::WorkerThread::workerThread(this=0x7b3835a0) + 1267 at WorkerThread.cpp:185
    frame #41: 0x041db337 WebCore`WebCore::WorkerThread::workerThreadStart(thread=0x7b3835a0) + 23 at WorkerThread.cpp:154
    frame #42: 0x00bd4455 JavaScriptCore`WTF::threadEntryPoint(contextData=0x7b38a210) + 133 at Threading.cpp:67
    frame #43: 0x00bd4ee1 JavaScriptCore`WTF::wtfThreadEntryPoint(param=0x7b38db30) + 193 at ThreadingPthreads.cpp:168
    frame #44: 0x99f7f5fb libsystem_pthread.dylib`_pthread_body + 144
    frame #45: 0x99f7f485 libsystem_pthread.dylib`_pthread_start + 130
    frame #46: 0x99f84cf2 libsystem_pthread.dylib`thread_start + 34

Comparing the stack pointer for frame 16 (operationLinkConstruct()) vs the crash frame:

(lldb) fr sel 16
frame #16: 0x007fddc7 JavaScriptCore`operationLinkConstruct(execCallee=0xb06ad798) + 55 at JITOperations.cpp:728
(lldb) p/x $esp
(unsigned int) $6 = 0xb06ad750
(lldb) p (0xb06ad750-0xb06ad000)/1024.0
(double) $7 = 1.828125

This should never be because the reserveZoneSize should be a lot larger than that.  There could be a bug in the stack checking code on 32-bit x86 or all 32-bit code.
Comment 1 Mark Lam 2014-03-09 00:20:41 PST
The test does not fail if I only run the LLINT or baseline JIT.  Hence, the issue is in the DFG.  Still investigating.
Comment 2 Mark Lam 2014-03-09 00:32:42 PST
<rdar://problem/15784697>
Comment 3 Mark Lam 2014-03-09 03:08:55 PDT
Looking through the frames in the stack trace, I see a sudden and huge jump in the stack pointer value going from frame 19 to frame 18.  At frame 19, $esp is sane and within expected bounds.  At frame 18, it is now near the end of the stack, and have exceeded the VM stackLimit.

Further investigation reveals that frame 18 is a DFG function with its  Block 1 being an OSR entry block.  If I   disable OSR entry into DFG functions, the crash does not manifest.

Next step, look at the stack adjustment code for OSR entry on the 32-bit port and see if it is at parity with the 64-bit port.
Comment 4 Mark Lam 2014-03-10 13:26:39 PDT
Bit of interesting detail:
1. The 2 piece of JIT code between callToJavaScript() and operationLinkConstruct() consists of:
    (1) a DFG function that we OSR'ed into
    (2) the linkFor() thunk.

2. The SP in the thunk frame (which is before operationConstruct()) is always 64 bytes from the VM::m_stackLimit.

What is surprising is that whatever value I reduce VM::m_stackLimit to, the thunk's SP will respond correspondingly to VM::m_stackLimit + 64.  Similarly, the DFG's SP is also set to a value near the stackLimit but not exceeding it even when I run with different the stackLimit values.  This suggests that the stack pointer value of the DFG function that we OSR enter'ed was erroneously set to something based on the VM stackLimit.
Comment 5 Mark Lam 2014-03-10 19:09:41 PDT
The culprit turned out to be getHostCallReturnValue() which is leaking 16 bytes of stack space every time it is called.  Fix coming.
Comment 6 Mark Lam 2014-03-10 19:18:40 PDT
Created attachment 226368 [details]
the patch.
Comment 7 Geoffrey Garen 2014-03-10 20:30:17 PDT
Comment on attachment 226368 [details]
the patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=226368&action=review

r=me

> Source/JavaScriptCore/ChangeLog:8
> +        The 32-bit x86 version of getHostCallReturnValue() was leaking 16 bytes

Was it 16? I count 12:

push %ebp // 4
leal -4(%esp), %esp // 4
push %ebp // 4

> Source/JavaScriptCore/ChangeLog:9
> +        stack memory every time it is called.  This is now fixed.

s/is called/was called/
Comment 8 Mark Lam 2014-03-10 20:32:42 PDT
(In reply to comment #7)
> > Source/JavaScriptCore/ChangeLog:8
> > +        The 32-bit x86 version of getHostCallReturnValue() was leaking 16 bytes
> 
> Was it 16? I count 12:
> 
> push %ebp // 4
> leal -4(%esp), %esp // 4
> push %ebp // 4

There was a return address already on the stack before the first "push %ebp".  That's what was copied into %eax, which I've removed.

> > Source/JavaScriptCore/ChangeLog:9
> > +        stack memory every time it is called.  This is now fixed.
> 
> s/is called/was called/

Will fix.
Comment 9 Geoffrey Garen 2014-03-10 20:35:37 PDT
> There was a return address already on the stack before the first "push %ebp".

Sure, but the return address wouldn't leak: we'd jump to getHostCallReturnValueWithExecState, which would ret, popping the return address. Right?

(I'm wonder because, if the return address was somehow leaking before, I don't think this patch fixes it.)
Comment 10 Mark Lam 2014-03-10 20:44:35 PDT
Please take a look at the patch again.  Here's the code before my change:

    "mov (%esp), %eax\n"     // 1st 4 bytes: ret addr was already on the stack.
    "push %ebp\n"               // 2nd 4 bytes
    "leal -4(%esp), %esp\n"  // 3rd 4 bytes
    "push %ebp\n"               // 4th 4 bytes
    "push %eax\n"               // new copy of ret addr popped when getHostCallReturnValueWithExecState returns.
    "jmp " LOCAL_REFERENCE(getHostCallReturnValueWithExecState) "\n"
Comment 11 Mark Lam 2014-03-10 21:01:27 PDT
Thanks for the review.  Landed in r165426: <http://trac.webkit.org/r165426>.