Summary: | Crash on a stack overflow on 32-bit x86 in http/tests/websocket/tests/hybi/workers/no-onmessage-in-sync-op.html | ||||||
---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Mark Lam <mark.lam> | ||||
Component: | JavaScriptCore | Assignee: | Mark Lam <mark.lam> | ||||
Status: | RESOLVED FIXED | ||||||
Severity: | Normal | CC: | fpizlo, ggaren, mhahnenberg, mmirman, msaboff, oliver, webkit-bug-importer | ||||
Priority: | P2 | Keywords: | InRadar | ||||
Version: | 528+ (Nightly build) | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Attachments: |
|
Description
Mark Lam
2014-03-07 19:31:04 PST
The test does not fail if I only run the LLINT or baseline JIT. Hence, the issue is in the DFG. Still investigating. Looking through the frames in the stack trace, I see a sudden and huge jump in the stack pointer value going from frame 19 to frame 18. At frame 19, $esp is sane and within expected bounds. At frame 18, it is now near the end of the stack, and have exceeded the VM stackLimit. Further investigation reveals that frame 18 is a DFG function with its Block 1 being an OSR entry block. If I disable OSR entry into DFG functions, the crash does not manifest. Next step, look at the stack adjustment code for OSR entry on the 32-bit port and see if it is at parity with the 64-bit port. Bit of interesting detail: 1. The 2 piece of JIT code between callToJavaScript() and operationLinkConstruct() consists of: (1) a DFG function that we OSR'ed into (2) the linkFor() thunk. 2. The SP in the thunk frame (which is before operationConstruct()) is always 64 bytes from the VM::m_stackLimit. What is surprising is that whatever value I reduce VM::m_stackLimit to, the thunk's SP will respond correspondingly to VM::m_stackLimit + 64. Similarly, the DFG's SP is also set to a value near the stackLimit but not exceeding it even when I run with different the stackLimit values. This suggests that the stack pointer value of the DFG function that we OSR enter'ed was erroneously set to something based on the VM stackLimit. The culprit turned out to be getHostCallReturnValue() which is leaking 16 bytes of stack space every time it is called. Fix coming. Created attachment 226368 [details]
the patch.
Comment on attachment 226368 [details] the patch. View in context: https://bugs.webkit.org/attachment.cgi?id=226368&action=review r=me > Source/JavaScriptCore/ChangeLog:8 > + The 32-bit x86 version of getHostCallReturnValue() was leaking 16 bytes Was it 16? I count 12: push %ebp // 4 leal -4(%esp), %esp // 4 push %ebp // 4 > Source/JavaScriptCore/ChangeLog:9 > + stack memory every time it is called. This is now fixed. s/is called/was called/ (In reply to comment #7) > > Source/JavaScriptCore/ChangeLog:8 > > + The 32-bit x86 version of getHostCallReturnValue() was leaking 16 bytes > > Was it 16? I count 12: > > push %ebp // 4 > leal -4(%esp), %esp // 4 > push %ebp // 4 There was a return address already on the stack before the first "push %ebp". That's what was copied into %eax, which I've removed. > > Source/JavaScriptCore/ChangeLog:9 > > + stack memory every time it is called. This is now fixed. > > s/is called/was called/ Will fix. > There was a return address already on the stack before the first "push %ebp".
Sure, but the return address wouldn't leak: we'd jump to getHostCallReturnValueWithExecState, which would ret, popping the return address. Right?
(I'm wonder because, if the return address was somehow leaking before, I don't think this patch fixes it.)
Please take a look at the patch again. Here's the code before my change: "mov (%esp), %eax\n" // 1st 4 bytes: ret addr was already on the stack. "push %ebp\n" // 2nd 4 bytes "leal -4(%esp), %esp\n" // 3rd 4 bytes "push %ebp\n" // 4th 4 bytes "push %eax\n" // new copy of ret addr popped when getHostCallReturnValueWithExecState returns. "jmp " LOCAL_REFERENCE(getHostCallReturnValueWithExecState) "\n" Thanks for the review. Landed in r165426: <http://trac.webkit.org/r165426>. |