Since r173031, 200+ crashes seen when performing run-layout-jsc on x86 32-bit/Linux/gcc 4.7.3
Created attachment 237463 [details] run-layout-jsc results (tested on r173161)
Created attachment 237465 [details] Workaround proposal It appears that in my env (Ubuntu Linux x86 32-bit / gcc 4.7.3): execCallee->setScope(exec->scope()); // exec pointer is valid at this point execCallee->setCodeBlock(0); // exec pointer is null now execCallee->setCallerFrame(exec); // setting null caller frame
With this patch, run-layout-jsc is running fine again: All 523 tests passed!
Could you provide a disassembly of operationCallEval? That might help understanding why exec is overwritten. I'm just curious.
(In reply to comment #4) > Could you provide a disassembly of operationCallEval? That might help understanding why exec is overwritten. I'm just curious. Sure. Here is the disassembly, taken from Source/JavaScriptCore/CMakeFiles/JavaScriptCore.dir/jit/JITOperations.cpp.o file, in release mode with -g: 00003233 <operationCallEval>: EncodedJSValue JIT_OPERATION operationCallEval(ExecState* exec, ExecState* execCallee) { 3233: 55 push %ebp 3234: 89 e5 mov %esp,%ebp 3236: 53 push %ebx 3237: 83 ec 34 sub $0x34,%esp 323a: e8 fc ff ff ff call 323b <operationCallEval+0x8> 323f: 81 c3 02 00 00 00 add $0x2,%ebx ASSERT(exec->codeBlock()->codeType() != FunctionCode 3245: 8b 45 08 mov 0x8(%ebp),%eax 3248: 89 04 24 mov %eax,(%esp) 324b: e8 fc ff ff ff call 324c <operationCallEval+0x19> 3250: 89 04 24 mov %eax,(%esp) 3253: e8 fc ff ff ff call 3254 <operationCallEval+0x21> || !exec->codeBlock()->needsActivation() || exec->hasActivation()); 3258: 83 f8 02 cmp $0x2,%eax 325b: 75 58 jne 32b5 <operationCallEval+0x82> directPutByVal(exec, asObject(baseValue), subscript, value); } EncodedJSValue JIT_OPERATION operationCallEval(ExecState* exec, ExecState* execCallee) { ASSERT(exec->codeBlock()->codeType() != FunctionCode 325d: 8b 45 08 mov 0x8(%ebp),%eax 3260: 89 04 24 mov %eax,(%esp) 3263: e8 fc ff ff ff call 3264 <operationCallEval+0x31> 3268: 89 04 24 mov %eax,(%esp) 326b: e8 fc ff ff ff call 326c <operationCallEval+0x39> 3270: 84 c0 test %al,%al 3272: 74 41 je 32b5 <operationCallEval+0x82> 3274: 8b 45 08 mov 0x8(%ebp),%eax 3277: 89 04 24 mov %eax,(%esp) 327a: e8 fc ff ff ff call 327b <operationCallEval+0x48> 327f: 83 f0 01 xor $0x1,%eax 3282: 84 c0 test %al,%al 3284: 74 2f je 32b5 <operationCallEval+0x82> 3286: 8d 83 c0 21 00 00 lea 0x21c0(%ebx),%eax 328c: 89 44 24 0c mov %eax,0xc(%esp) 3290: 8d 83 00 c0 00 00 lea 0xc000(%ebx),%eax 3296: 89 44 24 08 mov %eax,0x8(%esp) 329a: c7 44 24 04 62 02 00 movl $0x262,0x4(%esp) 32a1: 00 32a2: 8d 83 90 20 00 00 lea 0x2090(%ebx),%eax 32a8: 89 04 24 mov %eax,(%esp) 32ab: e8 fc ff ff ff call 32ac <operationCallEval+0x79> 32b0: e8 fc ff ff ff call 32b1 <operationCallEval+0x7e> || !exec->codeBlock()->needsActivation() || exec->hasActivation()); execCallee->setScope(exec->scope()); 32b5: 8b 45 08 mov 0x8(%ebp),%eax 32b8: 89 04 24 mov %eax,(%esp) 32bb: e8 fc ff ff ff call 32bc <operationCallEval+0x89> 32c0: 89 44 24 04 mov %eax,0x4(%esp) 32c4: 8b 45 0c mov 0xc(%ebp),%eax 32c7: 89 04 24 mov %eax,(%esp) 32ca: e8 fc ff ff ff call 32cb <operationCallEval+0x98> execCallee->setCodeBlock(0); 32cf: c7 44 24 04 00 00 00 movl $0x0,0x4(%esp) 32d6: 00 32d7: 8b 45 0c mov 0xc(%ebp),%eax 32da: 89 04 24 mov %eax,(%esp) 32dd: e8 fc ff ff ff call 32de <operationCallEval+0xab> execCallee->setCallerFrame(exec); 32e2: 8b 45 08 mov 0x8(%ebp),%eax 32e5: 89 44 24 04 mov %eax,0x4(%esp) 32e9: 8b 45 0c mov 0xc(%ebp),%eax 32ec: 89 04 24 mov %eax,(%esp) 32ef: e8 fc ff ff ff call 32f0 <operationCallEval+0xbd> if (!isHostFunction(execCallee->calleeAsValue(), globalFuncEval)) 32f4: 8d 45 f0 lea -0x10(%ebp),%eax 32f7: 8b 55 0c mov 0xc(%ebp),%edx 32fa: 89 54 24 04 mov %edx,0x4(%esp) 32fe: 89 04 24 mov %eax,(%esp) 3301: e8 fc ff ff ff call 3302 <operationCallEval+0xcf> [...]
Something very nasty is happening here. I've put my hands on an x86/32-bit chroot and built an EFL/debug jsc. When I run jsc --useLLInt=false eval.js, where eval.js is: var c = ""; eval(c); I also get the segfault. Running it in gdb and setting a breakpoint at JSC::operationCallEval reveals the following: (gdb) disas JSC::operationCallEval Dump of assembler code for function JSC::operationCallEval(JSC::ExecState*, JSC::ExecState*): 0x08808885 <+0>: push %ebp 0x08808886 <+1>: mov %esp,%ebp 0x08808888 <+3>: push %ebx 0x08808889 <+4>: sub $0x34,%esp 0x0880888c <+7>: call 0x86bfa30 <__x86.get_pc_thunk.bx> 0x08808891 <+12>: add $0xc8476f,%ebx => 0x08808897 <+18>: mov 0x8(%ebp),%eax At this point, esp=0xffffcda0 and ebp=0xffffcdd8 (the difference between them is 0x38, just as expected from the code above). The contents of the stack from *ebp are: 0xffffcdd8: 0xffffce28 // <=ebp, this is the result of push %ebp, i.e., before calling operationCallEval, ebp was 0xffffce28 0xffffcddc: 0xf5f1eef0 // this is the return address pushed on stack by the call to operationCallEval 0xffffcde0: 0xffffce28 // this is the first argument of operationCallEval, i.e., ExecState* exec (the previous ebp) 0xffffcde4: 0xffffcdd8 // this is the second argument of operationCallEval, i.e., ExecState* execCallee And this last line is strange: execCallee should point to a region of memory which has the layout of { callerFrame, returnAddress, codeBlock, scopeChain, callee, argCount, thisArg, args... }. However, in this case, the arguments of operationCallEval (i.e., the pointers exec and execCallee) on the stack are overlapping with the contents of execCallee itself. So, calling execCallee->setCodeBlock(0) zeros out execCallee[2], which is also exec. "Interestingly", this was also happening before the patch landed in r173031, but fortunately, exec was not used after the call to setCodeBlock, so the problem did not reveal itself. So, swapping the lines acts as a workaround but there is something nastier underneath, which would be worth of investigation. IMHO.
(In reply to comment #6) > "Interestingly", this was also happening before the patch landed in r173031, but fortunately, exec was not used after the call to setCodeBlock, so the problem did not reveal itself. So, swapping the lines acts as a workaround but there is something nastier underneath, which would be worth of investigation. IMHO. I fully agree with you, that's why I named my patch "workaround" and not "fix" :D
(In reply to comment #6) > Something very nasty is happening here. I've put my hands on an x86/32-bit chroot and built an EFL/debug jsc. When I run jsc --useLLInt=false eval.js, where eval.js is: > > var c = ""; > eval(c); > > I also get the segfault. Running it in gdb and setting a breakpoint at JSC::operationCallEval reveals the following: > > (gdb) disas JSC::operationCallEval > Dump of assembler code for function JSC::operationCallEval(JSC::ExecState*, JSC::ExecState*): > 0x08808885 <+0>: push %ebp > 0x08808886 <+1>: mov %esp,%ebp > 0x08808888 <+3>: push %ebx > 0x08808889 <+4>: sub $0x34,%esp > 0x0880888c <+7>: call 0x86bfa30 <__x86.get_pc_thunk.bx> > 0x08808891 <+12>: add $0xc8476f,%ebx > => 0x08808897 <+18>: mov 0x8(%ebp),%eax > > At this point, esp=0xffffcda0 and ebp=0xffffcdd8 (the difference between them is 0x38, just as expected from the code above). The contents of the stack from *ebp are: > > 0xffffcdd8: 0xffffce28 // <=ebp, this is the result of push %ebp, i.e., before calling operationCallEval, ebp was 0xffffce28 > 0xffffcddc: 0xf5f1eef0 // this is the return address pushed on stack by the call to operationCallEval > 0xffffcde0: 0xffffce28 // this is the first argument of operationCallEval, i.e., ExecState* exec (the previous ebp) > 0xffffcde4: 0xffffcdd8 // this is the second argument of operationCallEval, i.e., ExecState* execCallee > > And this last line is strange: execCallee should point to a region of memory which has the layout of { callerFrame, returnAddress, codeBlock, scopeChain, callee, argCount, thisArg, args... }. However, in this case, the arguments of operationCallEval (i.e., the pointers exec and execCallee) on the stack are overlapping with the contents of execCallee itself. So, calling execCallee->setCodeBlock(0) zeros out execCallee[2], which is also exec. > > "Interestingly", this was also happening before the patch landed in r173031, but fortunately, exec was not used after the call to setCodeBlock, so the problem did not reveal itself. So, swapping the lines acts as a workaround but there is something nastier underneath, which would be worth of investigation. IMHO. I don't think that this is due to r173031. The issue is likely that the space pointed to by execCallee overlaps the stacked argument location on architectures that pass arguments via the stack. In JIT::compileCallEval(), we use the stack pointer as the calleeFrame pointer and then make the call via callOperationNoExceptionCheck(). For X86-32, we use poke(reg, offset) where the offset is a positive offset of the stack pointer to pass the arguments. I think requires that we either move up execCallee in the stack (harder) or move down the outgoing argument area for architectures that use stacked arguments (easier).
Created attachment 237517 [details] Fix proposal
Created attachment 237519 [details] Patch Well, we think alike. I was working on this one and testing when you posted. I prefer keying this off of argument registers and not CPU type. BTW, X86-32 on Mac works with the old code on debug because the arguments are moved to local stack variables and for release because the args are moved into registers.
(In reply to comment #10) > Created an attachment (id=237519) [details] > Patch > > Well, we think alike. I was working on this one and testing when you posted. I prefer keying this off of argument registers and not CPU type. > > BTW, X86-32 on Mac works with the old code on debug because the arguments are moved to local stack variables and for release because the args are moved into registers. Looks good to me. As the stack must remain aligned (otherwise it will crash in an alignment check in the LLINT), do you think it's worth to assert something like this too ? ASSERT(!((sizeof(Register)*4) % stackAlignmentBytes()));
(In reply to comment #10) > Created an attachment (id=237519) [details] > Patch I was wondering why to subtract sizeof(Register) * 4 from sp, why not sizeof(CallerFrameAndPC)? (And then, the arithmetic is not even necessary, since the result is already in regT1.) A stack/call frame layout problem also exists on ARM32. It would be worth looking at https://bugs.webkit.org/show_bug.cgi?id=132740 . There, a highly similar approach has been suggested (not reviewed yet / titled dirty hack).
Created attachment 237526 [details] Updated patch from review comments.
Comment on attachment 237526 [details] Updated patch from review comments. View in context: https://bugs.webkit.org/attachment.cgi?id=237526&action=review > Source/JavaScriptCore/jit/JITCall32_64.cpp:227 > +#if NUMBER_OF_ARGUMENT_REGISTERS < 4 > + // Add stack space for arguments for architectures that pass most / all args on the stack > + int32_t addedStackSpace = (sizeof(GPRReg) * 4 + stackAlignmentBytes() - 1) & ~(stackAlignmentBytes() - 1); > + addPtr(TrustedImm32(-addedStackSpace), stackPointerRegister); Who actually writes these four GPRs to the stack? Why isn't that function responsible for making room on the stack?
Created attachment 237640 [details] Updated patch after discussion with ggaren
Comment on attachment 237640 [details] Updated patch after discussion with ggaren View in context: https://bugs.webkit.org/attachment.cgi?id=237640&action=review r=me > Source/JavaScriptCore/ChangeLog:10 > + That stack pointer provides space for the worse case number of stacked "worst" > Source/JavaScriptCore/jit/JITOperations.cpp:-615 > - execCallee->setCallerFrame(exec); Does the LLInt need updating to perform this callerFrame store as well, or does it do so already?
(In reply to comment #16) > Does the LLInt need updating to perform this callerFrame store as well, or does it do so already? The LLInt has its own slow path that includes the setting the callerFrame.
Committed r173282: <http://trac.webkit.org/changeset/173282>
*** Bug 132740 has been marked as a duplicate of this bug. ***