RESOLVED FIXED Bug 124409
[Win] JavaScript crashes on 64-bit with JIT enabled.
https://bugs.webkit.org/show_bug.cgi?id=124409
Summary [Win] JavaScript crashes on 64-bit with JIT enabled.
peavo
Reported 2013-11-15 06:52:55 PST
I'm currently getting several different crashes when running WinCairo with JIT enabled on 64-bit. I've found the following issues: The saving of the frame pointer in callToJavaScript in JITStubsMSVC64.asm is wrong. The move instruction has flipped the arguments, compared to the GCC version. The registers rsi and rdi in callToJavaScript needs to be saved and restored. This is required by the Windows 64-bit ABI. The caller stores local variables in them, and if not saved and restored, we crash when returning to the caller. The getHostCallReturnValue function needs to be updated according to it's GCC counterpart. The methods JIT::appendCallWithExceptionCheck, and JIT::appendCallWithCallFrameRollbackOnException needs to allocate stack space for the 4 argument registers, as required by the Windows 64-bit ABI.
Attachments
Patch (7.29 KB, patch)
2013-11-15 07:09 PST, peavo
no flags
Patch (5.59 KB, patch)
2013-11-15 16:07 PST, peavo
no flags
Patch (5.81 KB, patch)
2013-11-15 16:28 PST, peavo
no flags
Patch (5.50 KB, patch)
2013-11-15 16:32 PST, peavo
no flags
peavo
Comment 1 2013-11-15 07:09:19 PST
Michael Saboff
Comment 2 2013-11-15 09:54:28 PST
Comment on attachment 217049 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=217049&action=review Thanks for the work. Looking pretty good. It needs a couple of changes. In addition to what is noted inline, the value that we sub/add to sp in callToJavaScript / returnFromJavaScript need to be adjusted for the additional pushes of rbi/rdi and for the space needed to make calls out. The value should be large enough for the space needed to call AND result in a 32 byte aligned SP. I think that means the new value should be 38h. In addition to changing 28h -> 38h, update the comment to include that the calling convention requires space for 4 Dwords. > Source/JavaScriptCore/jit/JITInlines.h:123 > + > +#if OS(WINDOWS) && CPU(X86_64) > + // According to Windows 64-bit ABI, we need to allocate space on the stack for the 4 argument registers. > + subPtr(TrustedImm32(32), stackPointerRegister); > +#endif > + Instead of making space here, we should increase the amount that we decrement from SP in callToJavaScript. > Source/JavaScriptCore/jit/JITInlines.h:142 > + > +#if OS(WINDOWS) && CPU(X86_64) > + // According to Windows 64-bit ABI, we need to allocate space on the stack for the 4 argument registers. > + subPtr(TrustedImm32(32), stackPointerRegister); > +#endif > + Ditto. > Source/JavaScriptCore/jit/JITStubsMSVC64.asm:36 > - mov rbp, rax ; Save previous frame pointer > + mov rax, rbp ; Save previous frame pointer This fix was taken care of in r159290: <http://trac.webkit.org/changeset/159290>
peavo
Comment 3 2013-11-15 11:00:14 PST
(In reply to comment #2) > (From update of attachment 217049 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=217049&action=review > > Thanks for the work. Looking pretty good. It needs a couple of changes. > In addition to what is noted inline, the value that we sub/add to sp in callToJavaScript / returnFromJavaScript need to be adjusted for the additional pushes of rbi/rdi and for the space needed to make calls out. The value should be large enough for the space needed to call AND result in a 32 byte aligned SP. I think that means the new value should be 38h. In addition to changing 28h -> 38h, update the comment to include that the calling convention requires space for 4 Dwords. Thanks for the feedback :) I updated the value to 38h, and removed the other two calls, but then I get a crash in: void JIT_OPERATION operationPutByIdNonStrictOptimize(ExecState* exec, StructureStubInfo* stubInfo, EncodedJSValue encodedValue, EncodedJSValue encodedBase, StringImpl* uid) I assume the 5. parameter here goes on the stack. According to http://msdn.microsoft.com/en-us/library/ms235286.aspx, the stack should be aligned to 16 bytes, but I'm not sure what's correct?
Michael Saboff
Comment 4 2013-11-15 11:09:37 PST
(In reply to comment #3) > (In reply to comment #2) > > (From update of attachment 217049 [details] [details]) > > View in context: https://bugs.webkit.org/attachment.cgi?id=217049&action=review > > > > Thanks for the work. Looking pretty good. It needs a couple of changes. > > In addition to what is noted inline, the value that we sub/add to sp in callToJavaScript / returnFromJavaScript need to be adjusted for the additional pushes of rbi/rdi and for the space needed to make calls out. The value should be large enough for the space needed to call AND result in a 32 byte aligned SP. I think that means the new value should be 38h. In addition to changing 28h -> 38h, update the comment to include that the calling convention requires space for 4 Dwords. > > Thanks for the feedback :) > > I updated the value to 38h, and removed the other two calls, but then I get a crash in: > > void JIT_OPERATION operationPutByIdNonStrictOptimize(ExecState* exec, StructureStubInfo* stubInfo, EncodedJSValue encodedValue, EncodedJSValue encodedBase, StringImpl* uid) > > I assume the 5. parameter here goes on the stack. > > According to http://msdn.microsoft.com/en-us/library/ms235286.aspx, the stack should be aligned to 16 bytes, but I'm not sure what's correct? I take it things worked with your earlier patch. My math could be wrong on the 38h. Certainly 38h though is large enough for up to 7 values. Can you provide a stack trace for the crash and the faulting instruction? IIRC, The 32 byte requirement is for spilling xmm registers.
peavo
Comment 5 2013-11-15 11:57:50 PST
(In reply to comment #4) > I take it things worked with your earlier patch. My math could be wrong on the 38h. Certainly 38h though is large enough for up to 7 values. Can you provide a stack trace for the crash and the faulting instruction? > > IIRC, The 32 byte requirement is for spilling xmm registers. Yes, it worked with the previous patch. I had the exact same crash before, it was what made me add the SP subPtr/addPtr adjustments. The faulting instruction is the last instruction in the disassembly: void JIT_OPERATION operationPutByIdNonStrictOptimize(ExecState* exec, StructureStubInfo* stubInfo, EncodedJSValue encodedValue, EncodedJSValue encodedBase, StringImpl* uid) { 0000000002D01100 mov qword ptr [rsp+18h],rbx 0000000002D01105 push rbp 0000000002D01106 push rsi 0000000002D01107 push rdi 0000000002D01108 push r12 0000000002D0110A push r13 0000000002D0110C sub rsp,50h VM* vm = &exec->vm(); 0000000002D01110 mov rax,qword ptr [rcx+18h] NativeCallFrameTracer tracer(vm, exec); Identifier ident(vm, uid); 0000000002D01114 mov rbx,qword ptr [uid] 0000000002D0111C mov rdi,rdx 0000000002D0111F and rax,0FFFFFFFFFFFF0000h 0000000002D01125 mov rsi,r9 0000000002D01128 mov r13,r8 0000000002D0112B mov rdx,qword ptr [rax+468h] 0000000002D01132 mov rbp,rcx 0000000002D01135 mov qword ptr [rdx+9DB0h],rcx 0000000002D0113C mov eax,dword ptr [rbx+18h] Stacktrace of crash: > JavaScriptCore.dll!operationPutByIdNonStrictOptimize(JSC::ExecState * exec, JSC::StructureStubInfo * stubInfo, __int64 encodedValue, __int64 encodedBase, WTF::StringImpl * uid) Line 300 + 0x28 bytes C++ 0000000003d43e32() 0000000006035420() 000000000012ded8() 0000000005f2ff98() JavaScriptCore.dll!JSC::getCallLinkInfoReturnLocation(JSC::CallLinkInfo * callLinkInfo) Line 105 C++ 130a29aa8dcc1fb4() 0000000003c4f9b0() 0000e6c375e3872c() 000000000012ded8() 0000000005f2ff98() 0000000005f2ff98() 00000000ffffffff() 000000000012f288() 0000000003c4f9b0() 0000000001c40e68() 000000000012df30() JavaScriptCore.dll!JSC::JITCode::execute(JSC::JSStack * stack, JSC::ExecState * callFrame, JSC::VM * vm) Line 49 + 0x19 bytes C++ JavaScriptCore.dll!JSC::Interpreter::execute(JSC::ProgramExecutable * program, JSC::ExecState * callFrame, JSC::JSObject * thisObj) Line 883 + 0x28 bytes C++ JavaScriptCore.dll!JSC::evaluate(JSC::ExecState * exec, const JSC::SourceCode & source, JSC::JSValue thisValue, JSC::JSValue * returnedException) Line 85 C++ WebKit.dll!WebCore::ScriptController::evaluateInWorld(const WebCore::ScriptSourceCode & sourceCode, WebCore::DOMWrapperWorld & world) Line 145 + 0x41 bytes C++ WebKit.dll!WebCore::ScriptController::evaluate(const WebCore::ScriptSourceCode & sourceCode) Line 162 C++ WebKit.dll!WebCore::ScriptElement::executeScript(const WebCore::ScriptSourceCode & sourceCode) Line 310 + 0x14 bytes C++ WebKit.dll!WebCore::ScriptElement::prepareScript(const WTF::TextPosition & scriptStartPosition, WebCore::ScriptElement::LegacyTypeSupport supportLegacyTypes) Line 241 + 0x3d bytes C++ WebKit.dll!WebCore::HTMLScriptRunner::runScript(WebCore::Element * script, const WTF::TextPosition & scriptStartPosition) Line 304 C++ WebKit.dll!WebCore::HTMLScriptRunner::execute(WTF::PassRefPtr<WebCore::Element> * scriptElement, const WTF::TextPosition & scriptStartPosition) Line 177 C++ WebKit.dll!WebCore::HTMLDocumentParser::runScriptsForPausedTreeBuilder() Line 264 C++ WebKit.dll!WebCore::HTMLDocumentParser::pumpTokenizer(WebCore::HTMLDocumentParser::SynchronousMode mode) Line 527 + 0x41 bytes C++ WebKit.dll!WebCore::HTMLDocumentParser::resumeParsingAfterYield() Line 252 C++ WebKit.dll!WebCore::ThreadTimers::sharedTimerFiredInternal() Line 130 C++ WebKit.dll!WebCore::TimerWindowWndProc(HWND__ * hWnd, unsigned int message, unsigned __int64 wParam, __int64 lParam) Line 111 C++
Michael Saboff
Comment 6 2013-11-15 12:08:39 PST
(In reply to comment #5) What are the register values?
peavo
Comment 7 2013-11-15 12:17:16 PST
(In reply to comment #6) > (In reply to comment #5) > > What are the register values? RAX = 0000000003FA0000 RBX = 130A29AA8DCC1FB4 RCX = 0000000005F2FF48 RDX = 0000000001C30FD0 RSI = 0000000003DDFFB0 RDI = 0000000006026AA0 R8  = 0000000003D7DAF0 R9  = 0000000003DDFFB0 R10 = 0000000005F3E390 R11 = 0000000002D01100 R12 = 0000000001C40E68 R13 = 0000000003D7DAF0 R14 = FFFF000000000000 R15 = FFFF000000000002 RIP = 0000000002D0113C RSP = 000000000012DD00 RBP = 0000000005F2FF48 EFL = 00010204
peavo
Comment 8 2013-11-15 13:34:05 PST
I have probably misunderstood something, but below is the disassembly which calls operationPutByIdNonStrictOptimize; I thought we had to also adjust the stack pointer before this call, not only before the call in callToJavaScript? ... ... 00000000053D3E00 mov eax,edx 00000000053D3E02 mov r9,rax 00000000053D3E05 mov rdx,60E91B0h 00000000053D3E0F mov rcx,rbp 00000000053D3E12 mov dword ptr [rbp+2Ch],1Eh 00000000053D3E19 mov r11,1C0AD80h 00000000053D3E23 mov qword ptr [r11],rbp 00000000053D3E26 mov r11,2CE1100h 00000000053D3E30 call r11 < calls operationPutByIdNonStrictOptimize
Michael Saboff
Comment 9 2013-11-15 14:27:42 PST
(In reply to comment #8) > I have probably misunderstood something, but below is the disassembly which calls operationPutByIdNonStrictOptimize; I thought we had to also adjust the stack pointer before this call, not only before the call in callToJavaScript? > > ... > ... > 00000000053D3E00 mov eax,edx > 00000000053D3E02 mov r9,rax > 00000000053D3E05 mov rdx,60E91B0h > 00000000053D3E0F mov rcx,rbp > 00000000053D3E12 mov dword ptr [rbp+2Ch],1Eh > 00000000053D3E19 mov r11,1C0AD80h > 00000000053D3E23 mov qword ptr [r11],rbp > 00000000053D3E26 mov r11,2CE1100h > 00000000053D3E30 call r11 < calls operationPutByIdNonStrictOptimize We shouldn't have to because the stack pointer shouldn't be changed by LLInt or JIT code. From the register dump, rbx is garbage. Could you provide more disassembly for the caller? I'm looking for where arg 5 is put on the stack.
Michael Saboff
Comment 10 2013-11-15 14:29:27 PST
Actually, the value for rbx is the same as the PC for the caller of getCallLinkInfoReturnLocation() in the stack trace.
peavo
Comment 11 2013-11-15 14:47:28 PST
(In reply to comment #9) > We shouldn't have to because the stack pointer shouldn't be changed by LLInt or JIT code. From the register dump, rbx is garbage. Could you provide more disassembly for the caller? I'm looking for where arg 5 is put on the stack. 0000000005463CE8 add byte ptr [rax-75h],cl 0000000005463CEB xor byte ptr [r8-75h],r9b 0000000005463CEF push rbp 0000000005463CF0 or byte ptr [rax-75h],cl 0000000005463CF3 ins dword ptr [rdi],dx 0000000005463CF4 add byte ptr [rdx-3Dh],dl 0000000005463CF7 mov dword ptr [rbp+2Ch],0Ah 0000000005463CFE mov r11,1CCAD80h 0000000005463D08 mov qword ptr [r11],rbp 0000000005463D0B add rsp,0FFFFFFFFFFFFFFF0h 0000000005463D0F mov rcx,rsp 0000000005463D12 mov rdx,rbp 0000000005463D15 mov r8,606C520h 0000000005463D1F mov r11,2E55200h 0000000005463D29 call r11 0000000005463D2C pop rax 0000000005463D2D pop rdx 0000000005463D2E mov r11,1CCC830h 0000000005463D38 mov r11,qword ptr [r11] 0000000005463D3B test r11,r11 0000000005463D3E jne 0000000005463FD2 0000000005463D44 mov rax,qword ptr [rbp-28h] 0000000005463D48 jmp 0000000005463AA1 0000000005463D4D mov dword ptr [rbp+2Ch],0Dh 0000000005463D54 mov r11,1CCAD80h 0000000005463D5E mov qword ptr [r11],rbp 0000000005463D61 add rsp,0FFFFFFFFFFFFFFF0h 0000000005463D65 mov rcx,rsp 0000000005463D68 mov rdx,rbp 0000000005463D6B mov r8,606C538h 0000000005463D75 mov r11,2E58C70h 0000000005463D7F call r11 0000000005463D82 pop rax 0000000005463D83 pop rdx 0000000005463D84 mov r11,1CCC830h 0000000005463D8E mov r11,qword ptr [r11] 0000000005463D91 test r11,r11 0000000005463D94 jne 0000000005463FD2 0000000005463D9A mov rax,qword ptr [rbp+30h] 0000000005463D9E jmp 0000000005463ADD 0000000005463DA3 mov rdx,5F68120h 0000000005463DAD mov rcx,rbp 0000000005463DB0 mov dword ptr [rbp+2Ch],11h 0000000005463DB7 mov r11,1CCAD80h 0000000005463DC1 mov qword ptr [r11],rbp 0000000005463DC4 mov r11,2D95510h 0000000005463DCE call r11 0000000005463DD1 mov r11,1CCC830h 0000000005463DDB mov r11,qword ptr [r11] 0000000005463DDE test r11,r11 0000000005463DE1 jne 0000000005463FD2 0000000005463DE7 mov qword ptr [rbp-18h],rax 0000000005463DEB jmp 0000000005463B18 0000000005463DF0 mov r11,6065810h 0000000005463DFA mov qword ptr [rsp],r11 0000000005463DFE mov r8,rdx 0000000005463E01 mov r9,rax 0000000005463E04 mov rdx,6065540h 0000000005463E0E mov rcx,rbp 0000000005463E11 mov dword ptr [rbp+2Ch],1Eh 0000000005463E18 mov r11,1CCAD80h 0000000005463E22 mov qword ptr [r11],rbp 0000000005463E25 mov r11,2D91100h 0000000005463E2F call r11
peavo
Comment 12 2013-11-15 15:20:12 PST
(In reply to comment #10) > Actually, the value for rbx is the same as the PC for the caller of getCallLinkInfoReturnLocation() in the stack trace. I found the 5. parameter (StringImpl *) on the stack, it's located 20h off from where we actually read.
Michael Saboff
Comment 13 2013-11-15 15:35:26 PST
(In reply to comment #12) > (In reply to comment #10) > > Actually, the value for rbx is the same as the PC for the caller of getCallLinkInfoReturnLocation() in the stack trace. > > I found the 5. parameter (StringImpl *) on the stack, it's located 20h off from where we actually read. Makes sense. Doesn't appear that we accounted for the first 4 args before poking arg5. The poke for arg5 is: 0000000005463DFA mov qword ptr [rsp],r11 That should be 0000000005463DFA mov qword ptr [rsp + 0x20],r11 I think the C call helper for 5 args for X86 win needs to account for the first 4 args.
peavo
Comment 14 2013-11-15 15:48:03 PST
(In reply to comment #13) > (In reply to comment #12) > > (In reply to comment #10) > > > Actually, the value for rbx is the same as the PC for the caller of getCallLinkInfoReturnLocation() in the stack trace. > > > > I found the 5. parameter (StringImpl *) on the stack, it's located 20h off from where we actually read. > > Makes sense. Doesn't appear that we accounted for the first 4 args before poking arg5. The poke for arg5 is: > 0000000005463DFA mov qword ptr [rsp],r11 > That should be 0000000005463DFA mov qword ptr [rsp + 0x20],r11 > > I think the C call helper for 5 args for X86 win needs to account for the first 4 args. Ah, I see... Thanks, will update the patch :)
peavo
Comment 15 2013-11-15 16:07:15 PST
EFL EWS Bot
Comment 16 2013-11-15 16:11:55 PST
Michael Saboff
Comment 17 2013-11-15 16:14:40 PST
Comment on attachment 217094 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=217094&action=review > Source/JavaScriptCore/jit/JITStubsMSVC64.asm:35 > + mov r10, qword ptr[esp] Shouldn't this be rsp?
Michael Saboff
Comment 18 2013-11-15 16:23:22 PST
Comment on attachment 217094 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=217094&action=review > Source/JavaScriptCore/jit/CCallHelpers.h:733 > +#if CPU(MIPS) || OS(WINDOWS) && CPU(X86_64) Parenthesize this expression.
peavo
Comment 19 2013-11-15 16:28:48 PST
peavo
Comment 20 2013-11-15 16:32:37 PST
peavo
Comment 21 2013-11-15 16:34:08 PST
(In reply to comment #17) > (From update of attachment 217094 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=217094&action=review > > > Source/JavaScriptCore/jit/JITStubsMSVC64.asm:35 > > + mov r10, qword ptr[esp] > > Shouldn't this be rsp? Yes :)
Michael Saboff
Comment 22 2013-11-15 16:34:30 PST
r=me. Let's watch the EWS bots and then I can cq+
peavo
Comment 23 2013-11-15 16:35:38 PST
(In reply to comment #22) > r=me. Let's watch the EWS bots and then I can cq+ Thanks, I hope I got it right now :)
peavo
Comment 24 2013-11-15 17:22:02 PST
(In reply to comment #22) > r=me. Let's watch the EWS bots and then I can cq+ Seems to build successfully now :)
WebKit Commit Bot
Comment 25 2013-11-15 19:23:31 PST
Comment on attachment 217099 [details] Patch Clearing flags on attachment: 217099 Committed r159376: <http://trac.webkit.org/changeset/159376>
WebKit Commit Bot
Comment 26 2013-11-15 19:23:34 PST
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.