Summary: | [GTK][Windows] Fix the regression caused by the jsCStack branch merge | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Csaba Osztrogonác <ossy> | ||||||||||||
Component: | JavaScriptCore | Assignee: | Nobody <webkit-unassigned> | ||||||||||||
Status: | NEW --- | ||||||||||||||
Severity: | Blocker | CC: | ap, barraclough, bfulgham, bugs-noreply, eric.carlson, ggaren, gyuyoung.kim, mcatanzaro, mhahnenberg, mrobinson, msaboff, oliver, optional.anne, ossy, roger_fong, sergio, zan | ||||||||||||
Priority: | P1 | ||||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||||
Hardware: | Unspecified | ||||||||||||||
OS: | Unspecified | ||||||||||||||
Bug Depends on: | 127898, 127902, 127909, 127940, 128434, 128961, 129099, 132740 | ||||||||||||||
Bug Blocks: | 127763 | ||||||||||||||
Attachments: |
|
Description
Csaba Osztrogonác
2014-01-28 07:01:04 PST
Created attachment 222443 [details]
release test log with stress test
(In reply to comment #1) > Created an attachment (id=222443) [details] > run-javascriptcore-tests log Hmm, there are no failing tests on Mac, so I don't know why this seems so broken for EFL. Created attachment 222452 [details]
release test log without stress test
command to reproduce: $ Tools/Scripts/run-javascriptcore-tests --release --efl --no-jsc-stress --no-build
Created attachment 222455 [details]
debug test log without stress test
command to reproduce: $ Tools/Scripts/run-javascriptcore-tests --debug --efl --no-jsc-stress --no-build
(after I removed forcing us to use SHARED_CORE=ON in Source/cmake/OptionsEfl.cmake and build-jsc)
All tests pass in debug mode. It's so strange.
Created attachment 222460 [details]
debug test log with stress test
short summary: - release mode with stress test: 2096 failures - debug mode with stress test: 2 failures - release mode without stress test: 35 regressions - debug mode without stress test: 0 regressions I try to produce a release build with debug symbols to be able produce a useful crash log. Same issue on the GTK port. Given the fact that the changes work well on debug builds but fail on release builds, I played with using the system allocator as the default on release builds (since it's used in debug builds). Turns out this is _not_ the reason behind these regressions. The crashes disappear when optimizations in release builds are disabled (i.e. switching from -O2 to -O0). I managed to get a release build with debug symbols and could reproduce the crash: Program received signal SIGSEGV, Segmentation fault. JSC::eval (callFrame=0x7fffffffcf90) at /home/ossy/WebKit/Source/JavaScriptCore/interpreter/Interpreter.cpp:111 111 EvalExecutable* eval = callerCodeBlock->evalCodeCache().tryGet(callerCodeBlock->isStrictMode(), programSource, callerScopeChain); (gdb) bt #0 JSC::eval (callFrame=0x7fffffffcf90) at /home/ossy/WebKit/Source/JavaScriptCore/interpreter/Interpreter.cpp:111 #1 0x00000000006121a4 in JSC::operationCallEval (exec=0x7fffffffcf90, execCallee=0x7fffffffd060) at /home/ossy/WebKit/Source/JavaScriptCore/jit/JITOperations.cpp:625 #2 0x00007fffab12dac2 in ?? () #3 0x0000000000000000 in ?? () (gdb) p callerCodeBlock $1 = (JSC::CodeBlock *) 0x0 Source/JavaScriptCore/interpreter/Interpreter.cpp:111 ------------------------------------------------------ eval = callerCodeBlock->evalCodeCache().getSlow(callFrame, callerCodeBlock->ownerExecutable(), callerCodeBlock->isStrictMode(), programSource, callerScopeChain); Any pointer how can I continue debugging? Created attachment 222570 [details]
Disassembled JSC::Interpreter::execute
This is the backtrace as provided by GDB:
#0 JSC::Interpreter::execute (this=0x7ffff7ee1960, eval=0x7fffb195c270, callFrame=callFrame@entry=0x7fffffffc140, thisValue=..., scope=scope@entry=0x0) at ../../Source/JavaScriptCore/heap/MarkedBlock.h:262
No locals.
#1 0x00007ffff77ff50e in JSC::eval (callFrame=<optimized out>) at ../../Source/JavaScriptCore/interpreter/Interpreter.cpp:138
topCallFrame = {vm = @0x7ffff7ed5000, oldCallFrame = 0x7fffffffc2f0}
programSource = "Y()"
eval = <optimized out>
program = <optimized out>
interpreter = <optimized out>
#2 0x00007ffff783789b in JSC::operationCallEval (exec=0x7ffff7ee1960, execCallee=0x7fffb195c270) at ../../Source/JavaScriptCore/jit/JITOperations.cpp:625
result = {static numberOfInt52Bits = 52, static int52ShiftAmount = 12, u = {asInt64 = 0, ptr = 0x0, asBits = {payload = 0, tag = 0}}}
#3 0x00007fffb3a8359e in ?? ()
No symbol table info available.
#4 0x0000000000000000 in ?? ()
No symbol table info available.
Attached is the disassembly of JSC::Interpreter::execute.
Testing the various invocations of the sunspider-1.0/date-format-tofte.js test: The following arguments to jsc result in a crash with the backtrace detailed in comment #9: 1. --useLLInt=false The following extra arguments to jsc (besides the path to test) result in a crash with the backtrace detailed in comment #10: 1. none (just the path to test) 2. --minHeapUtilization=2.0 --minCopiedBlockUtilization=2.0 3. --enableConcurrentJIT=false --validateBytecode=true --validateGraphAtEachPhase=true 4. --thresholdForJITAfterWarmUp=10 --thresholdForJITSoon=10 --thresholdForOptimizeAfterWarmUp=20 --thresholdForOptimizeAfterLongWarmUp=20 --thresholdForOptimizeSoon=20 --thresholdForFTLOptimizeAfterWarmUp=20 --thresholdForFTLOptimizeSoon=20 5. --enableConcurrentJIT=false --validateGraph=true --thresholdForJITAfterWarmUp=10 --thresholdForJITSoon=10 --thresholdForOptimizeAfterWarmUp=20 --thresholdForOptimizeAfterLongWarmUp=20 --thresholdForOptimizeSoon=20 --thresholdForFTLOptimizeAfterWarmUp=20 --thresholdForFTLOptimizeSoon=20 It made all tests fail on Windows too: http://build.webkit.org/builders/Apple%20Win%207%20Release%20%28Tests%29/builds/42167 Now Apple Mac is the only working WebKit port :( One of the problems is in operationCallEval. http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/jit/JITOperations.cpp#L612 Here's the disassembled function for the current code: Dump of assembler code for function operationCallEval: => 0x00007ffff7837840 <+0>: mov 0x20(%rsi),%rax 0x00007ffff7837844 <+4>: mov 0x18(%rdi),%rdx 0x00007ffff7837848 <+8>: movabs $0xffff000000000002,%rcx 0x00007ffff7837852 <+18>: movq $0x0,0x10(%rsi) 0x00007ffff783785a <+26>: test %rcx,%rax 0x00007ffff783785d <+29>: mov %rdx,0x18(%rsi) 0x00007ffff7837861 <+33>: jne 0x7ffff783786c <operationCallEval+44> 0x00007ffff7837863 <+35>: mov (%rax),%rcx 0x00007ffff7837866 <+38>: cmpb $0x13,0x5c(%rcx) 0x00007ffff783786a <+42>: je 0x7ffff7837870 <operationCallEval+48> 0x00007ffff783786c <+44>: xor %eax,%eax 0x00007ffff783786e <+46>: retq 0x00007ffff783786f <+47>: nop 0x00007ffff7837870 <+48>: mov 0x18(%rax),%rax 0x00007ffff7837874 <+52>: mov 0x8(%rax),%ecx 0x00007ffff7837877 <+55>: test %ecx,%ecx 0x00007ffff7837879 <+57>: jne 0x7ffff783786c <operationCallEval+44> 0x00007ffff783787b <+59>: push %rbx 0x00007ffff783787c <+60>: mov 0x34f6ed(%rip),%rbx # 0x7ffff7b86f70 0x00007ffff7837883 <+67>: cmp %rbx,0x50(%rax) 0x00007ffff7837887 <+71>: jne 0x7ffff78378a5 <operationCallEval+101> 0x00007ffff7837889 <+73>: xor %dx,%dx 0x00007ffff783788c <+76>: mov %rsi,%rdi 0x00007ffff783788f <+79>: mov 0x468(%rdx),%rbx 0x00007ffff7837896 <+86>: callq 0x7ffff760e790 <_ZN3JSC4evalEPNS_9ExecStateE@plt> 0x00007ffff783789b <+91>: cmpq $0x0,0xbdb0(%rbx) 0x00007ffff78378a3 <+99>: je 0x7ffff78378b0 <operationCallEval+112> 0x00007ffff78378a5 <+101>: xor %eax,%eax 0x00007ffff78378a7 <+103>: nopw 0x0(%rax,%rax,1) 0x00007ffff78378b0 <+112>: pop %rbx 0x00007ffff78378b1 <+113>: retq End of assembler dump. Throwing a simple fprintf() call into the mix, in this case before isHostFunction() call, seems to shift some registers around, and the number of JSC stress test failures drops to 50. Here's the resulting disassembled function: Dump of assembler code for function operationCallEval: => 0x00007ffff78378a0 <+0>: push %rbp 0x00007ffff78378a1 <+1>: lea 0x26d0b6(%rip),%rdx # 0x7ffff7aa495e 0x00007ffff78378a8 <+8>: push %rbx 0x00007ffff78378a9 <+9>: mov %rsi,%rbx 0x00007ffff78378ac <+12>: sub $0x8,%rsp 0x00007ffff78378b0 <+16>: mov 0x18(%rdi),%rax 0x00007ffff78378b4 <+20>: movq $0x0,0x10(%rsi) 0x00007ffff78378bc <+28>: mov %rax,0x18(%rsi) 0x00007ffff78378c0 <+32>: mov 0x34f6a9(%rip),%rax # 0x7ffff7b86f70 0x00007ffff78378c7 <+39>: mov $0x1,%esi 0x00007ffff78378cc <+44>: mov (%rax),%rdi 0x00007ffff78378cf <+47>: xor %eax,%eax 0x00007ffff78378d1 <+49>: callq 0x7ffff760e790 <__fprintf_chk@plt> 0x00007ffff78378d6 <+54>: mov 0x20(%rbx),%rax 0x00007ffff78378da <+58>: movabs $0xffff000000000002,%rdx 0x00007ffff78378e4 <+68>: test %rdx,%rax 0x00007ffff78378e7 <+71>: jne 0x7ffff78378f2 <operationCallEval+82> 0x00007ffff78378e9 <+73>: mov (%rax),%rdx 0x00007ffff78378ec <+76>: cmpb $0x13,0x5c(%rdx) 0x00007ffff78378f0 <+80>: je 0x7ffff7837900 <operationCallEval+96> 0x00007ffff78378f2 <+82>: add $0x8,%rsp 0x00007ffff78378f6 <+86>: xor %eax,%eax 0x00007ffff78378f8 <+88>: pop %rbx 0x00007ffff78378f9 <+89>: pop %rbp 0x00007ffff78378fa <+90>: retq 0x00007ffff78378fb <+91>: nopl 0x0(%rax,%rax,1) 0x00007ffff7837900 <+96>: mov 0x18(%rax),%rax 0x00007ffff7837904 <+100>: mov 0x8(%rax),%edx 0x00007ffff7837907 <+103>: test %edx,%edx 0x00007ffff7837909 <+105>: jne 0x7ffff78378f2 <operationCallEval+82> 0x00007ffff783790b <+107>: mov 0x34f666(%rip),%rcx # 0x7ffff7b86f78 0x00007ffff7837912 <+114>: cmp %rcx,0x50(%rax) 0x00007ffff7837916 <+118>: jne 0x7ffff78378f2 <operationCallEval+82> 0x00007ffff7837918 <+120>: mov 0x18(%rbx),%rax 0x00007ffff783791c <+124>: mov %rbx,%rdi 0x00007ffff783791f <+127>: xor %ax,%ax 0x00007ffff7837922 <+130>: mov 0x468(%rax),%rbp 0x00007ffff7837929 <+137>: callq 0x7ffff760e7a0 <_ZN3JSC4evalEPNS_9ExecStateE@plt> 0x00007ffff783792e <+142>: cmpq $0x0,0xbdb0(%rbp) 0x00007ffff7837936 <+150>: jne 0x7ffff78378f2 <operationCallEval+82> 0x00007ffff7837938 <+152>: add $0x8,%rsp 0x00007ffff783793c <+156>: pop %rbx 0x00007ffff783793d <+157>: pop %rbp 0x00007ffff783793e <+158>: retq End of assembler dump. Some further clarification on comment #13: JSC::eval() and JSC::Interpreter::execute() rely on a non-null JSScope object being accessible through the relevant CallFrame. Currently the callerScopeChain variable in JSC::eval()[1] is assigned a null value, and the changes described in comment #13 (adding the extra fprintf call) somehow fix this. [1] http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/interpreter/Interpreter.cpp#L110 (In reply to comment #14) > Some further clarification on comment #13: > > JSC::eval() and JSC::Interpreter::execute() rely on a non-null JSScope object being accessible through the relevant CallFrame. Currently the callerScopeChain variable in JSC::eval()[1] is assigned a null value, and the changes described in comment #13 (adding the extra fprintf call) somehow fix this. > > > [1] http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/interpreter/Interpreter.cpp#L110 I believe the issue may be that the optimized version is not pushing rbp at the top of the function. The top of the stack after the push of rbp is the exec pointer passed in as the first argument. Without the push of rbp, the chain of caller frames is not complete. In the case of operationCallEval() we pass in both the caller and callee, but the callee's "CallerFrame" doesn't get set up without the push rbp. Are you compiling with the gcc option -fomit-frame-pointer? If so, you may want to eliminate that option and see what that does. The -O2 option implies -fomit-frame-pointer, and passing -fno-omit-frame-pointer explicitly helps greatly, reducing the number of failures down to 25. I'll deduce which additional optimization flags that are implied by -O2 are causing regressions. I checked EFL compiling with -fno-omit-frame-pointer and all jsc tests pass without stress tests and only two failing stress tests remained. (In reply to comment #17) > I checked EFL compiling with -fno-omit-frame-pointer and all jsc tests pass without stress tests and only two failing stress tests remained. Let's build EFL with -fno-omit-frame-pointer: https://bugs.webkit.org/show_bug.cgi?id=127898 (In reply to comment #18) > (In reply to comment #17) > > I checked EFL compiling with -fno-omit-frame-pointer and all jsc tests pass without stress tests and only two failing stress tests remained. > > Let's build EFL with -fno-omit-frame-pointer: https://bugs.webkit.org/show_bug.cgi?id=127898 Ossy, thank you for fixing this problem on EFL port ! However, it looks this js test regression still happen on efl bots and my local env. Do I need to do something to run js test for EFL port ? (In reply to comment #19) > Ossy, thank you for fixing this problem on EFL port ! However, it looks this js test regression still happen on efl bots and my local env. Do I need to do something to run js test for EFL port ? There are still 2 failing tests on GTK and EFL too - https://bugs.webkit.org/show_bug.cgi?id=127902 Maybe -fno-tree-dce could solve the other failures occur on the bots similar to GTK's fix in https://bugs.webkit.org/show_bug.cgi?id=127909 . Unfortunately I can't try if it helps, I haven't a machine with GCC 4.8. |