NEW 127777
[GTK][Windows] Fix the regression caused by the jsCStack branch merge
https://bugs.webkit.org/show_bug.cgi?id=127777
Summary [GTK][Windows] Fix the regression caused by the jsCStack branch merge
Csaba Osztrogonác
Reported 2014-01-28 07:01:04 PST
After adding the missing files to the cmake buildsystem and removed the dead code I managed to build EFLWebKit. But it seems that the merge caused a serious regression in tests. ( I don't know the impact of the merge for the layout tests, because it is impossible to build a WebKitEFL which is able to run layout tests. See https://bugs.webkit.org/show_bug.cgi?id=127776 for details.) I'll attach the detailed results.
Attachments
release test log with stress test (5.55 MB, text/plain)
2014-01-28 07:04 PST, Csaba Osztrogonác
no flags
release test log without stress test (121.71 KB, text/plain)
2014-01-28 08:59 PST, Csaba Osztrogonác
no flags
debug test log without stress test (118.42 KB, text/plain)
2014-01-28 09:01 PST, Csaba Osztrogonác
no flags
debug test log with stress test (416.16 KB, text/plain)
2014-01-28 09:17 PST, Csaba Osztrogonác
no flags
Disassembled JSC::Interpreter::execute (29.55 KB, text/plain)
2014-01-29 06:50 PST, Zan Dobersek
no flags
Csaba Osztrogonác
Comment 1 2014-01-28 07:04:25 PST
Created attachment 222443 [details] release test log with stress test
Mark Hahnenberg
Comment 2 2014-01-28 07:21:01 PST
(In reply to comment #1) > Created an attachment (id=222443) [details] > run-javascriptcore-tests log Hmm, there are no failing tests on Mac, so I don't know why this seems so broken for EFL.
Csaba Osztrogonác
Comment 3 2014-01-28 08:59:00 PST
Created attachment 222452 [details] release test log without stress test command to reproduce: $ Tools/Scripts/run-javascriptcore-tests --release --efl --no-jsc-stress --no-build
Csaba Osztrogonác
Comment 4 2014-01-28 09:01:47 PST
Created attachment 222455 [details] debug test log without stress test command to reproduce: $ Tools/Scripts/run-javascriptcore-tests --debug --efl --no-jsc-stress --no-build (after I removed forcing us to use SHARED_CORE=ON in Source/cmake/OptionsEfl.cmake and build-jsc) All tests pass in debug mode. It's so strange.
Csaba Osztrogonác
Comment 5 2014-01-28 09:17:43 PST
Created attachment 222460 [details] debug test log with stress test
Csaba Osztrogonác
Comment 6 2014-01-28 09:24:18 PST
short summary: - release mode with stress test: 2096 failures - debug mode with stress test: 2 failures - release mode without stress test: 35 regressions - debug mode without stress test: 0 regressions I try to produce a release build with debug symbols to be able produce a useful crash log.
Zan Dobersek
Comment 7 2014-01-28 15:44:08 PST
Same issue on the GTK port. Given the fact that the changes work well on debug builds but fail on release builds, I played with using the system allocator as the default on release builds (since it's used in debug builds). Turns out this is _not_ the reason behind these regressions.
Zan Dobersek
Comment 8 2014-01-29 03:52:17 PST
The crashes disappear when optimizations in release builds are disabled (i.e. switching from -O2 to -O0).
Csaba Osztrogonác
Comment 9 2014-01-29 06:28:59 PST
I managed to get a release build with debug symbols and could reproduce the crash: Program received signal SIGSEGV, Segmentation fault. JSC::eval (callFrame=0x7fffffffcf90) at /home/ossy/WebKit/Source/JavaScriptCore/interpreter/Interpreter.cpp:111 111 EvalExecutable* eval = callerCodeBlock->evalCodeCache().tryGet(callerCodeBlock->isStrictMode(), programSource, callerScopeChain); (gdb) bt #0 JSC::eval (callFrame=0x7fffffffcf90) at /home/ossy/WebKit/Source/JavaScriptCore/interpreter/Interpreter.cpp:111 #1 0x00000000006121a4 in JSC::operationCallEval (exec=0x7fffffffcf90, execCallee=0x7fffffffd060) at /home/ossy/WebKit/Source/JavaScriptCore/jit/JITOperations.cpp:625 #2 0x00007fffab12dac2 in ?? () #3 0x0000000000000000 in ?? () (gdb) p callerCodeBlock $1 = (JSC::CodeBlock *) 0x0 Source/JavaScriptCore/interpreter/Interpreter.cpp:111 ------------------------------------------------------ eval = callerCodeBlock->evalCodeCache().getSlow(callFrame, callerCodeBlock->ownerExecutable(), callerCodeBlock->isStrictMode(), programSource, callerScopeChain); Any pointer how can I continue debugging?
Zan Dobersek
Comment 10 2014-01-29 06:50:46 PST
Created attachment 222570 [details] Disassembled JSC::Interpreter::execute This is the backtrace as provided by GDB: #0 JSC::Interpreter::execute (this=0x7ffff7ee1960, eval=0x7fffb195c270, callFrame=callFrame@entry=0x7fffffffc140, thisValue=..., scope=scope@entry=0x0) at ../../Source/JavaScriptCore/heap/MarkedBlock.h:262 No locals. #1 0x00007ffff77ff50e in JSC::eval (callFrame=<optimized out>) at ../../Source/JavaScriptCore/interpreter/Interpreter.cpp:138 topCallFrame = {vm = @0x7ffff7ed5000, oldCallFrame = 0x7fffffffc2f0} programSource = "Y()" eval = <optimized out> program = <optimized out> interpreter = <optimized out> #2 0x00007ffff783789b in JSC::operationCallEval (exec=0x7ffff7ee1960, execCallee=0x7fffb195c270) at ../../Source/JavaScriptCore/jit/JITOperations.cpp:625 result = {static numberOfInt52Bits = 52, static int52ShiftAmount = 12, u = {asInt64 = 0, ptr = 0x0, asBits = {payload = 0, tag = 0}}} #3 0x00007fffb3a8359e in ?? () No symbol table info available. #4 0x0000000000000000 in ?? () No symbol table info available. Attached is the disassembly of JSC::Interpreter::execute.
Zan Dobersek
Comment 11 2014-01-29 12:13:44 PST
Testing the various invocations of the sunspider-1.0/date-format-tofte.js test: The following arguments to jsc result in a crash with the backtrace detailed in comment #9: 1. --useLLInt=false The following extra arguments to jsc (besides the path to test) result in a crash with the backtrace detailed in comment #10: 1. none (just the path to test) 2. --minHeapUtilization=2.0 --minCopiedBlockUtilization=2.0 3. --enableConcurrentJIT=false --validateBytecode=true --validateGraphAtEachPhase=true 4. --thresholdForJITAfterWarmUp=10 --thresholdForJITSoon=10 --thresholdForOptimizeAfterWarmUp=20 --thresholdForOptimizeAfterLongWarmUp=20 --thresholdForOptimizeSoon=20 --thresholdForFTLOptimizeAfterWarmUp=20 --thresholdForFTLOptimizeSoon=20 5. --enableConcurrentJIT=false --validateGraph=true --thresholdForJITAfterWarmUp=10 --thresholdForJITSoon=10 --thresholdForOptimizeAfterWarmUp=20 --thresholdForOptimizeAfterLongWarmUp=20 --thresholdForOptimizeSoon=20 --thresholdForFTLOptimizeAfterWarmUp=20 --thresholdForFTLOptimizeSoon=20
Csaba Osztrogonác
Comment 12 2014-01-29 14:26:25 PST
It made all tests fail on Windows too: http://build.webkit.org/builders/Apple%20Win%207%20Release%20%28Tests%29/builds/42167 Now Apple Mac is the only working WebKit port :(
Zan Dobersek
Comment 13 2014-01-29 14:59:53 PST
One of the problems is in operationCallEval. http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/jit/JITOperations.cpp#L612 Here's the disassembled function for the current code: Dump of assembler code for function operationCallEval: => 0x00007ffff7837840 <+0>: mov 0x20(%rsi),%rax 0x00007ffff7837844 <+4>: mov 0x18(%rdi),%rdx 0x00007ffff7837848 <+8>: movabs $0xffff000000000002,%rcx 0x00007ffff7837852 <+18>: movq $0x0,0x10(%rsi) 0x00007ffff783785a <+26>: test %rcx,%rax 0x00007ffff783785d <+29>: mov %rdx,0x18(%rsi) 0x00007ffff7837861 <+33>: jne 0x7ffff783786c <operationCallEval+44> 0x00007ffff7837863 <+35>: mov (%rax),%rcx 0x00007ffff7837866 <+38>: cmpb $0x13,0x5c(%rcx) 0x00007ffff783786a <+42>: je 0x7ffff7837870 <operationCallEval+48> 0x00007ffff783786c <+44>: xor %eax,%eax 0x00007ffff783786e <+46>: retq 0x00007ffff783786f <+47>: nop 0x00007ffff7837870 <+48>: mov 0x18(%rax),%rax 0x00007ffff7837874 <+52>: mov 0x8(%rax),%ecx 0x00007ffff7837877 <+55>: test %ecx,%ecx 0x00007ffff7837879 <+57>: jne 0x7ffff783786c <operationCallEval+44> 0x00007ffff783787b <+59>: push %rbx 0x00007ffff783787c <+60>: mov 0x34f6ed(%rip),%rbx # 0x7ffff7b86f70 0x00007ffff7837883 <+67>: cmp %rbx,0x50(%rax) 0x00007ffff7837887 <+71>: jne 0x7ffff78378a5 <operationCallEval+101> 0x00007ffff7837889 <+73>: xor %dx,%dx 0x00007ffff783788c <+76>: mov %rsi,%rdi 0x00007ffff783788f <+79>: mov 0x468(%rdx),%rbx 0x00007ffff7837896 <+86>: callq 0x7ffff760e790 <_ZN3JSC4evalEPNS_9ExecStateE@plt> 0x00007ffff783789b <+91>: cmpq $0x0,0xbdb0(%rbx) 0x00007ffff78378a3 <+99>: je 0x7ffff78378b0 <operationCallEval+112> 0x00007ffff78378a5 <+101>: xor %eax,%eax 0x00007ffff78378a7 <+103>: nopw 0x0(%rax,%rax,1) 0x00007ffff78378b0 <+112>: pop %rbx 0x00007ffff78378b1 <+113>: retq End of assembler dump. Throwing a simple fprintf() call into the mix, in this case before isHostFunction() call, seems to shift some registers around, and the number of JSC stress test failures drops to 50. Here's the resulting disassembled function: Dump of assembler code for function operationCallEval: => 0x00007ffff78378a0 <+0>: push %rbp 0x00007ffff78378a1 <+1>: lea 0x26d0b6(%rip),%rdx # 0x7ffff7aa495e 0x00007ffff78378a8 <+8>: push %rbx 0x00007ffff78378a9 <+9>: mov %rsi,%rbx 0x00007ffff78378ac <+12>: sub $0x8,%rsp 0x00007ffff78378b0 <+16>: mov 0x18(%rdi),%rax 0x00007ffff78378b4 <+20>: movq $0x0,0x10(%rsi) 0x00007ffff78378bc <+28>: mov %rax,0x18(%rsi) 0x00007ffff78378c0 <+32>: mov 0x34f6a9(%rip),%rax # 0x7ffff7b86f70 0x00007ffff78378c7 <+39>: mov $0x1,%esi 0x00007ffff78378cc <+44>: mov (%rax),%rdi 0x00007ffff78378cf <+47>: xor %eax,%eax 0x00007ffff78378d1 <+49>: callq 0x7ffff760e790 <__fprintf_chk@plt> 0x00007ffff78378d6 <+54>: mov 0x20(%rbx),%rax 0x00007ffff78378da <+58>: movabs $0xffff000000000002,%rdx 0x00007ffff78378e4 <+68>: test %rdx,%rax 0x00007ffff78378e7 <+71>: jne 0x7ffff78378f2 <operationCallEval+82> 0x00007ffff78378e9 <+73>: mov (%rax),%rdx 0x00007ffff78378ec <+76>: cmpb $0x13,0x5c(%rdx) 0x00007ffff78378f0 <+80>: je 0x7ffff7837900 <operationCallEval+96> 0x00007ffff78378f2 <+82>: add $0x8,%rsp 0x00007ffff78378f6 <+86>: xor %eax,%eax 0x00007ffff78378f8 <+88>: pop %rbx 0x00007ffff78378f9 <+89>: pop %rbp 0x00007ffff78378fa <+90>: retq 0x00007ffff78378fb <+91>: nopl 0x0(%rax,%rax,1) 0x00007ffff7837900 <+96>: mov 0x18(%rax),%rax 0x00007ffff7837904 <+100>: mov 0x8(%rax),%edx 0x00007ffff7837907 <+103>: test %edx,%edx 0x00007ffff7837909 <+105>: jne 0x7ffff78378f2 <operationCallEval+82> 0x00007ffff783790b <+107>: mov 0x34f666(%rip),%rcx # 0x7ffff7b86f78 0x00007ffff7837912 <+114>: cmp %rcx,0x50(%rax) 0x00007ffff7837916 <+118>: jne 0x7ffff78378f2 <operationCallEval+82> 0x00007ffff7837918 <+120>: mov 0x18(%rbx),%rax 0x00007ffff783791c <+124>: mov %rbx,%rdi 0x00007ffff783791f <+127>: xor %ax,%ax 0x00007ffff7837922 <+130>: mov 0x468(%rax),%rbp 0x00007ffff7837929 <+137>: callq 0x7ffff760e7a0 <_ZN3JSC4evalEPNS_9ExecStateE@plt> 0x00007ffff783792e <+142>: cmpq $0x0,0xbdb0(%rbp) 0x00007ffff7837936 <+150>: jne 0x7ffff78378f2 <operationCallEval+82> 0x00007ffff7837938 <+152>: add $0x8,%rsp 0x00007ffff783793c <+156>: pop %rbx 0x00007ffff783793d <+157>: pop %rbp 0x00007ffff783793e <+158>: retq End of assembler dump.
Zan Dobersek
Comment 14 2014-01-29 15:23:53 PST
Some further clarification on comment #13: JSC::eval() and JSC::Interpreter::execute() rely on a non-null JSScope object being accessible through the relevant CallFrame. Currently the callerScopeChain variable in JSC::eval()[1] is assigned a null value, and the changes described in comment #13 (adding the extra fprintf call) somehow fix this. [1] http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/interpreter/Interpreter.cpp#L110
Michael Saboff
Comment 15 2014-01-29 16:28:57 PST
(In reply to comment #14) > Some further clarification on comment #13: > > JSC::eval() and JSC::Interpreter::execute() rely on a non-null JSScope object being accessible through the relevant CallFrame. Currently the callerScopeChain variable in JSC::eval()[1] is assigned a null value, and the changes described in comment #13 (adding the extra fprintf call) somehow fix this. > > > [1] http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/interpreter/Interpreter.cpp#L110 I believe the issue may be that the optimized version is not pushing rbp at the top of the function. The top of the stack after the push of rbp is the exec pointer passed in as the first argument. Without the push of rbp, the chain of caller frames is not complete. In the case of operationCallEval() we pass in both the caller and callee, but the callee's "CallerFrame" doesn't get set up without the push rbp. Are you compiling with the gcc option -fomit-frame-pointer? If so, you may want to eliminate that option and see what that does.
Zan Dobersek
Comment 16 2014-01-30 00:39:06 PST
The -O2 option implies -fomit-frame-pointer, and passing -fno-omit-frame-pointer explicitly helps greatly, reducing the number of failures down to 25. I'll deduce which additional optimization flags that are implied by -O2 are causing regressions.
Csaba Osztrogonác
Comment 17 2014-01-30 02:22:03 PST
I checked EFL compiling with -fno-omit-frame-pointer and all jsc tests pass without stress tests and only two failing stress tests remained.
Csaba Osztrogonác
Comment 18 2014-01-30 02:27:27 PST
(In reply to comment #17) > I checked EFL compiling with -fno-omit-frame-pointer and all jsc tests pass without stress tests and only two failing stress tests remained. Let's build EFL with -fno-omit-frame-pointer: https://bugs.webkit.org/show_bug.cgi?id=127898
Gyuyoung Kim
Comment 19 2014-02-05 22:29:19 PST
(In reply to comment #18) > (In reply to comment #17) > > I checked EFL compiling with -fno-omit-frame-pointer and all jsc tests pass without stress tests and only two failing stress tests remained. > > Let's build EFL with -fno-omit-frame-pointer: https://bugs.webkit.org/show_bug.cgi?id=127898 Ossy, thank you for fixing this problem on EFL port ! However, it looks this js test regression still happen on efl bots and my local env. Do I need to do something to run js test for EFL port ?
Csaba Osztrogonác
Comment 20 2014-02-06 03:36:17 PST
(In reply to comment #19) > Ossy, thank you for fixing this problem on EFL port ! However, it looks this js test regression still happen on efl bots and my local env. Do I need to do something to run js test for EFL port ? There are still 2 failing tests on GTK and EFL too - https://bugs.webkit.org/show_bug.cgi?id=127902 Maybe -fno-tree-dce could solve the other failures occur on the bots similar to GTK's fix in https://bugs.webkit.org/show_bug.cgi?id=127909 . Unfortunately I can't try if it helps, I haven't a machine with GCC 4.8.
Note You need to log in before you can comment on or make changes to this bug.