Bug 138793

Summary: Allocate local ScopeChain register
Product: WebKit Reporter: Michael Saboff <msaboff>
Component: JavaScriptCoreAssignee: Michael Saboff <msaboff>
Status: RESOLVED FIXED    
Severity: Normal CC: benjamin, buildbot, cmarcelo, commit-queue, fpizlo, ggaren, mark.lam, mmirman, rniwa
Priority: P2    
Version: 312.x   
Hardware: All   
OS: All   
Bug Depends on:    
Bug Blocks: 136724    
Attachments:
Description Flags
Work in Progress, Almost there for 64 bit
fpizlo: review-, buildbot: commit-queue-
Archive of layout-test-results from webkit-ews-11 for mac-mountainlion-wk2
none
Archive of layout-test-results from webkit-ews-08 for mac-mountainlion
none
Archive of layout-test-results from webkit-ews-07 for mac-mountainlion
none
Updated work in progress patch
none
Updated work in progress patch with DataLog disabled
buildbot: commit-queue-
Archive of layout-test-results from webkit-ews-11 for mac-mountainlion-wk2
none
Archive of layout-test-results from webkit-ews-06 for mac-mountainlion
none
Patch with fixes for debugger layout test failures
fpizlo: review-
Patch with fix for performance regression and an added test ggaren: review+

Michael Saboff
Reported 2014-11-17 00:39:38 PST
This is to allocate the ScopeChain register instead of using the ScopeChain slot in the CallFrame header.
Attachments
Work in Progress, Almost there for 64 bit (27.42 KB, patch)
2014-11-17 02:12 PST, Michael Saboff
fpizlo: review-
buildbot: commit-queue-
Archive of layout-test-results from webkit-ews-11 for mac-mountainlion-wk2 (658.90 KB, application/zip)
2014-11-17 03:59 PST, Build Bot
no flags
Archive of layout-test-results from webkit-ews-08 for mac-mountainlion (678.43 KB, application/zip)
2014-11-17 04:40 PST, Build Bot
no flags
Archive of layout-test-results from webkit-ews-07 for mac-mountainlion (609.43 KB, application/zip)
2014-11-17 06:48 PST, Build Bot
no flags
Updated work in progress patch (28.12 KB, patch)
2014-11-18 11:52 PST, Michael Saboff
no flags
Updated work in progress patch with DataLog disabled (27.44 KB, patch)
2014-11-18 11:58 PST, Michael Saboff
buildbot: commit-queue-
Archive of layout-test-results from webkit-ews-11 for mac-mountainlion-wk2 (541.61 KB, application/zip)
2014-11-18 13:45 PST, Build Bot
no flags
Archive of layout-test-results from webkit-ews-06 for mac-mountainlion (556.83 KB, application/zip)
2014-11-18 14:24 PST, Build Bot
no flags
Patch with fixes for debugger layout test failures (33.19 KB, patch)
2014-11-18 16:16 PST, Michael Saboff
fpizlo: review-
Patch with fix for performance regression and an added test (39.83 KB, patch)
2014-11-20 18:51 PST, Michael Saboff
ggaren: review+
Michael Saboff
Comment 1 2014-11-17 02:12:49 PST
Created attachment 241698 [details] Work in Progress, Almost there for 64 bit This compiles and runs the JSC regression tests with only 4 related failures: ** The following JSC stress test failures have been introduced: v8-v6/v8-earley-boyer.js.dfg-eager-no-cjit-validate v8-v6/v8-earley-boyer.js.ftl-eager-no-cjit v8-v6/v8-earley-boyer.js.ftl-no-cjit-validate v8-v6/v8-earley-boyer.js.no-cjit-validate-phases These failures I believe are due to not restoring the scope value properly at OSR exit time from a exit in an inlined frame. I have already worked through other liveness bugs.
Build Bot
Comment 2 2014-11-17 03:59:04 PST
Comment on attachment 241698 [details] Work in Progress, Almost there for 64 bit Attachment 241698 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.appspot.com/results/5046898922618880 New failing tests: js/promises-tests/promises-tests-2-2-6.html inspector-protocol/debugger/setBreakpoint-actions.html js/promises-tests/promises-tests-2-3-4.html inspector-protocol/debugger/didSampleProbe-multiple-probes.html inspector-protocol/debugger/hit-breakpoint-from-console.html inspector-protocol/debugger/setBreakpoint-column.html js/promises-tests/promises-tests-2-3-3.html fast/media/w3c/test_media_queries.html js/promises-tests/promises-tests-2-2-7.html inspector-protocol/debugger/setBreakpointByUrl-sourceURL.html inspector-protocol/debugger/setBreakpoint-condition.html
Build Bot
Comment 3 2014-11-17 03:59:07 PST
Created attachment 241699 [details] Archive of layout-test-results from webkit-ews-11 for mac-mountainlion-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: webkit-ews-11 Port: mac-mountainlion-wk2 Platform: Mac OS X 10.8.5
Build Bot
Comment 4 2014-11-17 04:40:10 PST
Comment on attachment 241698 [details] Work in Progress, Almost there for 64 bit Attachment 241698 [details] did not pass mac-ews (mac): Output: http://webkit-queues.appspot.com/results/4834961613914112 New failing tests: js/promises-tests/promises-tests-2-2-6.html inspector-protocol/debugger/setBreakpoint-actions.html js/promises-tests/promises-tests-2-3-4.html inspector-protocol/debugger/didSampleProbe-multiple-probes.html inspector-protocol/debugger/hit-breakpoint-from-console.html inspector-protocol/debugger/setBreakpoint-column.html js/promises-tests/promises-tests-2-3-3.html fast/media/w3c/test_media_queries.html js/promises-tests/promises-tests-2-2-7.html inspector-protocol/debugger/setBreakpointByUrl-sourceURL.html inspector-protocol/debugger/setBreakpoint-condition.html
Build Bot
Comment 5 2014-11-17 04:40:13 PST
Created attachment 241700 [details] Archive of layout-test-results from webkit-ews-08 for mac-mountainlion The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: webkit-ews-08 Port: mac-mountainlion Platform: Mac OS X 10.8.5
Build Bot
Comment 6 2014-11-17 06:48:35 PST
Comment on attachment 241698 [details] Work in Progress, Almost there for 64 bit Attachment 241698 [details] did not pass mac-ews (mac): Output: http://webkit-queues.appspot.com/results/5401625841631232 New failing tests: js/promises-tests/promises-tests-2-2-6.html js/promises-tests/promises-tests-2-2-7.html inspector-protocol/debugger/setBreakpoint-column.html inspector-protocol/debugger/hit-breakpoint-from-console.html inspector-protocol/debugger/didSampleProbe-multiple-probes.html inspector-protocol/debugger/setBreakpoint-actions.html inspector-protocol/debugger/setBreakpointByUrl-sourceURL.html inspector-protocol/debugger/setBreakpoint-condition.html
Build Bot
Comment 7 2014-11-17 06:48:38 PST
Created attachment 241708 [details] Archive of layout-test-results from webkit-ews-07 for mac-mountainlion The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: webkit-ews-07 Port: mac-mountainlion Platform: Mac OS X 10.8.5
Filip Pizlo
Comment 8 2014-11-17 14:01:56 PST
Comment on attachment 241698 [details] Work in Progress, Almost there for 64 bit View in context: https://bugs.webkit.org/attachment.cgi?id=241698&action=review > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3178 > + addToGraph(Phantom, get(m_codeBlock->scopeRegister())); It's dangerous to put this here. The last Phantom on a node tells the DFG that the node is dead at that point. So, if this is the last bytecode operation that used the scope, then you'd be telling the DFG that the node is dead at the start of the bytecode op. But what if one of the other nodes below exited? Then the DFG would think that the scope is no longer live in bytecode even though it is. > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3242 > + addToGraph(Phantom, get(m_codeBlock->scopeRegister())); > set(VirtualRegister(dst), addToGraph(GetByIdFlush, OpInfo(identifierNumber), OpInfo(prediction), get(VirtualRegister(scope)))); As above, you're inserting the Phantom too early, and here it will cause real problems since GetByIdFlush can exit. Please move the Phantom to come after the GetByIdFlush. > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3248 > + addToGraph(Phantom, get(m_codeBlock->scopeRegister())); > Node* base = cellConstantWithStructureCheck(globalObject, status[0].structureSet().onlyStructure()); > addToGraph(Phantom, get(VirtualRegister(scope))); > set(VirtualRegister(dst), handleGetByOffset(prediction, base, status[0].structureSet(), identifierNumber, operand)); Ditto. Also, aren't we already inserting a phantom - the addToGraph(Phantom, get(VirtualRegister(scope)))? That Phantom also happens to be in the right place. > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3336 > + addToGraph(Phantom, get(m_codeBlock->scopeRegister())); > Node* base = cellConstantWithStructureCheck(globalObject, structure); > addToGraph(Phantom, get(VirtualRegister(scope))); Same as before - you added a Phantom in the wrong place and there was already a Phantom in the right place.
Filip Pizlo
Comment 9 2014-11-17 14:41:29 PST
Comment on attachment 241698 [details] Work in Progress, Almost there for 64 bit View in context: https://bugs.webkit.org/attachment.cgi?id=241698&action=review > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1264 > + set(VirtualRegister(codeBlock->scopeRegister()), addToGraph(GetScope, callTargetNode), ImmediateNakedSet)->variableAccessData(); This looks super wrong. Why would you do this? Doesn't the inlined code set its own scope register? >> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3178 >> + addToGraph(Phantom, get(m_codeBlock->scopeRegister())); > > It's dangerous to put this here. The last Phantom on a node tells the DFG that the node is dead at that point. So, if this is the last bytecode operation that used the scope, then you'd be telling the DFG that the node is dead at the start of the bytecode op. But what if one of the other nodes below exited? Then the DFG would think that the scope is no longer live in bytecode even though it is. Also, this is using the wrong scope register. m_codeBlock is the outermost code block, not the one on top of the inline stack. You want the scope register of the code block at the top of the inline stack. >> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3242 >> set(VirtualRegister(dst), addToGraph(GetByIdFlush, OpInfo(identifierNumber), OpInfo(prediction), get(VirtualRegister(scope)))); > > As above, you're inserting the Phantom too early, and here it will cause real problems since GetByIdFlush can exit. Please move the Phantom to come after the GetByIdFlush. Wrong CodeBlock, as before. >> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3248 >> set(VirtualRegister(dst), handleGetByOffset(prediction, base, status[0].structureSet(), identifierNumber, operand)); > > Ditto. Also, aren't we already inserting a phantom - the addToGraph(Phantom, get(VirtualRegister(scope)))? That Phantom also happens to be in the right place. Wrong CodeBlock, as before. > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3254 > + addToGraph(Phantom, get(m_codeBlock->scopeRegister())); Wrong CodeBlock. > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3284 > + addToGraph(Phantom, get(m_codeBlock->scopeRegister())); Wrong CodeBlock. >> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3336 >> addToGraph(Phantom, get(VirtualRegister(scope))); > > Same as before - you added a Phantom in the wrong place and there was already a Phantom in the right place. Also, wrong CodeBlock.
Michael Saboff
Comment 10 2014-11-18 11:52:34 PST
Created attachment 241798 [details] Updated work in progress patch Made suggested changes and fix other issues. Now passing all JSC regression tests. Will work on the WebKit test failures.
Michael Saboff
Comment 11 2014-11-18 11:58:39 PST
Created attachment 241799 [details] Updated work in progress patch with DataLog disabled Prior patch with DataLog turned off.
Build Bot
Comment 12 2014-11-18 13:45:11 PST
Comment on attachment 241799 [details] Updated work in progress patch with DataLog disabled Attachment 241799 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.appspot.com/results/6113215285035008 New failing tests: inspector-protocol/debugger/setBreakpoint-column.html inspector-protocol/debugger/didSampleProbe-multiple-probes.html inspector-protocol/debugger/hit-breakpoint-from-console.html inspector-protocol/debugger/setBreakpointByUrl-sourceURL.html inspector-protocol/debugger/setBreakpoint-actions.html inspector-protocol/debugger/setBreakpoint-condition.html
Build Bot
Comment 13 2014-11-18 13:45:14 PST
Created attachment 241809 [details] Archive of layout-test-results from webkit-ews-11 for mac-mountainlion-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: webkit-ews-11 Port: mac-mountainlion-wk2 Platform: Mac OS X 10.8.5
Build Bot
Comment 14 2014-11-18 14:24:19 PST
Comment on attachment 241799 [details] Updated work in progress patch with DataLog disabled Attachment 241799 [details] did not pass mac-ews (mac): Output: http://webkit-queues.appspot.com/results/4821685870002176 New failing tests: inspector-protocol/debugger/setBreakpoint-column.html inspector-protocol/debugger/didSampleProbe-multiple-probes.html inspector-protocol/debugger/hit-breakpoint-from-console.html inspector-protocol/debugger/setBreakpointByUrl-sourceURL.html inspector-protocol/debugger/setBreakpoint-actions.html inspector-protocol/debugger/setBreakpoint-condition.html
Build Bot
Comment 15 2014-11-18 14:24:23 PST
Created attachment 241812 [details] Archive of layout-test-results from webkit-ews-06 for mac-mountainlion The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: webkit-ews-06 Port: mac-mountainlion Platform: Mac OS X 10.8.5
Michael Saboff
Comment 16 2014-11-18 16:16:50 PST
Created attachment 241827 [details] Patch with fixes for debugger layout test failures
Geoffrey Garen
Comment 17 2014-11-18 17:16:44 PST
Comment on attachment 241827 [details] Patch with fixes for debugger layout test failures View in context: https://bugs.webkit.org/attachment.cgi?id=241827&action=review - Any performance change? - How should the debugger load the dynamic scope from the stack, in case the optimizing JIT did not store it? - Did we lose the optimization where an inlined function’s scope is constant? - How should eval load the dynamic scope from the stack, in case the optimizing JIT did not store it? Answer: DFG does not compile eval. - How should catch load the dynamic scope from the stack, in case the optimizing JIT did not store it? Answer: DFG does not compile catch. - How should push_name_scope and push_with_scope work? Answer: DFG does not compile them. > Source/JavaScriptCore/ChangeLog:10 > + Changed usage of JSStack::ScopeChain access to the CallFrame header to use the allocation "allocated" > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:527 > - flushDirect(inlineStackEntry->remapOperand(VirtualRegister(JSStack::ScopeChain))); > + flushDirect(inlineStackEntry->remapOperand(m_codeBlock->scopeRegister())); Do we still need this flush? Now that scope is a regular register, it should be fine through the normal mechanism. > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3191 > + addToGraph(Phantom, get(m_inlineStackTop->m_codeBlock->scopeRegister())); Let's use the operand instead of the data member. I think the data member is a wart, and should hopefully go away eventually. > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3237 > + addToGraph(Phantom, get(m_inlineStackTop->m_codeBlock->scopeRegister())); This seems like an unnecessary phantom. This node should only phantom things used by this bytecode -- not things we expect might be used by subsequent bytecodes. If removing this triggers some kind of test failure, that's probably an indication that something else has gone wrong. > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3241 > + addToGraph(Phantom, get(m_inlineStackTop->m_codeBlock->scopeRegister())); Ditto. > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3249 > + addToGraph(Phantom, get(m_inlineStackTop->m_codeBlock->scopeRegister())); Ditto. > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3279 > + addToGraph(Phantom, get(m_inlineStackTop->m_codeBlock->scopeRegister())); Ditto. > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3331 > + addToGraph(Phantom, get(m_inlineStackTop->m_codeBlock->scopeRegister())); Ditto.
Geoffrey Garen
Comment 18 2014-11-18 17:19:28 PST
> - How should the debugger load the dynamic scope from the stack, in case the > optimizing JIT did not store it? I think the right solution here is probably to pass the JSScope* as an explicit argument to op_debug.
Geoffrey Garen
Comment 19 2014-11-18 17:23:56 PST
(In reply to comment #18) > > - How should the debugger load the dynamic scope from the stack, in case the > > optimizing JIT did not store it? > > I think the right solution here is probably to pass the JSScope* as an > explicit argument to op_debug. Another solution is for the implementation of op_debug in the DFG to OSR exit in the case of a debugging event. If we do that, then we can continue to rely on a stack frame that allows us to peak and poke in the debugger.
Michael Saboff
Comment 20 2014-11-18 17:56:26 PST
(In reply to comment #19) > (In reply to comment #18) > > > - How should the debugger load the dynamic scope from the stack, in case the > > > optimizing JIT did not store it? > > > > I think the right solution here is probably to pass the JSScope* as an > > explicit argument to op_debug. > > Another solution is for the implementation of op_debug in the DFG to OSR > exit in the case of a debugging event. If we do that, then we can continue > to rely on a stack frame that allows us to peak and poke in the debugger. We never go into the DFG when debugging. Going into the debugger causes us to delete the code, so we start all over with the LLInt. operationOptimize() precludes tiering up to the DFG with a debugging check.
Michael Saboff
Comment 21 2014-11-18 18:01:01 PST
(In reply to comment #20) > (In reply to comment #19) > > (In reply to comment #18) > > > > - How should the debugger load the dynamic scope from the stack, in case the > > > > optimizing JIT did not store it? > > > > > > I think the right solution here is probably to pass the JSScope* as an > > > explicit argument to op_debug. > > > > Another solution is for the implementation of op_debug in the DFG to OSR > > exit in the case of a debugging event. If we do that, then we can continue > > to rely on a stack frame that allows us to peak and poke in the debugger. > > We never go into the DFG when debugging. Going into the debugger causes us > to delete the code, so we start all over with the LLInt. > operationOptimize() precludes tiering up to the DFG with a debugging check. The debugging check looks for two things, either we are single stepping anywhere or active breakpoints in the function to tier up.
Michael Saboff
Comment 22 2014-11-19 11:24:44 PST
(In reply to comment #17) > Comment on attachment 241827 [details] > Patch with fixes for debugger layout test failures > > - Any performance change? There are a few performance regressions. Most notable is a 7x slowdown in inline-arguments-aliased-access: function foo() { var a = arguments; return a[0] + a[1] + a[2]; } function bar(a, b, c) { return foo(b, c, 42); } for (var i = 0; i < 200000; ++i) { var result = bar(1, 2, 3); if (result != 47) throw "Bad result: " + result; } Before the change foo() basically becomes a constant of "47" with a little code generated for op_enter, etc. With this change, we do the full create arguments, get by val the arguments and add them together. Looking into why this is the case. > - Did we lose the optimization where an inlined function’s scope is constant? I modified the DFG code to add back in the constant scope for inlined functions. > - How should eval load the dynamic scope from the stack, in case the > optimizing JIT did not store it? Answer: DFG does not compile eval. > - How should catch load the dynamic scope from the stack, in case the > optimizing JIT did not store it? Answer: DFG does not compile catch. > - How should push_name_scope and push_with_scope work? Answer: DFG does not > compile them. > > > Source/JavaScriptCore/ChangeLog:10 > > + Changed usage of JSStack::ScopeChain access to the CallFrame header to use the allocation > > "allocated" Fixed. > > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:527 > > - flushDirect(inlineStackEntry->remapOperand(VirtualRegister(JSStack::ScopeChain))); > > + flushDirect(inlineStackEntry->remapOperand(m_codeBlock->scopeRegister())); > > Do we still need this flush? Now that scope is a regular register, it should > be fine through the normal mechanism. Removed > > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3191 > > + addToGraph(Phantom, get(m_inlineStackTop->m_codeBlock->scopeRegister())); > > Let's use the operand instead of the data member. I think the data member is > a wart, and should hopefully go away eventually. Done. > > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3237 > > + addToGraph(Phantom, get(m_inlineStackTop->m_codeBlock->scopeRegister())); > > This seems like an unnecessary phantom. This node should only phantom things > used by this bytecode -- not things we expect might be used by subsequent > bytecodes. If removing this triggers some kind of test failure, that's > probably an indication that something else has gone wrong. > > > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3241 > > + addToGraph(Phantom, get(m_inlineStackTop->m_codeBlock->scopeRegister())); > > Ditto. > > > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3249 > > + addToGraph(Phantom, get(m_inlineStackTop->m_codeBlock->scopeRegister())); > > Ditto. > > > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3279 > > + addToGraph(Phantom, get(m_inlineStackTop->m_codeBlock->scopeRegister())); > > Ditto. > > > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3331 > > + addToGraph(Phantom, get(m_inlineStackTop->m_codeBlock->scopeRegister())); > > Ditto. Removed these five Phantoms.
Filip Pizlo
Comment 23 2014-11-19 11:48:23 PST
(In reply to comment #22) > (In reply to comment #17) > > Comment on attachment 241827 [details] > > Patch with fixes for debugger layout test failures > > > > - Any performance change? > > There are a few performance regressions. Most notable is a 7x slowdown in > inline-arguments-aliased-access: > function foo() { > var a = arguments; > return a[0] + a[1] + a[2]; > } > > function bar(a, b, c) { > return foo(b, c, 42); > } > > for (var i = 0; i < 200000; ++i) { > var result = bar(1, 2, 3); > if (result != 47) > throw "Bad result: " + result; > } > > Before the change foo() basically becomes a constant of "47" with a little > code generated for op_enter, etc. With this change, we do the full create > arguments, get by val the arguments and add them together. Looking into why > this is the case. That seems really broken - I don't think we should land until this is diagnosed. > > > - Did we lose the optimization where an inlined function’s scope is constant? > > I modified the DFG code to add back in the constant scope for inlined > functions. > > > - How should eval load the dynamic scope from the stack, in case the > > optimizing JIT did not store it? Answer: DFG does not compile eval. > > - How should catch load the dynamic scope from the stack, in case the > > optimizing JIT did not store it? Answer: DFG does not compile catch. > > - How should push_name_scope and push_with_scope work? Answer: DFG does not > > compile them. > > > > > Source/JavaScriptCore/ChangeLog:10 > > > + Changed usage of JSStack::ScopeChain access to the CallFrame header to use the allocation > > > > "allocated" > > Fixed. > > > > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:527 > > > - flushDirect(inlineStackEntry->remapOperand(VirtualRegister(JSStack::ScopeChain))); > > > + flushDirect(inlineStackEntry->remapOperand(m_codeBlock->scopeRegister())); > > > > Do we still need this flush? Now that scope is a regular register, it should > > be fine through the normal mechanism. > > Removed > > > > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3191 > > > + addToGraph(Phantom, get(m_inlineStackTop->m_codeBlock->scopeRegister())); > > > > Let's use the operand instead of the data member. I think the data member is > > a wart, and should hopefully go away eventually. > > Done. > > > > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3237 > > > + addToGraph(Phantom, get(m_inlineStackTop->m_codeBlock->scopeRegister())); > > > > This seems like an unnecessary phantom. This node should only phantom things > > used by this bytecode -- not things we expect might be used by subsequent > > bytecodes. If removing this triggers some kind of test failure, that's > > probably an indication that something else has gone wrong. > > > > > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3241 > > > + addToGraph(Phantom, get(m_inlineStackTop->m_codeBlock->scopeRegister())); > > > > Ditto. > > > > > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3249 > > > + addToGraph(Phantom, get(m_inlineStackTop->m_codeBlock->scopeRegister())); > > > > Ditto. > > > > > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3279 > > > + addToGraph(Phantom, get(m_inlineStackTop->m_codeBlock->scopeRegister())); > > > > Ditto. > > > > > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:3331 > > > + addToGraph(Phantom, get(m_inlineStackTop->m_codeBlock->scopeRegister())); > > > > Ditto. > > Removed these five Phantoms.
Filip Pizlo
Comment 24 2014-11-19 11:48:39 PST
Comment on attachment 241827 [details] Patch with fixes for debugger layout test failures Marking r- due to the perf regression.
Michael Saboff
Comment 25 2014-11-20 18:51:48 PST
Created attachment 242013 [details] Patch with fix for performance regression and an added test Here are the performance test results. Benchmark report for SunSpider, LongSpider, V8Spider, Octane, Kraken, JSRegress, AsmBench, and CompressionBench on msaboff-pro (MacPro5,1). VMs tested: "Baseline" at /Volumes/Data/src/webkit.baseline/WebKitBuild/Release/jsc (r176292) "AllocScope" at /Volumes/Data/src/webkit/WebKitBuild/Release/jsc (r176292) Collected 4 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. Baseline AllocScope SunSpider: 3d-cube 6.7043+-0.3428 6.6736+-0.3169 3d-morph 8.3599+-0.2674 ? 8.3893+-0.2051 ? 3d-raytrace 8.1967+-0.3417 ? 8.6642+-0.1900 ? might be 1.0570x slower access-binary-trees 2.6387+-0.1098 ? 2.7303+-0.1139 ? might be 1.0347x slower access-fannkuch 8.1331+-0.2413 8.0338+-0.0903 might be 1.0124x faster access-nbody 3.9852+-0.1437 ? 4.0507+-0.1635 ? might be 1.0164x slower access-nsieve 4.9922+-0.2055 ? 5.1135+-0.0641 ? might be 1.0243x slower bitops-3bit-bits-in-byte 1.7890+-0.1124 ? 1.8440+-0.0760 ? might be 1.0307x slower bitops-bits-in-byte 5.6580+-0.1410 5.4667+-0.0600 might be 1.0350x faster bitops-bitwise-and 2.7624+-0.0382 ? 2.8070+-0.0749 ? might be 1.0162x slower bitops-nsieve-bits 5.0430+-0.2700 4.9766+-0.1428 might be 1.0133x faster controlflow-recursive 2.8350+-0.0470 ? 2.8822+-0.1498 ? might be 1.0167x slower crypto-aes 5.5928+-0.4737 ? 5.6612+-0.4346 ? might be 1.0122x slower crypto-md5 3.3118+-0.0446 ? 3.4150+-0.2779 ? might be 1.0312x slower crypto-sha1 3.1743+-0.2099 3.1406+-0.1948 might be 1.0107x faster date-format-tofte 11.9893+-0.4418 11.8443+-0.3476 might be 1.0122x faster date-format-xparb 6.9789+-0.3294 ? 7.1750+-0.2610 ? might be 1.0281x slower math-cordic 4.2153+-0.1330 4.1816+-0.1074 math-partial-sums 8.8959+-0.1767 ? 9.0160+-0.2441 ? might be 1.0135x slower math-spectral-norm 2.7811+-0.1705 2.7462+-0.1510 might be 1.0127x faster regexp-dna 10.0225+-0.4124 9.9097+-0.4583 might be 1.0114x faster string-base64 5.6498+-0.3858 ? 5.7672+-0.2516 ? might be 1.0208x slower string-fasta 8.3633+-0.1390 ? 8.4912+-0.3163 ? might be 1.0153x slower string-tagcloud 13.2698+-0.4630 13.2123+-0.3663 string-unpack-code 27.6840+-0.8408 ? 27.8139+-1.0244 ? string-validate-input 6.3184+-0.4168 ? 6.3217+-0.1709 ? <arithmetic> * 6.8979+-0.0950 ? 6.9357+-0.0541 ? might be 1.0055x slower <geometric> 5.6949+-0.0861 ? 5.7385+-0.0676 ? might be 1.0077x slower <harmonic> 4.8207+-0.0900 ? 4.8711+-0.0643 ? might be 1.0105x slower Baseline AllocScope LongSpider: 3d-cube 1162.9269+-20.9606 1149.2098+-6.1591 might be 1.0119x faster 3d-morph 1871.2963+-8.5229 ? 1877.8032+-13.9436 ? 3d-raytrace 1046.5884+-1.2430 ! 1054.1741+-3.9386 ! definitely 1.0072x slower access-binary-trees 1379.2540+-6.4895 ^ 1347.5148+-13.0069 ^ definitely 1.0236x faster access-fannkuch 438.6202+-4.0366 ? 444.7978+-20.4072 ? might be 1.0141x slower access-nbody 1005.6719+-10.2559 ? 1006.4373+-0.4544 ? access-nsieve 1332.2664+-13.0083 ? 1353.0705+-7.8396 ? might be 1.0156x slower bitops-3bit-bits-in-byte 50.3135+-0.6475 ? 50.4483+-1.3866 ? bitops-bits-in-byte 320.5659+-10.0230 316.4203+-11.2046 might be 1.0131x faster bitops-nsieve-bits 967.2447+-10.6097 964.1804+-10.1882 controlflow-recursive 719.6461+-3.1677 ? 721.5184+-4.4546 ? crypto-aes 917.7995+-9.0508 ? 922.0488+-12.6841 ? crypto-md5 812.1070+-16.5997 808.5375+-5.0577 crypto-sha1 970.3733+-7.3637 955.7965+-30.9129 might be 1.0153x faster date-format-tofte 1031.1957+-33.6712 1013.7327+-17.7595 might be 1.0172x faster date-format-xparb 980.0178+-26.7381 973.6632+-8.2001 math-cordic 844.6575+-7.3916 ^ 681.7010+-10.7251 ^ definitely 1.2390x faster math-partial-sums 1030.4660+-4.4530 ? 1033.2065+-7.5466 ? math-spectral-norm 1074.4768+-0.3809 ! 1079.3044+-3.2843 ! definitely 1.0045x slower string-base64 475.1814+-12.7456 471.8162+-3.0550 string-fasta 559.7793+-8.5541 ? 563.0473+-3.1164 ? string-tagcloud 304.3846+-2.9190 ? 306.8835+-9.8060 ? <arithmetic> 877.0379+-5.2348 ^ 867.9688+-2.7389 ^ definitely 1.0104x faster <geometric> * 735.0754+-4.6613 726.9574+-3.4893 might be 1.0112x faster <harmonic> 457.1600+-4.0082 454.7073+-6.0031 might be 1.0054x faster Baseline AllocScope V8Spider: crypto 73.1843+-0.8335 ? 73.6766+-0.8571 ? deltablue 110.2762+-1.4757 ? 111.7315+-2.1210 ? might be 1.0132x slower earley-boyer 62.0447+-0.4333 ? 62.8017+-1.1954 ? might be 1.0122x slower raytrace 39.8492+-1.2632 ? 40.1019+-0.8317 ? regexp 95.0214+-0.6014 ? 95.7184+-1.1542 ? richards 111.7504+-3.6842 ? 113.2792+-3.4772 ? might be 1.0137x slower splay 43.9547+-1.0853 43.3693+-2.1714 might be 1.0135x faster <arithmetic> 76.5830+-0.8432 ? 77.2398+-0.6473 ? might be 1.0086x slower <geometric> * 71.2376+-0.8418 ? 71.7009+-0.4185 ? might be 1.0065x slower <harmonic> 65.9277+-0.9343 ? 66.2000+-0.3974 ? might be 1.0041x slower Baseline AllocScope Octane: encrypt 0.32285+-0.00266 ^ 0.31582+-0.00081 ^ definitely 1.0222x faster decrypt 5.65354+-0.06112 ? 5.67520+-0.03216 ? deltablue x2 0.26537+-0.00210 0.26473+-0.00408 earley 0.99587+-0.00198 ? 1.00271+-0.02671 ? boyer 8.90626+-0.06005 8.86617+-0.16320 navier-stokes x2 6.40875+-0.00866 ? 6.47074+-0.14305 ? raytrace x2 1.65271+-0.02075 ? 1.66033+-0.02245 ? richards x2 0.16688+-0.00316 ? 0.16877+-0.00520 ? might be 1.0113x slower splay x2 0.53961+-0.00826 0.53678+-0.00473 regexp x2 28.64384+-0.41212 28.38926+-0.46062 pdfjs x2 71.03765+-0.76756 70.51828+-0.27489 mandreel x2 73.64267+-0.62092 73.23315+-0.99665 gbemu x2 63.21536+-1.12651 ? 64.57322+-3.04247 ? might be 1.0215x slower closure 0.76139+-0.00432 0.75828+-0.00767 jquery 9.62650+-0.12374 9.53595+-0.06485 box2d x2 18.88796+-0.23704 ? 19.15082+-0.32939 ? might be 1.0139x slower zlib x2 648.84821+-71.18849 ? 680.70315+-6.36221 ? might be 1.0491x slower typescript x2 1078.92603+-13.72886 ? 1097.96716+-15.85686 ? might be 1.0176x slower <arithmetic> 133.69122+-4.43263 ? 137.11423+-1.40346 ? might be 1.0256x slower <geometric> * 9.31447+-0.08064 ? 9.36611+-0.03414 ? might be 1.0055x slower <harmonic> 0.97318+-0.00840 ? 0.97431+-0.00995 ? might be 1.0012x slower Baseline AllocScope Kraken: ai-astar 472.673+-1.254 471.514+-3.910 audio-beat-detection 156.513+-2.378 ? 162.059+-4.837 ? might be 1.0354x slower audio-dft 213.811+-1.450 ? 217.208+-13.526 ? might be 1.0159x slower audio-fft 108.699+-0.877 ? 110.323+-7.430 ? might be 1.0149x slower audio-oscillator 313.756+-0.639 313.626+-1.092 imaging-darkroom 244.387+-1.764 ? 245.074+-4.418 ? imaging-desaturate 95.280+-0.695 ? 95.529+-1.360 ? imaging-gaussian-blur 157.221+-0.942 ? 158.765+-4.960 ? json-parse-financial 68.164+-1.458 67.411+-0.472 might be 1.0112x faster json-stringify-tinderbox 83.764+-2.532 83.235+-2.451 stanford-crypto-aes 83.084+-1.543 82.461+-0.934 stanford-crypto-ccm 76.874+-12.507 70.157+-14.228 might be 1.0957x faster stanford-crypto-pbkdf2 228.733+-2.937 228.700+-1.991 stanford-crypto-sha256-iterative 71.135+-0.378 ? 71.931+-1.334 ? might be 1.0112x slower <arithmetic> * 169.578+-0.440 ? 169.857+-0.692 ? might be 1.0016x slower <geometric> 140.488+-1.068 140.183+-1.614 might be 1.0022x faster <harmonic> 119.951+-1.715 118.981+-2.505 might be 1.0082x faster Baseline AllocScope JSRegress: abs-boolean 3.6764+-0.1248 3.6262+-0.0951 might be 1.0138x faster adapt-to-double-divide 17.8690+-0.4461 ? 18.0078+-0.6182 ? aliased-arguments-getbyval 1.0870+-0.0661 ! 1.6476+-0.1083 ! definitely 1.5158x slower allocate-big-object 3.4558+-0.4407 2.9811+-0.1187 might be 1.1592x faster arity-mismatch-inlining 1.0253+-0.1196 ? 1.1061+-0.0316 ? might be 1.0788x slower array-access-polymorphic-structure 9.3280+-0.1489 9.1752+-0.2547 might be 1.0167x faster array-nonarray-polymorhpic-access 50.6163+-1.7265 ? 50.9747+-0.6962 ? array-prototype-every 100.1774+-1.5463 ? 103.8340+-2.7531 ? might be 1.0365x slower array-prototype-forEach 101.6600+-5.0350 100.1467+-2.1835 might be 1.0151x faster array-prototype-map 126.0287+-1.5057 125.6683+-4.9344 array-prototype-some 100.2302+-2.3311 ? 103.3180+-3.6790 ? might be 1.0308x slower array-splice-contiguous 58.7499+-2.1073 57.7918+-0.6441 might be 1.0166x faster array-with-double-add 5.5138+-0.1006 ? 5.6085+-0.2420 ? might be 1.0172x slower array-with-double-increment 4.0833+-0.0611 4.0794+-0.1559 array-with-double-mul-add 7.1872+-0.1238 7.0884+-0.2357 might be 1.0139x faster array-with-double-sum 4.2713+-0.1702 ? 4.2983+-0.1666 ? array-with-int32-add-sub 9.6030+-0.1881 9.5747+-0.3312 array-with-int32-or-double-sum 4.3140+-0.1018 ? 4.3398+-0.1517 ? ArrayBuffer-DataView-alloc-large-long-lived 47.0146+-0.9478 46.8385+-0.8624 ArrayBuffer-DataView-alloc-long-lived 18.2665+-0.3506 ? 18.5762+-1.0219 ? might be 1.0170x slower ArrayBuffer-Int32Array-byteOffset 4.5291+-0.2691 ? 4.5925+-0.2577 ? might be 1.0140x slower ArrayBuffer-Int8Array-alloc-large-long-lived 53.2470+-1.2371 51.1756+-2.2276 might be 1.0405x faster ArrayBuffer-Int8Array-alloc-long-lived-buffer 34.6345+-1.1971 33.9214+-2.7962 might be 1.0210x faster ArrayBuffer-Int8Array-alloc-long-lived 20.7325+-0.7471 ^ 18.3345+-0.4060 ^ definitely 1.1308x faster ArrayBuffer-Int8Array-alloc 18.0219+-1.1457 ^ 15.9625+-0.5868 ^ definitely 1.1290x faster asmjs_bool_bug 8.6957+-0.3862 ? 8.8565+-0.1329 ? might be 1.0185x slower assign-custom-setter-polymorphic 4.3640+-0.3273 4.2174+-0.2913 might be 1.0348x faster assign-custom-setter 5.6804+-0.1639 ? 5.7209+-0.1180 ? basic-set 13.4615+-1.2026 13.3481+-0.2270 big-int-mul 5.8542+-0.1268 5.7807+-0.0905 might be 1.0127x faster boolean-test 4.1346+-0.1465 ? 4.1917+-0.1348 ? might be 1.0138x slower branch-fold 4.5672+-0.1151 4.5050+-0.0872 might be 1.0138x faster by-val-generic 11.0765+-0.3055 ? 11.2784+-0.2224 ? might be 1.0182x slower call-spread-apply 17.2422+-0.4533 17.1450+-0.2359 call-spread-call 7.6606+-0.3389 ? 7.7745+-0.1268 ? might be 1.0149x slower captured-assignments 0.7133+-0.1664 0.6035+-0.0945 might be 1.1821x faster cast-int-to-double 8.1328+-0.1530 8.0845+-0.2462 cell-argument 9.8693+-0.3280 ? 10.2842+-0.2964 ? might be 1.0420x slower cfg-simplify 3.6359+-0.0750 3.5787+-0.1413 might be 1.0160x faster chain-getter-access 12.6329+-0.3188 ? 12.8003+-0.4957 ? might be 1.0132x slower cmpeq-obj-to-obj-other 12.1782+-0.6914 11.8269+-0.6161 might be 1.0297x faster constant-test 7.5489+-0.0359 ? 7.6478+-0.1877 ? might be 1.0131x slower DataView-custom-properties 52.2145+-0.0903 ? 53.8785+-1.9425 ? might be 1.0319x slower delay-tear-off-arguments-strictmode 3.4387+-0.1132 3.4365+-0.2280 destructuring-arguments 7.6870+-0.2368 7.6342+-0.1405 destructuring-swap 7.5900+-0.3170 ? 7.7381+-0.1703 ? might be 1.0195x slower direct-arguments-getbyval 1.1517+-0.1675 ^ 0.9191+-0.0452 ^ definitely 1.2531x faster div-boolean-double 5.4810+-0.0492 ? 5.5032+-0.0567 ? div-boolean 9.7452+-0.2691 9.7431+-0.3802 double-get-by-val-out-of-bounds 5.5693+-0.1494 ? 5.7343+-0.3437 ? might be 1.0296x slower double-pollution-getbyval 9.6404+-0.1724 ? 9.7824+-0.1009 ? might be 1.0147x slower double-pollution-putbyoffset 5.4594+-0.4715 ? 5.5230+-0.1826 ? might be 1.0116x slower double-to-int32-typed-array-no-inline 2.7709+-0.0939 2.7698+-0.1037 double-to-int32-typed-array 2.4033+-0.0805 ? 2.5190+-0.1027 ? might be 1.0482x slower double-to-uint32-typed-array-no-inline 2.8213+-0.1178 ? 2.8251+-0.1282 ? double-to-uint32-typed-array 2.6835+-0.0261 2.6577+-0.2070 elidable-new-object-dag 55.5710+-0.7832 55.5638+-2.2426 elidable-new-object-roflcopter 204.7704+-10.5802 204.3742+-8.2687 elidable-new-object-then-call 50.8530+-4.2498 ? 53.5347+-4.2786 ? might be 1.0527x slower elidable-new-object-tree 62.2794+-2.1005 62.1555+-3.4027 empty-string-plus-int 7.2466+-0.4097 7.0193+-0.3172 might be 1.0324x faster emscripten-cube2hash 46.6963+-0.9438 ? 46.8990+-1.1831 ? external-arguments-getbyval 1.9240+-0.2425 1.8337+-0.1834 might be 1.0492x faster external-arguments-putbyval 2.7590+-0.0526 2.6832+-0.0869 might be 1.0283x faster fixed-typed-array-storage-var-index 1.5190+-0.0780 ? 1.5270+-0.1766 ? fixed-typed-array-storage 1.2151+-0.0768 1.1608+-0.0128 might be 1.0469x faster Float32Array-matrix-mult 5.8548+-0.2765 5.8149+-0.2229 Float32Array-to-Float64Array-set 73.0601+-2.2853 72.8556+-3.0007 Float64Array-alloc-long-lived 87.4623+-0.5847 ? 88.5342+-3.1487 ? might be 1.0123x slower Float64Array-to-Int16Array-set 93.5583+-2.4930 92.3798+-0.4792 might be 1.0128x faster fold-double-to-int 17.5067+-0.6611 ? 18.0450+-0.2903 ? might be 1.0307x slower fold-get-by-id-to-multi-get-by-offset-rare-int 29.2514+-16.9345 24.7902+-2.5698 might be 1.1800x faster fold-get-by-id-to-multi-get-by-offset 22.5128+-2.9109 22.4556+-2.6165 fold-multi-get-by-offset-to-get-by-offset 15.7877+-0.6457 15.7153+-0.5258 fold-multi-get-by-offset-to-poly-get-by-offset 15.9230+-0.5887 15.9133+-0.8042 fold-multi-put-by-offset-to-poly-put-by-offset 15.8453+-0.4225 ? 16.0886+-0.6052 ? might be 1.0154x slower fold-multi-put-by-offset-to-put-by-offset 14.4249+-0.2503 ? 14.9904+-0.8199 ? might be 1.0392x slower fold-multi-put-by-offset-to-replace-or-transition-put-by-offset 20.0404+-0.4657 19.4069+-0.6894 might be 1.0326x faster fold-put-by-id-to-multi-put-by-offset 25.1748+-2.7838 24.3094+-1.3948 might be 1.0356x faster fold-put-structure 14.6447+-0.5376 ? 14.8774+-0.1353 ? might be 1.0159x slower for-of-iterate-array-entries 7.4240+-0.2723 7.3740+-0.4409 for-of-iterate-array-keys 3.6555+-0.1405 ? 3.7997+-0.3241 ? might be 1.0395x slower for-of-iterate-array-values 3.2305+-0.3165 3.1957+-0.2692 might be 1.0109x faster fround 22.9187+-1.0641 22.7657+-1.3975 ftl-library-inlining-dataview 93.9007+-4.8166 ? 99.0175+-6.5571 ? might be 1.0545x slower ftl-library-inlining 88.9352+-2.8026 88.5748+-0.9149 function-dot-apply 2.2293+-0.1943 ? 2.3300+-0.1334 ? might be 1.0452x slower function-test 4.5359+-0.1543 ? 4.7162+-0.0762 ? might be 1.0398x slower function-with-eval 149.7122+-0.8282 ? 151.1223+-1.2659 ? gcse-poly-get-less-obvious 24.7608+-0.4622 ? 25.0608+-0.2556 ? might be 1.0121x slower gcse-poly-get 24.7070+-0.3734 ? 24.8321+-0.5983 ? gcse 6.4166+-0.0710 ? 6.5073+-0.0722 ? might be 1.0141x slower get-by-id-bimorphic-check-structure-elimination-simple 3.2590+-0.0984 3.2446+-0.0304 get-by-id-bimorphic-check-structure-elimination 8.2725+-0.0996 8.2372+-0.0782 get-by-id-chain-from-try-block 15.5897+-0.1690 ? 15.6422+-0.4041 ? get-by-id-check-structure-elimination 7.5925+-0.1687 7.4410+-0.0285 might be 1.0204x faster get-by-id-proto-or-self 22.6260+-1.3768 ? 23.6996+-3.1440 ? might be 1.0475x slower get-by-id-quadmorphic-check-structure-elimination-simple 3.8508+-0.1067 3.8411+-0.1301 get-by-id-self-or-proto 22.9954+-3.3652 ? 24.4495+-2.6351 ? might be 1.0632x slower get-by-val-out-of-bounds 5.5443+-0.3052 5.4931+-0.2734 get_callee_monomorphic 5.0535+-0.3405 ? 5.1132+-0.4143 ? might be 1.0118x slower get_callee_polymorphic 4.1353+-0.2274 ? 4.4653+-0.2697 ? might be 1.0798x slower getter-no-activation 5.6362+-0.1178 ? 5.6824+-0.0393 ? getter-richards 163.1965+-22.0673 ? 167.1045+-19.1510 ? might be 1.0239x slower getter 6.3940+-0.0765 ? 6.4437+-0.1016 ? global-var-const-infer-fire-from-opt 1.1672+-0.2404 ? 1.2527+-0.2943 ? might be 1.0733x slower global-var-const-infer 1.2371+-0.1161 1.1105+-0.1133 might be 1.1139x faster HashMap-put-get-iterate-keys 34.4230+-0.6120 ? 35.6689+-1.8748 ? might be 1.0362x slower HashMap-put-get-iterate 34.5616+-1.7751 34.4697+-0.6628 HashMap-string-put-get-iterate 35.6707+-3.5414 33.6670+-1.1428 might be 1.0595x faster hoist-make-rope 13.1235+-0.5725 ? 13.2109+-0.7993 ? hoist-poly-check-structure-effectful-loop 6.4924+-0.1768 6.3995+-0.0472 might be 1.0145x faster hoist-poly-check-structure 4.7814+-0.1710 ? 4.8248+-0.0954 ? imul-double-only 10.0488+-0.3711 ? 10.2844+-0.2586 ? might be 1.0234x slower imul-int-only 12.8719+-0.4025 12.6343+-0.6035 might be 1.0188x faster imul-mixed 9.3062+-0.7408 ? 9.8498+-0.6182 ? might be 1.0584x slower in-four-cases 23.8660+-0.2413 ? 23.9168+-0.6333 ? in-one-case-false 12.4479+-0.2904 ? 12.6520+-0.1368 ? might be 1.0164x slower in-one-case-true 12.4335+-0.2590 ? 12.5226+-0.2796 ? in-two-cases 13.1250+-0.1738 12.9727+-0.0709 might be 1.0117x faster indexed-properties-in-objects 3.7060+-0.0829 ? 3.7606+-0.1658 ? might be 1.0147x slower infer-closure-const-then-mov-no-inline 4.2881+-0.0911 ? 4.3007+-0.1812 ? infer-closure-const-then-mov 24.7602+-0.5497 24.7127+-0.7794 infer-closure-const-then-put-to-scope-no-inline 14.5187+-0.1495 ! 14.9987+-0.0292 ! definitely 1.0331x slower infer-closure-const-then-put-to-scope 27.9048+-0.2676 ? 28.6066+-0.7300 ? might be 1.0251x slower infer-closure-const-then-reenter-no-inline 65.2096+-0.9370 64.5441+-0.5026 might be 1.0103x faster infer-closure-const-then-reenter 27.8543+-0.2367 ? 29.6765+-5.1332 ? might be 1.0654x slower infer-constant-global-property 4.6332+-0.0832 ? 4.6900+-0.1204 ? might be 1.0123x slower infer-constant-property 3.2646+-0.1270 3.2579+-0.0822 infer-one-time-closure-ten-vars 14.6298+-0.3381 ? 14.7069+-0.3025 ? infer-one-time-closure-two-vars 14.2932+-0.2705 14.1861+-0.4914 infer-one-time-closure 14.2263+-0.1288 ? 14.2864+-0.3104 ? infer-one-time-deep-closure 24.7577+-0.2740 ? 24.9965+-0.6485 ? inline-arguments-access 2.1136+-0.1931 ? 2.1733+-0.1071 ? might be 1.0282x slower inline-arguments-aliased-access 2.3752+-0.1286 ? 2.4571+-0.2240 ? might be 1.0345x slower inline-arguments-local-escape 16.1328+-0.5212 ? 16.5412+-0.3266 ? might be 1.0253x slower inline-get-scoped-var 5.5728+-0.0746 ? 5.5815+-0.0955 ? inlined-put-by-id-transition 13.1218+-0.2677 ? 13.1536+-0.2534 ? int-or-other-abs-then-get-by-val 6.8932+-0.1161 6.8524+-0.0962 int-or-other-abs-zero-then-get-by-val 26.5175+-0.4150 ? 26.6833+-0.5170 ? int-or-other-add-then-get-by-val 5.9830+-0.0584 5.7995+-0.1484 might be 1.0316x faster int-or-other-add 7.9911+-0.1038 7.8495+-0.0543 might be 1.0180x faster int-or-other-div-then-get-by-val 5.0820+-0.1165 ? 5.0970+-0.1062 ? int-or-other-max-then-get-by-val 6.3245+-0.3413 ? 6.3945+-0.3747 ? might be 1.0111x slower int-or-other-min-then-get-by-val 5.1451+-0.1230 ? 5.2484+-0.0632 ? might be 1.0201x slower int-or-other-mod-then-get-by-val 5.0458+-0.0538 ? 5.1063+-0.1203 ? might be 1.0120x slower int-or-other-mul-then-get-by-val 4.9468+-0.1014 4.9346+-0.0886 int-or-other-neg-then-get-by-val 6.0748+-0.1652 ? 6.1409+-0.0436 ? might be 1.0109x slower int-or-other-neg-zero-then-get-by-val 26.6017+-0.2408 26.5460+-0.1917 int-or-other-sub-then-get-by-val 5.8915+-0.2709 ? 5.9030+-0.1088 ? int-or-other-sub 5.1533+-0.3121 5.1406+-0.0776 int-overflow-local 5.7534+-0.1538 5.7220+-0.0714 Int16Array-alloc-long-lived 62.0265+-0.1347 61.8510+-0.6133 Int16Array-bubble-sort-with-byteLength 39.3063+-0.8136 38.5518+-0.0922 might be 1.0196x faster Int16Array-bubble-sort 37.6570+-0.7192 ! 39.0322+-0.5283 ! definitely 1.0365x slower Int16Array-load-int-mul 1.8864+-0.1138 ? 1.9506+-0.1556 ? might be 1.0340x slower Int16Array-to-Int32Array-set 74.5745+-0.9681 ! 81.2596+-3.6655 ! definitely 1.0896x slower Int32Array-alloc-large 36.3065+-1.7250 35.8463+-0.7426 might be 1.0128x faster Int32Array-alloc-long-lived 69.8823+-5.3302 69.1545+-0.7586 might be 1.0105x faster Int32Array-alloc 4.0475+-0.1102 ? 4.1019+-0.0901 ? might be 1.0134x slower Int32Array-Int8Array-view-alloc 10.6702+-0.2123 ^ 9.5333+-0.3951 ^ definitely 1.1193x faster int52-spill 8.3802+-0.1129 ? 8.5085+-0.2520 ? might be 1.0153x slower Int8Array-alloc-long-lived 56.2452+-0.7078 ? 56.9172+-0.6820 ? might be 1.0119x slower Int8Array-load-with-byteLength 4.6512+-0.1825 ? 4.6983+-0.1487 ? might be 1.0101x slower Int8Array-load 4.6245+-0.1089 ? 4.7646+-0.1820 ? might be 1.0303x slower integer-divide 14.2258+-0.1629 ? 14.3578+-0.3399 ? integer-modulo 2.6387+-0.0729 ? 2.6414+-0.1194 ? large-int-captured 9.0430+-0.0627 ! 10.4487+-1.1203 ! definitely 1.1555x slower large-int-neg 20.7550+-0.7753 ? 21.5195+-0.7429 ? might be 1.0368x slower large-int 18.5464+-0.3107 ? 18.6848+-0.3318 ? logical-not 6.1231+-0.4764 ? 6.3469+-0.4766 ? might be 1.0366x slower lots-of-fields 15.7097+-0.6060 15.6972+-0.4143 make-indexed-storage 4.0619+-0.2349 ? 4.0664+-0.2453 ? make-rope-cse 5.6105+-0.3950 5.4175+-0.4803 might be 1.0356x faster marsaglia-larger-ints 56.3358+-1.3330 55.8514+-0.6915 marsaglia-osr-entry 28.0491+-1.0303 ? 28.4642+-0.2325 ? might be 1.0148x slower max-boolean 3.3917+-0.2588 ? 3.4443+-0.4955 ? might be 1.0155x slower method-on-number 22.8705+-0.3795 22.5939+-0.3858 might be 1.0122x faster min-boolean 3.1929+-0.1465 ? 3.2636+-0.1239 ? might be 1.0221x slower minus-boolean-double 4.1637+-0.1190 4.1442+-0.0946 minus-boolean 3.1821+-0.1083 3.1805+-0.1342 misc-strict-eq 54.5053+-0.4831 54.3568+-1.1152 mod-boolean-double 11.7489+-0.1460 11.7220+-0.1696 mod-boolean 8.8439+-0.3117 ? 8.9539+-0.0865 ? might be 1.0124x slower mul-boolean-double 4.7321+-0.0542 ? 4.8182+-0.0885 ? might be 1.0182x slower mul-boolean 3.5072+-0.0775 3.4135+-0.0965 might be 1.0275x faster neg-boolean 4.2107+-0.0359 4.1937+-0.1244 negative-zero-divide 0.4852+-0.1060 0.4785+-0.1013 might be 1.0140x faster negative-zero-modulo 0.4673+-0.0823 0.4136+-0.0060 might be 1.1300x faster negative-zero-negate 0.4390+-0.1269 0.4208+-0.0906 might be 1.0432x faster nested-function-parsing 30.9075+-0.2459 ? 31.1281+-0.6007 ? new-array-buffer-dead 3.6689+-0.1519 ? 3.8789+-0.1177 ? might be 1.0572x slower new-array-buffer-push 9.2820+-0.4603 ? 9.3702+-0.5099 ? new-array-dead 15.3121+-0.8766 ? 16.0865+-1.5383 ? might be 1.0506x slower new-array-push 6.5942+-0.2661 6.5565+-0.5921 number-test 4.0115+-0.1653 ? 4.0171+-0.0668 ? object-closure-call 7.9264+-0.1664 7.7653+-0.3465 might be 1.0208x faster object-test 4.2838+-0.0826 ? 4.3638+-0.1101 ? might be 1.0187x slower obvious-sink-pathology-taken 172.1451+-2.8009 ? 175.1265+-3.2310 ? might be 1.0173x slower obvious-sink-pathology 164.9886+-5.3608 ? 166.1791+-4.6394 ? obviously-elidable-new-object 48.5589+-4.0180 48.2702+-3.8736 plus-boolean-arith 3.2858+-0.0969 3.2552+-0.0416 plus-boolean-double 4.1774+-0.2170 ? 4.1921+-0.1474 ? plus-boolean 3.1785+-0.1188 3.1607+-0.0585 poly-chain-access-different-prototypes-simple 3.7984+-0.1525 3.7981+-0.0824 poly-chain-access-different-prototypes 3.2178+-0.0860 3.0531+-0.7585 might be 1.0539x faster poly-chain-access-simpler 3.7905+-0.0280 ? 3.8519+-0.0885 ? might be 1.0162x slower poly-chain-access 2.9117+-0.5639 ? 3.3027+-0.1437 ? might be 1.1343x slower poly-stricteq 69.3650+-1.2915 69.0439+-0.9853 polymorphic-array-call 2.2970+-0.2128 2.1092+-0.0606 might be 1.0890x faster polymorphic-get-by-id 4.0952+-0.0711 ? 4.1958+-0.0908 ? might be 1.0246x slower polymorphic-put-by-id 39.2294+-2.1692 38.2294+-1.1593 might be 1.0262x faster polymorphic-structure 21.3385+-1.9279 ? 21.9272+-2.1308 ? might be 1.0276x slower polyvariant-monomorphic-get-by-id 11.4033+-0.0890 ? 11.4489+-0.1354 ? proto-getter-access 12.7279+-0.2568 ? 12.7924+-0.2428 ? put-by-id-replace-and-transition 11.0151+-0.3127 ? 11.1229+-0.4465 ? put-by-id-slightly-polymorphic 3.4852+-0.0687 ? 3.5150+-0.0920 ? put-by-id 17.9852+-0.4618 17.8322+-0.6050 put-by-val-direct 0.7482+-0.1237 0.7067+-0.0177 might be 1.0587x faster put-by-val-large-index-blank-indexing-type 7.8951+-0.4504 ? 7.9705+-0.2599 ? put-by-val-machine-int 3.2303+-0.3855 ? 3.2842+-0.1159 ? might be 1.0167x slower rare-osr-exit-on-local 18.0760+-0.1747 ? 18.1412+-0.3403 ? register-pressure-from-osr 26.7535+-0.2567 ? 26.8575+-0.5315 ? setter 6.3466+-0.1847 ? 6.4501+-0.0136 ? might be 1.0163x slower simple-activation-demo 33.1785+-0.4988 ? 34.0226+-0.5526 ? might be 1.0254x slower simple-getter-access 17.9047+-0.2099 ? 18.3495+-0.7715 ? might be 1.0248x slower simple-poly-call-nested 21.7200+-1.6979 ? 21.9705+-0.1700 ? might be 1.0115x slower simple-poly-call 1.6025+-0.1957 1.5868+-0.0401 sin-boolean 23.9661+-3.4595 ? 26.9938+-0.8365 ? might be 1.1263x slower sinkable-new-object-dag 94.0320+-2.0098 ? 94.8826+-2.8790 ? sinkable-new-object-taken 70.2483+-3.7275 69.6060+-5.2938 sinkable-new-object 53.2799+-0.8458 ? 53.8108+-2.2045 ? slow-array-profile-convergence 3.6810+-0.1600 ? 3.7615+-0.1248 ? might be 1.0219x slower slow-convergence 4.3352+-0.2208 ? 4.4637+-0.1897 ? might be 1.0297x slower sparse-conditional 1.4688+-0.0985 ? 1.5356+-0.1022 ? might be 1.0455x slower splice-to-remove 20.9070+-0.4106 ? 21.1763+-1.6196 ? might be 1.0129x slower string-char-code-at 19.8657+-0.3577 19.8585+-0.2562 string-concat-object 2.8213+-0.2193 ? 2.8554+-0.3014 ? might be 1.0121x slower string-concat-pair-object 2.7590+-0.2966 ? 2.8172+-0.3835 ? might be 1.0211x slower string-concat-pair-simple 15.8339+-0.1456 ? 16.1069+-1.3236 ? might be 1.0172x slower string-concat-simple 15.6535+-0.3696 ? 15.7447+-0.4491 ? string-cons-repeat 10.4403+-0.5543 10.1321+-0.3170 might be 1.0304x faster string-cons-tower 9.9198+-1.0037 9.5771+-0.1385 might be 1.0358x faster string-equality 22.2883+-0.1962 22.2236+-0.3905 string-get-by-val-big-char 9.2297+-0.4100 ? 9.3112+-0.2715 ? string-get-by-val-out-of-bounds-insane 5.5040+-0.1327 ? 5.5395+-0.2915 ? string-get-by-val-out-of-bounds 6.7584+-0.0932 6.7453+-0.1603 string-get-by-val 4.7267+-0.1045 ? 4.7645+-0.0789 ? string-hash 2.6585+-0.1358 2.6417+-0.1434 string-long-ident-equality 18.1840+-0.3323 18.0113+-0.1369 string-repeat-arith 39.4503+-0.5526 ! 41.0795+-0.9962 ! definitely 1.0413x slower string-sub 78.4747+-1.5069 ? 79.8898+-1.2734 ? might be 1.0180x slower string-test 4.1373+-0.0767 4.1347+-0.1294 string-var-equality 43.5780+-0.6794 ? 43.8174+-0.8919 ? structure-hoist-over-transitions 3.3333+-0.1540 3.1860+-0.1326 might be 1.0462x faster substring-concat-weird 53.6245+-1.1975 53.3493+-0.5646 substring-concat 56.6407+-1.3943 56.2551+-0.5115 substring 62.9977+-0.6464 62.0081+-1.1299 might be 1.0160x faster switch-char-constant 3.2568+-0.1646 ? 3.2861+-0.1044 ? switch-char 7.8463+-0.1058 7.8175+-0.1357 switch-constant 11.4590+-2.2027 ? 12.3243+-2.1224 ? might be 1.0755x slower switch-string-basic-big-var 21.9222+-1.2503 ? 22.0660+-2.8625 ? switch-string-basic-big 19.9141+-0.8258 ? 19.9876+-0.8594 ? switch-string-basic-var 26.0380+-1.2340 ? 26.2941+-1.1926 ? switch-string-basic 19.2064+-2.3701 19.1860+-2.0133 switch-string-big-length-tower-var 25.1396+-0.1109 25.0089+-0.2260 switch-string-length-tower-var 19.9990+-0.4334 19.9460+-0.1692 switch-string-length-tower 14.7502+-0.1301 ? 14.7640+-0.3444 ? switch-string-short 15.0298+-0.6074 ? 15.1608+-0.4644 ? switch 14.0341+-1.4057 ? 16.5824+-3.0631 ? might be 1.1816x slower tear-off-arguments-simple 2.3437+-0.1921 ! 3.1897+-0.1912 ! definitely 1.3610x slower tear-off-arguments 3.5735+-0.1720 ! 5.0122+-0.1826 ! definitely 1.4026x slower temporal-structure 17.2239+-0.4271 17.1790+-0.1013 to-int32-boolean 20.7047+-0.0645 ? 20.7589+-0.1450 ? undefined-property-access 470.1378+-2.5664 ? 472.0942+-5.2029 ? undefined-test 4.2598+-0.1003 4.2205+-0.1236 unprofiled-licm 27.5986+-1.0819 ? 27.8730+-0.5970 ? weird-inlining-const-prop 2.5485+-0.1335 2.5020+-0.1450 might be 1.0186x faster <arithmetic> 22.7825+-0.2083 ? 22.9318+-0.1107 ? might be 1.0066x slower <geometric> * 10.5657+-0.0583 ? 10.6323+-0.0271 ? might be 1.0063x slower <harmonic> 5.3331+-0.0504 5.3099+-0.0621 might be 1.0044x faster Baseline AllocScope AsmBench: bigfib.cpp 670.0192+-6.0308 ? 677.3997+-9.2395 ? might be 1.0110x slower cray.c 677.3695+-3.0056 ? 696.2100+-16.7270 ? might be 1.0278x slower dry.c 650.0281+-14.2978 ? 652.9977+-14.8855 ? FloatMM.c 958.8400+-1.5697 ^ 951.3310+-2.6594 ^ definitely 1.0079x faster gcc-loops.cpp 5878.2693+-10.8498 ^ 5843.7734+-17.5757 ^ definitely 1.0059x faster n-body.c 1677.4102+-19.6161 1674.9329+-6.0146 Quicksort.c 585.9640+-5.2865 580.9025+-7.5367 stepanov_container.cpp 4867.1586+-31.8535 ? 4909.5021+-47.3553 ? Towers.c 403.7853+-7.9287 395.2675+-1.0754 might be 1.0215x faster <arithmetic> 1818.7605+-6.7981 ? 1820.2574+-6.4697 ? might be 1.0008x slower <geometric> * 1138.4168+-5.7544 ? 1139.2204+-6.7565 ? might be 1.0007x slower <harmonic> 840.6911+-5.6773 839.6403+-6.1396 might be 1.0013x faster Baseline AllocScope CompressionBench: huffman 667.1564+-7.1866 657.6157+-22.0066 might be 1.0145x faster arithmetic-simple 574.0845+-7.1127 ? 579.2909+-2.1205 ? arithmetic-precise 423.6301+-6.3894 ! 436.7186+-2.8367 ! definitely 1.0309x slower arithmetic-complex-precise 418.9545+-2.8932 ! 434.2194+-4.3982 ! definitely 1.0364x slower arithmetic-precise-order-0 614.0566+-11.1114 ! 647.0005+-8.2223 ! definitely 1.0536x slower arithmetic-precise-order-1 471.7182+-13.9830 464.0565+-2.8325 might be 1.0165x faster arithmetic-precise-order-2 530.8475+-12.4801 523.5511+-10.5255 might be 1.0139x faster arithmetic-simple-order-1 589.6745+-3.2710 ? 597.5333+-5.0399 ? might be 1.0133x slower arithmetic-simple-order-2 659.4534+-2.8031 ! 672.6242+-6.5873 ! definitely 1.0200x slower lz-string 437.2666+-3.6018 ? 443.1809+-13.1481 ? might be 1.0135x slower <arithmetic> 538.6842+-2.2597 ! 545.5791+-0.4028 ! definitely 1.0128x slower <geometric> * 530.8374+-2.1003 ! 537.7425+-0.8430 ! definitely 1.0130x slower <harmonic> 522.9858+-1.9318 ! 529.9851+-1.3681 ! definitely 1.0134x slower Baseline AllocScope All benchmarks: <arithmetic> 137.5866+-0.5290 ? 137.6790+-0.2213 ? might be 1.0007x slower <geometric> 17.7015+-0.0921 ? 17.7919+-0.0326 ? might be 1.0051x slower <harmonic> 4.4671+-0.0407 4.4601+-0.0343 might be 1.0016x faster Baseline AllocScope Geomean of preferred means: <scaled-result> 88.1409+-0.3576 ? 88.4495+-0.1275 ? might be 1.0035x slower
Geoffrey Garen
Comment 26 2014-11-21 10:18:34 PST
Comment on attachment 242013 [details] Patch with fix for performance regression and an added test View in context: https://bugs.webkit.org/attachment.cgi?id=242013&action=review > Source/JavaScriptCore/debugger/DebuggerCallFrame.cpp:149 > + if (codeBlock && codeBlock->scopeRegister().isValid()) > + scope = m_callFrame->scope(codeBlock->scopeRegister().offset()); I think this is wrong. If A calls B, A is DFG-compiled, B hits a breakpoint, and the user inspects the stack frame for A, the debugger will read a potentially uninitialized or scribbled stack slot for A. You should test this case, see what trunk does, and probably put in a specific work-around to avoid trying to load the scope chain in this case.
Geoffrey Garen
Comment 27 2014-11-21 10:51:25 PST
Comment on attachment 242013 [details] Patch with fix for performance regression and an added test I think the right solution is for the DFG's StackLayoutPhase to treat the scope register as special. Another option might just to mark the scope register as flushed or captured if and only if the debugger is enabled, but I'm not as clear on how to get that to work.
Geoffrey Garen
Comment 28 2014-11-21 10:54:42 PST
Comment on attachment 242013 [details] Patch with fix for performance regression and an added test View in context: https://bugs.webkit.org/attachment.cgi?id=242013&action=review r=me if you fix the things mentioned above. > Source/JavaScriptCore/jit/JITOperations.cpp:1320 > exec->uncheckedR(dst) = scope; This needs a FIXME that says this won't work if called from the DFG or FTL. > Source/JavaScriptCore/jit/JITOperations.cpp:1345 > + exec->uncheckedR(dst) = JSWithScope::create(exec, o, currentScope); Ditto. > LayoutTests/inspector-protocol/debugger/resources/breakpoint.js:104 > +function dfgWithoutInline3() Can you add to this a test for loading a storing a local variable?
Michael Saboff
Comment 29 2014-11-21 10:59:28 PST
(In reply to comment #28) > Comment on attachment 242013 [details] > Patch with fix for performance regression and an added test ... > > LayoutTests/inspector-protocol/debugger/resources/breakpoint.js:104 > > +function dfgWithoutInline3() > > Can you add to this a test for loading a storing a local variable? I assume you meant "loading *and storing". Don't we expect that to generally not work? The DFG is free to move locals around on the stack.
Geoffrey Garen
Comment 30 2014-11-21 11:10:45 PST
> I assume you meant "loading *and storing". Yup. > Don't we expect that to > generally not work? The DFG is free to move locals around on the stack. The debugger marks all locals as captured, so it should work.
Michael Saboff
Comment 31 2014-11-21 15:41:13 PST
Note You need to log in before you can comment on or make changes to this bug.