WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
176601
Turn recursive tail calls into loops
https://bugs.webkit.org/show_bug.cgi?id=176601
Summary
Turn recursive tail calls into loops
Robin Morisset
Reported
2017-09-08 08:16:35 PDT
As indicated in the title, we would like to turn recursive tail calls into simple loop, preferably before optimizations like LICM for optimal performance. We expect this to be a big win on the (admittedly contrived) benchmark in
https://bugs.webkit.org/show_bug.cgi?id=175754
. There are several difficulties. The main one is the interaction with inlining. Consider the following example: A() calls B() that calls C() that tail calls B (). If the inliner inlines C into B into A, then we want to insert a jump from the end of C to the beginning of B, in A. But currently the beginning of B might be in the middle of a basic block, so we must change the inliner to split the basic block. More surprisingly, the transformation itself must also happen at the same time and not in a later phase (for example after Fixup). Otherwise the intervening passes would not see the control flow edge, and might hoist parts of B() above the entry point of B (into the original code of A). So I will do the basic block splitting in DFGBytecodeParser::inlineCall and the actual transformation in DFGBytecodeParser::handleInlining. In the first time, we will only do the optimization for monomorphic non-varargs calls for simplicity. It should be straightforward (or at least possible) to extend it to the other cases later on.
Attachments
Patch
(4.86 KB, patch)
2017-09-15 05:39 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews103 for mac-elcapitan
(399.22 KB, application/zip)
2017-09-15 06:33 PDT
,
Build Bot
no flags
Details
Archive of layout-test-results from ews113 for mac-elcapitan
(579.73 KB, application/zip)
2017-09-15 06:36 PDT
,
Build Bot
no flags
Details
Archive of layout-test-results from ews106 for mac-elcapitan-wk2
(365.14 KB, application/zip)
2017-09-15 06:37 PDT
,
Build Bot
no flags
Details
Archive of layout-test-results from ews125 for ios-simulator-wk2
(881.88 KB, application/zip)
2017-09-15 07:03 PDT
,
Build Bot
no flags
Details
Patch
(12.70 KB, patch)
2017-09-18 07:41 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
WIP
(37.46 KB, patch)
2017-09-22 06:32 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews114 for mac-elcapitan
(1.27 MB, application/zip)
2017-09-22 07:14 PDT
,
Build Bot
no flags
Details
Archive of layout-test-results from ews105 for mac-elcapitan-wk2
(847.37 KB, application/zip)
2017-09-22 07:25 PDT
,
Build Bot
no flags
Details
Archive of layout-test-results from ews103 for mac-elcapitan
(422.87 KB, application/zip)
2017-09-22 07:30 PDT
,
Build Bot
no flags
Details
Archive of layout-test-results from ews121 for ios-simulator-wk2
(1.08 MB, application/zip)
2017-09-22 07:41 PDT
,
Build Bot
no flags
Details
Still WIP, still buggy
(51.83 KB, patch)
2017-09-25 03:23 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
WIP, stil trying to split the entry block, still buggy
(51.79 KB, patch)
2017-09-25 06:13 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
WIP, optimisation commented out to test refactoring, and block splitting
(51.84 KB, patch)
2017-09-25 07:23 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews100 for mac-elcapitan
(504.85 KB, application/zip)
2017-09-25 08:09 PDT
,
Build Bot
no flags
Details
Archive of layout-test-results from ews107 for mac-elcapitan-wk2
(363.69 KB, application/zip)
2017-09-25 08:16 PDT
,
Build Bot
no flags
Details
Archive of layout-test-results from ews112 for mac-elcapitan
(496.05 KB, application/zip)
2017-09-25 08:19 PDT
,
Build Bot
no flags
Details
Archive of layout-test-results from ews126 for ios-simulator-wk2
(1.03 MB, application/zip)
2017-09-25 08:39 PDT
,
Build Bot
no flags
Details
WIP, frustrating heisenbug
(51.63 KB, patch)
2017-09-26 09:36 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
WIP, some more bugs
(54.17 KB, patch)
2017-09-27 08:37 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Patch
(55.11 KB, patch)
2017-09-27 10:52 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
WIP, fix another bug
(55.22 KB, patch)
2017-09-28 00:32 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
WIP, more refactoring, probably still buggy
(63.87 KB, patch)
2017-09-29 07:01 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews104 for mac-elcapitan-wk2
(1.47 MB, application/zip)
2017-09-29 08:12 PDT
,
Build Bot
no flags
Details
Archive of layout-test-results from ews112 for mac-elcapitan
(2.53 MB, application/zip)
2017-09-29 08:40 PDT
,
Build Bot
no flags
Details
Archive of layout-test-results from ews103 for mac-elcapitan
(2.16 MB, application/zip)
2017-09-29 08:55 PDT
,
Build Bot
no flags
Details
Fix stupid bug in handleInlining, everything optimizing tail calls is commented out, it should hopefully pass ews tests this time
(66.86 KB, patch)
2017-10-02 15:18 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Known buggy, submitting to EWS to find more failing tests
(66.88 KB, patch)
2017-10-03 05:20 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Commented out the buggy splitting of entry block
(66.93 KB, patch)
2017-10-03 10:43 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Patch
(11.40 KB, patch)
2017-10-11 13:42 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Fixed, this passes all the stress tests, but I want some more testing before asking for review
(11.88 KB, patch)
2017-10-11 15:40 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Fixed a horrible design bug, and the performance regression
(13.46 KB, patch)
2017-10-12 13:44 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Moved the splitting of op_enter to UnlinkedCodeBlocks, and correctly recognize when it has happened
(18.65 KB, patch)
2017-10-13 11:17 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews114 for mac-elcapitan
(2.00 MB, application/zip)
2017-10-13 18:37 PDT
,
Build Bot
no flags
Details
Patch
(21.41 KB, patch)
2017-10-18 07:44 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Patch
(22.67 KB, patch)
2017-10-18 08:57 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Patch
(23.66 KB, patch)
2017-10-19 04:00 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
WIP, for sharing with Saam
(26.36 KB, patch)
2017-10-31 10:29 PDT
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Patch for landing
(25.12 KB, patch)
2017-11-08 12:11 PST
,
Robin Morisset
no flags
Details
Formatted Diff
Diff
Show Obsolete
(38)
View All
Add attachment
proposed patch, testcase, etc.
Robin Morisset
Comment 1
2017-09-15 05:39:02 PDT
Created
attachment 320892
[details]
Patch
Robin Morisset
Comment 2
2017-09-15 05:39:53 PDT
This patch is WIP: it compiles but does not goes through tests cleanly yet.
Build Bot
Comment 3
2017-09-15 05:42:00 PDT
Attachment 320892
[details]
did not pass style-queue: ERROR: Source/JavaScriptCore/ChangeLog:9: Line contains tab character. [whitespace/tab] [5] ERROR: Source/JavaScriptCore/ChangeLog:10: Line contains tab character. [whitespace/tab] [5] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:6282: Wrong number of spaces before statement. (expected: 8) [whitespace/indent] [4] Total errors found: 3 in 2 files If any of these errors are false positives, please file a bug against check-webkit-style.
Build Bot
Comment 4
2017-09-15 06:33:10 PDT
Comment on
attachment 320892
[details]
Patch
Attachment 320892
[details]
did not pass mac-ews (mac): Output:
http://webkit-queues.webkit.org/results/4557755
Number of test failures exceeded the failure limit.
Build Bot
Comment 5
2017-09-15 06:33:11 PDT
Created
attachment 320897
[details]
Archive of layout-test-results from ews103 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews103 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Build Bot
Comment 6
2017-09-15 06:36:55 PDT
Comment on
attachment 320892
[details]
Patch
Attachment 320892
[details]
did not pass mac-debug-ews (mac): Output:
http://webkit-queues.webkit.org/results/4557737
Number of test failures exceeded the failure limit.
Build Bot
Comment 7
2017-09-15 06:36:56 PDT
Created
attachment 320898
[details]
Archive of layout-test-results from ews113 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews113 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Build Bot
Comment 8
2017-09-15 06:37:49 PDT
Comment on
attachment 320892
[details]
Patch
Attachment 320892
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
http://webkit-queues.webkit.org/results/4557763
Number of test failures exceeded the failure limit.
Build Bot
Comment 9
2017-09-15 06:37:51 PDT
Created
attachment 320899
[details]
Archive of layout-test-results from ews106 for mac-elcapitan-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews106 Port: mac-elcapitan-wk2 Platform: Mac OS X 10.11.6
Build Bot
Comment 10
2017-09-15 07:02:59 PDT
Comment on
attachment 320892
[details]
Patch
Attachment 320892
[details]
did not pass ios-sim-ews (ios-simulator-wk2): Output:
http://webkit-queues.webkit.org/results/4557797
Number of test failures exceeded the failure limit.
Build Bot
Comment 11
2017-09-15 07:03:01 PDT
Created
attachment 320900
[details]
Archive of layout-test-results from ews125 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews125 Port: ios-simulator-wk2 Platform: Mac OS X 10.12.5
Robin Morisset
Comment 12
2017-09-18 07:41:34 PDT
Created
attachment 321090
[details]
Patch WIP, still crashes, and does not SetLocal arguments
Robin Morisset
Comment 13
2017-09-22 06:32:08 PDT
Created
attachment 321539
[details]
WIP
Build Bot
Comment 14
2017-09-22 07:14:43 PDT
Comment on
attachment 321539
[details]
WIP
Attachment 321539
[details]
did not pass mac-debug-ews (mac): Output:
http://webkit-queues.webkit.org/results/4627359
Number of test failures exceeded the failure limit.
Build Bot
Comment 15
2017-09-22 07:14:44 PDT
Created
attachment 321543
[details]
Archive of layout-test-results from ews114 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews114 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Build Bot
Comment 16
2017-09-22 07:25:17 PDT
Comment on
attachment 321539
[details]
WIP
Attachment 321539
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
http://webkit-queues.webkit.org/results/4627484
Number of test failures exceeded the failure limit.
Build Bot
Comment 17
2017-09-22 07:25:18 PDT
Created
attachment 321544
[details]
Archive of layout-test-results from ews105 for mac-elcapitan-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews105 Port: mac-elcapitan-wk2 Platform: Mac OS X 10.11.6
Build Bot
Comment 18
2017-09-22 07:30:32 PDT
Comment on
attachment 321539
[details]
WIP
Attachment 321539
[details]
did not pass mac-ews (mac): Output:
http://webkit-queues.webkit.org/results/4627548
Number of test failures exceeded the failure limit.
Build Bot
Comment 19
2017-09-22 07:30:33 PDT
Created
attachment 321546
[details]
Archive of layout-test-results from ews103 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews103 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Build Bot
Comment 20
2017-09-22 07:41:10 PDT
Comment on
attachment 321539
[details]
WIP
Attachment 321539
[details]
did not pass ios-sim-ews (ios-simulator-wk2): Output:
http://webkit-queues.webkit.org/results/4627420
Number of test failures exceeded the failure limit.
Build Bot
Comment 21
2017-09-22 07:41:11 PDT
Created
attachment 321547
[details]
Archive of layout-test-results from ews121 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews121 Port: ios-simulator-wk2 Platform: Mac OS X 10.12.6
Saam Barati
Comment 22
2017-09-22 10:12:04 PDT
Comment on
attachment 321539
[details]
WIP View in context:
https://bugs.webkit.org/attachment.cgi?id=321539&action=review
> Source/JavaScriptCore/ChangeLog:15 > + This makes it a lot easier to make sure that they are in the "right" inlineStackEntry, i.e. the one corresponding to the function that their successor comes from.
I'm not sure I follow this comment. What's "right" here? What's "successor"?
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1652 > + // If the current block is not empty, we must split it so that recursive tail calls in the inlinee have somewhere to jump to.
I don't think you mean empty here given assert below. Perhaps you just mean if the block only has code for the op_call?
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1661 > + if (m_currentIndex != m_currentBlock->bytecodeBegin) { > + ASSERT(!m_currentBlock->isEmpty()); > + Node* jumpNode = addToGraph(Jump); > + Ref<BasicBlock> block = adoptRef(*new BasicBlock(m_currentIndex, m_numArguments, m_numLocals, m_currentBlock->executionCount)); > + jumpNode->targetBlock() = block.ptr(); > + m_inlineStackTop->m_blockLinkingTargets.append(block.ptr()); > + m_currentBlock = block.ptr(); > + m_graph.appendBlock(WTFMove(block)); > + }
This split needs to happen after we process the setLocalQueue we may have appended to above. Otherwise, if we split, we'll generate invalid IR. I think a simple tail call that needs arity fixup will trigger this.
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:6352 > + Ref<BasicBlock> block = adoptRef(*new BasicBlock(m_currentIndex, m_numArguments, m_numLocals, 1)); // TODO: why 1 ?
Maybe we just start all blocks at 1? I know we change this value in the StaticExecutionCountEstimationPhase
Robin Morisset
Comment 23
2017-09-25 03:23:09 PDT
Created
attachment 321675
[details]
Still WIP, still buggy
Robin Morisset
Comment 24
2017-09-25 06:13:34 PDT
Created
attachment 321678
[details]
WIP, stil trying to split the entry block, still buggy
Robin Morisset
Comment 25
2017-09-25 07:23:07 PDT
Created
attachment 321681
[details]
WIP, optimisation commented out to test refactoring, and block splitting
Build Bot
Comment 26
2017-09-25 07:24:46 PDT
Attachment 321681
[details]
did not pass style-queue: ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1311: Should have a space between // and comment [whitespace/comments] [4] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1312: Weird number of spaces at line-start. Are you using a 4-space indent? [whitespace/indent] [3] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:4136: WTF_CONCAT is incorrectly named. Don't use underscores in your identifier names. [readability/naming/underscores] [4] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:4138: Should be indented on a separate line, with the colon or comma first on that line. [whitespace/indent] [4] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:4144: Wrong number of spaces before statement. (expected: 12) [whitespace/indent] [4] Total errors found: 5 in 4 files If any of these errors are false positives, please file a bug against check-webkit-style.
Build Bot
Comment 27
2017-09-25 08:09:42 PDT
Comment on
attachment 321681
[details]
WIP, optimisation commented out to test refactoring, and block splitting
Attachment 321681
[details]
did not pass mac-ews (mac): Output:
http://webkit-queues.webkit.org/results/4649225
Number of test failures exceeded the failure limit.
Build Bot
Comment 28
2017-09-25 08:09:43 PDT
Created
attachment 321685
[details]
Archive of layout-test-results from ews100 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews100 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Build Bot
Comment 29
2017-09-25 08:16:37 PDT
Comment on
attachment 321681
[details]
WIP, optimisation commented out to test refactoring, and block splitting
Attachment 321681
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
http://webkit-queues.webkit.org/results/4649246
Number of test failures exceeded the failure limit.
Build Bot
Comment 30
2017-09-25 08:16:38 PDT
Created
attachment 321686
[details]
Archive of layout-test-results from ews107 for mac-elcapitan-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews107 Port: mac-elcapitan-wk2 Platform: Mac OS X 10.11.6
Build Bot
Comment 31
2017-09-25 08:19:04 PDT
Comment on
attachment 321681
[details]
WIP, optimisation commented out to test refactoring, and block splitting
Attachment 321681
[details]
did not pass mac-debug-ews (mac): Output:
http://webkit-queues.webkit.org/results/4649236
Number of test failures exceeded the failure limit.
Build Bot
Comment 32
2017-09-25 08:19:05 PDT
Created
attachment 321687
[details]
Archive of layout-test-results from ews112 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews112 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Build Bot
Comment 33
2017-09-25 08:39:46 PDT
Comment on
attachment 321681
[details]
WIP, optimisation commented out to test refactoring, and block splitting
Attachment 321681
[details]
did not pass ios-sim-ews (ios-simulator-wk2): Output:
http://webkit-queues.webkit.org/results/4649342
Number of test failures exceeded the failure limit.
Build Bot
Comment 34
2017-09-25 08:39:47 PDT
Created
attachment 321689
[details]
Archive of layout-test-results from ews126 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews126 Port: ios-simulator-wk2 Platform: Mac OS X 10.12.6
Robin Morisset
Comment 35
2017-09-26 09:36:52 PDT
Created
attachment 321826
[details]
WIP, frustrating heisenbug
Keith Miller
Comment 36
2017-09-26 09:54:51 PDT
Comment on
attachment 321826
[details]
WIP, frustrating heisenbug View in context:
https://bugs.webkit.org/attachment.cgi?id=321826&action=review
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:83 > +#define VERBOSE_LOG(...) do { \ > +if (Options::verboseDFGByteCodeParsing()) \ > + dataLog(__VA_ARGS__); \ > +} while (0)
Nit: You could make this #define VERBOSE_LOG(...) dataLogIf(Options::verboseDFGByteCodeParsing(), __VA_ARGS__);
Saam Barati
Comment 37
2017-09-26 10:07:09 PDT
Comment on
attachment 321826
[details]
WIP, frustrating heisenbug View in context:
https://bugs.webkit.org/attachment.cgi?id=321826&action=review
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1278 > + for (int i = 0; i < stackEntry->m_codeBlock->m_numVars; ++i) > + set(virtualRegisterForLocal(i), undefined, ImmediateNakedSet); // TODO: check that it is still reasonable to do this "ImmediateNakedSet" stuff here.
This is wrong. You need setDirect here and you need to map the virtual register manually
Robin Morisset
Comment 38
2017-09-27 08:37:41 PDT
Created
attachment 321960
[details]
WIP, some more bugs
Robin Morisset
Comment 39
2017-09-27 10:52:49 PDT
Created
attachment 321974
[details]
Patch
Robin Morisset
Comment 40
2017-09-28 00:32:15 PDT
Created
attachment 322070
[details]
WIP, fix another bug
Robin Morisset
Comment 41
2017-09-29 07:01:05 PDT
Created
attachment 322183
[details]
WIP, more refactoring, probably still buggy
Build Bot
Comment 42
2017-09-29 07:03:33 PDT
Attachment 322183
[details]
did not pass style-queue: ERROR: Source/JavaScriptCore/ChangeLog:29: Line contains tab character. [whitespace/tab] [5] ERROR: Source/JavaScriptCore/ChangeLog:30: Line contains tab character. [whitespace/tab] [5] ERROR: Source/JavaScriptCore/ChangeLog:31: Line contains tab character. [whitespace/tab] [5] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1342: Should have a space between // and comment [whitespace/comments] [4] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1343: Weird number of spaces at line-start. Are you using a 4-space indent? [whitespace/indent] [3] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:4166: WTF_CONCAT is incorrectly named. Don't use underscores in your identifier names. [readability/naming/underscores] [4] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:4168: Should be indented on a separate line, with the colon or comma first on that line. [whitespace/indent] [4] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:4173: Wrong number of spaces before statement. (expected: 12) [whitespace/indent] [4] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:6285: Wrong number of spaces before statement. (expected: 8) [whitespace/indent] [4] Total errors found: 9 in 4 files If any of these errors are false positives, please file a bug against check-webkit-style.
Build Bot
Comment 43
2017-09-29 07:49:12 PDT
Comment on
attachment 322183
[details]
WIP, more refactoring, probably still buggy
Attachment 322183
[details]
did not pass jsc-ews (mac): Output:
http://webkit-queues.webkit.org/results/4697967
New failing tests: stress/v8-deltablue-strict.js.ftl-eager v8-v6/v8-deltablue.js.ftl-eager-no-cjit stress/v8-richards-strict.js.ftl-no-cjit-validate-sampling-profiler wasm.yaml/wasm/function-tests/stack-overflow.js.default-wasm stress/spread-calling.js.ftl-no-cjit-no-put-stack-validate microbenchmarks/delta-blue-try-catch.js.ftl-eager-no-cjit stress/spread-calling.js.ftl-no-cjit-validate-sampling-profiler v8-v6/v8-deltablue.js.default stress/v8-deltablue-strict.js.ftl-no-cjit-b3o1 v8-v6/v8-richards.js.default microbenchmarks/delta-blue-try-catch.js.ftl-no-cjit-validate-sampling-profiler stress/v8-deltablue-strict.js.ftl-eager-no-cjit-b3o1 stress/v8-deltablue-strict.js.ftl-no-cjit-validate-sampling-profiler stress/spread-non-array.js.ftl-eager stress/v8-richards-strict.js.ftl-no-cjit-no-put-stack-validate stress/arrowfunction-lexical-bind-arguments-strict.js.ftl-eager-no-cjit-b3o1 v8-v6/v8-deltablue.js.ftl-no-cjit-no-put-stack-validate stress/const-tdz.js.ftl-eager-no-cjit-b3o1 microbenchmarks/delta-blue-try-catch.js.ftl-eager-no-cjit-b3o1 microbenchmarks/getter-richards.js.default basic-tests.yaml/stress-test.js.ftl-eager-no-cjit stress/arrowfunction-lexical-bind-arguments-strict.js.ftl-eager-no-cjit v8-v6/v8-deltablue.js.ftl-eager stress/arrowfunction-lexical-bind-arguments-non-strict-2.js.ftl-eager-no-cjit stress/spread-calling.js.default stress/v8-richards-strict.js.default stress/arguments-elimination-varargs-too-many-args-arg-count.js.ftl-eager-no-cjit-b3o1 v8-v6/v8-richards.js.ftl-no-cjit-validate-sampling-profiler v8-v6/v8-richards.js.ftl-eager stress/array-copywithin.js.ftl-eager-no-cjit v8-v6/v8-richards.js.ftl-no-cjit-no-put-stack-validate stress/v8-deltablue-strict.js.default cdjs-tests.yaml/main.js.default v8-v6/v8-deltablue.js.ftl-no-cjit-validate-sampling-profiler wasm.yaml/wasm/function-tests/memory-access-past-4gib.js.wasm-no-cjit-yes-tls-context microbenchmarks/deltablue-for-of.js.ftl-eager microbenchmarks/delta-blue-try-catch.js.ftl-no-cjit-no-put-stack-validate wasm.yaml/wasm/js-api/export-arity.js.wasm-no-cjit-yes-tls-context stress/spread-non-array.js.ftl-no-cjit-no-put-stack-validate stress/arguments-elimination-varargs-too-many-args-arg-count.js.ftl-eager-no-cjit microbenchmarks/deltablue-for-of.js.ftl-eager-no-cjit microbenchmarks/fake-iterators-that-throw-when-finished.js.ftl-eager microbenchmarks/deltablue-for-of.js.ftl-eager-no-cjit-b3o1 stress/const-tdz.js.ftl-eager-no-cjit stress/arrowfunction-lexical-bind-arguments-non-strict-2.js.ftl-eager-no-cjit-b3o1 stress/v8-deltablue-strict.js.ftl-no-cjit-no-put-stack-validate microbenchmarks/delta-blue-try-catch.js.ftl-eager stress/spread-non-array.js.default basic-tests.yaml/stress-test.js.default wasm.yaml/wasm/fuzz/export-function.js.wasm-slow-memory basic-tests.yaml/stress-test.js.ftl-no-cjit microbenchmarks/fake-iterators-that-throw-when-finished.js.ftl-no-cjit-validate-sampling-profiler stress/spread-non-array.js.ftl-no-cjit-b3o1 v8-v6/v8-deltablue.js.ftl-eager-no-cjit-b3o1 cdjs-tests.yaml/main.js.ftl-no-cjit wasm.yaml/wasm/function-tests/stack-overflow.js.wasm-no-cjit-yes-tls-context microbenchmarks/delta-blue-try-catch.js.ftl-no-cjit-b3o1 v8-v6/v8-deltablue.js.ftl-no-cjit-b3o1 microbenchmarks/fake-iterators-that-throw-when-finished.js.ftl-no-cjit-b3o1 wasm.yaml/wasm/function-tests/stack-overflow.js.wasm-no-call-ic microbenchmarks/fake-iterators-that-throw-when-finished.js.default jsc-layout-tests.yaml/js/script-tests/arrowfunction-lexical-bind-arguments-non-strict.js.layout-ftl-eager-no-cjit stress/lexical-let-tdz.js.ftl-eager-no-cjit stress/lexical-let-tdz.js.ftl-eager-no-cjit-b3o1 stress/spread-non-array.js.ftl-no-cjit-validate-sampling-profiler v8-v6/v8-richards.js.ftl-no-cjit-b3o1 stress/spread-calling.js.ftl-no-cjit-b3o1 microbenchmarks/delta-blue-try-catch.js.default wasm.yaml/wasm/js-api/export-arity.js.default-wasm wasm.yaml/wasm/function-tests/stack-overflow.js.wasm-slow-memory stress/v8-richards-strict.js.ftl-no-cjit-b3o1 wasm.yaml/wasm/fuzz/memory.js.wasm-eager-jettison wasm.yaml/wasm/function-tests/stack-overflow.js.wasm-eager-jettison stress/const-loop-semantics.js.ftl-eager-no-cjit-b3o1 microbenchmarks/fake-iterators-that-throw-when-finished.js.ftl-no-cjit-no-put-stack-validate jsc-layout-tests.yaml/js/script-tests/arrowfunction-lexical-bind-arguments-strict.js.layout-ftl-eager-no-cjit stress/v8-deltablue-strict.js.ftl-eager-no-cjit stress/spread-non-array.js.ftl-eager-no-cjit-b3o1 wasm.yaml/wasm/js-api/export-arity.js.wasm-slow-memory
Build Bot
Comment 44
2017-09-29 08:12:46 PDT
Comment on
attachment 322183
[details]
WIP, more refactoring, probably still buggy
Attachment 322183
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
http://webkit-queues.webkit.org/results/4698039
Number of test failures exceeded the failure limit.
Build Bot
Comment 45
2017-09-29 08:12:48 PDT
Created
attachment 322185
[details]
Archive of layout-test-results from ews104 for mac-elcapitan-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews104 Port: mac-elcapitan-wk2 Platform: Mac OS X 10.11.6
Build Bot
Comment 46
2017-09-29 08:40:07 PDT
Comment on
attachment 322183
[details]
WIP, more refactoring, probably still buggy
Attachment 322183
[details]
did not pass mac-debug-ews (mac): Output:
http://webkit-queues.webkit.org/results/4698098
New failing tests: imported/w3c/web-platform-tests/dom/ranges/Range-mutations-dataChange.html imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.html imported/w3c/web-platform-tests/dom/ranges/Range-intersectsNode.html imported/w3c/web-platform-tests/html/syntax/parsing/named-character-references.html imported/w3c/web-platform-tests/WebCryptoAPI/generateKey/test_failures.https.html imported/w3c/web-platform-tests/encoding/textdecoder-labels.html imported/w3c/web-platform-tests/encoding/api-invalid-label.html workers/wasm-long-compile.html imported/w3c/web-platform-tests/html/the-xhtml-syntax/parsing-xhtml-documents/xhtml-mathml-dtd-entity-10.htm
Build Bot
Comment 47
2017-09-29 08:40:09 PDT
Created
attachment 322186
[details]
Archive of layout-test-results from ews112 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews112 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Build Bot
Comment 48
2017-09-29 08:55:03 PDT
Comment on
attachment 322183
[details]
WIP, more refactoring, probably still buggy
Attachment 322183
[details]
did not pass mac-ews (mac): Output:
http://webkit-queues.webkit.org/results/4698189
New failing tests: imported/w3c/web-platform-tests/dom/ranges/Range-mutations-dataChange.html imported/w3c/web-platform-tests/dom/ranges/Range-intersectsNode.html imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.html imported/w3c/web-platform-tests/dom/ranges/Range-isPointInRange.html imported/w3c/web-platform-tests/encoding/textdecoder-labels.html imported/w3c/web-platform-tests/dom/ranges/Range-mutations.html imported/w3c/web-platform-tests/html/dom/reflection-grouping.html imported/w3c/web-platform-tests/html/dom/reflection-misc.html imported/w3c/web-platform-tests/dom/ranges/Range-compareBoundaryPoints.html imported/w3c/web-platform-tests/dom/ranges/Range-comparePoint.html imported/w3c/web-platform-tests/WebCryptoAPI/derive_bits_keys/test_hkdf.https.html imported/w3c/web-platform-tests/html/dom/reflection-metadata.html imported/w3c/web-platform-tests/html/dom/reflection-sections.html imported/w3c/web-platform-tests/WebCryptoAPI/generateKey/test_failures.https.html imported/w3c/web-platform-tests/encoding/api-invalid-label.html imported/w3c/web-platform-tests/html/dom/reflection-forms.html imported/w3c/web-platform-tests/html/dom/reflection-text.html imported/w3c/web-platform-tests/dom/ranges/Range-set.html imported/w3c/web-platform-tests/html/dom/reflection-obsolete.html imported/w3c/web-platform-tests/html/dom/reflection-tabular.html imported/w3c/web-platform-tests/html/syntax/parsing/named-character-references.html imported/w3c/web-platform-tests/html/dom/reflection-embedded.html workers/wasm-long-compile.html imported/w3c/web-platform-tests/html/the-xhtml-syntax/parsing-xhtml-documents/xhtml-mathml-dtd-entity-10.htm
Build Bot
Comment 49
2017-09-29 08:55:09 PDT
Created
attachment 322187
[details]
Archive of layout-test-results from ews103 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews103 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Robin Morisset
Comment 50
2017-10-02 15:18:15 PDT
Created
attachment 322455
[details]
Fix stupid bug in handleInlining, everything optimizing tail calls is commented out, it should hopefully pass ews tests this time
Build Bot
Comment 51
2017-10-02 15:20:32 PDT
Attachment 322455
[details]
did not pass style-queue: ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1342: Should have a space between // and comment [whitespace/comments] [4] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1343: Weird number of spaces at line-start. Are you using a 4-space indent? [whitespace/indent] [3] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:4177: WTF_CONCAT is incorrectly named. Don't use underscores in your identifier names. [readability/naming/underscores] [4] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:4179: Should be indented on a separate line, with the colon or comma first on that line. [whitespace/indent] [4] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:4184: Wrong number of spaces before statement. (expected: 12) [whitespace/indent] [4] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:6265: Wrong number of spaces before statement. (expected: 8) [whitespace/indent] [4] Total errors found: 6 in 6 files If any of these errors are false positives, please file a bug against check-webkit-style.
Robin Morisset
Comment 52
2017-10-03 05:20:45 PDT
Created
attachment 322511
[details]
Known buggy, submitting to EWS to find more failing tests
Build Bot
Comment 53
2017-10-03 05:21:49 PDT
Attachment 322511
[details]
did not pass style-queue: ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1342: Should have a space between // and comment [whitespace/comments] [4] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1343: Weird number of spaces at line-start. Are you using a 4-space indent? [whitespace/indent] [3] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:4177: WTF_CONCAT is incorrectly named. Don't use underscores in your identifier names. [readability/naming/underscores] [4] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:4179: Should be indented on a separate line, with the colon or comma first on that line. [whitespace/indent] [4] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:4184: Wrong number of spaces before statement. (expected: 12) [whitespace/indent] [4] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:6264: Wrong number of spaces before statement. (expected: 8) [whitespace/indent] [4] Total errors found: 6 in 6 files If any of these errors are false positives, please file a bug against check-webkit-style.
Build Bot
Comment 54
2017-10-03 06:00:15 PDT
Comment on
attachment 322511
[details]
Known buggy, submitting to EWS to find more failing tests
Attachment 322511
[details]
did not pass jsc-ews (mac): Output:
http://webkit-queues.webkit.org/results/4739483
New failing tests: stress/arguments-elimination-varargs-too-many-args-arg-count.js.ftl-eager-no-cjit-b3o1 stress/arguments-elimination-varargs-too-many-args-arg-count.js.ftl-eager-no-cjit
Robin Morisset
Comment 55
2017-10-03 10:43:14 PDT
Created
attachment 322545
[details]
Commented out the buggy splitting of entry block
Build Bot
Comment 56
2017-10-03 10:46:35 PDT
Attachment 322545
[details]
did not pass style-queue: ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1313: Multi line control clauses should use braces. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1343: Should have a space between // and comment [whitespace/comments] [4] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1344: Weird number of spaces at line-start. Are you using a 4-space indent? [whitespace/indent] [3] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:4178: WTF_CONCAT is incorrectly named. Don't use underscores in your identifier names. [readability/naming/underscores] [4] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:4180: Should be indented on a separate line, with the colon or comma first on that line. [whitespace/indent] [4] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:4185: Wrong number of spaces before statement. (expected: 12) [whitespace/indent] [4] ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:6267: Wrong number of spaces before statement. (expected: 8) [whitespace/indent] [4] Total errors found: 7 in 6 files If any of these errors are false positives, please file a bug against check-webkit-style.
Keith Miller
Comment 57
2017-10-03 13:24:50 PDT
Comment on
attachment 322545
[details]
Commented out the buggy splitting of entry block View in context:
https://bugs.webkit.org/attachment.cgi?id=322545&action=review
Seems reasonable but there are some things you should clean up.
> Source/JavaScriptCore/ChangeLog:17 > + - m_linkingTargets is a dictionnary from bytecode indices to BasicBlock*
typo: dictionnary -> dictionary.
> Source/JavaScriptCore/ChangeLog:21 > + terminal that need to be taken off m_unlinkedBlocks.
typo: needs
> Source/JavaScriptCore/ChangeLog:28 > + - The 7 and 8 arguments of handleCall were inlined in its 3 arguments version for readability > + - The 9 argument version was cleaned up and simplified
I feel like something is wrong when a function has more than 6 arguments... :(
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1236 > + BasicBlock* blockPtr = block.ptr();
Nit: you could probably drop this line.
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1273 > + Node* callTarget = get(VirtualRegister(pc[2].u.operand)); > + int argumentCountIncludingThis = pc[3].u.operand; > + int registerOffset = -pc[4].u.operand;
It would be great if you gave these names in the BytecodeList.json and used them here.
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1325 > +bool ByteCodeParser::handleRecursiveTailCall(Node* callTargetNode, VirtualRegister thisArgument, CallLinkStatus callLinkStatus) > +{ > + // We currently only do this optimisation in the simple, non-polymorphic case. > + if (callLinkStatus.couldTakeSlowPath() || callLinkStatus.size() != 1) > + return false; > + > + auto targetExecutable = callLinkStatus[0].executable(); > + InlineStackEntry* stackEntry = m_inlineStackTop; > + do { > + if (targetExecutable != stackEntry->executable()) > + continue; > + > + VERBOSE_LOG(" We found a recursive tail, optimizing it to a jump to block ", *stackEntry->m_callsiteBlockHead, ".\n"); > + // TODO: set the arguments to the right value > + // TODO: shadow chicken log > + > + // We must also add some check that the profiling information was correct and the target of this call is what we thought > + emitFunctionChecks(callLinkStatus[0], callTargetNode, thisArgument); > + > + VERBOSE_LOG(" Repeating the work of op_enter as we are jumping to right after it. \n"); > + Node* undefined = addToGraph(JSConstant, OpInfo(m_constantUndefined)); > + // Initialize all locals to undefined. > + for (int i = 0; i < stackEntry->m_codeBlock->m_numVars; ++i) > + setDirect(stackEntry->remapOperand(virtualRegisterForLocal(i)), undefined, ImmediateSetWithFlush); > + // TODO: check that ImmediateSetWithFlush is the right pick here. > + > + Node* jumpNode = addToGraph(Jump); > + jumpNode->targetBlock() = stackEntry->m_callsiteBlockHead; > + m_currentBlock->didLink(); > + flushForTerminal(); > + return true; > + } while ((stackEntry = stackEntry->m_caller)); > + > + // The tail call was not recursive > + return false;
This function isn't used, can you remove it for now?
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1344 > + //if (op == TailCall && handleRecursiveTailCall(callTarget, thisArgument, callLinkStatus)) > + // return Terminal;
Please remove commented code.
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2052 > + // TODO: we should be able to get rid of this somehow.
Make this a FIXME and link a bugzilla.
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2080 > + // TODO: we can avoid this indirection in the case of normal inlined call by passing the continuation block all the way into inlineStackEntry->m_continuationBlock in inlineCall.
ditto.
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:4252 > + // to be true. TODO: change this.
Ditto.
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:5276 > + if (m_inlineStackTop->m_returnValue.isValid()) // TODO: understand what happens if it is not.
Ditto.
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:5284 > + // TODO: should there be a special case for when we are returning from the first block? I do not see why, but there was one.
Ditto.
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:6257 > + BasicBlock* callsiteBlockHead, // TODO: remove from constructor, since it is set in parseBlock, when dealing with op_enter
Ditto.
Robin Morisset
Comment 58
2017-10-04 07:55:27 PDT
> > Source/JavaScriptCore/ChangeLog:28 > > + - The 7 and 8 arguments of handleCall were inlined in its 3 arguments version for readability > > + - The 9 argument version was cleaned up and simplified > > I feel like something is wrong when a function has more than 6 arguments... > :(
I totally agree, but I saw no easy way to solve this problem, and my refactoring was already pretty big.
> > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1236 > > + BasicBlock* blockPtr = block.ptr(); > > Nit: you could probably drop this line.
I am not sure it would remain correct to refer to block.ptr() after I WTFMove(block).
> > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1273 > > + Node* callTarget = get(VirtualRegister(pc[2].u.operand)); > > + int argumentCountIncludingThis = pc[3].u.operand; > > + int registerOffset = -pc[4].u.operand; > > It would be great if you gave these names in the BytecodeList.json and used > them here.
Is there a way to do so that correctly negate the registerOffset, and correctly "get"s the callTarget? If not, I am not sure what to call those before the operands are processed.
Saam Barati
Comment 59
2017-10-04 13:09:40 PDT
Comment on
attachment 322545
[details]
Commented out the buggy splitting of entry block View in context:
https://bugs.webkit.org/attachment.cgi?id=322545&action=review
Robin, is the goal of this review to get feedback on the tail loop stuff, or to first check in the refactoring? If the latter, then let's open a new bug with a title for the refactoring.
>> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2052 >> + // TODO: we should be able to get rid of this somehow. > > Make this a FIXME and link a bugzilla.
What's your conclusion here? I don't vote for a bugzilla unless we know how we'll get rid of this.
>> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2080 >> + // TODO: we can avoid this indirection in the case of normal inlined call by passing the continuation block all the way into inlineStackEntry->m_continuationBlock in inlineCall. > > ditto.
Seems like this might be worth just doing.
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:4292 > + if (false) { > + VERBOSE_LOG("Splitting the entry block for potential recursive tail calls.\n"); > + Node* jumpNode = addToGraph(Jump); > + BasicBlock* newBlock = allocateUntargetableBlock(); > + jumpNode->targetBlock() = newBlock; > + m_currentBlock->didLink(); > + m_currentBlock = newBlock; > + }
What happens when you run this code?
>> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:5276 >> + if (m_inlineStackTop->m_returnValue.isValid()) // TODO: understand what happens if it is not. > > Ditto.
Going against Keith's ditto here: please don't file a bug on you not fully understanding of the code. I'm like 95% sure this has to do w/ setters, but regardless of that, you just figure out why this happens, and remove the TODO
>> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:5284 >> + // TODO: should there be a special case for when we are returning from the first block? I do not see why, but there was one. > > Ditto.
Ditto here. This seems like something you need to figure out before landing
>> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:6257 >> + BasicBlock* callsiteBlockHead, // TODO: remove from constructor, since it is set in parseBlock, when dealing with op_enter > > Ditto.
Ditto here. This is something you should either do or remove your TODO
Robin Morisset
Comment 60
2017-10-06 09:06:56 PDT
I moved the refactoring part of the patch to
https://bugs.webkit.org/show_bug.cgi?id=177922
so that it can be reviewed more easily.
Robin Morisset
Comment 61
2017-10-11 13:42:12 PDT
Created
attachment 323457
[details]
Patch
Robin Morisset
Comment 62
2017-10-11 15:40:12 PDT
Created
attachment 323474
[details]
Fixed, this passes all the stress tests, but I want some more testing before asking for review
Build Bot
Comment 63
2017-10-11 15:41:24 PDT
Attachment 323474
[details]
did not pass style-queue: ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:215: The parameter name "stackEntry" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 1 in 3 files If any of these errors are false positives, please file a bug against check-webkit-style.
Robin Morisset
Comment 64
2017-10-11 16:10:46 PDT
Benchmark report for Octane on MacBook-Pro-5 (MacBookPro14,3). VMs tested: "Baseline" at /Users/rmorisset/Webkit2/OpenSource/WebKitBuild/Baseline/Release/jsc "TailCall" at /Users/rmorisset/Webkit2/OpenSource/WebKitBuild/TailCall3/Release/jsc Collected 6 samples per benchmark/VM, with 6 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. Baseline TailCall encrypt 0.12516+-0.00200 ? 0.12576+-0.00276 ? decrypt 2.20316+-0.02487 2.19586+-0.00988 deltablue x2 0.11811+-0.00486 0.11528+-0.00436 might be 1.0245x faster earley 0.23454+-0.00292 ? 0.23467+-0.00234 ? boyer 4.09504+-0.03587 4.08283+-0.04547 navier-stokes x2 4.55212+-0.04389 4.54719+-0.02290 raytrace x2 0.63796+-0.01012 ! 2.19885+-0.00639 ! definitely 3.4467x slower richards x2 0.07406+-0.00089 0.07396+-0.00030 splay x2 0.29583+-0.00294 0.29561+-0.00234 regexp x2 14.64647+-0.37289 ! 15.44706+-0.18671 ! definitely 1.0547x slower pdfjs x2 36.56302+-0.18624 ? 36.80115+-0.34733 ? mandreel x2 35.58116+-0.36891 ? 35.85318+-0.47004 ? gbemu x2 27.57263+-0.34381 ? 27.62819+-0.47350 ? closure 0.42556+-0.00235 ? 0.42734+-0.00604 ? jquery 5.60274+-0.02243 ? 5.62641+-0.11612 ? box2d x2 7.08661+-0.03697 ? 7.10947+-0.09351 ? zlib x2 290.65739+-1.10185 ? 291.59003+-3.20236 ? typescript x2 582.97498+-17.32713 ? 588.47603+-9.79813 ? <geometric> 4.34865+-0.01869 ! 4.74215+-0.01957 ! definitely 1.0905x slower So clearly something is very wrong with this patch. Sunspider showed no performance change.
Robin Morisset
Comment 65
2017-10-11 17:36:24 PDT
After running a few tests, it is clear that this regression is entirely caused by the splitting of op_enter blocks: in particular it reproduces even if handleRecursiveTailCall is replaced by return false.
Robin Morisset
Comment 66
2017-10-12 11:27:20 PDT
Some more things about this performance regression: - It only shows on Octane/raytrace: neither sun spider, kraken or microbenchmarks have any significant regression (maybe a 1% regression on Kraken, to investigate later) - It is caused purely by the splitting of entry blocks, not by the actual optimisation - It persists even when not doing the splitting for functions at the leaves of the call graph (through testing for the emptiness of callLinkInfoMap). - It persists even when running DFGCFGSimplificationPhase right after the byte code parser (to eliminate unnecessary jumps) - It does not appear to be located in a single JS function: when attempting to bisect with JSC_bytecodeRangeToDFGCompile, the difference with the baseline progressively shrinks instead of having a cliff somewhere.
Robin Morisset
Comment 67
2017-10-12 13:44:10 PDT
Created
attachment 323558
[details]
Fixed a horrible design bug, and the performance regression
Robin Morisset
Comment 68
2017-10-12 13:45:39 PDT
Benchmark report for Octane on MacBook-Pro-5 (MacBookPro14,3). VMs tested: "Baseline" at /Users/rmorisset/Webkit2/OpenSource/WebKitBuild/Baseline/Release/jsc "Tail" at /Users/rmorisset/Webkit2/OpenSource/WebKitBuild/TailCall3/Release/jsc Collected 8 samples per benchmark/VM, with 8 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. Baseline Tail encrypt 0.13207+-0.00161 ? 0.13261+-0.00127 ? decrypt 2.34831+-0.04530 2.33041+-0.01246 deltablue x2 0.12342+-0.00190 ? 0.12351+-0.00322 ? earley 0.24859+-0.00332 0.24785+-0.00275 boyer 4.31797+-0.07807 4.28364+-0.03930 navier-stokes x2 4.87890+-0.06596 4.84418+-0.05741 raytrace x2 0.68536+-0.00741 0.68159+-0.00531 richards x2 0.07856+-0.00083 ? 0.07917+-0.00139 ? splay x2 0.31365+-0.00953 0.30422+-0.00139 might be 1.0310x faster regexp x2 15.52398+-0.16968 15.51919+-0.29346 pdfjs x2 39.02238+-0.21603 ? 39.06900+-0.24751 ? mandreel x2 38.14925+-0.39561 38.03272+-0.60931 gbemu x2 29.17857+-0.55032 ? 29.73977+-1.43515 ? might be 1.0192x slower closure 0.45147+-0.00498 ? 0.45274+-0.01029 ? jquery 6.08769+-0.12487 6.01421+-0.07787 might be 1.0122x faster box2d x2 7.77007+-0.22361 7.56490+-0.05053 might be 1.0271x faster zlib x2 306.73149+-3.07770 ? 307.86919+-2.63745 ? typescript x2 606.15997+-7.28700 599.21710+-6.84371 might be 1.0116x faster <geometric> 4.62426+-0.02182 4.60428+-0.01928 might be 1.0043x faster --------------------- Benchmark report for Kraken on MacBook-Pro-5 (MacBookPro14,3). VMs tested: "Baseline" at /Users/rmorisset/Webkit2/OpenSource/WebKitBuild/Baseline/Release/jsc "Tail" at /Users/rmorisset/Webkit2/OpenSource/WebKitBuild/TailCall3/Release/jsc Collected 8 samples per benchmark/VM, with 8 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. Baseline Tail ai-astar 71.707+-3.284 ? 74.090+-3.158 ? might be 1.0332x slower audio-beat-detection 34.792+-0.521 ? 36.688+-3.837 ? might be 1.0545x slower audio-dft 85.256+-3.956 ? 86.756+-2.058 ? might be 1.0176x slower audio-fft 24.700+-1.414 ? 24.726+-1.062 ? audio-oscillator 48.445+-1.414 ? 51.629+-4.553 ? might be 1.0657x slower imaging-darkroom 52.437+-1.327 ? 52.996+-1.902 ? might be 1.0107x slower imaging-desaturate 45.185+-2.413 44.590+-1.823 might be 1.0134x faster imaging-gaussian-blur 57.064+-3.813 ? 58.884+-5.177 ? might be 1.0319x slower json-parse-financial 31.760+-2.089 31.531+-1.447 json-stringify-tinderbox 19.834+-0.519 ? 20.088+-0.547 ? might be 1.0128x slower stanford-crypto-aes 34.609+-1.645 ? 35.127+-1.028 ? might be 1.0150x slower stanford-crypto-ccm 31.612+-2.898 30.550+-2.354 might be 1.0348x faster stanford-crypto-pbkdf2 58.973+-1.840 57.927+-1.497 might be 1.0180x faster stanford-crypto-sha256-iterative 18.413+-0.292 ? 18.814+-0.720 ? might be 1.0218x slower <arithmetic> 43.913+-0.329 ? 44.600+-0.709 ? might be 1.0156x slower So the regression on raytrace is solved, but there might be a 1.5% slowdown on Kraken.
Build Bot
Comment 69
2017-10-12 13:47:03 PDT
Attachment 323558
[details]
did not pass style-queue: ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:215: The parameter name "stackEntry" adds no information, so it should be removed. [readability/parameter_name] [5] ERROR: Source/JavaScriptCore/ChangeLog:12: Line contains tab character. [whitespace/tab] [5] ERROR: Source/JavaScriptCore/ChangeLog:13: Line contains tab character. [whitespace/tab] [5] Total errors found: 3 in 3 files If any of these errors are false positives, please file a bug against check-webkit-style.
Saam Barati
Comment 70
2017-10-12 13:48:25 PDT
Comment on
attachment 323558
[details]
Fixed a horrible design bug, and the performance regression View in context:
https://bugs.webkit.org/attachment.cgi?id=323558&action=review
> Source/JavaScriptCore/bytecode/PreciseJumpTargets.cpp:68 > + // We need to insert a jump after op_enter, so recursive tail calls have somewhere to jump to. > + // But we only want to pay that price for functions that have at least one tail call. > + CallLinkInfoMap map; > + codeBlock->getCallLinkInfoMap(map); > + for (auto info : map.values()) { > + if (info->callMode() == CallMode::Tail) { > + out.append(bytecodeOffset + opcodeLengths[op_enter]); > + break; > + } > + } > + }
You should just set a bit if a CodeBlock has an op_tail_call in it. You can probably do it in bytecode generator via UnlinkedCodeBlock.
Saam Barati
Comment 71
2017-10-12 13:53:49 PDT
Comment on
attachment 323558
[details]
Fixed a horrible design bug, and the performance regression View in context:
https://bugs.webkit.org/attachment.cgi?id=323558&action=review
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1433 > + if (auto callFrame = stackEntry->m_inlineCallFrame) {
nit: I prefer a type here, or at least, auto* instead of auto.
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1436 > + if (argumentCountIncludingThis != (int) callFrame->argumentCountIncludingThis)
Style: WebKit style uses static_cast instead of C-style casts.
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1465 > + // We want to emit the SetLocals with a code origin that points to the place we are jumping to.
Nit: "with a code origin" => "with an exit origin" since the semantic origin won't be there.
Robin Morisset
Comment 72
2017-10-13 11:17:48 PDT
Created
attachment 323710
[details]
Moved the splitting of op_enter to UnlinkedCodeBlocks, and correctly recognize when it has happened
Build Bot
Comment 73
2017-10-13 11:19:27 PDT
Attachment 323710
[details]
did not pass style-queue: ERROR: Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:215: The parameter name "stackEntry" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 1 in 7 files If any of these errors are false positives, please file a bug against check-webkit-style.
Robin Morisset
Comment 74
2017-10-13 11:21:03 PDT
We now see an improvement of TailBench9000 but it is rather disappointing: - n-body-run.js roughly went from 1.05s to 0.95s - merge-sort-run.js roughly went from 1.40s to 1.10s I have not yet ran other benchmarks on this last version of the code.
Robin Morisset
Comment 75
2017-10-13 11:39:21 PDT
It appears that the optimization currently hurts tier-up from DFG to FTL, I am investigating how to fix it.
Build Bot
Comment 76
2017-10-13 18:37:27 PDT
Comment on
attachment 323710
[details]
Moved the splitting of op_enter to UnlinkedCodeBlocks, and correctly recognize when it has happened
Attachment 323710
[details]
did not pass mac-debug-ews (mac): Output:
http://webkit-queues.webkit.org/results/4852670
New failing tests: http/tests/loading/basic-auth-resend-wrong-credentials.html
Build Bot
Comment 77
2017-10-13 18:37:29 PDT
Created
attachment 323785
[details]
Archive of layout-test-results from ews114 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews114 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Robin Morisset
Comment 78
2017-10-17 08:56:05 PDT
Comment on
attachment 323710
[details]
Moved the splitting of op_enter to UnlinkedCodeBlocks, and correctly recognize when it has happened My experiments with inserting LoopHint have not improved performance so far, so I would like to get this patch landed as is, and make it a separate bug. The build-bot failure looks very much like a flaky test to me, I don't see how my patch could break layout. Is there a simple way to convince the buildbot to try again? Or should I just resubmit the patch?
Sam Weinig
Comment 79
2017-10-17 09:19:35 PDT
(In reply to Robin Morisset from
comment #78
) Is there a simple way to convince the
> buildbot to try again? Or should I just resubmit the patch?
I think you just need to re-upload the change unfortunately.
Saam Barati
Comment 80
2017-10-17 12:18:52 PDT
Comment on
attachment 323710
[details]
Moved the splitting of op_enter to UnlinkedCodeBlocks, and correctly recognize when it has happened View in context:
https://bugs.webkit.org/attachment.cgi?id=323710&action=review
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1418 > + // We currently only do this optimisation in the simple, non-polymorphic case.
Maybe add a FIXME with a bugzilla link here since you told me you wanted to do this and you thought it'd be pretty easy?
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1429 > + BasicBlock** entryBlockPtr = tryBinarySearch<BasicBlock*, unsigned>(stackEntry->m_blockLinkingTargets, stackEntry->m_blockLinkingTargets.size(), OPCODE_LENGTH(op_enter), getBytecodeBeginForBlock);
m_blockLInkingTargets is sorted?
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1480 > + // It would be unsound to jump over a non-tail call: the "tail" call is not really a tail call in that case.
Please add tests for this since you didn't find any when running the tests with the earlier broken patch. It'd be good to verify your earlier broken version would fail such a test.
Robin Morisset
Comment 81
2017-10-18 07:44:01 PDT
Created
attachment 324116
[details]
Patch
Robin Morisset
Comment 82
2017-10-18 07:51:07 PDT
(In reply to Saam Barati from
comment #80
)
> Comment on
attachment 323710
[details]
> Moved the splitting of op_enter to UnlinkedCodeBlocks, and correctly > recognize when it has happened > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=323710&action=review
> > > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1418 > > + // We currently only do this optimisation in the simple, non-polymorphic case. > > Maybe add a FIXME with a bugzilla link here since you told me you wanted to > do this and you thought it'd be pretty easy?
Done
> > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1429 > > + BasicBlock** entryBlockPtr = tryBinarySearch<BasicBlock*, unsigned>(stackEntry->m_blockLinkingTargets, stackEntry->m_blockLinkingTargets.size(), OPCODE_LENGTH(op_enter), getBytecodeBeginForBlock); > > m_blockLInkingTargets is sorted?
Yes. It was already mentioned in a comment; I added some ASSERTs to allocateTargetableBlock and makeBlockTargetable.
> > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1480 > > + // It would be unsound to jump over a non-tail call: the "tail" call is not really a tail call in that case. > > Please add tests for this since you didn't find any when running the tests > with the earlier broken patch. It'd be good to verify your earlier broken > version would fail such a test.
I added some test, but the buggy code currently appears to be passing the test somehow, I am investigating.
Robin Morisset
Comment 83
2017-10-18 08:57:44 PDT
Created
attachment 324120
[details]
Patch
Robin Morisset
Comment 84
2017-10-18 09:07:56 PDT
Mystery solved: there was yet another bug (!) that was causing this entire code path not to be exercised by the tests for either the buggy or the "fixed" version of the code. In short, because each code block has a superset of the jump targets, a function with no control-flow was deemed to have no jump targets, even if it had a tail call, inhibiting the splitting of the basic block that includes op_enter. After fixing this bug, we can now correctly detect the other bugs from previous versions of this patch with the test. I also added another test as I realized that one code path was still not tested. The tests now (hopefully) correctly find wrong code generation; but I do not see how to test that the transformation occurred when it should.
Saam Barati
Comment 85
2017-10-18 11:46:56 PDT
Comment on
attachment 324120
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=324120&action=review
r=me with a few comments.
> Source/JavaScriptCore/bytecode/CodeBlock.h:928 > + bool hasTailCalls() const {return m_unlinkedCode->hasTailCalls();}
style nit: this should be written as: { return m_unlinkedCode->hasTailCalls(); } for the spacing to follow webkit style
> Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp:1321 > + // We must add the end of op_enter as a potential jump target, because the bytecode parser may decide to split its basic block > + // to have somewhere to jump to if there is a recursive tail-call that points to this function. > + m_codeBlock->addJumpTarget(instructions().size()); > + // This disables peephole optimizations when an instruction is a jump target > + m_lastOpcodeID = op_end;
Is it worth just emitting a label since that will do this work for us? Or should we keep this as is?
> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1469 > + // We jump right after it and not before it, because of some invariant saying that an OSR entry point cannot have predecessors in the IR.
This is not an accurate comment. The invariant is that a root in the CFG cannot have a predecessor. An OSR entrypoint *can* have predecessors. For example, loops are often OSR entry points. FWIW, we could totally jump to op_enter, we'd just need to split before op_enter, not after. This may be awkward. I'm totally OK with how you've implemented it.
Robin Morisset
Comment 86
2017-10-19 02:20:25 PDT
(In reply to Saam Barati from
comment #85
)
> Comment on
attachment 324120
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=324120&action=review
> > r=me with a few comments. > > > Source/JavaScriptCore/bytecode/CodeBlock.h:928 > > + bool hasTailCalls() const {return m_unlinkedCode->hasTailCalls();} > > style nit: this should be written as: > { return m_unlinkedCode->hasTailCalls(); } > for the spacing to follow webkit style
Done
> > Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp:1321 > > + // We must add the end of op_enter as a potential jump target, because the bytecode parser may decide to split its basic block > > + // to have somewhere to jump to if there is a recursive tail-call that points to this function. > > + m_codeBlock->addJumpTarget(instructions().size()); > > + // This disables peephole optimizations when an instruction is a jump target > > + m_lastOpcodeID = op_end; > > Is it worth just emitting a label since that will do this work for us? Or > should we keep this as is?
That was my first idea. The problem is that we would then have a Ref<Label> with nowhere to put it. It is not a serious problem, it just felt wrong to allocate data that we would never even read back before freeing it. I can change it back if you prefer.
> > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:1469 > > + // We jump right after it and not before it, because of some invariant saying that an OSR entry point cannot have predecessors in the IR. > > This is not an accurate comment. The invariant is that a root in the CFG > cannot have a predecessor. An OSR entrypoint *can* have predecessors. For > example, loops are often OSR entry points. FWIW, we could totally jump to > op_enter, we'd just need to split before op_enter, not after. This may be > awkward. I'm totally OK with how you've implemented it.
Thanks for the comment, will fix.
Robin Morisset
Comment 87
2017-10-19 04:00:18 PDT
Created
attachment 324216
[details]
Patch
Robin Morisset
Comment 88
2017-10-19 04:02:29 PDT
This should finally be finished. Diff with the previous version of the patch: - Applied both of Saam's comments - Added more tests, and slight refactoring of the tests. - Added one RELEASE_ASSERT instead of silently failing if for whatever reason the op_enter block is not split when we are expecting it to be.
WebKit Commit Bot
Comment 89
2017-10-19 09:27:48 PDT
Comment on
attachment 324216
[details]
Patch Clearing flags on attachment: 324216 Committed
r223691
: <
https://trac.webkit.org/changeset/223691
>
WebKit Commit Bot
Comment 90
2017-10-19 09:27:51 PDT
All reviewed patches have been landed. Closing bug.
Michael Catanzaro
Comment 91
2017-10-19 14:11:45 PDT
Regression in
bug #178543
WebKit Commit Bot
Comment 92
2017-10-25 15:11:49 PDT
Re-opened since this is blocked by
bug 178834
Ryosuke Niwa
Comment 93
2017-10-25 15:18:44 PDT
This patch broke Speedometer 2's React-Redux-TodoMVC test case. To reproduce, 1. Go to
http://browserbench.org/Speedometer2.0/InteractiveRunner.html
2. Click on "Unselect all" 3. Select "React-Redux-TodoMVC" 4. Click on "Run" Open Web Inspector and observe that there is an exception being thrown. You have to close the inspector & reload the page before running it again since the bug doesn't reproduce when Web Inspector is open.
Robin Morisset
Comment 94
2017-10-26 07:56:39 PDT
Thanks for the bug report, I am looking into it. Here is what I have found about it so far: - The bug appears deep within React code, taken from
http://browserbench.org/Speedometer2.0/resources/todomvc/architecture-examples/react-redux/dist/index.html
(sadly partially minified). - The bug is that "this" is apparently undefined in "this.moveChild", in _updateChildren in that file. - _updateChildren appears many times in the stack trace, but it does not appear to make any tail call so it should not be optimized. - The bug disappear if we refuse to make jumps to inlined functions, or if we refuse to make jumps across levels of the inline stack stack. So it is a case where we are creating a jump from one inlined function to another (higher in the call stack). - The bug does not disappear if I disable the optimization when the number of arguments is different from the number of parameters of the relevant code block (for inlined frames). - It reproduces fine with JSC_useConcurrentJIT=0 - Removing the repetition of op_enter does not make the bug disappear change. - Removing the setting of extra parameters to undefined also does not make the bug disappear or change.
Ryan Haddad
Comment 95
2017-10-26 09:42:13 PDT
***
Bug 178592
has been marked as a duplicate of this bug. ***
Robin Morisset
Comment 96
2017-10-31 10:29:41 PDT
Created
attachment 325457
[details]
WIP, for sharing with Saam
Robin Morisset
Comment 97
2017-11-08 12:11:05 PST
Created
attachment 326343
[details]
Patch for landing
Robin Morisset
Comment 98
2017-11-08 12:13:21 PST
Because of how tricky it was to write a minimized regression test, and speedometer provides some protection against regression, I decided to reland this patch with the fix described in
https://bugs.webkit.org/show_bug.cgi
and no extra test. I marked it r=sbarati as we discussed this together.
Robin Morisset
Comment 99
2017-11-08 12:14:25 PST
***
Bug 178834
has been marked as a duplicate of this bug. ***
WebKit Commit Bot
Comment 100
2017-11-08 12:42:19 PST
Comment on
attachment 326343
[details]
Patch for landing Clearing flags on attachment: 326343 Committed
r224592
: <
https://trac.webkit.org/changeset/224592
>
WebKit Commit Bot
Comment 101
2017-11-08 12:42:22 PST
All reviewed patches have been landed. Closing bug.
Robin Morisset
Comment 102
2017-11-09 02:29:22 PST
Collected 10 samples per benchmark/VM, with 10 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. Baseline TailCall n-body 837.5192+-2.5306 ^ 742.2701+-2.1430 ^ definitely 1.1283x faster merge-sort 597.9785+-5.6828 ^ 521.2900+-3.2286 ^ definitely 1.1471x faster bf-interpreter 478.1371+-4.7080 ^ 334.9072+-1.8962 ^ definitely 1.4277x faster <geometric> 620.9622+-3.5729 ^ 506.0356+-1.2795 ^ definitely 1.2271x faster No visible change on other benchmarks that I've tested.
David Kilzer (:ddkilzer)
Comment 103
2017-11-15 03:15:30 PST
<
rdar://problem/35556442
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug