RESOLVED DUPLICATE of bug 160168 165979
Rename finallyActionRegister to completionTypeRegister and only store int JSValues in it.
https://bugs.webkit.org/show_bug.cgi?id=165979
Summary Rename finallyActionRegister to completionTypeRegister and only store int JSV...
Mark Lam
Reported 2016-12-16 16:55:45 PST
It is better for JIT type speculation if the type stays an int always.
Attachments
proposed patch. (26.03 KB, patch)
2016-12-18 09:49 PST, Mark Lam
no flags
Mark Lam
Comment 1 2016-12-18 09:49:07 PST
Created attachment 297438 [details] proposed patch.
Saam Barati
Comment 2 2016-12-18 10:24:35 PST
Comment on attachment 297438 [details] proposed patch. View in context: https://bugs.webkit.org/attachment.cgi?id=297438&action=review > Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp:4874 > for (size_t i = 0; i < numberOfJumps; i++) { I know this isn't part of this patch, but why not emit a switch() statement instead of a linear set of jumps? Maybe worth doing in another patch? Or at least if the number of jumps is large we can emit a switch
Mark Lam
Comment 3 2016-12-18 10:39:55 PST
(In reply to comment #2) > Comment on attachment 297438 [details] > proposed patch. > > View in context: > https://bugs.webkit.org/attachment.cgi?id=297438&action=review > > > Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp:4874 > > for (size_t i = 0; i < numberOfJumps; i++) { > > I know this isn't part of this patch, but why not emit a switch() statement > instead of a linear set of jumps? Maybe worth doing in another patch? Or at > least if the number of jumps is large we can emit a switch Good point, but I'll look into doing that separately. Thanks.
WebKit Commit Bot
Comment 4 2016-12-18 11:04:59 PST
Comment on attachment 297438 [details] proposed patch. Clearing flags on attachment: 297438 Committed r209974: <http://trac.webkit.org/changeset/209974>
WebKit Commit Bot
Comment 5 2016-12-18 11:05:03 PST
All reviewed patches have been landed. Closing bug.
Mark Lam
Comment 6 2016-12-19 09:59:07 PST
(In reply to comment #2) > > Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp:4874 > > for (size_t i = 0; i < numberOfJumps; i++) { > > I know this isn't part of this patch, but why not emit a switch() statement > instead of a linear set of jumps? Maybe worth doing in another patch? Or at > least if the number of jumps is large we can emit a switch I took a look at this. Here's an example of JS code that would resemble the code generated in emitFinallyCompletion(): function foo(x) { var scope1 = x; // Fake scope value var scope2 = x; // Fake scope value var scope3 = x; // Fake scope value var completionType = 0; switch(x) { case 0: break; // Normal completion. case 3: return x; // Return completion. // Simulate Break/Continue completions with non-contiguous jumpIDs. // each case needs to do 1. a move to restore scope, 2. set the completion type // back to Normal, and 3. jump to the target. case 8: x = scope1; completionType = 0; break; case 21: x = scope2; completionType = 0; break; case 55: x = scope3; completionType = 0; break; default: throw x; // Throw completion. } } Here's the best bytecode that we can generate for this at present: foo#A4jB2P:[0x10f5979e0->0x10f5a64c0, NoneFunctionCall, 102]: 102 m_instructions; 816 bytes; 2 parameter(s); 12 callee register(s); 10 variable(s) [ 0] enter [ 1] get_scope loc3 [ 3] mov loc4, loc3 [ 6] mov loc6, arg1 [ 9] mov loc5, arg1 [ 12] mov loc8, arg1 [ 15] mov loc7, Int32: 0(const0) [ 18] mov loc10, Int32: 0(const0) // Check for Normal completion. [ 21] stricteq loc10, loc10, arg1 [ 25] jtrue loc10, 45(->70) [ 28] mov loc10, Int32: 3(const1) // Check for Return completion. [ 31] stricteq loc10, loc10, arg1 [ 35] jtrue loc10, 37(->72) [ 38] mov loc10, Int32: 8(const2) // Check for non-contiguous jumpID i.e. Break/Continue completion. [ 41] stricteq loc10, loc10, arg1 [ 45] jtrue loc10, 29(->74) [ 48] mov loc10, Int32: 21(const3) // Check for non-contiguous jumpID i.e. Break/Continue completion. [ 51] stricteq loc10, loc10, arg1 [ 55] jtrue loc10, 27(->82) [ 58] mov loc10, Int32: 55(const4) // Check for non-contiguous jumpID i.e. Break/Continue completion. [ 61] stricteq loc10, loc10, arg1 [ 65] jtrue loc10, 25(->90) [ 68] jmp 30(->98) // Default Throw completion. [ 70] jmp 30(->100) [ 72] ret arg1 [ 74] mov arg1, loc6 // Restore scope. [ 77] mov loc7, Int32: 0(const0) // Restore Normal completion. [ 80] jmp 20(->100) [ 82] mov arg1, loc5 // Restore scope. [ 85] mov loc7, Int32: 0(const0) // Restore Normal completion. [ 88] jmp 12(->100) [ 90] mov arg1, loc8 // Restore scope. [ 93] mov loc7, Int32: 0(const0) // Restore Normal completion. [ 96] jmp 4(->100) [ 98] throw arg1 [ 100] ret Undefined(const5) Hence, the best representation we have for the type of non-contiguous switch cases that emitFinallyCompletion() emits is still a series of test-and-jumps. Hence, the bytecode I'm currently emitting is as efficient as it gets. The one difference might be that the switch code layout may or may not yield better cache locality in JIT generated code eventually. That may be more the case if the number of cases in the switch would be large. However, I think the number of breaks and continues in for-of loops would tend to be few in number. Hence, it is unlikely, we'll see much benefit in terms of cache locality. In summary, we can't really emit an op_switch (because the cases are not contiguous), and the current generated bytecode is already efficient. I won't pursue this inquiry further.
Mark Lam
Comment 7 2016-12-19 11:36:09 PST
(In reply to comment #6) > In summary, we can't really emit an op_switch (because the cases are not > contiguous), and the current generated bytecode is already efficient. I > won't pursue this inquiry further. Correction: I had talked with Saam offline. We can make it so that the jumpIDs are contiguous (by changing our scheme for assigning jumpIDs) thereby enabling us to use a jump table switch. We may look into this in the future if we see a perf need in this area.
Mark Lam
Comment 8 2016-12-19 17:42:34 PST
Rolling out r209974 and r209952. They break some websites in mysterious ways.
Mark Lam
Comment 9 2016-12-19 18:10:28 PST
Part 1: r209974 is rolled out in r210007: <http://trac.webkit.org/r210007>. Part 2: r209952 is rolled out in r210010: <http://trac.webkit.org/r210010>.
Mark Lam
Comment 10 2016-12-21 12:40:59 PST
(In reply to comment #9) > Part 1: r209974 is rolled out in r210007: <http://trac.webkit.org/r210007>. > Part 2: r209952 is rolled out in r210010: <http://trac.webkit.org/r210010>. I will be posting a fixed patch for https://bugs.webkit.org/show_bug.cgi?id=160168. So, I'll just roll this patch into that one. Will close this one as a dupe.
Mark Lam
Comment 11 2016-12-21 12:41:21 PST
*** This bug has been marked as a duplicate of bug 160168 ***
Note You need to log in before you can comment on or make changes to this bug.