There are 2 bugs: 1. OSR exit doesn't currently populate the bytecodeOffset field in the CallFrame when it reifies the inlined frames as well as in the outermost (a.k.a. "machine") frame. 2. OSR exit is computing the return PC with the assumption that the caller's call opcode is op_call. If the caller's opcode is op_call_varargs, then the computed return PC will be incorrect. Patch for the fix coming soon. ref: <rdar://problem/14039335>.
Created attachment 203479 [details] the patch.
Comment on attachment 203479 [details] the patch. View in context: https://bugs.webkit.org/attachment.cgi?id=203479&action=review R=me with those changes. > Source/JavaScriptCore/bytecode/Opcode.h:282 > inline size_t opcodeLength(OpcodeID opcode) > { > - switch (opcode) { > -#define OPCODE_ID_LENGTHS(id, length) case id: return OPCODE_LENGTH(id); > - FOR_EACH_OPCODE_ID(OPCODE_ID_LENGTHS) > -#undef OPCODE_ID_LENGTHS > - } > - RELEASE_ASSERT_NOT_REACHED(); > - return 0; > + return opcodeLengths[opcode]; > } Why? The intuition for this method is that it gives the compiler a better opportunity to constant fold. We were expecting that if you call opcodeLength() with a constant then the call will be folded to a constant. I think that's harder if you do an array lookup instead. If you have a good reason for this change and it doesn't affect performance, then I'm fine with it - I just want to make sure that you had a reason for this change. > Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:41 > + return opcodeLength(opcodeID); This call is another example of where a switch is better than an array lookup: with a switch statement, a C++ compiler backend will either choose to use a lookup table (transforming a switch with all cases that return a constant into an array lookup) or branches depending on what it thinks is most profitable for the given target. > Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:657 > + unsigned returnBytecodeIndex = inlineCallFrame->caller.bytecodeIndex > + + opcodeLengthForBytecodeIndex(m_jit.vm(), baselineCodeBlockForCaller, inlineCallFrame->caller.bytecodeIndex); I would love to have this assert that the caller bytecode index is in fact a call of some kind. I would prefer a RELEASE_ASSERT. Can you refiddle this code to do such an assertion? I know it's awkward since you've got this helper, but the assertion is more important than having a nice helper. > Source/JavaScriptCore/dfg/DFGOSRExitCompiler64.cpp:621 > + unsigned returnBytecodeIndex = inlineCallFrame->caller.bytecodeIndex > + + opcodeLengthForBytecodeIndex(m_jit.vm(), baselineCodeBlockForCaller, inlineCallFrame->caller.bytecodeIndex); Ditto.
Thanks for the review. (In reply to comment #2) > > Source/JavaScriptCore/bytecode/Opcode.h:282 > > inline size_t opcodeLength(OpcodeID opcode) > > { > > - switch (opcode) { > > -#define OPCODE_ID_LENGTHS(id, length) case id: return OPCODE_LENGTH(id); > > - FOR_EACH_OPCODE_ID(OPCODE_ID_LENGTHS) > > -#undef OPCODE_ID_LENGTHS > > - } > > - RELEASE_ASSERT_NOT_REACHED(); > > - return 0; > > + return opcodeLengths[opcode]; > > } > > Why? > > The intuition for this method is that it gives the compiler a better opportunity to constant fold. We were expecting that if you call opcodeLength() with a constant then the call will be folded to a constant. I think that's harder if you do an array lookup instead. > > If you have a good reason for this change and it doesn't affect performance, then I'm fine with it - I just want to make sure that you had a reason for this change. Good point. I didn't think of constant folding. Will revert this change. Code that wants to do a length look up of a variable opcodeID should use opcodeLengths[] explicitly instead. > > Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:41 > > + return opcodeLength(opcodeID); > > This call is another example of where a switch is better than an array lookup: with a switch statement, a C++ compiler backend will either choose to use a lookup table (transforming a switch with all cases that return a constant into an array lookup) or branches depending on what it thinks is most profitable for the given target. The opcodeID is variable in this case. Hence, the C++ compiler won't be able to constant fold. This is the reason I didn't want to use the switch statement over the entire OpcodeID space (as implemented in opcodeLength()) in the first place. Initially, I wasn't sure if the helper should be called for the top frame as well. But, now that we've determined that we should not call it for the top frame (and know that the opcode ID has to be some call opcode), we can reduce this to a simple "?:" statement that checks for op_call_varargs. > > Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:657 > > + unsigned returnBytecodeIndex = inlineCallFrame->caller.bytecodeIndex > > + + opcodeLengthForBytecodeIndex(m_jit.vm(), baselineCodeBlockForCaller, inlineCallFrame->caller.bytecodeIndex); > > I would love to have this assert that the caller bytecode index is in fact a call of some kind. I would prefer a RELEASE_ASSERT. Can you refiddle this code to do such an assertion? I know it's awkward since you've got this helper, but the assertion is more important than having a nice helper. I will add the RELEASE_ASSERT in the helper since it is guaranteed to see a call opcode of some sort. Will also rename the helper to "callOpcodeLength()" for brevity and more clearly declare its intent.
(In reply to comment #3) > Thanks for the review. > > (In reply to comment #2) > > > Source/JavaScriptCore/bytecode/Opcode.h:282 > > > inline size_t opcodeLength(OpcodeID opcode) > > > { > > > - switch (opcode) { > > > -#define OPCODE_ID_LENGTHS(id, length) case id: return OPCODE_LENGTH(id); > > > - FOR_EACH_OPCODE_ID(OPCODE_ID_LENGTHS) > > > -#undef OPCODE_ID_LENGTHS > > > - } > > > - RELEASE_ASSERT_NOT_REACHED(); > > > - return 0; > > > + return opcodeLengths[opcode]; > > > } > > > > Why? > > > > The intuition for this method is that it gives the compiler a better opportunity to constant fold. We were expecting that if you call opcodeLength() with a constant then the call will be folded to a constant. I think that's harder if you do an array lookup instead. > > > > If you have a good reason for this change and it doesn't affect performance, then I'm fine with it - I just want to make sure that you had a reason for this change. > > Good point. I didn't think of constant folding. Will revert this change. Code that wants to do a length look up of a variable opcodeID should use opcodeLengths[] explicitly instead. > > > > Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:41 > > > + return opcodeLength(opcodeID); > > > > This call is another example of where a switch is better than an array lookup: with a switch statement, a C++ compiler backend will either choose to use a lookup table (transforming a switch with all cases that return a constant into an array lookup) or branches depending on what it thinks is most profitable for the given target. > > The opcodeID is variable in this case. Hence, the C++ compiler won't be able to constant fold. This is the reason I didn't want to use the switch statement over the entire OpcodeID space (as implemented in opcodeLength()) in the first place. Let me say this differently. This is a *different* example of why switch is better, and it *isn't* related to constant folding. A switch statement where all cases have a return of a constant will be turned into an array lookup, just like opcodeLength[blah]. The compiler will turn it into such an array lookup precisely when it thinks it is profitable. Hence, your change will never result in better code. It may result in worse code, if the compiler decides that an array lookup was less efficient - for example if the target it's compiling for has faster branches than loads, or something. > > Initially, I wasn't sure if the helper should be called for the top frame as well. But, now that we've determined that we should not call it for the top frame (and know that the opcode ID has to be some call opcode), we can reduce this to a simple "?:" statement that checks for op_call_varargs. > > > > Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:657 > > > + unsigned returnBytecodeIndex = inlineCallFrame->caller.bytecodeIndex > > > + + opcodeLengthForBytecodeIndex(m_jit.vm(), baselineCodeBlockForCaller, inlineCallFrame->caller.bytecodeIndex); > > > > I would love to have this assert that the caller bytecode index is in fact a call of some kind. I would prefer a RELEASE_ASSERT. Can you refiddle this code to do such an assertion? I know it's awkward since you've got this helper, but the assertion is more important than having a nice helper. > > I will add the RELEASE_ASSERT in the helper since it is guaranteed to see a call opcode of some sort. Will also rename the helper to "callOpcodeLength()" for brevity and more clearly declare its intent. Really? If the helper is called for the top call frame, it may see any opcode.
(In reply to comment #4) > > > > Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:41 > > > > + return opcodeLength(opcodeID); > > > > > > This call is another example of where a switch is better than an array lookup: with a switch statement, a C++ compiler backend will either choose to use a lookup table (transforming a switch with all cases that return a constant into an array lookup) or branches depending on what it thinks is most profitable for the given target. > > > > The opcodeID is variable in this case. Hence, the C++ compiler won't be able to constant fold. This is the reason I didn't want to use the switch statement over the entire OpcodeID space (as implemented in opcodeLength()) in the first place. > > Let me say this differently. This is a *different* example of why switch is better, and it *isn't* related to constant folding. > > A switch statement where all cases have a return of a constant will be turned into an array lookup, just like opcodeLength[blah]. The compiler will turn it into such an array lookup precisely when it thinks it is profitable. > > Hence, your change will never result in better code. It may result in worse code, if the compiler decides that an array lookup was less efficient - for example if the target it's compiling for has faster branches than loads, or something. OK, understood. But I knew that there's already an opcodeLengths[] array, and I didn't want the compiler to potentially instantiate a new one for the inlined opcodeLength() function. Anyway, here's the new code: static size_t callOpcodeLength(VM* vm, CodeBlock* codeBlock, unsigned bytecodeIndex) { Opcode opcode = codeBlock->instructions()[bytecodeIndex].u.opcode; OpcodeID opcodeID = vm->interpreter->getOpcodeID(opcode); RELEASE_ASSERT(opcodeID == op_call || opcodeID == op_call_eval || opcodeID == op_call_varargs); ASSERT(OPCODE_LENGTH(op_call) == OPCODE_LENGTH(op_call_eval)); return UNLIKELY(opcodeID == op_call_varargs) ? OPCODE_LENGTH(op_call_varargs) : OPCODE_LENGTH(op_call); } > > Initially, I wasn't sure if the helper should be called for the top frame as well. But, now that we've determined that we should not call it for the top frame (and know that the opcode ID has to be some call opcode), we can reduce this to a simple "?:" statement that checks for op_call_varargs. > > > > > > Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:657 > > > > + unsigned returnBytecodeIndex = inlineCallFrame->caller.bytecodeIndex > > > > + + opcodeLengthForBytecodeIndex(m_jit.vm(), baselineCodeBlockForCaller, inlineCallFrame->caller.bytecodeIndex); > > > > > > I would love to have this assert that the caller bytecode index is in fact a call of some kind. I would prefer a RELEASE_ASSERT. Can you refiddle this code to do such an assertion? I know it's awkward since you've got this helper, but the assertion is more important than having a nice helper. > > > > I will add the RELEASE_ASSERT in the helper since it is guaranteed to see a call opcode of some sort. Will also rename the helper to "callOpcodeLength()" for brevity and more clearly declare its intent. > > Really? If the helper is called for the top call frame, it may see any opcode. Let me clarify. The helper should be and is called for the top call frame to compute the bytecodeOffset of the returnPC, and this case should always see call opcodes. However, back before I determined that we shouldn't call it for computing the bytecodeOffset of the opcode that we're OSR exiting to, I was also calling the helper there. Hence, I was seeing whatever opcodes were triggering the OSR exit. But that is an invalid scenario, and we don't make that call anymore. Hence, the helper will never see any opcodes other than call opcodes.
(In reply to comment #5) > (In reply to comment #4) > > > > > Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:41 > > > > > + return opcodeLength(opcodeID); > > > > > > > > This call is another example of where a switch is better than an array lookup: with a switch statement, a C++ compiler backend will either choose to use a lookup table (transforming a switch with all cases that return a constant into an array lookup) or branches depending on what it thinks is most profitable for the given target. > > > > > > The opcodeID is variable in this case. Hence, the C++ compiler won't be able to constant fold. This is the reason I didn't want to use the switch statement over the entire OpcodeID space (as implemented in opcodeLength()) in the first place. > > > > Let me say this differently. This is a *different* example of why switch is better, and it *isn't* related to constant folding. > > > > A switch statement where all cases have a return of a constant will be turned into an array lookup, just like opcodeLength[blah]. The compiler will turn it into such an array lookup precisely when it thinks it is profitable. > > > > Hence, your change will never result in better code. It may result in worse code, if the compiler decides that an array lookup was less efficient - for example if the target it's compiling for has faster branches than loads, or something. > > OK, understood. But I knew that there's already an opcodeLengths[] array, and I didn't want the compiler to potentially instantiate a new one for the inlined opcodeLength() function. > > Anyway, here's the new code: > > static size_t callOpcodeLength(VM* vm, CodeBlock* codeBlock, unsigned bytecodeIndex) > { > Opcode opcode = codeBlock->instructions()[bytecodeIndex].u.opcode; > OpcodeID opcodeID = vm->interpreter->getOpcodeID(opcode); > RELEASE_ASSERT(opcodeID == op_call || opcodeID == op_call_eval || opcodeID == op_call_varargs); > ASSERT(OPCODE_LENGTH(op_call) == OPCODE_LENGTH(op_call_eval)); > return UNLIKELY(opcodeID == op_call_varargs) ? OPCODE_LENGTH(op_call_varargs) : OPCODE_LENGTH(op_call); > } > > > > Initially, I wasn't sure if the helper should be called for the top frame as well. But, now that we've determined that we should not call it for the top frame (and know that the opcode ID has to be some call opcode), we can reduce this to a simple "?:" statement that checks for op_call_varargs. > > > > > > > > Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:657 > > > > > + unsigned returnBytecodeIndex = inlineCallFrame->caller.bytecodeIndex > > > > > + + opcodeLengthForBytecodeIndex(m_jit.vm(), baselineCodeBlockForCaller, inlineCallFrame->caller.bytecodeIndex); > > > > > > > > I would love to have this assert that the caller bytecode index is in fact a call of some kind. I would prefer a RELEASE_ASSERT. Can you refiddle this code to do such an assertion? I know it's awkward since you've got this helper, but the assertion is more important than having a nice helper. > > > > > > I will add the RELEASE_ASSERT in the helper since it is guaranteed to see a call opcode of some sort. Will also rename the helper to "callOpcodeLength()" for brevity and more clearly declare its intent. > > > > Really? If the helper is called for the top call frame, it may see any opcode. > > Let me clarify. The helper should be and is called for the top call frame to compute the bytecodeOffset of the returnPC, and this case should always see call opcodes. > > However, back before I determined that we shouldn't call it for computing the bytecodeOffset of the opcode that we're OSR exiting to, I was also calling the helper there. Hence, I was seeing whatever opcodes were triggering the OSR exit. But that is an invalid scenario, and we don't make that call anymore. Hence, the helper will never see any opcodes other than call opcodes. OK! Let me take a peek at the new patch before you land.
Created attachment 203492 [details] updated patch.
Comment on attachment 203492 [details] updated patch. View in context: https://bugs.webkit.org/attachment.cgi?id=203492&action=review > Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:44 > +static size_t callOpcodeLength(VM* vm, CodeBlock* codeBlock, unsigned bytecodeIndex) > +{ > + Opcode opcode = codeBlock->instructions()[bytecodeIndex].u.opcode; > + OpcodeID opcodeID = vm->interpreter->getOpcodeID(opcode); > + RELEASE_ASSERT(opcodeID == op_call || opcodeID == op_call_eval || opcodeID == op_call_varargs); > + ASSERT(OPCODE_LENGTH(op_call) == OPCODE_LENGTH(op_call_eval)); > + return UNLIKELY(opcodeID == op_call_varargs) ? OPCODE_LENGTH(op_call_varargs) : OPCODE_LENGTH(op_call); > +} This is wrong. You forgot about op_construct. Also the name "callOpcodeLength" is really weird. Think of a better name. > Source/JavaScriptCore/dfg/DFGOSRExitCompiler64.cpp:44 > +static size_t callOpcodeLength(VM* vm, CodeBlock* codeBlock, unsigned bytecodeIndex) > +{ > + Opcode opcode = codeBlock->instructions()[bytecodeIndex].u.opcode; > + OpcodeID opcodeID = vm->interpreter->getOpcodeID(opcode); > + RELEASE_ASSERT(opcodeID == op_call || opcodeID == op_call_eval || opcodeID == op_call_varargs); > + ASSERT(OPCODE_LENGTH(op_call) == OPCODE_LENGTH(op_call_eval)); > + return UNLIKELY(opcodeID == op_call_varargs) ? OPCODE_LENGTH(op_call_varargs) : OPCODE_LENGTH(op_call); > +} Ditto.
(In reply to comment #8) > (From update of attachment 203492 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=203492&action=review > > > + RELEASE_ASSERT(opcodeID == op_call || opcodeID == op_call_eval || opcodeID == op_call_varargs); > > This is wrong. You forgot about op_construct. Dope! I had it right yesterday when I was testing this assert (but unfortunately, removed it), but botch it this time (when I re-added it). Will fix. > Also the name "callOpcodeLength" is really weird. Think of a better name. I presume it sounds weird because it can imply "calling" something called an "OpcodeLength". How about "lengthOfCallOpcodeAt"? Here's an example how it appears in use: bytecodeIndex += lengthOfCallOpcodeAt(m_jit.vm(), baselineCodeBlock, codeOrigin.bytecodeIndex);
Created attachment 203493 [details] updated patch #2.
Comment on attachment 203493 [details] updated patch #2. Actually, would it be possible to add a test? Like throw an exception after OSR exiting from inlined code and then check the stack trace. Make sure that this fails on trunk.
(In reply to comment #11) > (From update of attachment 203493 [details]) > Actually, would it be possible to add a test? Like throw an exception after OSR exiting from inlined code and then check the stack trace. Make sure that this fails on trunk. Strange. I can get a crash in fast/js/stack-trace.html quite easily with the flt branch (debug build, unmodified), but I can't seem to reproduce the error on trunk. I'm also not seeing any line number errors in the stack traces. Still investigating.
(In reply to comment #12) > Strange. I can get a crash in fast/js/stack-trace.html quite easily with the flt branch (debug build, unmodified), but I can't seem to reproduce the error on trunk. I'm also not seeing any line number errors in the stack traces. Still investigating. The reason that the ftl branch crashed and that ToT did not is because ToT has a fix by Oliver (http://trac.webkit.org/changeset/149404) that was not merged into the ftl branch. Next, I'll see if I can alter the test to defeat that fix by Oliver.
For this OSR exit case, we won't be able to construct a stack trace test that fails without the patch. The reason is because we've OSR'ed exited to the baseline JIT, and the CodeBlock uses the baselineJIT code's callReturnIndexVector to map the return PC to the bytecodeOffset. It does not rely on the bytecodeOffset in the ArgumentCount tag. Hence, it won't fail.
> I can't seem to reproduce the error on trunk. Sounds like this patch should land only on the branch, then.
(In reply to comment #15) > > I can't seem to reproduce the error on trunk. > > Sounds like this patch should land only on the branch, then. Sure does sound like it.
Created attachment 203719 [details] reworked patch for ftl branch With this new patch, we no longer need to opcode length to the bytecode offset because that's how the ftl branch rolls.
(In reply to comment #17) > Created an attachment (id=203719) [details] > reworked patch for ftl branch > > With this new patch, we no longer need to opcode length to the bytecode offset because that's how the ftl branch rolls. … no longer need to add opcode length ...
Comment on attachment 203719 [details] reworked patch for ftl branch r=me
Thanks for the review. Landed in r151184: <http://trac.webkit.org/changeset/151184>.