Bug 117103 - DFG should populate frame bytecodeOffsets on OSR exit
Summary: DFG should populate frame bytecodeOffsets on OSR exit
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Mark Lam
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2013-05-31 19:23 PDT by Mark Lam
Modified: 2013-06-04 14:32 PDT (History)
5 users (show)

See Also:


Attachments
the patch. (9.62 KB, patch)
2013-05-31 20:39 PDT, Mark Lam
fpizlo: review+
Details | Formatted Diff | Diff
updated patch. (9.34 KB, patch)
2013-06-01 11:01 PDT, Mark Lam
fpizlo: review-
Details | Formatted Diff | Diff
updated patch #2. (9.56 KB, patch)
2013-06-01 11:35 PDT, Mark Lam
no flags Details | Formatted Diff | Diff
reworked patch for ftl branch (2.48 KB, patch)
2013-06-04 13:18 PDT, Mark Lam
ggaren: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Lam 2013-05-31 19:23:45 PDT
There are 2 bugs:
1. OSR exit doesn't currently populate the bytecodeOffset field in the CallFrame when it reifies the inlined frames as well as in the outermost (a.k.a. "machine") frame.
2. OSR exit is computing the return PC with the assumption that the caller's call opcode is op_call.  If the caller's opcode is op_call_varargs, then the computed return PC will be incorrect.

Patch for the fix coming soon.

ref: <rdar://problem/14039335>.
Comment 1 Mark Lam 2013-05-31 20:39:04 PDT
Created attachment 203479 [details]
the patch.
Comment 2 Filip Pizlo 2013-06-01 09:46:52 PDT
Comment on attachment 203479 [details]
the patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=203479&action=review

R=me with those changes.

> Source/JavaScriptCore/bytecode/Opcode.h:282
>  inline size_t opcodeLength(OpcodeID opcode)
>  {
> -    switch (opcode) {
> -#define OPCODE_ID_LENGTHS(id, length) case id: return OPCODE_LENGTH(id);
> -         FOR_EACH_OPCODE_ID(OPCODE_ID_LENGTHS)
> -#undef OPCODE_ID_LENGTHS
> -    }
> -    RELEASE_ASSERT_NOT_REACHED();
> -    return 0;
> +    return opcodeLengths[opcode];
>  }

Why?

The intuition for this method is that it gives the compiler a better opportunity to constant fold.  We were expecting that if you call opcodeLength() with a constant then the call will be folded to a constant.  I think that's harder if you do an array lookup instead.

If you have a good reason for this change and it doesn't affect performance, then I'm fine with it - I just want to make sure that you had a reason for this change.

> Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:41
> +    return opcodeLength(opcodeID);

This call is another example of where a switch is better than an array lookup: with a switch statement, a C++ compiler backend will either choose to use a lookup table (transforming a switch with all cases that return a constant into an array lookup) or branches depending on what it thinks is most profitable for the given target.

> Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:657
> +        unsigned returnBytecodeIndex = inlineCallFrame->caller.bytecodeIndex
> +            + opcodeLengthForBytecodeIndex(m_jit.vm(), baselineCodeBlockForCaller, inlineCallFrame->caller.bytecodeIndex);

I would love to have this assert that the caller bytecode index is in fact a call of some kind.  I would prefer a RELEASE_ASSERT.  Can you refiddle this code to do such an assertion?  I know it's awkward since you've got this helper, but the assertion is more important than having a nice helper.

> Source/JavaScriptCore/dfg/DFGOSRExitCompiler64.cpp:621
> +        unsigned returnBytecodeIndex = inlineCallFrame->caller.bytecodeIndex
> +            + opcodeLengthForBytecodeIndex(m_jit.vm(), baselineCodeBlockForCaller, inlineCallFrame->caller.bytecodeIndex);

Ditto.
Comment 3 Mark Lam 2013-06-01 10:20:13 PDT
Thanks for the review.

(In reply to comment #2)
> > Source/JavaScriptCore/bytecode/Opcode.h:282
> >  inline size_t opcodeLength(OpcodeID opcode)
> >  {
> > -    switch (opcode) {
> > -#define OPCODE_ID_LENGTHS(id, length) case id: return OPCODE_LENGTH(id);
> > -         FOR_EACH_OPCODE_ID(OPCODE_ID_LENGTHS)
> > -#undef OPCODE_ID_LENGTHS
> > -    }
> > -    RELEASE_ASSERT_NOT_REACHED();
> > -    return 0;
> > +    return opcodeLengths[opcode];
> >  }
> 
> Why?
> 
> The intuition for this method is that it gives the compiler a better opportunity to constant fold.  We were expecting that if you call opcodeLength() with a constant then the call will be folded to a constant.  I think that's harder if you do an array lookup instead.
> 
> If you have a good reason for this change and it doesn't affect performance, then I'm fine with it - I just want to make sure that you had a reason for this change.

Good point.  I didn't think of constant folding.  Will revert this change.  Code that wants to do a length look up of a variable opcodeID should use opcodeLengths[] explicitly instead.

> > Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:41
> > +    return opcodeLength(opcodeID);
> 
> This call is another example of where a switch is better than an array lookup: with a switch statement, a C++ compiler backend will either choose to use a lookup table (transforming a switch with all cases that return a constant into an array lookup) or branches depending on what it thinks is most profitable for the given target.

The opcodeID is variable in this case.  Hence, the C++ compiler won't be able to constant fold.  This is the reason I didn't want to use the switch statement over the entire OpcodeID space (as implemented in opcodeLength()) in the first place.

Initially, I wasn't sure if the helper should be called for the top frame as well.  But, now that we've determined that we should not call it for the top frame (and know that the opcode ID has to be some call opcode), we can reduce this to a simple "?:" statement that checks for op_call_varargs. 

> > Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:657
> > +        unsigned returnBytecodeIndex = inlineCallFrame->caller.bytecodeIndex
> > +            + opcodeLengthForBytecodeIndex(m_jit.vm(), baselineCodeBlockForCaller, inlineCallFrame->caller.bytecodeIndex);
> 
> I would love to have this assert that the caller bytecode index is in fact a call of some kind.  I would prefer a RELEASE_ASSERT.  Can you refiddle this code to do such an assertion?  I know it's awkward since you've got this helper, but the assertion is more important than having a nice helper.

I will add the RELEASE_ASSERT in the helper since it is guaranteed to see a call opcode of some sort.  Will also rename the helper to "callOpcodeLength()" for brevity and more clearly declare its intent.
Comment 4 Filip Pizlo 2013-06-01 10:26:37 PDT
(In reply to comment #3)
> Thanks for the review.
> 
> (In reply to comment #2)
> > > Source/JavaScriptCore/bytecode/Opcode.h:282
> > >  inline size_t opcodeLength(OpcodeID opcode)
> > >  {
> > > -    switch (opcode) {
> > > -#define OPCODE_ID_LENGTHS(id, length) case id: return OPCODE_LENGTH(id);
> > > -         FOR_EACH_OPCODE_ID(OPCODE_ID_LENGTHS)
> > > -#undef OPCODE_ID_LENGTHS
> > > -    }
> > > -    RELEASE_ASSERT_NOT_REACHED();
> > > -    return 0;
> > > +    return opcodeLengths[opcode];
> > >  }
> > 
> > Why?
> > 
> > The intuition for this method is that it gives the compiler a better opportunity to constant fold.  We were expecting that if you call opcodeLength() with a constant then the call will be folded to a constant.  I think that's harder if you do an array lookup instead.
> > 
> > If you have a good reason for this change and it doesn't affect performance, then I'm fine with it - I just want to make sure that you had a reason for this change.
> 
> Good point.  I didn't think of constant folding.  Will revert this change.  Code that wants to do a length look up of a variable opcodeID should use opcodeLengths[] explicitly instead.
> 
> > > Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:41
> > > +    return opcodeLength(opcodeID);
> > 
> > This call is another example of where a switch is better than an array lookup: with a switch statement, a C++ compiler backend will either choose to use a lookup table (transforming a switch with all cases that return a constant into an array lookup) or branches depending on what it thinks is most profitable for the given target.
> 
> The opcodeID is variable in this case.  Hence, the C++ compiler won't be able to constant fold.  This is the reason I didn't want to use the switch statement over the entire OpcodeID space (as implemented in opcodeLength()) in the first place.

Let me say this differently.  This is a *different* example of why switch is better, and it *isn't* related to constant folding.

A switch statement where all cases have a return of a constant will be turned into an array lookup, just like opcodeLength[blah].  The compiler will turn it into such an array lookup precisely when it thinks it is profitable.

Hence, your change will never result in better code.  It may result in worse code, if the compiler decides that an array lookup was less efficient - for example if the target it's compiling for has faster branches than loads, or something.

> 
> Initially, I wasn't sure if the helper should be called for the top frame as well.  But, now that we've determined that we should not call it for the top frame (and know that the opcode ID has to be some call opcode), we can reduce this to a simple "?:" statement that checks for op_call_varargs. 
> 
> > > Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:657
> > > +        unsigned returnBytecodeIndex = inlineCallFrame->caller.bytecodeIndex
> > > +            + opcodeLengthForBytecodeIndex(m_jit.vm(), baselineCodeBlockForCaller, inlineCallFrame->caller.bytecodeIndex);
> > 
> > I would love to have this assert that the caller bytecode index is in fact a call of some kind.  I would prefer a RELEASE_ASSERT.  Can you refiddle this code to do such an assertion?  I know it's awkward since you've got this helper, but the assertion is more important than having a nice helper.
> 
> I will add the RELEASE_ASSERT in the helper since it is guaranteed to see a call opcode of some sort.  Will also rename the helper to "callOpcodeLength()" for brevity and more clearly declare its intent.

Really?  If the helper is called for the top call frame, it may see any opcode.
Comment 5 Mark Lam 2013-06-01 10:44:31 PDT
(In reply to comment #4)
> > > > Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:41
> > > > +    return opcodeLength(opcodeID);
> > > 
> > > This call is another example of where a switch is better than an array lookup: with a switch statement, a C++ compiler backend will either choose to use a lookup table (transforming a switch with all cases that return a constant into an array lookup) or branches depending on what it thinks is most profitable for the given target.
> > 
> > The opcodeID is variable in this case.  Hence, the C++ compiler won't be able to constant fold.  This is the reason I didn't want to use the switch statement over the entire OpcodeID space (as implemented in opcodeLength()) in the first place.
> 
> Let me say this differently.  This is a *different* example of why switch is better, and it *isn't* related to constant folding.
> 
> A switch statement where all cases have a return of a constant will be turned into an array lookup, just like opcodeLength[blah].  The compiler will turn it into such an array lookup precisely when it thinks it is profitable.
> 
> Hence, your change will never result in better code.  It may result in worse code, if the compiler decides that an array lookup was less efficient - for example if the target it's compiling for has faster branches than loads, or something.

OK, understood.  But I knew that there's already an opcodeLengths[] array, and I didn't want the compiler to potentially instantiate a new one for the inlined opcodeLength() function. 

Anyway, here's the new code:

static size_t callOpcodeLength(VM* vm, CodeBlock* codeBlock, unsigned bytecodeIndex)
{
    Opcode opcode = codeBlock->instructions()[bytecodeIndex].u.opcode;
    OpcodeID opcodeID = vm->interpreter->getOpcodeID(opcode);
    RELEASE_ASSERT(opcodeID == op_call || opcodeID == op_call_eval || opcodeID == op_call_varargs);
    ASSERT(OPCODE_LENGTH(op_call) == OPCODE_LENGTH(op_call_eval));
    return UNLIKELY(opcodeID == op_call_varargs) ? OPCODE_LENGTH(op_call_varargs) : OPCODE_LENGTH(op_call);
}

> > Initially, I wasn't sure if the helper should be called for the top frame as well.  But, now that we've determined that we should not call it for the top frame (and know that the opcode ID has to be some call opcode), we can reduce this to a simple "?:" statement that checks for op_call_varargs. 
> > 
> > > > Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:657
> > > > +        unsigned returnBytecodeIndex = inlineCallFrame->caller.bytecodeIndex
> > > > +            + opcodeLengthForBytecodeIndex(m_jit.vm(), baselineCodeBlockForCaller, inlineCallFrame->caller.bytecodeIndex);
> > > 
> > > I would love to have this assert that the caller bytecode index is in fact a call of some kind.  I would prefer a RELEASE_ASSERT.  Can you refiddle this code to do such an assertion?  I know it's awkward since you've got this helper, but the assertion is more important than having a nice helper.
> > 
> > I will add the RELEASE_ASSERT in the helper since it is guaranteed to see a call opcode of some sort.  Will also rename the helper to "callOpcodeLength()" for brevity and more clearly declare its intent.
> 
> Really?  If the helper is called for the top call frame, it may see any opcode.

Let me clarify.  The helper should be and is called for the top call frame to compute the bytecodeOffset of the returnPC, and this case should always see call opcodes.

However, back before I determined that we shouldn't call it for computing the bytecodeOffset of the opcode that we're OSR exiting to, I was also calling the helper there.  Hence, I was seeing whatever opcodes were triggering the OSR exit.  But that is an invalid scenario, and we don't make that call anymore.  Hence, the helper will never see any opcodes other than call opcodes.
Comment 6 Filip Pizlo 2013-06-01 10:47:27 PDT
(In reply to comment #5)
> (In reply to comment #4)
> > > > > Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:41
> > > > > +    return opcodeLength(opcodeID);
> > > > 
> > > > This call is another example of where a switch is better than an array lookup: with a switch statement, a C++ compiler backend will either choose to use a lookup table (transforming a switch with all cases that return a constant into an array lookup) or branches depending on what it thinks is most profitable for the given target.
> > > 
> > > The opcodeID is variable in this case.  Hence, the C++ compiler won't be able to constant fold.  This is the reason I didn't want to use the switch statement over the entire OpcodeID space (as implemented in opcodeLength()) in the first place.
> > 
> > Let me say this differently.  This is a *different* example of why switch is better, and it *isn't* related to constant folding.
> > 
> > A switch statement where all cases have a return of a constant will be turned into an array lookup, just like opcodeLength[blah].  The compiler will turn it into such an array lookup precisely when it thinks it is profitable.
> > 
> > Hence, your change will never result in better code.  It may result in worse code, if the compiler decides that an array lookup was less efficient - for example if the target it's compiling for has faster branches than loads, or something.
> 
> OK, understood.  But I knew that there's already an opcodeLengths[] array, and I didn't want the compiler to potentially instantiate a new one for the inlined opcodeLength() function. 
> 
> Anyway, here's the new code:
> 
> static size_t callOpcodeLength(VM* vm, CodeBlock* codeBlock, unsigned bytecodeIndex)
> {
>     Opcode opcode = codeBlock->instructions()[bytecodeIndex].u.opcode;
>     OpcodeID opcodeID = vm->interpreter->getOpcodeID(opcode);
>     RELEASE_ASSERT(opcodeID == op_call || opcodeID == op_call_eval || opcodeID == op_call_varargs);
>     ASSERT(OPCODE_LENGTH(op_call) == OPCODE_LENGTH(op_call_eval));
>     return UNLIKELY(opcodeID == op_call_varargs) ? OPCODE_LENGTH(op_call_varargs) : OPCODE_LENGTH(op_call);
> }
> 
> > > Initially, I wasn't sure if the helper should be called for the top frame as well.  But, now that we've determined that we should not call it for the top frame (and know that the opcode ID has to be some call opcode), we can reduce this to a simple "?:" statement that checks for op_call_varargs. 
> > > 
> > > > > Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:657
> > > > > +        unsigned returnBytecodeIndex = inlineCallFrame->caller.bytecodeIndex
> > > > > +            + opcodeLengthForBytecodeIndex(m_jit.vm(), baselineCodeBlockForCaller, inlineCallFrame->caller.bytecodeIndex);
> > > > 
> > > > I would love to have this assert that the caller bytecode index is in fact a call of some kind.  I would prefer a RELEASE_ASSERT.  Can you refiddle this code to do such an assertion?  I know it's awkward since you've got this helper, but the assertion is more important than having a nice helper.
> > > 
> > > I will add the RELEASE_ASSERT in the helper since it is guaranteed to see a call opcode of some sort.  Will also rename the helper to "callOpcodeLength()" for brevity and more clearly declare its intent.
> > 
> > Really?  If the helper is called for the top call frame, it may see any opcode.
> 
> Let me clarify.  The helper should be and is called for the top call frame to compute the bytecodeOffset of the returnPC, and this case should always see call opcodes.
> 
> However, back before I determined that we shouldn't call it for computing the bytecodeOffset of the opcode that we're OSR exiting to, I was also calling the helper there.  Hence, I was seeing whatever opcodes were triggering the OSR exit.  But that is an invalid scenario, and we don't make that call anymore.  Hence, the helper will never see any opcodes other than call opcodes.

OK!  Let me take a peek at the new patch before you land.
Comment 7 Mark Lam 2013-06-01 11:01:29 PDT
Created attachment 203492 [details]
updated patch.
Comment 8 Filip Pizlo 2013-06-01 11:03:36 PDT
Comment on attachment 203492 [details]
updated patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=203492&action=review

> Source/JavaScriptCore/dfg/DFGOSRExitCompiler32_64.cpp:44
> +static size_t callOpcodeLength(VM* vm, CodeBlock* codeBlock, unsigned bytecodeIndex)
> +{
> +    Opcode opcode = codeBlock->instructions()[bytecodeIndex].u.opcode;
> +    OpcodeID opcodeID = vm->interpreter->getOpcodeID(opcode);
> +    RELEASE_ASSERT(opcodeID == op_call || opcodeID == op_call_eval || opcodeID == op_call_varargs);
> +    ASSERT(OPCODE_LENGTH(op_call) == OPCODE_LENGTH(op_call_eval));
> +    return UNLIKELY(opcodeID == op_call_varargs) ? OPCODE_LENGTH(op_call_varargs) : OPCODE_LENGTH(op_call);
> +}

This is wrong.  You forgot about op_construct.

Also the name "callOpcodeLength" is really weird.  Think of a better name.

> Source/JavaScriptCore/dfg/DFGOSRExitCompiler64.cpp:44
> +static size_t callOpcodeLength(VM* vm, CodeBlock* codeBlock, unsigned bytecodeIndex)
> +{
> +    Opcode opcode = codeBlock->instructions()[bytecodeIndex].u.opcode;
> +    OpcodeID opcodeID = vm->interpreter->getOpcodeID(opcode);
> +    RELEASE_ASSERT(opcodeID == op_call || opcodeID == op_call_eval || opcodeID == op_call_varargs);
> +    ASSERT(OPCODE_LENGTH(op_call) == OPCODE_LENGTH(op_call_eval));
> +    return UNLIKELY(opcodeID == op_call_varargs) ? OPCODE_LENGTH(op_call_varargs) : OPCODE_LENGTH(op_call);
> +}

Ditto.
Comment 9 Mark Lam 2013-06-01 11:28:41 PDT
(In reply to comment #8)
> (From update of attachment 203492 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=203492&action=review
> > > +    RELEASE_ASSERT(opcodeID == op_call || opcodeID == op_call_eval || opcodeID == op_call_varargs);
> 
> This is wrong.  You forgot about op_construct.

Dope!  I had it right yesterday when I was testing this assert (but unfortunately, removed it), but botch it this time (when I re-added it).  Will fix.
 
> Also the name "callOpcodeLength" is really weird.  Think of a better name.

I presume it sounds weird because it can imply "calling" something called an "OpcodeLength".  How about "lengthOfCallOpcodeAt"?  Here's an example how it appears in use:

    bytecodeIndex += lengthOfCallOpcodeAt(m_jit.vm(), baselineCodeBlock, codeOrigin.bytecodeIndex);
Comment 10 Mark Lam 2013-06-01 11:35:36 PDT
Created attachment 203493 [details]
updated patch #2.
Comment 11 Filip Pizlo 2013-06-01 12:36:17 PDT
Comment on attachment 203493 [details]
updated patch #2.

Actually, would it be possible to add a test?  Like throw an exception after OSR exiting from inlined code and then check the stack trace. Make sure that this fails on trunk.
Comment 12 Mark Lam 2013-06-02 16:14:43 PDT
(In reply to comment #11)
> (From update of attachment 203493 [details])
> Actually, would it be possible to add a test?  Like throw an exception after OSR exiting from inlined code and then check the stack trace. Make sure that this fails on trunk.

Strange.  I can get a crash in fast/js/stack-trace.html quite easily with the flt branch (debug build, unmodified), but I can't seem to reproduce the error on trunk.  I'm also not seeing any line number errors in the stack traces.  Still investigating.
Comment 13 Mark Lam 2013-06-02 19:59:06 PDT
(In reply to comment #12)
> Strange.  I can get a crash in fast/js/stack-trace.html quite easily with the flt branch (debug build, unmodified), but I can't seem to reproduce the error on trunk.  I'm also not seeing any line number errors in the stack traces.  Still investigating.

The reason that the ftl branch crashed and that ToT did not is because ToT has a fix by Oliver (http://trac.webkit.org/changeset/149404) that was not merged into the ftl branch.

Next, I'll see if I can alter the test to defeat that fix by Oliver.
Comment 14 Mark Lam 2013-06-02 21:48:53 PDT
For this OSR exit case, we won't be able to construct a stack trace test that fails without the patch.  The reason is because we've OSR'ed exited to the baseline JIT, and the CodeBlock uses the baselineJIT code's callReturnIndexVector to map the return PC to the bytecodeOffset.  It does not rely on the bytecodeOffset in the ArgumentCount tag.  Hence, it won't fail.
Comment 15 Geoffrey Garen 2013-06-03 10:51:15 PDT
> I can't seem to reproduce the error on trunk.

Sounds like this patch should land only on the branch, then.
Comment 16 Filip Pizlo 2013-06-03 10:54:06 PDT
(In reply to comment #15)
> > I can't seem to reproduce the error on trunk.
> 
> Sounds like this patch should land only on the branch, then.

Sure does sound like it.
Comment 17 Mark Lam 2013-06-04 13:18:16 PDT
Created attachment 203719 [details]
reworked patch for ftl branch

With this new patch, we no longer need to opcode length to the bytecode offset because that's how the ftl branch rolls.
Comment 18 Mark Lam 2013-06-04 13:18:53 PDT
(In reply to comment #17)
> Created an attachment (id=203719) [details]
> reworked patch for ftl branch
> 
> With this new patch, we no longer need to opcode length to the bytecode offset because that's how the ftl branch rolls.

… no longer need to add opcode length ...
Comment 19 Geoffrey Garen 2013-06-04 14:26:18 PDT
Comment on attachment 203719 [details]
reworked patch for ftl branch

r=me
Comment 20 Mark Lam 2013-06-04 14:32:09 PDT
Thanks for the review.  Landed in r151184: <http://trac.webkit.org/changeset/151184>.