Bug 150745

Summary: WebInspector crashed while viewing Timeline when refreshing cnn.com while it was already loading
Product: WebKit Reporter: Michael Saboff <msaboff>
Component: JavaScriptCoreAssignee: Michael Saboff <msaboff>
Status: RESOLVED FIXED    
Severity: Normal CC: benjamin, fpizlo, ggaren, keith_miller, mark.lam, saam
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: All   
OS: All   
Attachments:
Description Flags
Patch
msaboff: review-
Updated Patch ggaren: review+

Description Michael Saboff 2015-10-30 18:26:30 PDT
* STEPS TO REPRODUCE
1. Inspect cnn.com
2. Show Timeline tab
3. Reload
4. Repeat

We get a crash like this:
    frame #0: 0x000000010c25767e JavaScriptCore`::WTFCrash() + 62 at Assertions.cpp:321
    frame #1: 0x000000010bd50e8a JavaScriptCore`JSC::DFG::reifyInlinedCallFrames(jit=<unavailable>, exit=<unavailable>) + 1546 at DFGOSRExitCompilerCommon.cpp:193
    frame #2: 0x000000010bd4ee0b JavaScriptCore`JSC::DFG::OSRExitCompiler::compileExit(this=0x00007fff5b42b410, exit=0x0000000143288380, operands=<unavailable>, recovery=<unavailable>) + 4667 at DFGOSRExitCompiler64.cpp:387
    frame #3: 0x000000010bd4cc95 JavaScriptCore`::compileOSRExit(exec=<unavailable>) + 1493 at DFGOSRExitCompiler.cpp:162
    frame #4: 0x000036bd736098a1 prepareToShow#DyZ1GU [DFG](Cell[Object ID: 18687]: 0x14000ea80, True)
    frame #5: 0x000036bd7406e4df _showEntry#Dp5saP [Baseline](Cell[Object ID: 15664]: 0x1435c9900, Cell[Object ID: 18687]: 0x14000ea80, True)
    frame #6: 0x000036bd741638d2 showBackForwardEntryForIndex#ETQFoG [Baseline](Cell[Object ID: 15664]: 0x1435c9900, 0)
    frame #7: 0x000036bd73f00045 showContentView#BhrqjJ [Baseline](Cell[Object ID: 15664]: 0x1435c9900, Cell[Object ID: 18505]: 0x1435c97c0)
...

Looks like we don't have correct location information for an OSR exit.

We are OSR exiting from prepareToShow#DyZ1GU->_restoreFromCookie#AsCLr2->cookie#C5Hkj7->value#ApQL0d.
From a call tree perspective,
    prepareToShow#DyZ1GU called _restoreFromCookie#AsCLr2 via op_call
        _restoreFromCookie#AsCLr2 called cookie#C5Hkj7 as a getter via op_get_by_id (bc#52)
            cookie#C5Hkj7 called value#ApQL0d via a op_tail_call

When we are reifying the inlined frames looking for the jump target, we incorrectly have the call type from _restoreFromCookie#AsCLr2 to cookie#C5Hkj7 as InlineCallFrame::TailCall and try to find the CallLinkInfo for bc#52, but there isn't one.  Instead this call should be processed as a InlineCallFrame::GetterCall.
Comment 1 Michael Saboff 2015-10-30 18:27:09 PDT
rdar://problem/23209327
Comment 2 Michael Saboff 2015-11-02 13:36:07 PST
Created attachment 264620 [details]
Patch
Comment 3 Geoffrey Garen 2015-11-02 13:56:05 PST
Comment on attachment 264620 [details]
Patch

r=me

In future, I think we want the InlineCallFrame to reflect its non-tail caller, since that's what the language wants -- instead of requiring all users of InlineCallFrame to see the tail caller and then skip it, effectively correcting the data structure on the fly.
Comment 4 Michael Saboff 2015-11-02 14:00:44 PST
(In reply to comment #3)
> Comment on attachment 264620 [details]
> Patch
> 
> r=me
> 
> In future, I think we want the InlineCallFrame to reflect its non-tail
> caller, since that's what the language wants -- instead of requiring all
> users of InlineCallFrame to see the tail caller and then skip it,
> effectively correcting the data structure on the fly.

Agreed.  I filed <rdar://problem/23361344>.
Comment 5 Filip Pizlo 2015-11-02 14:05:09 PST
(In reply to comment #3)
> Comment on attachment 264620 [details]
> Patch
> 
> r=me
> 
> In future, I think we want the InlineCallFrame to reflect its non-tail
> caller, since that's what the language wants -- instead of requiring all
> users of InlineCallFrame to see the tail caller and then skip it,
> effectively correcting the data structure on the fly.

The problem is that if we have foo(), bar() like:

function bar() { things }

function foo() { things; bar(); } // tail call to bar

And we compile foo() and inline bar(), then what should the inline stack in foo->bar look like?  When we do this, we cannot have inlining pretend as if bar replaced foo, even though it actually did.  For example, bar()'s arguments aren't going to be placed in foo's caller's argument passing area.  So, bar() must have an InlineCallFrame, because it must have a stack offset, and its arguments must be in foo's locals.

Some of these things can be changed, some can't, and some aren't worth it.  The DFG expects that there are some "bytecode locals" that it uses to store meta-data about an inlined call.  It's not worth trying to place those locals anywhere but the locals area of foo; for example placing them in foo's call frame header would be too much trouble even though that's what "tail calling" usually means.  But where we place the meta-data is currently coupled to the InlineCallFrame::stackOffset, and having a non-zero stackOffset is currently coupled to having a non-null InlineCallFrame.

Since we're saying that bar() has an InlineCallFrame with a call stack that reflects what would have happened without tail calls, it therefore makes some sense to be consistent, and always say that the InlineCallFrame stack is the "non-tail-optimized" call stack.

FWIW, this also makes the most sense when looking at IR dumps and thinking about how compiler optimizations impact inlining - since inlining makes the DFG IR look as if tail calls aren't a thing, it means that optimizations don't have to know about tail call inlining.  It's true that code that does OSR must skip up the InlineCallFrame stack, and that is annoying.  But I think we should fix that by making the API nicer.  Right now we walk InlineCallFrame stacks using ad-hoc for loops, and I think that's partly why we make mistakes.
Comment 6 Geoffrey Garen 2015-11-02 14:27:34 PST
Let's assume that we do produce an InlineCallFrame in the locals area for all inlined functions, including inlined functions that do tails calls or are tailed called by the root CodeBlock, which includes arguments and so on, and let's assume that the compiler's InlineStackEntry reflects an m_caller value that includes tail callers.

Now let's say that we change InlineCallFrame, which we use for recovering stacks at runtime, for OSR exits and backtraces and stuff, to have a directCaller value that does not include tail callers.

Does this mostly work?

In other words, what are the clients of InlineCallFrame (and not InlineStackEntry) that *want* to see tail callers?
Comment 7 Michael Saboff 2015-11-02 16:33:38 PST
I have tested this fix with the original web page (cnn.com) and reloading the page with the web inspector open.  Now it is crashing after the OSR exit code jumps to the baseline and we return from the callee as if it was a getter call.

We end up returning to the baseline getter return thunk, which looks like:
Generated JIT code for baseline getter return thunk:
    Code at [0x4ef12eff3de0, 0x4ef12eff3e00):
      0x4ef12eff3de0: mov 0x20(%rsp), %rdx
      0x4ef12eff3de5: add $0x30, %rsp
      0x4ef12eff3de9: jmp %rdx

The problem is that 0x20(%rsp) doesn't contain the return location to the originating op_get_by_id, but the second argument to the tail called function.  This getter return stub expects one argument and puts the return address in the unused slot for the second argument.

In the OSR exit generator, we find (DFGOSRExitCompilerCommon.cpp, line214):
        if (trueReturnPC)
            jit.storePtr(AssemblyHelpers::TrustedImmPtr(trueReturnPC), 
                AssemblyHelpers::addressFor(inlineCallFrame->stackOffset + 
                    virtualRegisterForArgument(inlineCallFrame->arguments.size()).offset()));

For a getter, inlineCallFrame->arguments.size() will be 1.  In our case with a tail callee with two arguments, this won't work.

I've updated the new test to reproduce this issue.

More work to be done.
Comment 8 Michael Saboff 2015-11-02 19:43:56 PST
Created attachment 264657 [details]
Updated Patch
Comment 9 Filip Pizlo 2015-11-02 20:26:18 PST
(In reply to comment #6)
> Let's assume that we do produce an InlineCallFrame in the locals area for
> all inlined functions, including inlined functions that do tails calls or
> are tailed called by the root CodeBlock, which includes arguments and so on,
> and let's assume that the compiler's InlineStackEntry reflects an m_caller
> value that includes tail callers.
> 
> Now let's say that we change InlineCallFrame, which we use for recovering
> stacks at runtime, for OSR exits and backtraces and stuff, to have a
> directCaller value that does not include tail callers.

We use it for more than that. It's the compiler's way of knowing what locals mean.

> 
> Does this mostly work?
> 
> In other words, what are the clients of InlineCallFrame (and not
> InlineStackEntry) that *want* to see tail callers?

What is the value of directCaller for inlined functions that were tail called from the outermost function? If it's null, then how do we know that we're in inlined code?  Another field, perhaps?  If we did this, then a lot of code in the DFG would have to change.  There is a huge difference between how we have to treat outermost code and inlined code.  For example, the outermost code puts its arguments in the physical argument area on the stack, while inlined code puts its arguments in locals.

It's possible that we could engineer a change like this, but it feels like a very large change with lots of unknowns.

A much better approach would be to add a method to InlineCallFrame called directCaller(), which skips tail callers, and then make sure that we are careful about the use of m_caller. We can switch uses of m_caller to directCaller() and examine if this really let us simplify code. This will give us a benefit even without having to worry about there being some code that relies on the current semantics of m_caller. And if there really are no code paths that want m_caller, then we can find that out through incrementally adopting the directCaller() method, rather than in one expensive leap.
Comment 10 Geoffrey Garen 2015-11-02 21:25:05 PST
Comment on attachment 264657 [details]
Updated Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=264657&action=review

r=me

> Source/JavaScriptCore/ChangeLog:15
> +        We can also retrun directly back to a getter or setter callsite with using a thunk.

return

without

> Source/JavaScriptCore/dfg/DFGOSRExitCompilerCommon.cpp:150
> +        CodeOrigin* trueCaller = inlineCallFrame->getCallerSkippingDeadFrames(&trueCallerCallKind);

You should follow up and rename this to getCallerSkippingTailCallers.
Comment 11 Geoffrey Garen 2015-11-02 21:29:41 PST
> A much better approach would be to add a method to InlineCallFrame called
> directCaller(), which skips tail callers

👍

To clarify, InlineCallFrame has a directCaller member; InlineStackEntry has an m_caller member. 

But I think the point stands that, if we change InlineCallFrame's clients to use a helper function that excludes tail callers, and we find that nobody continues to use its directCaller member to include tail callers, then we have our answer.
Comment 12 Michael Saboff 2015-11-02 21:34:14 PST
(In reply to comment #10)
> Comment on attachment 264657 [details]
> Updated Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=264657&action=review
> 
> r=me
> 
> > Source/JavaScriptCore/ChangeLog:15
> > +        We can also retrun directly back to a getter or setter callsite with using a thunk.
> 
> return

Fixed.

> without

Fixed.

> > Source/JavaScriptCore/dfg/DFGOSRExitCompilerCommon.cpp:150
> > +        CodeOrigin* trueCaller = inlineCallFrame->getCallerSkippingDeadFrames(&trueCallerCallKind);
> 
> You should follow up and rename this to getCallerSkippingTailCallers.

Agreed.  Filed https://bugs.webkit.org/show_bug.cgi?id=150832.
Comment 13 Michael Saboff 2015-11-02 21:34:51 PST
Committed r191937: <http://trac.webkit.org/changeset/191937>