Bug 177844 - Add support for using Probe DFG OSR Exit behind a runtime flag.
Summary: Add support for using Probe DFG OSR Exit behind a runtime flag.
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Mark Lam
URL:
Keywords: InRadar
Depends on:
Blocks: 175144
  Show dependency treegraph
 
Reported: 2017-10-03 15:50 PDT by Mark Lam
Modified: 2017-10-04 13:00 PDT (History)
7 users (show)

See Also:


Attachments
proposed patch. (86.15 KB, patch)
2017-10-03 17:45 PDT, Mark Lam
saam: review+
Details | Formatted Diff | Diff
patch for landing. (85.86 KB, patch)
2017-10-04 12:57 PDT, Mark Lam
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Lam 2017-10-03 15:50:39 PDT
This is based on the code originally posted in https://bugs.webkit.org/show_bug.cgi?id=175144 with some optimizations and bug fixes added.  The probe based DFG OSR Exit is only enabled if Options::useProbeBaseOSRExit() is true.  We're landing this behind an option switch to make it easier to tune performance using the probe based OSR exit.
Comment 1 Radar WebKit Bug Importer 2017-10-03 15:51:15 PDT
<rdar://problem/34801425>
Comment 2 Mark Lam 2017-10-03 17:45:59 PDT
Created attachment 322619 [details]
proposed patch.

Let's get some feedback and some EWS testing.  I still need to run benchmarks to confirm that perf is at parity with JSC_useProbeOSRExit=false.
Comment 3 Saam Barati 2017-10-03 19:35:00 PDT
Comment on attachment 322619 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=322619&action=review

> Source/JavaScriptCore/assembler/ProbeStack.cpp:99
> +    RELEASE_ASSERT(!dirtyBits);

don't think this is needed since the above loop condition is exactly this.

> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:50
> +//=======================================================================================

I don't think this is needed to delineate.

> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:444
> +    // The only reason for using this do while look is so we can break out midway when appropriate.

Nit: I think we've started to use immediately invoked lambdas in this situation, but I'm also ok with the while loop.

> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:487
> +        if (extraInitializationLevel <= ExtraInitializationLevel::SpeculationRecovery)

I'm confused about this extraInitialzationLevel stuff. Is this something we cache on first exit? Does it actually help with speed? This all seems like straight forward code, so I can't imagine it would help, you don't loop anywhere here.

> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:557
> +    size_t nextUndefinedSpanIndex = 0;
> +    size_t nextUndefinedOperandIndex = numberOfOperands;
> +    if (numUndefinedOperandSpans)
> +        nextUndefinedOperandIndex = undefinedOperandSpans[nextUndefinedSpanIndex].firstIndex;
> +
> +    JSValue undefined = jsUndefined();
> +    for (size_t spanIndex = 0; spanIndex < numUndefinedOperandSpans; ++spanIndex) {
> +        auto& span = undefinedOperandSpans[spanIndex];
> +        int firstOffset = span.minOffset;
> +        int lastOffset = firstOffset + span.numberOfRegisters;
> +
> +        for (int offset = firstOffset; offset < lastOffset; ++offset)
> +            frame.setOperand(offset, undefined);
> +    }

I thought you said this optimization doesn't help? If it doesn't, I saw we drop it, since it complicates the code.

> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:671
> +    // The tag registers are needed to materialize recoveries below.

This comment seems like it has a high chance of being wrong since this is C++ code.

> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:903
> +//=======================================================================================

I don't think this is needed to delineate.
Comment 4 Mark Lam 2017-10-04 11:47:43 PDT
Comment on attachment 322619 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=322619&action=review

Thanks for the review.  My responses below ...

> Source/JavaScriptCore/assembler/ProbeStack.cpp:91
> +    uintptr_t dirtyBits = m_dirtyBits;

I changed this to uint64_t to fix the 32-bit build failure.

>> Source/JavaScriptCore/assembler/ProbeStack.cpp:99
>> +    RELEASE_ASSERT(!dirtyBits);
> 
> don't think this is needed since the above loop condition is exactly this.

I've removed this.

>> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:50
>> +//=======================================================================================
> 
> I don't think this is needed to delineate.

I've removed this.

>> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:444
>> +    // The only reason for using this do while look is so we can break out midway when appropriate.
> 
> Nit: I think we've started to use immediately invoked lambdas in this situation, but I'm also ok with the while loop.

Though we can express this early breakout more elegantly with a lambda, the last time I checked, clang still does not optimize out lambdas in cases where it should / could.  Since this function appears to be perf sensitive (ala the OSR exit perf regression), I'll stick with the do while loop.

>> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:487
>> +        if (extraInitializationLevel <= ExtraInitializationLevel::SpeculationRecovery)
> 
> I'm confused about this extraInitialzationLevel stuff. Is this something we cache on first exit? Does it actually help with speed? This all seems like straight forward code, so I can't imagine it would help, you don't loop anywhere here.

The extraInitialzationLevel is based on my profiling of which of these initialization steps are actually taken during typical benchmarks.  Based on that, I ordered the initialization such that we can skip the least executed steps when not needed.

Regarding the complexity cost vs perf tradeoff, I agree in general.  However, the OSR exit perf issue is like a multi-variable problem.  Once one factor is improved, another one that wasn't previously a significant contributing factor now becomes significant.  So, I'll opt for keeping this for now until all the dust settles (at least until after I've tuned the OSR exit deopt/reopt policy).  I'll remove this complexity then if it is still not contributing to perf with any significance.

>> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:557
>> +    }
> 
> I thought you said this optimization doesn't help? If it doesn't, I saw we drop it, since it complicates the code.

Same as above.  I saw that it helps on some measurements and not on others (depending on which variables changed).  I'll keep this for now until the dust settles, and will re-evaluate keeping / removing it then.

>> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:671
>> +    // The tag registers are needed to materialize recoveries below.
> 
> This comment seems like it has a high chance of being wrong since this is C++ code.

I'm not sure that this is true because the tag registers are part of the callee saved registers, and we store the callee save registers to the VMEntryFrameCalleeSavesBuffer below in copyCalleeSavesToVMEntryFrameCalleeSavesBuffer().  Keeping this preserves the JIT OSR exit ramp's behavior of initializing these tag registers.  Note: the probe will update these registers to the newly set values when the probe returns as well.  Hence, setting here does have an effect.

>> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:903
>> +//=======================================================================================
> 
> I don't think this is needed to delineate.

I've removed this.
Comment 5 Saam Barati 2017-10-04 11:50:42 PDT
Comment on attachment 322619 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=322619&action=review

>>> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:671
>>> +    // The tag registers are needed to materialize recoveries below.
>> 
>> This comment seems like it has a high chance of being wrong since this is C++ code.
> 
> I'm not sure that this is true because the tag registers are part of the callee saved registers, and we store the callee save registers to the VMEntryFrameCalleeSavesBuffer below in copyCalleeSavesToVMEntryFrameCalleeSavesBuffer().  Keeping this preserves the JIT OSR exit ramp's behavior of initializing these tag registers.  Note: the probe will update these registers to the newly set values when the probe returns as well.  Hence, setting here does have an effect.

I'm not saying that you don't need to materialize these. All I'm saying is that I don't think these are needed for the recoveries anymore, since they're probably C++ code.
Comment 6 Mark Lam 2017-10-04 11:54:07 PDT
(In reply to Saam Barati from comment #5)
> Comment on attachment 322619 [details]
> proposed patch.
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=322619&action=review
> 
> >>> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:671
> >>> +    // The tag registers are needed to materialize recoveries below.
> >> 
> >> This comment seems like it has a high chance of being wrong since this is C++ code.
> > 
> > I'm not sure that this is true because the tag registers are part of the callee saved registers, and we store the callee save registers to the VMEntryFrameCalleeSavesBuffer below in copyCalleeSavesToVMEntryFrameCalleeSavesBuffer().  Keeping this preserves the JIT OSR exit ramp's behavior of initializing these tag registers.  Note: the probe will update these registers to the newly set values when the probe returns as well.  Hence, setting here does have an effect.
> 
> I'm not saying that you don't need to materialize these. All I'm saying is
> that I don't think these are needed for the recoveries anymore, since
> they're probably C++ code.

OK, I've removed the comment.
Comment 7 Mark Lam 2017-10-04 12:57:54 PDT
Created attachment 322708 [details]
patch for landing.
Comment 8 Mark Lam 2017-10-04 13:00:34 PDT
Landed in r222871: <http://trac.webkit.org/r222871>.