Bug 149600

Summary: Factoring out op_sub baseline code generation into JITSubGenerator
Product: WebKit Reporter: Mark Lam <mark.lam>
Component: JavaScriptCoreAssignee: Mark Lam <mark.lam>
Status: RESOLVED FIXED    
Severity: Normal CC: benjamin, commit-queue, fpizlo, ggaren, keith_miller, msaboff, sbarati, ysuzuki
Priority: P2    
Version: WebKit Local Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 149729    
Bug Blocks:    
Attachments:
Description Flags
work in progress 1: experimental prototype for op_add and op_sub
none
32-bit (X86) benchmark results
none
64-bit (X86_64) benchmark results
none
the patch.
fpizlo: review-
patch 2: addressed feedback
ggaren: review+
32-bit (x86) benchmark results
none
64-bit (x86_64) benchmark results none

Description Mark Lam 2015-09-28 11:27:55 PDT
Currently, a lot of DFG and FTL code generation for JS operations only handles operands that are of a primitive type like int32 and double.  This is the case even when type profiling predicts that the operands will be of a non-number or polymorphic type.  If the operand is not an int32 or double, the generated DFG/FTL code will currently OSR exit to the baseline JIT.  In contrast, the baseline JIT does have code to handle all types of operands (either via generated code or via a call to a C++ runtime helper).  We should factor out the baseline code that handles these polymorphic operands into re-targettable snippet generators, and allow the DFG and FTL to use them to generate code to handle polymorphic operands when the type profiling data calls for it.  This will prevent OSR exits due to operations on polymorphic operands when they can be predicted.
Comment 1 Mark Lam 2015-09-28 11:40:44 PDT
Created attachment 262014 [details]
work in progress 1: experimental prototype for op_add and op_sub

Archiving my work in progress (1): it builds.  There's an implementation for op_add and op_sub, but there's a bug in there that I need to debug.

The style of using "_" as a local (in function only) abbreviation for "m_jit" is also experimental.  The purpose for this is to make the code more readable from the perspective of making the code less verbose so that the reader can focus on what the generator functions are doing.  Let me know if you like it, hate it, or don't care either way.  Thanks.

After fixing the bug, I expect clean up work will be needed make the code more presentable and to address style issues.

My immediate next steps:
1. fix the bug on the op_sub implementation.
2. if it's easy, test the fix on op_add as well.
3. make the DFG use the op_sub generator to see how hard / easy it is to make a DFGSnippetJIT that can use these generators.
Comment 2 Mark Lam 2015-09-28 11:43:47 PDT
(In reply to comment #1)
> After fixing the bug, I expect clean up work will be needed make the code
> more presentable and to address style issues.

I also have lots of //@ annotations to help me personally grok the code.  I will remove these in the final clean up.
Comment 3 Mark Lam 2015-09-29 13:18:16 PDT
Update: I talked to Filip offline, and we'll be changing the strategy a bit.  In general, we'll try to replicate the strategy of the JITInlineCacheGenerator.  Here are the details on what's different:

1. The snippet generator won't use a register allocator interface (fake or otherwise).  Instead, all relevant registers will be passed in to the generator constructor.
2. A used register set will also be passed in to the generator constructor if it needs more than 1 scratch register.  It will use the ScratchRegisterAllocator to allocate and restore the scratch register if needed.
3. The generator will emit code using a MacroAssembler instead of a JIT.
4. The generator will not emit any slow path calls.  Instead, it will return a JumpList that the client JIT will link to in the slow path where it emits the C runtime helper call.
5. The generator will only generate 1 blob of code that handles all "easy" (or "easier") cases of operands.  The true slow case will now only be the C runtime helper call.
   As a result, we'll be moving some code in the existing baseline slow path into their respective fast path.  Only the C runtime helper call will remain in the slow path.

As a future exercise (not for this pass), we will change the generators to only emit the easy case (integer operands only) in the fast path.  The generator will emit the "less easy" cases (e.g. integer overflow, mixed integers and doubles operands, etc) in an inline cache, with the a C runtime helper call as the last resort.

I'll update the patch to reflect these changes.
Comment 4 Filip Pizlo 2015-09-29 13:25:35 PDT
(In reply to comment #3)
> Update: I talked to Filip offline, and we'll be changing the strategy a bit.
> In general, we'll try to replicate the strategy of the
> JITInlineCacheGenerator.  Here are the details on what's different:
> 
> 1. The snippet generator won't use a register allocator interface (fake or
> otherwise).  Instead, all relevant registers will be passed in to the
> generator constructor.
> 2. A used register set will also be passed in to the generator constructor
> if it needs more than 1 scratch register.  It will use the
> ScratchRegisterAllocator to allocate and restore the scratch register if
> needed.
> 3. The generator will emit code using a MacroAssembler instead of a JIT.

You might find it helpful to use CCallHelpers or AssemblyHelpers, since it already contains methods for doing boxing/unboxing and type checks.

> 4. The generator will not emit any slow path calls.  Instead, it will return
> a JumpList that the client JIT will link to in the slow path where it emits
> the C runtime helper call.
> 5. The generator will only generate 1 blob of code that handles all "easy"
> (or "easier") cases of operands.  The true slow case will now only be the C
> runtime helper call.
>    As a result, we'll be moving some code in the existing baseline slow path
> into their respective fast path.  Only the C runtime helper call will remain
> in the slow path.
> 
> As a future exercise (not for this pass), we will change the generators to
> only emit the easy case (integer operands only) in the fast path.  The
> generator will emit the "less easy" cases (e.g. integer overflow, mixed
> integers and doubles operands, etc) in an inline cache, with the a C runtime
> helper call as the last resort.
> 
> I'll update the patch to reflect these changes.

SGTM.
Comment 5 Mark Lam 2015-10-01 14:52:50 PDT
Re-purposing this bug for implementing the op_sub snippet generator.
Comment 6 Mark Lam 2015-10-01 16:09:22 PDT
Created attachment 262295 [details]
32-bit (X86) benchmark results
Comment 7 Mark Lam 2015-10-01 16:09:50 PDT
Created attachment 262296 [details]
64-bit (X86_64) benchmark results
Comment 8 Mark Lam 2015-10-01 16:11:35 PDT
The benchmarks results (attached) show no significant difference in perf (for both 32-bit and 64-bit) for the patch that I will upload shortly.
Comment 9 Mark Lam 2015-10-01 17:05:38 PDT
Created attachment 262300 [details]
the patch.
Comment 10 WebKit Commit Bot 2015-10-01 17:08:09 PDT
Attachment 262300 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:45:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:46:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:47:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:48:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:49:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:50:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:51:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:52:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:53:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
Total errors found: 9 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 11 Mark Lam 2015-10-01 17:10:05 PDT
Comment on attachment 262300 [details]
the patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=262300&action=review

> Source/JavaScriptCore/jit/AssemblyHelpers.h:1186
> +        return branchTest64(Zero, nonIntValue.payloadGPR(), GPRInfo::tagTypeNumberRegister);

Note that, on the 64-bit port, there's no way to test if the value is not a double.  To truly test for that, we also need to verify that the value also does not have an int tag.  However, the only places where we use this emitter is where the value is known to be not an int (hence the param name "nonIntValue"). This code works under this condition that the value is a non-Int value.

If you have a better idea for how we can improve this to be more robust, please chime in.  Thanks.
Comment 12 Geoffrey Garen 2015-10-01 17:27:30 PDT
There's quite a bit of variance in these benchmark results -- enough to mask a 2%-4% performance change. The variance I see when benchmarking on a quiet system is much lower. Is there something you can do to lower the variance?
Comment 13 Geoffrey Garen 2015-10-01 17:52:38 PDT
Comment on attachment 262300 [details]
the patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=262300&action=review

I think you're going to break some ports by removing the check for "supportsFloatingPoint()". Please add that back.

> Source/JavaScriptCore/jit/AssemblyHelpers.h:1183
> +    Jump emitJumpIfNotDouble(JSValueRegs nonIntValue)

Perhaps you can make this function less convenient to call wrong by requiring the caller to pass in the non-int jump that it should have done before this jump:

// Caller must check for integer before calling this function.
Jump emitJumpIfNotDouble(Jump& notIntJump, JSValueRegs notIntValue)
{
    UNUSED_PARAM(notIntJump);
}

> Source/JavaScriptCore/jit/JITSubGenerator.h:34
> +    typedef MacroAssembler Masm;

It is bad when one thing becomes two.

Let's not do this.
Comment 14 Filip Pizlo 2015-10-01 19:25:16 PDT
Comment on attachment 262300 [details]
the patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=262300&action=review

> Source/JavaScriptCore/jit/AssemblyHelpers.h:1181
> +    Jump emitJumpIfNotInt(JSValueRegs value)
> +    {
> +#if USE(JSVALUE64)
> +        return branch64(Below, value.payloadGPR(), GPRInfo::tagTypeNumberRegister);
> +#else
> +        return branch32(NotEqual, value.tagGPR(), TrustedImm32(JSValue::Int32Tag));
> +#endif
> +    }

Is this exactly the same as branchIfNotInt32()?

> Source/JavaScriptCore/jit/AssemblyHelpers.h:1190
> +    Jump emitJumpIfNotDouble(JSValueRegs nonIntValue)
> +    {
> +#if USE(JSVALUE64)
> +        return branchTest64(Zero, nonIntValue.payloadGPR(), GPRInfo::tagTypeNumberRegister);
> +#else
> +        return branch32(Above, nonIntValue.tagGPR(), TrustedImm32(JSValue::LowestTag));
> +#endif
> +    }

Is this exactly the same as branchIfNotNumber()?

We usually branchIfNotInt32() and then branchIfNotNumber() instead of directly doing branchIfNotDouble(), since the latter is much more expensive on 64-bit.

>> Source/JavaScriptCore/jit/JITSubGenerator.h:34
>> +    typedef MacroAssembler Masm;
> 
> It is bad when one thing becomes two.
> 
> Let's not do this.

Agreed.

> Source/JavaScriptCore/jit/JITSubGenerator.h:44
> +    JITSubGenerator(JSValueRegs result, JSValueRegs left, JSValueRegs right,
> +        ResultType leftType, ResultType rightType, FPRReg leftFPR, FPRReg rightFPR,
> +        FPRReg scratchFPR, ResultBoxing resultBoxing = Unboxed)

When would resultBoxing be anything but Boxed?

> Source/JavaScriptCore/jit/JITSubGenerator.h:54
> +    { }

I think that we usually put the } on the next line.
Comment 15 Mark Lam 2015-10-01 23:20:38 PDT
Comment on attachment 262300 [details]
the patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=262300&action=review

>> Source/JavaScriptCore/jit/AssemblyHelpers.h:1181
>> +    }
> 
> Is this exactly the same as branchIfNotInt32()?

I made the mistake of not searching for this (since the baseline JIT code I'm replacing actually does not actually use this).  Will fix.

>> Source/JavaScriptCore/jit/AssemblyHelpers.h:1183
>> +    Jump emitJumpIfNotDouble(JSValueRegs nonIntValue)
> 
> Perhaps you can make this function less convenient to call wrong by requiring the caller to pass in the non-int jump that it should have done before this jump:
> 
> // Caller must check for integer before calling this function.
> Jump emitJumpIfNotDouble(Jump& notIntJump, JSValueRegs notIntValue)
> {
>     UNUSED_PARAM(notIntJump);
> }

This is a good idea.  But since there's a branchIfNotNumber() (as Fil pointed out) that I can use instead, the issue is now moot.  I'll switch to using branchIfNotNumber().

>> Source/JavaScriptCore/jit/AssemblyHelpers.h:1190
>> +    }
> 
> Is this exactly the same as branchIfNotNumber()?
> 
> We usually branchIfNotInt32() and then branchIfNotNumber() instead of directly doing branchIfNotDouble(), since the latter is much more expensive on 64-bit.

Ditto.  Will fix.

>>> Source/JavaScriptCore/jit/JITSubGenerator.h:34
>>> +    typedef MacroAssembler Masm;
>> 
>> It is bad when one thing becomes two.
>> 
>> Let's not do this.
> 
> Agreed.

My intent was just to get more compact code.  Masm is obviously short for MacroAssembler.  Anyway, I'll switch to using CCallHelpers instead, which works since JITSubGenerator works with a CCallHelpers instance.

>> Source/JavaScriptCore/jit/JITSubGenerator.h:44
>> +        FPRReg scratchFPR, ResultBoxing resultBoxing = Unboxed)
> 
> When would resultBoxing be anything but Boxed?

I was thinking of the DFG.  For example, SpeculativeJIT::compileArithSub() stores unboxed results.  But on second thought, I think I missed a detail.  The DFG will only use this generator for cases where the operands are not int32 or double.  Hence, the result should be a JSValue.  Did I get it right this time?

I'll fix it to always produce a boxed result.

>> Source/JavaScriptCore/jit/JITSubGenerator.h:54
>> +    { }
> 
> I think that we usually put the } on the next line.

A grep of Source/JavaScriptCore shows that we do this in many places:

Source/JavaScriptCore$ grep -rn '    { }' *
assembler/MacroAssemblerPrinter.h:147:    { }
assembler/MacroAssemblerPrinter.h:151:    { }
assembler/MacroAssemblerPrinter.h:155:    { }
bytecode/CodeBlock.cpp:3201:    { }
bytecode/VirtualRegister.h:53:    { }
bytecode/VirtualRegister.h:57:    { }
debugger/DebuggerCallFrame.cpp:66:    { }
dfg/DFGCommonData.h:79:    { }
ftl/FTLInlineCacheDescriptor.h:42:    { }
interpreter/CallFrame.h:44:        { }
interpreter/CallFrame.h:48:        { }
runtime/ArrayBuffer.h:47:    { }
runtime/ArrayBuffer.h:58:    { }
runtime/ControlFlowProfiler.h:42:    { }
runtime/ControlFlowProfiler.h:47:    { }
runtime/ControlFlowProfiler.h:52:    { }
runtime/Error.cpp:114:    { }
runtime/ErrorInstance.cpp:103:    { }
runtime/Lookup.h:57:        { }
runtime/Lookup.h:60:        { }
runtime/TypeProfiler.h:47:    { }
runtime/TypeProfiler.h:52:    { }
runtime/TypeProfiler.h:57:    { }

A grep of Source/WebCore shows that this is used in many places as well:

Source/WebCore$ grep -rn '    { }' * | wc
     117     354    5397

I would like to keep using it.
Comment 16 Mark Lam 2015-10-01 23:21:13 PDT
(In reply to comment #13)
> Comment on attachment 262300 [details]
> the patch.
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=262300&action=review
> 
> I think you're going to break some ports by removing the check for
> "supportsFloatingPoint()". Please add that back.

I'll look into this.
Comment 17 Mark Lam 2015-10-06 12:41:56 PDT
Created attachment 262537 [details]
patch 2: addressed feedback
Comment 18 Mark Lam 2015-10-06 12:45:23 PDT
Created attachment 262538 [details]
32-bit (x86) benchmark results
Comment 19 Mark Lam 2015-10-06 12:45:47 PDT
Created attachment 262539 [details]
64-bit (x86_64) benchmark results
Comment 20 WebKit Commit Bot 2015-10-06 12:47:13 PDT
Attachment 262537 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:39:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:40:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:41:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:42:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:43:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:44:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:45:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:46:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
ERROR: Source/JavaScriptCore/jit/JITSubGenerator.h:47:  Wrong number of spaces before statement. (expected: 12)  [whitespace/indent] [4]
Total errors found: 9 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 21 Geoffrey Garen 2015-10-06 15:20:44 PDT
Comment on attachment 262537 [details]
patch 2: addressed feedback

r=me
Comment 22 Mark Lam 2015-10-06 15:30:28 PDT
Thanks.  Landed in r190649: <http://trac.webkit.org/r190649>.