Bug 226375

Summary: Put the Baseline JIT prologue and op_loop_hint code in JIT thunks.
Product: WebKit Reporter: Mark Lam <mark.lam>
Component: JavaScriptCoreAssignee: Mark Lam <mark.lam>
Status: REOPENED ---    
Severity: Normal CC: ews-watchlist, keith_miller, msaboff, rmorisset, saam, tzagallo, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
wip.
ews-feeder: commit-queue-
wip.
ews-feeder: commit-queue-
work in progress.
ews-feeder: commit-queue-
work in progress.
ews-feeder: commit-queue-
work in progress.
none
work in progress.
none
proposed patch.
none
proposed patch.
ews-feeder: commit-queue-
proposed patch.
ews-feeder: commit-queue-
proposed patch.
ews-feeder: commit-queue-
proposed patch.
none
proposed patch.
none
proposed patch.
keith_miller: review+
patch for landing. none

Description Mark Lam 2021-05-28 08:41:51 PDT
And also optimize op_enter a bit more.
Comment 1 Mark Lam 2021-05-28 08:49:28 PDT
Created attachment 430011 [details]
wip.
Comment 2 Mark Lam 2021-05-28 09:09:10 PDT
Created attachment 430015 [details]
wip.
Comment 3 Mark Lam 2021-05-28 10:11:42 PDT
Created attachment 430023 [details]
work in progress.
Comment 4 Mark Lam 2021-05-29 17:39:15 PDT
Created attachment 430119 [details]
work in progress.
Comment 5 Mark Lam 2021-05-29 20:31:00 PDT
Created attachment 430127 [details]
work in progress.
Comment 6 Mark Lam 2021-05-30 01:28:48 PDT
Created attachment 430133 [details]
work in progress.
Comment 7 Mark Lam 2021-06-03 17:58:03 PDT
Created attachment 430522 [details]
proposed patch.

Let's test this on the EWS first.
Comment 8 Radar WebKit Bug Importer 2021-06-04 08:42:18 PDT
<rdar://problem/78870591>
Comment 9 Mark Lam 2021-06-04 15:41:31 PDT
Created attachment 430623 [details]
proposed patch.
Comment 10 Mark Lam 2021-06-04 17:09:22 PDT
Created attachment 430632 [details]
proposed patch.
Comment 11 Mark Lam 2021-06-06 11:41:43 PDT
Created attachment 430691 [details]
proposed patch.
Comment 12 Mark Lam 2021-06-06 11:45:49 PDT
Created attachment 430692 [details]
proposed patch.
Comment 13 Robin Morisset 2021-06-06 23:04:38 PDT
Comment on attachment 430692 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=430692&action=review

> Source/JavaScriptCore/assembler/MacroAssemblerARM64.h:2238
> +    void negate64(RegisterID dest)

Any reason not to do sub64(ARM64Registers::zr, dest, dest)?

> Source/JavaScriptCore/assembler/MacroAssemblerARMv7.h:1809
> +            // we'll need to reload it with the high bits of the address afterwards.

I don't see where you do this reloading?

> Source/JavaScriptCore/assembler/MacroAssemblerX86_64.h:921
> +    void negate64(RegisterID dest)

There seems to be an opcode exactly for this: https://www.aldeid.com/wiki/X86-assembly/Instructions/neg

> Source/JavaScriptCore/jit/JIT.cpp:894
> +        static constexpr ThunkGenerator generators[] = {

Isn't it already defined a few dozen lines above? How do they differ?
More general the entire block that follows appear duplicated, and I don't understand why.

> Source/JavaScriptCore/jit/JIT.cpp:966
> +    // We'll be imminently returing with a retab in the normal path, which will do validation.

returing -> returning ?
retab -> retag ?
Comment 14 Mark Lam 2021-06-06 23:19:16 PDT
Comment on attachment 430692 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=430692&action=review

>> Source/JavaScriptCore/assembler/MacroAssemblerARM64.h:2238
>> +    void negate64(RegisterID dest)
> 
> Any reason not to do sub64(ARM64Registers::zr, dest, dest)?

No good reason.  It's only because I forgot about ARM64Registers::zr.  Will fix.

>> Source/JavaScriptCore/assembler/MacroAssemblerARMv7.h:1809
>> +            // we'll need to reload it with the high bits of the address afterwards.
> 
> I don't see where you do this reloading?

Stale comment copied from `Jump branchAdd32(ResultCondition cond, TrustedImm32 imm, AbsoluteAddress dest)`.  Does not apply here because we are free to use addressTempRegister in this function.  I'll remove the comment.

>> Source/JavaScriptCore/assembler/MacroAssemblerX86_64.h:921
>> +    void negate64(RegisterID dest)
> 
> There seems to be an opcode exactly for this: https://www.aldeid.com/wiki/X86-assembly/Instructions/neg

k, thanks.  I'll look into it.

>> Source/JavaScriptCore/jit/JIT.cpp:894
>> +        static constexpr ThunkGenerator generators[] = {
> 
> Isn't it already defined a few dozen lines above? How do they differ?
> More general the entire block that follows appear duplicated, and I don't understand why.

The table above is for the normal prologue thunks.  This table is for the arity fixup prologue thunks.

>> Source/JavaScriptCore/jit/JIT.cpp:966
>> +    // We'll be imminently returing with a retab in the normal path, which will do validation.
> 
> returing -> returning ?
> retab -> retag ?

Will fix "returing". "retab" is not a typo.  It's the ARM64E instruction for "return with Authentication using the B key".  The ARM64E MacroAssembler automatically emits retab in place of ret when you see the `ret()` call below.
Comment 15 Mark Lam 2021-06-06 23:26:18 PDT
Comment on attachment 430692 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=430692&action=review

>>> Source/JavaScriptCore/assembler/MacroAssemblerARM64.h:2238
>>> +    void negate64(RegisterID dest)
>> 
>> Any reason not to do sub64(ARM64Registers::zr, dest, dest)?
> 
> No good reason.  It's only because I forgot about ARM64Registers::zr.  Will fix.

Turns out, our MacroAssembler already has the neg64() emitter.  I just didn't think to search for neg64 instead of negate64.  Will remove these redundant negate64 emitters.
Comment 16 Mark Lam 2021-06-06 23:32:05 PDT
Created attachment 430714 [details]
proposed patch.
Comment 17 Mark Lam 2021-06-07 08:03:13 PDT
Created attachment 430746 [details]
proposed patch.
Comment 18 Keith Miller 2021-06-07 08:43:35 PDT
Comment on attachment 430746 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=430746&action=review

r=me with some nits.

> Source/JavaScriptCore/ChangeLog:45
> +        LinkBuffer size results:

Are these number from a specific workload?

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:2313
> +    Jump branchAdd32(ResultCondition cond, TrustedImm32 imm, ImplicitAddress destAddress)

I don't know anything about MIPS so I'm just gonna assume this is correct if tests pass.

> Source/JavaScriptCore/jit/JIT.cpp:799
> +            unsigned startArgument = m_codeBlock->isConstructor() ? 1 : 0;

Nit, I think you don't need the ternary here. True implicitly converts to unsigned 1.

> Source/JavaScriptCore/jit/JIT.cpp:834
> +    unsigned generatorSelector = doesProfiling << 2 | isConstructor << 1 | hasHugeFrame << 0;

Nit: Maybe this should be an inline function next to the macro for future code changes?

> Source/JavaScriptCore/jit/JIT.cpp:898
> +        static constexpr ThunkGenerator generators[] = {
> +#define USE_PROLOGUE_GENERATOR(doesProfiling, isConstructor, hasHugeFrame, name, arityFixupName) arityFixupName,
> +            FOR_EACH_PROLOGUE_GENERATOR(USE_PROLOGUE_GENERATOR)
> +#undef USE_PROLOGUE_GENERATOR
> +        };

How does this not conflict with the generators definition on line 825? O.o

> Source/JavaScriptCore/jit/JIT.cpp:943
> +    store64(codeBlockGPR, addressFor(CallFrameSlot::codeBlock));

Nit: I think this should be storePtr.

> Source/JavaScriptCore/jit/JIT.cpp:966
> +    // We'll be imminently returning with a retab in the normal path, which will do validation.

Nit: retab => retag.

> Source/JavaScriptCore/jit/JIT.cpp:1067
> +#endif

Maybe add an else with a #error "unsupported architecture" for each of these?

> Source/JavaScriptCore/jit/JIT.cpp:1069
> +    store64(codeBlockGPR, addressFor(CallFrameSlot::codeBlock));

Ditto on storePtr.
Comment 19 Keith Miller 2021-06-07 08:44:42 PDT
Comment on attachment 430746 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=430746&action=review

> Source/JavaScriptCore/jit/JIT.cpp:696
> +    v( doesProfiling, !isConstructor, !hasHugeFrame, prologueGenerator4, arityFixup_prologueGenerator4) \
> +    v( doesProfiling, !isConstructor,  hasHugeFrame, prologueGenerator5, arityFixup_prologueGenerator5) \
> +    v( doesProfiling,  isConstructor, !hasHugeFrame, prologueGenerator6, arityFixup_prologueGenerator6) \
> +    v( doesProfiling,  isConstructor,  hasHugeFrame, prologueGenerator7, arityFixup_prologueGenerator7)

Nit: Style comment seems somewhat relevant here.
Comment 20 Mark Lam 2021-06-07 09:37:20 PDT
Comment on attachment 430746 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=430746&action=review

>> Source/JavaScriptCore/ChangeLog:45
>> +        LinkBuffer size results:
> 
> Are these number from a specific workload?

This is from a single run of Speedometer2.  I will update the ChangeLog to say so.

>> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:2313
>> +    Jump branchAdd32(ResultCondition cond, TrustedImm32 imm, ImplicitAddress destAddress)
> 
> I don't know anything about MIPS so I'm just gonna assume this is correct if tests pass.

Ditto.  FWIW, this implementation is based on `Jump branchAdd32(ResultCondition cond, TrustedImm32 imm, AbsoluteAddress destAddress)` but tweaked based on the implementation of `void load32(ImplicitAddress address, RegisterID dest)` and `void store32(RegisterID src, ImplicitAddress address)` plus a little googled knowledge: (1) MIPS (32-bit) instructions are always 4 bytes in size, and (2) the integers in the 2nd argument of conditional branches are the number of instructions between the branch instruction and the target.

>> Source/JavaScriptCore/jit/JIT.cpp:696
>> +    v( doesProfiling,  isConstructor,  hasHugeFrame, prologueGenerator7, arityFixup_prologueGenerator7)
> 
> Nit: Style comment seems somewhat relevant here.

I deliberately kept the columns aligned (in violation of the style) because this makes it easier to verify (both for me and future readers of the code) that I got the permutations correct.  I'll leave it as is.

>> Source/JavaScriptCore/jit/JIT.cpp:799
>> +            unsigned startArgument = m_codeBlock->isConstructor() ? 1 : 0;
> 
> Nit, I think you don't need the ternary here. True implicitly converts to unsigned 1.

That's true, but I think that this makes it clear that we want to start from argument 1 if isConstructor and argument 0 if otherwise, and not require the cognitive leap to realize that true == 1 (small as the cognitive leap may be).  I'm going to leave this as is.

>> Source/JavaScriptCore/jit/JIT.cpp:834
>> +    unsigned generatorSelector = doesProfiling << 2 | isConstructor << 1 | hasHugeFrame << 0;
> 
> Nit: Maybe this should be an inline function next to the macro for future code changes?

Sounds good.  Will do.

>> Source/JavaScriptCore/jit/JIT.cpp:898
>> +        };
> 
> How does this not conflict with the generators definition on line 825? O.o

Nope.  This one is inside an if statement block i.e. nested scope.

>> Source/JavaScriptCore/jit/JIT.cpp:943
>> +    store64(codeBlockGPR, addressFor(CallFrameSlot::codeBlock));
> 
> Nit: I think this should be storePtr.

Will fix.

>> Source/JavaScriptCore/jit/JIT.cpp:966
>> +    // We'll be imminently returning with a retab in the normal path, which will do validation.
> 
> Nit: retab => retag.

Nope. retab is ARM64E's "return with Authentication using the B key".   This is the 2nd time someone got confused about this (Robin asked first).  I will update the comment to clarify.  Here's the updated comment:

    // We'll be imminently returning with a `retab` (ARM64E's return with authentication
    // using the B key) in the normal path (see MacroAssemblerARM64E's implementation of
    // ret()), which will do validation. So, extra validation here is redundant and unnecessary.

>> Source/JavaScriptCore/jit/JIT.cpp:1067
>> +#endif
> 
> Maybe add an else with a #error "unsupported architecture" for each of these?

If we do this, there will be many many places where we'll have to add this, and make the code unsightly.  As a compromise, I'll add the following at the top of JIT.cpp to alert the port developer:

#if ENABLE(EXTRA_CTI_THUNKS)
#if CPU(ARM64) || (CPU(X86_64) && !OS(WINDOWS))
// These are supported ports.
#else
// This is a courtesy reminder (and warning) that the implementation of EXTRA_CTI_THUNKS can
// use up to 6 argument registers and/or 6/7 temp registers, and make use of ARM64 like
// features. Hence, it may not work for many other ports without significant work. If you
// plan on adding EXTRA_CTI_THUNKS support for your port, please remember to search the
// EXTRA_CTI_THUNKS code for CPU(ARM64) and CPU(X86_64) conditional code, and add support
// for your port there as well.
#error "unsupported architecture"
#endif
#endif // ENABLE(EXTRA_CTI_THUNKS)

>> Source/JavaScriptCore/jit/JIT.cpp:1069
>> +    store64(codeBlockGPR, addressFor(CallFrameSlot::codeBlock));
> 
> Ditto on storePtr.

Will fix.
Comment 21 Mark Lam 2021-06-07 09:44:00 PDT
Created attachment 430756 [details]
patch for landing.
Comment 22 Mark Lam 2021-06-07 15:52:31 PDT
Thanks for the reviews.  Landed in r278576: <http://trac.webkit.org/r278576>.
Comment 23 Mark Lam 2021-06-19 01:26:09 PDT
Reverted patch in r279049: <http://trac.webkit.org/r279049> dues to suspected Speedometer2 regression.