Bug 238411

Summary: [JSC] Include argumentRegisters in identity of SlowPathCallKey when clobberAllRegsInFTLICSlowPath is enabled
Product: WebKit Reporter: Yusuke Suzuki <ysuzuki>
Component: New BugsAssignee: Yusuke Suzuki <ysuzuki>
Status: RESOLVED FIXED    
Severity: Normal CC: ews-watchlist, keith_miller, mark.lam, msaboff, saam, tzagallo, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch mark.lam: review+, ews-feeder: commit-queue-

Description Yusuke Suzuki 2022-03-25 23:24:34 PDT
[JSC] Remove clobberAllRegsInFTLICSlowPath
Comment 1 Yusuke Suzuki 2022-03-26 00:15:34 PDT
Created attachment 455835 [details]
Patch
Comment 2 Mark Lam 2022-03-26 03:22:57 PDT
Comment on attachment 455835 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=455835&action=review

r=me with suggestions.

> Source/JavaScriptCore/ChangeLog:16
> +        While SlowPathCallKey includes argumentRegisters, it is not used for identity check. But this argumentRegisters
> +        is only effective to the resulted code in FTLThunks only when Options::clobberAllRegsInFTLICSlowPath is specified.
> +        Just including argumentRegisters will cause code size regression since we will lose a chance to duplicate thunks
> +        while argumentRegisters is unrelated when Options::clobberAllRegsInFTLICSlowPath is not true. This bug causes
> +        x64 Debug JSC failures after enabling DataIC because the same FTLThunks should not be picked for the different
> +        argument registers when Options::clobberAllRegsInFTLICSlowPath is true.
> +
> +        In this patch, we make argumentRegisters in SlowPathCallKey effective only when Options::clobberAllRegsInFTLICSlowPath
> +        is specified. And we also refactor SlowPathCallKey to reduce size of it from 40 to 24 on x64.

I suggest rephrasing this as:

"While SlowPathCallKey includes argumentRegisters, it is not used for its identity check. But this argumentRegisters
is effectual on the resulting code in FTLThunks if Options::clobberAllRegsInFTLICSlowPath is set. This causes
x64 Debug JSC test failures after enabling DataIC because the same FTLThunks should not be picked for different
argument registers when Options::clobberAllRegsInFTLICSlowPath is true.

However, always including argumentRegisters in the identity check will cause a code size regression since we will
lose a chance to duplicate thunks when argumentRegisters is ineffectual. Note that Options::clobberAllRegsInFTLICSlowPath
is only set for debugging use cases. Hence, argumentRegisters is normally not effectual.

In this patch, we include argumentRegisters in SlowPathCallKey's identity check only when Options::clobberAllRegsInFTLICSlowPath
is set.  And we also refactor SlowPathCallKey to reduce size of it from 40 to 24 on x64."

If you choose to make m_offset a bitfield (see below), you should drop the "on x64" part here.

> Source/JavaScriptCore/assembler/MacroAssemblerCodeRef.h:139
> +    constexpr FunctionPtr(std::nullptr_t) { };

The semicolon is not needed.

> Source/JavaScriptCore/ftl/FTLSlowPathCall.cpp:120
> +    if (Options::clobberAllRegsInFTLICSlowPath())

Use UNLIKELY?

> Source/JavaScriptCore/ftl/FTLSlowPathCall.cpp:128
> +    if (Options::clobberAllRegsInFTLICSlowPath())

Use UNLIKELY?

> Source/JavaScriptCore/ftl/FTLSlowPathCallKey.h:129
> +        return PtrHash<void*>::hash(callTarget().executableAddress()) + m_offset + m_usedRegisters.hash() + indirectOffset() + static_cast<unsigned>(m_type);

Maybe add a comment above here saying that "m_numberOfUsedArgumentRegistersIfClobberingCheckIsEnabled is intentionally not included because it will always be 0 unless Options::clobberAllRegsInFTLICSlowPath() is set, and Options::clobberAllRegsInFTLICSlowPath() is only set in debugging use cases."

> Source/JavaScriptCore/ftl/FTLSlowPathCallKey.h:139
>      ptrdiff_t m_offset { 0 };

I think you can make SlowPathCallKey fit in 24 bytes too on ARM64 if you move this to the end and make it a 48 bit bitfield, which should be more than enough bits.  You can probably even make it a 32-bit bitfield given that the value of this ptrdiff_t is computed from number of args + number of registers if I understand the code correctly.  If you do make this a bitfield, I recommend adding a debug ASSERT in the constructor to verify that the incoming value used to initialize this does indeed fit in the bitfield.

> Source/JavaScriptCore/ftl/FTLThunks.cpp:203
> +    if (Options::clobberAllRegsInFTLICSlowPath()) {

Use UNLIKELY?
Comment 3 Yusuke Suzuki 2022-03-26 04:15:00 PDT
Comment on attachment 455835 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=455835&action=review

Thanks!

>> Source/JavaScriptCore/ChangeLog:16
>> +        is specified. And we also refactor SlowPathCallKey to reduce size of it from 40 to 24 on x64.
> 
> I suggest rephrasing this as:
> 
> "While SlowPathCallKey includes argumentRegisters, it is not used for its identity check. But this argumentRegisters
> is effectual on the resulting code in FTLThunks if Options::clobberAllRegsInFTLICSlowPath is set. This causes
> x64 Debug JSC test failures after enabling DataIC because the same FTLThunks should not be picked for different
> argument registers when Options::clobberAllRegsInFTLICSlowPath is true.
> 
> However, always including argumentRegisters in the identity check will cause a code size regression since we will
> lose a chance to duplicate thunks when argumentRegisters is ineffectual. Note that Options::clobberAllRegsInFTLICSlowPath
> is only set for debugging use cases. Hence, argumentRegisters is normally not effectual.
> 
> In this patch, we include argumentRegisters in SlowPathCallKey's identity check only when Options::clobberAllRegsInFTLICSlowPath
> is set.  And we also refactor SlowPathCallKey to reduce size of it from 40 to 24 on x64."
> 
> If you choose to make m_offset a bitfield (see below), you should drop the "on x64" part here.

Changed.

>> Source/JavaScriptCore/assembler/MacroAssemblerCodeRef.h:139
>> +    constexpr FunctionPtr(std::nullptr_t) { };
> 
> The semicolon is not needed.

Fixed.

>> Source/JavaScriptCore/ftl/FTLSlowPathCall.cpp:120
>> +    if (Options::clobberAllRegsInFTLICSlowPath())
> 
> Use UNLIKELY?

Fixed.

>> Source/JavaScriptCore/ftl/FTLSlowPathCall.cpp:128
>> +    if (Options::clobberAllRegsInFTLICSlowPath())
> 
> Use UNLIKELY?

Fixed.

>> Source/JavaScriptCore/ftl/FTLSlowPathCallKey.h:129
>> +        return PtrHash<void*>::hash(callTarget().executableAddress()) + m_offset + m_usedRegisters.hash() + indirectOffset() + static_cast<unsigned>(m_type);
> 
> Maybe add a comment above here saying that "m_numberOfUsedArgumentRegistersIfClobberingCheckIsEnabled is intentionally not included because it will always be 0 unless Options::clobberAllRegsInFTLICSlowPath() is set, and Options::clobberAllRegsInFTLICSlowPath() is only set in debugging use cases."

Added.

>> Source/JavaScriptCore/ftl/FTLSlowPathCallKey.h:139
>>      ptrdiff_t m_offset { 0 };
> 
> I think you can make SlowPathCallKey fit in 24 bytes too on ARM64 if you move this to the end and make it a 48 bit bitfield, which should be more than enough bits.  You can probably even make it a 32-bit bitfield given that the value of this ptrdiff_t is computed from number of args + number of registers if I understand the code correctly.  If you do make this a bitfield, I recommend adding a debug ASSERT in the constructor to verify that the incoming value used to initialize this does indeed fit in the bitfield.

Sounds good!

>> Source/JavaScriptCore/ftl/FTLThunks.cpp:203
>> +    if (Options::clobberAllRegsInFTLICSlowPath()) {
> 
> Use UNLIKELY?

Fixed.
Comment 4 Yusuke Suzuki 2022-03-26 04:38:35 PDT
Committed r291935 (248906@trunk): <https://commits.webkit.org/248906@trunk>
Comment 5 Radar WebKit Bug Importer 2022-03-26 04:39:17 PDT
<rdar://problem/90875203>