Bug 228543 - [JSC] load/store with BaseIndex is inefficient in ARM64
Summary: [JSC] load/store with BaseIndex is inefficient in ARM64
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Yusuke Suzuki
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2021-07-28 01:49 PDT by Yusuke Suzuki
Modified: 2021-07-28 13:24 PDT (History)
7 users (show)

See Also:


Attachments
Patch (27.33 KB, patch)
2021-07-28 01:51 PDT, Yusuke Suzuki
no flags Details | Formatted Diff | Diff
Patch (27.56 KB, patch)
2021-07-28 10:32 PDT, Yusuke Suzuki
mark.lam: review+
ews-feeder: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Yusuke Suzuki 2021-07-28 01:49:56 PDT
[JSC] load/store with BaseIndex is inefficient in ARM64
Comment 1 Yusuke Suzuki 2021-07-28 01:51:21 PDT
Created attachment 434411 [details]
Patch
Comment 2 Mark Lam 2021-07-28 09:08:09 PDT
Comment on attachment 434411 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=434411&action=review

> Source/JavaScriptCore/assembler/MacroAssemblerARM64.h:1433
> +        if (!address.scale || address.scale == 3) {

Can you use the Scale enum values instead here?
    if (address.scale == TimesOne || address.scale == TimesEight) {

> Source/JavaScriptCore/assembler/MacroAssemblerARM64.h:1435
> +            if (auto baseGPR = tryFoldBaseAndOffsetPart(address)) {
> +                m_assembler.ldr<64>(dest, baseGPR.value(), address.index, indexExtendType(address), address.scale);

I think you can do better: if (scale == TimesEight && isUInt12(address.offset / 8)), then you can use the `ALWAYS_INLINE void ldr(RegisterID rt, RegisterID rn, unsigned pimm)` form and just issue 1 instruction to do this load.  See also section C6.2.101 of the spec.

See also LDUR in section C6.2.136 if scale == TimesOne.
Comment 3 Mark Lam 2021-07-28 09:10:38 PDT
Comment on attachment 434411 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=434411&action=review

>> Source/JavaScriptCore/assembler/MacroAssemblerARM64.h:1435
>> +                m_assembler.ldr<64>(dest, baseGPR.value(), address.index, indexExtendType(address), address.scale);
> 
> I think you can do better: if (scale == TimesEight && isUInt12(address.offset / 8)), then you can use the `ALWAYS_INLINE void ldr(RegisterID rt, RegisterID rn, unsigned pimm)` form and just issue 1 instruction to do this load.  See also section C6.2.101 of the spec.
> 
> See also LDUR in section C6.2.136 if scale == TimesOne.

I should clarify: my point is that you may be able to do this with just 1 instruction instead of 2.  Ditto for other cases below.  So, I think it's worth checking the spec manual for available instructions.
Comment 4 Yusuke Suzuki 2021-07-28 10:31:13 PDT
Comment on attachment 434411 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=434411&action=review

>> Source/JavaScriptCore/assembler/MacroAssemblerARM64.h:1433
>> +        if (!address.scale || address.scale == 3) {
> 
> Can you use the Scale enum values instead here?
>     if (address.scale == TimesOne || address.scale == TimesEight) {

Changed.

>>> Source/JavaScriptCore/assembler/MacroAssemblerARM64.h:1435
>>> +                m_assembler.ldr<64>(dest, baseGPR.value(), address.index, indexExtendType(address), address.scale);
>> 
>> I think you can do better: if (scale == TimesEight && isUInt12(address.offset / 8)), then you can use the `ALWAYS_INLINE void ldr(RegisterID rt, RegisterID rn, unsigned pimm)` form and just issue 1 instruction to do this load.  See also section C6.2.101 of the spec.
>> 
>> See also LDUR in section C6.2.136 if scale == TimesOne.
> 
> I should clarify: my point is that you may be able to do this with just 1 instruction instead of 2.  Ditto for other cases below.  So, I think it's worth checking the spec manual for available instructions.

I don't think it is available since here there is base and index registers (since this is BaseIndex). But `ALWAYS_INLINE void ldr(RegisterID rt, RegisterID rn, unsigned pimm)` only takes base register.
So if we would like to use `ALWAYS_INLINE void ldr(RegisterID rt, RegisterID rn, unsigned pimm)`, then we first need to fold index into base. This means that it requires 2 instructions anyway.
Comment 5 Yusuke Suzuki 2021-07-28 10:32:09 PDT
Created attachment 434434 [details]
Patch
Comment 6 Mark Lam 2021-07-28 11:35:32 PDT
Comment on attachment 434434 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=434434&action=review

r=me with suggestions.

> Source/JavaScriptCore/assembler/testmasm.cpp:4224
> +        uint64_t array[] = { 1, 2, 3, 4, 5, };
> +        CHECK_EQ(invoke<uint64_t>(test, array, static_cast<UCPURegister>(3)), 3);

nit: can you add  INT64_MAX to each of the values in array so that this test will also catch bugs where the load instruction doesn't load the full 8 bytes?

> Source/JavaScriptCore/assembler/testmasm.cpp:4234
> +        uint64_t array[] = { 1, 2, 3, 4, 5, };
> +        CHECK_EQ(invoke<uint64_t>(test, array, static_cast<UCPURegister>(3)), 5);

Ditto.

> Source/JavaScriptCore/assembler/testmasm.cpp:4247
> +        uint32_t array[] = { 1, 2, 3, 4, 5, };
> +        CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 3);

Ditto: add INT32_MAX.

> Source/JavaScriptCore/assembler/testmasm.cpp:4257
> +        uint32_t array[] = { 1, 2, 3, 4, 5, };
> +        CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 5);

Ditto.

> Source/JavaScriptCore/assembler/testmasm.cpp:4269
> +        uint16_t array[] = { 1, 2, 3, 4, 5, };
> +        CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 3);

Ditto: add INT16_MAX.

> Source/JavaScriptCore/assembler/testmasm.cpp:4279
> +        uint16_t array[] = { 1, 2, 3, 4, static_cast<uint16_t>(-1), };
> +        CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 0xffff);

Ditto.

> Source/JavaScriptCore/assembler/testmasm.cpp:4291
> +        uint16_t array[] = { 1, 2, 3, 4, 5, };
> +        CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 3);

nit: add (INT16_MAX - 10)?

> Source/JavaScriptCore/assembler/testmasm.cpp:4301
> +        uint16_t array[] = { 1, 2, 3, 4, static_cast<uint16_t>(-1), };
> +        CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), static_cast<uint32_t>(-1));

nit: add (INT16_MAX - 10) but make leave the last one as -1?

> Source/JavaScriptCore/assembler/testmasm.cpp:4406
> +        invoke<void>(test, array, 3, 42);
> +        CHECK_EQ(array[2], 42);

nit: Add INT64_MAX to 42.

> Source/JavaScriptCore/assembler/testmasm.cpp:4417
> +        invoke<void>(test, array, 3, 42);
> +        CHECK_EQ(array[4], 42);

Ditto.

> Source/JavaScriptCore/assembler/testmasm.cpp:4431
> +        invoke<void>(test, array, 3, 42);
> +        CHECK_EQ(array[2], 42);

nit: Add INT32_MAX to 42.

> Source/JavaScriptCore/assembler/testmasm.cpp:4442
> +        invoke<void>(test, array, 3, 42);
> +        CHECK_EQ(array[4], 42);

Ditto.

> Source/JavaScriptCore/assembler/testmasm.cpp:4455
> +        invoke<void>(test, array, 3, 42);
> +        CHECK_EQ(array[2], 42);

nit: Add INT16_MAX to 42.

> Source/JavaScriptCore/assembler/testmasm.cpp:4466
> +        invoke<void>(test, array, 3, 42);
> +        CHECK_EQ(array[4], 42);

Ditto.
Comment 7 Yusuke Suzuki 2021-07-28 11:47:08 PDT
Comment on attachment 434434 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=434434&action=review

Thanks!

>> Source/JavaScriptCore/assembler/testmasm.cpp:4224
>> +        CHECK_EQ(invoke<uint64_t>(test, array, static_cast<UCPURegister>(3)), 3);
> 
> nit: can you add  INT64_MAX to each of the values in array so that this test will also catch bugs where the load instruction doesn't load the full 8 bytes?

Changed.

>> Source/JavaScriptCore/assembler/testmasm.cpp:4234
>> +        CHECK_EQ(invoke<uint64_t>(test, array, static_cast<UCPURegister>(3)), 5);
> 
> Ditto.

Changed.

>> Source/JavaScriptCore/assembler/testmasm.cpp:4247
>> +        CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 3);
> 
> Ditto: add INT32_MAX.

Changed.

>> Source/JavaScriptCore/assembler/testmasm.cpp:4257
>> +        CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 5);
> 
> Ditto.

Changed.

>> Source/JavaScriptCore/assembler/testmasm.cpp:4279
>> +        CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 0xffff);
> 
> Ditto.

Changed.

>> Source/JavaScriptCore/assembler/testmasm.cpp:4291
>> +        CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 3);
> 
> nit: add (INT16_MAX - 10)?

Changed.

>> Source/JavaScriptCore/assembler/testmasm.cpp:4301
>> +        CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), static_cast<uint32_t>(-1));
> 
> nit: add (INT16_MAX - 10) but make leave the last one as -1?

Changed.

>> Source/JavaScriptCore/assembler/testmasm.cpp:4406
>> +        CHECK_EQ(array[2], 42);
> 
> nit: Add INT64_MAX to 42.

Changed.

>> Source/JavaScriptCore/assembler/testmasm.cpp:4417
>> +        CHECK_EQ(array[4], 42);
> 
> Ditto.

Changed.

>> Source/JavaScriptCore/assembler/testmasm.cpp:4431
>> +        CHECK_EQ(array[2], 42);
> 
> nit: Add INT32_MAX to 42.

Changed.

>> Source/JavaScriptCore/assembler/testmasm.cpp:4442
>> +        CHECK_EQ(array[4], 42);
> 
> Ditto.

Changed.

>> Source/JavaScriptCore/assembler/testmasm.cpp:4455
>> +        CHECK_EQ(array[2], 42);
> 
> nit: Add INT16_MAX to 42.

Changed.

>> Source/JavaScriptCore/assembler/testmasm.cpp:4466
>> +        CHECK_EQ(array[4], 42);
> 
> Ditto.

Changed.
Comment 8 Yusuke Suzuki 2021-07-28 11:49:36 PDT
Committed r280391 (240032@main): <https://commits.webkit.org/240032@main>
Comment 9 Radar WebKit Bug Importer 2021-07-28 11:50:19 PDT
<rdar://problem/81227865>
Comment 10 Yusuke Suzuki 2021-07-28 13:23:08 PDT
There is 0% chance that https://ews-build.webkit.org/#/builders/56/builds/12242 failures are related to this patch since

1. This is x64 EWS
2. This patch does nothing on x64. It only affects on ARM64.
Comment 11 Yusuke Suzuki 2021-07-28 13:24:20 PDT
(In reply to Yusuke Suzuki from comment #10)
> There is 0% chance that
> https://ews-build.webkit.org/#/builders/56/builds/12242 failures are related
> to this patch since
> 
> 1. This is x64 EWS
> 2. This patch does nothing on x64. It only affects on ARM64.

Looks like checking-out is failing. So, it is EWS issue.