WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
228543
[JSC] load/store with BaseIndex is inefficient in ARM64
https://bugs.webkit.org/show_bug.cgi?id=228543
Summary
[JSC] load/store with BaseIndex is inefficient in ARM64
Yusuke Suzuki
Reported
2021-07-28 01:49:56 PDT
[JSC] load/store with BaseIndex is inefficient in ARM64
Attachments
Patch
(27.33 KB, patch)
2021-07-28 01:51 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(27.56 KB, patch)
2021-07-28 10:32 PDT
,
Yusuke Suzuki
mark.lam
: review+
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Yusuke Suzuki
Comment 1
2021-07-28 01:51:21 PDT
Created
attachment 434411
[details]
Patch
Mark Lam
Comment 2
2021-07-28 09:08:09 PDT
Comment on
attachment 434411
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=434411&action=review
> Source/JavaScriptCore/assembler/MacroAssemblerARM64.h:1433 > + if (!address.scale || address.scale == 3) {
Can you use the Scale enum values instead here? if (address.scale == TimesOne || address.scale == TimesEight) {
> Source/JavaScriptCore/assembler/MacroAssemblerARM64.h:1435 > + if (auto baseGPR = tryFoldBaseAndOffsetPart(address)) { > + m_assembler.ldr<64>(dest, baseGPR.value(), address.index, indexExtendType(address), address.scale);
I think you can do better: if (scale == TimesEight && isUInt12(address.offset / 8)), then you can use the `ALWAYS_INLINE void ldr(RegisterID rt, RegisterID rn, unsigned pimm)` form and just issue 1 instruction to do this load. See also section C6.2.101 of the spec. See also LDUR in section C6.2.136 if scale == TimesOne.
Mark Lam
Comment 3
2021-07-28 09:10:38 PDT
Comment on
attachment 434411
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=434411&action=review
>> Source/JavaScriptCore/assembler/MacroAssemblerARM64.h:1435 >> + m_assembler.ldr<64>(dest, baseGPR.value(), address.index, indexExtendType(address), address.scale); > > I think you can do better: if (scale == TimesEight && isUInt12(address.offset / 8)), then you can use the `ALWAYS_INLINE void ldr(RegisterID rt, RegisterID rn, unsigned pimm)` form and just issue 1 instruction to do this load. See also section C6.2.101 of the spec. > > See also LDUR in section C6.2.136 if scale == TimesOne.
I should clarify: my point is that you may be able to do this with just 1 instruction instead of 2. Ditto for other cases below. So, I think it's worth checking the spec manual for available instructions.
Yusuke Suzuki
Comment 4
2021-07-28 10:31:13 PDT
Comment on
attachment 434411
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=434411&action=review
>> Source/JavaScriptCore/assembler/MacroAssemblerARM64.h:1433 >> + if (!address.scale || address.scale == 3) { > > Can you use the Scale enum values instead here? > if (address.scale == TimesOne || address.scale == TimesEight) {
Changed.
>>> Source/JavaScriptCore/assembler/MacroAssemblerARM64.h:1435 >>> + m_assembler.ldr<64>(dest, baseGPR.value(), address.index, indexExtendType(address), address.scale); >> >> I think you can do better: if (scale == TimesEight && isUInt12(address.offset / 8)), then you can use the `ALWAYS_INLINE void ldr(RegisterID rt, RegisterID rn, unsigned pimm)` form and just issue 1 instruction to do this load. See also section C6.2.101 of the spec. >> >> See also LDUR in section C6.2.136 if scale == TimesOne. > > I should clarify: my point is that you may be able to do this with just 1 instruction instead of 2. Ditto for other cases below. So, I think it's worth checking the spec manual for available instructions.
I don't think it is available since here there is base and index registers (since this is BaseIndex). But `ALWAYS_INLINE void ldr(RegisterID rt, RegisterID rn, unsigned pimm)` only takes base register. So if we would like to use `ALWAYS_INLINE void ldr(RegisterID rt, RegisterID rn, unsigned pimm)`, then we first need to fold index into base. This means that it requires 2 instructions anyway.
Yusuke Suzuki
Comment 5
2021-07-28 10:32:09 PDT
Created
attachment 434434
[details]
Patch
Mark Lam
Comment 6
2021-07-28 11:35:32 PDT
Comment on
attachment 434434
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=434434&action=review
r=me with suggestions.
> Source/JavaScriptCore/assembler/testmasm.cpp:4224 > + uint64_t array[] = { 1, 2, 3, 4, 5, }; > + CHECK_EQ(invoke<uint64_t>(test, array, static_cast<UCPURegister>(3)), 3);
nit: can you add INT64_MAX to each of the values in array so that this test will also catch bugs where the load instruction doesn't load the full 8 bytes?
> Source/JavaScriptCore/assembler/testmasm.cpp:4234 > + uint64_t array[] = { 1, 2, 3, 4, 5, }; > + CHECK_EQ(invoke<uint64_t>(test, array, static_cast<UCPURegister>(3)), 5);
Ditto.
> Source/JavaScriptCore/assembler/testmasm.cpp:4247 > + uint32_t array[] = { 1, 2, 3, 4, 5, }; > + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 3);
Ditto: add INT32_MAX.
> Source/JavaScriptCore/assembler/testmasm.cpp:4257 > + uint32_t array[] = { 1, 2, 3, 4, 5, }; > + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 5);
Ditto.
> Source/JavaScriptCore/assembler/testmasm.cpp:4269 > + uint16_t array[] = { 1, 2, 3, 4, 5, }; > + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 3);
Ditto: add INT16_MAX.
> Source/JavaScriptCore/assembler/testmasm.cpp:4279 > + uint16_t array[] = { 1, 2, 3, 4, static_cast<uint16_t>(-1), }; > + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 0xffff);
Ditto.
> Source/JavaScriptCore/assembler/testmasm.cpp:4291 > + uint16_t array[] = { 1, 2, 3, 4, 5, }; > + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 3);
nit: add (INT16_MAX - 10)?
> Source/JavaScriptCore/assembler/testmasm.cpp:4301 > + uint16_t array[] = { 1, 2, 3, 4, static_cast<uint16_t>(-1), }; > + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), static_cast<uint32_t>(-1));
nit: add (INT16_MAX - 10) but make leave the last one as -1?
> Source/JavaScriptCore/assembler/testmasm.cpp:4406 > + invoke<void>(test, array, 3, 42); > + CHECK_EQ(array[2], 42);
nit: Add INT64_MAX to 42.
> Source/JavaScriptCore/assembler/testmasm.cpp:4417 > + invoke<void>(test, array, 3, 42); > + CHECK_EQ(array[4], 42);
Ditto.
> Source/JavaScriptCore/assembler/testmasm.cpp:4431 > + invoke<void>(test, array, 3, 42); > + CHECK_EQ(array[2], 42);
nit: Add INT32_MAX to 42.
> Source/JavaScriptCore/assembler/testmasm.cpp:4442 > + invoke<void>(test, array, 3, 42); > + CHECK_EQ(array[4], 42);
Ditto.
> Source/JavaScriptCore/assembler/testmasm.cpp:4455 > + invoke<void>(test, array, 3, 42); > + CHECK_EQ(array[2], 42);
nit: Add INT16_MAX to 42.
> Source/JavaScriptCore/assembler/testmasm.cpp:4466 > + invoke<void>(test, array, 3, 42); > + CHECK_EQ(array[4], 42);
Ditto.
Yusuke Suzuki
Comment 7
2021-07-28 11:47:08 PDT
Comment on
attachment 434434
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=434434&action=review
Thanks!
>> Source/JavaScriptCore/assembler/testmasm.cpp:4224 >> + CHECK_EQ(invoke<uint64_t>(test, array, static_cast<UCPURegister>(3)), 3); > > nit: can you add INT64_MAX to each of the values in array so that this test will also catch bugs where the load instruction doesn't load the full 8 bytes?
Changed.
>> Source/JavaScriptCore/assembler/testmasm.cpp:4234 >> + CHECK_EQ(invoke<uint64_t>(test, array, static_cast<UCPURegister>(3)), 5); > > Ditto.
Changed.
>> Source/JavaScriptCore/assembler/testmasm.cpp:4247 >> + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 3); > > Ditto: add INT32_MAX.
Changed.
>> Source/JavaScriptCore/assembler/testmasm.cpp:4257 >> + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 5); > > Ditto.
Changed.
>> Source/JavaScriptCore/assembler/testmasm.cpp:4279 >> + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 0xffff); > > Ditto.
Changed.
>> Source/JavaScriptCore/assembler/testmasm.cpp:4291 >> + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 3); > > nit: add (INT16_MAX - 10)?
Changed.
>> Source/JavaScriptCore/assembler/testmasm.cpp:4301 >> + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), static_cast<uint32_t>(-1)); > > nit: add (INT16_MAX - 10) but make leave the last one as -1?
Changed.
>> Source/JavaScriptCore/assembler/testmasm.cpp:4406 >> + CHECK_EQ(array[2], 42); > > nit: Add INT64_MAX to 42.
Changed.
>> Source/JavaScriptCore/assembler/testmasm.cpp:4417 >> + CHECK_EQ(array[4], 42); > > Ditto.
Changed.
>> Source/JavaScriptCore/assembler/testmasm.cpp:4431 >> + CHECK_EQ(array[2], 42); > > nit: Add INT32_MAX to 42.
Changed.
>> Source/JavaScriptCore/assembler/testmasm.cpp:4442 >> + CHECK_EQ(array[4], 42); > > Ditto.
Changed.
>> Source/JavaScriptCore/assembler/testmasm.cpp:4455 >> + CHECK_EQ(array[2], 42); > > nit: Add INT16_MAX to 42.
Changed.
>> Source/JavaScriptCore/assembler/testmasm.cpp:4466 >> + CHECK_EQ(array[4], 42); > > Ditto.
Changed.
Yusuke Suzuki
Comment 8
2021-07-28 11:49:36 PDT
Committed
r280391
(
240032@main
): <
https://commits.webkit.org/240032@main
>
Radar WebKit Bug Importer
Comment 9
2021-07-28 11:50:19 PDT
<
rdar://problem/81227865
>
Yusuke Suzuki
Comment 10
2021-07-28 13:23:08 PDT
There is 0% chance that
https://ews-build.webkit.org/#/builders/56/builds/12242
failures are related to this patch since 1. This is x64 EWS 2. This patch does nothing on x64. It only affects on ARM64.
Yusuke Suzuki
Comment 11
2021-07-28 13:24:20 PDT
(In reply to Yusuke Suzuki from
comment #10
)
> There is 0% chance that >
https://ews-build.webkit.org/#/builders/56/builds/12242
failures are related > to this patch since > > 1. This is x64 EWS > 2. This patch does nothing on x64. It only affects on ARM64.
Looks like checking-out is failing. So, it is EWS issue.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug