[JSC] load/store with BaseIndex is inefficient in ARM64
Created attachment 434411 [details] Patch
Comment on attachment 434411 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=434411&action=review > Source/JavaScriptCore/assembler/MacroAssemblerARM64.h:1433 > + if (!address.scale || address.scale == 3) { Can you use the Scale enum values instead here? if (address.scale == TimesOne || address.scale == TimesEight) { > Source/JavaScriptCore/assembler/MacroAssemblerARM64.h:1435 > + if (auto baseGPR = tryFoldBaseAndOffsetPart(address)) { > + m_assembler.ldr<64>(dest, baseGPR.value(), address.index, indexExtendType(address), address.scale); I think you can do better: if (scale == TimesEight && isUInt12(address.offset / 8)), then you can use the `ALWAYS_INLINE void ldr(RegisterID rt, RegisterID rn, unsigned pimm)` form and just issue 1 instruction to do this load. See also section C6.2.101 of the spec. See also LDUR in section C6.2.136 if scale == TimesOne.
Comment on attachment 434411 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=434411&action=review >> Source/JavaScriptCore/assembler/MacroAssemblerARM64.h:1435 >> + m_assembler.ldr<64>(dest, baseGPR.value(), address.index, indexExtendType(address), address.scale); > > I think you can do better: if (scale == TimesEight && isUInt12(address.offset / 8)), then you can use the `ALWAYS_INLINE void ldr(RegisterID rt, RegisterID rn, unsigned pimm)` form and just issue 1 instruction to do this load. See also section C6.2.101 of the spec. > > See also LDUR in section C6.2.136 if scale == TimesOne. I should clarify: my point is that you may be able to do this with just 1 instruction instead of 2. Ditto for other cases below. So, I think it's worth checking the spec manual for available instructions.
Comment on attachment 434411 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=434411&action=review >> Source/JavaScriptCore/assembler/MacroAssemblerARM64.h:1433 >> + if (!address.scale || address.scale == 3) { > > Can you use the Scale enum values instead here? > if (address.scale == TimesOne || address.scale == TimesEight) { Changed. >>> Source/JavaScriptCore/assembler/MacroAssemblerARM64.h:1435 >>> + m_assembler.ldr<64>(dest, baseGPR.value(), address.index, indexExtendType(address), address.scale); >> >> I think you can do better: if (scale == TimesEight && isUInt12(address.offset / 8)), then you can use the `ALWAYS_INLINE void ldr(RegisterID rt, RegisterID rn, unsigned pimm)` form and just issue 1 instruction to do this load. See also section C6.2.101 of the spec. >> >> See also LDUR in section C6.2.136 if scale == TimesOne. > > I should clarify: my point is that you may be able to do this with just 1 instruction instead of 2. Ditto for other cases below. So, I think it's worth checking the spec manual for available instructions. I don't think it is available since here there is base and index registers (since this is BaseIndex). But `ALWAYS_INLINE void ldr(RegisterID rt, RegisterID rn, unsigned pimm)` only takes base register. So if we would like to use `ALWAYS_INLINE void ldr(RegisterID rt, RegisterID rn, unsigned pimm)`, then we first need to fold index into base. This means that it requires 2 instructions anyway.
Created attachment 434434 [details] Patch
Comment on attachment 434434 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=434434&action=review r=me with suggestions. > Source/JavaScriptCore/assembler/testmasm.cpp:4224 > + uint64_t array[] = { 1, 2, 3, 4, 5, }; > + CHECK_EQ(invoke<uint64_t>(test, array, static_cast<UCPURegister>(3)), 3); nit: can you add INT64_MAX to each of the values in array so that this test will also catch bugs where the load instruction doesn't load the full 8 bytes? > Source/JavaScriptCore/assembler/testmasm.cpp:4234 > + uint64_t array[] = { 1, 2, 3, 4, 5, }; > + CHECK_EQ(invoke<uint64_t>(test, array, static_cast<UCPURegister>(3)), 5); Ditto. > Source/JavaScriptCore/assembler/testmasm.cpp:4247 > + uint32_t array[] = { 1, 2, 3, 4, 5, }; > + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 3); Ditto: add INT32_MAX. > Source/JavaScriptCore/assembler/testmasm.cpp:4257 > + uint32_t array[] = { 1, 2, 3, 4, 5, }; > + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 5); Ditto. > Source/JavaScriptCore/assembler/testmasm.cpp:4269 > + uint16_t array[] = { 1, 2, 3, 4, 5, }; > + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 3); Ditto: add INT16_MAX. > Source/JavaScriptCore/assembler/testmasm.cpp:4279 > + uint16_t array[] = { 1, 2, 3, 4, static_cast<uint16_t>(-1), }; > + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 0xffff); Ditto. > Source/JavaScriptCore/assembler/testmasm.cpp:4291 > + uint16_t array[] = { 1, 2, 3, 4, 5, }; > + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 3); nit: add (INT16_MAX - 10)? > Source/JavaScriptCore/assembler/testmasm.cpp:4301 > + uint16_t array[] = { 1, 2, 3, 4, static_cast<uint16_t>(-1), }; > + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), static_cast<uint32_t>(-1)); nit: add (INT16_MAX - 10) but make leave the last one as -1? > Source/JavaScriptCore/assembler/testmasm.cpp:4406 > + invoke<void>(test, array, 3, 42); > + CHECK_EQ(array[2], 42); nit: Add INT64_MAX to 42. > Source/JavaScriptCore/assembler/testmasm.cpp:4417 > + invoke<void>(test, array, 3, 42); > + CHECK_EQ(array[4], 42); Ditto. > Source/JavaScriptCore/assembler/testmasm.cpp:4431 > + invoke<void>(test, array, 3, 42); > + CHECK_EQ(array[2], 42); nit: Add INT32_MAX to 42. > Source/JavaScriptCore/assembler/testmasm.cpp:4442 > + invoke<void>(test, array, 3, 42); > + CHECK_EQ(array[4], 42); Ditto. > Source/JavaScriptCore/assembler/testmasm.cpp:4455 > + invoke<void>(test, array, 3, 42); > + CHECK_EQ(array[2], 42); nit: Add INT16_MAX to 42. > Source/JavaScriptCore/assembler/testmasm.cpp:4466 > + invoke<void>(test, array, 3, 42); > + CHECK_EQ(array[4], 42); Ditto.
Comment on attachment 434434 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=434434&action=review Thanks! >> Source/JavaScriptCore/assembler/testmasm.cpp:4224 >> + CHECK_EQ(invoke<uint64_t>(test, array, static_cast<UCPURegister>(3)), 3); > > nit: can you add INT64_MAX to each of the values in array so that this test will also catch bugs where the load instruction doesn't load the full 8 bytes? Changed. >> Source/JavaScriptCore/assembler/testmasm.cpp:4234 >> + CHECK_EQ(invoke<uint64_t>(test, array, static_cast<UCPURegister>(3)), 5); > > Ditto. Changed. >> Source/JavaScriptCore/assembler/testmasm.cpp:4247 >> + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 3); > > Ditto: add INT32_MAX. Changed. >> Source/JavaScriptCore/assembler/testmasm.cpp:4257 >> + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 5); > > Ditto. Changed. >> Source/JavaScriptCore/assembler/testmasm.cpp:4279 >> + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 0xffff); > > Ditto. Changed. >> Source/JavaScriptCore/assembler/testmasm.cpp:4291 >> + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), 3); > > nit: add (INT16_MAX - 10)? Changed. >> Source/JavaScriptCore/assembler/testmasm.cpp:4301 >> + CHECK_EQ(invoke<uint32_t>(test, array, static_cast<UCPURegister>(3)), static_cast<uint32_t>(-1)); > > nit: add (INT16_MAX - 10) but make leave the last one as -1? Changed. >> Source/JavaScriptCore/assembler/testmasm.cpp:4406 >> + CHECK_EQ(array[2], 42); > > nit: Add INT64_MAX to 42. Changed. >> Source/JavaScriptCore/assembler/testmasm.cpp:4417 >> + CHECK_EQ(array[4], 42); > > Ditto. Changed. >> Source/JavaScriptCore/assembler/testmasm.cpp:4431 >> + CHECK_EQ(array[2], 42); > > nit: Add INT32_MAX to 42. Changed. >> Source/JavaScriptCore/assembler/testmasm.cpp:4442 >> + CHECK_EQ(array[4], 42); > > Ditto. Changed. >> Source/JavaScriptCore/assembler/testmasm.cpp:4455 >> + CHECK_EQ(array[2], 42); > > nit: Add INT16_MAX to 42. Changed. >> Source/JavaScriptCore/assembler/testmasm.cpp:4466 >> + CHECK_EQ(array[4], 42); > > Ditto. Changed.
Committed r280391 (240032@main): <https://commits.webkit.org/240032@main>
<rdar://problem/81227865>
There is 0% chance that https://ews-build.webkit.org/#/builders/56/builds/12242 failures are related to this patch since 1. This is x64 EWS 2. This patch does nothing on x64. It only affects on ARM64.
(In reply to Yusuke Suzuki from comment #10) > There is 0% chance that > https://ews-build.webkit.org/#/builders/56/builds/12242 failures are related > to this patch since > > 1. This is x64 EWS > 2. This patch does nothing on x64. It only affects on ARM64. Looks like checking-out is failing. So, it is EWS issue.