RESOLVED FIXED235027
[JSC][ARMv7] Cleanup GPR numbering
https://bugs.webkit.org/show_bug.cgi?id=235027
Summary [JSC][ARMv7] Cleanup GPR numbering
Geza Lore
Reported 2022-01-10 04:44:59 PST
The way ARMv7 callee save registers are numbered in GPRInfo.h is inconsistent with other targets. Also ARMv7 temporaries are ordered in a sub-optimal way that increases code size/necessitates ifdefs.
Attachments
Patch (39.70 KB, patch)
2022-01-10 04:48 PST, Geza Lore
no flags
Patch (39.67 KB, patch)
2022-03-01 07:47 PST, Geza Lore
no flags
Patch (40.41 KB, patch)
2022-03-24 03:32 PDT, Geza Lore
no flags
Patch (41.40 KB, patch)
2022-03-24 04:49 PDT, Geza Lore
no flags
Patch (40.25 KB, patch)
2022-03-28 04:18 PDT, Geza Lore
no flags
Patch (39.75 KB, patch)
2022-03-28 10:17 PDT, Geza Lore
no flags
Geza Lore
Comment 1 2022-01-10 04:48:10 PST
Radar WebKit Bug Importer
Comment 2 2022-01-17 04:45:23 PST
Geza Lore
Comment 3 2022-03-01 07:47:41 PST
Geza Lore
Comment 4 2022-03-24 03:32:26 PDT
Geza Lore
Comment 5 2022-03-24 04:49:02 PDT
Yusuke Suzuki
Comment 6 2022-03-26 05:58:14 PDT
Comment on attachment 455632 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=455632&action=review r=me with comments. > Source/JavaScriptCore/jit/BaselineJITRegisters.h:95 > + constexpr GPRReg globalObjectGPR = preferredArgumentGPR<SlowOperation, 0>(); > + constexpr JSValueRegs thrownValueJSR = preferredArgumentJSR<SlowOperation, 1>(); > + constexpr GPRReg bytecodeOffsetGPR = GPRInfo::nonArgGPR0; This change is not necessary. > Source/JavaScriptCore/llint/LLIntData.h:394 > +#if CPU(ARM) Use CPU(ARM_THUMB2) > Source/JavaScriptCore/llint/LLIntData.h:-405 > -#elif CPU(MIPS) || CPU(ARM_THUMB2) Use CPU(ARM_THUMB2)
Geza Lore
Comment 7 2022-03-28 04:18:19 PDT
Yusuke Suzuki
Comment 8 2022-03-28 09:23:54 PDT
Comment on attachment 455900 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=455900&action=review Ah, I think we have some problems. > Source/JavaScriptCore/jit/GPRInfo.h:569 > + static constexpr GPRReg regT4 = ARMRegisters::r4; regT4 is r4, then LLInt t4 must be r4. > Source/JavaScriptCore/llint/LLIntData.h:398 > +#if CPU(ARM_THUMB2) > + static constexpr GPRReg pcGPR = ARMRegisters::r8; > +#else > static constexpr GPRReg pcGPR = GPRInfo::regT4; > +#endif We should use regT4 in all architectures. > Source/JavaScriptCore/offlineasm/arm.rb:40 > +# x8 => t4 (callee-save, PC) But it seems that t5 is r8. Is it correct?
Yusuke Suzuki
Comment 9 2022-03-28 09:25:57 PDT
Comment on attachment 455900 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=455900&action=review >> Source/JavaScriptCore/offlineasm/arm.rb:40 >> +# x8 => t4 (callee-save, PC) > > But it seems that t5 is r8. Is it correct? t4 is r8 in this table, but in GPRInfo, regT4 is r4. Both needs to be the same.
Geza Lore
Comment 10 2022-03-28 09:39:40 PDT
(In reply to Yusuke Suzuki from comment #8) > > Source/JavaScriptCore/llint/LLIntData.h:398 > > +#if CPU(ARM_THUMB2) > > + static constexpr GPRReg pcGPR = ARMRegisters::r8; > > +#else > > static constexpr GPRReg pcGPR = GPRInfo::regT4; > > +#endif > > We should use regT4 in all architectures. Why is this important? (this change is what allows us LLInt t4 !== GPRInfo::regT4) More generally what is the relevance of LLInt vs GPRInfo numbering of temporaries? (CSRs are the right way around) Ideally, on ARM_THUMB2, we want GPRInfo::regT4 (and also GPRInfo::regT5) to be something less than ARMRegisters::r8 for code size (otherwise the zillions of ifdefs come back and is pretty hard to get right unless you are very conscious). This is an important saving in baseline code size (5-10%). And ideally we also want the LLInt PC to be higher or same than r8, as it's not frequently used, so we can keep the cheaper registers for other use. This is what the current patch does. If GPRInfo::regT4 must absolutely be the same as LLInt t4, I will change llint t4 to ARMRegisters::r4 as well, but that is sacrificing the llint code size/performance for the JIT. I would like to have both if I can.
Yusuke Suzuki
Comment 11 2022-03-28 09:42:58 PDT
(In reply to Geza Lore from comment #10) > (In reply to Yusuke Suzuki from comment #8) > > > > Source/JavaScriptCore/llint/LLIntData.h:398 > > > +#if CPU(ARM_THUMB2) > > > + static constexpr GPRReg pcGPR = ARMRegisters::r8; > > > +#else > > > static constexpr GPRReg pcGPR = GPRInfo::regT4; > > > +#endif > > > > We should use regT4 in all architectures. > > Why is this important? (this change is what allows us LLInt t4 !== > GPRInfo::regT4) > > > More generally what is the relevance of LLInt vs GPRInfo numbering of > temporaries? (CSRs are the right way around) > > > Ideally, on ARM_THUMB2, we want GPRInfo::regT4 (and also GPRInfo::regT5) to > be something less than ARMRegisters::r8 for code size (otherwise the > zillions of ifdefs come back and is pretty hard to get right unless you are > very conscious). This is an important saving in baseline code size (5-10%). > And ideally we also want the LLInt PC to be higher or same than r8, as it's > not frequently used, so we can keep the cheaper registers for other use. > This is what the current patch does. > > If GPRInfo::regT4 must absolutely be the same as LLInt t4, I will change > llint t4 to ARMRegisters::r4 as well, but that is sacrificing the llint code > size/performance for the JIT. I would like to have both if I can. Because offlineasm is also used as an assembler to create a asm functions. And these code strongly assumes t4 is GPRInfo::regT4 to make it aligned to online assembler. So, we must ensure regT4 is t4 in offlineasm.
Geza Lore
Comment 12 2022-03-28 10:17:25 PDT
Yusuke Suzuki
Comment 13 2022-03-28 10:19:59 PDT
Comment on attachment 455925 [details] Patch Nice, r=me
EWS
Comment 14 2022-03-29 16:22:26 PDT
Committed r292080 (249007@main): <https://commits.webkit.org/249007@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 455925 [details].
Note You need to log in before you can comment on or make changes to this bug.