The way ARMv7 callee save registers are numbered in GPRInfo.h is inconsistent with other targets. Also ARMv7 temporaries are ordered in a sub-optimal way that increases code size/necessitates ifdefs.
Created attachment 448733 [details] Patch
<rdar://problem/87676633>
Created attachment 453498 [details] Patch
Created attachment 455628 [details] Patch
Created attachment 455632 [details] Patch
Comment on attachment 455632 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=455632&action=review r=me with comments. > Source/JavaScriptCore/jit/BaselineJITRegisters.h:95 > + constexpr GPRReg globalObjectGPR = preferredArgumentGPR<SlowOperation, 0>(); > + constexpr JSValueRegs thrownValueJSR = preferredArgumentJSR<SlowOperation, 1>(); > + constexpr GPRReg bytecodeOffsetGPR = GPRInfo::nonArgGPR0; This change is not necessary. > Source/JavaScriptCore/llint/LLIntData.h:394 > +#if CPU(ARM) Use CPU(ARM_THUMB2) > Source/JavaScriptCore/llint/LLIntData.h:-405 > -#elif CPU(MIPS) || CPU(ARM_THUMB2) Use CPU(ARM_THUMB2)
Created attachment 455900 [details] Patch
Comment on attachment 455900 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=455900&action=review Ah, I think we have some problems. > Source/JavaScriptCore/jit/GPRInfo.h:569 > + static constexpr GPRReg regT4 = ARMRegisters::r4; regT4 is r4, then LLInt t4 must be r4. > Source/JavaScriptCore/llint/LLIntData.h:398 > +#if CPU(ARM_THUMB2) > + static constexpr GPRReg pcGPR = ARMRegisters::r8; > +#else > static constexpr GPRReg pcGPR = GPRInfo::regT4; > +#endif We should use regT4 in all architectures. > Source/JavaScriptCore/offlineasm/arm.rb:40 > +# x8 => t4 (callee-save, PC) But it seems that t5 is r8. Is it correct?
Comment on attachment 455900 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=455900&action=review >> Source/JavaScriptCore/offlineasm/arm.rb:40 >> +# x8 => t4 (callee-save, PC) > > But it seems that t5 is r8. Is it correct? t4 is r8 in this table, but in GPRInfo, regT4 is r4. Both needs to be the same.
(In reply to Yusuke Suzuki from comment #8) > > Source/JavaScriptCore/llint/LLIntData.h:398 > > +#if CPU(ARM_THUMB2) > > + static constexpr GPRReg pcGPR = ARMRegisters::r8; > > +#else > > static constexpr GPRReg pcGPR = GPRInfo::regT4; > > +#endif > > We should use regT4 in all architectures. Why is this important? (this change is what allows us LLInt t4 !== GPRInfo::regT4) More generally what is the relevance of LLInt vs GPRInfo numbering of temporaries? (CSRs are the right way around) Ideally, on ARM_THUMB2, we want GPRInfo::regT4 (and also GPRInfo::regT5) to be something less than ARMRegisters::r8 for code size (otherwise the zillions of ifdefs come back and is pretty hard to get right unless you are very conscious). This is an important saving in baseline code size (5-10%). And ideally we also want the LLInt PC to be higher or same than r8, as it's not frequently used, so we can keep the cheaper registers for other use. This is what the current patch does. If GPRInfo::regT4 must absolutely be the same as LLInt t4, I will change llint t4 to ARMRegisters::r4 as well, but that is sacrificing the llint code size/performance for the JIT. I would like to have both if I can.
(In reply to Geza Lore from comment #10) > (In reply to Yusuke Suzuki from comment #8) > > > > Source/JavaScriptCore/llint/LLIntData.h:398 > > > +#if CPU(ARM_THUMB2) > > > + static constexpr GPRReg pcGPR = ARMRegisters::r8; > > > +#else > > > static constexpr GPRReg pcGPR = GPRInfo::regT4; > > > +#endif > > > > We should use regT4 in all architectures. > > Why is this important? (this change is what allows us LLInt t4 !== > GPRInfo::regT4) > > > More generally what is the relevance of LLInt vs GPRInfo numbering of > temporaries? (CSRs are the right way around) > > > Ideally, on ARM_THUMB2, we want GPRInfo::regT4 (and also GPRInfo::regT5) to > be something less than ARMRegisters::r8 for code size (otherwise the > zillions of ifdefs come back and is pretty hard to get right unless you are > very conscious). This is an important saving in baseline code size (5-10%). > And ideally we also want the LLInt PC to be higher or same than r8, as it's > not frequently used, so we can keep the cheaper registers for other use. > This is what the current patch does. > > If GPRInfo::regT4 must absolutely be the same as LLInt t4, I will change > llint t4 to ARMRegisters::r4 as well, but that is sacrificing the llint code > size/performance for the JIT. I would like to have both if I can. Because offlineasm is also used as an assembler to create a asm functions. And these code strongly assumes t4 is GPRInfo::regT4 to make it aligned to online assembler. So, we must ensure regT4 is t4 in offlineasm.
Created attachment 455925 [details] Patch
Comment on attachment 455925 [details] Patch Nice, r=me
Committed r292080 (249007@main): <https://commits.webkit.org/249007@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 455925 [details].