| Summary: | Add Pre/Post-Indexed Address Mode to Air for ARM64 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | WebKit | Reporter: | Yijia Huang <yijia_huang> | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Component: | JavaScriptCore | Assignee: | Yijia Huang <yijia_huang> | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Status: | RESOLVED FIXED | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Severity: | Normal | CC: | ews-watchlist, fpizlo, keith_miller, mark.lam, msaboff, saam, tzagallo, webkit-bug-importer | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Priority: | P2 | Keywords: | InRadar | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Version: | WebKit Nightly Build | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Hardware: | Unspecified | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| OS: | Unspecified | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Attachments: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Description
Yijia Huang
2021-07-17 03:50:26 PDT
Created attachment 433730 [details]
Patch
Created attachment 433734 [details]
Patch
Created attachment 433738 [details]
Patch
Created attachment 433859 [details]
Patch
Created attachment 433864 [details]
Patch
Created attachment 433897 [details]
Patch
Created attachment 434069 [details]
Patch
Created attachment 434096 [details]
Patch
Comment on attachment 434069 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=434069&action=review > Source/JavaScriptCore/b3/B3ReduceStrength.cpp:522 > + m_potentialPreIndex.remove(it); This remove() doesn't destroy the MemoryValue*. > Source/JavaScriptCore/b3/B3ReduceStrength.cpp:550 > + m_potentialPreIndex.remove(it); Why does this remove() destroy the MemoryValue* here? Created attachment 434098 [details]
Patch
Created attachment 434166 [details]
Patch
Created attachment 434167 [details]
Patch
Created attachment 434174 [details]
Patch
Created attachment 434178 [details]
Patch
Created attachment 434179 [details]
Patch
Created attachment 434181 [details]
Patch
Created attachment 434182 [details]
Patch
Created attachment 434197 [details]
Patch
Comment on attachment 434197 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=434197&action=review > Source/JavaScriptCore/b3/B3LowerToAir.cpp:916 > + auto tryPotentialPostIndexAddress = [&] () { > + if (!right->hasIntPtr()) > + return; > + intptr_t offset = right->asIntPtr(); > + Value::OffsetType smallOffset = static_cast<Value::OffsetType>(offset); > + if (opcode != Air::Add64 || smallOffset != offset || !Arg::isValidPostIndexForm(smallOffset)) > + return; > + m_baseToInst.add(left, &m_insts.last().last()); > + }; I think it's weird to have this code here. Seems like you want it at the top of the Add case in the main lowering switch statement. > Source/JavaScriptCore/b3/B3LowerToAir.cpp:1164 > - if (isValidForm(Store32, Arg::ZeroReg, dest.kind()) && dest.isValidForm(Width32)) > - return Inst(Store32, m_value, zeroReg(), dest); > + if (isValidForm(Move32, Arg::ZeroReg, dest.kind()) && dest.isValidForm(Width32)) > + return Inst(Move32, m_value, zeroReg(), dest); Why do you need to name stores as such? > Source/JavaScriptCore/b3/B3LowerToAir.cpp:2597 > + if (totalUseCounts(address) == 1 || totalUseCounts(base) > 1) I don't get this. What are you trying to achieve here? Isn't the whole point of pre-increment loads that the address has many uses? Created attachment 434272 [details]
Patch
Created attachment 434273 [details]
Patch
Created attachment 434294 [details]
Patch
Created attachment 434376 [details]
Patch
Created attachment 434416 [details]
Patch
This is the 5th version. Created attachment 434567 [details]
Patch
Created attachment 434573 [details]
Patch
Comment on attachment 434573 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=434573&action=review I think you have a bug in how your new Arg works - it says that it just Uses its input. Also, you should factor your phase out into its own file. > Source/JavaScriptCore/b3/B3ReduceStrength.cpp:3165 > +bool canonicalize(Procedure& proc) > +{ > + PhaseScope phaseScope(proc, "canonicalize"); > + ReduceStrength reduceStrength(proc); > + return reduceStrength.canonicalizeIncrementAddress(); > +} I think you need a more descriptive name at the B3 level namespace than "canonicalize". I would call it something like "canonicalizePrePostIncrements" and maybe I would even conditionalize it so it only runs on ARM64. Also, I wouldn't make this a method of ReduceStrength. It doesn't seem like it needs to be. Seems like this should be a standalone function, in its own file. Like "B3CanonicalizePrePostIncrements.h" and "B3CanonicalizePrePostIncrements.cpp". > Source/JavaScriptCore/b3/air/AirArg.h:1395 > functor(m_base, Use, GP, argRole == UseAddr ? argWidth : pointerWidth()); This is wrong now - it'll claim that pre/post increment "uses" are just Uses. They are UseDef. Created attachment 434582 [details]
Patch
Comment on attachment 434582 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=434582&action=review I think this is right? > Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:167 > + return canonicalizePrePostIncrementsImpl(proc); You could inline the body of the impl here. > Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.h:34 > +bool canonicalizePrePostIncrementsImpl(Procedure&); Don't need this declared here since you don't call it from anywhere but the non-impl variant. > Source/JavaScriptCore/b3/air/AirArg.h:1398 > + case PreIndex: > + case PostIndex: > + functor(m_base, UseDef, GP, argRole == UseAddr ? argWidth : pointerWidth()); > + break; Fingers crossed that this works! Ideally we'd audit the compiler to make sure nobody is assuming that only a top-level Arg can be a Def. Created attachment 434610 [details]
Patch for fixing the layout test failure
Created attachment 434641 [details]
Patch for fixing the layout test failure
Created attachment 434656 [details]
Patch for landing
Committed r280493 (240125@main): <https://commits.webkit.org/240125@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 434656 [details]. Comment on attachment 434656 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=434656&action=review > Source/JavaScriptCore/b3/B3LowerToAir.cpp:2568 > + // memory = load(base, 0) > + // address = add(base, offset) This comment is not what the code is doing. The order here is backwards of the canonical form. Might also be worth noting what Air code we emit for clarity? > Source/JavaScriptCore/b3/B3LowerToAir.cpp:2573 > + Value* address = m_block->at(m_index - 1); Why do we do this if address is already locked? I guess it's sound because we are always writing to its temp? That said, if it's locked, some other operation already sucked up this add internal to the other operation. And nobody would use the result of the add. But we're emitting a move, which may not get elided. Comment on attachment 434656 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=434656&action=review Thanks for reviewing. >> Source/JavaScriptCore/b3/B3LowerToAir.cpp:2568 >> + // address = add(base, offset) > > This comment is not what the code is doing. The order here is backwards of the canonical form. > > Might also be worth noting what Air code we emit for clarity? The order is fixed in the patch for store instruction. I will add the corresponding Air code in the store patch. >> Source/JavaScriptCore/b3/B3LowerToAir.cpp:2573 >> + Value* address = m_block->at(m_index - 1); > > Why do we do this if address is already locked? I guess it's sound because we are always writing to its temp? That said, if it's locked, some other operation already sucked up this add internal to the other operation. And nobody would use the result of the add. But we're emitting a move, which may not get elided. I will add a m_locked check on address in the store patch. Comment on attachment 434656 [details]
Patch for landing
Thanks for reviewing.
Comment on attachment 434656 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=434656&action=review still reviewing, but a few more comments. > Source/JavaScriptCore/assembler/AbstractMacroAssembler.h:250 > + // Describes an address with base address and pre-increment/decrement index. nit: index => offset everywhere in these classes > Source/JavaScriptCore/assembler/AbstractMacroAssembler.h:264 > + // Describes an address with base address and post-increment/decrement index. ditto > Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:55 > + BasicBlock* block { nullptr }; this field is totally unused except for one store to it in the candidate search loop > Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:76 > + auto tryPrePostIndex = [&] (Value* value) { nit: this should be called something like "tryAddPrePostIndexCandidate" > Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:110 > + if (smallOffset != offset || !Arg::isValidPreIndexForm(smallOffset)) I think we should just have one function, but if we stick with two, this should be isValidPostIndexForm, not isValidPreIndexForm > Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:144 > + auto detect = [&] (HashMap<MemoryValue*, Value*> candidates) { this should be taking like a "const HashMap&" or HashMap&. No need to take the hash map by value. > Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:145 > + for (auto itr = candidates.begin(); itr != candidates.end(); ++itr) { you should be using the good style for loops: for (auto pair : candidates) > Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:146 > + MemoryValue* memory = itr->key; then pair.key > Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:147 > + Value* address = itr->value; and pair.value > Source/JavaScriptCore/b3/air/AirArg.h:1296 > + template<typename Int, typename = Value::IsLegalOffset<Int>> > + static bool isValidPreIndexForm(Int offset) > + { > + if (isARM64()) > + return isValidSignedImm9(offset); > + return false; > + } > + > + template<typename Int, typename = Value::IsLegalOffset<Int>> > + static bool isValidPostIndexForm(Int offset) > + { > + if (isARM64()) > + return isValidSignedImm9(offset); > + return false; > + } Can we just make one function "isValidIncrementIndexForm"? Comment on attachment 434656 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=434656&action=review > Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:151 > + if (candidates == preIndexCandidates) { This is doing a hash table comparison by value each iteration. That's a crazy O(n) operation. We should do this by passing in a flag or something. Also, if we pass in the flag, we don't need to check full control flow equivalence. We know which should forward dominate, which should backwards dominate. |