Bug 228047 - Add Pre/Post-Indexed Address Mode to Air for ARM64
Summary: Add Pre/Post-Indexed Address Mode to Air for ARM64
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Yijia Huang
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2021-07-17 03:50 PDT by Yijia Huang
Modified: 2021-08-09 07:42 PDT (History)
8 users (show)

See Also:


Attachments
Patch (4.37 KB, patch)
2021-07-17 03:54 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (5.07 KB, patch)
2021-07-17 16:29 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (5.46 KB, patch)
2021-07-17 19:27 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (19.62 KB, patch)
2021-07-20 02:00 PDT, Yijia Huang
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
Patch (19.69 KB, patch)
2021-07-20 02:34 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (24.89 KB, patch)
2021-07-20 13:34 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (44.31 KB, patch)
2021-07-23 00:44 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (44.36 KB, patch)
2021-07-23 10:18 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (43.51 KB, patch)
2021-07-23 10:36 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (147.13 KB, patch)
2021-07-24 05:20 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (42.34 KB, patch)
2021-07-24 05:59 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (42.67 KB, patch)
2021-07-24 14:34 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (44.73 KB, patch)
2021-07-25 00:01 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (45.16 KB, patch)
2021-07-25 03:23 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (44.48 KB, patch)
2021-07-25 12:32 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (44.48 KB, patch)
2021-07-25 13:14 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (45.68 KB, patch)
2021-07-26 02:57 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (46.72 KB, patch)
2021-07-26 22:51 PDT, Yijia Huang
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
Patch (46.73 KB, patch)
2021-07-26 23:00 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (46.22 KB, patch)
2021-07-27 09:23 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (64.17 KB, patch)
2021-07-27 15:51 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (64.19 KB, patch)
2021-07-28 02:38 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (47.67 KB, patch)
2021-07-29 13:52 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (55.36 KB, patch)
2021-07-29 14:19 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (58.75 KB, patch)
2021-07-29 16:07 PDT, Yijia Huang
fpizlo: review+
Details | Formatted Diff | Diff
Patch for fixing the layout test failure (59.06 KB, patch)
2021-07-30 00:58 PDT, Yijia Huang
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
Patch for fixing the layout test failure (60.19 KB, patch)
2021-07-30 10:01 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch for landing (60.22 KB, patch)
2021-07-30 12:56 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Yijia Huang 2021-07-17 03:50:26 PDT
...
Comment 1 Yijia Huang 2021-07-17 03:54:42 PDT
Created attachment 433730 [details]
Patch
Comment 2 Yijia Huang 2021-07-17 16:29:25 PDT
Created attachment 433734 [details]
Patch
Comment 3 Yijia Huang 2021-07-17 19:27:25 PDT
Created attachment 433738 [details]
Patch
Comment 4 Yijia Huang 2021-07-20 02:00:29 PDT
Created attachment 433859 [details]
Patch
Comment 5 Yijia Huang 2021-07-20 02:34:39 PDT
Created attachment 433864 [details]
Patch
Comment 6 Yijia Huang 2021-07-20 13:34:01 PDT
Created attachment 433897 [details]
Patch
Comment 7 Yijia Huang 2021-07-23 00:44:44 PDT
Created attachment 434069 [details]
Patch
Comment 8 Yijia Huang 2021-07-23 10:18:22 PDT
Created attachment 434096 [details]
Patch
Comment 9 Yijia Huang 2021-07-23 10:19:54 PDT
Comment on attachment 434069 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=434069&action=review

> Source/JavaScriptCore/b3/B3ReduceStrength.cpp:522
> +                                m_potentialPreIndex.remove(it);

This remove() doesn't destroy the MemoryValue*.

> Source/JavaScriptCore/b3/B3ReduceStrength.cpp:550
> +                m_potentialPreIndex.remove(it);

Why does this remove() destroy the MemoryValue* here?
Comment 10 Yijia Huang 2021-07-23 10:36:13 PDT
Created attachment 434098 [details]
Patch
Comment 11 Radar WebKit Bug Importer 2021-07-24 03:51:40 PDT
<rdar://problem/81053101>
Comment 12 Yijia Huang 2021-07-24 05:20:00 PDT
Created attachment 434166 [details]
Patch
Comment 13 Yijia Huang 2021-07-24 05:59:19 PDT
Created attachment 434167 [details]
Patch
Comment 14 Yijia Huang 2021-07-24 14:34:14 PDT
Created attachment 434174 [details]
Patch
Comment 15 Yijia Huang 2021-07-25 00:01:46 PDT
Created attachment 434178 [details]
Patch
Comment 16 Yijia Huang 2021-07-25 03:23:02 PDT
Created attachment 434179 [details]
Patch
Comment 17 Yijia Huang 2021-07-25 12:32:59 PDT
Created attachment 434181 [details]
Patch
Comment 18 Yijia Huang 2021-07-25 13:14:21 PDT
Created attachment 434182 [details]
Patch
Comment 19 Yijia Huang 2021-07-26 02:57:51 PDT
Created attachment 434197 [details]
Patch
Comment 20 Filip Pizlo 2021-07-26 20:23:41 PDT
Comment on attachment 434197 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=434197&action=review

> Source/JavaScriptCore/b3/B3LowerToAir.cpp:916
> +                    auto tryPotentialPostIndexAddress = [&] () {
> +                        if (!right->hasIntPtr())
> +                            return;
> +                        intptr_t offset = right->asIntPtr();
> +                        Value::OffsetType smallOffset = static_cast<Value::OffsetType>(offset);
> +                        if (opcode != Air::Add64 || smallOffset != offset || !Arg::isValidPostIndexForm(smallOffset))
> +                            return;
> +                        m_baseToInst.add(left, &m_insts.last().last());
> +                    };

I think it's weird to have this code here.  Seems like you want it at the top of the Add case in the main lowering switch statement.

> Source/JavaScriptCore/b3/B3LowerToAir.cpp:1164
> -                    if (isValidForm(Store32, Arg::ZeroReg, dest.kind()) && dest.isValidForm(Width32))
> -                        return Inst(Store32, m_value, zeroReg(), dest);
> +                    if (isValidForm(Move32, Arg::ZeroReg, dest.kind()) && dest.isValidForm(Width32))
> +                        return Inst(Move32, m_value, zeroReg(), dest);

Why do you need to name stores as such?

> Source/JavaScriptCore/b3/B3LowerToAir.cpp:2597
> +                if (totalUseCounts(address) == 1 || totalUseCounts(base) > 1)

I don't get this.  What are you trying to achieve here?

Isn't the whole point of pre-increment loads that the address has many uses?
Comment 21 Yijia Huang 2021-07-26 22:51:49 PDT
Created attachment 434272 [details]
Patch
Comment 22 Yijia Huang 2021-07-26 23:00:26 PDT
Created attachment 434273 [details]
Patch
Comment 23 Yijia Huang 2021-07-27 09:23:00 PDT
Created attachment 434294 [details]
Patch
Comment 24 Yijia Huang 2021-07-27 15:51:13 PDT
Created attachment 434376 [details]
Patch
Comment 25 Yijia Huang 2021-07-28 02:38:20 PDT
Created attachment 434416 [details]
Patch
Comment 26 Yijia Huang 2021-07-28 02:39:56 PDT
This is the 5th version.
Comment 27 Yijia Huang 2021-07-29 13:52:34 PDT
Created attachment 434567 [details]
Patch
Comment 28 Yijia Huang 2021-07-29 14:19:10 PDT
Created attachment 434573 [details]
Patch
Comment 29 Filip Pizlo 2021-07-29 14:54:14 PDT
Comment on attachment 434573 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=434573&action=review

I think you have a bug in how your new Arg works - it says that it just Uses its input.

Also, you should factor your phase out into its own file.

> Source/JavaScriptCore/b3/B3ReduceStrength.cpp:3165
> +bool canonicalize(Procedure& proc)
> +{
> +    PhaseScope phaseScope(proc, "canonicalize");
> +    ReduceStrength reduceStrength(proc);
> +    return reduceStrength.canonicalizeIncrementAddress();
> +}

I think you need a more descriptive name at the B3 level namespace than "canonicalize".  I would call it something like "canonicalizePrePostIncrements" and maybe I would even conditionalize it so it only runs on ARM64.

Also, I wouldn't make this a method of ReduceStrength.  It doesn't seem like it needs to be.  Seems like this should be a standalone function, in its own file.  Like "B3CanonicalizePrePostIncrements.h" and "B3CanonicalizePrePostIncrements.cpp".

> Source/JavaScriptCore/b3/air/AirArg.h:1395
>              functor(m_base, Use, GP, argRole == UseAddr ? argWidth : pointerWidth());

This is wrong now - it'll claim that pre/post increment "uses" are just Uses.  They are UseDef.
Comment 30 Yijia Huang 2021-07-29 16:07:11 PDT
Created attachment 434582 [details]
Patch
Comment 31 Filip Pizlo 2021-07-29 17:35:11 PDT
Comment on attachment 434582 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=434582&action=review

I think this is right?

> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:167
> +    return canonicalizePrePostIncrementsImpl(proc);

You could inline the body of the impl here.

> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.h:34
> +bool canonicalizePrePostIncrementsImpl(Procedure&);

Don't need this declared here since you don't call it from anywhere but the non-impl variant.

> Source/JavaScriptCore/b3/air/AirArg.h:1398
> +        case PreIndex:
> +        case PostIndex:
> +            functor(m_base, UseDef, GP, argRole == UseAddr ? argWidth : pointerWidth());
> +            break;

Fingers crossed that this works!

Ideally we'd audit the compiler to make sure nobody is assuming that only a top-level Arg can be a Def.
Comment 32 Yijia Huang 2021-07-30 00:58:33 PDT
Created attachment 434610 [details]
Patch for fixing the layout test failure
Comment 33 Yijia Huang 2021-07-30 10:01:23 PDT
Created attachment 434641 [details]
Patch for fixing the layout test failure
Comment 34 Yijia Huang 2021-07-30 12:56:30 PDT
Created attachment 434656 [details]
Patch for landing
Comment 35 EWS 2021-07-30 13:44:55 PDT
Committed r280493 (240125@main): <https://commits.webkit.org/240125@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 434656 [details].
Comment 36 Saam Barati 2021-08-05 20:33:52 PDT
Comment on attachment 434656 [details]
Patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=434656&action=review

> Source/JavaScriptCore/b3/B3LowerToAir.cpp:2568
> +            //     memory = load(base, 0)
> +            //     address = add(base, offset)

This comment is not what the code is doing. The order here is backwards of the canonical form.

Might also be worth noting what Air code we emit for clarity?

> Source/JavaScriptCore/b3/B3LowerToAir.cpp:2573
> +                Value* address = m_block->at(m_index - 1);

Why do we do this if address is already locked? I guess it's sound because we are always writing to its temp? That said, if it's locked, some other operation already sucked up this add internal to the other operation. And nobody would use the result of the add. But we're emitting a move, which may not get elided.
Comment 37 Yijia Huang 2021-08-05 21:03:10 PDT
Comment on attachment 434656 [details]
Patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=434656&action=review

Thanks for reviewing.

>> Source/JavaScriptCore/b3/B3LowerToAir.cpp:2568
>> +            //     address = add(base, offset)
> 
> This comment is not what the code is doing. The order here is backwards of the canonical form.
> 
> Might also be worth noting what Air code we emit for clarity?

The order is fixed in the patch for store instruction. I will add the corresponding Air code in the store patch.

>> Source/JavaScriptCore/b3/B3LowerToAir.cpp:2573
>> +                Value* address = m_block->at(m_index - 1);
> 
> Why do we do this if address is already locked? I guess it's sound because we are always writing to its temp? That said, if it's locked, some other operation already sucked up this add internal to the other operation. And nobody would use the result of the add. But we're emitting a move, which may not get elided.

I will add a m_locked check on address in the store patch.
Comment 38 Yijia Huang 2021-08-05 21:04:38 PDT
Comment on attachment 434656 [details]
Patch for landing

Thanks for reviewing.
Comment 39 Saam Barati 2021-08-06 12:54:16 PDT
Comment on attachment 434656 [details]
Patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=434656&action=review

still reviewing, but a few more comments.

> Source/JavaScriptCore/assembler/AbstractMacroAssembler.h:250
> +    // Describes an address with base address and pre-increment/decrement index.

nit: index => offset everywhere in these classes

> Source/JavaScriptCore/assembler/AbstractMacroAssembler.h:264
> +    // Describes an address with base address and post-increment/decrement index.

ditto

> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:55
> +    BasicBlock* block { nullptr };

this field is totally unused except for one store to it in the candidate search loop

> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:76
> +    auto tryPrePostIndex = [&] (Value* value) {

nit: this should be called something like "tryAddPrePostIndexCandidate"

> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:110
> +                if (smallOffset != offset || !Arg::isValidPreIndexForm(smallOffset))

I think we should just have one function, but if we stick with two, this should be isValidPostIndexForm, not isValidPreIndexForm

> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:144
> +    auto detect = [&] (HashMap<MemoryValue*, Value*> candidates) {

this should be taking like a "const HashMap&" or HashMap&. No need to take the hash map by value.

> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:145
> +        for (auto itr = candidates.begin(); itr != candidates.end(); ++itr) {

you should be using the good style for loops:

for (auto pair : candidates)

> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:146
> +            MemoryValue* memory = itr->key;

then pair.key

> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:147
> +            Value* address = itr->value;

and pair.value

> Source/JavaScriptCore/b3/air/AirArg.h:1296
> +    template<typename Int, typename = Value::IsLegalOffset<Int>>
> +    static bool isValidPreIndexForm(Int offset)
> +    {
> +        if (isARM64())
> +            return isValidSignedImm9(offset);
> +        return false;
> +    }
> +
> +    template<typename Int, typename = Value::IsLegalOffset<Int>>
> +    static bool isValidPostIndexForm(Int offset)
> +    {
> +        if (isARM64())
> +            return isValidSignedImm9(offset);
> +        return false;
> +    }

Can we just make one function "isValidIncrementIndexForm"?
Comment 40 Saam Barati 2021-08-06 12:56:25 PDT
Comment on attachment 434656 [details]
Patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=434656&action=review

> Source/JavaScriptCore/b3/B3CanonicalizePrePostIncrements.cpp:151
> +            if (candidates == preIndexCandidates) {

This is doing a hash table comparison by value each iteration. That's a crazy O(n) operation. We should do this by passing in a flag or something.
Also, if we pass in the flag, we don't need to check full control flow equivalence. We know which should forward dominate, which should backwards dominate.