Bug 227202 - Add ARM64 BIC, ORN, BFXIL to AIR opcode and select them in AIR
Summary: Add ARM64 BIC, ORN, BFXIL to AIR opcode and select them in AIR
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Yijia Huang
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2021-06-20 21:52 PDT by Yijia Huang
Modified: 2021-06-29 08:53 PDT (History)
8 users (show)

See Also:


Attachments
Patch (15.14 KB, patch)
2021-06-24 23:05 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (16.84 KB, patch)
2021-06-25 09:13 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (14.21 KB, patch)
2021-06-25 12:48 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (14.21 KB, patch)
2021-06-25 16:23 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (34.93 KB, patch)
2021-06-26 14:49 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (42.09 KB, patch)
2021-06-26 22:03 PDT, Yijia Huang
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
Patch (42.07 KB, patch)
2021-06-26 22:14 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (44.51 KB, patch)
2021-06-27 14:16 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (64.53 KB, patch)
2021-06-28 01:23 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (64.53 KB, patch)
2021-06-28 01:39 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff
Patch (63.33 KB, patch)
2021-06-28 19:06 PDT, Yijia Huang
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Yijia Huang 2021-06-20 21:52:48 PDT
...
Comment 1 Yijia Huang 2021-06-24 23:05:31 PDT
Created attachment 432238 [details]
Patch
Comment 2 Yijia Huang 2021-06-25 09:13:49 PDT
Created attachment 432278 [details]
Patch
Comment 3 Saam Barati 2021-06-25 10:06:04 PDT
Comment on attachment 432278 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=432278&action=review

> Source/JavaScriptCore/ChangeLog:59
> +        Before Adding BFI:
> +        // Old optimized AIR
> +        Move    $1, %x2,      @3
> +        Lshift %x2, %x1, %x1, @5
> +        Not    %x1, %x1,      @6
> +        And    %x0, %x1, %x0, @7
> +        Ret    %x0,           @8
> +
> +        After Adding BFI:
> +        // New optimized AIR
> +        ClearBit %x1, %x0, %x2, @7
> +        Ret      %x0,           @8

Nice

> Source/JavaScriptCore/assembler/MacroAssemblerARM64.h:503
> +    void clearBit64(RegisterID bitToClear, RegisterID dest, RegisterID scratchForMask = InvalidGPRReg)
> +    {
> +        if (scratchForMask == InvalidGPRReg)
> +            scratchForMask = scratchRegister();
> +
> +        move(TrustedImm32(1), scratchForMask);
> +        lshift64(bitToClear, scratchForMask);
> +        clearBits64WithMask(scratchForMask, dest);

This is an interesting question for Phil.

I wonder if it's worth sucking up the lshift as part of this instruction, instead of emitting it directly in Air IR, and just adding a "clearBits32/64WithMask" in Air

> Source/JavaScriptCore/assembler/testmasm.cpp:920
> +    constexpr unsigned bitsInWord = sizeof(uint32_t) * 8;

nit: 8 => CHAR_BIT

> Source/JavaScriptCore/b3/air/AirOpcode.opcodes:384
> +arm64: ClearBit32 U:G:32, UZD:G:32, U:G:32

the last register here should be "scratch", so, "S:G:32"

> Source/JavaScriptCore/b3/air/AirOpcode.opcodes:387
> +arm64: ClearBit64 U:G:32, UD:G:64, U:G:64

ditto
Comment 4 Yijia Huang 2021-06-25 12:48:55 PDT
Created attachment 432286 [details]
Patch
Comment 5 Yijia Huang 2021-06-25 16:23:26 PDT
Created attachment 432307 [details]
Patch
Comment 6 Yijia Huang 2021-06-26 14:49:27 PDT
Created attachment 432338 [details]
Patch
Comment 7 Yijia Huang 2021-06-26 22:03:23 PDT
Created attachment 432343 [details]
Patch
Comment 8 Yijia Huang 2021-06-26 22:14:33 PDT
Created attachment 432345 [details]
Patch
Comment 9 Yijia Huang 2021-06-27 14:16:31 PDT
Created attachment 432356 [details]
Patch
Comment 10 Yijia Huang 2021-06-28 01:23:11 PDT
Created attachment 432368 [details]
Patch
Comment 11 Yijia Huang 2021-06-28 01:39:01 PDT
Created attachment 432371 [details]
Patch
Comment 12 Filip Pizlo 2021-06-28 17:36:12 PDT
Comment on attachment 432371 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=432371&action=review

Didn't you previously make (n >> a) & b the canonical form for extracting a bitfield?

It's probably worth saying the following though I'm sure you've started to figure it out: Canonical forms are like pseudoinstructions. Just as phases may look for instructions, they may look for patterns of instructions, or instructions with something special about them (like that some operand is a constant). Also, anytime we can express some logic in B3 IR using existing instructions, then rather than adding a new instruction, we pick a canonical form of existing ones. Like BitNot - there is no BitNot opcode. 

When designing canonical forms, we often have many alternative possible ways of expressing this "pseudoinstruction". So which one do we pick? There are two things that happen to instructions or instruction patterns:

1. There's a bunch of code to create them.
2. There's a bunch of code to pattern-match them.

If someone wants to write a phase that as part of a transformation needs to introduce a bitfield extract, then they will want to emit it using canonical form.

If someone wants to pattern-match the bitfield extract, then they will pattern-match the canonical form.

Therefore: we should always pick the canonical form that requires the least amount of code for case (1) and (2), with a preference for less code in (2).

Hence why I think that (n >> a) & b is the natural choice.  Your instruction selection code illustraces this. To check if (n & a) >> b is a bitfield extract, you have to count trailing zeroes, then shift down, then check if it's a mask.  But with (n >> a) & b, you just have to check that a is a mask.

> Source/JavaScriptCore/ChangeLog:36
> +            Turn this: -value - 1 
> +            Into this: value ^ -1

This is great.  I can't believe we didn't have this rule already.

Indeed, the canonical form of the BitNot(@x) operation in B3 IR is BitXor(@x, -1).  So, your patch just reinforces this existing rule.

> Source/JavaScriptCore/b3/B3LowerToAir.cpp:2851
> +            // BFXIL Pattern: d = ((n & mask1) >> lsb) | (d & mask2)

I would have made this be (n >> lsb) & mask1

> Source/JavaScriptCore/b3/B3LowerToAir.cpp:3022
> +            // UBFX Pattern: d = (n & mask) >> lsb

Ditto.
Comment 13 Radar WebKit Bug Importer 2021-06-28 17:36:25 PDT
<rdar://problem/79890466>
Comment 14 Yijia Huang 2021-06-28 19:06:48 PDT
Created attachment 432447 [details]
Patch
Comment 15 Filip Pizlo 2021-06-28 20:21:31 PDT
Comment on attachment 432447 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=432447&action=review

Nice work.

> Source/JavaScriptCore/b3/B3LowerToAir.cpp:2916
> +                auto isValidMask2 = [&] () -> bool {

I like your style of lambda usage. It makes the code easy to read since you are naming this block of computation.
Comment 16 EWS 2021-06-28 21:05:21 PDT
Committed r279362 (239228@main): <https://commits.webkit.org/239228@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 432447 [details].