Bug 171955 - PinnedRegisters should be better modeled in IRC/Briggs
Summary: PinnedRegisters should be better modeled in IRC/Briggs
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Saam Barati
URL:
Keywords:
Depends on:
Blocks: 171826
  Show dependency treegraph
 
Reported: 2017-05-10 17:04 PDT by Saam Barati
Modified: 2017-05-17 12:25 PDT (History)
13 users (show)

See Also:


Attachments
WIP (16.66 KB, patch)
2017-05-10 19:14 PDT, Saam Barati
no flags Details | Formatted Diff | Diff
WIP (25.10 KB, patch)
2017-05-12 18:59 PDT, Saam Barati
no flags Details | Formatted Diff | Diff
WIP for alternate modeling (19.55 KB, patch)
2017-05-12 19:25 PDT, Saam Barati
no flags Details | Formatted Diff | Diff
patch (25.07 KB, patch)
2017-05-14 21:03 PDT, Saam Barati
no flags Details | Formatted Diff | Diff
patch (25.05 KB, patch)
2017-05-14 21:04 PDT, Saam Barati
fpizlo: review+
Details | Formatted Diff | Diff
patch for landing (24.81 KB, patch)
2017-05-16 17:53 PDT, Saam Barati
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews105 for mac-elcapitan-wk2 (1.31 MB, application/zip)
2017-05-17 00:13 PDT, Build Bot
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Saam Barati 2017-05-10 17:04:39 PDT
Say a patchpoint clobbers every register but regT0 (for the sake of argument), and it also says it clobbers a pinned register.
Note that the registerCount() won't include the pinned register. However, the Tmp that the patch defines will have a degree count
of registerCount(), instead of registerCount() - 1. This is because we model clobbering of pinned registers in the normal
interference graph. Instead, we should not do this. We should model it like we model interference with the FP register.

The interference graph should only model interference edges between Tmps and Tmps, and Tmps and assignable registers.
Comment 1 Saam Barati 2017-05-10 19:14:43 PDT
Created attachment 309676 [details]
WIP

I still have some concerns on us deciding to coalesce pinned regs. E.g:

```
Move pinnedGPR, temp1
Patch # writePinned
Move pinnedGPR, temp2
use temp1
use temp2
```

Can we transform this into:
```
Patch
use pinnedGPR
use pinnedGPR
```
Comment 2 Filip Pizlo 2017-05-11 15:18:29 PDT
Comment on attachment 309676 [details]
WIP

View in context: https://bugs.webkit.org/attachment.cgi?id=309676&action=review

I think this looks OK overall.  It would be good to use something other than a vector for tracking the set of pinned regs, but then again, it probably doesn't matter very much.  You should change the names of those functiuons in Code/Procedure, though - the correct term is "mutable GPRs/FPRs".

> Source/JavaScriptCore/b3/B3Procedure.h:262
> +    // This is just used for tests, and is not meant to be efficient.
> +    // We could make it more efficient later if we wanted to.
> +    JS_EXPORT_PRIVATE RegisterSet gprBank();
> +    JS_EXPORT_PRIVATE RegisterSet fprBank();

"Bank" is the name of an enumeration that has GP and FP as its members.  That enum is relatively new, but that's roughly how we've always thought of the word.

So, I think you should pick a different name for these functions.

> Source/JavaScriptCore/b3/air/AirAllocateRegistersByGraphColoring.cpp:115
> +        for (const IndexType pinnedReg : m_pinnedRegs) {
> +            if (pinnedReg == tmpIndex)
> +                return true;
> +        }

Why not make this a bitvector lookup?

> Source/JavaScriptCore/b3/air/AirCode.cpp:111
> +RegisterSet Code::gprBank()
> +{
> +    RegisterSet result;
> +    for (Reg reg : m_gpRegsInPriorityOrder)
> +        result.set(reg);
> +    return result;
> +}
> +
> +RegisterSet Code::fprBank()
> +{
> +    RegisterSet result;
> +    for (Reg reg : m_fpRegsInPriorityOrder)
> +        result.set(reg);
> +    return result;
>  }

Why aren't you just using Code::mutableRegs()?

If you want the GP subset, you filter it with RegisterSet::allGPRs().  If you want the FP subset, you filter it with RegisterSet::allFPRs().

I think that means that if you want to provide helpers to do this for you, they should be called mutableGPRs() and mutableFPRs(), not gprBank/fprBank.
Comment 3 Filip Pizlo 2017-05-11 16:03:30 PDT
(In reply to Saam Barati from comment #1)
> Created attachment 309676 [details]
> WIP
> 
> I still have some concerns on us deciding to coalesce pinned regs. E.g:
> 
> ```
> Move pinnedGPR, temp1
> Patch # writePinned
> Move pinnedGPR, temp2
> use temp1
> use temp2
> ```
> 
> Can we transform this into:
> ```
> Patch
> use pinnedGPR
> use pinnedGPR
> ```

If Patch writesPinned then (unless the user is doing shenanigans) the Patch will also clobber the pinnedGPR.  Therefore, temp1 will not get coalesced with pinnedGPR, but temp2 could be.
Comment 4 Saam Barati 2017-05-12 18:48:33 PDT
(In reply to Saam Barati from comment #1)
> Created attachment 309676 [details]
> WIP
> 
> I still have some concerns on us deciding to coalesce pinned regs. E.g:
> 
> ```
> Move pinnedGPR, temp1
> Patch # writePinned
> Move pinnedGPR, temp2
> use temp1
> use temp2
> ```
Actually, my analysis here is wrong. IRC can not coalesce these regardless. Let's forget about pinned registers here, and just consider this program:
```
    Move %tmp0, %tmp1
    Move %tmp0, %tmp2
    Add32 %tmp1, %tmp2, %r0
    Ret32 %r0
```

Interestingly, IRC can not coalesce both moves. To understand why, note that tmp1 is live in to the second move. This means that tmp2 and tmp1 can't be assigned to the same register given how we build the interference graph. Clearly this shows a weakness in IRC. For example, it'll make this transformation:
```
    Move %tmp0, %tmp1
    Add32 %tmp1, %tmp0, %r0
    Ret32 %r0
```

But obviously this can even be further reduced to be this if we ran IRC over again:
```
    Add32 %tmp0, %tmp0, %r0
    Ret32 %r0
```


> 
> Can we transform this into:
> ```
> Patch
> use pinnedGPR
> use pinnedGPR
> ```
Comment 5 Saam Barati 2017-05-12 18:59:42 PDT
Created attachment 309994 [details]
WIP

This patch generalizes the m_interferesWithFramePointer concept to include all pinned registers. That said, I'm starting to thing that this approach is hacky. I think this whole notion could be modeled much more naturally:

We start by keeping pinned registers accounted for in registerCount(). Then, we just treat all pinned registers as always being alive. Then, we just model this as IRC normally would. I think this will just work for coalescing, etc, and be more consistent w/ the framework of IRC. I need to think about it a bit more to make sure I'm right. I'm gonna test it out now.
Comment 6 Saam Barati 2017-05-12 19:25:44 PDT
Created attachment 309995 [details]
WIP for alternate modeling

The only difference worth reading is in AirAllocateRegistersByGraphColoring
Comment 7 Saam Barati 2017-05-14 21:03:28 PDT
Created attachment 310106 [details]
patch
Comment 8 Build Bot 2017-05-14 21:04:39 PDT
Attachment 310106 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/b3/air/testair.cpp:36:  Alphabetical sorting problem.  [build/include_order] [4]
Total errors found: 1 in 10 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 9 Saam Barati 2017-05-14 21:04:51 PDT
Created attachment 310107 [details]
patch
Comment 10 Build Bot 2017-05-14 21:07:38 PDT
Attachment 310107 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/b3/air/testair.cpp:36:  Alphabetical sorting problem.  [build/include_order] [4]
Total errors found: 1 in 10 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 11 Mark Lam 2017-05-15 10:24:10 PDT
Comment on attachment 310107 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=310107&action=review

> Source/JavaScriptCore/b3/B3Procedure.h:260
> +    // This is just used for tests, and is not meant to be efficient.
> +    // We could make it more efficient later if we wanted to.

Drive by comment: I suggest making this comment plural since you're referring to both mutableGPRs and mutableFPRs:

// These are just used for tests, and are not meant to be efficient.
// We could make them more efficient later if we wanted to.

Using singular form implies the comment only applies to the first one.

> Source/JavaScriptCore/b3/air/AirCode.h:313
> +    // This is just used for tests, and is not meant to be efficient.
> +    // We could make it more efficient later if we wanted to.

Ditto.
Comment 12 Filip Pizlo 2017-05-16 12:10:13 PDT
Comment on attachment 310107 [details]
patch

This is rad.  I like how simple this turned out.
Comment 13 Saam Barati 2017-05-16 17:47:49 PDT
Comment on attachment 310107 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=310107&action=review

>> Source/JavaScriptCore/b3/B3Procedure.h:260
>> +    // We could make it more efficient later if we wanted to.
> 
> Drive by comment: I suggest making this comment plural since you're referring to both mutableGPRs and mutableFPRs:
> 
> // These are just used for tests, and are not meant to be efficient.
> // We could make them more efficient later if we wanted to.
> 
> Using singular form implies the comment only applies to the first one.

Ima just remove this comment, it preexists the code as I wrote it. I had it because a previous version of this function was worse. This function does pretty standard RegisterSet manipulations.
Comment 14 Saam Barati 2017-05-16 17:53:18 PDT
Created attachment 310334 [details]
patch for landing
Comment 15 Build Bot 2017-05-16 21:00:07 PDT
Attachment 310334 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/b3/air/testair.cpp:36:  Alphabetical sorting problem.  [build/include_order] [4]
Total errors found: 1 in 10 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 16 Build Bot 2017-05-17 00:13:33 PDT
Comment on attachment 310334 [details]
patch for landing

Attachment 310334 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/3758268

New failing tests:
http/tests/loading/resourceLoadStatistics/non-prevalent-resource-without-user-interaction.html
Comment 17 Build Bot 2017-05-17 00:13:34 PDT
Created attachment 310355 [details]
Archive of layout-test-results from ews105 for mac-elcapitan-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews105  Port: mac-elcapitan-wk2  Platform: Mac OS X 10.11.6
Comment 18 Saam Barati 2017-05-17 11:56:45 PDT
Comment on attachment 310334 [details]
patch for landing

Failures look unrelated.
Comment 19 WebKit Commit Bot 2017-05-17 12:25:21 PDT
Comment on attachment 310334 [details]
patch for landing

Clearing flags on attachment: 310334

Committed r216989: <http://trac.webkit.org/changeset/216989>
Comment 20 WebKit Commit Bot 2017-05-17 12:25:23 PDT
All reviewed patches have been landed.  Closing bug.