Bug 170161 - Air should support linear scan for optLevel<2
Summary: Air should support linear scan for optLevel<2
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: All All
: P2 Normal
Assignee: Filip Pizlo
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-03-27 20:11 PDT by Filip Pizlo
Modified: 2017-03-31 16:28 PDT (History)
2 users (show)

See Also:


Attachments
it begins (73.64 KB, patch)
2017-03-27 20:12 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
so close (84.08 KB, patch)
2017-03-27 21:20 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
the latest (89.47 KB, patch)
2017-03-27 22:29 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
seems like it might work (96.29 KB, patch)
2017-03-28 11:16 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
it linear scanned marsaglia and didn't crash (109.94 KB, patch)
2017-03-28 16:19 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
seems like it's starting to work (109.71 KB, patch)
2017-03-28 18:34 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
passes a few tests here and there (109.68 KB, patch)
2017-03-28 18:53 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
passing more tests! (110.13 KB, patch)
2017-03-28 20:04 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
more things (112.30 KB, patch)
2017-03-29 14:16 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
fixed another bug (112.35 KB, patch)
2017-03-29 17:26 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
changed the approach to clobbers (111.71 KB, patch)
2017-03-29 20:21 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
more (111.82 KB, patch)
2017-03-29 21:27 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
not failing so many tests anymore (111.82 KB, patch)
2017-03-29 21:31 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
another big fix (112.09 KB, patch)
2017-03-29 21:52 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
the patch (132.18 KB, patch)
2017-03-30 13:15 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
the patch (132.59 KB, patch)
2017-03-30 13:23 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
the patch (132.61 KB, patch)
2017-03-30 13:50 PDT, Filip Pizlo
saam: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Filip Pizlo 2017-03-27 20:11:58 PDT
Patch forthcoming.
Comment 1 Filip Pizlo 2017-03-27 20:12:28 PDT
Created attachment 305548 [details]
it begins
Comment 2 Filip Pizlo 2017-03-27 21:20:04 PDT
Created attachment 305558 [details]
so close
Comment 3 Filip Pizlo 2017-03-27 22:29:50 PDT
Created attachment 305561 [details]
the latest

I realized that I stumbled upon a super gnarly situation where linear scan has a bad time with Air's liberal use of precoloring.

I think I found a megahack to work around this.  I wrote it down in a comment.  I'll get back to this tomorrow.
Comment 4 Filip Pizlo 2017-03-28 11:16:04 PDT
Created attachment 305612 [details]
seems like it might work

Haven't tested it yet.
Comment 5 Filip Pizlo 2017-03-28 16:19:59 PDT
Created attachment 305662 [details]
it linear scanned marsaglia and didn't crash
Comment 6 Filip Pizlo 2017-03-28 18:34:28 PDT
Created attachment 305690 [details]
seems like it's starting to work

It linear scanned imaging-gaussian-blur
Comment 7 Filip Pizlo 2017-03-28 18:53:11 PDT
Created attachment 305694 [details]
passes a few tests here and there
Comment 8 Filip Pizlo 2017-03-28 20:04:30 PDT
Created attachment 305697 [details]
passing more tests!
Comment 9 Filip Pizlo 2017-03-29 14:16:44 PDT
Created attachment 305787 [details]
more things

Added a bunch of debug support, including a way to force linear scan to first spill all the things before proceeding further.  This tests that linear scan's spiller is sound.

So far it looks like the spiller is sound but the allocator itself is borked.
Comment 10 Filip Pizlo 2017-03-29 17:26:02 PDT
Created attachment 305814 [details]
fixed another bug

It's now passing a bunch of tests that it was previously failing.  Rerunning all tests now.
Comment 11 Filip Pizlo 2017-03-29 20:21:25 PDT
Created attachment 305825 [details]
changed the approach to clobbers

Switching to using one of the approaches suggested in the paper.
Comment 12 Filip Pizlo 2017-03-29 21:27:03 PDT
Created attachment 305829 [details]
more
Comment 13 Filip Pizlo 2017-03-29 21:31:22 PDT
Created attachment 305830 [details]
not failing so many tests anymore

But still failing many tests
Comment 14 Filip Pizlo 2017-03-29 21:52:43 PDT
Created attachment 305832 [details]
another big fix
Comment 15 Filip Pizlo 2017-03-29 22:22:57 PDT
(In reply to Filip Pizlo from comment #14)
> Created attachment 305832 [details]
> another big fix

With this it passes all JSC tests but I'm still able to get it to crash through fuzz tests.
Comment 16 Filip Pizlo 2017-03-30 13:15:29 PDT
Created attachment 305885 [details]
the patch
Comment 17 Filip Pizlo 2017-03-30 13:23:04 PDT
Created attachment 305890 [details]
the patch
Comment 18 Build Bot 2017-03-30 13:25:22 PDT
Attachment 305890 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp:34:  Alphabetical sorting problem.  [build/include_order] [4]
ERROR: Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp:420:  Boolean expressions that span multiple lines should have their operators on the left side of the line instead of the right side.  [whitespace/operators] [4]
ERROR: Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp:420:  Multi line control clauses should use braces.  [whitespace/braces] [4]
ERROR: Source/WTF/wtf/Range.h:88:  Missing spaces around |  [whitespace/operators] [3]
Total errors found: 4 in 65 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 19 Filip Pizlo 2017-03-30 13:50:15 PDT
Created attachment 305896 [details]
the patch
Comment 20 Build Bot 2017-03-30 13:52:17 PDT
Attachment 305896 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp:34:  Alphabetical sorting problem.  [build/include_order] [4]
ERROR: Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp:420:  Boolean expressions that span multiple lines should have their operators on the left side of the line instead of the right side.  [whitespace/operators] [4]
ERROR: Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp:420:  Multi line control clauses should use braces.  [whitespace/braces] [4]
ERROR: Source/WTF/wtf/Range.h:88:  Missing spaces around |  [whitespace/operators] [3]
Total errors found: 4 in 65 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 21 Saam Barati 2017-03-30 15:37:48 PDT
Comment on attachment 305896 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=305896&action=review

r=me. Please fix the various places you have OOPS

> Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp:151
> +            index += block->size() * 2;

shouldn't this be block->size()*2+1 since block->size()*2 is the tail?
Or is the head of one block OK to be the same as the tail?

> Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp:407
> +            if (m_active.size() != m_registers.size()) {
> +                bool didAssign = false;
> +                for (Reg reg : m_registers) {
> +                    if (!m_activeRegs.contains(reg) && entry.possibleRegs.contains(reg)) {
> +                        assign(tmp, reg);
> +                        didAssign = true;
> +                        break;
> +                    }
> +                }
> +                if (didAssign)
> +                    continue;
> +            }

We could totally do biased coloring here to try to remove some extra moves. Should be cheap, and benefit quality of code in some programs. Code gen quality isn't a priority for this phase, but if we get a throughput progression without slowing this phase down, we should do it. Might be worth a FIXME. I'm happy to hack it up.

> Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp:422
> +                spill(tmp);

What guarantees !entry.isUnspillable here?
Comment 22 Saam Barati 2017-03-30 15:45:08 PDT
Comment on attachment 305896 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=305896&action=review

> Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp:61
> +bool verbose() { return Options::airLinearScanVerbose(); }

nit: I'm not sure this really needs to be an option now that the phase works.
Comment 23 Filip Pizlo 2017-03-30 15:46:00 PDT
(In reply to Saam Barati from comment #21)
> Comment on attachment 305896 [details]
> the patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=305896&action=review
> 
> r=me. Please fix the various places you have OOPS

Right!  I will!

> 
> > Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp:151
> > +            index += block->size() * 2;
> 
> shouldn't this be block->size()*2+1 since block->size()*2 is the tail?
> Or is the head of one block OK to be the same as the tail?

Basic blocks already have "double padding" between them in the sense that we treat the point after the terminal to be distinct from the point before the first instruction in the successor(s). The indexing scheme in this patch means that if you have a block like:

    Foo
    Bar
    Baz

Then it will be indexed as follows:

0
    Foo
1
2
    Bar
3
4
    Baz
5

We don't need the extra +1, since every block is indexed with padding.

> 
> > Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp:407
> > +            if (m_active.size() != m_registers.size()) {
> > +                bool didAssign = false;
> > +                for (Reg reg : m_registers) {
> > +                    if (!m_activeRegs.contains(reg) && entry.possibleRegs.contains(reg)) {
> > +                        assign(tmp, reg);
> > +                        didAssign = true;
> > +                        break;
> > +                    }
> > +                }
> > +                if (didAssign)
> > +                    continue;
> > +            }
> 
> We could totally do biased coloring here to try to remove some extra moves.
> Should be cheap, and benefit quality of code in some programs. Code gen
> quality isn't a priority for this phase, but if we get a throughput
> progression without slowing this phase down, we should do it. Might be worth
> a FIXME. I'm happy to hack it up.

I'll add a FIXME.

> 
> > Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp:422
> > +                spill(tmp);
> 
> What guarantees !entry.isUnspillable here?

This means that a tmp introduced by the spiller was born at a time when none of the registers that it could legally use are available.  Since spill tmps have minuscule live ranges - literally live either just at the end, or just the beginning, or both, of some instruction - it takes some serious effort to reach this point.

Graph coloring fails on this case also.

It's like if you created a patchpoint that returns SomeRegister but clobbers the entire register file.
Comment 24 Filip Pizlo 2017-03-30 15:46:19 PDT
(In reply to Saam Barati from comment #22)
> Comment on attachment 305896 [details]
> the patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=305896&action=review
> 
> > Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp:61
> > +bool verbose() { return Options::airLinearScanVerbose(); }
> 
> nit: I'm not sure this really needs to be an option now that the phase works.

You're prolly gonna like it when you add biased coloring.
Comment 25 Filip Pizlo 2017-03-30 15:56:25 PDT
Landed in https://trac.webkit.org/changeset/214636/webkit
Comment 26 Saam Barati 2017-03-30 16:20:57 PDT
(In reply to Filip Pizlo from comment #23)
> (In reply to Saam Barati from comment #21)
> > Comment on attachment 305896 [details]
> > the patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=305896&action=review
> > 
> > r=me. Please fix the various places you have OOPS
> 
> Right!  I will!
> 
> > 
> > > Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp:151
> > > +            index += block->size() * 2;
> > 
> > shouldn't this be block->size()*2+1 since block->size()*2 is the tail?
> > Or is the head of one block OK to be the same as the tail?
> 
> Basic blocks already have "double padding" between them in the sense that we
> treat the point after the terminal to be distinct from the point before the
> first instruction in the successor(s). The indexing scheme in this patch
> means that if you have a block like:
> 
>     Foo
>     Bar
>     Baz
> 
> Then it will be indexed as follows:
> 
> 0
>     Foo
> 1
> 2
>     Bar
> 3
> 4
>     Baz
> 5
> 
> We don't need the extra +1, since every block is indexed with padding.
> 
> > 
> > > Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp:407
> > > +            if (m_active.size() != m_registers.size()) {
> > > +                bool didAssign = false;
> > > +                for (Reg reg : m_registers) {
> > > +                    if (!m_activeRegs.contains(reg) && entry.possibleRegs.contains(reg)) {
> > > +                        assign(tmp, reg);
> > > +                        didAssign = true;
> > > +                        break;
> > > +                    }
> > > +                }
> > > +                if (didAssign)
> > > +                    continue;
> > > +            }
> > 
> > We could totally do biased coloring here to try to remove some extra moves.
> > Should be cheap, and benefit quality of code in some programs. Code gen
> > quality isn't a priority for this phase, but if we get a throughput
> > progression without slowing this phase down, we should do it. Might be worth
> > a FIXME. I'm happy to hack it up.
> 
> I'll add a FIXME.
> 
> > 
> > > Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp:422
> > > +                spill(tmp);
> > 
> > What guarantees !entry.isUnspillable here?
> 
> This means that a tmp introduced by the spiller was born at a time when none
> of the registers that it could legally use are available.  Since spill tmps
> have minuscule live ranges - literally live either just at the end, or just
> the beginning, or both, of some instruction - it takes some serious effort
> to reach this point.
Right. I know that this is the case. I just realized the function you call into has a RELEASE_ASSERT.

> 
> Graph coloring fails on this case also.
> 
> It's like if you created a patchpoint that returns SomeRegister but clobbers
> the entire register file.
Comment 27 Saam Barati 2017-03-31 16:19:04 PDT
Is there any chance this patch could've regressed ARM64? I'm seeing a 4% JetStream regression on iOS perf bots.
Comment 28 Filip Pizlo 2017-03-31 16:28:55 PDT
(In reply to Saam Barati from comment #27)
> Is there any chance this patch could've regressed ARM64? I'm seeing a 4%
> JetStream regression on iOS perf bots.

Anything is possible but I don't see how this would have done it.