Bug 180783

Summary: Octane/richards regressed by a whopping 20% because eliminateCommonSubexpressions has a weird fixpoint requirement
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: JavaScriptCoreAssignee: Filip Pizlo <fpizlo>
Status: RESOLVED FIXED    
Severity: Normal CC: ews-watchlist, keith_miller, mark.lam, msaboff, saam, webkit-bug-importer, ysuzuki
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: All   
OS: All   
Attachments:
Description Flags
the patch
none
the patch saam: review+

Description Filip Pizlo 2017-12-13 16:34:04 PST
We accidentally introduced an extra load from arguments.  That load revealed the fact that our B3 CSE cannot handle redundant dependent chains:

BB#1:
a: Load(@x)
BB#2:
c: Load(@x)
d: Load(@c)
BB#3:
e: Load(@c)

Then CSE will build a map at the end of BB#2 that says that loads from @c are serviced by @d.

Then CSE will do this:

BB#1:
a: Load(@x)
BB#2:
c: Identity(@a)
d: Load(@a)
BB#3:
e: Load(@a)

But then, when we try to eliminate @e, we search for what loads service @a.  BB#2 tells us about @c, not @a.

The solution is either to fixpoint CSE or to perform substitution on the matches map.
Comment 1 Filip Pizlo 2017-12-13 16:34:47 PST
@Saam: I think that this is try-catch fallout.  We introduced some extra loads of arguments for try-catch entrypoints hacks.  CSE successfully eliminates those, but that causes it to not eliminate any loads from things loaded from arguments.
Comment 2 Filip Pizlo 2017-12-13 16:44:18 PST
(In reply to Filip Pizlo from comment #0)
> We accidentally introduced an extra load from arguments.  That load revealed
> the fact that our B3 CSE cannot handle redundant dependent chains:
> 
> BB#1:
> a: Load(@x)
> BB#2:
> c: Load(@x)
> d: Load(@c)
> BB#3:
> e: Load(@c)
> 
> Then CSE will build a map at the end of BB#2 that says that loads from @c
> are serviced by @d.
> 
> Then CSE will do this:
> 
> BB#1:
> a: Load(@x)
> BB#2:
> c: Identity(@a)
> d: Load(@a)
> BB#3:
> e: Load(@a)
> 
> But then, when we try to eliminate @e, we search for what loads service @a. 
> BB#2 tells us about @c, not @a.
> 
> The solution is either to fixpoint CSE or to perform substitution on the
> matches map.

It's even more subtle!

We fix that value map after processing a block.  The real problem is that a load may be redundant with a load that is later in RPO because of loops.  That load's map would not have been updated.

I think that the best thing to do is to fixpoint CSE.
Comment 3 Saam Barati 2017-12-13 16:49:29 PST
(In reply to Filip Pizlo from comment #1)
> @Saam: I think that this is try-catch fallout.  We introduced some extra
> loads of arguments for try-catch entrypoints hacks.  CSE successfully
> eliminates those, but that causes it to not eliminate any loads from things
> loaded from arguments.

Interesting. Do you know where I emitted more loads? I thought I just refactored things to do argument type speculations on the call entrypoint edge of the EntrySwitch.
Comment 4 Saam Barati 2017-12-13 16:50:08 PST
What are the control flow edges in your example?
Comment 5 Filip Pizlo 2017-12-13 16:50:09 PST
Created attachment 329288 [details]
the patch
Comment 6 Filip Pizlo 2017-12-13 16:50:58 PST
Created attachment 329289 [details]
the patch
Comment 7 Filip Pizlo 2017-12-13 16:55:35 PST
(In reply to Saam Barati from comment #3)
> (In reply to Filip Pizlo from comment #1)
> > @Saam: I think that this is try-catch fallout.  We introduced some extra
> > loads of arguments for try-catch entrypoints hacks.  CSE successfully
> > eliminates those, but that causes it to not eliminate any loads from things
> > loaded from arguments.
> 
> Interesting. Do you know where I emitted more loads? I thought I just
> refactored things to do argument type speculations on the call entrypoint
> edge of the EntrySwitch.

I'm not sure you actually emitted new ones.  I think that the first load is the one in lower() right after setting up multiple entrypoints.  It may be that by moving the code, it became less obvious to eliminate.  Not sure about the mechanism by which this happened...
Comment 8 Saam Barati 2017-12-13 16:57:15 PST
Comment on attachment 329289 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=329289&action=review

r=me

> Source/JavaScriptCore/b3/B3EliminateCommonSubexpressions.cpp:419
> +                    if (B3EliminateCommonSubexpressionsInternal::verbose)

so much verbose 😥

> Source/JavaScriptCore/b3/B3Generate.cpp:89
> +        if (eliminateCommonSubexpressions(procedure))
> +            eliminateCommonSubexpressions(procedure);

Why only twice?

> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:74
> +static const bool verbose = true;

oops

> Source/JavaScriptCore/dfg/DFGSSAConversionPhase.cpp:86
> +        if (m_graph.m_numberOfEntrypoints > 1) {

The reason I didn't do this before is I thought it'd add nice test coverage for EntrySwitch. However, it gets eliminated almost immediately inside B3. So I agree it's probably better to just do it here because it'll make the compiler faster by processing fewer BBs.
Comment 9 Filip Pizlo 2017-12-13 17:00:29 PST
(In reply to Filip Pizlo from comment #7)
> (In reply to Saam Barati from comment #3)
> > (In reply to Filip Pizlo from comment #1)
> > > @Saam: I think that this is try-catch fallout.  We introduced some extra
> > > loads of arguments for try-catch entrypoints hacks.  CSE successfully
> > > eliminates those, but that causes it to not eliminate any loads from things
> > > loaded from arguments.
> > 
> > Interesting. Do you know where I emitted more loads? I thought I just
> > refactored things to do argument type speculations on the call entrypoint
> > edge of the EntrySwitch.
> 
> I'm not sure you actually emitted new ones.  I think that the first load is
> the one in lower() right after setting up multiple entrypoints.  It may be
> that by moving the code, it became less obvious to eliminate.  Not sure
> about the mechanism by which this happened...

Actually, no longer sure that's what did it.  It may be because we're now speculating cell on that argument, causing the load of the argument in arguments checking to not be dead.

I'm going to stop doing the archeology now.  I'm getting too confused.
Comment 10 Filip Pizlo 2017-12-13 17:02:13 PST
(In reply to Saam Barati from comment #8)
> Comment on attachment 329289 [details]
> the patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=329289&action=review
> 
> r=me
> 
> > Source/JavaScriptCore/b3/B3EliminateCommonSubexpressions.cpp:419
> > +                    if (B3EliminateCommonSubexpressionsInternal::verbose)
> 
> so much verbose 😥
> 
> > Source/JavaScriptCore/b3/B3Generate.cpp:89
> > +        if (eliminateCommonSubexpressions(procedure))
> > +            eliminateCommonSubexpressions(procedure);
> 
> Why only twice?

I don't like unbounded fixpointing of optimizations, because *usually* you don't need to fixpoint at all, and if you need to rerun a second time, then the likelihood of a third time being profitable is epsilon.

> 
> > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:74
> > +static const bool verbose = true;
> 
> oops

Fixed.

> 
> > Source/JavaScriptCore/dfg/DFGSSAConversionPhase.cpp:86
> > +        if (m_graph.m_numberOfEntrypoints > 1) {
> 
> The reason I didn't do this before is I thought it'd add nice test coverage
> for EntrySwitch. However, it gets eliminated almost immediately inside B3.
> So I agree it's probably better to just do it here because it'll make the
> compiler faster by processing fewer BBs.

Yeah, and we've already enjoyed that test coverage before.  This is mostly useful because it makes diffing IR easier.
Comment 11 Filip Pizlo 2017-12-13 22:05:11 PST
Landed in https://trac.webkit.org/changeset/225893/webkit
Comment 12 Radar WebKit Bug Importer 2017-12-13 22:06:44 PST
<rdar://problem/36040919>