RESOLVED FIXED 180783
Octane/richards regressed by a whopping 20% because eliminateCommonSubexpressions has a weird fixpoint requirement
https://bugs.webkit.org/show_bug.cgi?id=180783
Summary Octane/richards regressed by a whopping 20% because eliminateCommonSubexpress...
Filip Pizlo
Reported 2017-12-13 16:34:04 PST
We accidentally introduced an extra load from arguments. That load revealed the fact that our B3 CSE cannot handle redundant dependent chains: BB#1: a: Load(@x) BB#2: c: Load(@x) d: Load(@c) BB#3: e: Load(@c) Then CSE will build a map at the end of BB#2 that says that loads from @c are serviced by @d. Then CSE will do this: BB#1: a: Load(@x) BB#2: c: Identity(@a) d: Load(@a) BB#3: e: Load(@a) But then, when we try to eliminate @e, we search for what loads service @a. BB#2 tells us about @c, not @a. The solution is either to fixpoint CSE or to perform substitution on the matches map.
Attachments
the patch (16.52 KB, patch)
2017-12-13 16:50 PST, Filip Pizlo
no flags
the patch (16.69 KB, patch)
2017-12-13 16:50 PST, Filip Pizlo
saam: review+
Filip Pizlo
Comment 1 2017-12-13 16:34:47 PST
@Saam: I think that this is try-catch fallout. We introduced some extra loads of arguments for try-catch entrypoints hacks. CSE successfully eliminates those, but that causes it to not eliminate any loads from things loaded from arguments.
Filip Pizlo
Comment 2 2017-12-13 16:44:18 PST
(In reply to Filip Pizlo from comment #0) > We accidentally introduced an extra load from arguments. That load revealed > the fact that our B3 CSE cannot handle redundant dependent chains: > > BB#1: > a: Load(@x) > BB#2: > c: Load(@x) > d: Load(@c) > BB#3: > e: Load(@c) > > Then CSE will build a map at the end of BB#2 that says that loads from @c > are serviced by @d. > > Then CSE will do this: > > BB#1: > a: Load(@x) > BB#2: > c: Identity(@a) > d: Load(@a) > BB#3: > e: Load(@a) > > But then, when we try to eliminate @e, we search for what loads service @a. > BB#2 tells us about @c, not @a. > > The solution is either to fixpoint CSE or to perform substitution on the > matches map. It's even more subtle! We fix that value map after processing a block. The real problem is that a load may be redundant with a load that is later in RPO because of loops. That load's map would not have been updated. I think that the best thing to do is to fixpoint CSE.
Saam Barati
Comment 3 2017-12-13 16:49:29 PST
(In reply to Filip Pizlo from comment #1) > @Saam: I think that this is try-catch fallout. We introduced some extra > loads of arguments for try-catch entrypoints hacks. CSE successfully > eliminates those, but that causes it to not eliminate any loads from things > loaded from arguments. Interesting. Do you know where I emitted more loads? I thought I just refactored things to do argument type speculations on the call entrypoint edge of the EntrySwitch.
Saam Barati
Comment 4 2017-12-13 16:50:08 PST
What are the control flow edges in your example?
Filip Pizlo
Comment 5 2017-12-13 16:50:09 PST
Created attachment 329288 [details] the patch
Filip Pizlo
Comment 6 2017-12-13 16:50:58 PST
Created attachment 329289 [details] the patch
Filip Pizlo
Comment 7 2017-12-13 16:55:35 PST
(In reply to Saam Barati from comment #3) > (In reply to Filip Pizlo from comment #1) > > @Saam: I think that this is try-catch fallout. We introduced some extra > > loads of arguments for try-catch entrypoints hacks. CSE successfully > > eliminates those, but that causes it to not eliminate any loads from things > > loaded from arguments. > > Interesting. Do you know where I emitted more loads? I thought I just > refactored things to do argument type speculations on the call entrypoint > edge of the EntrySwitch. I'm not sure you actually emitted new ones. I think that the first load is the one in lower() right after setting up multiple entrypoints. It may be that by moving the code, it became less obvious to eliminate. Not sure about the mechanism by which this happened...
Saam Barati
Comment 8 2017-12-13 16:57:15 PST
Comment on attachment 329289 [details] the patch View in context: https://bugs.webkit.org/attachment.cgi?id=329289&action=review r=me > Source/JavaScriptCore/b3/B3EliminateCommonSubexpressions.cpp:419 > + if (B3EliminateCommonSubexpressionsInternal::verbose) so much verbose 😥 > Source/JavaScriptCore/b3/B3Generate.cpp:89 > + if (eliminateCommonSubexpressions(procedure)) > + eliminateCommonSubexpressions(procedure); Why only twice? > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:74 > +static const bool verbose = true; oops > Source/JavaScriptCore/dfg/DFGSSAConversionPhase.cpp:86 > + if (m_graph.m_numberOfEntrypoints > 1) { The reason I didn't do this before is I thought it'd add nice test coverage for EntrySwitch. However, it gets eliminated almost immediately inside B3. So I agree it's probably better to just do it here because it'll make the compiler faster by processing fewer BBs.
Filip Pizlo
Comment 9 2017-12-13 17:00:29 PST
(In reply to Filip Pizlo from comment #7) > (In reply to Saam Barati from comment #3) > > (In reply to Filip Pizlo from comment #1) > > > @Saam: I think that this is try-catch fallout. We introduced some extra > > > loads of arguments for try-catch entrypoints hacks. CSE successfully > > > eliminates those, but that causes it to not eliminate any loads from things > > > loaded from arguments. > > > > Interesting. Do you know where I emitted more loads? I thought I just > > refactored things to do argument type speculations on the call entrypoint > > edge of the EntrySwitch. > > I'm not sure you actually emitted new ones. I think that the first load is > the one in lower() right after setting up multiple entrypoints. It may be > that by moving the code, it became less obvious to eliminate. Not sure > about the mechanism by which this happened... Actually, no longer sure that's what did it. It may be because we're now speculating cell on that argument, causing the load of the argument in arguments checking to not be dead. I'm going to stop doing the archeology now. I'm getting too confused.
Filip Pizlo
Comment 10 2017-12-13 17:02:13 PST
(In reply to Saam Barati from comment #8) > Comment on attachment 329289 [details] > the patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=329289&action=review > > r=me > > > Source/JavaScriptCore/b3/B3EliminateCommonSubexpressions.cpp:419 > > + if (B3EliminateCommonSubexpressionsInternal::verbose) > > so much verbose 😥 > > > Source/JavaScriptCore/b3/B3Generate.cpp:89 > > + if (eliminateCommonSubexpressions(procedure)) > > + eliminateCommonSubexpressions(procedure); > > Why only twice? I don't like unbounded fixpointing of optimizations, because *usually* you don't need to fixpoint at all, and if you need to rerun a second time, then the likelihood of a third time being profitable is epsilon. > > > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:74 > > +static const bool verbose = true; > > oops Fixed. > > > Source/JavaScriptCore/dfg/DFGSSAConversionPhase.cpp:86 > > + if (m_graph.m_numberOfEntrypoints > 1) { > > The reason I didn't do this before is I thought it'd add nice test coverage > for EntrySwitch. However, it gets eliminated almost immediately inside B3. > So I agree it's probably better to just do it here because it'll make the > compiler faster by processing fewer BBs. Yeah, and we've already enjoyed that test coverage before. This is mostly useful because it makes diffing IR easier.
Filip Pizlo
Comment 11 2017-12-13 22:05:11 PST
Radar WebKit Bug Importer
Comment 12 2017-12-13 22:06:44 PST
Note You need to log in before you can comment on or make changes to this bug.