Bug 125430 - CSE should work in SSA
Summary: CSE should work in SSA
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Filip Pizlo
URL:
Keywords:
Depends on:
Blocks: 112840 125429
  Show dependency treegraph
 
Reported: 2013-12-08 14:35 PST by Filip Pizlo
Modified: 2013-12-10 09:54 PST (History)
8 users (show)

See Also:


Attachments
the patch (3.20 KB, patch)
2013-12-08 15:37 PST, Filip Pizlo
oliver: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Filip Pizlo 2013-12-08 14:35:02 PST
Patch forthcoming.
Comment 1 Filip Pizlo 2013-12-08 14:35:12 PST
This is actually non-trivial.
Comment 2 Filip Pizlo 2013-12-08 15:37:44 PST
Created attachment 218717 [details]
the patch
Comment 3 Nadav Rotem 2013-12-08 19:00:39 PST
>> m_graph.getBlocksInDepthFirstOrder(depthFirst);

Why are you scanning the code in DFS?  Is this a preparation for the multi-block CSE support? If it is then I have some code for that.  Also, I think that I need some help with my CSE patch. It boosts one of the sun spider benchmarks by 30%, but I had a crash I could not debug when I was refactoring the code to use the DoesReadOrWrite API.
Comment 4 Filip Pizlo 2013-12-08 19:06:31 PST
(In reply to comment #3)
> >> m_graph.getBlocksInDepthFirstOrder(depthFirst);
> 
> Why are you scanning the code in DFS?

It guarantees that if A dominates B then I visit A before B.  Essentially, CSE is doing a RAUW even though DFG doesn't have pointers from a node to its users.  So even though I don't find global redundancies, I may have a basic block with:

x: ValueAdd(@a, @b)
y: ValueAdd(@a, @b)

And @y gets used in some basic blocks dominated by this block.  CSE is responsible for finding all of the uses of @y and replacing them with @x.  That's why we do DFS order.

> Is this a preparation for the multi-block CSE support? If it is then I have some code for that.  Also, I think that I need some help with my CSE patch. It boosts one of the sun spider benchmarks by 30%, but I had a crash I could not debug when I was refactoring the code to use the DoesReadOrWrite API.

OK - we should chat this week!
Comment 5 Nadav Rotem 2013-12-08 20:18:51 PST
(In reply to comment #4)
> (In reply to comment #3)
> > >> m_graph.getBlocksInDepthFirstOrder(depthFirst);
> > 
> > Why are you scanning the code in DFS?
> 
> It guarantees that if A dominates B then I visit A before B.  Essentially, CSE is doing a RAUW even though DFG doesn't have pointers from a node to its users.  So even though I don't find global redundancies, I may have a basic block with:
> 
> x: ValueAdd(@a, @b)
> y: ValueAdd(@a, @b)
> 
> And @y gets used in some basic blocks dominated by this block.  CSE is responsible for finding all of the uses of @y and replacing them with @x.  That's why we do DFS order.
> 

I know how dominator based value numbering works :) I was wondering why you are implementing it this way. My understanding is that this patch will not change anything since the code still does intra block cse. You will still need to support the non SSA dfg phase with the same code base, so I am not sure how you want to implement the multi block cse. The block scan order is the least of my worries. 

> > Is this a preparation for the multi-block CSE support? If it is then I have some code for that.  Also, I think that I need some help with my CSE patch. It boosts one of the sun spider benchmarks by 30%, but I had a crash I could not debug when I was refactoring the code to use the DoesReadOrWrite API.
> 
> OK - we should chat this week!
Comment 6 Filip Pizlo 2013-12-08 20:38:13 PST
(In reply to comment #5)
> (In reply to comment #4)
> > (In reply to comment #3)
> > > >> m_graph.getBlocksInDepthFirstOrder(depthFirst);
> > > 
> > > Why are you scanning the code in DFS?
> > 
> > It guarantees that if A dominates B then I visit A before B.  Essentially, CSE is doing a RAUW even though DFG doesn't have pointers from a node to its users.  So even though I don't find global redundancies, I may have a basic block with:
> > 
> > x: ValueAdd(@a, @b)
> > y: ValueAdd(@a, @b)
> > 
> > And @y gets used in some basic blocks dominated by this block.  CSE is responsible for finding all of the uses of @y and replacing them with @x.  That's why we do DFS order.
> > 
> 
> I know how dominator based value numbering works :) I was wondering why you are implementing it this way.

That's the thing - this isn't meant to be dominator-based value numbering.  This is still doing block-local CSE.  The only thing global about it is propagating node->misc.replacement, which is like a bulk version of RAUW.

I think we're in violent agreement over the fact that this isn't dominator based value numbering.  It's not meant to be.  This patch is just a bugfix to make it legal to run the current CSE phase in SSA mode.  Previously doing so would either assert or corrupt the IR.

> My understanding is that this patch will not change anything since the code still does intra block cse. 

Correct, for now.  This work:

https://bugs.webkit.org/show_bug.cgi?id=125253

will eventually benefit from it, since it's doing a lowering that happens after SSAification.

The long-term idea is to do some DFG->DFG lowerings only along the FTL path.  Then it makes sense to do some extra CSE along the FTL path, as well.

> You will still need to support the non SSA dfg phase with the same code base, so I am not sure how you want to implement the multi block cse. The block scan order is the least of my worries. 

I'm not implementing multi-block CSE.  This patch isn't intended to make multi-block CSE easier to implement.  I'm just making it so that it is possible to run the existing CSE phase when in SSA form.

Prior to this patch, if you needed to do *any* CSE - even the stupid block-local one - over SSA, you'd crash because the block-local CSE would either assert or corrupt the IR.  I think that this patch is mainly interesting from the standpoint of just making all tests pass when CSE is run over SSA.

Ultimately, I want all optimization phases to be runnable in SSA and to have some way of testing that they do, so that we have the option of doing SSAification prior to the optimization fixpoint in FTL mode.  I suspect that if nothing else, this ought to reduce compile times.  But it should also make the CFA and the folder more precise.

On the topic of multi-block CSE, I'm not sure how much the DFG CSE is going to end up buying us, in FTL mode.  It certainly finds *some* things that LLVM's GVN won't find (in case of nodes that call C functions that do write some memory but don't clobber the node being eliminated), but for *most* things, LLVM GVN+TBAA is likely to be strictly better.  Maybe then the DFG CSE would serve a similar purpose to LLVM's EarlyCSE, which as I understand it, has more to do with compile times than anything else?

> 
> > > Is this a preparation for the multi-block CSE support? If it is then I have some code for that.  Also, I think that I need some help with my CSE patch. It boosts one of the sun spider benchmarks by 30%, but I had a crash I could not debug when I was refactoring the code to use the DoesReadOrWrite API.
> > 
> > OK - we should chat this week!
Comment 7 Filip Pizlo 2013-12-10 09:54:45 PST
Landed in http://trac.webkit.org/changeset/160328