WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 125430
CSE should work in SSA
https://bugs.webkit.org/show_bug.cgi?id=125430
Summary
CSE should work in SSA
Filip Pizlo
Reported
2013-12-08 14:35:02 PST
Patch forthcoming.
Attachments
the patch
(3.20 KB, patch)
2013-12-08 15:37 PST
,
Filip Pizlo
oliver
: review+
Details
Formatted Diff
Diff
View All
Add attachment
proposed patch, testcase, etc.
Filip Pizlo
Comment 1
2013-12-08 14:35:12 PST
This is actually non-trivial.
Filip Pizlo
Comment 2
2013-12-08 15:37:44 PST
Created
attachment 218717
[details]
the patch
Nadav Rotem
Comment 3
2013-12-08 19:00:39 PST
>> m_graph.getBlocksInDepthFirstOrder(depthFirst);
Why are you scanning the code in DFS? Is this a preparation for the multi-block CSE support? If it is then I have some code for that. Also, I think that I need some help with my CSE patch. It boosts one of the sun spider benchmarks by 30%, but I had a crash I could not debug when I was refactoring the code to use the DoesReadOrWrite API.
Filip Pizlo
Comment 4
2013-12-08 19:06:31 PST
(In reply to
comment #3
)
> >> m_graph.getBlocksInDepthFirstOrder(depthFirst); > > Why are you scanning the code in DFS?
It guarantees that if A dominates B then I visit A before B. Essentially, CSE is doing a RAUW even though DFG doesn't have pointers from a node to its users. So even though I don't find global redundancies, I may have a basic block with: x: ValueAdd(@a, @b) y: ValueAdd(@a, @b) And @y gets used in some basic blocks dominated by this block. CSE is responsible for finding all of the uses of @y and replacing them with @x. That's why we do DFS order.
> Is this a preparation for the multi-block CSE support? If it is then I have some code for that. Also, I think that I need some help with my CSE patch. It boosts one of the sun spider benchmarks by 30%, but I had a crash I could not debug when I was refactoring the code to use the DoesReadOrWrite API.
OK - we should chat this week!
Nadav Rotem
Comment 5
2013-12-08 20:18:51 PST
(In reply to
comment #4
)
> (In reply to
comment #3
) > > >> m_graph.getBlocksInDepthFirstOrder(depthFirst); > > > > Why are you scanning the code in DFS? > > It guarantees that if A dominates B then I visit A before B. Essentially, CSE is doing a RAUW even though DFG doesn't have pointers from a node to its users. So even though I don't find global redundancies, I may have a basic block with: > > x: ValueAdd(@a, @b) > y: ValueAdd(@a, @b) > > And @y gets used in some basic blocks dominated by this block. CSE is responsible for finding all of the uses of @y and replacing them with @x. That's why we do DFS order. >
I know how dominator based value numbering works :) I was wondering why you are implementing it this way. My understanding is that this patch will not change anything since the code still does intra block cse. You will still need to support the non SSA dfg phase with the same code base, so I am not sure how you want to implement the multi block cse. The block scan order is the least of my worries.
> > Is this a preparation for the multi-block CSE support? If it is then I have some code for that. Also, I think that I need some help with my CSE patch. It boosts one of the sun spider benchmarks by 30%, but I had a crash I could not debug when I was refactoring the code to use the DoesReadOrWrite API. > > OK - we should chat this week!
Filip Pizlo
Comment 6
2013-12-08 20:38:13 PST
(In reply to
comment #5
)
> (In reply to
comment #4
) > > (In reply to
comment #3
) > > > >> m_graph.getBlocksInDepthFirstOrder(depthFirst); > > > > > > Why are you scanning the code in DFS? > > > > It guarantees that if A dominates B then I visit A before B. Essentially, CSE is doing a RAUW even though DFG doesn't have pointers from a node to its users. So even though I don't find global redundancies, I may have a basic block with: > > > > x: ValueAdd(@a, @b) > > y: ValueAdd(@a, @b) > > > > And @y gets used in some basic blocks dominated by this block. CSE is responsible for finding all of the uses of @y and replacing them with @x. That's why we do DFS order. > > > > I know how dominator based value numbering works :) I was wondering why you are implementing it this way.
That's the thing - this isn't meant to be dominator-based value numbering. This is still doing block-local CSE. The only thing global about it is propagating node->misc.replacement, which is like a bulk version of RAUW. I think we're in violent agreement over the fact that this isn't dominator based value numbering. It's not meant to be. This patch is just a bugfix to make it legal to run the current CSE phase in SSA mode. Previously doing so would either assert or corrupt the IR.
> My understanding is that this patch will not change anything since the code still does intra block cse.
Correct, for now. This work:
https://bugs.webkit.org/show_bug.cgi?id=125253
will eventually benefit from it, since it's doing a lowering that happens after SSAification. The long-term idea is to do some DFG->DFG lowerings only along the FTL path. Then it makes sense to do some extra CSE along the FTL path, as well.
> You will still need to support the non SSA dfg phase with the same code base, so I am not sure how you want to implement the multi block cse. The block scan order is the least of my worries.
I'm not implementing multi-block CSE. This patch isn't intended to make multi-block CSE easier to implement. I'm just making it so that it is possible to run the existing CSE phase when in SSA form. Prior to this patch, if you needed to do *any* CSE - even the stupid block-local one - over SSA, you'd crash because the block-local CSE would either assert or corrupt the IR. I think that this patch is mainly interesting from the standpoint of just making all tests pass when CSE is run over SSA. Ultimately, I want all optimization phases to be runnable in SSA and to have some way of testing that they do, so that we have the option of doing SSAification prior to the optimization fixpoint in FTL mode. I suspect that if nothing else, this ought to reduce compile times. But it should also make the CFA and the folder more precise. On the topic of multi-block CSE, I'm not sure how much the DFG CSE is going to end up buying us, in FTL mode. It certainly finds *some* things that LLVM's GVN won't find (in case of nodes that call C functions that do write some memory but don't clobber the node being eliminated), but for *most* things, LLVM GVN+TBAA is likely to be strictly better. Maybe then the DFG CSE would serve a similar purpose to LLVM's EarlyCSE, which as I understand it, has more to do with compile times than anything else?
> > > > Is this a preparation for the multi-block CSE support? If it is then I have some code for that. Also, I think that I need some help with my CSE patch. It boosts one of the sun spider benchmarks by 30%, but I had a crash I could not debug when I was refactoring the code to use the DoesReadOrWrite API. > > > > OK - we should chat this week!
Filip Pizlo
Comment 7
2013-12-10 09:54:45 PST
Landed in
http://trac.webkit.org/changeset/160328
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug