80415 – Eliminate redundant Phis in DFG

RESOLVED FIXED 80415

Eliminate redundant Phis in DFG

https://bugs.webkit.org/show_bug.cgi?id=80415

Summary Eliminate redundant Phis in DFG

Yuqiang Xian

Reported 2012-03-06 06:17:20 PST

Although this may not have any advantage at current stage, this is towards minimal SSA to make more high level optimizations (like bug 76770) easier. We have the choices either to build minimal SSA from scratch or to keep current simple Phi insertion mechanism and remove the redundancy in another phase. Currently we choose the latter because the change could be smaller.

Attachments
WIP patch (25.17 KB, patch) 2012-03-06 07:13 PST, Yuqiang Xian	no flags	Details Formatted Diff Diff
Initial performance result (7.03 KB, text/plain) 2012-03-06 07:16 PST, Yuqiang Xian	no flags	Details
WIP patch (25.13 KB, patch) 2012-03-06 07:53 PST, Yuqiang Xian	no flags	Details Formatted Diff Diff
Case showing the problem of type predications propagation after Phi elimination (6.73 KB, text/plain) 2012-03-06 17:39 PST, Yuqiang Xian	no flags	Details
proposed patch (25.89 KB, patch) 2012-03-07 02:54 PST, Yuqiang Xian	no flags	Details Formatted Diff Diff
Latest performance result (6.96 KB, text/plain) 2012-03-07 02:56 PST, Yuqiang Xian	no flags	Details
updated patch (25.51 KB, patch) 2012-03-07 06:51 PST, Yuqiang Xian	fpizlo: review+	Details Formatted Diff Diff
patch updated (18.69 KB, patch) 2012-03-07 17:26 PST, Yuqiang Xian	fpizlo: review+	Details Formatted Diff Diff
Show Obsolete (5) View All Add attachment proposed patch, testcase, etc.

Yuqiang Xian

Comment 1 2012-03-06 07:13:42 PST

Created attachment 130375 [details] WIP patch WIP - Please let me know if this is the right approach or not. Thanks.

Yuqiang Xian

Comment 2 2012-03-06 07:16:56 PST

Created attachment 130377 [details] Initial performance result There seems to be a 9% regression on SunSpider-crypto-sha1, which should be caused by the additional compilation overhead. Will try to investigate it later.

Yuqiang Xian

Comment 3 2012-03-06 07:53:29 PST

Created attachment 130384 [details] WIP patch This partially reduces the 9% regression on sunspider-crypto-sha1 to 6-7%. The remaining overhead mainly comes from recording Phi uses when building the graph.

Filip Pizlo

Comment 4 2012-03-06 13:55:51 PST

Comment on attachment 130384 [details] WIP patch View in context: https://bugs.webkit.org/attachment.cgi?id=130384&action=review I think this is great. My intuition is that we could simplify this further by doing away with the phi use map. But maybe I'm wrong? Have you tried it without the use map? > Source/JavaScriptCore/dfg/DFGAbstractState.cpp:1006 > - if (!node.refCount()) > + // We don't want to ignore the skipped Phi nodes, otherwise we may fail to > + // propagate the type predictions to other blocks. > + if (!node.refCount() && node.op != Phi) Why? The idea here is that if we have a node that claims that it is using a variable from another block, and its uses of that variable are dead, then we have no work to do. > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:594 > + void addPhiUse(NodeIndex phi, NodeIndex node, unsigned childIndex, NodeIndex replace = NoNode) > + { > + ASSERT(phi != NoNode && m_graph[phi].op == Phi); > + > + Vector<PhiUse>* phiUse; > + PhiUseMap::iterator iter = m_graph.m_phiUses.find(phi); > + if (iter != m_graph.m_phiUses.end()) > + phiUse = &iter->second; > + else { > + Vector<PhiUse> emptyUse; > + pair<PhiUseMap::iterator, bool> result = m_graph.m_phiUses.add(phi, emptyUse); > + phiUse = &result.first->second; > + } > + > + // If a replacement is needed, we want to remove the old use entry. > + // This would find and delete an entry in the middle of a vector, > + // but we expect this would be rarely happened in practice - only when > + // a Phi node in a joint block needs to handle more than 3 incoming edges. > + if (replace != NoNode) { > + size_t i = phiUse->find(PhiUse(replace, childIndex)); > + ASSERT(i != notFound); > + phiUse->remove(i); > + } > + > + phiUse->append(PhiUse(node, childIndex)); > + } Not sure I buy that the phi use map is necessary. I kind of don't like it because it means we cannot rerun the redundant phi replacement algorithm a second time unless we also maintain the consistency of this map. In general, in the future I want to be able to have a phase that introduces new nodes or switches the children of nodes, and then rerun the phi simplification if I had also done some control flow simplification. But playing with the children of nodes seems hard if I have to maintain the phi use map as I do it. > Source/JavaScriptCore/dfg/DFGRedundantPhiEliminationPhase.cpp:87 > + switch (use.childIndex()) { > + case 1: > + ASSERT(useNode.child1().indexUnchecked() == phi); > + useNode.children.child1().setIndex(replacement); > + m_graph[replacement].ref(); > + break; > + case 2: > + ASSERT(useNode.child2().indexUnchecked() == phi); > + useNode.children.child2().setIndex(replacement); > + m_graph[replacement].ref(); > + break; > + case 3: > + ASSERT(useNode.child3().indexUnchecked() == phi); > + useNode.children.child3().setIndex(replacement); > + m_graph[replacement].ref(); > + break; > + default: > + break; > + } Why not kill the switch statement and say useNode.children.child(use.childIndex() - 1)? > Source/JavaScriptCore/dfg/DFGRedundantPhiEliminationPhase.cpp:126 > + NodeIndex getRedundantReplacement(NodeIndex phi) > + { > + NodeIndex child1 = m_graph[phi].child1().indexUnchecked(); > + NodeIndex candidate = child1 == phi ? NoNode : child1; > + > + NodeIndex child2 = m_graph[phi].child2().indexUnchecked(); > + if (candidate != NoNode) { > + if (child2 != NoNode && child2 != candidate && child2 != phi) > + return NoNode; > + } else if (child2 != phi) > + candidate = child2; > + > + NodeIndex child3 = m_graph[phi].child3().indexUnchecked(); > + if (candidate != NoNode) { > + if (child3 != NoNode && child3 != candidate && child3 != phi) > + return NoNode; > + } else if (child3 != phi) > + candidate = child3; > + > + return candidate; > + } Can't we run this algorithm without having a phi use map? Seems to me that we could: 1) Wrap the algorithm in a fixpoint. You're using a worklist; maybe we could do that too. 2) Walk over the graph inside the fixpoint. 3) For every child that is a phi and fits the simplification criteria, attempt to simplify.

Filip Pizlo

Comment 5 2012-03-06 13:58:42 PST

(In reply to comment #2) > Created an attachment (id=130377) [details] > Initial performance result > > There seems to be a 9% regression on SunSpider-crypto-sha1, which should be caused by the additional compilation overhead. Will try to investigate it later. I would not worry about this too much, since it hardly affects the overall average. Looking at the perf results, it does not feel like a blocker for landing this patch. Plus, maybe it's just because the DFG is recompiling like crazy on that benchmark because it is doing something that our speculations don't support? If that were the case then that would further imply to me that we should ignore that result, and file a separate bug to reduce compile times on that program.

Yuqiang Xian

Comment 6 2012-03-06 17:30:51 PST

(In reply to comment #4) > I think this is great. My intuition is that we could simplify this further by doing away with the phi use map. But maybe I'm wrong? Have you tried it without the use map? No, I haven't tried it - I was just worried that traversing the whole graph for Phi replacement may be time consuming especially when the traversal can be iterative. We can try it and I think we should only need the iterative traversal on the Phi nodes (if we have the fix of bug 80361 it would be easier) and a one-time traversal on other nodes in the graph (actually, at current stage, the GetLocals) after all of the Phi replacements are identified. > > > Source/JavaScriptCore/dfg/DFGAbstractState.cpp:1006 > > - if (!node.refCount()) > > + // We don't want to ignore the skipped Phi nodes, otherwise we may fail to > > + // propagate the type predictions to other blocks. > > + if (!node.refCount() && node.op != Phi) > > Why? The idea here is that if we have a node that claims that it is using a variable from another block, and its uses of that variable are dead, then we have no work to do. > I will attach a case to explain this problem soon. > > Not sure I buy that the phi use map is necessary. I kind of don't like it because it means we cannot rerun the redundant phi replacement algorithm a second time unless we also maintain the consistency of this map. In general, in the future I want to be able to have a phase that introduces new nodes or switches the children of nodes, and then rerun the phi simplification if I had also done some control flow simplification. But playing with the children of nodes seems hard if I have to maintain the phi use map as I do it. > Yes. In the redundant phi elimination, we maintain the consistency of this map. But on the other hand, it really brought troubles when we play with the children, though currently only GetLocals have Phi children (except for Phis), but in the future if we get rid of GetLocal/SetLocals (will?) that may be a problem. > > Source/JavaScriptCore/dfg/DFGRedundantPhiEliminationPhase.cpp:87 > > Why not kill the switch statement and say useNode.children.child(use.childIndex() - 1)? > Agree. > > Source/JavaScriptCore/dfg/DFGRedundantPhiEliminationPhase.cpp:126 > > Can't we run this algorithm without having a phi use map? Seems to me that we could: > > 1) Wrap the algorithm in a fixpoint. You're using a worklist; maybe we could do that too. > > 2) Walk over the graph inside the fixpoint. > > 3) For every child that is a phi and fits the simplification criteria, attempt to simplify. It's possible to go without phi use map. I will try and see if it can be more efficient and simpler. Thanks for the review!

Filip Pizlo

Comment 7 2012-03-06 17:34:41 PST

(In reply to comment #6) > (In reply to comment #4) > > I think this is great. My intuition is that we could simplify this further by doing away with the phi use map. But maybe I'm wrong? Have you tried it without the use map? > > No, I haven't tried it - I was just worried that traversing the whole graph for Phi replacement may be time consuming especially when the traversal can be iterative. We can try it and I think we should only need the iterative traversal on the Phi nodes (if we have the fix of bug 80361 it would be easier) and a one-time traversal on other nodes in the graph (actually, at current stage, the GetLocals) after all of the Phi replacements are identified. It seems like one way around this is to just track the Phis, and not the Phi uses. We can first do a fixpoint where we simplify Phis only - which may leave dummy Phis like: somePhi: Phi(@something) where @something is either another Phi or something real (like a SetLocal), and somePhi is still referenced from various other places (GetLocal, etc). Then we can do a second pass that considers all nodes (not just Phis) and checks if those nodes reference trivial Phis. If they do, then replace. I think that this should be sufficient, since the fixpoint (as I understand it) only depends on the shape of Phis and not the shapes of users of Phis. So you only need to visit Phis repeatedly - and after that you can visit all non-Phi uses of Phis in one go. Would that work?

Yuqiang Xian

Comment 8 2012-03-06 17:39:24 PST

Created attachment 130493 [details] Case showing the problem of type predications propagation after Phi elimination This is the case showing the problem of type predications propagation after Phi elimination. Please pay attention to block 3, where node 51 and node 49 are eliminated since those two variables each have only one definition (in block 0). They're not used in block 3 but they should still be alive because they will be used in block 1 (block 3's successor). But after we eliminate 51 and 49 in block 3, the CFA ignores them and the type predictions of the two variables are not propagated to the end of the block, and hence not to the successors.

Yuqiang Xian

Comment 9 2012-03-06 17:42:28 PST

(In reply to comment #7) > > > > No, I haven't tried it - I was just worried that traversing the whole graph for Phi replacement may be time consuming especially when the traversal can be iterative. We can try it and I think we should only need the iterative traversal on the Phi nodes (if we have the fix of bug 80361 it would be easier) and a one-time traversal on other nodes in the graph (actually, at current stage, the GetLocals) after all of the Phi replacements are identified. > > It seems like one way around this is to just track the Phis, and not the Phi uses. > > We can first do a fixpoint where we simplify Phis only - which may leave dummy Phis like: > > somePhi: Phi(@something) > > where @something is either another Phi or something real (like a SetLocal), and somePhi is still referenced from various other places (GetLocal, etc). > > Then we can do a second pass that considers all nodes (not just Phis) and checks if those nodes reference trivial Phis. If they do, then replace. > > I think that this should be sufficient, since the fixpoint (as I understand it) only depends on the shape of Phis and not the shapes of users of Phis. So you only need to visit Phis repeatedly - and after that you can visit all non-Phi uses of Phis in one go. > > Would that work? Yes. I think that's exactly what I mean for "iterative traversal on the Phi nodes only" at first and then "one-time traversal on other nodes in the graph". - if I understand you correctly. :)

Filip Pizlo

Comment 10 2012-03-06 17:43:22 PST

(In reply to comment #9) > (In reply to comment #7) > > > > > > No, I haven't tried it - I was just worried that traversing the whole graph for Phi replacement may be time consuming especially when the traversal can be iterative. We can try it and I think we should only need the iterative traversal on the Phi nodes (if we have the fix of bug 80361 it would be easier) and a one-time traversal on other nodes in the graph (actually, at current stage, the GetLocals) after all of the Phi replacements are identified. > > > > It seems like one way around this is to just track the Phis, and not the Phi uses. > > > > We can first do a fixpoint where we simplify Phis only - which may leave dummy Phis like: > > > > somePhi: Phi(@something) > > > > where @something is either another Phi or something real (like a SetLocal), and somePhi is still referenced from various other places (GetLocal, etc). > > > > Then we can do a second pass that considers all nodes (not just Phis) and checks if those nodes reference trivial Phis. If they do, then replace. > > > > I think that this should be sufficient, since the fixpoint (as I understand it) only depends on the shape of Phis and not the shapes of users of Phis. So you only need to visit Phis repeatedly - and after that you can visit all non-Phi uses of Phis in one go. > > > > Would that work? > > Yes. I think that's exactly what I mean for "iterative traversal on the Phi nodes only" at first and then "one-time traversal on other nodes in the graph". - if I understand you correctly. :) Ooops, sorry, yes, we're thinking the same thing. :-) I didn't read carefully enough the first time.

Filip Pizlo

Comment 11 2012-03-06 17:44:01 PST

(In reply to comment #8) > Created an attachment (id=130493) [details] > Case showing the problem of type predications propagation after Phi elimination > > This is the case showing the problem of type predications propagation after Phi elimination. Please pay attention to block 3, where node 51 and node 49 are eliminated since those two variables each have only one definition (in block 0). They're not used in block 3 but they should still be alive because they will be used in block 1 (block 3's successor). But after we eliminate 51 and 49 in block 3, the CFA ignores them and the type predictions of the two variables are not propagated to the end of the block, and hence not to the successors. Aha! I think that the thing that our dummy Phi insertion was giving us was liveness, and you're saying that we've lost this. This is dangerous since we need to know liveness to compute DFG OSR entry constraints. So we have two options: 1) Have a different way of computing liveness. I'm OK with this if you want to do it, since we'll need it eventually! 2) If we eliminate a Phi from a block, then have the block's references to that Phi (i.e. variablesAtHead/AtTail) refer to its replacement. I like (2) better for this patch, since it feels like a more minimal change. What do you think?

Yuqiang Xian

Comment 12 2012-03-06 17:53:51 PST

(In reply to comment #11) > > Aha! I think that the thing that our dummy Phi insertion was giving us was liveness, and you're saying that we've lost this. > > This is dangerous since we need to know liveness to compute DFG OSR entry constraints. > > So we have two options: > > 1) Have a different way of computing liveness. I'm OK with this if you want to do it, since we'll need it eventually! > > 2) If we eliminate a Phi from a block, then have the block's references to that Phi (i.e. variablesAtHead/AtTail) refer to its replacement. > > I like (2) better for this patch, since it feels like a more minimal change. > > What do you think? I agree. Then the variablesAtHead/Tail of a block may reference a node outside the block and we need to extract the type predictions from it. But it _should_ not be a problem as we now keep the abstract values of all the nodes in the graph in one AbstractState - will verify it.

Yuqiang Xian

Comment 13 2012-03-07 02:54:30 PST

Created attachment 130579 [details] proposed patch Patch w/o the Phi use map. Additionally, we don't observe any performance regression on the 3 benchmarks with this version.

Yuqiang Xian

Comment 14 2012-03-07 02:56:44 PST

Created attachment 130581 [details] Latest performance result As you could see, no obvious performance regression on the 3 major benchmarks.

WebKit Review Bot

Comment 15 2012-03-07 02:58:17 PST

Attachment 130579 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/CMakeLists.txt', u'S..." exit_code: 1 Source/JavaScriptCore/dfg/DFGRedundantPhiEliminationPhase.cpp:128: Tab found; better to use spaces [whitespace/tab] [1] Total errors found: 1 in 11 files If any of these errors are false positives, please file a bug against check-webkit-style.

Yuqiang Xian

Comment 16 2012-03-07 06:51:21 PST

Created attachment 130619 [details] updated patch

Filip Pizlo

Comment 17 2012-03-07 11:59:18 PST

Comment on attachment 130619 [details] updated patch View in context: https://bugs.webkit.org/attachment.cgi?id=130619&action=review R=me as it is, but it might be worth considering doing away with the Phi-to-block links, if possible. > Source/JavaScriptCore/dfg/DFGRedundantPhiEliminationPhase.cpp:83 > + if (replacement != NoNode) { > + // Current Phi node will be skipped, we need to update > + // the variable information if it references this node. > + BasicBlock* basicBlock = m_graph.m_blocks[node.blockIndex()].get(); > + VirtualRegister operand = node.local(); > + if (operandIsArgument(operand)) { > + unsigned arg = operandToArgument(operand); > + if (basicBlock->variablesAtHead.argument(arg) == index) { > + // This argument must be unused in this block. > + ASSERT(basicBlock->variablesAtTail.argument(arg) == index); > + basicBlock->variablesAtHead.argument(arg) = replacement; > + basicBlock->variablesAtTail.argument(arg) = replacement; > + } > + } else { > + if (basicBlock->variablesAtHead.local(operand) == index) { > + // This local variable must be unused in this block. > + ASSERT(basicBlock->variablesAtTail.local(operand) == index); > + basicBlock->variablesAtHead.local(operand) = replacement; > + basicBlock->variablesAtTail.local(operand) = replacement; > + } > + } I think this is fine, but why didn't you just walk all blocks' variablesAtHead|Tail lists after the Phi fixup fixpoint? That way, you wouldn't have had to save the block index in the Phi nodes' OpInfos.

Yuqiang Xian

Comment 18 2012-03-07 17:26:58 PST

Created attachment 130732 [details] patch updated Yes. It should be far cleaner! Patch is updated. Thanks for the review comments!

Filip Pizlo

Comment 19 2012-03-07 17:53:37 PST

Comment on attachment 130732 [details] patch updated Yay! :-)

Yuqiang Xian

Comment 20 2012-03-07 19:10:19 PST

Landed as http://trac.webkit.org/changeset/110132

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Unspecified

OS Unspecified

Product WebKit

Component JavaScriptCore

Assignee

Nobody

Reported

2012-03-06 06:17 PST

Modified

2012-03-07 19:10 PST History

CC List

4 users Show

URL

Keywords

Depends on

80361

Blocks

76770

Dependencies

tree graph