Bug 84553

Summary: DFG should have control flow graph simplification
Product: WebKit Reporter: Filip Pizlo <fpizlo@apple.com>
Component: JavaScriptCoreAssignee: Nobody <webkit-unassigned@lists.webkit.org>
Status: RESOLVED FIXED    
Severity: Normal    
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Bug Depends on:    
Bug Blocks: 86918, 86939    
Attachments:
Description Flags
work in progress
none
it seems to work
none
getting close
none
getting closer
none
more
none
more
none
starting to run SunSpider
none
it appears to work
none
almost there
none
and now, passing more tests than ever!
none
the patch oliver: review+

Description From 2012-04-22 15:09:58 PST
Patch forthcoming.
------- Comment #1 From 2012-04-22 15:13:50 PST -------
Created an attachment (id=138277) [details]
work in progress
------- Comment #2 From 2012-04-22 17:06:19 PST -------
Created an attachment (id=138280) [details]
it seems to work

There appear to still be some issues, like that the fixpoint is not converging as fast as it probably should.  I'll investigate more and then start running bigger tests.
------- Comment #3 From 2012-04-22 17:58:19 PST -------
Created an attachment (id=138281) [details]
getting close

Fixed the convergence problem, which was due to constant abstract values being deconstantified upon inter-block merging. This then revealed a bunch of bugs in constant folding of GetLocal. So I fixed them.

Right now the code is kind of performing as expected for simple code.  But I still need to test it more and do more investigation to ensure that I haven't introduced some horrible problems.
------- Comment #4 From 2012-04-22 19:48:55 PST -------
Found some fun issues:

- Merging the taken successor when the branch condition is known is wrong if that successor has multiple predecessors. We should convert the branch to a jump in that case, and insert phantoms for those things that were live in the not-taken successor.

- The handling of constant folding on GetLocal is wrong. First, it fails to deref the child of the GetLocal, which means that the original SetLocal is still executed even if it need not be. This leads to the graph being in an inconsistent state where the SetLocal will have a ref count than is greater than the number of nodes using it. Second, it's not clear that it will correctly inform OSR exit of the value of the local, if the original SetLocal is killed. Likely the best way to fix this would be that for any block in which the valueAtHead is a constant, we insert a JSConstant/dead-SetLocal pair at the top of the block, and link variablesAtHead to that SetLocal. Link variablesAtTail to that SetLocal also, if variablesAtTail was equal to variablesAtHead. Constant folding a GetLocal ought to assert that the local has been relinked in this fashion in this basic block.

But this appears to be dangerous, since a subsequent run of the CFA might get massively confused. Consider that we might have a block that does not use a variable but knows that the variable is a constant. Then it will have a dead SetLocal at the top. But since it's dead, the CFA will assume that the block kills the variable. So the constant value that flowed through this block will not be propagated correctly. Ouch!  I need to think about this more.

All of this does bring up an interesting question: could we perform constant propagation as a block-local analysis separate from CFA, and then propagate results between blocks by observing when a block uses a GetLocal that is linked to SetLocal of a JSConstant. The problem with that approach is that it won't work cleanly with value speculation or sparse conditional constant propagation, since it will depend too much on interleaving with CFG simplification.
------- Comment #5 From 2012-04-24 01:33:11 PST -------
I've decided that:

- We should have Phantom nodes in place of GetLocals in the constant folder. This means that Phantom nodes participate in operand-related liveness. So you may have Phantom(SetLocal), Phantom(Flush), Phantom(SetArgument), or Phantom(Phi). The control flow simplification will forward Phantom(operand instruction) to Phantom(source of the operand instruction) if the operand's value is known locally due to a SetLocal.

- We need to clean up unreachable blocks correctly. This means removing dead Phi edges. More subtly, it means that if we jettison a block - which may not render it unreachable, yet - then we must remove those Phi edges outgoing from that block that targeted the predecessor that we are killing.

- We need IR verification. Many of the invariants that we're allegedly upholding in this phase are not documented anywhere and may already be invalidated by bugs in other parts of the compiler. We need a phase that enforces the invariants that we're interested in. We can disable this phase for release builds, but it should always be on in debug builds.
------- Comment #6 From 2012-04-24 01:35:09 PST -------
Created an attachment (id=138517) [details]
getting closer

- Implemented (hopefully) correct handling of Phantom keep-alive for children of GetLocals killed by constant folding.

- Implemented (hopefully) correct handling of Branch(constant), in that we now only merge blocks if the taken successor of the branch has only one predecessor.

- Implemented (hopefully) correct unreachable block destruction.

- Started to implement, but did not finish, the handling of jettisoned blocks with respect to Phi edges.
------- Comment #7 From 2012-04-24 01:54:03 PST -------
Created an attachment (id=138519) [details]
more

- Implemented more of the handling of removing a block from the predecessors list.

- Also chose to rely on redundant phi elimination not being run, since that would only make things more complicated.
------- Comment #8 From 2012-04-24 16:35:41 PST -------
Created an attachment (id=138683) [details]
more

- Ensured that the state of variablesAtHead/variablesAtTail after constant folding and CFG simplification complies with our DFG-style CPS conventions.

- Fixed a bug where the value of the condition passed to Branch was not being interpreted correctly.
------- Comment #9 From 2012-04-24 19:27:23 PST -------
Created an attachment (id=138720) [details]
starting to run SunSpider

Fixed miscellaneous bugs.  I can now run some of the SunSpider benchmarks.
------- Comment #10 From 2012-04-24 20:32:52 PST -------
Created an attachment (id=138729) [details]
it appears to work

Runs SunSpider, V8, and Kraken.

Still need to test performance and run more tests.
------- Comment #11 From 2012-04-25 22:17:28 PST -------
Created an attachment (id=138934) [details]
almost there

This appears to work, and is performance neutral, except on Kraken, where it is an ever-so-slight win.  Still running more tests.  Will set to r? if I find it to work on all the relevant ones, otherwise it's back to work for me.
------- Comment #12 From 2012-04-25 22:41:59 PST -------
Yup, I found another bug.  Consider that we have block A that branches to blocks B and C.  Both blocks B and C then jump to block D.  We notice that block A will always jump to block B, so we jettison block C.  We currently successfully clean up C's phis so that they do not point to A anymore. But let's assume that there are no other jumps to C, so C is unreachable now. We will fail to clean up D's incoming phis to reflect this.

Ooops!
------- Comment #13 From 2012-04-26 19:37:49 PST -------
Created an attachment (id=139123) [details]
and now, passing more tests than ever!
------- Comment #14 From 2012-04-27 00:10:31 PST -------
Created an attachment (id=139140) [details]
the patch
------- Comment #15 From 2012-04-27 10:50:26 PST -------
(From update of attachment 139140 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=139140&action=review

r=me, but make sure we don't have any custom toBoolean(CallFrame*) overrides in WebCore -- I recall that there was one, something like document.all -- but it's possible we hanlde that with the MasqueradesAsUndefined flag.

r+ with the few changes i suggested.

> Source/JavaScriptCore/dfg/DFGAbstractState.cpp:153
> +            root->valuesAtHead.local(i).makeTop();

Sadness.  Maybe one day we'll be able to do some analysis of capture sites.  Oh well.

> Source/JavaScriptCore/dfg/DFGAdjacencyList.h:45
> +    #define AdjacencyListSize 3

could we use enum { AdjacencyListSize = 3 } rather than a #define here?  Aside from anything else #define's are hard to see in the debugger. (i'd suggest static const but i see we use it as an array length below. Silly C++ const expr requirements.

> Source/JavaScriptCore/dfg/DFGEdge.h:83
> +    // use an Edge where you meant to use Edge::index(), you will instead get Edge::isSet(). Gross!

There's a horrible trick we use in other parts of the code where you can do this:

    typedef void* (DFGEdge::*UnspecifiedBoolType);
    operator UnspecifiedBoolType*() const { return isSet ? reinterpret_cast<UnspecifiedBoolType*>(1) : 0; }

This gives you (essentially) operator bool() but without the horrible implied int conversions.

> Source/JavaScriptCore/dfg/DFGGraph.cpp:130
> -    if (mustGenerate) {
> -        ASSERT(refCount);
> +    if (mustGenerate)
>          --refCount;

Seems reasonable to keep this assertion.

> Source/JavaScriptCore/dfg/DFGNode.h:478
> +            return 1;
> +        case Branch:
> +            return 2;

Mmmmmm tasty constants ;) (not saying that there's anything better though :-(

> Source/JavaScriptCore/heap/Heap.cpp:860
>      double lastGCEndTime = WTF::currentTime();
>      m_lastGCLength = lastGCEndTime - lastGCStartTime;
>      JAVASCRIPTCORE_GC_END();
> -
> +    
>      (*m_activityCallback)();
>  }
>  

revert Heap.cpp?
------- Comment #16 From 2012-04-27 11:38:27 PST -------
(In reply to comment #15)
> (From update of attachment 139140 [details] [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=139140&action=review
> 
> r=me, but make sure we don't have any custom toBoolean(CallFrame*) overrides in WebCore -- I recall that there was one, something like document.all -- but it's possible we hanlde that with the MasqueradesAsUndefined flag.
> 
> r+ with the few changes i suggested.
> 
> > Source/JavaScriptCore/dfg/DFGAbstractState.cpp:153
> > +            root->valuesAtHead.local(i).makeTop();
> 
> Sadness.  Maybe one day we'll be able to do some analysis of capture sites.  Oh well.

It's not that hard.  But I don't want to pile on more dangerous stuff all at once.

> 
> > Source/JavaScriptCore/dfg/DFGAdjacencyList.h:45
> > +    #define AdjacencyListSize 3
> 
> could we use enum { AdjacencyListSize = 3 } rather than a #define here?  Aside from anything else #define's are hard to see in the debugger. (i'd suggest static const but i see we use it as an array length below. Silly C++ const expr requirements.

Ah, good point about using an enum.  I'll do that.

> 
> > Source/JavaScriptCore/dfg/DFGEdge.h:83
> > +    // use an Edge where you meant to use Edge::index(), you will instead get Edge::isSet(). Gross!
> 
> There's a horrible trick we use in other parts of the code where you can do this:
> 
>     typedef void* (DFGEdge::*UnspecifiedBoolType);
>     operator UnspecifiedBoolType*() const { return isSet ? reinterpret_cast<UnspecifiedBoolType*>(1) : 0; }
> 
> This gives you (essentially) operator bool() but without the horrible implied int conversions.

I'll add that.  Though, right now, nobody will use it.  I've gotten used to having !!foo.child1() in places.

> 
> > Source/JavaScriptCore/dfg/DFGGraph.cpp:130
> > -    if (mustGenerate) {
> > -        ASSERT(refCount);
> > +    if (mustGenerate)
> >          --refCount;
> 
> Seems reasonable to keep this assertion.

It isn't.  If you have this assertion and the validation fails, then instead of getting a nice pretty graph dump showing you why validation failed, you have a good chance of getting an assertion failure in the graph dump.

The graph dump should not try to validate the graph.  It should try hard to print it even if it's bogus.

> 
> > Source/JavaScriptCore/dfg/DFGNode.h:478
> > +            return 1;
> > +        case Branch:
> > +            return 2;
> 
> Mmmmmm tasty constants ;) (not saying that there's anything better though :-(

They're infallible constants of nature and truth.  A jump has one successor because that's the definition of a jump.  A branch has two successors because that's the definition of a branch.

> 
> > Source/JavaScriptCore/heap/Heap.cpp:860
> >      double lastGCEndTime = WTF::currentTime();
> >      m_lastGCLength = lastGCEndTime - lastGCStartTime;
> >      JAVASCRIPTCORE_GC_END();
> > -
> > +    
> >      (*m_activityCallback)();
> >  }
> >  
> 
> revert Heap.cpp?

Ooops!  Yes, will do that.
------- Comment #17 From 2012-04-27 15:06:08 PST -------
(In reply to comment #15)
> (From update of attachment 139140 [details] [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=139140&action=review
> 
> r=me, but make sure we don't have any custom toBoolean(CallFrame*) overrides in WebCore -- I recall that there was one, something like document.all -- but it's possible we hanlde that with the MasqueradesAsUndefined flag.

Would overrides be possible, given that toBoolean() is not in the methodTable?
------- Comment #18 From 2012-04-27 16:35:08 PST -------
Landed in http://trac.webkit.org/changeset/115512
------- Comment #19 From 2012-05-18 15:29:59 PST -------
Merged in http://trac.webkit.org/changeset/117646