Bug 156891

Summary: [JSC] PredictionPropagation should not be in the top 5 heaviest phases
Product: WebKit Reporter: Benjamin Poulain <benjamin>
Component: New BugsAssignee: Benjamin Poulain <benjamin>
Status: RESOLVED FIXED    
Severity: Normal CC: commit-queue, fpizlo, keith_miller, mark.lam, msaboff, saam
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch for landing none

Description Benjamin Poulain 2016-04-21 22:09:56 PDT
[JSC] PredictionPropagation should not be in the top 5 heaviest phases
Comment 1 Benjamin Poulain 2016-04-21 22:25:06 PDT
Created attachment 277009 [details]
Patch
Comment 2 Benjamin Poulain 2016-04-21 23:15:02 PDT
From run to run, it is between 0.2 and 0.8% faster. Here is a good run:

                                    Conf#1                    Conf#2                                      

3d-cube                         4.9343+-0.0898     ?      4.9911+-0.1312        ? might be 1.0115x slower
3d-morph                        5.0066+-0.0786     ?      5.1369+-0.2216        ? might be 1.0260x slower
3d-raytrace                     5.2229+-0.0652            5.2187+-0.0457        
access-binary-trees             2.1478+-0.0362            2.1227+-0.0393          might be 1.0118x faster
access-fannkuch                 6.1786+-0.2340            5.9789+-0.1307          might be 1.0334x faster
access-nbody                    2.6457+-0.1253            2.5420+-0.0463          might be 1.0408x faster
access-nsieve                   3.0282+-0.0550            3.0154+-0.0763        
bitops-3bit-bits-in-byte        1.1365+-0.0187     ?      1.1588+-0.0366        ? might be 1.0196x slower
bitops-bits-in-byte             2.8563+-0.0470            2.8536+-0.1076        
bitops-bitwise-and              2.0675+-0.0569     ?      2.0695+-0.0485        ?
bitops-nsieve-bits              3.1162+-0.0383     ?      3.1651+-0.0915        ? might be 1.0157x slower
controlflow-recursive           2.3721+-0.0394            2.3328+-0.0417          might be 1.0168x faster
crypto-aes                      4.2856+-0.1929            4.1238+-0.0575          might be 1.0393x faster
crypto-md5                      2.4638+-0.0602     ?      2.4813+-0.0456        ?
crypto-sha1                     2.3946+-0.0979            2.3788+-0.0678        
date-format-tofte               6.8225+-0.1669            6.7144+-0.1163          might be 1.0161x faster
date-format-xparb               4.6748+-0.1307     ?      4.7250+-0.1635        ? might be 1.0108x slower
math-cordic                     2.9013+-0.0712            2.9000+-0.1035        
math-partial-sums               4.7603+-0.0547     ?      4.7903+-0.0770        ?
math-spectral-norm              2.0250+-0.0351     ?      2.0333+-0.0394        ?
regexp-dna                      6.5777+-0.2425     ?      6.6352+-0.2959        ?
string-base64                   4.7766+-0.1619            4.6521+-0.1108          might be 1.0268x faster
string-fasta                    5.7766+-0.0823     ?      5.8047+-0.1863        ?
string-tagcloud                 8.1666+-0.2112            8.1428+-0.1602        
string-unpack-code             18.9451+-0.5965           18.4644+-0.2174          might be 1.0260x faster
string-validate-input           4.4505+-0.1446            4.3418+-0.0949          might be 1.0250x faster

<arithmetic>                    4.6051+-0.0399            4.5682+-0.0340          might be 1.0081x faster
Comment 3 Mark Lam 2016-04-22 09:53:00 PDT
(In reply to comment #2)
> From run to run, it is between 0.2 and 0.8% faster. Here is a good run:
> 
> <arithmetic>                    4.6051+-0.0399            4.5682+-0.0340    
> might be 1.0081x faster

Just curious: do we accept "might be" numbers as a believable indicator now?
Comment 4 Benjamin Poulain 2016-04-22 10:17:27 PDT
(In reply to comment #3)
> (In reply to comment #2)
> > From run to run, it is between 0.2 and 0.8% faster. Here is a good run:
> > 
> > <arithmetic>                    4.6051+-0.0399            4.5682+-0.0340    
> > might be 1.0081x faster
> 
> Just curious: do we accept "might be" numbers as a believable indicator now?

This is Sunspider!
Any gain/regression less than a millisecond is well within the noise of a modern CPU.

What I do is run it multiple times, typically 3-5 times with outer=20. If all runs are in the same direction, it is likely the right thing.

Most regressions can't be tracked on the bots for the same reason: the average is going up/down, but every change is within the noise.
Comment 5 Mark Lam 2016-04-22 10:50:58 PDT
Comment on attachment 277009 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=277009&action=review

nice work.  r=me.

> Source/JavaScriptCore/ChangeLog:21
> +

Please remove this extra empty line.

> Source/JavaScriptCore/ChangeLog:30
> +

Please remove this extra empty line.

> Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp:39
> +bool verboseFixPointLoops = false;

If you make this const static, then the compiler can elide the "verbose" code.  As is, it still needs to emit code for them and branch around them.
Comment 6 Filip Pizlo 2016-04-22 11:45:53 PDT
Comment on attachment 277009 [details]
Patch

Super cool!
Comment 7 Benjamin Poulain 2016-04-22 11:46:58 PDT
> > Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp:39
> > +bool verboseFixPointLoops = false;
> 
> If you make this const static, then the compiler can elide the "verbose"
> code.  As is, it still needs to emit code for them and branch around them.

Why would constant propagation fail here?
Comment 8 Mark Lam 2016-04-22 12:46:48 PDT
(In reply to comment #7)
> > > Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp:39
> > > +bool verboseFixPointLoops = false;
> > 
> > If you make this const static, then the compiler can elide the "verbose"
> > code.  As is, it still needs to emit code for them and branch around them.
> 
> Why would constant propagation fail here?

Because it is not const.  How would the compiler know that another file won't do this:

    extern bool verboseFixPointLoops;
    ...
    verboseFixPointLoops = true;

... that is unless there's link time optimizations involved.  Is link time optimization capable of eliding code like this?
Comment 9 Mark Lam 2016-04-22 12:53:33 PDT
(In reply to comment #8)
> (In reply to comment #7)
> > > > Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp:39
> > > > +bool verboseFixPointLoops = false;
> > > 
> > > If you make this const static, then the compiler can elide the "verbose"
> > > code.  As is, it still needs to emit code for them and branch around them.
> > 
> > Why would constant propagation fail here?
> 
> Because it is not const.  How would the compiler know that another file
> won't do this:
> 
>     extern bool verboseFixPointLoops;
>     ...
>     verboseFixPointLoops = true;
> 
> ... that is unless there's link time optimizations involved.  Is link time
> optimization capable of eliding code like this?

Another reason to put a "static const" attribute on it is to document that this flag is only used in this file and is not meant to be mutable at runtime.
Comment 10 Benjamin Poulain 2016-04-22 15:11:06 PDT
Created attachment 277104 [details]
Patch for landing
Comment 11 WebKit Commit Bot 2016-04-22 16:10:07 PDT
Comment on attachment 277104 [details]
Patch for landing

Clearing flags on attachment: 277104

Committed r199933: <http://trac.webkit.org/changeset/199933>
Comment 12 WebKit Commit Bot 2016-04-22 16:10:12 PDT
All reviewed patches have been landed.  Closing bug.
Comment 13 Mark Lam 2016-04-22 16:14:08 PDT
(In reply to comment #9)
> (In reply to comment #8)
> > (In reply to comment #7)
> > > > > Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp:39
> > > > > +bool verboseFixPointLoops = false;
> > > > 
> > > > If you make this const static, then the compiler can elide the "verbose"
> > > > code.  As is, it still needs to emit code for them and branch around them.
> > > 
> > > Why would constant propagation fail here?
> > 

To follow up for those reading along: Ben pointed out that verboseFixPointLoops is in an anonymous namespace, which makes it effectively static to this compilation unit.  Hence, no need for the static const attributes.