Bug 127639

Summary: FTL should do polyvariant GetById inlining
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: JavaScriptCoreAssignee: Filip Pizlo <fpizlo>
Status: RESOLVED FIXED    
Severity: Normal CC: barraclough, ggaren, mark.lam, mhahnenberg, mmirman, msaboff, oliver, sam
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Bug Depends on: 127646    
Bug Blocks: 127325    
Attachments:
Description Flags
it begins
none
almost there
none
the patch oliver: review+

Description Filip Pizlo 2014-01-25 20:04:19 PST
Patch forthcoming.
Comment 1 Filip Pizlo 2014-01-25 22:44:10 PST
Created attachment 222257 [details]
it begins
Comment 2 Filip Pizlo 2014-01-26 10:50:56 PST
Created attachment 222279 [details]
almost there

It inlined a GetById.
Comment 3 Filip Pizlo 2014-01-26 12:16:09 PST
This, combined with polyvariant Call/Construct inlining, is a 15-16% speed-up on Raytrace.
Comment 4 Oliver Hunt 2014-01-26 12:20:13 PST
(In reply to comment #3)
> This, combined with polyvariant Call/Construct inlining, is a 15-16% speed-up on Raytrace.

ooh, nice
Comment 5 Filip Pizlo 2014-01-26 12:35:02 PST
(In reply to comment #4)
> (In reply to comment #3)
> > This, combined with polyvariant Call/Construct inlining, is a 15-16% speed-up on Raytrace.
> 
> ooh, nice

Yup.
Comment 6 Filip Pizlo 2014-01-26 12:39:12 PST
Here's the combined impact of polyvariant call, construct, and get_by_id inlining, as well as our loosening of the restrictions on recursive inlining.


Benchmark report for SunSpider, Octane, Kraken, and AsmBench on oldmac (MacPro4,1).

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/cStack/OpenSource/WebKitBuild/Release/jsc (r162775)
"Polyvariant" at /Volumes/Data/fromMiniMe/cStack/OpenSource/WebKitBuild/Release/jsc (r162802)

Collected 7 samples per benchmark/VM, with 7 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to
get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds.

                                                TipOfTree                Polyvariant                                    
SunSpider:
   3d-cube                                    8.2543+-0.0749            8.2333+-0.1511        
   3d-morph                                   9.2846+-0.2250            9.2691+-0.1157        
   3d-raytrace                                9.7788+-0.2722            9.7150+-0.0529        
   access-binary-trees                        2.7546+-0.0173     !      2.9426+-0.0121        ! definitely 1.0683x slower
   access-fannkuch                            8.1062+-0.1008     ?      8.1651+-0.0838        ?
   access-nbody                               4.6833+-0.1771            4.4671+-0.1296          might be 1.0484x faster
   access-nsieve                              5.5225+-0.2111            5.4650+-0.1239          might be 1.0105x faster
   bitops-3bit-bits-in-byte                   1.9982+-0.0312     ?      2.0090+-0.0279        ?
   bitops-bits-in-byte                        6.6814+-0.0321            6.6639+-0.0190        
   bitops-bitwise-and                         3.0589+-0.0673            3.0476+-0.0139        
   bitops-nsieve-bits                         5.8736+-0.0810            5.7569+-0.1250          might be 1.0203x faster
   controlflow-recursive                      3.9936+-0.0468     ^      3.3387+-0.0278        ^ definitely 1.1962x faster
   crypto-aes                                 6.2213+-0.0279     ?      6.2234+-0.0255        ?
   crypto-md5                                 3.8075+-0.0526     ^      3.6968+-0.0291        ^ definitely 1.0299x faster
   crypto-sha1                                3.7536+-0.0368     ?      3.7608+-0.0212        ?
   date-format-tofte                         12.3529+-0.0818     !     12.6441+-0.1839        ! definitely 1.0236x slower
   date-format-xparb                          8.9583+-0.0575     ^      8.7369+-0.0866        ^ definitely 1.0253x faster
   math-cordic                                4.8908+-0.0517            4.8886+-0.0249        
   math-partial-sums                         10.3546+-0.1096     ?     10.4173+-0.1506        ?
   math-spectral-norm                         3.2556+-0.0364     ?      3.2887+-0.0389        ? might be 1.0102x slower
   regexp-dna                                13.0241+-0.1945     ?     13.0838+-0.1657        ?
   string-base64                              5.9731+-0.0482     ?      5.9746+-0.0953        ?
   string-fasta                              11.1139+-0.2209           11.1097+-0.1173        
   string-tagcloud                           15.8351+-0.1706           15.6483+-0.2020          might be 1.0119x faster
   string-unpack-code                        33.6274+-0.6313     ?     33.8924+-0.1845        ?
   string-validate-input                      7.8653+-0.1073            7.8404+-0.0598        

   <arithmetic> *                             8.1163+-0.0253            8.0877+-0.0243          might be 1.0035x faster
   <geometric>                                6.6451+-0.0221     ^      6.5917+-0.0170        ^ definitely 1.0081x faster
   <harmonic>                                 5.5843+-0.0167     ^      5.5303+-0.0152        ^ definitely 1.0098x faster

                                                TipOfTree                Polyvariant                                    
Octane and V8v7:
   encrypt                                   0.41947+-0.00054    ?     0.42183+-0.00353       ?
   decrypt                                   7.59646+-0.05317          7.56945+-0.04025       
   deltablue                        x2       0.50688+-0.00434          0.50290+-0.00519       
   earley                                    0.98659+-0.02694          0.96930+-0.00832         might be 1.0178x faster
   boyer                                    12.25735+-0.09908         12.23577+-0.04478       
   raytrace                         x2       4.08674+-0.11312    ^     3.46633+-0.03018       ^ definitely 1.1790x faster
   regexp                           x2      32.47881+-0.10753    ?    32.58171+-0.17481       ?
   richards                         x2       0.21225+-0.00624          0.21141+-0.00340       
   splay                            x2       0.61052+-0.00422    ^     0.58871+-0.00244       ^ definitely 1.0370x faster
   navier-stokes                    x2       7.79889+-0.00486    ?     7.80587+-0.03070       ?
   closure                                   0.74937+-0.01671    ?     0.76622+-0.00554       ? might be 1.0225x slower
   jquery                                   10.81645+-0.07306    !    11.03254+-0.02684       ! definitely 1.0200x slower
   gbemu                            x2      69.14262+-0.79314    ^    62.30284+-0.78711       ^ definitely 1.1098x faster
   mandreel                         x2      97.96271+-2.83544         97.44957+-0.68272       
   pdfjs                            x2     101.85877+-0.27247    ?   102.26682+-0.30497       ?
   box2d                            x2      30.28804+-0.64902         30.23497+-0.16425       

V8v7:
   <arithmetic>                              7.04050+-0.01758    ^     6.96939+-0.02719       ^ definitely 1.0102x faster
   <geometric> *                             2.12872+-0.00887    ^     2.07182+-0.00711       ^ definitely 1.0275x faster
   <harmonic>                                0.75943+-0.00996          0.74952+-0.00610         might be 1.0132x faster

Octane including V8v7:
   <arithmetic>                             27.79685+-0.25170    ^    27.22375+-0.10254       ^ definitely 1.0211x faster
   <geometric> *                             6.30850+-0.01485    ^     6.16355+-0.01226       ^ definitely 1.0235x faster
   <harmonic>                                1.14888+-0.01537          1.13631+-0.00882         might be 1.0111x faster

                                                TipOfTree                Polyvariant                                    
Kraken:
   ai-astar                                  495.650+-0.810            495.638+-0.390         
   audio-beat-detection                      221.692+-1.536      ?     222.753+-1.646         ?
   audio-dft                                 301.221+-5.226            299.042+-6.949         
   audio-fft                                 130.039+-0.116      ?     131.248+-2.567         ?
   audio-oscillator                          592.556+-9.839      ^     578.174+-3.019         ^ definitely 1.0249x faster
   imaging-darkroom                          298.977+-0.823            298.561+-1.151         
   imaging-desaturate                        105.858+-0.138      ?     105.867+-0.526         ?
   imaging-gaussian-blur                     202.816+-35.705           189.334+-1.013           might be 1.0712x faster
   json-parse-financial                       82.880+-0.263             82.461+-0.503         
   json-stringify-tinderbox                  106.786+-2.419            104.643+-0.793           might be 1.0205x faster
   stanford-crypto-aes                       160.896+-2.225      ^     157.776+-0.839         ^ definitely 1.0198x faster
   stanford-crypto-ccm                       108.134+-3.806            106.794+-0.659           might be 1.0125x faster
   stanford-crypto-pbkdf2                    272.840+-3.271            271.732+-5.495         
   stanford-crypto-sha256-iterative          116.669+-0.647      ?     117.601+-2.366         ?

   <arithmetic> *                            228.358+-2.643            225.830+-0.827           might be 1.0112x faster
   <geometric>                               189.878+-2.074            188.136+-0.744           might be 1.0093x faster
   <harmonic>                                162.401+-1.325            161.188+-0.647           might be 1.0075x faster

                                                TipOfTree                Polyvariant                                    
AsmBench:
   bigfib.cpp                              1194.8333+-19.1009        1187.6722+-11.3044       
   cray.c                                    56.2320+-0.2700     ?     56.7966+-1.3339        ? might be 1.0100x slower
   dry.c                                    881.8271+-56.9067    ?    898.1438+-62.3191       ? might be 1.0185x slower
   FloatMM.c                               1777.0444+-28.7809        1762.6009+-1.0136        
   gcc-loops.cpp                           2177.6453+-1.1674     ?   2177.7537+-1.4576        ?
   n-body.c                                3069.9110+-1.4905     ?   3081.2242+-28.9799       ?
   Quicksort.c                               83.4500+-0.4061           83.0268+-0.3604        
   stepanov_container.cpp                  9341.3638+-31.3451        9304.7015+-39.0187       
   Towers.c                                  69.8374+-0.5867     ?     70.6055+-0.3869        ? might be 1.0110x slower

   <arithmetic>                            2072.4605+-5.4676         2069.1694+-7.9309          might be 1.0016x faster
   <geometric> *                            695.8181+-4.7783     ?    697.3273+-6.1953        ? might be 1.0022x slower
   <harmonic>                               189.4043+-0.7306     ?    190.5190+-1.8482        ? might be 1.0059x slower

                                                TipOfTree                Polyvariant                                    
All benchmarks:
   <arithmetic>                             303.7720+-0.8340          302.6966+-1.0118          might be 1.0036x faster
   <geometric>                               21.3231+-0.0741     ^     21.0623+-0.0531        ^ definitely 1.0124x faster
   <harmonic>                                 2.7351+-0.0305            2.7059+-0.0180          might be 1.0108x faster

                                                TipOfTree                Polyvariant                                    
Geomean of preferred means:
   <scaled-result>                           53.4062+-0.2108     ^     52.9316+-0.1753        ^ definitely 1.0090x faster
Comment 7 Filip Pizlo 2014-01-26 12:45:50 PST
Created attachment 222282 [details]
the patch
Comment 8 Filip Pizlo 2014-01-26 14:35:33 PST
Landed in http://trac.webkit.org/changeset/162811