Bug 68136

Summary: Tiered compilation should be enabled by default on platforms that support the DFG JIT
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: JavaScriptCoreAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal    
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
the patch
webkit-ews: commit-queue-
the patch - fix everything
webkit-ews: commit-queue-
the patch - for real this time sam: review+

Description Filip Pizlo 2011-09-14 18:10:57 PDT
Tiered compilation is now a pure win on all benchmark, running both in the command-line bencher harness and when running in in-browser harnesses.  It should be turned on by default.



Performance results are below.  Tests in command-line using bencher come first, then after that I show results in browser.

Summary:
SunSpider: no change
V8: 2.6% speed-up in command line, 3% in browser
Kraken: 6% speed-up in command line, 6.9% in browser.



Benchmark report for SunSpider, V8, and Kraken.

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/secondary/OpenSource/WebKitBuild/Release/jsc
"TipOfTreeDyn" at /Volumes/Data/pizlo/tertiary/OpenSource/WebKitBuild/Release/jsc

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark.Used 1 benchmark iteration per VM
invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting
benchmark execution times with 95% confidence intervals in milliseconds.

                                           TipOfTree              TipOfTreeDyn                                  
SunSpider:
  3d-cube                                8.4302+-0.0222    ^     8.3198+-0.0266       ^ definitely 1.0133x faster
  3d-morph                               8.0814+-0.0199          8.0445+-0.0216       
  3d-raytrace                            8.1582+-0.1157          8.1228+-0.0901       
  access-binary-trees                    2.5265+-0.0284    ?     2.5576+-0.0348       ? might be 1.0123x slower
  access-fannkuch                       12.8357+-0.0537    ?    12.8565+-0.1030       ?
  access-nbody                           4.9249+-0.0131    ^     4.8798+-0.0267       ^ definitely 1.0092x faster
  access-nsieve                          3.0600+-0.0317    !     3.1244+-0.0218       ! definitely 1.0211x slower
  bitops-3bit-bits-in-byte               1.8164+-0.0165    ^     1.7234+-0.0227       ^ definitely 1.0539x faster
  bitops-bits-in-byte                    4.5293+-0.0526          4.4825+-0.0408         might be 1.0104x faster
  bitops-bitwise-and                     4.1277+-0.0398          4.1165+-0.0351       
  bitops-nsieve-bits                     5.6712+-0.0435    ?     5.7023+-0.0300       ?
  controlflow-recursive                  2.3055+-0.0273          2.2834+-0.0296       
  crypto-aes                             6.9125+-0.0430    !     7.2886+-0.1815       ! definitely 1.0544x slower
  crypto-md5                             3.1190+-0.0591          3.0258+-0.0589         might be 1.0308x faster
  crypto-sha1                            2.4611+-0.0318          2.4469+-0.0317       
  date-format-tofte                     10.8462+-0.0381    ?    11.0204+-0.1420       ? might be 1.0161x slower
  date-format-xparb                      9.4480+-0.0600    ^     9.0535+-0.1051       ^ definitely 1.0436x faster
  math-cordic                            7.1222+-0.0299          7.1005+-0.0217       
  math-partial-sums                     10.7093+-0.0171         10.5327+-0.1899         might be 1.0168x faster
  math-spectral-norm                     2.7595+-0.0303    !     2.8293+-0.0251       ! definitely 1.0253x slower
  regexp-dna                            12.6190+-0.1450         12.5731+-0.1250       
  string-base64                          6.5514+-0.0865    ^     6.1306+-0.0649       ^ definitely 1.0687x faster
  string-fasta                           8.3384+-0.0543    !     9.0432+-0.0120       ! definitely 1.0845x slower
  string-tagcloud                       13.4682+-0.0862         13.3401+-0.0473       
  string-unpack-code                    21.2893+-0.1294    ^    20.8310+-0.1483       ^ definitely 1.0220x faster
  string-validate-input                  7.4814+-0.2716    ^     7.0612+-0.0435       ^ definitely 1.0595x faster

  <arithmetic>                           7.2920+-0.0323          7.2496+-0.0274       
  <geometric>                            6.0408+-0.0301          6.0064+-0.0286       
  <harmonic>                             4.9374+-0.0297          4.8995+-0.0310       

                                           TipOfTree              TipOfTreeDyn                                  
V8:
  crypto                               103.2552+-0.4108    ^    93.5304+-0.3761       ^ definitely 1.1040x faster
  deltablue                            292.5504+-2.2242    ^   285.5507+-2.4859       ^ definitely 1.0245x faster
  earley-boyer                         112.3912+-0.2282    !   114.2831+-0.4305       ! definitely 1.0168x slower
  raytrace                              87.6601+-0.5881         86.8542+-0.2573       
  regexp                               132.9250+-0.5788    ^   130.0159+-0.5122       ^ definitely 1.0224x faster
  richards                             300.6141+-1.8467    ^   284.7736+-1.3857       ^ definitely 1.0556x faster
  splay                                145.5287+-1.1222    ?   146.5276+-0.8034       ?

  <arithmetic>                         167.8464+-0.4878    ^   163.0765+-0.5047       ^ definitely 1.0292x faster
  <geometric>                          150.2703+-0.3533    ^   146.3499+-0.2865       ^ definitely 1.0268x faster
  <harmonic>                           136.8999+-0.2936    ^   133.3658+-0.1962       ^ definitely 1.0265x faster

                                           TipOfTree              TipOfTreeDyn                                  
Kraken:
  ai-astar                            1667.9173+-16.6919   !  1710.0280+-18.8938      ! definitely 1.0252x slower
  audio-beat-detection                 541.8131+-1.8633    ^   533.0010+-1.7733       ^ definitely 1.0165x faster
  audio-dft                            452.0701+-8.0429    ?   458.7987+-6.4440       ? might be 1.0149x slower
  audio-fft                            421.3918+-0.8967    ^   416.0566+-0.8896       ^ definitely 1.0128x faster
  audio-oscillator                     403.1402+-0.6423    ^   372.7753+-4.5984       ^ definitely 1.0815x faster
  imaging-darkroom                     585.7165+-13.8667   ^   467.8477+-1.8063       ^ definitely 1.2519x faster
  imaging-desaturate                   652.0937+-0.1374    ^   251.4218+-0.0225       ^ definitely 2.5936x faster
  imaging-gaussian-blur               1844.7490+-1.9134       1844.4333+-1.5254       
  json-parse-financial                  63.1134+-0.3853         62.6635+-0.2933       
  json-stringify-tinderbox              83.7694+-0.3490         83.7252+-0.4869       
  stanford-crypto-aes                  166.9100+-1.2094    ^   164.1889+-0.4856       ^ definitely 1.0166x faster
  stanford-crypto-ccm                  131.0104+-0.6111    ?   131.2150+-1.6770       ?
  stanford-crypto-pbkdf2               370.9821+-1.3787    !   436.4344+-3.0435       ! definitely 1.1764x slower
  stanford-crypto-sha256-iterative     143.6550+-0.1511    !   163.3373+-0.4790       ! definitely 1.1370x slower

  <arithmetic>                         537.7380+-0.8751    ^   506.8519+-1.8089       ^ definitely 1.0609x faster
  <geometric>                          342.1178+-0.8278    ^   319.0591+-0.9121       ^ definitely 1.0723x faster
  <harmonic>                           217.5167+-0.4851    ^   210.7909+-0.6950       ^ definitely 1.0319x faster

                                           TipOfTree              TipOfTreeDyn                                  
All benchmarks:
  <arithmetic>                         189.2096+-0.2348    ^   179.2756+-0.5298       ^ definitely 1.0554x faster
  <geometric>                           32.4465+-0.0987    ^    31.5542+-0.0959       ^ definitely 1.0283x faster
  <harmonic>                             8.7337+-0.0514          8.6626+-0.0538       



Tests in browser:


FROM = TipOfTree
TO = Tiered Compilation


TEST                   COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:           ??                191.3ms +/- 1.0%   191.6ms +/- 0.6%     not conclusive: might be *1.002x as slow*

=============================================================================

 3d:                  -                  25.1ms +/- 0.9%    25.0ms +/- 0.0% 
   cube:              -                   9.0ms +/- 0.0%     9.0ms +/- 0.0% 
   morph:             -                   8.0ms +/- 0.0%     8.0ms +/- 0.0% 
   raytrace:          -                   8.1ms +/- 2.8%     8.0ms +/- 0.0% 

 access:              ??                 22.8ms +/- 3.2%    23.5ms +/- 2.6%     not conclusive: might be *1.031x as slow*
   binary-trees:      *1.21x as slow*     2.4ms +/- 15.4%     2.9ms +/- 7.8%     significant
   fannkuch:          -                  12.8ms +/- 2.4%    12.8ms +/- 2.4% 
   nbody:             ??                  4.6ms +/- 8.0%     4.7ms +/- 7.3%     not conclusive: might be *1.022x as slow*
   nsieve:            ??                  3.0ms +/- 0.0%     3.1ms +/- 7.3%     not conclusive: might be *1.033x as slow*

 bitops:              ??                 15.5ms +/- 5.9%    16.0ms +/- 2.1%     not conclusive: might be *1.032x as slow*
   3bit-bits-in-byte: -                   1.6ms +/- 23.1%     1.5ms +/- 25.1% 
   bits-in-byte:      ??                  4.4ms +/- 8.4%     4.5ms +/- 8.4%     not conclusive: might be *1.023x as slow*
   bitwise-and:       -                   4.0ms +/- 0.0%     4.0ms +/- 0.0% 
   nsieve-bits:       *1.091x as slow*    5.5ms +/- 6.8%     6.0ms +/- 0.0%     significant

 controlflow:         ??                  2.0ms +/- 0.0%     2.1ms +/- 10.8%     not conclusive: might be *1.050x as slow*
   recursive:         ??                  2.0ms +/- 0.0%     2.1ms +/- 10.8%     not conclusive: might be *1.050x as slow*

 crypto:              *1.092x as slow*   12.0ms +/- 4.0%    13.1ms +/- 3.1%     significant
   aes:               *1.125x as slow*    7.2ms +/- 4.2%     8.1ms +/- 2.8%     significant
   md5:               ??                  2.6ms +/- 14.2%     2.9ms +/- 7.8%     not conclusive: might be *1.115x as slow*
   sha1:              -                   2.2ms +/- 13.7%     2.1ms +/- 10.8% 

 date:                -                  23.0ms +/- 2.1%    22.7ms +/- 1.5% 
   format-tofte:      ??                 12.8ms +/- 3.5%    12.9ms +/- 3.1%     not conclusive: might be *1.008x as slow*
   format-xparb:      1.041x as fast     10.2ms +/- 3.0%     9.8ms +/- 3.1%     significant

 math:                -                  20.5ms +/- 3.8%    20.2ms +/- 2.2% 
   cordic:            -                   7.2ms +/- 6.3%     7.1ms +/- 3.2% 
   partial-sums:      -                  10.6ms +/- 3.5%    10.3ms +/- 3.4% 
   spectral-norm:     ??                  2.7ms +/- 12.8%     2.8ms +/- 10.8%     not conclusive: might be *1.037x as slow*

 regexp:              -                  11.9ms +/- 3.4%    11.4ms +/- 3.2% 
   dna:               -                  11.9ms +/- 3.4%    11.4ms +/- 3.2% 

 string:              -                  58.5ms +/- 1.9%    57.6ms +/- 2.0% 
   base64:            1.167x as fast      7.7ms +/- 8.8%     6.6ms +/- 5.6%     significant
   fasta:             *1.059x as slow*    8.5ms +/- 4.4%     9.0ms +/- 0.0%     significant
   tagcloud:          ??                 12.6ms +/- 2.9%    12.9ms +/- 1.8%     not conclusive: might be *1.024x as slow*
   unpack-code:       ??                 21.3ms +/- 2.3%    21.4ms +/- 3.2%     not conclusive: might be *1.005x as slow*
   validate-input:    1.091x as fast      8.4ms +/- 5.9%     7.7ms +/- 4.5%     significant



TipOfTree:
Score: 4660
Richards: 4084
DeltaBlue: 3587
Crypto: 9954
RayTrace: 5228
EarleyBoyer: 8052
RegExp: 1787
Splay: 4352

TieredCompilation:   (3% faster)
Score: 4806
Richards: 4377
DeltaBlue: 3623
Crypto: 10748
RayTrace: 5416
EarleyBoyer: 7892
RegExp: 1785
Splay: 4555



FROM = TipOfTree
TO = Tiered Compilation


TEST                         COMPARISON            FROM                 TO               DETAILS

====================================================================================

** TOTAL **:                 1.069x as fast    7619.7ms +/- 0.2%   7129.9ms +/- 0.4%     significant

====================================================================================

 ai:                        ??                1690.9ms +/- 1.1%   1717.7ms +/- 1.5%     might be *1.016x as slow*
   astar:                   ??                1690.9ms +/- 1.1%   1717.7ms +/- 1.5%     might be *1.016x as slow*

 audio:                     1.033x as fast    1842.6ms +/- 0.2%   1784.4ms +/- 0.3%     significant
   beat-detection:          1.024x as fast     563.0ms +/- 0.5%    549.6ms +/- 0.7%     significant
   dft:                     *1.009x as slow*   450.9ms +/- 0.3%    455.1ms +/- 0.7%     significant
   fft:                     1.016x as fast     425.6ms +/- 0.7%    419.0ms +/- 0.6%     significant
   oscillator:              1.118x as fast     403.1ms +/- 0.4%    360.7ms +/- 0.4%     significant

 imaging:                   1.21x as fast     3095.3ms +/- 0.1%   2557.2ms +/- 0.2%     significant
   gaussian-blur:           1.002x as fast    1819.6ms +/- 0.1%   1816.3ms +/- 0.1%     significant
   darkroom:                1.27x as fast      622.6ms +/- 0.3%    488.9ms +/- 1.2%     significant
   desaturate:              2.59x as fast      653.1ms +/- 0.2%    252.0ms +/- 0.4%     significant

 json:                      -                  159.1ms +/- 0.6%    159.1ms +/- 0.3% 
   parse-financial:         -                   75.2ms +/- 1.1%     74.6ms +/- 0.5% 
   stringify-tinderbox:     *1.007x as slow*    83.9ms +/- 0.5%     84.5ms +/- 0.4%     significant

 stanford:                  *1.096x as slow*   831.8ms +/- 0.5%    911.5ms +/- 0.3%     significant
   crypto-aes:              ??                 166.2ms +/- 1.3%    167.4ms +/- 1.5%     might be *1.007x as slow*
   crypto-ccm:              -                  142.5ms +/- 1.1%    142.0ms +/- 0.8% 
   crypto-pbkdf2:           *1.167x as slow*   373.3ms +/- 0.8%    435.8ms +/- 0.7%     significant
   crypto-sha256-iterative: *1.110x as slow*   149.8ms +/- 0.7%    166.3ms +/- 0.7%     significant
Comment 1 Filip Pizlo 2011-09-14 18:18:55 PDT
Created attachment 107435 [details]
the patch
Comment 2 Early Warning System Bot 2011-09-14 18:36:43 PDT
Comment on attachment 107435 [details]
the patch

Attachment 107435 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/9661764
Comment 3 Filip Pizlo 2011-09-14 18:39:04 PDT
Created attachment 107437 [details]
the patch - fix everything
Comment 4 Early Warning System Bot 2011-09-14 18:51:18 PDT
Comment on attachment 107437 [details]
the patch - fix everything

Attachment 107437 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/9655784
Comment 5 Filip Pizlo 2011-09-14 19:05:04 PDT
Created attachment 107442 [details]
the patch - for real this time
Comment 6 Filip Pizlo 2011-09-14 20:42:14 PDT
Will wait for the bots to go green before committing, since I'm not yet 100% sure that I've got all those #ifdef's right.
Comment 7 Filip Pizlo 2011-09-15 11:54:38 PDT
Landed in r95206.


The last numbers (comparing 95205 to 95206):



Benchmark report for SunSpider, V8, and Kraken.

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/octonary/OpenSource/WebKitBuild/Release/jsc
"TieredCompilation" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM
invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting
benchmark execution times with 95% confidence intervals in milliseconds.

                                            TipOfTree           TieredCompilation                                
SunSpider:
   3d-cube                                7.9594+-0.2150          7.9262+-0.1658       
   3d-morph                               7.5548+-0.1317    ?     7.5557+-0.1786       ?
   3d-raytrace                            8.2024+-0.3451          7.6801+-0.2302         might be 1.0680x faster
   access-binary-trees                    2.2097+-0.0229    !     2.2953+-0.0507       ! definitely 1.0387x slower
   access-fannkuch                       11.6981+-0.2218    ?    11.8111+-0.2361       ?
   access-nbody                           4.3410+-0.1316          4.2497+-0.1391         might be 1.0215x faster
   access-nsieve                          2.5923+-0.0997    ?     2.6411+-0.0757       ? might be 1.0188x slower
   bitops-3bit-bits-in-byte               1.7042+-0.0569    ?     1.7102+-0.0685       ?
   bitops-bits-in-byte                    2.7652+-0.0643    ?     2.8248+-0.1128       ? might be 1.0215x slower
   bitops-bitwise-and                     3.7014+-0.1329          3.6661+-0.1160       
   bitops-nsieve-bits                     5.4193+-0.0787          5.2836+-0.1220         might be 1.0257x faster
   controlflow-recursive                  1.9813+-0.0341    ?     2.0904+-0.0850       ? might be 1.0551x slower
   crypto-aes                             6.7605+-0.2467    ?     7.1447+-0.3279       ? might be 1.0568x slower
   crypto-md5                             2.7752+-0.0483    ?     3.3646+-0.6840       ? might be 1.2124x slower
   crypto-sha1                            2.2489+-0.0621          2.1552+-0.0704         might be 1.0435x faster
   date-format-tofte                     11.2852+-0.3376    ^    10.3442+-0.2719       ^ definitely 1.0910x faster
   date-format-xparb                      8.8027+-0.2859          8.7357+-0.2227       
   math-cordic                            6.3806+-0.1088          6.2527+-0.1303         might be 1.0204x faster
   math-partial-sums                      7.6559+-0.1588          7.3674+-0.1612         might be 1.0392x faster
   math-spectral-norm                     2.4454+-0.0401    !     2.5832+-0.0443       ! definitely 1.0563x slower
   regexp-dna                            11.1329+-0.2861         10.7977+-0.2008         might be 1.0310x faster
   string-base64                          5.8202+-0.1199          5.8175+-0.1956       
   string-fasta                           7.1207+-0.0827    !     8.5540+-0.2553       ! definitely 1.2013x slower
   string-tagcloud                       11.8230+-0.2290    ?    11.8602+-0.2660       ?
   string-unpack-code                    19.1518+-0.4605         19.1024+-0.4931       
   string-validate-input                  7.5095+-0.3377    ^     6.6418+-0.1659       ^ definitely 1.1307x faster

   <arithmetic>                           6.5785+-0.0437          6.5560+-0.0377       
   <geometric>                            5.3869+-0.0303    ?     5.4105+-0.0375       ?
   <harmonic>                             4.3522+-0.0249    ?     4.4074+-0.0392       ? might be 1.0127x slower

                                            TipOfTree           TieredCompilation                                
V8:
   crypto                                90.8545+-0.8925    ^    85.2666+-2.3793       ^ definitely 1.0655x faster
   deltablue                            265.8815+-3.5027    ^   247.8677+-2.3100       ^ definitely 1.0727x faster
   earley-boyer                          94.4787+-0.7301    ?    95.5216+-0.8552       ? might be 1.0110x slower
   raytrace                              76.1163+-1.1740    ^    71.9430+-0.4490       ^ definitely 1.0580x faster
   regexp                               108.4889+-1.3826        107.0564+-0.5068         might be 1.0134x faster
   richards                             239.9382+-2.8300    ^   220.9082+-2.7744       ^ definitely 1.0861x faster
   splay                                 99.3576+-0.2353    ?    99.8593+-0.4493       ?

   <arithmetic>                         139.3022+-0.6465    ^   132.6318+-0.5601       ^ definitely 1.0503x faster
   <geometric>                          123.9271+-0.5100    ^   119.2335+-0.4920       ^ definitely 1.0394x faster
   <harmonic>                           112.9873+-0.5110    ^   109.3138+-0.5171       ^ definitely 1.0336x faster

                                            TipOfTree           TieredCompilation                                
Kraken:
   ai-astar                            1220.2982+-33.6219   ^   651.2664+-13.5760      ^ definitely 1.8737x faster
   audio-beat-detection                 481.6183+-3.7772        476.8139+-5.5267         might be 1.0101x faster
   audio-dft                            423.7553+-5.9952        417.3201+-3.1266         might be 1.0154x faster
   audio-fft                            369.7725+-0.7480        369.4925+-4.5536       
   audio-oscillator                     392.0498+-16.2092   ^   323.1123+-7.5327       ^ definitely 1.2134x faster
   imaging-darkroom                     538.4024+-9.1255    ^   417.7482+-3.7546       ^ definitely 1.2888x faster
   imaging-desaturate                   629.2546+-14.4686   ^   211.4487+-1.9027       ^ definitely 2.9759x faster
   imaging-gaussian-blur               1729.7578+-6.4878    ?  1739.8427+-19.4150      ?
   json-parse-financial                  48.4246+-0.4196    ?    49.0291+-0.8502       ? might be 1.0125x slower
   json-stringify-tinderbox              70.2918+-2.9485         68.1125+-0.4787         might be 1.0320x faster
   stanford-crypto-aes                  147.8755+-2.6111    ^   144.7200+-0.4577       ^ definitely 1.0218x faster
   stanford-crypto-ccm                  113.2348+-1.2837        111.8843+-0.7783         might be 1.0121x faster
   stanford-crypto-pbkdf2               339.8100+-9.8956    !   404.5472+-2.2667       ! definitely 1.1905x slower
   stanford-crypto-sha256-iterative     132.5923+-0.9317    !   150.5893+-1.3005       ! definitely 1.1357x slower

   <arithmetic>                         474.0813+-3.5488    ^   395.4234+-1.9613       ^ definitely 1.1989x faster
   <geometric>                          302.5975+-1.5819    ^   263.5863+-0.6790       ^ definitely 1.1480x faster
   <harmonic>                           185.2691+-1.1003    ^   175.9015+-0.5961       ^ definitely 1.0533x faster

                                            TipOfTree           TieredCompilation                                
All benchmarks:
   <arithmetic>                         165.6020+-1.0020    ^   141.1661+-0.5955       ^ definitely 1.1731x faster
   <geometric>                           28.5295+-0.1039    ^    27.2891+-0.1128       ^ definitely 1.0455x faster
   <harmonic>                             7.6904+-0.0429    ?     7.7777+-0.0675       ? might be 1.0114x slower