Bug 113010

Summary: Fix some minor issues in the DFG's profiling of heap accesses
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: JavaScriptCoreAssignee: Filip Pizlo <fpizlo>
Status: RESOLVED FIXED    
Severity: Normal CC: barraclough, buildbot, ggaren, mark.lam, mhahnenberg, msaboff, oliver, rniwa, sam
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Bug Depends on: 113093    
Bug Blocks:    
Attachments:
Description Flags
the patch
ggaren: review+, buildbot: commit-queue-
Archive of layout-test-results from webkit-ews-10
none
Archive of layout-test-results from webkit-ews-11 none

Description Filip Pizlo 2013-03-21 23:54:07 PDT
1) If a CodeBlock gets jettisoned by GC, we should count the exit sites.

2) If a CodeBlock clears a structure stub during GC, it should record this, and the DFG should prefer to not inline that access (i.e. treat it as if it had an exit site).

3) If a PutById was seen by the baseline JIT, and the JIT attempted to cache it, but it chose not to, then assume that it will take slow path.

4) If we frequently exited because of a structure check on a weak constant, don't try to inline that access in the future.

5) Treat all exits that were counted as being frequent.
Comment 1 Filip Pizlo 2013-03-21 23:54:52 PDT
Here's why:


Benchmark report for SunSpider, V8Spider, Octane, Kraken, and JSRegress on oldmac (MacPro4,1).

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/OpenSource/WebKitBuild/Release/jsc (r146548)
"FixGBEMU" at /Volumes/Data/fromMiniMe/tertiary/OpenSource/WebKitBuild/Release/jsc (r146548)

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get
microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds.

                                                     TipOfTree                  FixGBEMU                                     
SunSpider:
   3d-cube                                         9.0846+-0.1484            9.0303+-0.1859        
   3d-morph                                        8.8789+-0.1491            8.8739+-0.1287        
   3d-raytrace                                    10.5132+-0.1462     ?     10.5844+-0.1862        ?
   access-binary-trees                             1.9329+-0.0172            1.9249+-0.0091        
   access-fannkuch                                 7.8821+-0.1174            7.8035+-0.1064          might be 1.0101x faster
   access-nbody                                    4.5769+-0.0745     ?      4.6414+-0.0440        ? might be 1.0141x slower
   access-nsieve                                   4.9256+-0.0388     ?      5.0202+-0.0831        ? might be 1.0192x slower
   bitops-3bit-bits-in-byte                        1.8394+-0.0104     ?      1.8557+-0.0361        ?
   bitops-bits-in-byte                             7.1097+-0.1183            7.0279+-0.1173          might be 1.0116x faster
   bitops-bitwise-and                              2.5459+-0.0933     ?      2.5787+-0.0812        ? might be 1.0129x slower
   bitops-nsieve-bits                              4.1692+-0.0247            4.1654+-0.0165        
   controlflow-recursive                           3.1190+-0.0508     ?      3.1247+-0.0456        ?
   crypto-aes                                      7.7251+-0.1143     ?      7.7696+-0.1197        ?
   crypto-md5                                      4.1234+-0.0303     !      4.2755+-0.0454        ! definitely 1.0369x slower
   crypto-sha1                                     3.2872+-0.0295            3.2739+-0.0177        
   date-format-tofte                              15.0805+-0.1832           14.9441+-0.1574        
   date-format-xparb                               9.4615+-0.1663     ?      9.5086+-0.1850        ?
   math-cordic                                     4.0385+-0.0102     ?      4.0715+-0.0343        ?
   math-partial-sums                              12.5037+-0.1265     ^     12.1629+-0.1017        ^ definitely 1.0280x faster
   math-spectral-norm                              3.1927+-0.0645            3.1522+-0.0145          might be 1.0129x faster
   regexp-dna                                     11.4206+-0.1681     ?     11.5162+-0.1477        ?
   string-base64                                   4.7809+-0.0564            4.7635+-0.0727        
   string-fasta                                   10.8818+-0.1146     ?     11.0069+-0.1722        ? might be 1.0115x slower
   string-tagcloud                                14.1358+-0.2240     ?     14.2511+-0.2715        ?
   string-unpack-code                             28.0697+-0.2002           27.9494+-0.1280        
   string-validate-input                           7.3010+-0.1486     ?      7.3107+-0.1245        ?

   <arithmetic> *                                  7.7915+-0.0580     ?      7.7918+-0.0609        ? might be 1.0000x slower
   <geometric>                                     6.2628+-0.0399     ?      6.2756+-0.0417        ? might be 1.0020x slower
   <harmonic>                                      5.0539+-0.0286     ?      5.0710+-0.0273        ? might be 1.0034x slower

                                                     TipOfTree                  FixGBEMU                                     
V8Spider:
   crypto                                         87.2903+-0.1972     !     88.9008+-0.3482        ! definitely 1.0185x slower
   deltablue                                     125.1053+-0.4582          125.0692+-0.4684        
   earley-boyer                                   82.9335+-0.2887           82.6560+-0.4435        
   raytrace                                       63.1818+-2.1770           60.8824+-0.1580          might be 1.0378x faster
   regexp                                        102.0758+-0.6559     ?    102.1525+-0.5032        ?
   richards                                      119.2178+-0.5237          118.5554+-0.2717        
   splay                                          48.9179+-0.2876     ?     48.9539+-0.3058        ?

   <arithmetic>                                   89.8175+-0.3433           89.5957+-0.1935          might be 1.0025x faster
   <geometric> *                                  85.7258+-0.4322           85.4148+-0.1842          might be 1.0036x faster
   <harmonic>                                     81.4000+-0.5233           80.9978+-0.1861          might be 1.0050x faster

                                                     TipOfTree                  FixGBEMU                                     
Octane and V8v7:
   encrypt                                        0.46742+-0.00044    ?     0.46790+-0.00092       ?
   decrypt                                        8.67076+-0.02814    ?     8.70455+-0.01366       ?
   deltablue                             x2       0.56021+-0.00193          0.55772+-0.00233       
   earley                                         0.89547+-0.00456    ^     0.87711+-0.00613       ^ definitely 1.0209x faster
   boyer                                         12.78182+-0.03693    ?    12.80769+-0.05286       ?
   raytrace                              x2       4.43545+-0.02235          4.40852+-0.02031       
   regexp                                x2      32.45109+-0.24764    ?    32.72819+-0.23218       ?
   richards                              x2       0.30599+-0.00161    ?     0.30776+-0.00198       ?
   splay                                 x2       0.65689+-0.02856          0.65385+-0.02684       
   navier-stokes                         x2      10.82387+-0.03515    ?    10.85164+-0.02743       ?
   closure                                        0.31254+-0.03420          0.30932+-0.03447         might be 1.0104x faster
   jquery                                         4.44463+-0.56936          4.42281+-0.55988       
   gbemu                                 x2     251.60661+-16.89947   ^   138.40108+-2.11660       ^ definitely 1.8180x faster
   box2d                                 x2      32.08002+-0.34375    ^    31.61052+-0.08567       ^ definitely 1.0149x faster

V8v7:
   <arithmetic>                                   7.58016+-0.03249    ?     7.61704+-0.02779       ? might be 1.0049x slower
   <geometric> *                                  2.42413+-0.01196          2.42261+-0.01337         might be 1.0006x faster
   <harmonic>                                     0.92439+-0.00654          0.92366+-0.00725         might be 1.0008x faster

Octane including V8v7:
   <arithmetic>                                  31.51877+-1.55654    ^    21.21036+-0.21867       ^ definitely 1.4860x faster
   <geometric> *                                  4.37137+-0.06648    ^     4.13131+-0.04573       ^ definitely 1.0581x faster
   <harmonic>                                     1.05350+-0.01680          1.05056+-0.01754         might be 1.0028x faster

                                                     TipOfTree                  FixGBEMU                                     
Kraken:
   ai-astar                                       494.758+-0.801      ?     495.763+-0.695         ?
   audio-beat-detection                           246.158+-1.986      ?     246.574+-2.388         ?
   audio-dft                                      313.488+-1.670            312.659+-1.383         
   audio-fft                                      144.092+-0.324      ?     144.675+-1.318         ?
   audio-oscillator                               234.445+-1.201      !     250.644+-5.527         ! definitely 1.0691x slower
   imaging-darkroom                               294.169+-0.881      ^     287.658+-1.235         ^ definitely 1.0226x faster
   imaging-desaturate                             160.377+-0.426      ?     160.758+-0.197         ?
   imaging-gaussian-blur                          398.601+-0.674            397.679+-0.345         
   json-parse-financial                            79.868+-0.512      !      80.953+-0.524         ! definitely 1.0136x slower
   json-stringify-tinderbox                       100.330+-0.321      ?     101.021+-0.398         ?
   stanford-crypto-aes                             97.031+-0.576             96.596+-0.506         
   stanford-crypto-ccm                            106.567+-4.187      ^      85.570+-0.311         ^ definitely 1.2454x faster
   stanford-crypto-pbkdf2                         268.390+-0.820      !     274.963+-5.034         ! definitely 1.0245x slower
   stanford-crypto-sha256-iterative               115.556+-0.462      !     117.353+-0.494         ! definitely 1.0155x slower

   <arithmetic> *                                 218.131+-0.498            218.062+-0.453           might be 1.0003x faster
   <geometric>                                    186.634+-0.674      ^     185.124+-0.330         ^ definitely 1.0082x faster
   <harmonic>                                     160.528+-0.829      ^     157.578+-0.280         ^ definitely 1.0187x faster

                                                     TipOfTree                  FixGBEMU                                     
JSRegress:
   adapt-to-double-divide                         22.7069+-0.1195           22.5107+-0.0916        
   aliased-arguments-getbyval                      0.9086+-0.0099     ?      0.9089+-0.0114        ?
   allocate-big-object                             2.5346+-0.0326            2.5236+-0.0328        
   arity-mismatch-inlining                         0.7638+-0.0146     ?      0.7689+-0.0117        ?
   array-access-polymorphic-structure              7.0468+-0.0897     ?      7.0817+-0.0929        ?
   array-with-double-add                           5.7665+-0.0800     ?      5.8684+-0.0686        ? might be 1.0177x slower
   array-with-double-increment                     4.0895+-0.0118            4.0800+-0.0438        
   array-with-double-mul-add                       7.0919+-0.0883     ?      7.1110+-0.1112        ?
   array-with-double-sum                           7.9038+-0.1000            7.8465+-0.1184        
   array-with-int32-add-sub                       10.6536+-0.1905           10.5396+-0.0943          might be 1.0108x faster
   array-with-int32-or-double-sum                  8.0082+-0.0724            7.9646+-0.1144        
   big-int-mul                                     4.9960+-0.0200     ?      5.0022+-0.0241        ?
   boolean-test                                    4.4271+-0.0569            4.4119+-0.0584        
   cast-int-to-double                             14.0392+-0.1205           13.8658+-0.1085          might be 1.0125x faster
   cell-argument                                  14.4384+-0.1296     ?     14.5406+-0.1469        ?
   cfg-simplify                                    4.0097+-0.0700            3.9860+-0.0481        
   cmpeq-obj-to-obj-other                         11.3426+-0.3078           11.2898+-0.2939        
   constant-test                                   8.4800+-0.1338     ?      8.5250+-0.1476        ?
   direct-arguments-getbyval                       0.8349+-0.0101     ?      0.8351+-0.0089        ?
   double-pollution-getbyval                      10.7254+-0.0947     ?     10.7322+-0.1117        ?
   double-pollution-putbyoffset                    5.4535+-0.0412            5.4515+-0.0393        
   empty-string-plus-int                          10.7412+-0.2120           10.7248+-0.2359        
   external-arguments-getbyval                     2.2378+-0.0287            2.2082+-0.0401          might be 1.0134x faster
   external-arguments-putbyval                     3.3289+-0.0105            3.3162+-0.0239        
   Float32Array-matrix-mult                       14.1383+-0.0852     ^     13.9549+-0.0907        ^ definitely 1.0131x faster
   fold-double-to-int                             22.1569+-0.1782     ?     22.2348+-0.4448        ?
   function-dot-apply                              3.1789+-0.0084     ?      3.1812+-0.0086        ?
   function-test                                   4.9985+-0.0473     ?      5.0035+-0.0535        ?
   get-by-id-chain-from-try-block                  7.5162+-0.0664            7.4838+-0.0779        
   HashMap-put-get-iterate-keys                   88.3111+-0.6365           87.5419+-0.3942        
   HashMap-put-get-iterate                        90.9449+-0.7129           90.6892+-0.7596        
   HashMap-string-put-get-iterate                 73.3387+-0.5397     ?     74.5178+-0.6842        ? might be 1.0161x slower
   indexed-properties-in-objects                   4.5374+-0.0351     ?      4.5550+-0.0165        ?
   inline-arguments-access                         1.2504+-0.0112            1.2479+-0.0082        
   inline-arguments-local-escape                  23.0647+-0.1096     !     23.4042+-0.1115        ! definitely 1.0147x slower
   inline-get-scoped-var                           6.5635+-0.1417     ?      6.6958+-0.2209        ? might be 1.0202x slower
   inlined-put-by-id-transition                   16.7447+-0.1516           16.5957+-0.1912        
   int-or-other-abs-then-get-by-val                8.9181+-0.1110     ?      8.9639+-0.1115        ?
   int-or-other-abs-zero-then-get-by-val          39.6458+-0.1748     ?     39.8097+-0.2201        ?
   int-or-other-add-then-get-by-val               10.3220+-0.1349           10.2902+-0.1044        
   int-or-other-add                               10.4913+-0.1089     ?     10.5148+-0.1095        ?
   int-or-other-div-then-get-by-val                8.0230+-0.1222     ?      8.1364+-0.0987        ? might be 1.0141x slower
   int-or-other-max-then-get-by-val               10.1837+-0.1274            9.9508+-0.2539          might be 1.0234x faster
   int-or-other-min-then-get-by-val                8.1716+-0.1099     ?      8.2679+-0.1525        ? might be 1.0118x slower
   int-or-other-mod-then-get-by-val                8.0171+-0.0947     ?      8.0894+-0.1076        ?
   int-or-other-mul-then-get-by-val                7.2653+-0.0762            7.2572+-0.0857        
   int-or-other-neg-then-get-by-val                8.1943+-0.0896            8.1663+-0.0817        
   int-or-other-neg-zero-then-get-by-val          39.3124+-0.1390     ?     39.3500+-0.1781        ?
   int-or-other-sub-then-get-by-val               10.2972+-0.1346     ?     10.3359+-0.1465        ?
   int-or-other-sub                                8.3101+-0.1146            8.2243+-0.1100          might be 1.0104x faster
   int-overflow-local                             12.9788+-0.1302           12.9359+-0.1098        
   Int16Array-bubble-sort                         49.5639+-0.2098           49.5557+-0.1990        
   Int16Array-load-int-mul                         1.8801+-0.0072     ?      1.8815+-0.0053        ?
   Int8Array-load                                  4.8785+-0.0102            4.8676+-0.0322        
   integer-divide                                 15.2602+-0.1042           15.1530+-0.1075        
   integer-modulo                                  2.0556+-0.0131            2.0548+-0.0143        
   make-indexed-storage                            3.9138+-0.0425            3.8894+-0.0421        
   method-on-number                               23.6574+-0.3215     ?     24.1133+-0.6326        ? might be 1.0193x slower
   nested-function-parsing-random                380.5551+-13.0605         379.2467+-13.2662       
   nested-function-parsing                        48.2732+-1.0834           47.6943+-1.0879          might be 1.0121x faster
   new-array-buffer-dead                           3.6226+-0.0134            3.6192+-0.0119        
   new-array-buffer-push                          10.5234+-0.1967           10.4399+-0.1615        
   new-array-dead                                 28.5468+-0.1653           28.4054+-0.1117        
   new-array-push                                  6.9868+-0.0861            6.8935+-0.1080          might be 1.0135x faster
   number-test                                     4.3630+-0.0672            4.3253+-0.0554        
   object-closure-call                             8.3506+-0.1335            8.2878+-0.1085        
   object-test                                     4.8943+-0.0427     ?      4.9275+-0.0502        ?
   poly-stricteq                                  91.8564+-0.2131     ?     92.5235+-1.1846        ?
   polymorphic-structure                          20.1749+-0.1385           20.0595+-0.1167        
   polyvariant-monomorphic-get-by-id              12.5763+-0.1164     ?     12.6199+-0.1377        ?
   rare-osr-exit-on-local                         20.6867+-0.1060           20.4878+-0.1071        
   register-pressure-from-osr                     31.6845+-0.1471           31.6503+-0.1304        
   simple-activation-demo                         34.5888+-0.1389           34.4862+-0.1114        
   slow-array-profile-convergence                  4.3852+-0.0212            4.3621+-0.0403        
   slow-convergence                                3.7912+-0.0077     ^      3.7601+-0.0186        ^ definitely 1.0083x faster
   sparse-conditional                              1.3278+-0.0108            1.3152+-0.0170        
   splice-to-remove                               50.3924+-0.1113     !     51.1039+-0.3120        ! definitely 1.0141x slower
   string-concat-object                            2.7420+-0.0688            2.7257+-0.0255        
   string-concat-pair-object                       2.6576+-0.0352     ?      2.6676+-0.0400        ?
   string-concat-pair-simple                      17.1179+-0.2924           17.0252+-0.3582        
   string-concat-simple                           16.9530+-0.2678     ?     17.0634+-0.2269        ?
   string-cons-repeat                             10.2174+-0.1251           10.1296+-0.0227        
   string-cons-tower                              10.9399+-0.0408     ?     10.9665+-0.0356        ?
   string-hash                                     2.6463+-0.0106     ?      2.6571+-0.0124        ?
   string-repeat-arith                            46.6225+-0.6048           46.3527+-0.2917        
   string-sub                                     89.7833+-1.9256     ?     90.0412+-0.8695        ?
   string-test                                     4.2810+-0.0481     ?      4.3036+-0.0347        ?
   structure-hoist-over-transitions                3.2606+-0.0240     ?      3.2668+-0.0260        ?
   tear-off-arguments-simple                       1.7794+-0.0106            1.7566+-0.0135          might be 1.0130x faster
   tear-off-arguments                              3.3878+-0.0121            3.3692+-0.0104        
   temporal-structure                             20.9346+-0.0930     ?     21.2540+-0.3289        ? might be 1.0153x slower
   to-int32-boolean                               30.6093+-0.1185     ?     30.7007+-0.1133        ?
   undefined-test                                  4.5581+-0.0371            4.5442+-0.0444        

   <arithmetic>                                   19.9643+-0.1441           19.9574+-0.1572          might be 1.0003x faster
   <geometric> *                                   9.2034+-0.0211            9.1953+-0.0228          might be 1.0009x faster
   <harmonic>                                      5.1403+-0.0164            5.1338+-0.0184          might be 1.0013x faster

                                                     TipOfTree                  FixGBEMU                                     
All benchmarks:
   <arithmetic>                                   39.7236+-0.3304     ^     38.3043+-0.1036        ^ definitely 1.0371x faster
   <geometric>                                    11.1694+-0.0480     ^     11.0727+-0.0425        ^ definitely 1.0087x faster
   <harmonic>                                      3.6569+-0.0285            3.6515+-0.0309          might be 1.0015x faster

                                                     TipOfTree                  FixGBEMU                                     
Geomean of preferred means:
   <scaled-result>                                22.5728+-0.1188     ^     22.2982+-0.0941        ^ definitely 1.0123x faster
Comment 2 Filip Pizlo 2013-03-21 23:57:06 PDT
Created attachment 194458 [details]
the patch
Comment 3 Filip Pizlo 2013-03-22 00:55:16 PDT
More numbers:


Benchmark report for SunSpider, V8Spider, Octane, Kraken, JSBench, JSRegress, and DSP on oldmac (MacPro4,1).

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/OpenSource/WebKitBuild/Release/DumpRenderTree (r146548)
"FixGBEMU" at /Volumes/Data/fromMiniMe/tertiary/OpenSource/WebKitBuild/Release/DumpRenderTree (r146548)

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get
microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds.

                                                     TipOfTree                  FixGBEMU                                     
SunSpider:
   3d-cube                                        10.7897+-0.3666           10.7454+-0.3549        
   3d-morph                                        9.0504+-0.1720            8.9410+-0.0992          might be 1.0122x faster
   3d-raytrace                                    11.9426+-0.2016     ?     12.0273+-0.1447        ?
   access-binary-trees                             2.9210+-0.3263     ?      2.9284+-0.3248        ?
   access-fannkuch                                 8.0796+-0.1310            7.9860+-0.1339          might be 1.0117x faster
   access-nbody                                    4.7972+-0.0570            4.7566+-0.0628        
   access-nsieve                                   5.0343+-0.0394            5.0164+-0.0532        
   bitops-3bit-bits-in-byte                        1.8313+-0.0186            1.8305+-0.0210        
   bitops-bits-in-byte                             6.7964+-0.0857     ?      6.8145+-0.0687        ?
   bitops-bitwise-and                              2.4740+-0.0701            2.4574+-0.0717        
   bitops-nsieve-bits                              4.0948+-0.0201     ?      4.1001+-0.0299        ?
   controlflow-recursive                           3.0646+-0.0416            3.0373+-0.0194        
   crypto-aes                                      8.8459+-0.3088     ?      8.9888+-0.3004        ? might be 1.0162x slower
   crypto-md5                                      4.4452+-0.0564     ?      4.5259+-0.0921        ? might be 1.0182x slower
   crypto-sha1                                     3.5295+-0.0607     ?      3.5391+-0.0640        ?
   date-format-tofte                              17.3838+-1.1967           16.9716+-1.0278          might be 1.0243x faster
   date-format-xparb                              11.3335+-0.7899     ?     11.3986+-0.7475        ?
   math-cordic                                     4.0014+-0.0189            3.9958+-0.0120        
   math-partial-sums                              12.6863+-0.1598     ^     12.1912+-0.0327        ^ definitely 1.0406x faster
   math-spectral-norm                              3.2339+-0.0316     ?      3.2519+-0.0335        ?
   regexp-dna                                     12.3715+-0.6610           12.2216+-0.5810          might be 1.0123x faster
   string-base64                                   5.9902+-0.5740     ?      6.0203+-0.5363        ?
   string-fasta                                   12.0530+-0.3293           11.9267+-0.2625          might be 1.0106x faster
   string-tagcloud                                14.9208+-0.2709     ?     14.9517+-0.2424        ?
   string-unpack-code                             30.3821+-0.5455           30.3009+-0.6076        
   string-validate-input                          10.1125+-0.1742     ?     10.1785+-0.2036        ?

   <arithmetic> *                                  8.5448+-0.1929            8.5040+-0.1856          might be 1.0048x faster
   <geometric>                                     6.8075+-0.1242            6.7902+-0.1286          might be 1.0026x faster
   <harmonic>                                      5.4400+-0.0820            5.4353+-0.0936          might be 1.0009x faster

                                                     TipOfTree                  FixGBEMU                                     
V8Spider:
   crypto                                         89.2575+-0.7032     !     90.6746+-0.6035        ! definitely 1.0159x slower
   deltablue                                     127.6513+-0.8557          127.0990+-0.7251        
   earley-boyer                                   86.0470+-1.0161           85.4573+-0.9201        
   raytrace                                       62.8535+-0.3913     ?     64.8205+-4.5758        ? might be 1.0313x slower
   regexp                                        102.5714+-0.4082     ?    103.2094+-0.4485        ?
   richards                                      121.9889+-0.9943     ?    122.0245+-1.5579        ?
   splay                                          61.0942+-3.0038           60.4840+-3.2299          might be 1.0101x faster

   <arithmetic>                                   93.0663+-0.7043     ?     93.3956+-0.8007        ? might be 1.0035x slower
   <geometric> *                                  89.7749+-0.9023     ?     90.1153+-1.0186        ? might be 1.0038x slower
   <harmonic>                                     86.4742+-1.1222     ?     86.7795+-1.2424        ? might be 1.0035x slower

                                                     TipOfTree                  FixGBEMU                                     
Octane and V8v7:
   encrypt                                        0.48665+-0.00104    ?     0.48782+-0.00098       ?
   decrypt                                        8.92121+-0.02077    ^     8.86220+-0.01854       ^ definitely 1.0067x faster
   deltablue                             x2       0.60198+-0.00943          0.59170+-0.00273         might be 1.0174x faster
   earley                                         0.92925+-0.00844    ?     0.93367+-0.02170       ?
   boyer                                         13.04892+-0.04272    ?    13.12345+-0.07389       ?
   raytrace                              x2       4.73267+-0.21763          4.65269+-0.04371         might be 1.0172x faster
   regexp                                x2      32.64399+-0.16665    ?    32.83455+-0.20863       ?
   richards                              x2       0.31431+-0.00325          0.31307+-0.00285       
   splay                                 x2       0.63788+-0.00588    ?     0.65420+-0.01940       ? might be 1.0256x slower
   navier-stokes                         x2      11.33809+-0.02498    ^    10.97695+-0.02051       ^ definitely 1.0329x faster
   closure                                        0.30445+-0.03994    ?     0.30639+-0.03971       ?
   jquery                                         4.43232+-0.52851          4.39625+-0.52494       
   gbemu                                 x2     264.54957+-3.34035    ^   141.11648+-2.18262       ^ definitely 1.8747x faster
   mandreel                              x2     186.12259+-0.66146    ?   186.14140+-1.00488       ?
   pdfjs                                 x2     127.73158+-0.79743    ^   123.58179+-0.94261       ^ definitely 1.0336x faster
   box2d                                 x2      35.14161+-0.22841         34.92223+-0.22585       

V8v7:
   <arithmetic>                                   7.74524+-0.03678          7.71584+-0.02487         might be 1.0038x faster
   <geometric> *                                  2.50094+-0.01826          2.48983+-0.00743         might be 1.0045x faster
   <harmonic>                                     0.95208+-0.00620          0.95148+-0.00522         might be 1.0006x faster

Octane including V8v7:
   <arithmetic>                                  52.14428+-0.29126    ^    42.29538+-0.22207       ^ definitely 1.2329x faster
   <geometric> *                                  7.78577+-0.06652    ^     7.37585+-0.07824       ^ definitely 1.0556x faster
   <harmonic>                                     1.26681+-0.02350    ?     1.26770+-0.02983       ? might be 1.0007x slower

                                                     TipOfTree                  FixGBEMU                                     
Kraken:
   ai-astar                                       493.361+-0.912      ?     493.407+-1.144         ?
   audio-beat-detection                           252.635+-1.442      ?     255.085+-2.106         ?
   audio-dft                                      315.655+-2.942      ?     317.918+-2.442         ?
   audio-fft                                      147.349+-0.487            147.047+-0.283         
   audio-oscillator                               253.036+-1.083      ?     256.056+-5.277         ? might be 1.0119x slower
   imaging-darkroom                               296.226+-0.894      ?     297.050+-1.315         ?
   imaging-desaturate                             161.731+-0.455      ?     161.863+-0.585         ?
   imaging-gaussian-blur                          502.951+-1.040            502.939+-0.775         
   json-parse-financial                            80.440+-0.509      ?      81.388+-0.531         ? might be 1.0118x slower
   json-stringify-tinderbox                       101.362+-0.336      ?     101.724+-0.351         ?
   stanford-crypto-aes                            123.857+-0.581      ^     122.047+-0.748         ^ definitely 1.0148x faster
   stanford-crypto-ccm                            143.735+-0.370      ^     117.421+-0.398         ^ definitely 1.2241x faster
   stanford-crypto-pbkdf2                         276.935+-1.833            276.910+-0.829         
   stanford-crypto-sha256-iterative               126.852+-0.400            126.208+-0.552         

   <arithmetic> *                                 234.009+-0.366      ^     232.647+-0.542         ^ definitely 1.0059x faster
   <geometric>                                    201.363+-0.267      ^     198.833+-0.316         ^ definitely 1.0127x faster
   <harmonic>                                     174.422+-0.230      ^     171.357+-0.146         ^ definitely 1.0179x faster

                                                     TipOfTree                  FixGBEMU                                     
JSBench:
   amazon                                          9.0833+-0.1834            9.0833+-0.3272        
   facebook                                       39.9167+-2.0536     ?     40.1667+-2.0062        ?
   google                                         79.4167+-2.2086     ?     79.6667+-2.1561        ?
   twitter                                        10.5833+-0.3272     ?     10.6667+-0.3128        ?
   yahoo                                           3.7500+-0.2874     ?      3.8333+-0.2473        ? might be 1.0222x slower

   <arithmetic> *                                 28.5500+-0.8722     ?     28.6833+-0.8728        ? might be 1.0047x slower
   <geometric>                                    16.2445+-0.3518     ?     16.3800+-0.4143        ? might be 1.0083x slower
   <harmonic>                                      9.7764+-0.4003     ?      9.9212+-0.3835        ? might be 1.0148x slower

                                                     TipOfTree                  FixGBEMU                                     
JSRegress:
   adapt-to-double-divide                         22.5048+-0.0811     ?     22.5360+-0.1406        ?
   aliased-arguments-getbyval                      0.9743+-0.0136     ?      0.9852+-0.0313        ? might be 1.0112x slower
   allocate-big-object                             4.2560+-1.3685            4.2214+-1.3582        
   arity-mismatch-inlining                         0.8120+-0.0181            0.8093+-0.0161        
   array-access-polymorphic-structure              8.1727+-1.6306     ?      8.7300+-1.9317        ? might be 1.0682x slower
   array-with-double-add                           5.7952+-0.0404     ?      5.8024+-0.0661        ?
   array-with-double-increment                     3.9543+-0.0436     ?      4.0659+-0.0955        ? might be 1.0282x slower
   array-with-double-mul-add                       7.9600+-0.1154     ?      7.9751+-0.0759        ?
   array-with-double-sum                           7.7727+-0.0333            7.7696+-0.0105        
   array-with-int32-add-sub                       10.4854+-0.0830           10.4331+-0.0367        
   array-with-int32-or-double-sum                  7.9282+-0.1084            7.8729+-0.0329        
   big-int-mul                                     4.8696+-0.0162     ^      4.8384+-0.0150        ^ definitely 1.0065x faster
   boolean-test                                    4.2317+-0.0212     ?      4.2561+-0.0357        ?
   cast-int-to-double                             13.9196+-0.0385           13.8891+-0.0699        
   cell-argument                                  14.6705+-0.1344     ^     14.3485+-0.0153        ^ definitely 1.0224x faster
   cfg-simplify                                    3.8386+-0.0513     ?      3.8764+-0.1042        ?
   cmpeq-obj-to-obj-other                         11.6028+-0.2816           11.5812+-0.0975        
   constant-test                                   8.4013+-0.0862     ?      8.4513+-0.1225        ?
   direct-arguments-getbyval                       0.8952+-0.0201            0.8825+-0.0087          might be 1.0144x faster
   double-pollution-getbyval                      10.7752+-0.1510           10.6384+-0.0190          might be 1.0129x faster
   double-pollution-putbyoffset                    5.9199+-0.6300            5.8540+-0.6576          might be 1.0113x faster
   empty-string-plus-int                          13.8883+-0.4336           13.7411+-0.5121          might be 1.0107x faster
   external-arguments-getbyval                     2.6346+-0.1931            2.6305+-0.1822        
   external-arguments-putbyval                     4.2299+-0.3328     ?      4.3357+-0.3328        ? might be 1.0250x slower
   Float32Array-matrix-mult                       16.0396+-0.8426     ?     16.4342+-0.8552        ? might be 1.0246x slower
   fold-double-to-int                             22.1074+-0.2382           21.9526+-0.1846        
   function-dot-apply                              3.1636+-0.0232            3.1613+-0.0302        
   function-test                                   4.8608+-0.0791     ?      4.8881+-0.0462        ?
   get-by-id-chain-from-try-block                  7.4826+-0.0560            7.4458+-0.0232        
   HashMap-put-get-iterate-keys                   90.0712+-1.6094           87.9114+-1.1638          might be 1.0246x faster
   HashMap-put-get-iterate                        91.6792+-0.8337           90.5578+-1.0320          might be 1.0124x faster
   HashMap-string-put-get-iterate                 87.1271+-1.2189           86.4839+-1.2945        
   indexed-properties-in-objects                   4.4467+-0.0246            4.4358+-0.0192        
   inline-arguments-access                         1.2901+-0.0187            1.2892+-0.0148        
   inline-arguments-local-escape                  25.8386+-0.2077           25.5691+-0.1547          might be 1.0105x faster
   inline-get-scoped-var                           6.5271+-0.2157     ?      6.5344+-0.2152        ?
   inlined-put-by-id-transition                   16.7272+-0.2392           16.4086+-0.2303          might be 1.0194x faster
   int-or-other-abs-then-get-by-val                8.8389+-0.0284     ?      8.8394+-0.0354        ?
   int-or-other-abs-zero-then-get-by-val          36.9764+-0.3272           36.8253+-0.1030        
   int-or-other-add-then-get-by-val               10.2306+-0.0660           10.1949+-0.0227        
   int-or-other-add                               10.4894+-0.0419     ?     10.5132+-0.0684        ?
   int-or-other-div-then-get-by-val                7.9837+-0.0277     ?      8.0184+-0.0730        ?
   int-or-other-max-then-get-by-val                9.9300+-0.2029     ?      9.9338+-0.2045        ?
   int-or-other-min-then-get-by-val                8.1339+-0.0164     ?      8.1760+-0.0869        ?
   int-or-other-mod-then-get-by-val                7.9519+-0.0254     ?      7.9540+-0.0310        ?
   int-or-other-mul-then-get-by-val                7.1413+-0.0332     ?      7.1630+-0.0357        ?
   int-or-other-neg-then-get-by-val                8.0872+-0.0719            8.0381+-0.0205        
   int-or-other-neg-zero-then-get-by-val          36.8848+-0.0746     ?     37.2869+-0.4763        ? might be 1.0109x slower
   int-or-other-sub-then-get-by-val               10.2264+-0.0424           10.2095+-0.0458        
   int-or-other-sub                                8.2056+-0.0546            8.1988+-0.0559        
   int-overflow-local                             12.9126+-0.0750           12.8353+-0.0312        
   Int16Array-bubble-sort                         78.1133+-2.7749           78.0594+-2.7210        
   Int16Array-load-int-mul                         1.9093+-0.0259            1.8878+-0.0172          might be 1.0114x faster
   Int8Array-load                                  5.4114+-0.0235     ?      5.4797+-0.1131        ? might be 1.0126x slower
   integer-divide                                 15.2338+-0.0796           15.1556+-0.0154        
   integer-modulo                                  2.2165+-0.0590     ?      2.2488+-0.0410        ? might be 1.0146x slower
   make-indexed-storage                            4.7428+-0.6588            4.7176+-0.6569        
   method-on-number                               23.2679+-0.2939     ?     23.7224+-0.4644        ? might be 1.0195x slower
   nested-function-parsing-random                390.4205+-12.6293         386.8288+-12.8896       
   nested-function-parsing                        57.0050+-3.4956           56.9407+-3.3263        
   new-array-buffer-dead                           3.7320+-0.1042     ?      3.7336+-0.0890        ?
   new-array-buffer-push                          14.8515+-2.3631     ?     14.9366+-2.3920        ?
   new-array-dead                                 28.4352+-0.1374           28.3795+-0.1218        
   new-array-push                                 12.4420+-1.9611           12.4024+-1.9676        
   number-test                                     4.1786+-0.0551            4.1673+-0.0202        
   object-closure-call                             8.8917+-0.2541            8.6764+-0.2206          might be 1.0248x faster
   object-test                                     4.7537+-0.0353            4.7226+-0.0386        
   poly-stricteq                                  92.8363+-1.1056           91.7895+-0.2155          might be 1.0114x faster
   polymorphic-structure                          20.1870+-0.1441     ?     20.2586+-0.2461        ?
   polyvariant-monomorphic-get-by-id              12.5483+-0.0872           12.5325+-0.0657        
   rare-osr-exit-on-local                         20.5841+-0.0469           20.5391+-0.0529        
   register-pressure-from-osr                     31.5935+-0.0788     ?     31.6128+-0.1270        ?
   simple-activation-demo                         34.8291+-0.2652     ?     34.9314+-0.2367        ?
   slow-array-profile-convergence                  4.9437+-0.2525            4.8851+-0.2449          might be 1.0120x faster
   slow-convergence                                3.8448+-0.0258            3.8078+-0.0402        
   sparse-conditional                              1.3406+-0.0091     ?      1.3648+-0.0302        ? might be 1.0181x slower
   splice-to-remove                               50.5563+-0.2296     !     51.0580+-0.1329        ! definitely 1.0099x slower
   string-concat-object                            5.0111+-1.4530     ?      5.0493+-1.4170        ?
   string-concat-pair-object                       4.9942+-1.4259     ?      5.0161+-1.4710        ?
   string-concat-pair-simple                      19.7468+-0.6906           19.5444+-0.7513          might be 1.0104x faster
   string-concat-simple                           19.6588+-0.7239           19.4935+-0.6697        
   string-cons-repeat                             14.8747+-0.9846           14.6714+-0.9395          might be 1.0139x faster
   string-cons-tower                              37.6062+-22.3803          37.4074+-22.2040       
   string-hash                                     2.6343+-0.0317     ?      2.6796+-0.0791        ? might be 1.0172x slower
   string-repeat-arith                            46.5152+-0.3677           45.8222+-0.3348          might be 1.0151x faster
   string-sub                                     88.6065+-1.1104     ?     88.8955+-1.1219        ?
   string-test                                     4.1639+-0.0504     ?      4.1869+-0.0527        ?
   structure-hoist-over-transitions                4.1713+-0.6483            4.1647+-0.6418        
   tear-off-arguments-simple                       1.8215+-0.0214            1.8124+-0.0298        
   tear-off-arguments                              3.3329+-0.0277            3.3134+-0.0252        
   temporal-structure                             20.9489+-0.0875     ?     21.0299+-0.0865        ?
   to-int32-boolean                               30.6409+-0.1294     ?     30.6438+-0.1009        ?
   undefined-test                                  4.4085+-0.0261     ?      4.4319+-0.0507        ?

   <arithmetic>                                   21.2887+-0.5083           21.1912+-0.4979          might be 1.0046x faster
   <geometric> *                                   9.8514+-0.2231            9.8460+-0.2321          might be 1.0006x faster
   <harmonic>                                      5.4852+-0.0942     ?      5.4895+-0.1022        ? might be 1.0008x slower

                                                     TipOfTree                  FixGBEMU                                     
DSP:
   filtrr-posterize-tint                          54.1873+-1.1158           54.0510+-1.1097        
   filtrr-tint-contrast-sat-bright                77.3939+-2.1065           76.1090+-1.0157          might be 1.0169x faster
   filtrr-tint-sat-adj-contr-mult                 91.0803+-3.5947     ?     91.3694+-3.6364        ?
   filtrr-blur-overlay-sat-contr                 241.7403+-8.8579          227.8193+-5.5459          might be 1.0611x faster
   filtrr-sat-blur-mult-sharpen-contr            281.0981+-6.4494          277.6984+-4.3982          might be 1.0122x faster
   filtrr-sepia-bias                              39.1825+-2.1779     ?     39.3602+-2.0515        ?
   route9-vp8                            x5     1150.5998+-6.3888     ?   1155.6031+-6.0505        ?
   starfield                             x5     1127.4799+-5.4693     ?   1128.1505+-3.4124        ?
   zynaps-quake3                         x5     1381.2514+-10.0607    ?   1384.7985+-11.2272       ?
   zynaps-mandelbrot                     x5     1115.0710+-6.8424     ?   1116.5846+-7.4386        ?

   <arithmetic>                                  948.3343+-4.1684     ?    949.6958+-3.6550        ? might be 1.0014x slower
   <geometric> *                                 671.6896+-2.1208          670.7384+-1.3604          might be 1.0014x faster
   <harmonic>                                    280.7781+-7.0549          279.7051+-6.2470          might be 1.0038x faster

                                                     TipOfTree                  FixGBEMU                                     
All benchmarks:
   <arithmetic>                                  163.8822+-0.6319     ^    162.6290+-0.5560        ^ definitely 1.0077x faster
   <geometric>                                    21.5527+-0.2727           21.3703+-0.2796          might be 1.0085x faster
   <harmonic>                                      4.5765+-0.0430     ?      4.5787+-0.0519        ? might be 1.0005x slower

                                                     TipOfTree                  FixGBEMU                                     
Geomean of preferred means:
   <scaled-result>                                42.8115+-0.4070           42.4560+-0.3880          might be 1.0084x faster
Comment 4 Build Bot 2013-03-22 01:50:45 PDT
Comment on attachment 194458 [details]
the patch

Attachment 194458 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-commit-queue.appspot.com/results/17230347

New failing tests:
fast/js/method-check.html
Comment 5 Build Bot 2013-03-22 01:50:47 PDT
Created attachment 194472 [details]
Archive of layout-test-results from webkit-ews-10

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: webkit-ews-10  Port: <class 'webkitpy.common.config.ports.MacWK2Port'>  Platform: Mac OS X 10.8.2
Comment 6 Build Bot 2013-03-22 03:00:23 PDT
Comment on attachment 194458 [details]
the patch

Attachment 194458 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-commit-queue.appspot.com/results/17257235

New failing tests:
fast/js/method-check.html
Comment 7 Build Bot 2013-03-22 03:00:26 PDT
Created attachment 194492 [details]
Archive of layout-test-results from webkit-ews-11

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: webkit-ews-11  Port: <class 'webkitpy.common.config.ports.MacWK2Port'>  Platform: Mac OS X 10.8.2
Comment 8 Geoffrey Garen 2013-03-22 08:48:00 PDT
> fast/js/method-check.html

This looks like a real failure.
Comment 9 Geoffrey Garen 2013-03-22 08:49:14 PDT
Comment on attachment 194458 [details]
the patch

r=me if you fix the test failure
Comment 10 Filip Pizlo 2013-03-22 10:24:49 PDT
(In reply to comment #8)
> > fast/js/method-check.html
> 
> This looks like a real failure.

It is.

My patch appears to reveal a long-standing DFG bug for put_by_id transitions.  With this patch, the baseline JIT will do put_by_id transition caching on the first run, not the second run.  This has the quirky effect of sometimes increased, and sometimes decreasing, the amount of put_by_id specialization that happens.

And it appears that in this program, we end up with more put_by_id specialization, and in particular, we specialize a specific function put transition.  That's bad, we should never do that.

I'm still hunting for the place where we do that, and I'll put it up as a separate patch, when I do.  I'll hold off committing this before then.
Comment 11 Filip Pizlo 2013-03-22 16:00:52 PDT
(In reply to comment #10)
> (In reply to comment #8)
> > > fast/js/method-check.html
> > 
> > This looks like a real failure.
> 
> It is.
> 
> My patch appears to reveal a long-standing DFG bug for put_by_id transitions.  With this patch, the baseline JIT will do put_by_id transition caching on the first run, not the second run.  This has the quirky effect of sometimes increased, and sometimes decreasing, the amount of put_by_id specialization that happens.
> 
> And it appears that in this program, we end up with more put_by_id specialization, and in particular, we specialize a specific function put transition.  That's bad, we should never do that.
> 
> I'm still hunting for the place where we do that, and I'll put it up as a separate patch, when I do.  I'll hold off committing this before then.

Yup, this is fixed now that https://bugs.webkit.org/show_bug.cgi?id=113093 landed.
Comment 12 Filip Pizlo 2013-03-22 16:03:38 PDT
Landed in http://trac.webkit.org/changeset/146669