Bug 148664 - [ES6] Implement tail calls in the FTL
Summary: [ES6] Implement tail calls in the FTL
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Michael Saboff
URL:
Keywords:
: 146851 (view as bug list)
Depends on: 148663 149619 149621 149647
Blocks: 146477
  Show dependency treegraph
 
Reported: 2015-08-31 17:54 PDT by Basile Clement
Modified: 2015-10-01 02:50 PDT (History)
5 users (show)

See Also:


Attachments
Patch (63.12 KB, patch)
2015-09-04 15:19 PDT, Basile Clement
no flags Details | Formatted Diff | Diff
Rebased patch (79.25 KB, patch)
2015-09-24 16:00 PDT, Michael Saboff
fpizlo: review+
Details | Formatted Diff | Diff
Patch for Landing (80.49 KB, patch)
2015-09-28 15:36 PDT, Michael Saboff
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Basile Clement 2015-08-31 17:54:23 PDT
...
Comment 1 Basile Clement 2015-08-31 17:56:28 PDT
*** Bug 146851 has been marked as a duplicate of this bug. ***
Comment 2 Basile Clement 2015-09-04 15:19:20 PDT
Created attachment 260631 [details]
Patch
Comment 3 Basile Clement 2015-09-10 10:46:23 PDT
This doesn't properly handle OSR exit and will incorrectly leave the tail calling frames on the stack.

A speculative test failure would be:

````
function bar() {
    if (isFinalTier()) OSRExit();
    if (bar.caller !== null)
        throw new Error("Caller should be null.");
}

function foo() {
    "use strict";
    return bar();
}

for (var i = 0; i < 10000; ++i)
    foo();
````

The patch needs to be updated to handle OSR exit.
Comment 4 Michael Saboff 2015-09-24 16:00:41 PDT
Created attachment 261901 [details]
Rebased patch

Performance looks neutral:

Baseline:/Users/msaboff/src/webkit.baseline/WebKitBuild/Release/JavaScriptCore.framework/Resources/jsc FTLTailCall:/Users/msaboff/src/webkit/WebKitBuild/Release/JavaScriptCore.framework/Resources/jsc
Warning: could not identify checkout location for Baseline
Warning: could not identify checkout location for FTLTailCall
Warning: refusing to run JSBench because not all VMs are DumpRenderTree or WebKitTestRunner.
Warning: refusing to run DSPJS because not all VMs are DumpRenderTree or WebKitTestRunner.
3940/3940                                                                                             
Generating benchmark report at /Volumes/Data/src/webkit/Baseline_FTLTailCall_SunSpiderLongSpiderV8SpiderOctaneKrakenJSRegressAsmBenchCompressionBench_msaboff-pro_20150924_1542_report.txt
And raw data at /Volumes/Data/src/webkit/Baseline_FTLTailCall_SunSpiderLongSpiderV8SpiderOctaneKrakenJSRegressAsmBenchCompressionBench_msaboff-pro_20150924_1542.json

Benchmark report for SunSpider, LongSpider, V8Spider, Octane, Kraken, JSRegress, AsmBench, and CompressionBench on msaboff-pro (MacPro5,1).

VMs tested:
"Baseline" at /Volumes/Data/src/webkit.baseline/WebKitBuild/Release/JavaScriptCore.framework/Versions/A/Resources/jsc
"FTLTailCall" at /Volumes/Data/src/webkit/WebKitBuild/Release/JavaScriptCore.framework/Versions/A/Resources/jsc

Collected 4 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample measurements.
Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level
timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds.

                                                         Baseline                FTLTailCall                                    
SunSpider:
   3d-cube                                            7.9792+-0.3960            7.9030+-0.5182        
   3d-morph                                           8.1843+-0.3378            8.0872+-0.1859          might be 1.0120x faster
   3d-raytrace                                        8.9017+-0.2968            8.8450+-0.2756        
   access-binary-trees                                3.3732+-0.2480            3.3213+-0.2054          might be 1.0156x faster
   access-fannkuch                                    8.7303+-0.4338            8.7216+-0.2674        
   access-nbody                                       4.3638+-0.1992     ?      4.4130+-0.1117        ? might be 1.0113x slower
   access-nsieve                                      4.5186+-0.1981     ?      4.5796+-0.1514        ? might be 1.0135x slower
   bitops-3bit-bits-in-byte                           1.7733+-0.0846     ?      1.7955+-0.1632        ? might be 1.0125x slower
   bitops-bits-in-byte                                5.5793+-0.0904     ?      5.6617+-0.1095        ? might be 1.0148x slower
   bitops-bitwise-and                                 2.8330+-0.1592     ?      2.8389+-0.2845        ?
   bitops-nsieve-bits                                 4.3277+-0.1660     ?      4.3874+-0.1611        ? might be 1.0138x slower
   controlflow-recursive                              3.5425+-0.0602            3.4766+-0.1143          might be 1.0190x faster
   crypto-aes                                         6.3738+-0.2913     ?      6.4257+-0.1987        ?
   crypto-md5                                         3.9291+-0.1659     ?      4.0664+-0.1865        ? might be 1.0350x slower
   crypto-sha1                                        3.4300+-0.0462     ?      3.4395+-0.1766        ?
   date-format-tofte                                 12.8163+-0.6279           12.6020+-0.8477          might be 1.0170x faster
   date-format-xparb                                  7.6727+-0.2657            7.4630+-0.1711          might be 1.0281x faster
   math-cordic                                        4.3720+-0.1547            4.2963+-0.1442          might be 1.0176x faster
   math-partial-sums                                  9.8143+-0.2748            9.7425+-0.1166        
   math-spectral-norm                                 3.0869+-0.1398     ?      3.0949+-0.1097        ?
   regexp-dna                                        10.1293+-0.5095     ?     10.2421+-0.6308        ? might be 1.0111x slower
   string-base64                                      6.7480+-0.5537            6.6589+-0.0748          might be 1.0134x faster
   string-fasta                                       9.1501+-0.1671            9.1280+-0.1924        
   string-tagcloud                                   12.8978+-0.3869           12.7969+-0.2681        
   string-unpack-code                                26.4536+-0.6404     ?     26.7227+-0.7055        ? might be 1.0102x slower
   string-validate-input                              6.9626+-0.2053     ?      7.1074+-0.4688        ? might be 1.0208x slower

   <arithmetic>                                       7.2286+-0.0433            7.2237+-0.0491          might be 1.0007x faster

                                                         Baseline                FTLTailCall                                    
LongSpider:
   3d-cube                                         1153.3079+-18.6644    ?   1163.3223+-20.4309       ?
   3d-morph                                        1901.7108+-26.1352        1893.9627+-8.5184        
   3d-raytrace                                      991.5331+-9.2850          990.9625+-5.5981        
   access-binary-trees                             1374.4435+-2.2196     ?   1380.4895+-8.1406        ?
   access-fannkuch                                  455.7527+-4.8470     ?    461.2219+-11.3162       ? might be 1.0120x slower
   access-nbody                                    1021.2827+-27.8837        1010.2141+-4.2885          might be 1.0110x faster
   access-nsieve                                    683.3531+-16.0561         674.7672+-9.2971          might be 1.0127x faster
   bitops-3bit-bits-in-byte                          44.7759+-1.1735     ?     44.8318+-0.2471        ?
   bitops-bits-in-byte                              343.4109+-5.1462     ?    345.6949+-4.3722        ?
   bitops-nsieve-bits                               619.5676+-1.4247     ?    627.8960+-27.1643       ? might be 1.0134x slower
   controlflow-recursive                            752.4019+-23.2001         745.3698+-2.2247        
   crypto-aes                                       889.0048+-8.2046     ?    900.2663+-6.4874        ? might be 1.0127x slower
   crypto-md5                                       789.5475+-12.8263    ?    794.0745+-12.4994       ?
   crypto-sha1                                     1070.9747+-17.8856        1062.4113+-8.2733        
   date-format-tofte                                991.0571+-34.2796         967.0193+-33.7797         might be 1.0249x faster
   date-format-xparb                               1062.8716+-10.4773        1045.2930+-32.8620         might be 1.0168x faster
   hash-map                                         231.8835+-2.3581          231.3430+-3.0570        
   math-cordic                                      642.5349+-7.2891     ?    648.5532+-12.7299       ?
   math-partial-sums                               1138.5790+-5.8434     ?   1140.9069+-11.5044       ?
   math-spectral-norm                              1075.9895+-6.5989         1072.6901+-10.5186       
   string-base64                                    532.2181+-2.3690     ?    543.5682+-18.9630       ? might be 1.0213x slower
   string-fasta                                     583.0449+-1.3105     ?    586.9727+-20.9157       ?
   string-tagcloud                                  285.8955+-3.7030     ?    286.7098+-4.5746        ?

   <geometric>                                      666.0745+-1.1628     ?    666.4751+-1.6011        ? might be 1.0006x slower

                                                         Baseline                FTLTailCall                                    
V8Spider:
   crypto                                            70.7248+-1.4062     ?     71.7533+-1.0392        ? might be 1.0145x slower
   deltablue                                         88.5544+-3.5528           86.6682+-2.9355          might be 1.0218x faster
   earley-boyer                                      65.8374+-2.0879           64.9132+-0.9326          might be 1.0142x faster
   raytrace                                          41.0757+-3.6438           39.6375+-0.8211          might be 1.0363x faster
   regexp                                           103.8710+-1.9320     ?    104.4360+-3.1611        ?
   richards                                          77.5117+-2.7887           76.2626+-0.7336          might be 1.0164x faster
   splay                                             53.2029+-2.3600           52.7885+-0.6731        

   <geometric>                                       68.7266+-1.2562           68.0019+-0.2576          might be 1.0107x faster

                                                         Baseline                FTLTailCall                                    
Octane:
   encrypt                                           0.30090+-0.00589    ?     0.30240+-0.00363       ?
   decrypt                                           5.64773+-0.02163    ?     5.67141+-0.11087       ?
   deltablue                                x2       0.23364+-0.01160          0.22964+-0.00423         might be 1.0174x faster
   earley                                            0.52974+-0.01149    ?     0.53146+-0.00558       ?
   boyer                                             8.91079+-0.19548          8.88555+-0.05988       
   navier-stokes                            x2       6.41802+-0.08549    ?     6.44301+-0.12366       ?
   raytrace                                 x2       1.55703+-0.04475          1.55108+-0.07685       
   richards                                 x2       0.15414+-0.00381          0.15272+-0.00155       
   splay                                    x2       0.53648+-0.00032    ^     0.53351+-0.00237       ^ definitely 1.0056x faster
   regexp                                   x2      38.71862+-0.73486    ?    39.43810+-1.20940       ? might be 1.0186x slower
   pdfjs                                    x2      60.97598+-0.72315         60.48355+-0.24387       
   mandreel                                 x2      68.64978+-0.73693         68.45893+-1.79874       
   gbemu                                    x2      58.75793+-2.36535    ?    59.78192+-6.56690       ? might be 1.0174x slower
   closure                                           0.94955+-0.02963          0.93248+-0.00883         might be 1.0183x faster
   jquery                                           12.27260+-0.12351         12.13883+-0.10089         might be 1.0110x faster
   box2d                                    x2      16.60395+-0.16649         16.51912+-0.23395       
   zlib                                     x2     566.52854+-43.81670   ?   576.13479+-5.54413       ? might be 1.0170x slower
   typescript                               x2    1111.06866+-23.95040      1103.00403+-6.05634       

   <geometric>                                       8.93742+-0.07228          8.92973+-0.09585         might be 1.0009x faster

                                                         Baseline                FTLTailCall                                    
Kraken:
   ai-astar                                          198.740+-2.625            197.021+-2.409         
   audio-beat-detection                               78.512+-1.141      ?      79.105+-1.822         ?
   audio-dft                                         129.321+-2.470      ?     131.041+-4.485         ? might be 1.0133x slower
   audio-fft                                          59.147+-1.003             58.518+-0.192           might be 1.0107x faster
   audio-oscillator                                   98.501+-0.921      ?      98.839+-0.647         ?
   imaging-darkroom                                   97.097+-0.739      ?      98.013+-2.011         ?
   imaging-desaturate                                 89.269+-1.492             89.145+-0.933         
   imaging-gaussian-blur                             147.482+-3.753            145.544+-2.407           might be 1.0133x faster
   json-parse-financial                               67.207+-1.053             66.562+-0.732         
   json-stringify-tinderbox                           41.031+-0.375      ?      41.186+-0.289         ?
   stanford-crypto-aes                                64.983+-2.651             64.913+-1.683         
   stanford-crypto-ccm                                57.850+-1.990      ?      59.626+-1.643         ? might be 1.0307x slower
   stanford-crypto-pbkdf2                            146.216+-5.438            144.537+-1.084           might be 1.0116x faster
   stanford-crypto-sha256-iterative                   57.432+-0.913             57.009+-0.739         

   <arithmetic>                                       95.199+-0.580             95.076+-0.276           might be 1.0013x faster

                                                         Baseline                FTLTailCall                                    
JSRegress:
   abc-forward-loop-equal                            56.1079+-2.0238           55.3519+-0.9890          might be 1.0137x faster
   abc-postfix-backward-loop                         54.8818+-0.6272           54.5860+-0.3172        
   abc-simple-backward-loop                          55.8820+-1.8755           54.7155+-0.3558          might be 1.0213x faster
   abc-simple-forward-loop                           54.5515+-0.5771     ?     55.2635+-1.7936        ? might be 1.0131x slower
   abc-skippy-loop                                   38.0599+-0.7643           37.5444+-0.1659          might be 1.0137x faster
   abs-boolean                                        3.8250+-0.1728            3.7800+-0.0946          might be 1.0119x faster
   adapt-to-double-divide                            17.4385+-0.8104     ?     17.4789+-0.2782        ?
   aliased-arguments-getbyval                         1.5881+-0.1449     ?      1.6648+-0.2171        ? might be 1.0483x slower
   allocate-big-object                                3.8117+-0.3979            3.7202+-0.3371          might be 1.0246x faster
   arguments-named-and-reflective                    14.7466+-0.2621           14.3640+-0.1372          might be 1.0266x faster
   arguments-out-of-bounds                           14.4697+-0.6622     ?     14.5793+-0.7407        ?
   arguments-strict-mode                             12.3905+-0.6816     ?     12.5939+-0.9702        ? might be 1.0164x slower
   arguments                                         11.1553+-0.1382           11.1257+-0.4649        
   arity-mismatch-inlining                            1.2812+-0.1014     ?      1.3132+-0.0886        ? might be 1.0250x slower
   array-access-polymorphic-structure                11.3450+-0.3581           11.3032+-0.4938        
   array-nonarray-polymorhpic-access                 38.2127+-1.6015     ?     38.3690+-2.1178        ?
   array-prototype-every                            119.5709+-2.9514     ?    123.8366+-3.3707        ? might be 1.0357x slower
   array-prototype-forEach                          118.1006+-3.0104     ?    120.0157+-0.8872        ? might be 1.0162x slower
   array-prototype-map                              141.5541+-16.4492         130.5815+-3.7306          might be 1.0840x faster
   array-prototype-reduce                           113.5812+-4.0231     ?    115.0142+-1.7671        ? might be 1.0126x slower
   array-prototype-reduceRight                      113.2582+-2.4726     ?    115.6287+-1.9895        ? might be 1.0209x slower
   array-prototype-some                             121.6501+-1.8487     ?    123.0714+-3.4648        ? might be 1.0117x slower
   array-splice-contiguous                           32.9707+-0.6041           32.7333+-0.5835        
   array-with-double-add                              5.7347+-0.2265            5.6733+-0.1101          might be 1.0108x faster
   array-with-double-increment                        4.2075+-0.1760            4.1696+-0.1208        
   array-with-double-mul-add                          7.2661+-0.1205            7.2405+-0.1542        
   array-with-double-sum                              4.3787+-0.1376     ?      4.4370+-0.1408        ? might be 1.0133x slower
   array-with-int32-add-sub                           9.4976+-0.3640            9.4887+-0.3462        
   array-with-int32-or-double-sum                     4.5926+-0.3008     ?      4.6045+-0.2457        ?
   ArrayBuffer-DataView-alloc-large-long-lived   
                                                     46.1448+-1.6808           46.0717+-0.9285        
   ArrayBuffer-DataView-alloc-long-lived             18.2743+-0.4495     ?     18.2822+-0.4714        ?
   ArrayBuffer-Int32Array-byteOffset                  5.5942+-0.2213            5.3975+-0.1357          might be 1.0365x faster
   ArrayBuffer-Int8Array-alloc-large-long-lived   
                                                     46.3503+-0.8594     ?     47.2176+-0.8937        ? might be 1.0187x slower
   ArrayBuffer-Int8Array-alloc-long-lived-buffer   
                                                     31.0935+-0.7923     ^     29.6496+-0.3658        ^ definitely 1.0487x faster
   ArrayBuffer-Int8Array-alloc-long-lived            17.6442+-1.1007           17.3124+-0.7952          might be 1.0192x faster
   ArrayBuffer-Int8Array-alloc                       14.5004+-0.6442           14.4240+-0.2454        
   arrowfunction-call                                14.9610+-0.1032     ?     15.0068+-0.2972        ?
   asmjs_bool_bug                                     9.7924+-0.3397            9.4665+-0.4584          might be 1.0344x faster
   assign-custom-setter-polymorphic                   4.6380+-0.1238            4.4985+-0.0717          might be 1.0310x faster
   assign-custom-setter                               6.0778+-0.1613            5.9904+-0.2324          might be 1.0146x faster
   basic-set                                         11.4849+-0.8086           11.2978+-0.4473          might be 1.0166x faster
   big-int-mul                                        5.9150+-0.1163     ?      5.9738+-0.0736        ?
   boolean-test                                       4.5935+-0.2903            4.4417+-0.0542          might be 1.0342x faster
   branch-fold                                        4.8069+-0.1187            4.7440+-0.1256          might be 1.0133x faster
   branch-on-string-as-boolean                       22.4595+-1.5856     ?     22.6550+-0.7806        ?
   by-val-generic                                     3.7142+-0.1749     ?      3.9424+-0.1296        ? might be 1.0614x slower
   call-spread-apply                                 40.8485+-2.3197           38.8124+-1.0474          might be 1.0525x faster
   call-spread-call                                  31.4403+-0.2574           30.6766+-0.7199          might be 1.0249x faster
   captured-assignments                               0.8015+-0.1713            0.7890+-0.1276          might be 1.0158x faster
   cast-int-to-double                                 8.5768+-0.1040     ?      8.6008+-0.0783        ?
   cell-argument                                      7.3845+-0.2947     ?      7.4160+-0.4698        ?
   cfg-simplify                                       3.8511+-0.2025     ?      3.9024+-0.1489        ? might be 1.0133x slower
   chain-getter-access                               10.4308+-0.1752     ?     10.4772+-0.2975        ?
   cmpeq-obj-to-obj-other                            15.2235+-1.1702           15.1502+-0.2910        
   constant-test                                      8.1270+-0.0875            8.0787+-0.1399        
   create-lots-of-functions                          16.4334+-0.1096     ?     16.6403+-0.4923        ? might be 1.0126x slower
   cse-new-array-buffer                               3.3898+-0.2044            3.3794+-0.2678        
   cse-new-array                                      3.5527+-0.2860            3.5415+-0.2181        
   DataView-custom-properties                        53.4545+-1.1086     ?     53.9829+-3.2101        ?
   delay-tear-off-arguments-strictmode               18.8756+-0.2538           18.6810+-0.5141          might be 1.0104x faster
   deltablue-varargs                                287.4162+-13.0160         286.5977+-11.4086       
   destructuring-arguments                          244.2088+-13.8651    ?    245.3911+-2.1662        ?
   destructuring-parameters-overridden-by-function   
                                                      0.7581+-0.1274     ?      0.7960+-0.1641        ? might be 1.0500x slower
   destructuring-swap                                 7.9628+-0.1985            7.8906+-0.0727        
   direct-arguments-getbyval                          1.6536+-0.3289            1.6393+-0.1957        
   div-boolean-double                                 5.6390+-0.0090            5.5994+-0.0738        
   div-boolean                                       10.0390+-0.1484           10.0269+-0.1646        
   double-get-by-val-out-of-bounds                    6.1074+-0.1588     ?      6.4080+-0.6187        ? might be 1.0492x slower
   double-pollution-getbyval                          9.8387+-0.1018     ?      9.9341+-0.0857        ?
   double-pollution-putbyoffset                       5.6676+-0.1815            5.4958+-0.2499          might be 1.0313x faster
   double-real-use                                   39.3749+-0.2265           39.3669+-0.3931        
   double-to-int32-typed-array-no-inline              3.2102+-0.1990     ?      3.2733+-0.1622        ? might be 1.0196x slower
   double-to-int32-typed-array                        3.0036+-0.2544            2.9097+-0.0850          might be 1.0323x faster
   double-to-uint32-typed-array-no-inline             3.1236+-0.1218     ?      3.2539+-0.2015        ? might be 1.0417x slower
   double-to-uint32-typed-array                       2.9881+-0.0588     ?      3.0050+-0.1467        ?
   elidable-new-object-dag                           56.0861+-1.9409           54.6929+-0.6695          might be 1.0255x faster
   elidable-new-object-roflcopter                    53.6485+-0.5427     ?     53.9496+-1.8775        ?
   elidable-new-object-then-call                     49.6780+-1.2513     ?     50.2726+-1.1090        ? might be 1.0120x slower
   elidable-new-object-tree                          64.2993+-2.2158     ?     64.9947+-0.9152        ? might be 1.0108x slower
   empty-string-plus-int                              7.5973+-0.2775            7.5781+-0.1180        
   emscripten-cube2hash                              45.5381+-0.4457           45.2473+-0.1809        
   exit-length-on-plain-object                       22.7953+-1.6044           22.7063+-0.7591        
   external-arguments-getbyval                        1.7989+-0.2427     ?      1.8281+-0.1810        ? might be 1.0163x slower
   external-arguments-putbyval                        3.2172+-0.2089     ?      3.2654+-0.1826        ? might be 1.0150x slower
   fixed-typed-array-storage-var-index                1.6917+-0.1035     ?      1.7012+-0.1908        ?
   fixed-typed-array-storage                          1.3140+-0.1162     ?      1.3848+-0.1272        ? might be 1.0539x slower
   Float32Array-matrix-mult                           6.4377+-0.2589            6.3139+-0.2209          might be 1.0196x faster
   Float32Array-to-Float64Array-set                  77.5938+-1.6882     ?     78.4612+-1.2402        ? might be 1.0112x slower
   Float64Array-alloc-long-lived                     93.5413+-0.6554     ?     94.0788+-1.0496        ?
   Float64Array-to-Int16Array-set                    91.5964+-0.7516     ?     92.9632+-2.5438        ? might be 1.0149x slower
   fold-double-to-int                                19.0319+-0.0574     ?     19.1622+-0.1135        ?
   fold-get-by-id-to-multi-get-by-offset-rare-int   
                                                     12.5089+-0.2640     ?     12.5880+-0.4629        ?
   fold-get-by-id-to-multi-get-by-offset             10.7925+-0.5102           10.5137+-0.1873          might be 1.0265x faster
   fold-multi-get-by-offset-to-get-by-offset   
                                                      9.2238+-1.0817            8.9297+-0.7670          might be 1.0329x faster
   fold-multi-get-by-offset-to-poly-get-by-offset   
                                                      9.7716+-0.6867     ?     10.0508+-0.0956        ? might be 1.0286x slower
   fold-multi-put-by-offset-to-poly-put-by-offset   
                                                     11.3400+-0.0687     ?     11.4734+-0.0748        ? might be 1.0118x slower
   fold-multi-put-by-offset-to-put-by-offset   
                                                     10.6783+-0.7598           10.5313+-0.7659          might be 1.0140x faster
   fold-multi-put-by-offset-to-replace-or-transition-put-by-offset   
                                                     16.1390+-0.3576           15.7775+-0.7569          might be 1.0229x faster
   fold-put-by-id-to-multi-put-by-offset             12.9133+-0.9014           12.8920+-1.1119        
   fold-put-by-val-with-string-to-multi-put-by-offset   
                                                     12.8086+-0.9931     ?     13.1598+-0.8232        ? might be 1.0274x slower
   fold-put-by-val-with-symbol-to-multi-put-by-offset   
                                                     12.8102+-1.1397           12.3907+-0.2123          might be 1.0339x faster
   fold-put-structure                                 8.7095+-0.1331     ?      8.7428+-0.0747        ?
   for-of-iterate-array-entries                      16.2662+-0.8943           16.1808+-0.7769        
   for-of-iterate-array-keys                          5.1653+-0.2126     ?      5.2953+-0.4148        ? might be 1.0252x slower
   for-of-iterate-array-values                        5.1898+-0.4913            5.1837+-0.3238        
   fround                                            19.4658+-0.4991           19.3803+-0.6576        
   ftl-library-inlining-dataview                     93.5699+-0.6435     !     99.9172+-2.0479        ! definitely 1.0678x slower
   ftl-library-inlining                             132.9858+-20.9047    ?    136.4564+-15.4222       ? might be 1.0261x slower
   function-call                                     15.0975+-0.0684     ^     14.6705+-0.2955        ^ definitely 1.0291x faster
   function-dot-apply                                 3.2520+-0.1171     ^      3.0933+-0.0365        ^ definitely 1.0513x faster
   function-test                                      4.5065+-0.0936            4.4127+-0.1183          might be 1.0213x faster
   function-with-eval                               148.3520+-1.1957          148.3283+-3.3879        
   gcse-poly-get-less-obvious                        30.1830+-4.5853     ?     30.3105+-3.3762        ?
   gcse-poly-get                                     33.6250+-1.2514           33.3990+-1.7991        
   gcse                                               7.6112+-0.0673     ^      6.7647+-0.0831        ^ definitely 1.1251x faster
   get-by-id-bimorphic-check-structure-elimination-simple   
                                                      3.3865+-0.1056            3.3036+-0.1764          might be 1.0251x faster
   get-by-id-bimorphic-check-structure-elimination   
                                                      8.1777+-0.1412     ?      8.1890+-0.1692        ?
   get-by-id-chain-from-try-block                     3.5455+-0.1708            3.5003+-0.2465          might be 1.0129x faster
   get-by-id-check-structure-elimination              7.7162+-0.1533     ?      7.7258+-0.0793        ?
   get-by-id-proto-or-self                           20.8983+-3.3721     ?     20.9605+-2.9853        ?
   get-by-id-quadmorphic-check-structure-elimination-simple   
                                                      4.2279+-0.1097            4.0405+-0.0898          might be 1.0464x faster
   get-by-id-self-or-proto                           21.4673+-2.3679           19.8605+-0.9187          might be 1.0809x faster
   get-by-val-out-of-bounds                           6.0390+-0.1656     ?      6.2372+-0.1322        ? might be 1.0328x slower
   get-by-val-with-string-bimorphic-check-structure-elimination-simple   
                                                      3.8754+-0.0549     ?      3.9128+-0.1071        ?
   get-by-val-with-string-bimorphic-check-structure-elimination   
                                                     10.0098+-0.2855     ?     10.1169+-0.0685        ? might be 1.0107x slower
   get-by-val-with-string-chain-from-try-block   
                                                      3.6303+-0.2521            3.6014+-0.2136        
   get-by-val-with-string-check-structure-elimination   
                                                      9.1703+-0.0429     ?      9.1931+-0.2579        ?
   get-by-val-with-string-proto-or-self              21.5203+-2.9297     ?     22.5953+-3.4416        ? might be 1.0500x slower
   get-by-val-with-string-quadmorphic-check-structure-elimination-simple   
                                                      5.0241+-0.1166            4.9762+-0.1064        
   get-by-val-with-string-self-or-proto              22.6219+-3.6611           22.4606+-3.1689        
   get-by-val-with-symbol-bimorphic-check-structure-elimination-simple   
                                                      4.6028+-0.1902     ?      4.6266+-0.1620        ?
   get-by-val-with-symbol-bimorphic-check-structure-elimination   
                                                     19.4675+-0.3440           19.2994+-0.4242        
   get-by-val-with-symbol-chain-from-try-block   
                                                      3.5770+-0.0873     ?      3.6016+-0.2328        ?
   get-by-val-with-symbol-check-structure-elimination   
                                                     18.2565+-0.5452     ?     18.5060+-0.3592        ? might be 1.0137x slower
   get-by-val-with-symbol-proto-or-self              23.3569+-2.5780           20.7347+-0.6088          might be 1.1265x faster
   get-by-val-with-symbol-quadmorphic-check-structure-elimination-simple   
                                                      6.0485+-0.1463            6.0469+-0.0937        
   get-by-val-with-symbol-self-or-proto              21.5950+-2.6428     ?     22.2852+-4.0537        ? might be 1.0320x slower
   get_callee_monomorphic                             3.7780+-0.1770            3.7578+-0.3154        
   get_callee_polymorphic                             5.0004+-0.1409            4.8792+-0.2847          might be 1.0248x faster
   getter-no-activation                               5.9208+-0.1053            5.9161+-0.1117        
   getter-prototype                                  11.1676+-0.4595     ?     11.2338+-0.4775        ?
   getter-richards                                  128.0605+-5.6354          125.5574+-12.8744         might be 1.0199x faster
   getter                                             7.9800+-0.0942     ^      7.3513+-0.2324        ^ definitely 1.0855x faster
   global-object-access-with-mutating-structure   
                                                      7.6500+-0.0946            7.5335+-0.1641          might be 1.0155x faster
   global-var-const-infer-fire-from-opt               1.1740+-0.2492     ?      1.1842+-0.1683        ?
   global-var-const-infer                             1.0084+-0.1572     ?      1.1443+-0.1670        ? might be 1.1347x slower
   hard-overflow-check-equal                         44.8525+-0.9454     ?     44.9805+-1.9131        ?
   hard-overflow-check                               44.4554+-0.3787     ?     44.7370+-0.5934        ?
   HashMap-put-get-iterate-keys                      33.8235+-0.9135     ?     34.0250+-1.1932        ?
   HashMap-put-get-iterate                           34.0887+-0.7274           33.9380+-0.8909        
   HashMap-string-put-get-iterate                    36.9832+-3.6363           35.5679+-1.1296          might be 1.0398x faster
   hoist-make-rope                                   12.7125+-1.4558     ?     12.9760+-1.7734        ? might be 1.0207x slower
   hoist-poly-check-structure-effectful-loop   
                                                      6.6788+-0.0916     ?      6.7131+-0.1997        ?
   hoist-poly-check-structure                         4.4827+-0.1005            4.4706+-0.0856        
   imul-double-only                                   8.6847+-0.4372     ?      8.8731+-1.3619        ? might be 1.0217x slower
   imul-int-only                                     10.8917+-0.2596           10.7063+-0.2688          might be 1.0173x faster
   imul-mixed                                         8.4587+-1.1011     ?      8.6865+-1.0281        ? might be 1.0269x slower
   in-four-cases                                     25.8458+-0.7387           25.8093+-0.8041        
   in-one-case-false                                 14.8284+-0.2529           14.8161+-0.0724        
   in-one-case-true                                  14.7388+-0.0650     ?     14.9081+-0.2432        ? might be 1.0115x slower
   in-two-cases                                      15.1138+-0.1590           14.9469+-0.1740          might be 1.0112x faster
   indexed-properties-in-objects                      4.2024+-0.0998     ^      3.8806+-0.1272        ^ definitely 1.0829x faster
   infer-closure-const-then-mov-no-inline             4.8937+-0.2048     ?      4.9107+-0.1417        ?
   infer-closure-const-then-mov                      21.4066+-1.1340           21.0269+-0.7422          might be 1.0181x faster
   infer-closure-const-then-put-to-scope-no-inline   
                                                     15.7961+-0.3555     !     17.0670+-0.5402        ! definitely 1.0805x slower
   infer-closure-const-then-put-to-scope             27.7354+-0.8894     !     32.4135+-0.3610        ! definitely 1.1687x slower
   infer-closure-const-then-reenter-no-inline   
                                                     70.9622+-0.6659     !     74.3788+-0.3754        ! definitely 1.0481x slower
   infer-closure-const-then-reenter                  33.1869+-1.1217           33.1837+-0.6130        
   infer-constant-global-property                     4.8211+-0.0223            4.7550+-0.0996          might be 1.0139x faster
   infer-constant-property                            3.4120+-0.1338     ?      3.4589+-0.0905        ? might be 1.0138x slower
   infer-one-time-closure-ten-vars                   11.2770+-0.5883     ?     11.5590+-0.2116        ? might be 1.0250x slower
   infer-one-time-closure-two-vars                   10.7819+-0.4222     ?     10.8403+-0.1283        ?
   infer-one-time-closure                            10.6577+-0.4531           10.5937+-0.3642        
   infer-one-time-deep-closure                       17.8967+-0.8140     ?     18.0213+-0.1585        ?
   inline-arguments-access                            6.1150+-0.3242            6.0554+-0.2238        
   inline-arguments-aliased-access                    6.2040+-0.2420            6.1586+-0.1301        
   inline-arguments-local-escape                      6.2360+-0.0556            6.1161+-0.1635          might be 1.0196x faster
   inline-get-scoped-var                              5.7583+-0.0938     ?      5.8250+-0.0463        ? might be 1.0116x slower
   inlined-put-by-id-transition                      15.7344+-0.5429           15.4290+-0.3129          might be 1.0198x faster
   inlined-put-by-val-with-string-transition   
                                                     67.8935+-2.0964           66.9481+-1.1203          might be 1.0141x faster
   inlined-put-by-val-with-symbol-transition   
                                                     68.5538+-2.8800           67.9240+-1.2082        
   int-or-other-abs-then-get-by-val                   6.5981+-0.2488     ?      6.6240+-0.0639        ?
   int-or-other-abs-zero-then-get-by-val             28.1147+-0.6954           27.3970+-0.3707          might be 1.0262x faster
   int-or-other-add-then-get-by-val                   6.1070+-0.1119     ?      6.1550+-0.1768        ?
   int-or-other-add                                   8.1768+-0.2148     ?      8.2601+-0.1192        ? might be 1.0102x slower
   int-or-other-div-then-get-by-val                   5.0574+-0.1467     ?      5.0577+-0.1296        ?
   int-or-other-max-then-get-by-val                   6.4609+-0.1676            6.1133+-0.5342          might be 1.0569x faster
   int-or-other-min-then-get-by-val                   5.1910+-0.3035            5.1769+-0.1217        
   int-or-other-mod-then-get-by-val                   4.9344+-0.0595            4.9273+-0.1220        
   int-or-other-mul-then-get-by-val                   5.0533+-0.1255     ?      5.1467+-0.1990        ? might be 1.0185x slower
   int-or-other-neg-then-get-by-val                   5.7080+-0.1326     ?      6.0906+-0.9210        ? might be 1.0670x slower
   int-or-other-neg-zero-then-get-by-val             27.6542+-0.2006     ?     28.0616+-0.2964        ? might be 1.0147x slower
   int-or-other-sub-then-get-by-val                   6.0562+-0.1128     ?      6.1453+-0.1064        ? might be 1.0147x slower
   int-or-other-sub                                   5.3602+-0.0716     ?      5.3615+-0.0795        ?
   int-overflow-local                                 6.1800+-0.0713            6.1533+-0.1910        
   Int16Array-alloc-long-lived                       65.2850+-0.7081     ?     65.9570+-1.0616        ? might be 1.0103x slower
   Int16Array-bubble-sort-with-byteLength            35.5035+-1.1122     ?     35.8108+-0.2979        ?
   Int16Array-bubble-sort                            35.7055+-1.1577     ?     36.0655+-1.0111        ? might be 1.0101x slower
   Int16Array-load-int-mul                            2.1409+-0.0882            2.0737+-0.1380          might be 1.0324x faster
   Int16Array-to-Int32Array-set                      74.5519+-1.2694           73.0711+-1.5188          might be 1.0203x faster
   Int32Array-alloc-large                            33.3762+-0.8468           32.3179+-0.5883          might be 1.0327x faster
   Int32Array-alloc-long-lived                       74.4218+-1.3953           74.1349+-1.3517        
   Int32Array-alloc                                   4.7007+-0.2190            4.6031+-0.2056          might be 1.0212x faster
   Int32Array-Int8Array-view-alloc                    9.3167+-0.2781            9.2792+-0.1682        
   int52-spill                                        7.2535+-0.2614            7.1418+-0.1124          might be 1.0156x faster
   Int8Array-alloc-long-lived                        58.0889+-1.7560     ?     59.3250+-3.1403        ? might be 1.0213x slower
   Int8Array-load-with-byteLength                     4.7575+-0.1077     ?      4.7910+-0.1788        ?
   Int8Array-load                                     4.8945+-0.2222            4.7897+-0.1775          might be 1.0219x faster
   integer-divide                                    14.0390+-0.1103           14.0131+-0.3927        
   integer-modulo                                     2.7477+-0.0945            2.7191+-0.1750          might be 1.0105x faster
   is-boolean-fold-tricky                             5.6200+-0.1007     ?      5.6677+-0.1884        ?
   is-boolean-fold                                    3.9811+-0.0986     ?      4.0537+-0.1473        ? might be 1.0182x slower
   is-function-fold-tricky-internal-function   
                                                     15.3417+-1.7957           14.8641+-0.2150          might be 1.0321x faster
   is-function-fold-tricky                            5.8341+-0.0489            5.7852+-0.1549        
   is-function-fold                                   4.0696+-0.1521            4.0389+-0.1069        
   is-number-fold-tricky                              5.6976+-0.1896     ?      5.7151+-0.1282        ?
   is-number-fold                                     4.0784+-0.1103            4.0176+-0.1272          might be 1.0151x faster
   is-object-or-null-fold-functions                   4.2069+-0.0737            4.1028+-0.1065          might be 1.0254x faster
   is-object-or-null-fold-less-tricky                 5.7341+-0.1237     ?      5.8157+-0.0629        ? might be 1.0142x slower
   is-object-or-null-fold-tricky                      7.3820+-0.1532     ?      7.5287+-0.1895        ? might be 1.0199x slower
   is-object-or-null-fold                             4.1586+-0.2867            4.0350+-0.1159          might be 1.0306x faster
   is-object-or-null-trickier-function                5.8661+-0.1883            5.8597+-0.1303        
   is-object-or-null-trickier-internal-function   
                                                     15.6541+-0.9657     ?     15.7238+-0.1667        ?
   is-object-or-null-tricky-function                  5.8820+-0.1252     ?      5.8980+-0.1244        ?
   is-object-or-null-tricky-internal-function   
                                                     11.5880+-0.1293           11.5132+-0.1656        
   is-string-fold-tricky                              5.6694+-0.1304            5.6433+-0.1075        
   is-string-fold                                     3.9922+-0.0870     ?      4.0190+-0.1312        ?
   is-undefined-fold-tricky                           4.7663+-0.1354     ?      4.8107+-0.0939        ?
   is-undefined-fold                                  3.9630+-0.0785     ?      4.0605+-0.1977        ? might be 1.0246x slower
   JSONP-negative-0                                   0.4478+-0.0807            0.4293+-0.0948          might be 1.0429x faster
   large-int-captured                                 6.5096+-0.2530     ?      6.5200+-0.5736        ?
   large-int-neg                                     19.7649+-0.4612     ?     20.0004+-0.7356        ? might be 1.0119x slower
   large-int                                         17.6397+-0.1672     ?     17.6456+-0.1621        ?
   load-varargs-elimination                          29.3098+-1.3961     ?     29.4713+-0.7212        ?
   logical-not-weird-types                            5.1075+-0.1313            5.0070+-0.2540          might be 1.0201x faster
   logical-not                                        6.6058+-0.5652            6.4423+-0.3225          might be 1.0254x faster
   lots-of-fields                                    17.9940+-0.2587     ?     18.0244+-0.6866        ?
   make-indexed-storage                               4.1447+-0.6869     ?      4.3683+-0.2348        ? might be 1.0539x slower
   make-rope-cse                                      6.3652+-0.2668     ?      6.5412+-0.2035        ? might be 1.0276x slower
   marsaglia-larger-ints                             53.9771+-0.7136     ?     54.1827+-1.6095        ?
   marsaglia-osr-entry                               26.4467+-0.5981     ?     26.6700+-0.6757        ?
   math-with-out-of-bounds-array-values              36.0089+-0.3902     ^     33.9117+-0.8869        ^ definitely 1.0618x faster
   max-boolean                                        3.3587+-0.1713            3.3204+-0.1277          might be 1.0115x faster
   method-on-number                                  22.5718+-0.3544     !     23.5757+-0.0968        ! definitely 1.0445x slower
   min-boolean                                        3.4186+-0.1114            3.3876+-0.1375        
   minus-boolean-double                               4.2108+-0.0525     ?      4.2678+-0.1506        ? might be 1.0135x slower
   minus-boolean                                      3.4182+-0.0759            3.4051+-0.1026        
   misc-strict-eq                                    47.4189+-3.1595           46.5891+-0.7866          might be 1.0178x faster
   mod-boolean-double                                11.8081+-0.1662     ?     11.9675+-0.4156        ? might be 1.0135x slower
   mod-boolean                                        9.0740+-0.3033     ?      9.1332+-0.1296        ?
   mul-boolean-double                                 4.8530+-0.0367     !      4.9747+-0.0578        ! definitely 1.0251x slower
   mul-boolean                                        3.5893+-0.0901     ?      3.6499+-0.1425        ? might be 1.0169x slower
   neg-boolean                                        4.2935+-0.1108     ?      4.3110+-0.0877        ?
   negative-zero-divide                               0.5772+-0.0968     ?      0.6281+-0.1177        ? might be 1.0881x slower
   negative-zero-modulo                               0.5228+-0.0076     ?      0.5752+-0.1199        ? might be 1.1003x slower
   negative-zero-negate                               0.5270+-0.0491            0.5269+-0.0949        
   nested-function-parsing                           72.6717+-1.1864     ?     73.2660+-2.8106        ?
   new-array-buffer-dead                            143.8330+-3.3165     ?    144.5037+-4.4408        ?
   new-array-buffer-push                              9.9576+-0.6558     ?     10.0914+-0.2086        ? might be 1.0134x slower
   new-array-dead                                    18.4188+-0.7024     ?     18.6478+-0.6738        ? might be 1.0124x slower
   new-array-push                                     5.4870+-0.1924     ?      5.5934+-0.1874        ? might be 1.0194x slower
   no-inline-constructor                             49.5439+-2.3933           49.3301+-0.8852        
   number-test                                        4.3897+-0.1455     ?      4.4055+-0.1001        ?
   object-closure-call                                7.6103+-0.1844     ?      7.7611+-0.2203        ? might be 1.0198x slower
   object-get-own-property-symbols-on-large-array   
                                                      5.1368+-0.4152     ?      5.3012+-0.1059        ? might be 1.0320x slower
   object-test                                        4.3110+-0.1324            4.2224+-0.1512          might be 1.0210x faster
   obvious-sink-pathology-taken                     171.0161+-9.2398          169.6381+-9.9475        
   obvious-sink-pathology                           162.1172+-13.8819         158.8401+-2.8638          might be 1.0206x faster
   obviously-elidable-new-object                     44.4477+-1.3473           43.6290+-0.9261          might be 1.0188x faster
   plus-boolean-arith                                 3.3977+-0.0814            3.3969+-0.0564        
   plus-boolean-double                                4.3062+-0.1092     ?      4.3135+-0.1116        ?
   plus-boolean                                       3.3080+-0.1275            3.3051+-0.1488        
   poly-chain-access-different-prototypes-simple   
                                                      5.2510+-0.1351     ?      5.2705+-0.0377        ?
   poly-chain-access-different-prototypes             5.3908+-0.0366            5.3434+-0.1580        
   poly-chain-access-simpler                          5.2597+-0.1188     ?      5.2658+-0.0995        ?
   poly-chain-access                                  5.1183+-0.0813            5.0793+-0.1360        
   poly-stricteq                                     81.1038+-2.9152           79.2272+-0.3466          might be 1.0237x faster
   polymorphic-array-call                             1.9453+-0.3169            1.9337+-0.1789        
   polymorphic-get-by-id                              5.0206+-0.3715     ?      5.0833+-0.3490        ? might be 1.0125x slower
   polymorphic-put-by-id                             43.5688+-1.6434           41.5898+-1.3682          might be 1.0476x faster
   polymorphic-put-by-val-with-string                43.7271+-1.4155           43.4907+-0.7532        
   polymorphic-put-by-val-with-symbol                43.2488+-0.9160           43.1714+-0.8983        
   polymorphic-structure                             25.3026+-0.3580     ?     25.3724+-0.7225        ?
   polyvariant-monomorphic-get-by-id                 11.6013+-0.2095     ?     11.6252+-0.2222        ?
   proto-getter-access                               10.4666+-0.3047     ?     10.5521+-0.1522        ?
   prototype-access-with-mutating-prototype           7.1492+-0.2884     ?      7.2142+-0.2717        ?
   put-by-id-replace-and-transition                  12.7933+-0.1233     ?     12.9305+-0.4289        ? might be 1.0107x slower
   put-by-id-slightly-polymorphic                     3.6184+-0.0687     ?      3.6496+-0.0787        ?
   put-by-id                                         18.3819+-0.3142     ?     18.6152+-0.2509        ? might be 1.0127x slower
   put-by-val-direct                                  0.5914+-0.0986            0.5450+-0.0193          might be 1.0852x faster
   put-by-val-large-index-blank-indexing-type   
                                                      8.2003+-0.1768     ?      8.3987+-0.3672        ? might be 1.0242x slower
   put-by-val-machine-int                             3.7003+-0.1212     ?      3.7275+-0.3588        ?
   put-by-val-with-string-replace-and-transition   
                                                     19.2853+-0.8601           19.1003+-0.6969        
   put-by-val-with-string-slightly-polymorphic   
                                                      4.8229+-0.1910            4.8038+-0.2678        
   put-by-val-with-string                            19.5538+-0.3147           19.4990+-1.0591        
   put-by-val-with-symbol-replace-and-transition   
                                                     20.7418+-0.4819     ?     21.1187+-0.1544        ? might be 1.0182x slower
   put-by-val-with-symbol-slightly-polymorphic   
                                                      5.0228+-0.1633     ?      5.0521+-0.0731        ?
   put-by-val-with-symbol                            19.1353+-0.7257     ?     19.3765+-1.3047        ? might be 1.0126x slower
   rare-osr-exit-on-local                            16.5665+-0.0591     ?     16.7027+-0.3544        ?
   raytrace-with-empty-try-catch                      9.0857+-0.1128            8.9972+-0.2870        
   raytrace-with-try-catch                           16.2667+-0.5950     ?     16.3167+-0.3680        ?
   register-pressure-from-osr                        25.3452+-0.7226           25.0065+-0.2039          might be 1.0135x faster
   repeat-multi-get-by-offset                        28.2213+-1.1313     ?     28.8240+-0.2503        ? might be 1.0214x slower
   richards-empty-try-catch                         123.5026+-3.4299     ?    128.5753+-3.8824        ? might be 1.0411x slower
   richards-try-catch                               408.5924+-8.4474          405.1485+-7.8659        
   setter-prototype                                  11.4163+-0.5593           11.3247+-0.2554        
   setter                                             6.8278+-0.0657     !      7.1150+-0.2085        ! definitely 1.0421x slower
   simple-activation-demo                            33.8445+-0.9276           33.5181+-0.3692        
   simple-getter-access                              14.3268+-0.3833           14.0475+-0.3152          might be 1.0199x faster
   simple-poly-call-nested                            9.2784+-0.1034     ?      9.4417+-0.0817        ? might be 1.0176x slower
   simple-poly-call                                   1.9418+-0.1889            1.9374+-0.1046        
   sin-boolean                                       21.0912+-1.2114     ?     21.1145+-2.0203        ?
   singleton-scope                                   84.8734+-2.7347     ?     87.4733+-0.7323        ? might be 1.0306x slower
   sink-function                                     13.4752+-0.5140     ?     13.5058+-0.2381        ?
   sink-huge-activation                              20.5024+-0.7718     ?     21.1139+-0.5292        ? might be 1.0298x slower
   sinkable-new-object-dag                           83.8180+-2.6742     ?     89.7119+-6.5385        ? might be 1.0703x slower
   sinkable-new-object-taken                         66.7482+-1.0161     ?     66.9510+-1.4509        ?
   sinkable-new-object                               45.9813+-0.2317     ?     46.7982+-0.9802        ? might be 1.0178x slower
   slow-array-profile-convergence                     3.8118+-0.0709     ?      3.9625+-0.2382        ? might be 1.0396x slower
   slow-convergence                                   3.8649+-0.2256            3.7678+-0.1030          might be 1.0258x faster
   slow-ternaries                                    33.3363+-1.1995           33.3230+-1.1725        
   sorting-benchmark                                 25.4230+-1.2080     ?     25.8743+-0.9213        ? might be 1.0178x slower
   sparse-conditional                                 1.7410+-0.0590            1.6705+-0.0686          might be 1.0422x faster
   splice-to-remove                                  19.5941+-0.1964     ?     19.6570+-0.7977        ?
   string-char-code-at                               20.7964+-0.5131     ?     20.8229+-0.6571        ?
   string-concat-object                               3.4554+-0.4172            3.3484+-0.2247          might be 1.0320x faster
   string-concat-pair-object                          3.2863+-0.3384            3.1685+-0.1570          might be 1.0372x faster
   string-concat-pair-simple                         17.1970+-0.2549           17.1310+-0.5919        
   string-concat-simple                              17.3218+-0.2495     ?     17.6580+-0.3826        ? might be 1.0194x slower
   string-cons-repeat                                11.2892+-0.2501     ?     11.6085+-0.6134        ? might be 1.0283x slower
   string-cons-tower                                 11.1109+-0.4314           11.0529+-0.8141        
   string-equality                                   22.8563+-0.5577           22.7903+-0.5692        
   string-get-by-val-big-char                        10.9804+-0.2807           10.8962+-0.4150        
   string-get-by-val-out-of-bounds-insane             5.8965+-1.7405            5.0690+-0.1841          might be 1.1632x faster
   string-get-by-val-out-of-bounds                    6.8870+-0.1482     ?      7.0023+-0.2305        ? might be 1.0167x slower
   string-get-by-val                                  4.9035+-0.0607            4.8630+-0.0543        
   string-hash                                        2.8685+-0.0856            2.8002+-0.1800          might be 1.0244x faster
   string-long-ident-equality                        18.9660+-0.3869           18.8985+-0.5381        
   string-out-of-bounds                              15.2198+-0.4117     ?     15.2742+-0.2462        ?
   string-repeat-arith                               42.4420+-0.7784     ?     43.2211+-0.4118        ? might be 1.0184x slower
   string-sub                                        83.6598+-0.9450     ?     84.9185+-2.9205        ? might be 1.0150x slower
   string-test                                        4.4833+-0.1799            4.4041+-0.0885          might be 1.0180x faster
   string-var-equality                               59.5074+-1.6351     ?     59.6920+-2.7466        ?
   structure-hoist-over-transitions                   3.3961+-0.1244     ?      3.4680+-0.1639        ? might be 1.0212x slower
   substring-concat-weird                            57.9296+-0.3837           57.2679+-0.7046          might be 1.0116x faster
   substring-concat                                  63.7455+-2.2029           62.7404+-0.6526          might be 1.0160x faster
   substring                                         69.9964+-1.3205     ?     70.5173+-3.1273        ?
   switch-char-constant                               3.4128+-0.1496     ?      3.4382+-0.1330        ?
   switch-char                                        7.9395+-0.0959     ?      7.9933+-0.1241        ?
   switch-constant                                   13.8322+-2.6930           12.8125+-0.5405          might be 1.0796x faster
   switch-string-basic-big-var                       31.1395+-0.5022     ?     31.7504+-0.9673        ? might be 1.0196x slower
   switch-string-basic-big                           30.5966+-2.6535           28.8702+-2.5982          might be 1.0598x faster
   switch-string-basic-var                           32.1591+-1.1458           31.7173+-1.0470          might be 1.0139x faster
   switch-string-basic                               20.2922+-0.5237     ?     22.0007+-1.4344        ? might be 1.0842x slower
   switch-string-big-length-tower-var                27.6143+-1.3907     ?     28.0028+-0.8601        ? might be 1.0141x slower
   switch-string-length-tower-var                    20.9316+-0.7926     ?     21.2938+-0.8673        ? might be 1.0173x slower
   switch-string-length-tower                        15.0159+-0.2165     ?     15.4891+-0.3251        ? might be 1.0315x slower
   switch-string-short                               15.0556+-0.3663     ?     15.4705+-0.4013        ? might be 1.0276x slower
   switch                                            18.6821+-2.2473     ?     19.0468+-3.6711        ? might be 1.0195x slower
   tear-off-arguments-simple                          4.4592+-0.3902     ?      4.4793+-0.2930        ?
   tear-off-arguments                                 6.4133+-0.3101     ?      6.4637+-0.5340        ?
   temporal-structure                                16.6542+-0.2854     ?     16.6702+-0.1786        ?
   to-int32-boolean                                  21.2107+-0.6007     ?     21.2349+-0.1941        ?
   try-catch-get-by-val-cloned-arguments             14.5635+-0.3829     ?     14.6710+-0.2166        ?
   try-catch-get-by-val-direct-arguments              3.1721+-0.2412     ?      3.2209+-0.1111        ? might be 1.0154x slower
   try-catch-get-by-val-scoped-arguments              7.2185+-0.2494     ?      7.3502+-0.1872        ? might be 1.0182x slower
   typed-array-get-set-by-val-profiling              32.6567+-0.5778     ?     33.0248+-1.7292        ? might be 1.0113x slower
   undefined-property-access                        454.0237+-2.8673     ?    457.4219+-11.9366       ?
   undefined-test                                     4.6183+-0.0941            4.5880+-0.1480        
   unprofiled-licm                                   15.7719+-0.2770           15.2503+-0.5003          might be 1.0342x faster
   v8-raytrace-with-empty-try-catch                  86.3595+-1.3847           84.5534+-2.1997          might be 1.0214x faster
   v8-raytrace-with-try-catch                       111.9131+-1.9967     ?    113.3912+-6.4468        ? might be 1.0132x slower
   varargs-call                                      18.3911+-0.3014           18.0943+-0.1232          might be 1.0164x faster
   varargs-construct-inline                          35.4605+-0.7706     ?     36.1693+-1.2386        ? might be 1.0200x slower
   varargs-construct                                 29.8461+-0.4795     ?     30.1227+-1.0262        ?
   varargs-inline                                    12.3726+-0.0572     ?     12.6510+-0.3138        ? might be 1.0225x slower
   varargs-strict-mode                               14.4158+-0.1046     !     15.3515+-0.1917        ! definitely 1.0649x slower
   varargs                                           14.7092+-0.3661     ?     15.1565+-0.1614        ? might be 1.0304x slower
   weird-inlining-const-prop                          3.1445+-0.1744     ?      3.1488+-0.1516        ?

   <geometric>                                       12.0266+-0.0266     ?     12.0331+-0.0216        ? might be 1.0005x slower

                                                         Baseline                FTLTailCall                                    
AsmBench:
   bigfib.cpp                                       686.6130+-12.6805         681.1686+-2.8260        
   cray.c                                           627.5635+-14.1646    ?    627.8507+-12.3962       ?
   dry.c                                            622.0730+-79.3959    ?    639.7935+-90.4039       ? might be 1.0285x slower
   FloatMM.c                                        923.3546+-6.2904          920.9418+-1.0889        
   gcc-loops.cpp                                   5999.5257+-39.6127        5991.2282+-54.0201       
   n-body.c                                        1675.0771+-5.3706     ?   1675.3594+-8.6330        ?
   Quicksort.c                                      574.2011+-1.9090     ?    577.0942+-16.3350       ?
   stepanov_container.cpp                          4929.2533+-57.9747        4917.8504+-136.1726      
   Towers.c                                         403.0887+-5.7450          395.5480+-3.3673          might be 1.0191x faster

   <geometric>                                     1122.5891+-14.7933        1122.5798+-18.2758         might be 1.0000x faster

                                                         Baseline                FTLTailCall                                    
CompressionBench:
   huffman                                           81.4788+-1.2614     ?     81.7714+-1.0110        ?
   arithmetic-simple                                436.5280+-5.5896     ?    437.3597+-3.1873        ?
   arithmetic-precise                               365.8832+-10.6441    ?    371.6338+-8.8219        ? might be 1.0157x slower
   arithmetic-complex-precise                       363.1974+-7.8005     ?    366.7683+-6.5026        ?
   arithmetic-precise-order-0                       445.6658+-6.3451     ?    448.8375+-12.9706       ?
   arithmetic-precise-order-1                       420.9778+-3.6185     ?    426.6824+-3.9093        ? might be 1.0136x slower
   arithmetic-precise-order-2                       487.9990+-3.4694     ?    490.9768+-9.8150        ?
   arithmetic-simple-order-1                        500.0128+-6.0787     ?    504.2504+-9.9207        ?
   arithmetic-simple-order-2                        558.6084+-5.7195     ?    563.6854+-3.9748        ?
   lz-string                                        426.6593+-5.4276     ?    430.0168+-17.4548       ?

   <geometric>                                      372.5848+-1.4683     ?    375.6743+-2.5672        ? might be 1.0083x slower

                                                         Baseline                FTLTailCall                                    
Geomean of preferred means:
   <scaled-result>                                   78.3226+-0.1925           78.2832+-0.1550          might be 1.0005x faster
Comment 5 WebKit Commit Bot 2015-09-24 16:03:37 PDT
Attachment 261901 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/jit/CallFrameShuffler.h:576:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
Total errors found: 1 in 39 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 6 Filip Pizlo 2015-09-26 10:44:30 PDT
Comment on attachment 261901 [details]
Rebased patch

View in context: https://bugs.webkit.org/attachment.cgi?id=261901&action=review

R=me with comments.

> Source/JavaScriptCore/dfg/DFGNode.h:1133
> +    bool isFunctionTerminal()
> +    {
> +        switch (op()) {
> +        case Return:
> +        case TailCall:
> +        case TailCallVarargs:
> +        case TailCallForwardVarargs:
> +            return true;
> +        default:
> +            return false;
> +        }
> +    }
> +

The way that I'm interpreting this function name is that it should return true for nodes that terminal the function, not just the block.

Three things:

1) It's weird that Unreachable isn't here.  Most compiler phases don't want to care about whether a piece of code is reachable.  The point of Unreachable is that it's like a Return for those phases that don't want to care about reachability, but do want to care about whether a basic block has any further successors.  A property of Unreachable that I'd like to preserve is that it should be possible to write a phase that replaces every Unreachable with Return(Undefined), and doing so should not cause any behavior change.  In fact, Unreachable is like a Return(Undefined) that is preceded by the DFG IR equivalent of ASSERT_NOT_REACHED.  So, if Unreachable isn't handled here, then we probably need to either change the function name or add a comment, since we're failing to obey the "like a Return" property of Unreachable.  Luckily, it looks like the only user of this function would be OK with Unreachable returning true.  It's fine for TierUpCheckInjectionPhase to insert a tier-up check before Unreachable, and we can safely trust that this will not affect performance on anything we care about, since Unreachable is usually used after some piece of code that the compiler will already prove to always exit, like a Throw or ForceOSRExit.  So, if we insert a tier-up check before Unreachable, the compiler won't compile the tier-up check anyway.

2) If Unreachable is indeed a "function terminal", then this function could be written as "bool isFunctionTerminal() { return isTerminal() && !numSuccessors(); }

3) If we think that doing (1) and (2) is weird, then maybe the less weird approach would be to just delete this function, and have TierUpCheckInjectionPhase inline this switch.  FWIW, that's probably how I would have done it.  But, now that I see this isFunctionTerminal() function, I tend to think that it would be a generally useful function, provided that (1) it returns true for Unreachable since that's the less surprising behavior and (2) it's written in terms of isTerminal() and numSuccessors() since "a terminal that has no successors" is a very nice way of saying "a terminal that terminates the function".

> Source/JavaScriptCore/ftl/FTLJSTailCall.cpp:43
> +static FTL::Location getRegisterWithAddend(const ExitValue& value, StackMaps::Record& record, StackMaps& stackmaps)

I slightly prefer using anonymous namespaces over static, especially since you have multiple static methods.

> Source/JavaScriptCore/ftl/FTLJSTailCall.cpp:58
> +static ValueRecovery recoveryFor(const ExitValue& value, StackMaps::Record& record, StackMaps& stackmaps)

Ditto.

> Source/JavaScriptCore/ftl/FTLJSTailCall.cpp:108
> +static uint32_t sizeFor(DataFormat format)

Ditto.

> Source/JavaScriptCore/jit/CallFrameShuffler.cpp:312
> +    for (; firstRead < VirtualRegister { 0 }; firstRead += 1) {

Is there a better way of saying VirtualRegister{0}?  We don't usually use that syntax for VirtualRegister.  Wouldn't it be better to say "firstRead <= virtualRegisterForLocal(0)"?

> Source/JavaScriptCore/jit/CallFrameShuffler.h:411
> +#if USE(JSVALUE64)
> +    mutable RegisterSet m_lockedRegisters;
> +#else
>      RegisterSet m_lockedRegisters;
> +#endif

Would be cleaner to just make it mutable unconditionally.

> Source/JavaScriptCore/jit/Reg.h:197
> +template<typename T> struct HashTraits;
> +template<> struct HashTraits<JSC::Reg> : SimpleClassHashTraits<JSC::Reg> { };

I found a bug!  I believe that this needs "static const bool emptyValueIsZero = false;" like for example this traits:

template<typename T> struct HashTraits;
template<> struct HashTraits<JSC::DFG::PromotedLocationDescriptor> : SimpleClassHashTraits<JSC::DFG::PromotedLocationDescriptor> {
    static const bool emptyValueIsZero = false;
};

The reason is that the empty Reg (i.e. "Reg()") will have the value 0xff (see Reg::invalid()), which is not zero.
Comment 7 Michael Saboff 2015-09-28 15:35:04 PDT
(In reply to comment #6)
> Comment on attachment 261901 [details]
> Rebased patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=261901&action=review
> 
> R=me with comments.
> 
> > Source/JavaScriptCore/dfg/DFGNode.h:1133
> > +    bool isFunctionTerminal()
> > +    {
> > +        switch (op()) {
> > +        case Return:
> > +        case TailCall:
> > +        case TailCallVarargs:
> > +        case TailCallForwardVarargs:
> > +            return true;
> > +        default:
> > +            return false;
> > +        }
> > +    }
> > +
> 
> The way that I'm interpreting this function name is that it should return
> true for nodes that terminal the function, not just the block.
> 
> Three things:
> 
> 1) It's weird that Unreachable isn't here.  Most compiler phases don't want
> to care about whether a piece of code is reachable.  The point of
> Unreachable is that it's like a Return for those phases that don't want to
> care about reachability, but do want to care about whether a basic block has
> any further successors.  A property of Unreachable that I'd like to preserve
> is that it should be possible to write a phase that replaces every
> Unreachable with Return(Undefined), and doing so should not cause any
> behavior change.  In fact, Unreachable is like a Return(Undefined) that is
> preceded by the DFG IR equivalent of ASSERT_NOT_REACHED.  So, if Unreachable
> isn't handled here, then we probably need to either change the function name
> or add a comment, since we're failing to obey the "like a Return" property
> of Unreachable.  Luckily, it looks like the only user of this function would
> be OK with Unreachable returning true.  It's fine for
> TierUpCheckInjectionPhase to insert a tier-up check before Unreachable, and
> we can safely trust that this will not affect performance on anything we
> care about, since Unreachable is usually used after some piece of code that
> the compiler will already prove to always exit, like a Throw or
> ForceOSRExit.  So, if we insert a tier-up check before Unreachable, the
> compiler won't compile the tier-up check anyway.
> 
> 2) If Unreachable is indeed a "function terminal", then this function could
> be written as "bool isFunctionTerminal() { return isTerminal() &&
> !numSuccessors(); }
> 
> 3) If we think that doing (1) and (2) is weird, then maybe the less weird
> approach would be to just delete this function, and have
> TierUpCheckInjectionPhase inline this switch.  FWIW, that's probably how I
> would have done it.  But, now that I see this isFunctionTerminal() function,
> I tend to think that it would be a generally useful function, provided that
> (1) it returns true for Unreachable since that's the less surprising
> behavior and (2) it's written in terms of isTerminal() and numSuccessors()
> since "a terminal that has no successors" is a very nice way of saying "a
> terminal that terminates the function".

I changed isFunctionTerminal() to be written in terms of isTerminal() and numSuccessors().  I also had to fix the ThrowReferenceError case in DFGClobberize.cpp::clobberize() to satisfy the validation phase. 

> > Source/JavaScriptCore/ftl/FTLJSTailCall.cpp:43
> > +static FTL::Location getRegisterWithAddend(const ExitValue& value, StackMaps::Record& record, StackMaps& stackmaps)
> 
> I slightly prefer using anonymous namespaces over static, especially since
> you have multiple static methods.

Done.

> > Source/JavaScriptCore/ftl/FTLJSTailCall.cpp:58
> > +static ValueRecovery recoveryFor(const ExitValue& value, StackMaps::Record& record, StackMaps& stackmaps)
> 
> Ditto.

Done.

> > Source/JavaScriptCore/ftl/FTLJSTailCall.cpp:108
> > +static uint32_t sizeFor(DataFormat format)
> 
> Ditto.

Done.

> > Source/JavaScriptCore/jit/CallFrameShuffler.cpp:312
> > +    for (; firstRead < VirtualRegister { 0 }; firstRead += 1) {
> 
> Is there a better way of saying VirtualRegister{0}?  We don't usually use
> that syntax for VirtualRegister.  Wouldn't it be better to say "firstRead <=
> virtualRegisterForLocal(0)"?
> 
> > Source/JavaScriptCore/jit/CallFrameShuffler.h:411
> > +#if USE(JSVALUE64)
> > +    mutable RegisterSet m_lockedRegisters;
> > +#else
> >      RegisterSet m_lockedRegisters;
> > +#endif
> 
> Would be cleaner to just make it mutable unconditionally.

Done.

> > Source/JavaScriptCore/jit/Reg.h:197
> > +template<typename T> struct HashTraits;
> > +template<> struct HashTraits<JSC::Reg> : SimpleClassHashTraits<JSC::Reg> { };
> 
> I found a bug!  I believe that this needs "static const bool
> emptyValueIsZero = false;" like for example this traits:
> 
> template<typename T> struct HashTraits;
> template<> struct HashTraits<JSC::DFG::PromotedLocationDescriptor> :
> SimpleClassHashTraits<JSC::DFG::PromotedLocationDescriptor> {
>     static const bool emptyValueIsZero = false;
> };
> 
> The reason is that the empty Reg (i.e. "Reg()") will have the value 0xff
> (see Reg::invalid()), which is not zero.

Done.
Comment 8 Michael Saboff 2015-09-28 15:36:41 PDT
Created attachment 262024 [details]
Patch for Landing
Comment 9 Michael Saboff 2015-09-28 15:37:57 PDT
Committed r190289: <http://trac.webkit.org/changeset/190289>
Comment 10 Csaba Osztrogonác 2015-09-29 03:30:02 PDT
(In reply to comment #9)
> Committed r190289: <http://trac.webkit.org/changeset/190289>

It caused two different regression:
- bug149619
- bug149621
Comment 11 Chris Dumez 2015-09-29 09:50:18 PDT
It looks like this change may have caused a ~4% progression on Speedometer.
Comment 12 Filip Pizlo 2015-09-29 09:52:54 PDT
(In reply to comment #11)
> It looks like this change may have caused a ~4% progression on Speedometer.

Are you also seeing the crashes that Ossy reports?
Comment 13 Michael Saboff 2015-09-29 09:53:34 PDT
(In reply to comment #10)
> (In reply to comment #9)
> > Committed r190289: <http://trac.webkit.org/changeset/190289>
> 
> It caused two different regression:
> - bug149619
> - bug149621

I'm working on these.
Comment 14 Chris Dumez 2015-09-29 09:59:10 PDT
(In reply to comment #12)
> (In reply to comment #11)
> > It looks like this change may have caused a ~4% progression on Speedometer.
> 
> Are you also seeing the crashes that Ossy reports?

One is ARM 32bit and I don't think we have coverage for this. This other one is OpenSource Speedometer which is indeed failing on the open source apple bots:
https://build.webkit.org/builders/Apple%20Yosemite%20Release%20WK2%20%28Perf%29/builds/3026
Comment 15 WebKit Commit Bot 2015-09-29 14:21:28 PDT
Re-opened since this is blocked by bug 149647
Comment 16 Alexey Proskuryakov 2015-09-29 14:41:26 PDT
This also caused a crash on regress/script-tests/call-spread-apply.js.ftl-no-cjit-no-inline-validate on Mac Debug, which happened every time on the bots.
Comment 17 Michael Saboff 2015-09-30 15:24:55 PDT
(In reply to comment #14)
> (In reply to comment #12)
> > (In reply to comment #11)
> > > It looks like this change may have caused a ~4% progression on Speedometer.
> > 
> > Are you also seeing the crashes that Ossy reports?
> 
> One is ARM 32bit and I don't think we have coverage for this. This other one
> is OpenSource Speedometer which is indeed failing on the open source apple
> bots:
> https://build.webkit.org/builders/
> Apple%20Yosemite%20Release%20WK2%20%28Perf%29/builds/3026

The ARM32 problems showed up on our iOS testers.
Comment 18 Michael Saboff 2015-09-30 15:28:32 PDT
Committed r190370: <http://trac.webkit.org/changeset/190370>
Comment 19 Csaba Osztrogonác 2015-10-01 02:50:48 PDT
(In reply to comment #18)
> Committed r190370: <http://trac.webkit.org/changeset/190370>

bug149621 is still valid