Bug 143489

Summary: Make it possible to enable LLVM FastISel
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: New BugsAssignee: Filip Pizlo <fpizlo>
Status: RESOLVED FIXED    
Severity: Normal CC: barraclough, benjamin, ggaren, juergen, mark.lam, mhahnenb, mmirman, msaboff, nrotem, oliver, ossy, saam, sam
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch msaboff: review+

Description Filip Pizlo 2015-04-07 11:21:51 PDT
Make it possible to enable LLVM FastISel
Comment 1 Filip Pizlo 2015-04-07 11:23:33 PDT
Created attachment 250279 [details]
Patch
Comment 2 Filip Pizlo 2015-04-07 11:23:54 PDT
*** Bug 138523 has been marked as a duplicate of this bug. ***
Comment 3 Filip Pizlo 2015-04-07 11:23:59 PDT
*** Bug 138625 has been marked as a duplicate of this bug. ***
Comment 4 Filip Pizlo 2015-04-07 11:24:45 PDT
We don't want to enable it on X86.  Testing ARM64 next.


Benchmark report for SunSpider, LongSpider, V8Spider, Octane, Kraken, JSRegress, AsmBench, and CompressionBench on bigmac (MacPro5,1).

VMs tested:
"SelectionDAG" at /Volumes/Data/pizlo/secondary/OpenSource/WebKitBuild/Release/jsc (r182200)
    export JSC_enableLLVMFastISel=false
"FastISel" at /Volumes/Data/pizlo/secondary/OpenSource/WebKitBuild/Release/jsc (r182200)
    export JSC_enableLLVMFastISel=true

Collected 6 samples per benchmark/VM, with 6 VM invocations per benchmark. Emitted a call to gc() between sample measurements.
Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level
timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds.

                                                       SelectionDAG                FastISel                                     
SunSpider:
   3d-cube                                            8.5997+-0.1195            8.5895+-0.1445        
   3d-morph                                          12.2763+-0.1741     ?     12.3106+-0.2126        ?
   3d-raytrace                                       11.5169+-0.3377           11.3405+-0.2460          might be 1.0156x faster
   access-binary-trees                                3.7450+-0.1578     ?      3.7637+-0.0961        ?
   access-fannkuch                                   10.4689+-0.2421           10.4601+-0.2007        
   access-nbody                                       5.5993+-0.1232            5.5437+-0.1534          might be 1.0100x faster
   access-nsieve                                      5.4593+-0.1355     ?      5.5930+-0.1514        ? might be 1.0245x slower
   bitops-3bit-bits-in-byte                           2.3904+-0.0906            2.3597+-0.0927          might be 1.0130x faster
   bitops-bits-in-byte                                6.5479+-0.1260     ?      6.5771+-0.0715        ?
   bitops-bitwise-and                                 3.1172+-0.0945     ?      3.1272+-0.0485        ?
   bitops-nsieve-bits                                 6.8454+-0.2556     ?      6.8559+-0.1393        ?
   controlflow-recursive                              3.6220+-0.1463            3.5868+-0.1118        
   crypto-aes                                         7.3588+-0.2070            7.3105+-0.2618        
   crypto-md5                                         4.5729+-0.1534     ?      4.5764+-0.0761        ?
   crypto-sha1                                        4.3040+-0.1828     ?      4.3196+-0.1671        ?
   date-format-tofte                                 17.3472+-0.6758     ?     17.5007+-0.6940        ?
   date-format-xparb                                 10.1877+-0.3588           10.1772+-0.2082        
   math-cordic                                        5.6974+-0.0890            5.6133+-0.0586          might be 1.0150x faster
   math-partial-sums                                 11.7185+-0.1635     ?     11.8821+-0.1650        ? might be 1.0140x slower
   math-spectral-norm                                 3.6212+-0.1360     ?      3.6368+-0.1478        ?
   regexp-dna                                        11.3943+-0.2059     ?     11.5802+-0.3477        ? might be 1.0163x slower
   string-base64                                      7.8936+-0.1848            7.8191+-0.0877        
   string-fasta                                      12.9583+-0.3164     ?     13.1183+-0.2116        ? might be 1.0123x slower
   string-tagcloud                                   18.3916+-0.1808     ?     18.7413+-0.4576        ? might be 1.0190x slower
   string-unpack-code                                38.0065+-0.8955     ?     38.3133+-0.9858        ?
   string-validate-input                              9.0782+-0.2968            9.0472+-0.2819        

   <arithmetic>                                       9.3353+-0.0584     ?      9.3748+-0.0652        ? might be 1.0042x slower

                                                       SelectionDAG                FastISel                                     
LongSpider:
   3d-cube                                         1865.0443+-17.8014    !   3507.5431+-14.3836       ! definitely 1.8807x slower
   3d-morph                                        2007.0619+-17.2373        1995.2124+-14.1488       
   3d-raytrace                                     1542.9008+-16.4015    !   1707.3601+-18.2749       ! definitely 1.1066x slower
   access-binary-trees                             1915.8303+-5.4544     ?   1920.4260+-28.0326       ?
   access-fannkuch                                  597.0572+-19.2605    ?    620.9979+-11.7437       ? might be 1.0401x slower
   access-nbody                                    1495.5085+-12.0595    !   1516.1370+-7.5472        ! definitely 1.0138x slower
   access-nsieve                                   1882.1476+-24.7522    ?   1892.9358+-7.6270        ?
   bitops-3bit-bits-in-byte                          62.9661+-0.5505     ^     60.7209+-0.5645        ^ definitely 1.0370x faster
   bitops-bits-in-byte                              376.5118+-2.2567     !    404.2630+-3.3823        ! definitely 1.0737x slower
   bitops-nsieve-bits                              1464.5593+-11.0922    ?   1470.9776+-14.3860       ?
   controlflow-recursive                           1051.7330+-11.5337    !   1125.2125+-6.3360        ! definitely 1.0699x slower
   crypto-aes                                      1336.9211+-7.1115     !   1469.7300+-10.5137       ! definitely 1.0993x slower
   crypto-md5                                       986.2363+-12.4707    !   1111.2268+-6.2500        ! definitely 1.1267x slower
   crypto-sha1                                     1347.7334+-33.7064    !   1521.0999+-17.3160       ! definitely 1.1286x slower
   date-format-tofte                               1527.1635+-26.0328    ?   1543.5298+-10.7866       ? might be 1.0107x slower
   date-format-xparb                               1403.5752+-10.1330    ?   1416.6774+-17.2525       ?
   math-cordic                                     1066.8608+-5.4891     !   1324.9700+-14.1726       ! definitely 1.2419x slower
   math-partial-sums                               1316.3061+-7.1068     !   1410.8193+-8.6652        ! definitely 1.0718x slower
   math-spectral-norm                              1227.9100+-12.3159    !   1876.5701+-8.5793        ! definitely 1.5283x slower
   string-base64                                    651.0180+-3.9562     ?    656.2692+-4.8383        ?
   string-fasta                                     921.2173+-4.5239     !    951.3451+-25.4720       ! definitely 1.0327x slower
   string-tagcloud                                  441.7861+-7.8164          436.7946+-6.1052          might be 1.0114x faster

   <geometric>                                     1007.5604+-2.8817     !   1102.9033+-1.7548        ! definitely 1.0946x slower

                                                       SelectionDAG                FastISel                                     
V8Spider:
   crypto                                           101.9584+-1.2954     ^     96.1765+-0.5681        ^ definitely 1.0601x faster
   deltablue                                        123.3252+-2.6895          122.9685+-2.4337        
   earley-boyer                                      82.4085+-0.9499           82.1640+-0.7160        
   raytrace                                          51.5650+-1.2099           50.9358+-1.0471          might be 1.0124x faster
   regexp                                           129.0633+-1.4601          128.9525+-2.3959        
   richards                                         129.0986+-1.4271     ?    130.7869+-1.3377        ? might be 1.0131x slower
   splay                                             65.2642+-1.9563     ?     67.4703+-5.1693        ? might be 1.0338x slower

   <geometric>                                       92.5256+-0.5210           92.0957+-1.1748          might be 1.0047x faster

                                                       SelectionDAG                FastISel                                     
Octane:
   encrypt                                           0.42148+-0.00427    !     0.45383+-0.00398       ! definitely 1.0767x slower
   decrypt                                           7.24277+-0.08315    !     8.24161+-0.11025       ! definitely 1.1379x slower
   deltablue                                x2       0.39034+-0.01019    !     0.43766+-0.00331       ! definitely 1.1212x slower
   earley                                            1.10494+-0.03167    ?     1.15498+-0.03285       ? might be 1.0453x slower
   boyer                                            14.22936+-0.06135         14.22274+-0.11674       
   navier-stokes                            x2       7.61319+-0.06497    !     9.22806+-0.11794       ! definitely 1.2121x slower
   raytrace                                 x2       2.43427+-0.09611    !     3.00558+-0.15251       ! definitely 1.2347x slower
   richards                                 x2       0.26589+-0.00362    !     0.29973+-0.00216       ! definitely 1.1273x slower
   splay                                    x2       0.67113+-0.01137    ?     0.68897+-0.00727       ? might be 1.0266x slower
   regexp                                   x2      64.99142+-0.63005         64.82059+-0.91023       
   pdfjs                                    x2      89.93591+-0.70553    ?    89.95927+-2.13850       ?
   mandreel                                 x2     103.11619+-0.67479    !   124.36555+-2.69942       ! definitely 1.2061x slower
   gbemu                                    x2      93.50565+-2.94178    ?    96.67965+-5.55520       ? might be 1.0339x slower
   closure                                           0.99198+-0.01036    ?     1.00270+-0.00929       ? might be 1.0108x slower
   jquery                                           12.43881+-0.18042    ?    12.56416+-0.17672       ? might be 1.0101x slower
   box2d                                    x2      26.71514+-0.32637    !    27.46626+-0.28792       ! definitely 1.0281x slower
   zlib                                     x2     722.86532+-7.93585    !   776.68334+-5.32801       ! definitely 1.0745x slower
   typescript                               x2    1436.81160+-20.80254   ?  1460.83370+-14.59704      ? might be 1.0167x slower

   <geometric>                                      12.96284+-0.05750    !    13.97779+-0.05568       ! definitely 1.0783x slower

                                                       SelectionDAG                FastISel                                     
Kraken:
   ai-astar                                          830.397+-4.090            827.543+-9.930         
   audio-beat-detection                              211.438+-2.595      !     254.782+-1.764         ! definitely 1.2050x slower
   audio-dft                                         290.277+-13.465     ?     322.441+-31.892        ? might be 1.1108x slower
   audio-fft                                         158.454+-2.326      !     182.535+-3.608         ! definitely 1.1520x slower
   audio-oscillator                                  376.803+-3.511      !     385.914+-2.721         ! definitely 1.0242x slower
   imaging-darkroom                                  177.480+-2.060      !     195.769+-2.804         ! definitely 1.1031x slower
   imaging-desaturate                                147.688+-1.339      !     159.703+-0.667         ! definitely 1.0814x slower
   imaging-gaussian-blur                             225.622+-2.266      !     252.286+-2.925         ! definitely 1.1182x slower
   json-parse-financial                               97.533+-1.838             96.247+-2.062           might be 1.0134x faster
   json-stringify-tinderbox                          108.371+-0.734      ?     109.008+-1.757         ?
   stanford-crypto-aes                               123.952+-2.862            123.632+-1.813         
   stanford-crypto-ccm                                99.472+-11.188     ?     108.617+-12.698        ? might be 1.0919x slower
   stanford-crypto-pbkdf2                            311.025+-4.970      ?     312.564+-4.749         ?
   stanford-crypto-sha256-iterative                   95.663+-1.213      ?      97.736+-3.373         ? might be 1.0217x slower

   <arithmetic>                                      232.441+-1.180      !     244.913+-2.327         ! definitely 1.0537x slower

                                                       SelectionDAG                FastISel                                     
JSRegress:
   abs-boolean                                        4.6894+-0.1020     ?      4.7036+-0.0833        ?
   adapt-to-double-divide                            20.2448+-0.4708           19.7487+-0.5439          might be 1.0251x faster
   aliased-arguments-getbyval                         1.8922+-0.2338     ?      2.0666+-0.1937        ? might be 1.0922x slower
   allocate-big-object                                4.7758+-0.4891            4.5614+-0.0379          might be 1.0470x faster
   arguments-named-and-reflective                    21.3042+-0.2437     ^     20.0065+-0.3491        ^ definitely 1.0649x faster
   arguments-out-of-bounds                           22.3727+-0.2905     ^     21.3520+-0.2289        ^ definitely 1.0478x faster
   arguments-strict-mode                             19.1051+-0.7753           18.7394+-0.5312          might be 1.0195x faster
   arguments                                         17.1198+-0.3580           16.4279+-0.3454          might be 1.0421x faster
   arity-mismatch-inlining                            1.3113+-0.0678     ?      1.3475+-0.1414        ? might be 1.0276x slower
   array-access-polymorphic-structure                14.5345+-0.1525     ?     14.6381+-0.2875        ?
   array-nonarray-polymorhpic-access                 66.6090+-3.9166     ?     68.4310+-3.0869        ? might be 1.0274x slower
   array-prototype-every                            190.6930+-2.1282          189.0092+-4.1653        
   array-prototype-forEach                          186.8407+-0.8638          185.9291+-1.5337        
   array-prototype-map                              204.8980+-1.3233     ?    210.5504+-7.9014        ? might be 1.0276x slower
   array-prototype-some                             189.7626+-1.8481          189.0440+-2.3175        
   array-splice-contiguous                           89.2187+-2.0189           87.6331+-1.1110          might be 1.0181x faster
   array-with-double-add                              7.6195+-0.1441     ?      7.6444+-0.1527        ?
   array-with-double-increment                        5.6081+-0.1280     ?      5.6687+-0.0907        ? might be 1.0108x slower
   array-with-double-mul-add                         11.8791+-0.1340     ?     11.9427+-0.1200        ?
   array-with-double-sum                              5.5305+-0.1448     ?      5.5979+-0.1219        ? might be 1.0122x slower
   array-with-int32-add-sub                          13.4411+-0.1726           13.3357+-0.1621        
   array-with-int32-or-double-sum                     5.4254+-0.0556     ?      5.5384+-0.1262        ? might be 1.0208x slower
   ArrayBuffer-DataView-alloc-large-long-lived   
                                                     74.0944+-3.5205           72.7228+-2.5277          might be 1.0189x faster
   ArrayBuffer-DataView-alloc-long-lived             33.3169+-1.3638           33.0926+-1.7570        
   ArrayBuffer-Int32Array-byteOffset                  5.2727+-0.0950     ?      5.2962+-0.0907        ?
   ArrayBuffer-Int8Array-alloc-large-long-lived   
                                                     71.1624+-1.6337     ?     72.3822+-2.6428        ? might be 1.0171x slower
   ArrayBuffer-Int8Array-alloc-long-lived-buffer   
                                                     55.3926+-1.5904     ?     57.4735+-2.6402        ? might be 1.0376x slower
   ArrayBuffer-Int8Array-alloc-long-lived            28.9157+-1.3897           28.2915+-1.7676          might be 1.0221x faster
   ArrayBuffer-Int8Array-alloc                       26.8741+-2.2263           24.6380+-2.4678          might be 1.0908x faster
   asmjs_bool_bug                                    13.5028+-0.2613           13.3617+-0.1222          might be 1.0106x faster
   assign-custom-setter-polymorphic                   6.2513+-0.2151     ?      6.2782+-0.0857        ?
   assign-custom-setter                               8.4838+-0.1920            8.2903+-0.0717          might be 1.0233x faster
   basic-set                                         16.1321+-0.3143     ?     16.3488+-0.1665        ? might be 1.0134x slower
   big-int-mul                                        7.8496+-0.1435     ?      7.8939+-0.1602        ?
   boolean-test                                       4.9461+-0.1076     ?      5.0277+-0.1087        ? might be 1.0165x slower
   branch-fold                                        5.3185+-0.1288     ?      5.3487+-0.1252        ?
   by-val-generic                                    15.7998+-0.9126     ?     15.8211+-0.7306        ?
   call-spread-apply                                 62.7080+-1.8914     ?     63.1357+-1.5990        ?
   call-spread-call                                  49.0560+-0.9004           48.8023+-0.6716        
   captured-assignments                               0.8283+-0.0941     ?      0.8475+-0.0911        ? might be 1.0231x slower
   cast-int-to-double                                 8.8052+-0.0954            8.8002+-0.1694        
   cell-argument                                     13.0802+-0.2886     ^     12.4643+-0.2344        ^ definitely 1.0494x faster
   cfg-simplify                                       4.0194+-0.1285     ?      4.0521+-0.1350        ?
   chain-getter-access                               17.5015+-0.1375     ^     16.8215+-0.0950        ^ definitely 1.0404x faster
   cmpeq-obj-to-obj-other                            17.3470+-0.2989           17.3003+-0.2790        
   constant-test                                      8.2842+-0.1503            8.2405+-0.0809        
   create-lots-of-functions                          42.0282+-0.6981     ?     42.1368+-0.6535        ?
   DataView-custom-properties                        81.4009+-1.4057     ?     83.4624+-3.2204        ? might be 1.0253x slower
   deconstructing-parameters-overridden-by-function   
                                                      0.9823+-0.0918     ^      0.8286+-0.0469        ^ definitely 1.1856x faster
   delay-tear-off-arguments-strictmode               27.5980+-0.3574     ?     27.8925+-0.3613        ? might be 1.0107x slower
   deltablue-varargs                                367.8245+-3.7122          362.6208+-2.4074          might be 1.0144x faster
   destructuring-arguments                           28.1412+-0.2704     ?     28.2227+-0.4707        ?
   destructuring-swap                                 8.5175+-0.1123            8.4937+-0.0791        
   direct-arguments-getbyval                          1.7555+-0.0468     ?      1.9189+-0.1735        ? might be 1.0931x slower
   div-boolean-double                                 5.9800+-0.1507            5.9572+-0.1543        
   div-boolean                                       10.3669+-0.1949     ?     10.4093+-0.1528        ?
   double-get-by-val-out-of-bounds                    8.4616+-0.3658            8.3654+-0.1671          might be 1.0115x faster
   double-pollution-getbyval                         10.2407+-0.1459           10.1699+-0.1249        
   double-pollution-putbyoffset                       8.3648+-0.2522     ?      8.4335+-0.2016        ?
   double-to-int32-typed-array-no-inline              3.7060+-0.0990            3.6746+-0.1320        
   double-to-int32-typed-array                        3.2898+-0.1397            3.2819+-0.0496        
   double-to-uint32-typed-array-no-inline             3.8522+-0.1382     ?      3.8795+-0.1561        ?
   double-to-uint32-typed-array                       3.3427+-0.1091     ?      3.4029+-0.1288        ? might be 1.0180x slower
   elidable-new-object-dag                           67.2061+-0.9196     !     69.4036+-0.9192        ! definitely 1.0327x slower
   elidable-new-object-roflcopter                    73.4537+-0.6774           72.4845+-0.8684          might be 1.0134x faster
   elidable-new-object-then-call                     70.5998+-3.4176     ^     62.2038+-1.1127        ^ definitely 1.1350x faster
   elidable-new-object-tree                          74.9192+-0.5778     ^     73.0699+-0.5764        ^ definitely 1.0253x faster
   empty-string-plus-int                             10.1955+-0.2286     ?     10.2614+-0.1194        ?
   emscripten-cube2hash                              58.1526+-0.6014           57.1345+-0.8926          might be 1.0178x faster
   exit-length-on-plain-object                       32.0852+-1.0174     ?     32.1013+-1.4017        ?
   external-arguments-getbyval                        2.0430+-0.1689     ?      2.2134+-0.1246        ? might be 1.0834x slower
   external-arguments-putbyval                        3.9580+-0.1141     ?      4.0547+-0.0786        ? might be 1.0244x slower
   fixed-typed-array-storage-var-index                1.9548+-0.0962            1.9411+-0.0821        
   fixed-typed-array-storage                          1.4196+-0.0847     ?      1.4597+-0.1049        ? might be 1.0282x slower
   Float32Array-matrix-mult                           8.1877+-0.1977     ?      8.2923+-0.2134        ? might be 1.0128x slower
   Float32Array-to-Float64Array-set                 120.3352+-2.8371     ?    121.0543+-1.5252        ?
   Float64Array-alloc-long-lived                    116.3851+-1.2076     ?    116.8838+-1.1062        ?
   Float64Array-to-Int16Array-set                   145.8976+-1.4129          145.6190+-1.6514        
   fold-double-to-int                                28.2958+-0.2904     ^     27.2797+-0.4792        ^ definitely 1.0372x faster
   fold-get-by-id-to-multi-get-by-offset-rare-int   
                                                     14.6296+-0.4682           14.5572+-0.3576        
   fold-get-by-id-to-multi-get-by-offset             12.6135+-0.4332     ?     12.7499+-0.3171        ? might be 1.0108x slower
   fold-multi-get-by-offset-to-get-by-offset   
                                                     11.9594+-0.2257           11.8476+-0.7804        
   fold-multi-get-by-offset-to-poly-get-by-offset   
                                                     12.1622+-0.1302           11.9609+-0.4659          might be 1.0168x faster
   fold-multi-put-by-offset-to-poly-put-by-offset   
                                                     10.8218+-0.6512           10.7195+-0.7266        
   fold-multi-put-by-offset-to-put-by-offset   
                                                      9.1284+-0.6866            8.6495+-0.8552          might be 1.0554x faster
   fold-multi-put-by-offset-to-replace-or-transition-put-by-offset   
                                                     17.6037+-0.5242           17.5820+-0.2417        
   fold-put-by-id-to-multi-put-by-offset             11.7725+-0.5786           11.4432+-0.4677          might be 1.0288x faster
   fold-put-structure                                 8.7777+-0.7611            8.5015+-0.5916          might be 1.0325x faster
   for-of-iterate-array-entries                       8.7133+-0.0709     ?      8.7412+-0.1949        ?
   for-of-iterate-array-keys                          7.4320+-0.1207            7.3796+-0.1987        
   for-of-iterate-array-values                        6.8846+-0.1477     ?      7.1204+-0.1382        ? might be 1.0342x slower
   fround                                            27.4575+-0.3182     ^     25.5948+-0.5197        ^ definitely 1.0728x faster
   ftl-library-inlining-dataview                    147.3132+-0.4472          147.1323+-2.0459        
   ftl-library-inlining                             149.7279+-1.4417     ?    151.7170+-0.8484        ? might be 1.0133x slower
   function-dot-apply                                 3.7060+-0.1414     ?      3.7896+-0.1066        ? might be 1.0226x slower
   function-test                                      5.4785+-0.0698     ?      5.5520+-0.1121        ? might be 1.0134x slower
   function-with-eval                               178.6155+-9.1548          173.7661+-0.8569          might be 1.0279x faster
   gcse-poly-get-less-obvious                        36.9332+-0.2872           36.5610+-1.0432          might be 1.0102x faster
   gcse-poly-get                                     37.5935+-0.2065           37.3721+-0.5449        
   gcse                                               9.3281+-0.1099            9.3197+-0.1727        
   get-by-id-bimorphic-check-structure-elimination-simple   
                                                      4.1269+-0.1155            4.1267+-0.1270        
   get-by-id-bimorphic-check-structure-elimination   
                                                     12.3902+-0.1096     ?     12.5017+-0.2098        ?
   get-by-id-chain-from-try-block                    14.0478+-0.1678           13.9646+-0.1584        
   get-by-id-check-structure-elimination             11.1002+-0.1011           11.0998+-0.0637        
   get-by-id-proto-or-self                           34.9756+-3.0700           31.8979+-1.8802          might be 1.0965x faster
   get-by-id-quadmorphic-check-structure-elimination-simple   
                                                      5.3534+-0.0499     ?      5.3537+-0.0844        ?
   get-by-id-self-or-proto                           34.4309+-2.5821           33.1685+-2.3244          might be 1.0381x faster
   get-by-val-out-of-bounds                           8.0806+-0.1625     ?      8.1885+-0.1784        ? might be 1.0133x slower
   get_callee_monomorphic                             6.3323+-0.2880     ?      6.4347+-0.2403        ? might be 1.0162x slower
   get_callee_polymorphic                             5.6858+-0.1022            5.6506+-0.1697        
   getter-no-activation                               6.4304+-0.1613     ?      6.4935+-0.1432        ?
   getter-richards                                  178.0033+-3.2880     ?    181.4618+-4.4789        ? might be 1.0194x slower
   getter                                            10.0794+-0.0987     ?     10.1272+-0.0822        ?
   global-var-const-infer-fire-from-opt               1.8322+-0.1369            1.6978+-0.0880          might be 1.0792x faster
   global-var-const-infer                             1.5197+-0.1066     ?      1.5834+-0.1721        ? might be 1.0419x slower
   HashMap-put-get-iterate-keys                      50.3003+-0.9066           49.6560+-0.3964          might be 1.0130x faster
   HashMap-put-get-iterate                           49.0870+-0.5173     ?     49.3398+-0.6486        ?
   HashMap-string-put-get-iterate                    51.1516+-0.8312           50.6290+-0.5082          might be 1.0103x faster
   hoist-make-rope                                   19.5199+-1.2816     ^     16.5275+-1.3628        ^ definitely 1.1811x faster
   hoist-poly-check-structure-effectful-loop   
                                                      9.8179+-0.0921            9.7946+-0.1395        
   hoist-poly-check-structure                         6.4805+-0.0654     ?      6.5111+-0.1663        ?
   imul-double-only                                  14.7276+-0.3495     ^     12.5468+-0.8325        ^ definitely 1.1738x faster
   imul-int-only                                     14.8087+-0.1429           14.4719+-0.4361          might be 1.0233x faster
   imul-mixed                                        13.8275+-0.7548     ^     12.0864+-0.4622        ^ definitely 1.1441x faster
   in-four-cases                                     35.8693+-0.4690           35.1101+-0.3447          might be 1.0216x faster
   in-one-case-false                                 18.8662+-0.2755     ?     18.9657+-0.2189        ?
   in-one-case-true                                  18.7217+-0.2440     ?     18.9977+-0.2512        ? might be 1.0147x slower
   in-two-cases                                      19.3696+-0.2470     ?     19.4438+-0.2785        ?
   indexed-properties-in-objects                      4.7735+-0.0540            4.7137+-0.0892          might be 1.0127x faster
   infer-closure-const-then-mov-no-inline             6.1501+-0.0687     ?      6.1858+-0.1273        ?
   infer-closure-const-then-mov                      27.7511+-0.5943     ^     22.8807+-0.6616        ^ definitely 1.2129x faster
   infer-closure-const-then-put-to-scope-no-inline   
                                                     19.5283+-0.2381           19.3764+-0.2582        
   infer-closure-const-then-put-to-scope             31.5698+-0.2451           31.1210+-0.4440          might be 1.0144x faster
   infer-closure-const-then-reenter-no-inline   
                                                     93.0436+-0.7589     !     95.4198+-0.8972        ! definitely 1.0255x slower
   infer-closure-const-then-reenter                 231.1912+-3.4959     ?    233.6228+-0.5816        ? might be 1.0105x slower
   infer-constant-global-property                    39.4675+-0.4140           39.4182+-0.2149        
   infer-constant-property                            3.7176+-0.1036            3.6945+-0.0803        
   infer-one-time-closure-ten-vars                   18.3456+-0.2120     ^     17.6279+-0.3604        ^ definitely 1.0407x faster
   infer-one-time-closure-two-vars                   17.6565+-0.1556     ^     17.0547+-0.3666        ^ definitely 1.0353x faster
   infer-one-time-closure                            17.5772+-0.1623     ^     17.0557+-0.2671        ^ definitely 1.0306x faster
   infer-one-time-deep-closure                       31.0248+-0.2441     ^     30.2749+-0.1668        ^ definitely 1.0248x faster
   inline-arguments-access                            7.0294+-0.1914            7.0182+-0.1282        
   inline-arguments-aliased-access                    7.1417+-0.1101            7.0667+-0.0838          might be 1.0106x faster
   inline-arguments-local-escape                      7.3002+-0.2146            7.2275+-0.2263          might be 1.0101x faster
   inline-get-scoped-var                              6.0265+-0.0786     ?      6.0536+-0.1425        ?
   inlined-put-by-id-transition                      17.8819+-0.2240     ?     18.2325+-0.2201        ? might be 1.0196x slower
   int-or-other-abs-then-get-by-val                   9.6504+-0.0782     ?      9.6875+-0.1583        ?
   int-or-other-abs-zero-then-get-by-val             34.2129+-0.2819           34.1840+-0.5952        
   int-or-other-add-then-get-by-val                   8.6727+-0.1049     ?      8.7348+-0.1442        ?
   int-or-other-add                                   9.2454+-0.0683            9.2063+-0.1664        
   int-or-other-div-then-get-by-val                   6.6267+-0.1768     ?      6.6848+-0.1199        ?
   int-or-other-max-then-get-by-val                   7.3953+-0.1760     ?      7.4728+-0.1138        ? might be 1.0105x slower
   int-or-other-min-then-get-by-val                   7.4847+-0.1070            7.3523+-0.0983          might be 1.0180x faster
   int-or-other-mod-then-get-by-val                   6.2106+-0.1448     ?      6.2990+-0.1011        ? might be 1.0142x slower
   int-or-other-mul-then-get-by-val                   6.7827+-0.1100     ?      6.8452+-0.1446        ?
   int-or-other-neg-then-get-by-val                   8.2593+-0.1484     ?      8.4526+-0.1202        ? might be 1.0234x slower
   int-or-other-neg-zero-then-get-by-val             34.2902+-0.3664     ?     34.3126+-0.3569        ?
   int-or-other-sub-then-get-by-val                   9.2069+-0.1346            9.1265+-0.1468        
   int-or-other-sub                                   5.7812+-0.0712            5.7740+-0.0870        
   int-overflow-local                                 8.2546+-0.1608     ?      8.2580+-0.0828        ?
   Int16Array-alloc-long-lived                       82.5020+-0.9122     ?     82.6247+-0.6534        ?
   Int16Array-bubble-sort-with-byteLength            52.0377+-0.6363           51.9418+-0.5616        
   Int16Array-bubble-sort                            50.2659+-0.4678           50.1385+-1.2946        
   Int16Array-load-int-mul                            2.3125+-0.1125            2.2966+-0.0408        
   Int16Array-to-Int32Array-set                     113.6869+-0.9490     ?    114.7040+-0.7833        ?
   Int32Array-alloc-large                            38.0277+-0.8765     ?     38.2511+-0.8850        ?
   Int32Array-alloc-long-lived                       91.1698+-0.6881     ?     91.6479+-0.9127        ?
   Int32Array-alloc                                   5.5246+-0.1722     ?      5.5702+-0.1681        ?
   Int32Array-Int8Array-view-alloc                   15.8805+-0.4929           15.3928+-0.2661          might be 1.0317x faster
   int52-spill                                       12.2472+-0.2581     ?     12.2904+-0.2831        ?
   Int8Array-alloc-long-lived                        75.6414+-1.0315           75.5951+-0.8525        
   Int8Array-load-with-byteLength                     5.2231+-0.1188            5.2195+-0.1305        
   Int8Array-load                                     5.2445+-0.0815            5.1987+-0.0987        
   integer-divide                                    19.5997+-0.1575     ?     19.7310+-0.2281        ?
   integer-modulo                                     3.6654+-0.2039            3.5684+-0.1223          might be 1.0272x faster
   large-int-captured                                10.2194+-0.3193           10.0739+-0.1696          might be 1.0145x faster
   large-int-neg                                     31.7052+-0.3500     ?     32.1844+-0.5373        ? might be 1.0151x slower
   large-int                                         28.2292+-0.3419     ?     28.3281+-0.2415        ?
   logical-not                                        8.2285+-0.1733            8.2257+-0.0871        
   lots-of-fields                                    23.7387+-0.1998     ?     23.9305+-0.2467        ?
   make-indexed-storage                               5.6999+-0.2224            5.5838+-0.4229          might be 1.0208x faster
   make-rope-cse                                      7.4395+-0.1881     ?      7.4797+-0.1587        ?
   marsaglia-larger-ints                             64.6115+-0.5563     !     67.8317+-0.5070        ! definitely 1.0498x slower
   marsaglia-osr-entry                               36.2752+-0.8264     !     38.1157+-0.5739        ! definitely 1.0507x slower
   max-boolean                                        4.0587+-0.1348     ?      4.2760+-0.5059        ? might be 1.0535x slower
   method-on-number                                  34.2976+-1.3779     ?     34.5450+-0.5705        ?
   min-boolean                                        3.8831+-0.0867            3.7958+-0.0924          might be 1.0230x faster
   minus-boolean-double                               4.4181+-0.0886     ?      4.4392+-0.0750        ?
   minus-boolean                                      3.6857+-0.0990     ?      3.8219+-0.1021        ? might be 1.0370x slower
   misc-strict-eq                                    76.9371+-5.7151           69.6841+-3.3935          might be 1.1041x faster
   mod-boolean-double                                14.5115+-0.2081     ?     14.5239+-0.1009        ?
   mod-boolean                                        9.3303+-0.0428     ?      9.3874+-0.0565        ?
   mul-boolean-double                                 5.0826+-0.0742            5.0790+-0.0507        
   mul-boolean                                        4.0833+-0.0463            4.0574+-0.0961        
   neg-boolean                                        4.7247+-0.1031            4.6320+-0.0828          might be 1.0200x faster
   negative-zero-divide                               0.6345+-0.0875     ?      0.6430+-0.0890        ? might be 1.0133x slower
   negative-zero-modulo                               0.6419+-0.0852            0.6416+-0.0856        
   negative-zero-negate                               0.5625+-0.0787            0.5168+-0.0166          might be 1.0884x faster
   nested-function-parsing                           72.5534+-0.7649           72.4578+-0.7705        
   new-array-buffer-dead                              4.1733+-0.1565            4.1412+-0.1059        
   new-array-buffer-push                             10.4566+-0.2856     ?     10.6333+-0.2344        ? might be 1.0169x slower
   new-array-dead                                    18.5350+-0.5539           17.5635+-0.4374          might be 1.0553x faster
   new-array-push                                     6.7718+-0.1904            6.7287+-0.1598        
   no-inline-constructor                            185.8405+-1.1437     ?    186.2167+-1.6442        ?
   number-test                                        4.9538+-0.1234     ?      4.9563+-0.0772        ?
   object-closure-call                               10.6423+-0.1428           10.5911+-0.1815        
   object-test                                        5.3088+-0.0746     ?      5.3118+-0.0490        ?
   obvious-sink-pathology-taken                     224.5919+-1.9574     !    239.9729+-1.2429        ! definitely 1.0685x slower
   obvious-sink-pathology                           207.1306+-2.1107     ?    209.7955+-1.8423        ? might be 1.0129x slower
   obviously-elidable-new-object                     59.7957+-0.8936           57.6697+-2.6974          might be 1.0369x faster
   plus-boolean-arith                                 4.1835+-0.1298            4.0886+-0.0991          might be 1.0232x faster
   plus-boolean-double                                4.5261+-0.0773            4.4631+-0.0527          might be 1.0141x faster
   plus-boolean                                       3.6762+-0.0699     ?      3.6972+-0.1180        ?
   poly-chain-access-different-prototypes-simple   
                                                      5.4100+-0.2087            5.3170+-0.0694          might be 1.0175x faster
   poly-chain-access-different-prototypes             3.9245+-0.1244     ?      3.9889+-0.0744        ? might be 1.0164x slower
   poly-chain-access-simpler                          5.2791+-0.0767     ?      5.3349+-0.0920        ? might be 1.0106x slower
   poly-chain-access                                  3.9854+-0.1055            3.9816+-0.0847        
   poly-stricteq                                    101.2163+-0.9053     ?    102.2290+-0.3011        ? might be 1.0100x slower
   polymorphic-array-call                             2.6107+-0.2761            2.5710+-0.2081          might be 1.0154x faster
   polymorphic-get-by-id                              5.3105+-0.0866     ?      5.3501+-0.1189        ?
   polymorphic-put-by-id                             46.8473+-2.0091     ?     47.1553+-2.5377        ?
   polymorphic-structure                             30.6224+-0.9654     ?     31.1726+-0.6363        ? might be 1.0180x slower
   polyvariant-monomorphic-get-by-id                 17.7112+-0.1965           17.5722+-0.5159        
   proto-getter-access                               17.1846+-0.1480     ^     16.7629+-0.2553        ^ definitely 1.0252x faster
   put-by-id-replace-and-transition                  14.8277+-0.2224     ?     15.0968+-0.4135        ? might be 1.0181x slower
   put-by-id-slightly-polymorphic                     4.0701+-0.0575     ?      4.1465+-0.0979        ? might be 1.0188x slower
   put-by-id                                         20.8864+-0.4660     ?     21.0407+-0.3754        ?
   put-by-val-direct                                  1.1365+-0.0813            1.1037+-0.0826          might be 1.0297x faster
   put-by-val-large-index-blank-indexing-type   
                                                     11.7750+-0.6287           11.6176+-0.6512          might be 1.0135x faster
   put-by-val-machine-int                             4.4420+-0.1052     ?      4.5068+-0.1145        ? might be 1.0146x slower
   rare-osr-exit-on-local                            25.9587+-0.2790           25.9373+-0.1749        
   register-pressure-from-osr                        38.6895+-0.4197     ?     38.8131+-0.3164        ?
   setter                                             7.7931+-0.0714     ?      7.7939+-0.1163        ?
   simple-activation-demo                            41.3110+-0.5326           41.1925+-0.3382        
   simple-getter-access                              21.9587+-0.5853     !     23.4349+-0.3350        ! definitely 1.0672x slower
   simple-poly-call-nested                           13.4834+-0.3536     ?     13.6738+-0.2535        ? might be 1.0141x slower
   simple-poly-call                                   2.2957+-0.2115     ?      2.3490+-0.1711        ? might be 1.0232x slower
   sin-boolean                                       30.7915+-0.4586           29.4427+-1.0075          might be 1.0458x faster
   singleton-scope                                  130.6710+-1.8339          129.4368+-1.2777        
   sinkable-new-object-dag                          116.7161+-0.6658     !    131.4027+-1.3962        ! definitely 1.1258x slower
   sinkable-new-object-taken                         89.3175+-3.8291     !    103.1487+-0.7717        ! definitely 1.1549x slower
   sinkable-new-object                               63.9858+-0.5780     ^     62.0129+-0.5843        ^ definitely 1.0318x faster
   slow-array-profile-convergence                     5.2005+-0.1081            5.0825+-0.2109          might be 1.0232x faster
   slow-convergence                                   5.8536+-0.1389     ?      5.9589+-0.1895        ? might be 1.0180x slower
   sparse-conditional                                 1.8931+-0.1123     ?      1.9236+-0.0993        ? might be 1.0161x slower
   splice-to-remove                                  34.0475+-1.5103     ?     34.6749+-1.3049        ? might be 1.0184x slower
   string-char-code-at                               29.9552+-0.2873           29.7500+-0.2287        
   string-concat-object                               4.0517+-0.2457            4.0291+-0.3002        
   string-concat-pair-object                          3.7623+-0.2735     ?      3.9700+-0.1561        ? might be 1.0552x slower
   string-concat-pair-simple                         20.3565+-0.4312           20.1445+-0.5254          might be 1.0105x faster
   string-concat-simple                              20.7802+-0.1630     ?     21.0522+-0.3823        ? might be 1.0131x slower
   string-cons-repeat                                13.0412+-0.1751           12.8580+-0.1634          might be 1.0142x faster
   string-cons-tower                                 13.1147+-0.2616     ?     13.1979+-0.3825        ?
   string-equality                                   33.1228+-0.4960           32.7658+-0.3679          might be 1.0109x faster
   string-get-by-val-big-char                        13.8846+-0.0974     ?     14.0727+-0.3712        ? might be 1.0135x slower
   string-get-by-val-out-of-bounds-insane             8.0847+-0.1462            8.0760+-0.0878        
   string-get-by-val-out-of-bounds                    9.6835+-0.1373            9.5571+-0.1409          might be 1.0132x faster
   string-get-by-val                                  6.7431+-0.1197     ?      6.8752+-0.0953        ? might be 1.0196x slower
   string-hash                                        3.5725+-0.0895     ?      3.5977+-0.0805        ?
   string-long-ident-equality                        26.2828+-0.3446     ?     26.4396+-0.3625        ?
   string-out-of-bounds                              22.6351+-0.3147     ^     21.2159+-0.2664        ^ definitely 1.0669x faster
   string-repeat-arith                               62.6082+-0.3941     ?     62.6716+-0.4003        ?
   string-sub                                       119.4828+-0.6511     ?    119.7570+-0.9638        ?
   string-test                                        4.9519+-0.1880            4.9075+-0.1531        
   string-var-equality                               61.6142+-0.7637           61.3372+-0.4645        
   structure-hoist-over-transitions                   4.2059+-0.2620            4.1911+-0.1554        
   substring-concat-weird                            74.9517+-0.3501     ?     75.2542+-1.2201        ?
   substring-concat                                  79.3645+-1.0033     ?     79.6346+-0.8790        ?
   substring                                         90.4904+-0.7933     ?     90.8223+-0.9722        ?
   switch-char-constant                               3.8668+-0.0792            3.7979+-0.0752          might be 1.0181x faster
   switch-char                                       11.3082+-0.1620     ?     11.5399+-0.4236        ? might be 1.0205x slower
   switch-constant                                   14.8455+-0.1959           14.6940+-0.5171          might be 1.0103x faster
   switch-string-basic-big-var                       31.3233+-1.6646           30.3767+-1.1267          might be 1.0312x faster
   switch-string-basic-big                           24.7890+-0.7978           24.3688+-1.0018          might be 1.0172x faster
   switch-string-basic-var                           33.3917+-1.4025           32.6048+-0.8702          might be 1.0241x faster
   switch-string-basic                               24.9836+-2.3076           24.4477+-1.6871          might be 1.0219x faster
   switch-string-big-length-tower-var                30.7263+-0.3658           30.5112+-0.0930        
   switch-string-length-tower-var                    26.5222+-0.2905           26.3319+-0.2350        
   switch-string-length-tower                        21.9843+-1.4445     ?     22.3167+-1.4779        ? might be 1.0151x slower
   switch-string-short                               20.6147+-0.2382     ?     20.8482+-0.3281        ? might be 1.0113x slower
   switch                                            21.0638+-0.3814     ?     21.3770+-0.3676        ? might be 1.0149x slower
   tear-off-arguments-simple                          5.6965+-0.2921            5.5304+-0.2056          might be 1.0300x faster
   tear-off-arguments                                 7.9628+-0.1787            7.8462+-0.1498          might be 1.0149x faster
   temporal-structure                                19.9055+-0.1879           19.8280+-0.2108        
   to-int32-boolean                                  25.4525+-0.3122     ?     25.5382+-0.2572        ?
   try-catch-get-by-val-cloned-arguments             28.6984+-0.5083           28.4295+-0.3145        
   try-catch-get-by-val-direct-arguments             11.6765+-1.3606           11.5554+-1.1812          might be 1.0105x faster
   try-catch-get-by-val-scoped-arguments             13.7115+-0.1727     ?     13.7313+-0.2693        ?
   undefined-property-access                        597.6775+-5.0579     ^    589.1895+-2.1319        ^ definitely 1.0144x faster
   undefined-test                                     5.0271+-0.0844     ?      5.1060+-0.1872        ? might be 1.0157x slower
   unprofiled-licm                                   36.6673+-0.2296     ^     34.4048+-0.8722        ^ definitely 1.0658x faster
   varargs-call                                      24.8457+-0.2608     ?     24.9020+-0.2522        ?
   varargs-construct-inline                          38.5220+-0.3151           38.3725+-0.3708        
   varargs-construct                                 57.5459+-0.9338     ?     59.3946+-0.9615        ? might be 1.0321x slower
   varargs-inline                                    15.6607+-0.2578     ^     14.7590+-0.1916        ^ definitely 1.0611x faster
   varargs-strict-mode                               18.0523+-0.3413     ?     18.1833+-0.3542        ?
   varargs                                           17.7978+-0.2294     ?     18.1201+-0.1917        ? might be 1.0181x slower
   weird-inlining-const-prop                          3.7823+-0.1819     ?      3.8913+-0.2240        ? might be 1.0288x slower

   <geometric>                                       14.9237+-0.0240     ^     14.8598+-0.0340        ^ definitely 1.0043x faster

                                                       SelectionDAG                FastISel                                     
AsmBench:
   bigfib.cpp                                       887.0988+-8.6434     !   1099.3190+-12.8987       ! definitely 1.2392x slower
   cray.c                                           875.6460+-6.4732     !    948.3042+-9.0611        ! definitely 1.0830x slower
   dry.c                                           1038.3807+-11.1271    !   1174.2063+-10.1666       ! definitely 1.1308x slower
   FloatMM.c                                       1296.5059+-5.7447     !   1424.7773+-11.2949       ! definitely 1.0989x slower
   gcc-loops.cpp                                   8230.2213+-59.3281    !  10292.2275+-28.3838       ! definitely 1.2505x slower
   n-body.c                                        2467.5656+-12.5657    !   2647.4965+-10.3973       ! definitely 1.0729x slower
   Quicksort.c                                      714.0405+-13.2013    !    764.2223+-6.4279        ! definitely 1.0703x slower
   stepanov_container.cpp                          6364.0815+-151.5082   !   6774.4578+-178.0046      ! definitely 1.0645x slower
   Towers.c                                         610.3955+-4.0709     !    629.9341+-5.0971        ! definitely 1.0320x slower

   <geometric>                                     1572.7714+-6.5098     !   1751.2524+-9.2044        ! definitely 1.1135x slower

                                                       SelectionDAG                FastISel                                     
CompressionBench:
   huffman                                          783.6010+-7.8806          780.8865+-12.5274       
   arithmetic-simple                                723.0556+-3.1654     !    750.0397+-3.4141        ! definitely 1.0373x slower
   arithmetic-precise                               556.8846+-5.2662     !    571.9472+-5.1301        ! definitely 1.0270x slower
   arithmetic-complex-precise                       556.9940+-4.9284     !    573.0284+-3.7072        ! definitely 1.0288x slower
   arithmetic-precise-order-0                       789.6442+-15.5094    ?    798.1486+-8.8324        ? might be 1.0108x slower
   arithmetic-precise-order-1                       586.8455+-5.0994     !    624.7945+-18.9340       ! definitely 1.0647x slower
   arithmetic-precise-order-2                       661.7862+-11.5918    !    687.3401+-12.8577       ! definitely 1.0386x slower
   arithmetic-simple-order-1                        714.2471+-5.7002     !    742.1235+-7.0381        ! definitely 1.0390x slower
   arithmetic-simple-order-2                        793.6716+-12.7723    !    827.5933+-12.2754       ! definitely 1.0427x slower
   lz-string                                        636.9012+-29.5865         618.1506+-38.2922         might be 1.0303x faster

   <geometric>                                      674.3033+-2.5072     !    691.3106+-5.5432        ! definitely 1.0252x slower

                                                       SelectionDAG                FastISel                                     

                                                       SelectionDAG                FastISel                                     
Geomean of preferred means:
   <scaled-result>                                  119.4692+-0.1742     !    124.7443+-0.3773        ! definitely 1.0442x slower
Comment 5 Filip Pizlo 2015-04-07 11:25:49 PDT
Note that this currently conflates enabling FastISel and using basic RA.  Maybe that's not optimal for X86.  But anyway, my next move is to test this on ARM64 and see what happens.
Comment 6 Juergen Ributzka 2015-04-07 11:37:20 PDT
FastISel for X86 hasn't been optimized for Javascript yet.

Testing Basic RA and/or disabling the Machine Instruction Scheduler is worth giving a try on X86.
Comment 7 Filip Pizlo 2015-04-07 12:22:59 PDT
And it's profitable for ARM64.


Benchmark report for SunSpider, Octane, Kraken, and AsmBench.

VMs tested:
"SelectionDAG" at /Volumes/Data/pizlo/secondary/OpenSource/JscArmBuild/Release-iphoneos/JavaScriptCore.framework/Resources/jsc
    export JSC_enableLLVMFastISel=false
"FastISel" at /Volumes/Data/pizlo/secondary/OpenSource/JscArmBuild/Release-iphoneos/JavaScriptCore.framework/Resources/jsc
    export JSC_enableLLVMFastISel=true

Collected 3 samples per benchmark/VM, with 3 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to
get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds.

                                               SelectionDAG                FastISel                                     
SunSpider:
   3d-cube                                   19.1584+-1.5976           19.0782+-1.5476        
   3d-morph                                  18.4647+-0.4147     ?     18.5917+-0.5864        ?
   3d-raytrace                               19.8375+-0.9101     ?     21.0723+-3.9045        ? might be 1.0622x slower
   access-binary-trees                        6.2598+-0.6093            6.2017+-0.4173        
   access-fannkuch                           15.0500+-1.9685     ?     15.7166+-4.2463        ? might be 1.0443x slower
   access-nbody                               8.7649+-0.0676     ?      8.9951+-0.8798        ? might be 1.0263x slower
   access-nsieve                              6.5923+-0.2791     ?      6.6240+-0.3910        ?
   bitops-3bit-bits-in-byte                   3.0800+-0.3071     ?      3.1207+-0.2709        ? might be 1.0132x slower
   bitops-bits-in-byte                       10.9430+-0.5485     ?     10.9932+-1.7605        ?
   bitops-bitwise-and                         6.0168+-0.2334            5.9670+-0.1393        
   bitops-nsieve-bits                         8.4460+-0.0802     ?      8.5463+-0.2661        ? might be 1.0119x slower
   controlflow-recursive                      5.6418+-0.1320     ?      6.3622+-2.9473        ? might be 1.1277x slower
   crypto-aes                                13.1012+-0.7841     ?     13.2612+-1.5332        ? might be 1.0122x slower
   crypto-md5                                 8.8907+-7.9512     ?      9.6057+-8.0888        ? might be 1.0804x slower
   crypto-sha1                                7.0814+-0.3844     ?      7.2605+-0.1854        ? might be 1.0253x slower
   date-format-tofte                         26.3088+-2.7523           25.8010+-0.7791          might be 1.0197x faster
   date-format-xparb                         16.2147+-0.9355     ?     17.3440+-4.6970        ? might be 1.0696x slower
   math-cordic                                9.5390+-0.1404     ?      9.7357+-0.6370        ? might be 1.0206x slower
   math-partial-sums                         15.1840+-0.7492     ?     15.2391+-0.6273        ?
   math-spectral-norm                         5.1752+-0.6829     ?      8.9093+-3.1804        ? might be 1.7215x slower
   regexp-dna                                18.8178+-0.6276           18.5535+-0.9504          might be 1.0142x faster
   string-base64                             11.7364+-1.0907           11.4598+-0.3804          might be 1.0241x faster
   string-fasta                              16.9639+-0.8497           16.5520+-0.1265          might be 1.0249x faster
   string-tagcloud                           30.1510+-1.6947     ?     30.1854+-5.0598        ?
   string-unpack-code                        69.5977+-2.1304           67.7461+-1.9471          might be 1.0273x faster
   string-validate-input                     13.4931+-0.4232     ?     14.2054+-2.0545        ? might be 1.0528x slower

   <arithmetic>                              15.0196+-0.2969     ?     15.2742+-0.3394        ? might be 1.0169x slower

                                               SelectionDAG                FastISel                                     
Octane:
   encrypt                                   0.53884+-0.00167    !     0.56236+-0.00388       ! definitely 1.0437x slower
   decrypt                                  10.09786+-0.08573    ^     9.73540+-0.06686       ^ definitely 1.0372x faster
   deltablue                        x2       0.47122+-0.09774          0.44781+-0.03957         might be 1.0523x faster
   earley                                    1.60465+-0.13677          1.57417+-0.04395         might be 1.0194x faster
   boyer                                    16.84935+-0.30842    ?    17.00140+-0.23350       ?
   navier-stokes                    x2      21.06951+-0.59193         20.95475+-0.10441       
   raytrace                         x2       4.24897+-0.21862    ?     4.28543+-4.21308       ?
   richards                         x2       0.23855+-0.00981    !     0.25425+-0.00457       ! definitely 1.0658x slower
   splay                            x2       1.76973+-0.03330          1.76386+-0.00131       
   regexp                           x2     108.57046+-2.92004        108.21400+-3.05217       
   pdfjs                            x2     133.58375+-1.69675    ?   133.71515+-3.14950       ?
   mandreel                         x2     221.21532+-9.45257        212.11890+-8.92153         might be 1.0429x faster
   gbemu                            x2     187.78388+-32.53771       185.41156+-33.68148        might be 1.0128x faster
   closure                                   1.40434+-0.01542    ?     1.42800+-0.07583       ? might be 1.0169x slower
   jquery                                   19.49726+-0.19485    ?    19.54411+-0.21783       ?
   box2d                            x2      72.95358+-10.28726        69.47765+-4.32754         might be 1.0500x faster
   zlib                             x2    1250.08369+-392.99709     1199.90525+-342.64089       might be 1.0418x faster
   typescript                       x2    3081.99731+-30.42784      3008.99984+-58.76044        might be 1.0243x faster

   <geometric>                              22.48137+-0.20623         22.19825+-1.64422         might be 1.0128x faster

                                               SelectionDAG                FastISel                                     
Kraken:
   ai-astar                                  861.540+-191.400          853.498+-253.910       
   audio-beat-detection                      280.905+-28.173     ?     306.554+-30.146        ? might be 1.0913x slower
   audio-dft                                 757.085+-14.864     ?     766.236+-4.890         ? might be 1.0121x slower
   audio-fft                                 190.076+-5.799            178.430+-11.966          might be 1.0653x faster
   audio-oscillator                          497.191+-26.893           469.066+-9.328           might be 1.0600x faster
   imaging-darkroom                          309.485+-10.596           308.856+-8.454         
   imaging-desaturate                        163.043+-3.425      ^     152.891+-1.342         ^ definitely 1.0664x faster
   imaging-gaussian-blur                     230.257+-10.870           217.556+-10.133          might be 1.0584x faster
   json-parse-financial                      119.880+-2.326      ?     120.091+-0.324         ?
   json-stringify-tinderbox                  126.949+-3.036      ?     127.377+-2.736         ?
   stanford-crypto-aes                       170.821+-15.293           162.193+-13.150          might be 1.0532x faster
   stanford-crypto-ccm                       115.358+-6.060      ?     122.424+-47.992        ? might be 1.0612x slower
   stanford-crypto-pbkdf2                    438.124+-5.427      ?     438.525+-5.339         ?
   stanford-crypto-sha256-iterative          144.810+-1.107      ?     146.090+-1.826         ?

   <arithmetic>                              314.680+-19.245           312.128+-20.173          might be 1.0082x faster

                                               SelectionDAG                FastISel                                     
AsmBench:
   bigfib.cpp                              1614.4840+-87.2817        1478.4010+-247.9110        might be 1.0920x faster
   cray.c                                  1196.2056+-48.0512    ^   1109.1813+-35.6899       ^ definitely 1.0785x faster
   dry.c                                    848.3123+-20.4077    ?    849.0481+-18.3524       ?
   FloatMM.c                               1411.4184+-5.6955     ^   1398.0672+-1.5206        ^ definitely 1.0095x faster
   gcc-loops.cpp                           9130.2301+-126.0490   ^   8407.3656+-87.8810       ^ definitely 1.0860x faster
   n-body.c                                2686.2397+-6.8112         2674.9181+-7.1344        
   Quicksort.c                             1064.3299+-3.9816         1058.7861+-20.2545       
   stepanov_container.cpp                 12733.0667+-286.4764      12402.7603+-45.8994         might be 1.0266x faster
   Towers.c                                 560.9787+-8.8784          545.6500+-18.0831         might be 1.0281x faster

   <geometric>                             1962.1021+-8.6012     ^   1893.7075+-33.6412       ^ definitely 1.0361x faster

                                               SelectionDAG                FastISel                                     

                                               SelectionDAG                FastISel                                     
Geomean of preferred means:
   <scaled-result>                          120.1578+-2.3493          118.9667+-1.9308          might be 1.0100x faster
Comment 8 Filip Pizlo 2015-04-07 12:27:50 PDT
Ossy: this patch enables FastISel on iOS/ARM64.  I believe that it is broadly profitable on any ARM64 because:

- LLVM's FastISel backend generates code more quickly than the SelectionDAG backend.
- ARM64 is an easy-enough target to generate code for that SelectionDAG's throughput benefits don't materialize; the code from FastISel is good enough.
- This also switches to a simpler register allocator on ARM64, because ARM64 has so many registers.

The only downside of enabling it is that it requires a new-enough LLVM.  I believe that 3.6 is new enough.

I suspect it would be better to just enable this on all ARM64 and not have any PLATFORM(IOS) condition.

Are you OK with this?
Comment 9 Michael Saboff 2015-04-07 12:37:51 PDT
Comment on attachment 250279 [details]
Patch

r=me
Comment 10 Filip Pizlo 2015-04-07 12:42:34 PDT
Landed in http://trac.webkit.org/changeset/182483
Comment 11 Csaba Osztrogon√°c 2015-04-10 08:10:15 PDT
(In reply to comment #8)
> Ossy: this patch enables FastISel on iOS/ARM64.  I believe that it is
> broadly profitable on any ARM64 because:
> 
> - LLVM's FastISel backend generates code more quickly than the SelectionDAG
> backend.
> - ARM64 is an easy-enough target to generate code for that SelectionDAG's
> throughput benefits don't materialize; the code from FastISel is good enough.
> - This also switches to a simpler register allocator on ARM64, because ARM64
> has so many registers.
> 
> The only downside of enabling it is that it requires a new-enough LLVM.  I
> believe that 3.6 is new enough.
> 
> I suspect it would be better to just enable this on all ARM64 and not have
> any PLATFORM(IOS) condition.
> 
> Are you OK with this?

Thanks for the heads up. I filed a new bug report for it: bug143606.
(But unfortunately we still have a blocker issue to bump 
to LLVM 3.6, we should fix it first.)