Bug 145033

Summary: [JSC] Replace CheckTierUpInLoop by a simple loop counter increment
Product: WebKit Reporter: Benjamin Poulain <benjamin>
Component: New BugsAssignee: Benjamin Poulain <benjamin>
Status: RESOLVED WONTFIX    
Severity: Normal CC: fpizlo
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch none

Description Benjamin Poulain 2015-05-14 18:47:10 PDT
[JSC] Replace CheckTierUpInLoop by a simple loop counter increment
Comment 1 Benjamin Poulain 2015-05-14 18:57:24 PDT
Created attachment 253166 [details]
Patch
Comment 2 Benjamin Poulain 2015-05-14 19:20:35 PDT
                                                          Conf#1                    Conf#2                                      
SunSpider:
   3d-cube                                            7.4707+-0.4057            7.4615+-0.3710        
   3d-morph                                           7.8777+-0.0660     ?      7.9604+-0.1467        ? might be 1.0105x slower
   3d-raytrace                                        8.6840+-0.1431     ?      8.8187+-0.2511        ? might be 1.0155x slower
   access-binary-trees                                2.9951+-0.1373            2.9417+-0.2488          might be 1.0182x faster
   access-fannkuch                                    8.3361+-0.3083     ?      8.4088+-0.2321        ?
   access-nbody                                       4.1312+-0.1774            4.1223+-0.2054        
   access-nsieve                                      4.5938+-0.3353     ?      4.6199+-0.2793        ?
   bitops-3bit-bits-in-byte                           1.9629+-0.1076            1.9415+-0.1058          might be 1.0110x faster
   bitops-bits-in-byte                                5.6274+-0.0605     ?      5.6606+-0.1273        ?
   bitops-bitwise-and                                 2.6995+-0.1815            2.6174+-0.2308          might be 1.0314x faster
   bitops-nsieve-bits                                 4.2758+-0.1182     ?      4.3358+-0.0522        ? might be 1.0140x slower
   controlflow-recursive                              2.9628+-0.1228            2.9005+-0.1873          might be 1.0215x faster
   crypto-aes                                         5.9450+-0.2088     ?      6.0028+-0.3464        ?
   crypto-md5                                         3.7909+-0.1596            3.6649+-0.1790          might be 1.0344x faster
   crypto-sha1                                        3.3301+-0.0992            3.2568+-0.1158          might be 1.0225x faster
   date-format-tofte                                 12.1987+-0.3280           12.1503+-0.3859        
   date-format-xparb                                  7.5950+-0.2782     ?      7.9877+-0.7983        ? might be 1.0517x slower
   math-cordic                                        4.2595+-0.0916     ?      4.2905+-0.1752        ?
   math-partial-sums                                  9.2545+-0.1239     ?      9.3163+-0.5185        ?
   math-spectral-norm                                 2.9158+-0.0423            2.8813+-0.1724          might be 1.0120x faster
   regexp-dna                                         9.7119+-0.4281     ?      9.7368+-0.2951        ?
   string-base64                                      6.3848+-0.3070            6.2654+-0.2394          might be 1.0191x faster
   string-fasta                                       9.2384+-0.3365            9.1397+-0.2983          might be 1.0108x faster
   string-tagcloud                                   12.7599+-0.4104           12.7151+-0.8563        
   string-unpack-code                                26.7627+-0.4047     ?     27.3229+-0.5700        ? might be 1.0209x slower
   string-validate-input                              6.9058+-0.2286            6.7432+-0.3538          might be 1.0241x faster

   <arithmetic>                                       7.0258+-0.0592     ?      7.0486+-0.0747        ? might be 1.0032x slower

                                                          Conf#1                    Conf#2                                      
LongSpider:
   3d-cube                                         1264.9675+-14.5417    ?   1270.1407+-11.5393       ?
   3d-morph                                        1874.7108+-8.3848     ?   1875.1567+-3.3507        ?
   3d-raytrace                                     1043.4266+-6.9934     ?   1047.0209+-13.2064       ?
   access-binary-trees                             1379.3623+-8.9715         1378.7266+-9.3519        
   access-fannkuch                                  467.4772+-32.4366         440.8781+-33.6757         might be 1.0603x faster
   access-nbody                                    1037.9371+-1.6282         1034.7320+-1.6044        
   access-nsieve                                    678.3118+-2.9731          675.6826+-6.0944        
   bitops-3bit-bits-in-byte                          50.8896+-0.8966     ?     51.0807+-0.4196        ?
   bitops-bits-in-byte                              344.3956+-2.9800          343.8072+-2.7501        
   bitops-nsieve-bits                               629.0178+-5.7104          627.8666+-4.4334        
   controlflow-recursive                            709.3497+-0.8398     ^    703.3580+-0.6315        ^ definitely 1.0085x faster
   crypto-aes                                       933.3588+-5.9685     ?    934.6479+-5.7318        ?
   crypto-md5                                       683.9178+-3.3079          682.2672+-8.9723        
   crypto-sha1                                      915.5109+-4.0541     ^    910.3347+-0.8222        ^ definitely 1.0057x faster
   date-format-tofte                                993.1617+-43.1704         984.2944+-2.6621        
   date-format-xparb                                989.0282+-4.2928     !   1047.1752+-32.4707       ! definitely 1.0588x slower
   math-cordic                                      672.9670+-1.6293          672.7302+-1.1759        
   math-partial-sums                               1068.0108+-7.1082         1066.5082+-4.1469        
   math-spectral-norm                              1072.4953+-0.9029     ^   1064.2513+-1.7631        ^ definitely 1.0077x faster
   string-base64                                    488.1353+-4.0599          486.6469+-3.2855        
   string-fasta                                     604.4690+-7.8138          600.0031+-5.9961        
   string-tagcloud                                  276.4984+-1.6085     ?    277.2037+-4.9739        ?

   <geometric>                                      693.3342+-1.1845          691.9971+-2.2918          might be 1.0019x faster

                                                          Conf#1                    Conf#2                                      
V8Spider:
   crypto                                            72.8729+-0.9843     ?     72.9085+-1.9642        ?
   deltablue                                         90.6683+-2.2873           89.4022+-3.5802          might be 1.0142x faster
   earley-boyer                                      63.6345+-0.6856           62.7613+-0.9145          might be 1.0139x faster
   raytrace                                          41.5044+-0.7309           40.7682+-2.4604          might be 1.0181x faster
   regexp                                           110.4010+-1.4171     ?    110.6057+-0.7042        ?
   richards                                          99.0670+-2.1324     ?    100.8824+-0.9996        ? might be 1.0183x slower
   splay                                             53.2233+-3.1016     ?     53.5613+-2.9946        ?

   <geometric>                                       72.1240+-0.9937           71.9214+-0.5743          might be 1.0028x faster

                                                          Conf#1                    Conf#2                                      
Octane:
   encrypt                                           0.30771+-0.00039    ?     0.30791+-0.00141       ?
   decrypt                                           5.65878+-0.01212    ?     5.67509+-0.03465       ?
   deltablue                                x2       0.26576+-0.00961    ?     0.27599+-0.01137       ? might be 1.0385x slower
   earley                                            0.64607+-0.00428          0.64506+-0.00307       
   boyer                                            10.44214+-0.11440         10.42770+-0.01663       
   navier-stokes                            x2       6.39080+-0.01134          6.38477+-0.00263       
   raytrace                                 x2       1.80358+-0.02352          1.72946+-0.07059         might be 1.0429x faster
   richards                                 x2       0.16658+-0.00238          0.16651+-0.00465       
   splay                                    x2       0.54419+-0.00945    ?     0.54690+-0.00357       ?
   regexp                                   x2      42.31243+-0.32199    ?    42.37793+-0.22416       ?
   pdfjs                                    x2      57.94230+-0.27754    ?    58.67918+-0.61763       ? might be 1.0127x slower
   mandreel                                 x2      69.69720+-0.39394         69.69618+-0.57666       
   gbemu                                    x2      59.71787+-0.58793    ?    59.81490+-0.35447       ?
   closure                                           0.75903+-0.00247          0.75559+-0.00792       
   jquery                                            9.45824+-0.04574          9.44228+-0.04492       
   box2d                                    x2      17.38628+-0.08540    ?    17.40287+-0.04921       ?
   zlib                                     x2     541.84515+-2.49419    ?   541.91532+-10.15354      ?
   typescript                               x2    1060.73999+-14.89385   ?  1074.58350+-36.70791      ? might be 1.0131x slower

   <geometric>                                       9.14094+-0.02379    ?     9.15677+-0.04845       ? might be 1.0017x slower

                                                          Conf#1                    Conf#2                                      
Kraken:
   ai-astar                                          480.865+-13.059     ^     346.508+-1.128         ^ definitely 1.3877x faster
   audio-beat-detection                              148.723+-2.109            147.436+-1.729         
   audio-dft                                         144.299+-4.111            142.467+-4.217           might be 1.0129x faster
   audio-fft                                         111.296+-1.604            111.105+-0.477         
   audio-oscillator                                   96.926+-0.081             96.841+-0.434         
   imaging-darkroom                                  139.427+-0.128      ?     139.544+-0.266         ?
   imaging-desaturate                                 94.574+-0.574             94.463+-0.179         
   imaging-gaussian-blur                             151.210+-13.586           147.299+-0.354           might be 1.0266x faster
   json-parse-financial                               66.614+-1.797             65.487+-0.142           might be 1.0172x faster
   json-stringify-tinderbox                           43.918+-0.723             43.573+-0.503         
   stanford-crypto-aes                                82.129+-1.979             82.088+-0.670         
   stanford-crypto-ccm                                67.121+-3.681      ?      68.066+-1.594         ? might be 1.0141x slower
   stanford-crypto-pbkdf2                            159.249+-2.410      ?     159.904+-1.132         ?
   stanford-crypto-sha256-iterative                   64.624+-1.677             64.119+-1.667         

   <arithmetic>                                      132.212+-0.703      ^     122.064+-0.573         ^ definitely 1.0831x faster

                                                          Conf#1                    Conf#2                                      
JSRegress:
   abs-boolean                                        3.7354+-0.1124            3.7336+-0.1152        
   adapt-to-double-divide                            17.9088+-0.4614     ?     18.0990+-0.4811        ? might be 1.0106x slower
   aliased-arguments-getbyval                         1.6483+-0.2354            1.6237+-0.2036          might be 1.0151x faster
   allocate-big-object                                3.8594+-0.5019            3.6978+-0.2010          might be 1.0437x faster
   arguments-named-and-reflective                    13.9742+-0.3046     ?     13.9869+-0.2023        ?
   arguments-out-of-bounds                           17.9718+-0.5186           17.9347+-0.5703        
   arguments-strict-mode                             11.9040+-0.7339     ?     12.0513+-1.0123        ? might be 1.0124x slower
   arguments                                         10.8785+-0.3452           10.7875+-0.1701        
   arity-mismatch-inlining                            1.1722+-0.1698            1.1370+-0.1150          might be 1.0310x faster
   array-access-polymorphic-structure                 9.4415+-0.5891            9.3060+-0.5798          might be 1.0146x faster
   array-nonarray-polymorhpic-access                 49.1620+-5.6189     ^     42.4359+-0.6589        ^ definitely 1.1585x faster
   array-prototype-every                            117.9456+-1.9301     ?    120.2440+-3.9798        ? might be 1.0195x slower
   array-prototype-forEach                          115.9567+-3.9929     ?    116.9478+-1.2063        ?
   array-prototype-map                              128.7217+-3.2365          128.6006+-4.2198        
   array-prototype-some                             116.1681+-3.1899     ?    118.4347+-4.9776        ? might be 1.0195x slower
   array-splice-contiguous                           57.2451+-0.5375     ?     58.0010+-1.2550        ? might be 1.0132x slower
   array-with-double-add                              5.6252+-0.0750     ?      5.6437+-0.2313        ?
   array-with-double-increment                        4.1425+-0.1745     ?      4.1430+-0.0818        ?
   array-with-double-mul-add                          8.0113+-0.2433     ^      7.1430+-0.1444        ^ definitely 1.1216x faster
   array-with-double-sum                              4.3774+-0.0644     ?      4.4006+-0.0939        ?
   array-with-int32-add-sub                           9.5967+-0.3844            9.5799+-0.2351        
   array-with-int32-or-double-sum                     4.4252+-0.1817     ?      4.4863+-0.1670        ? might be 1.0138x slower
   ArrayBuffer-DataView-alloc-large-long-lived   
                                                     47.2195+-2.3805     ?     47.2685+-1.2503        ?
   ArrayBuffer-DataView-alloc-long-lived             18.7047+-0.5289           18.5272+-0.4265        
   ArrayBuffer-Int32Array-byteOffset                  5.3105+-0.1693            5.2238+-0.1623          might be 1.0166x faster
   ArrayBuffer-Int8Array-alloc-large-long-lived   
                                                     50.8057+-1.4485     ?     51.2613+-4.5309        ?
   ArrayBuffer-Int8Array-alloc-long-lived-buffer   
                                                     31.6698+-1.1597           31.4547+-0.9442        
   ArrayBuffer-Int8Array-alloc-long-lived            17.7678+-0.2917     ?     17.8988+-0.5499        ?
   ArrayBuffer-Int8Array-alloc                       15.0102+-0.4117           14.9760+-0.4665        
   asmjs_bool_bug                                     8.5575+-0.2780            8.4832+-0.1136        
   assign-custom-setter-polymorphic                   4.3235+-0.1596            4.2392+-0.2457          might be 1.0199x faster
   assign-custom-setter                               5.8701+-0.4519            5.8267+-0.2777        
   basic-set                                         11.3942+-0.7292     ?     12.2148+-0.7618        ? might be 1.0720x slower
   big-int-mul                                        5.8883+-0.4044     ?      5.9630+-0.3664        ? might be 1.0127x slower
   boolean-test                                       4.2051+-0.1700     ?      4.2776+-0.0992        ? might be 1.0172x slower
   branch-fold                                        4.7303+-0.1274     ?      4.7495+-0.0645        ?
   branch-on-string-as-boolean                       22.8486+-1.1200     ?     23.4976+-0.5926        ? might be 1.0284x slower
   by-val-generic                                    10.9012+-0.7251           10.5931+-0.1982          might be 1.0291x faster
   call-spread-apply                                 39.1586+-1.0481           38.2603+-1.0697          might be 1.0235x faster
   call-spread-call                                  32.6278+-2.0645           31.8426+-1.1482          might be 1.0247x faster
   captured-assignments                               0.6555+-0.0550     ?      0.6631+-0.1214        ? might be 1.0115x slower
   cast-int-to-double                                 8.2314+-0.3471            8.1517+-0.3530        
   cell-argument                                      9.7772+-0.1374     ?     10.0825+-0.4013        ? might be 1.0312x slower
   cfg-simplify                                       3.7693+-0.1229     ?      3.8120+-0.4283        ? might be 1.0113x slower
   chain-getter-access                               11.4783+-0.3865           11.3087+-0.2906          might be 1.0150x faster
   cmpeq-obj-to-obj-other                            12.4398+-0.3456           12.2455+-0.3966          might be 1.0159x faster
   constant-test                                      7.8868+-0.3202     ?      8.1558+-0.8399        ? might be 1.0341x slower
   create-lots-of-functions                          16.8372+-0.3507           16.6919+-0.6568        
   DataView-custom-properties                        53.3099+-1.0154           53.1384+-1.6919        
   deconstructing-parameters-overridden-by-function   
                                                      0.7389+-0.1572     ?      0.7487+-0.1475        ? might be 1.0132x slower
   delay-tear-off-arguments-strictmode               18.6036+-0.2330           18.5721+-0.2620        
   deltablue-varargs                                264.9066+-1.4691     ?    265.7395+-0.9195        ?
   destructuring-arguments                           22.9165+-0.4592           22.7300+-0.4893        
   destructuring-swap                                 7.8647+-0.1994            7.8244+-0.1384        
   direct-arguments-getbyval                          1.7355+-0.1801            1.6542+-0.2355          might be 1.0491x faster
   div-boolean-double                                 5.5276+-0.1048            5.5096+-0.0837        
   div-boolean                                        9.8295+-0.2676            9.7826+-0.2618        
   double-get-by-val-out-of-bounds                    5.9681+-0.0690     !      6.2098+-0.0437        ! definitely 1.0405x slower
   double-pollution-getbyval                          9.7162+-0.2979            9.7067+-0.2830        
   double-pollution-putbyoffset                       5.7421+-0.2066     ?      5.7944+-0.1793        ?
   double-to-int32-typed-array-no-inline              2.9047+-0.0911     ?      2.9411+-0.0890        ? might be 1.0125x slower
   double-to-int32-typed-array                        2.5522+-0.0992            2.5363+-0.1876        
   double-to-uint32-typed-array-no-inline             2.9568+-0.2207     ?      2.9908+-0.2095        ? might be 1.0115x slower
   double-to-uint32-typed-array                       2.7245+-0.2211            2.6095+-0.1889          might be 1.0441x faster
   elidable-new-object-dag                           55.2192+-0.6424     ?     55.9133+-2.3547        ? might be 1.0126x slower
   elidable-new-object-roflcopter                    58.2345+-1.7137           57.3688+-1.4217          might be 1.0151x faster
   elidable-new-object-then-call                     53.2241+-4.1455           51.3990+-1.6429          might be 1.0355x faster
   elidable-new-object-tree                          65.1043+-1.5141           65.0966+-0.7834        
   empty-string-plus-int                              7.4400+-0.4983            7.4388+-0.1947        
   emscripten-cube2hash                              44.6530+-0.9146           44.6135+-0.9454        
   exit-length-on-plain-object                       18.6757+-0.2625           18.6025+-0.6874        
   external-arguments-getbyval                        1.7012+-0.0751     ?      1.7025+-0.3491        ?
   external-arguments-putbyval                        3.2766+-0.0866            3.1896+-0.1435          might be 1.0273x faster
   fixed-typed-array-storage-var-index                1.6125+-0.0744            1.5928+-0.1187          might be 1.0124x faster
   fixed-typed-array-storage                          1.2548+-0.0921            1.2421+-0.0671          might be 1.0102x faster
   Float32Array-matrix-mult                           6.1805+-0.4285            6.0610+-0.7692          might be 1.0197x faster
   Float32Array-to-Float64Array-set                  71.4550+-2.8414     ?     73.2407+-4.2405        ? might be 1.0250x slower
   Float64Array-alloc-long-lived                     91.5484+-0.4592           91.5453+-0.5677        
   Float64Array-to-Int16Array-set                    97.3175+-2.5804     ^     92.0966+-2.1509        ^ definitely 1.0567x faster
   fold-double-to-int                                19.3442+-0.3550           19.2087+-0.0555        
   fold-get-by-id-to-multi-get-by-offset-rare-int   
                                                     10.9012+-0.4192     ?     11.3011+-0.3894        ? might be 1.0367x slower
   fold-get-by-id-to-multi-get-by-offset              9.1507+-0.5909     ?      9.2895+-0.4115        ? might be 1.0152x slower
   fold-multi-get-by-offset-to-get-by-offset   
                                                      8.6567+-0.0755            7.8522+-0.9913          might be 1.1024x faster
   fold-multi-get-by-offset-to-poly-get-by-offset   
                                                      8.5142+-0.1735            8.3389+-0.6098          might be 1.0210x faster
   fold-multi-put-by-offset-to-poly-put-by-offset   
                                                      7.8245+-0.6984            7.7305+-0.6295          might be 1.0122x faster
   fold-multi-put-by-offset-to-put-by-offset   
                                                      6.3921+-0.7609     ?      6.4286+-0.7042        ?
   fold-multi-put-by-offset-to-replace-or-transition-put-by-offset   
                                                     13.1448+-0.6545           12.9755+-1.3580          might be 1.0130x faster
   fold-put-by-id-to-multi-put-by-offset              8.6461+-0.7468     ?      9.1091+-0.9770        ? might be 1.0536x slower
   fold-put-structure                                 6.2933+-0.6352     ?      6.6877+-0.8031        ? might be 1.0627x slower
   for-of-iterate-array-entries                       6.4566+-0.5069            6.3012+-0.0970          might be 1.0247x faster
   for-of-iterate-array-keys                          5.0457+-0.3531            4.9709+-0.1012          might be 1.0150x faster
   for-of-iterate-array-values                        4.7958+-0.2162     ?      4.8367+-0.0546        ?
   fround                                            23.4052+-1.6956     ?     23.5544+-0.7597        ?
   ftl-library-inlining-dataview                     89.0478+-0.4226     !     98.8783+-5.7649        ! definitely 1.1104x slower
   ftl-library-inlining                              87.7807+-9.6760           87.6375+-10.4341       
   function-dot-apply                                 2.6345+-0.2214     ?      2.7529+-0.0918        ? might be 1.0449x slower
   function-test                                      4.3090+-0.1759     ?      4.3618+-0.1059        ? might be 1.0123x slower
   function-with-eval                               134.2858+-0.6240          133.8984+-1.9662        
   gcse-poly-get-less-obvious                        24.7490+-0.1915           24.6497+-0.5340        
   gcse-poly-get                                     24.7110+-0.3604     ?     24.7255+-0.3128        ?
   gcse                                               6.4962+-0.1383     ?      6.5367+-0.1382        ?
   get-by-id-bimorphic-check-structure-elimination-simple   
                                                      3.3018+-0.1129            3.2977+-0.1104        
   get-by-id-bimorphic-check-structure-elimination   
                                                      8.1526+-0.2777     ?      8.3668+-0.3082        ? might be 1.0263x slower
   get-by-id-chain-from-try-block                    11.4468+-0.0817     ?     11.4684+-0.4147        ?
   get-by-id-check-structure-elimination              7.4720+-0.2873            7.4125+-0.1387        
   get-by-id-proto-or-self                           24.1395+-0.7583           23.1938+-1.9313          might be 1.0408x faster
   get-by-id-quadmorphic-check-structure-elimination-simple   
                                                      3.9338+-0.0810            3.9240+-0.1309        
   get-by-id-self-or-proto                           24.2550+-0.2918           23.3732+-2.4077          might be 1.0377x faster
   get-by-val-out-of-bounds                           6.1700+-0.4698            6.0308+-0.1994          might be 1.0231x faster
   get_callee_monomorphic                             3.7901+-0.1907     ?      4.1425+-0.8026        ? might be 1.0930x slower
   get_callee_polymorphic                             4.8707+-0.3796            4.6165+-0.3908          might be 1.0551x faster
   getter-no-activation                               5.7663+-0.0724     ?      5.8325+-0.0473        ? might be 1.0115x slower
   getter-richards                                  131.9205+-2.8582     ?    133.1222+-2.6337        ?
   getter                                             7.1388+-0.0775     ?      7.1577+-0.1050        ?
   global-var-const-infer-fire-from-opt               1.3614+-0.1702            1.1968+-0.2915          might be 1.1375x faster
   global-var-const-infer                             1.1533+-0.1505     ?      1.1868+-0.1829        ? might be 1.0291x slower
   HashMap-put-get-iterate-keys                      33.3495+-0.7736     ?     33.5013+-0.9209        ?
   HashMap-put-get-iterate                           32.4583+-0.2898     ?     32.8713+-0.6172        ? might be 1.0127x slower
   HashMap-string-put-get-iterate                    36.3760+-2.9205           35.7398+-1.7104          might be 1.0178x faster
   hoist-make-rope                                   13.7410+-1.3871     ?     13.9276+-1.6943        ? might be 1.0136x slower
   hoist-poly-check-structure-effectful-loop   
                                                      6.8183+-0.1637     ?      6.9527+-0.0592        ? might be 1.0197x slower
   hoist-poly-check-structure                         4.8292+-0.1497     ?      4.8901+-0.0976        ? might be 1.0126x slower
   imul-double-only                                  10.0575+-0.5030     ?     10.4946+-0.4827        ? might be 1.0435x slower
   imul-int-only                                     11.7420+-1.1632     ?     12.0264+-0.8468        ? might be 1.0242x slower
   imul-mixed                                         9.4063+-0.7483     ?      9.8138+-0.1856        ? might be 1.0433x slower
   in-four-cases                                     25.4580+-0.4509     ^     24.6800+-0.0682        ^ definitely 1.0315x faster
   in-one-case-false                                 13.2842+-0.3122     ^     12.3500+-0.2523        ^ definitely 1.0756x faster
   in-one-case-true                                  13.5540+-0.8814     ^     12.3245+-0.1821        ^ definitely 1.0998x faster
   in-two-cases                                      13.5930+-0.2615     ^     13.0262+-0.1460        ^ definitely 1.0435x faster
   indexed-properties-in-objects                      3.7938+-0.1305            3.7905+-0.1832        
   infer-closure-const-then-mov-no-inline             4.6672+-0.1272            4.6385+-0.1480        
   infer-closure-const-then-mov                      24.1671+-0.6298     ?     24.9845+-1.0177        ? might be 1.0338x slower
   infer-closure-const-then-put-to-scope-no-inline   
                                                     14.6125+-0.4296           14.5165+-0.3437        
   infer-closure-const-then-put-to-scope             27.8470+-0.3234     ?     28.2328+-0.6418        ? might be 1.0139x slower
   infer-closure-const-then-reenter-no-inline   
                                                     61.6763+-1.2928           61.5002+-0.5825        
   infer-closure-const-then-reenter                  29.5238+-0.1966     ?     29.6221+-0.7504        ?
   infer-constant-global-property                     4.7191+-0.1404     ?      4.7219+-0.1943        ?
   infer-constant-property                            3.2610+-0.1254     ?      3.2627+-0.0736        ?
   infer-one-time-closure-ten-vars                   14.6371+-0.1324           14.4320+-0.1978          might be 1.0142x faster
   infer-one-time-closure-two-vars                   14.4709+-0.1812           14.4537+-0.7159        
   infer-one-time-closure                            14.0451+-0.3671           14.0165+-0.4598        
   infer-one-time-deep-closure                       24.4062+-0.7046     ?     24.9091+-1.0716        ? might be 1.0206x slower
   inline-arguments-access                            6.1610+-0.2448            6.0185+-0.1270          might be 1.0237x faster
   inline-arguments-aliased-access                    6.1627+-0.3095     ?      6.2526+-0.2820        ? might be 1.0146x slower
   inline-arguments-local-escape                      6.3468+-0.3907            6.1014+-0.1353          might be 1.0402x faster
   inline-get-scoped-var                              5.7005+-0.0642            5.6773+-0.0167        
   inlined-put-by-id-transition                      14.8317+-0.4942     ?     16.0319+-2.0715        ? might be 1.0809x slower
   int-or-other-abs-then-get-by-val                   6.7510+-0.2141     ?      6.7677+-0.1003        ?
   int-or-other-abs-zero-then-get-by-val             28.6926+-0.9126           27.2861+-1.5747          might be 1.0515x faster
   int-or-other-add-then-get-by-val                   5.3757+-0.0448     !      5.5343+-0.0605        ! definitely 1.0295x slower
   int-or-other-add                                   7.9206+-0.2507     ?      7.9795+-0.0917        ?
   int-or-other-div-then-get-by-val                   5.2422+-0.1052            5.1876+-0.1171          might be 1.0105x faster
   int-or-other-max-then-get-by-val                   5.1267+-0.1420     ?      5.1340+-0.1042        ?
   int-or-other-min-then-get-by-val                   5.0787+-0.1248            5.0181+-0.1896          might be 1.0121x faster
   int-or-other-mod-then-get-by-val                   4.9977+-0.0850            4.9056+-0.1972          might be 1.0188x faster
   int-or-other-mul-then-get-by-val                   4.9427+-0.1544            4.8622+-0.1140          might be 1.0166x faster
   int-or-other-neg-then-get-by-val                   6.0675+-0.1224            6.0459+-0.1362        
   int-or-other-neg-zero-then-get-by-val             28.1220+-0.4628     ^     26.9724+-0.1718        ^ definitely 1.0426x faster
   int-or-other-sub-then-get-by-val                   5.8755+-0.1251     ?      6.0551+-0.3752        ? might be 1.0306x slower
   int-or-other-sub                                   4.2156+-0.1210     ?      4.2291+-0.1669        ?
   int-overflow-local                                 5.6610+-0.1946     ?      5.6998+-0.0919        ?
   Int16Array-alloc-long-lived                       62.9570+-1.0537           62.5540+-0.5452        
   Int16Array-bubble-sort-with-byteLength            37.7992+-0.5534           37.4625+-0.5083        
   Int16Array-bubble-sort                            36.8207+-0.2083     ?     37.4307+-0.7676        ? might be 1.0166x slower
   Int16Array-load-int-mul                            1.9893+-0.1213     ?      1.9919+-0.0596        ?
   Int16Array-to-Int32Array-set                      72.5517+-1.7964           70.4373+-2.6397          might be 1.0300x faster
   Int32Array-alloc-large                            32.9401+-1.2831     ?     33.3011+-1.2011        ? might be 1.0110x slower
   Int32Array-alloc-long-lived                       70.0492+-1.6979     ?     70.1092+-1.3616        ?
   Int32Array-alloc                                   4.5004+-0.2771     ?      4.5442+-0.0933        ?
   Int32Array-Int8Array-view-alloc                    9.8594+-0.3520            9.6715+-0.2971          might be 1.0194x faster
   int52-spill                                        7.9307+-0.1874     ?      8.0980+-0.4436        ? might be 1.0211x slower
   Int8Array-alloc-long-lived                        56.1036+-0.7580     ?     57.0907+-1.4535        ? might be 1.0176x slower
   Int8Array-load-with-byteLength                     4.7528+-0.0529            4.7130+-0.1295        
   Int8Array-load                                     4.6738+-0.0737     ?      4.6995+-0.0559        ?
   integer-divide                                    13.3618+-0.5228     ?     13.4872+-0.2106        ?
   integer-modulo                                     2.7648+-0.1156     ?      2.8134+-0.2588        ? might be 1.0176x slower
   is-boolean-fold-tricky                             5.7588+-0.0865            5.7588+-0.1018        
   is-boolean-fold                                    3.9911+-0.0909     ?      3.9943+-0.0914        ?
   is-function-fold-tricky-internal-function   
                                                     14.6224+-0.3116     ?     14.8000+-1.1129        ? might be 1.0122x slower
   is-function-fold-tricky                            5.8203+-0.2405     ?      5.9265+-0.1826        ? might be 1.0183x slower
   is-function-fold                                   4.0134+-0.0888     ?      4.0175+-0.1103        ?
   is-number-fold-tricky                              5.7798+-0.1598     ?      5.7875+-0.1161        ?
   is-number-fold                                     3.9615+-0.1693     ?      3.9907+-0.1603        ?
   is-object-or-null-fold-functions                   3.9996+-0.1044     ?      4.0066+-0.1377        ?
   is-object-or-null-fold-less-tricky                 5.8615+-0.1381     ?      5.9565+-0.0341        ? might be 1.0162x slower
   is-object-or-null-fold-tricky                      7.4613+-0.2196     ?      7.5546+-0.2615        ? might be 1.0125x slower
   is-object-or-null-fold                             4.1080+-0.1057            3.9886+-0.1016          might be 1.0299x faster
   is-object-or-null-trickier-function                5.9871+-0.1939            5.8862+-0.2492          might be 1.0172x faster
   is-object-or-null-trickier-internal-function   
                                                     15.9462+-1.6058           15.4556+-0.9629          might be 1.0317x faster
   is-object-or-null-tricky-function                  5.8688+-0.1256     ?      5.9645+-0.0827        ? might be 1.0163x slower
   is-object-or-null-tricky-internal-function   
                                                     11.7362+-1.0414           11.3077+-0.0767          might be 1.0379x faster
   is-string-fold-tricky                              5.7340+-0.1490            5.6766+-0.2483          might be 1.0101x faster
   is-string-fold                                     4.0060+-0.2547            3.9756+-0.0985        
   is-undefined-fold-tricky                           4.6666+-0.4316            4.5132+-0.1887          might be 1.0340x faster
   is-undefined-fold                                  3.8989+-0.0775     ?      3.9800+-0.1111        ? might be 1.0208x slower
   large-int-captured                                 6.7093+-0.2458            6.6517+-0.1017        
   large-int-neg                                     20.7574+-0.5073           20.6547+-0.2463        
   large-int                                         18.6873+-0.2688     ?     18.8534+-0.1886        ?
   logical-not                                        6.0782+-0.2403            6.0688+-0.1620        
   lots-of-fields                                    19.4328+-0.0876           19.3887+-0.5395        
   make-indexed-storage                               4.2186+-0.2934     ?      4.3051+-0.5265        ? might be 1.0205x slower
   make-rope-cse                                      6.6907+-0.2630            6.6185+-0.2298          might be 1.0109x faster
   marsaglia-larger-ints                             55.1337+-0.9828           54.9319+-0.5453        
   marsaglia-osr-entry                               27.6518+-0.7756     ?     28.2576+-0.8517        ? might be 1.0219x slower
   max-boolean                                        3.4627+-0.2215            3.3973+-0.2011          might be 1.0193x faster
   method-on-number                                  24.6814+-1.4586           23.4730+-1.0759          might be 1.0515x faster
   min-boolean                                        3.3266+-0.2590     ?      3.3333+-0.2826        ?
   minus-boolean-double                               4.1248+-0.0632     !      4.2524+-0.0433        ! definitely 1.0310x slower
   minus-boolean                                      3.2783+-0.0802            3.2256+-0.0689          might be 1.0163x faster
   misc-strict-eq                                    57.1432+-1.5033           57.0510+-2.0406        
   mod-boolean-double                                11.7264+-0.2751           11.6060+-0.3073          might be 1.0104x faster
   mod-boolean                                        8.6642+-0.2044     ?      8.7730+-0.1388        ? might be 1.0126x slower
   mul-boolean-double                                 4.9161+-0.1126            4.8929+-0.1839        
   mul-boolean                                        3.5718+-0.1086            3.4843+-0.0846          might be 1.0251x faster
   neg-boolean                                        4.3804+-0.1368            4.3528+-0.1369        
   negative-zero-divide                               0.4441+-0.0167     ?      0.5352+-0.1663        ? might be 1.2052x slower
   negative-zero-modulo                               0.5046+-0.1239     ?      0.5212+-0.0886        ? might be 1.0330x slower
   negative-zero-negate                               0.4233+-0.0158     ?      0.4294+-0.0227        ? might be 1.0146x slower
   nested-function-parsing                           58.7930+-0.4188     ?     58.9720+-0.4573        ?
   new-array-buffer-dead                            139.8837+-1.8754     ?    140.7181+-0.9710        ?
   new-array-buffer-push                              9.7011+-0.4574     ?      9.7664+-0.6850        ?
   new-array-dead                                    22.0250+-2.5260     ?     22.2811+-2.4747        ? might be 1.0116x slower
   new-array-push                                     5.2936+-0.2830     ?      5.3692+-0.1768        ? might be 1.0143x slower
   no-inline-constructor                            162.4305+-1.2846          162.2382+-0.2116        
   number-test                                        4.1615+-0.1387     ?      4.2358+-0.0958        ? might be 1.0179x slower
   object-closure-call                                7.3557+-0.0774            7.2430+-0.2931          might be 1.0156x faster
   object-test                                        4.2936+-0.4330            4.2451+-0.0755          might be 1.0114x faster
   obvious-sink-pathology-taken                     174.1981+-2.3322     ?    174.2998+-1.5224        ?
   obvious-sink-pathology                           158.3652+-2.0818     ?    161.2597+-1.8590        ? might be 1.0183x slower
   obviously-elidable-new-object                     48.6956+-5.5261           46.2335+-4.6435          might be 1.0533x faster
   plus-boolean-arith                                 3.3117+-0.1108            3.2820+-0.0839        
   plus-boolean-double                                4.2551+-0.1092            4.1733+-0.1107          might be 1.0196x faster
   plus-boolean                                       3.2573+-0.0954            3.1736+-0.1297          might be 1.0264x faster
   poly-chain-access-different-prototypes-simple   
                                                      3.8975+-0.1685     ?      3.9359+-0.1681        ?
   poly-chain-access-different-prototypes             3.1844+-0.0872            3.1344+-0.0857          might be 1.0159x faster
   poly-chain-access-simpler                          3.8538+-0.2004            3.8352+-0.1121        
   poly-chain-access                                  3.0751+-0.1496     ?      3.1647+-0.2780        ? might be 1.0291x slower
   poly-stricteq                                     70.6735+-0.7706           70.0264+-0.9420        
   polymorphic-array-call                             1.9133+-0.1339            1.6375+-0.2691          might be 1.1685x faster
   polymorphic-get-by-id                              4.1060+-0.1083     ?      4.1573+-0.2088        ? might be 1.0125x slower
   polymorphic-put-by-id                             39.9557+-5.3225     ?     40.2709+-5.6240        ?
   polymorphic-structure                             17.6393+-0.2389     ?     17.8649+-0.3210        ? might be 1.0128x slower
   polyvariant-monomorphic-get-by-id                 11.5646+-0.1480           11.5195+-0.1883        
   proto-getter-access                               11.4307+-0.7228     ?     11.5670+-0.6722        ? might be 1.0119x slower
   put-by-id-replace-and-transition                  12.6222+-0.4676           12.5017+-0.2799        
   put-by-id-slightly-polymorphic                     3.4604+-0.1185     ?      3.5508+-0.1130        ? might be 1.0261x slower
   put-by-id                                         18.4367+-0.5186     ?     19.3102+-1.2108        ? might be 1.0474x slower
   put-by-val-direct                                  0.5443+-0.0972            0.5408+-0.1641        
   put-by-val-large-index-blank-indexing-type   
                                                      8.6073+-0.3964     ?      8.7965+-0.4133        ? might be 1.0220x slower
   put-by-val-machine-int                             3.5677+-0.2940     ?      3.5931+-0.1475        ?
   rare-osr-exit-on-local                            18.5121+-0.3894           18.4775+-0.3708        
   register-pressure-from-osr                        26.0280+-0.7566           25.6618+-0.2595          might be 1.0143x faster
   setter                                             6.4290+-0.0868     ?      6.5334+-0.0949        ? might be 1.0163x slower
   simple-activation-demo                            31.6697+-0.5580           30.6970+-0.5398          might be 1.0317x faster
   simple-getter-access                              15.4708+-1.3174           15.3482+-0.4950        
   simple-poly-call-nested                           10.0801+-0.1844           10.0253+-0.1489        
   simple-poly-call                                   1.6707+-0.0372     ?      1.7593+-0.1497        ? might be 1.0530x slower
   sin-boolean                                       24.2405+-4.0506           22.1197+-0.7820          might be 1.0959x faster
   singleton-scope                                   80.6666+-0.5353           80.4058+-1.2139        
   sink-function                                     14.3201+-1.6553     ?     15.4960+-1.8281        ? might be 1.0821x slower
   sinkable-new-object-dag                           95.6910+-1.3128     ?     95.8181+-0.7689        ?
   sinkable-new-object-taken                         69.7025+-2.4355     ?     71.1367+-4.5821        ? might be 1.0206x slower
   sinkable-new-object                               52.4555+-5.0395     ?     53.6763+-1.6764        ? might be 1.0233x slower
   slow-array-profile-convergence                     4.2880+-0.5439            3.9835+-0.2412          might be 1.0764x faster
   slow-convergence                                   3.8136+-0.1367     ?      3.8790+-0.2788        ? might be 1.0171x slower
   sorting-benchmark                                 26.4373+-0.8953     ?     26.7104+-1.3317        ? might be 1.0103x slower
   sparse-conditional                                 1.6038+-0.0730            1.5632+-0.1722          might be 1.0260x faster
   splice-to-remove                                  20.8183+-0.7855     ?     21.0294+-1.0120        ? might be 1.0101x slower
   string-char-code-at                               19.6927+-1.0323           19.3468+-0.0820          might be 1.0179x faster
   string-concat-object                               3.3577+-0.3182     ?      3.4168+-0.2870        ? might be 1.0176x slower
   string-concat-pair-object                          3.3210+-0.3337            3.2933+-0.4373        
   string-concat-pair-simple                         17.2225+-0.4418           16.7405+-0.5364          might be 1.0288x faster
   string-concat-simple                              17.0902+-0.4441     ?     17.1182+-0.0911        ?
   string-cons-repeat                                11.5198+-1.0908           11.1112+-0.1850          might be 1.0368x faster
   string-cons-tower                                 10.6085+-0.1604           10.5740+-0.3335        
   string-equality                                   23.2667+-0.1578           23.2639+-0.2592        
   string-get-by-val-big-char                         9.8333+-0.3229     ?      9.9057+-0.4226        ?
   string-get-by-val-out-of-bounds-insane             5.0988+-0.2449     ?      5.2474+-0.1338        ? might be 1.0291x slower
   string-get-by-val-out-of-bounds                    6.7997+-0.1608     ?      6.9597+-0.3269        ? might be 1.0235x slower
   string-get-by-val                                  4.6818+-0.2388     ?      4.7716+-0.1136        ? might be 1.0192x slower
   string-hash                                        2.8141+-0.0920            2.7394+-0.1858          might be 1.0272x faster
   string-long-ident-equality                        18.6293+-0.3671     ?     18.6553+-0.4450        ?
   string-out-of-bounds                              17.4517+-0.1716           17.4380+-0.1404        
   string-repeat-arith                               40.3918+-1.1263           39.5048+-0.7392          might be 1.0225x faster
   string-sub                                        78.1627+-1.2681           77.6304+-1.6138        
   string-test                                        4.2141+-0.0516     ?      4.2692+-0.1431        ? might be 1.0131x slower
   string-var-equality                               45.1135+-1.2380           44.3307+-0.2628          might be 1.0177x faster
   structure-hoist-over-transitions                   3.6113+-0.1318            3.5843+-0.1140        
   substring-concat-weird                            54.8107+-1.2000     ?     55.1127+-1.5804        ?
   substring-concat                                  58.1064+-0.6506     ?     58.5209+-0.5806        ?
   substring                                         63.4324+-1.5833     ?     63.8988+-0.5295        ?
   switch-char-constant                               3.3991+-0.1048            3.3470+-0.0598          might be 1.0156x faster
   switch-char                                        8.3154+-0.1741     ^      7.6473+-0.0726        ^ definitely 1.0874x faster
   switch-constant                                   11.4772+-0.8718           11.3917+-0.4410        
   switch-string-basic-big-var                       27.7782+-0.2517           27.5092+-0.6650        
   switch-string-basic-big                           21.5125+-5.4240     ?     22.6737+-4.6780        ? might be 1.0540x slower
   switch-string-basic-var                           27.6625+-0.7824           27.2134+-1.0483          might be 1.0165x faster
   switch-string-basic                               16.1723+-0.4351     ?     17.3353+-2.5906        ? might be 1.0719x slower
   switch-string-big-length-tower-var                25.9818+-0.1814           25.9594+-0.3046        
   switch-string-length-tower-var                    19.7414+-0.3839           19.5394+-0.4463          might be 1.0103x faster
   switch-string-length-tower                        16.8587+-1.5632           16.6362+-0.7025          might be 1.0134x faster
   switch-string-short                               14.1404+-0.3280           13.7953+-0.3156          might be 1.0250x faster
   switch                                            18.3950+-0.3891           16.1973+-2.2240          might be 1.1357x faster
   tear-off-arguments-simple                          4.5735+-0.1899            4.5621+-0.3900        
   tear-off-arguments                                 6.4413+-0.3780            6.2729+-0.3452          might be 1.0268x faster
   temporal-structure                                15.2394+-0.2913     ?     15.3918+-0.2073        ?
   to-int32-boolean                                  20.5565+-0.5646     ?     20.9765+-0.2636        ? might be 1.0204x slower
   try-catch-get-by-val-cloned-arguments             19.8458+-1.2370           19.3103+-0.3214          might be 1.0277x faster
   try-catch-get-by-val-direct-arguments              8.6525+-0.4387     ?      8.7627+-0.3711        ? might be 1.0127x slower
   try-catch-get-by-val-scoped-arguments             10.5080+-0.7655           10.2823+-0.4349          might be 1.0219x faster
   typed-array-get-set-by-val-profiling              43.3593+-3.1999     ?     43.6428+-0.6773        ?
   undefined-property-access                        421.6635+-4.6915     ?    425.0971+-4.3603        ?
   undefined-test                                     4.4355+-0.1566     ?      4.4938+-0.0462        ? might be 1.0131x slower
   unprofiled-licm                                   27.7910+-0.8899           27.3430+-0.6653          might be 1.0164x faster
   varargs-call                                      18.1255+-1.3965           17.8550+-0.3522          might be 1.0151x faster
   varargs-construct-inline                          30.1406+-0.5577     ^     28.8843+-0.5513        ^ definitely 1.0435x faster
   varargs-construct                                 44.0071+-1.2223           43.2736+-0.9487          might be 1.0169x faster
   varargs-inline                                    10.6042+-0.1786           10.3942+-0.1764          might be 1.0202x faster
   varargs-strict-mode                               12.7635+-0.3467           12.5423+-0.2022          might be 1.0176x faster
   varargs                                           12.7338+-0.2028     ?     12.7427+-0.1788        ?
   weird-inlining-const-prop                          2.9715+-0.2141     ?      3.0852+-0.4993        ? might be 1.0383x slower

   <geometric>                                       10.9177+-0.0294           10.8902+-0.0402          might be 1.0025x faster

                                                          Conf#1                    Conf#2                                      
AsmBench:
   bigfib.cpp                                       671.2872+-3.4993          668.2188+-5.7185        
   cray.c                                           623.5895+-4.1691          618.1917+-3.0196        
   dry.c                                            651.3746+-14.5903    ?    653.9422+-1.4227        ?
   FloatMM.c                                        954.6768+-0.4261     ?    955.1008+-1.1031        ?
   gcc-loops.cpp                                   5877.6520+-10.5185        5875.0744+-8.0721        
   n-body.c                                        1668.5110+-1.3811         1667.5019+-0.2854        
   Quicksort.c                                      571.2950+-11.1789         570.0117+-9.3460        
   stepanov_container.cpp                          4883.7885+-38.0455        4881.4542+-30.1892       
   Towers.c                                         375.8582+-1.3565     ^    371.0027+-2.0792        ^ definitely 1.0131x faster

   <geometric>                                     1115.6080+-2.5751     ^   1112.4407+-0.5724        ^ definitely 1.0028x faster

                                                          Conf#1                    Conf#2                                      
CompressionBench:
   huffman                                          502.3149+-2.9301          501.2273+-7.7351        
   arithmetic-simple                                527.4620+-1.2347     ^    463.9194+-2.7719        ^ definitely 1.1370x faster
   arithmetic-precise                               402.6512+-2.5964     ^    387.0590+-1.9937        ^ definitely 1.0403x faster
   arithmetic-complex-precise                       403.2155+-3.0490          400.7200+-3.4195        
   arithmetic-precise-order-0                       566.5947+-8.0850     ^    486.5091+-14.5647       ^ definitely 1.1646x faster
   arithmetic-precise-order-1                       427.9297+-22.6108         420.6318+-3.1192          might be 1.0174x faster
   arithmetic-precise-order-2                       470.7887+-1.6870          470.0422+-1.7822        
   arithmetic-simple-order-1                        528.2590+-3.3715     ?    528.3013+-0.8666        ?
   arithmetic-simple-order-2                        583.4188+-2.9492     ?    585.6379+-1.7343        ?
   lz-string                                        417.4472+-2.7028     ?    418.1752+-7.4047        ?

   <geometric>                                      478.6687+-2.8539     ^    462.5936+-1.6204        ^ definitely 1.0348x faster

                                                          Conf#1                    Conf#2                                      
Geomean of preferred means:
   <scaled-result>                                   83.9856+-0.1956     ^     82.7440+-0.1814        ^ definitely 1.0150x faster
Comment 3 Filip Pizlo 2015-05-14 19:27:22 PDT
Comment on attachment 253166 [details]
Patch

I don't like this change since it doesn't guard against overflow of the counter. Can't you just lift the "soon" threshold or just use a different threshold?
Comment 4 Filip Pizlo 2015-05-14 19:54:38 PDT
(In reply to comment #3)
> Comment on attachment 253166 [details]
> Patch
> 
> I don't like this change since it doesn't guard against overflow of the
> counter. Can't you just lift the "soon" threshold or just use a different
> threshold?

I get it now - you're trying to guarantee that the outer loop's check hits the slow path, and then does OSR entry.  I agree that we want this.  I also agree that your approach achieves this goal, but it does so at some cost: it means we risk wrap-around on the counter for inner loops.

One approach would be to say "screw it" and risk the wrap-around.  I don't like this approach.  It feels somewhat dangerous.  It also risks triggering the original compilation later than needed.

Another approach is to just have a saturation check for inner loops, rather than the current become-not-signed check.  This will fix the wrap-around, but still has the problem of not triggering compilation soon enough.

But I believe you want something else: you want those CheckTierUpAndOSREnter that have a nested CheckTierUpInLoop to actually do one additional check: check some bit that says "I've got a compilation ready, please call slow path now".  So this means adding a third variant: CheckTierUpAndNestedTriggerAndOSREnter, or something. This bit would be set by the slow path of CheckTierUpInLoop.  CheckTierUpInLoop would use thresholdForFTLOptimizeAfterWarmUp(), though CheckTierUpAndNestedTriggerAndOSREnter and CheckTierUpAndOSREnter will still use thresholdForFTLOptimizeSoon (since there, it makes total sense to fall back into the trigger soon to check if the compilation is done).

This will make CheckTierUpAndNestedTriggerAndOSREnter more expensive than either CheckTierUpAndOSREnter or CheckTierUpInLoop.  But I don't believe that the actual branch to test whether we should call slow path is what is killing you.  It's just that the outer loop never realizes that it should take slow path.  The extra bit could help you with this.

Note that this also means that we should have CheckTierUpInLoop do one of two things for OSR entry compiles:

1) Never trigger OSR entry compiles, and instead set the bit that causes the outer loop to trigger.

2) Have CheckTierUpInLoop take a parameter that is something like "OSR entrypoint for closest outer loop that can OSR enter".  This is what it will use for the OSR entry compile rather than using its own bytecode index.  You'll probably have to also play some tricks with the OSR entry values (the mustHandleValues), where it's the outer loop that records them and the inner loop trigger just uses them - since we need to collect mustHandleValues for the loop that we will be OSR entering into.

I prefer (1) for now because it's easier to implement.

This means that in Kraken/ai-aster, the inner loop will kick off a replacement compilation.  This will happen sooner than in your solution.  Then the inner loop will realize when the compilation is done, and realize that what we really need is an OSR entry compile.  It will set the bit and return with thresholdForFTLOptimizeAfterWarmUp.  Then when we pop out of the inner loop, the outer loop will *immediately* realize that we already crossed the counter threshold (because of the bit being set) and it will do the OSR entry logic.
Comment 5 Benjamin Poulain 2015-05-15 18:03:50 PDT
I'll upload an alternative patch elsewhere once it is fully tested.