Bug 143582 - REGRESSION (182567): regress/script-tests/sorting-benchmark.js fails on 32 bit dfg-eager tests
Summary: REGRESSION (182567): regress/script-tests/sorting-benchmark.js fails on 32 bi...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Michael Saboff
URL:
Keywords: InRadar
Depends on:
Blocks: 143535
  Show dependency treegraph
 
Reported: 2015-04-09 15:06 PDT by Michael Saboff
Modified: 2019-06-19 01:02 PDT (History)
3 users (show)

See Also:


Attachments
Patch (2.26 KB, patch)
2015-04-09 15:48 PDT, Michael Saboff
ggaren: review-
Details | Formatted Diff | Diff
Patch. (1.69 KB, patch)
2015-04-09 16:11 PDT, Michael Saboff
msaboff: review-
Details | Formatted Diff | Diff
Updated Patch (2.65 KB, patch)
2015-04-10 16:31 PDT, Michael Saboff
mark.lam: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Saboff 2015-04-09 15:06:27 PDT
From https://bugs.webkit.org/show_bug.cgi?id=143535#c6

regress/script-tests/sorting-benchmark.js.ftl-eager-no-cjit: ASSERTION FAILED: (spillFormat & DataFormatJS) || spillFormat == DataFormatInt32
regress/script-tests/sorting-benchmark.js.ftl-eager-no-cjit: /Volumes/Data/slave/mavericks-32bitJSC-debug/build/Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp(871) : GPRReg JSC::DFG::SpeculativeJIT::fillSpeculateInt32Internal(JSC::DFG::Edge, JSC::DataFormat &) [strict = false]
Comment 1 Michael Saboff 2015-04-09 15:48:29 PDT
Created attachment 250476 [details]
Patch
Comment 2 Michael Saboff 2015-04-09 15:50:07 PDT
Will change the title in the ChangeLog to not reference "dog-eager", but "dfg-eager".  Copied from defect title that was autocorrected.
Comment 3 Michael Saboff 2015-04-09 15:57:37 PDT
Performance results.  The JSRegress results for ftp-library-inlining reproduce without this change.

Noticeable speedup on int-or-other-sub-then-get-by-val (3x), int-or-other-sub (1.6X) as well as a few other minor speedups.

Benchmark report for SunSpider, Octane, Kraken, JSRegress, AsmBench, and CompressionBench on msaboff-pro (MacPro5,1).

VMs tested:
"143582" at /Volumes/Data/src/webkit/WebKitBuild/Release/jsc (r182575)
"Baseline" at /Volumes/Data/src/webkit.baseline/WebKitBuild/Release/jsc (r182575)

Collected 4 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample measurements.
Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level
timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds.

                                                          143582                   Baseline                                     
SunSpider:
   3d-cube                                            6.5350+-0.2188     ?      6.6574+-0.3250        ? might be 1.0187x slower
   3d-morph                                           7.7028+-0.2522     ?      7.7205+-0.0656        ?
   3d-raytrace                                        8.4152+-0.2220     !      8.7635+-0.1156        ! definitely 1.0414x slower
   access-binary-trees                                2.9922+-0.1003     ?      2.9929+-0.2092        ?
   access-fannkuch                                    7.9595+-0.1593            7.9565+-0.1602        
   access-nbody                                       4.1694+-0.1227     ?      4.1800+-0.2302        ?
   access-nsieve                                      4.8140+-0.1928            4.7791+-0.1227        
   bitops-3bit-bits-in-byte                           1.8235+-0.2254     ?      1.8417+-0.0524        ?
   bitops-bits-in-byte                                5.6285+-0.3058     ?      5.6616+-0.0949        ?
   bitops-bitwise-and                                 2.7898+-0.1551     ?      2.7948+-0.1288        ?
   bitops-nsieve-bits                                 5.0851+-0.2947            5.0226+-0.2515          might be 1.0124x faster
   controlflow-recursive                              2.7150+-0.1932            2.6541+-0.1696          might be 1.0230x faster
   crypto-aes                                         5.4198+-0.2763     ?      5.5536+-0.5419        ? might be 1.0247x slower
   crypto-md5                                         3.2798+-0.0880     ?      3.4495+-0.0852        ? might be 1.0518x slower
   crypto-sha1                                        3.2521+-0.1132            3.0940+-0.1680          might be 1.0511x faster
   date-format-tofte                                 12.3700+-1.0184           11.9872+-0.3906          might be 1.0319x faster
   date-format-xparb                                  7.4514+-0.2305            7.2963+-0.5263          might be 1.0213x faster
   math-cordic                                        4.1965+-0.1536            4.1381+-0.1941          might be 1.0141x faster
   math-partial-sums                                  9.2725+-0.2119            9.1503+-0.3530          might be 1.0133x faster
   math-spectral-norm                                 2.8047+-0.1374     ?      2.8428+-0.1432        ? might be 1.0136x slower
   regexp-dna                                         9.7850+-0.1980     ?      9.8354+-0.2751        ?
   string-base64                                      5.9279+-0.2983     ?      6.1296+-0.1878        ? might be 1.0340x slower
   string-fasta                                       9.3875+-0.3217            9.3469+-0.5166        
   string-tagcloud                                   13.5993+-0.2744           13.2272+-0.5325          might be 1.0281x faster
   string-unpack-code                                27.6871+-0.9498           26.7062+-0.9327          might be 1.0367x faster
   string-validate-input                              6.8041+-0.2951            6.6071+-0.4277          might be 1.0298x faster

   <arithmetic>                                       6.9949+-0.0713            6.9380+-0.0267          might be 1.0082x faster

                                                          143582                   Baseline                                     
Octane:
   encrypt                                           0.31314+-0.00350    ?     0.31936+-0.00903       ? might be 1.0199x slower
   decrypt                                           5.65591+-0.03127    ?     5.65898+-0.03935       ?
   deltablue                                x2       0.26536+-0.00161          0.26462+-0.00208       
   earley                                            0.81672+-0.00659          0.81667+-0.00498       
   boyer                                            10.33243+-0.05197    ?    10.33519+-0.02275       ?
   navier-stokes                            x2       6.40153+-0.01442    ?     6.42470+-0.05597       ?
   raytrace                                 x2       1.76171+-0.11064          1.74335+-0.10441         might be 1.0105x faster
   richards                                 x2       0.16693+-0.00266          0.16634+-0.00157       
   splay                                    x2       0.52584+-0.01300    ?     0.53122+-0.01527       ? might be 1.0102x slower
   regexp                                   x2      44.44885+-0.30484    ?    44.52554+-0.48588       ?
   pdfjs                                    x2      62.62840+-0.59291    ?    62.98508+-0.85120       ?
   mandreel                                 x2      71.44585+-1.44914         71.05507+-1.21279       
   gbemu                                    x2      55.42048+-0.83489         55.00026+-1.40659       
   closure                                           0.76713+-0.00629          0.76465+-0.00996       
   jquery                                            9.58397+-0.08117          9.53560+-0.05228       
   box2d                                    x2      17.38529+-0.05459    ^    16.92042+-0.17522       ^ definitely 1.0275x faster
   zlib                                     x2     545.84456+-10.65242   ?   546.63963+-11.55289      ?
   typescript                               x2    1088.27521+-19.37655      1075.45746+-25.59206        might be 1.0119x faster

   <geometric>                                       9.25520+-0.04820          9.23035+-0.04982         might be 1.0027x faster

                                                          143582                   Baseline                                     
Kraken:
   ai-astar                                          466.430+-14.132     ?     470.402+-4.918         ?
   audio-beat-detection                              155.049+-1.356            152.016+-3.627           might be 1.0200x faster
   audio-dft                                         205.749+-4.879            202.900+-2.290           might be 1.0140x faster
   audio-fft                                         114.171+-8.894            110.164+-1.048           might be 1.0364x faster
   audio-oscillator                                  263.264+-1.064      ?     264.864+-4.747         ?
   imaging-darkroom                                  139.874+-0.178            139.700+-0.426         
   imaging-desaturate                                 95.166+-2.011             95.080+-1.698         
   imaging-gaussian-blur                             156.354+-1.097      ?     156.911+-2.933         ?
   json-parse-financial                               65.534+-1.679      ?      66.213+-1.256         ? might be 1.0104x slower
   json-stringify-tinderbox                           76.693+-1.002             76.322+-0.946         
   stanford-crypto-aes                                84.731+-1.525      ?      87.577+-4.556         ? might be 1.0336x slower
   stanford-crypto-ccm                                67.663+-1.634             66.334+-1.537           might be 1.0200x faster
   stanford-crypto-pbkdf2                            225.407+-1.895      ?     227.869+-1.301         ? might be 1.0109x slower
   stanford-crypto-sha256-iterative                   72.801+-1.207      ?      72.951+-2.020         ?

   <arithmetic>                                      156.349+-1.143      ?     156.379+-0.602         ? might be 1.0002x slower

                                                          143582                   Baseline                                     
JSRegress:
   abs-boolean                                        3.7931+-0.0775            3.7032+-0.1528          might be 1.0243x faster
   adapt-to-double-divide                            17.8929+-0.2884           17.8331+-0.6910        
   aliased-arguments-getbyval                         1.7003+-0.2369            1.5065+-0.4025          might be 1.1287x faster
   allocate-big-object                                3.5246+-0.2034     ?      3.5546+-0.3899        ?
   arguments-named-and-reflective                    15.1204+-0.5459           14.7701+-0.6868          might be 1.0237x faster
   arguments-out-of-bounds                           17.7606+-0.3132           17.5652+-0.3930          might be 1.0111x faster
   arguments-strict-mode                             13.3256+-0.2662           13.3026+-0.8817        
   arguments                                         11.8800+-0.1624     ?     12.0222+-0.4356        ? might be 1.0120x slower
   arity-mismatch-inlining                            1.1857+-0.1330     ?      1.2533+-0.1150        ? might be 1.0570x slower
   array-access-polymorphic-structure                 8.7840+-0.2542     ?      8.9843+-0.4462        ? might be 1.0228x slower
   array-nonarray-polymorhpic-access                 47.6774+-0.4932     ?     47.8170+-0.6964        ?
   array-prototype-every                            118.3965+-3.9617          117.5836+-0.5021        
   array-prototype-forEach                          118.1490+-3.3574     ?    118.2635+-5.3654        ?
   array-prototype-map                              131.8705+-2.8274          131.6506+-3.7808        
   array-prototype-some                             118.5522+-5.6653          116.9913+-1.1638          might be 1.0133x faster
   array-splice-contiguous                           62.7807+-1.5157     ?     63.0951+-1.6442        ?
   array-with-double-add                              5.5964+-0.0607            5.4927+-0.2722          might be 1.0189x faster
   array-with-double-increment                        4.1738+-0.2404            4.0253+-0.1037          might be 1.0369x faster
   array-with-double-mul-add                          7.0762+-0.1725     ?      7.1755+-0.1206        ? might be 1.0140x slower
   array-with-double-sum                              4.3668+-0.1232            4.3171+-0.1405          might be 1.0115x faster
   array-with-int32-add-sub                           9.8210+-0.6754            9.5815+-0.2055          might be 1.0250x faster
   array-with-int32-or-double-sum                     4.4953+-0.0306     ^      4.3271+-0.0535        ^ definitely 1.0389x faster
   ArrayBuffer-DataView-alloc-large-long-lived   
                                                     45.9068+-1.0134           45.5982+-1.1389        
   ArrayBuffer-DataView-alloc-long-lived             18.4930+-0.2175     ?     18.7759+-0.2249        ? might be 1.0153x slower
   ArrayBuffer-Int32Array-byteOffset                  5.1554+-0.3744            5.0115+-0.2338          might be 1.0287x faster
   ArrayBuffer-Int8Array-alloc-large-long-lived   
                                                     49.6611+-2.1750           48.1920+-1.7021          might be 1.0305x faster
   ArrayBuffer-Int8Array-alloc-long-lived-buffer   
                                                     31.4641+-1.6711           30.8785+-1.0116          might be 1.0190x faster
   ArrayBuffer-Int8Array-alloc-long-lived            18.1965+-0.6399           17.6185+-0.3524          might be 1.0328x faster
   ArrayBuffer-Int8Array-alloc                       14.7727+-0.6764     ?     15.2650+-0.6896        ? might be 1.0333x slower
   asmjs_bool_bug                                     8.6519+-0.2542     ?      8.8760+-0.0704        ? might be 1.0259x slower
   assign-custom-setter-polymorphic                   4.1858+-0.2676     ?      4.3130+-0.2572        ? might be 1.0304x slower
   assign-custom-setter                               5.5167+-0.1192     ?      5.7490+-0.3685        ? might be 1.0421x slower
   basic-set                                         12.0677+-0.7982     ?     12.1724+-1.1298        ?
   big-int-mul                                        5.8854+-0.1857            5.8082+-0.0705          might be 1.0133x faster
   boolean-test                                       4.2779+-0.1599     ?      4.2858+-0.1392        ?
   branch-fold                                        4.5418+-0.1031            4.5170+-0.1101        
   by-val-generic                                    10.5391+-0.3707           10.3704+-0.3631          might be 1.0163x faster
   call-spread-apply                                 38.1520+-0.3590     ?     38.3762+-0.4559        ?
   call-spread-call                                  31.2108+-0.5891     ?     32.4387+-1.0109        ? might be 1.0393x slower
   captured-assignments                               0.6692+-0.1344            0.5956+-0.0174          might be 1.1236x faster
   cast-int-to-double                                 7.9827+-0.1051     ?      8.0383+-0.1423        ?
   cell-argument                                      9.9896+-0.2085            9.6338+-0.2406          might be 1.0369x faster
   cfg-simplify                                       3.8142+-0.1592            3.7432+-0.1689          might be 1.0190x faster
   chain-getter-access                               12.8196+-0.3107           12.5627+-0.3381          might be 1.0205x faster
   cmpeq-obj-to-obj-other                            12.8207+-0.3511           12.3666+-0.1346          might be 1.0367x faster
   constant-test                                      7.7104+-0.0303            7.7099+-0.2069        
   DataView-custom-properties                        52.5458+-1.6199           52.1561+-1.1635        
   deconstructing-parameters-overridden-by-function   
                                                      0.6597+-0.0540     ?      0.7322+-0.1204        ? might be 1.1098x slower
   delay-tear-off-arguments-strictmode               18.2257+-0.4120           17.9365+-0.6645          might be 1.0161x faster
   deltablue-varargs                                254.4942+-0.8587          253.9927+-1.6571        
   destructuring-arguments                           21.9846+-0.9608           21.5901+-0.4653          might be 1.0183x faster
   destructuring-swap                                 7.7175+-0.1992            7.5710+-0.2007          might be 1.0193x faster
   direct-arguments-getbyval                          1.6735+-0.3017            1.6089+-0.2210          might be 1.0402x faster
   div-boolean-double                                 5.4830+-0.0380            5.4534+-0.0657        
   div-boolean                                        9.7524+-0.3141     ?     10.0162+-0.7273        ? might be 1.0270x slower
   double-get-by-val-out-of-bounds                    5.9034+-0.2713            5.8295+-0.3233          might be 1.0127x faster
   double-pollution-getbyval                          9.6308+-0.0852     ?      9.6587+-0.0869        ?
   double-pollution-putbyoffset                       5.6658+-0.2050            5.4334+-0.0823          might be 1.0428x faster
   double-to-int32-typed-array-no-inline              2.8679+-0.1750            2.8409+-0.0569        
   double-to-int32-typed-array                        2.5750+-0.1209            2.5121+-0.2444          might be 1.0250x faster
   double-to-uint32-typed-array-no-inline             2.9506+-0.1768            2.8598+-0.1137          might be 1.0318x faster
   double-to-uint32-typed-array                       2.6248+-0.2014            2.5562+-0.2764          might be 1.0268x faster
   elidable-new-object-dag                           53.5914+-2.1758     ?     54.7012+-0.5511        ? might be 1.0207x slower
   elidable-new-object-roflcopter                    60.0373+-1.0048     ?     61.2685+-3.8601        ? might be 1.0205x slower
   elidable-new-object-then-call                     49.9273+-0.6477     ?     51.3052+-4.6629        ? might be 1.0276x slower
   elidable-new-object-tree                          62.7932+-2.2728     ?     63.3803+-1.2700        ?
   empty-string-plus-int                              7.8111+-0.8130            7.5113+-0.1331          might be 1.0399x faster
   emscripten-cube2hash                              45.8065+-0.6495           45.5851+-0.3545        
   exit-length-on-plain-object                       18.6224+-0.9350           18.2084+-0.1710          might be 1.0227x faster
   external-arguments-getbyval                        1.6069+-0.2165     ?      1.6377+-0.2322        ? might be 1.0191x slower
   external-arguments-putbyval                        3.1582+-0.0960            3.1353+-0.1402        
   fixed-typed-array-storage-var-index                1.6036+-0.0683     ?      1.6140+-0.0984        ?
   fixed-typed-array-storage                          1.2448+-0.0686     ?      1.2580+-0.1126        ? might be 1.0106x slower
   Float32Array-matrix-mult                           5.5953+-0.4266     ?      5.8602+-0.1288        ? might be 1.0473x slower
   Float32Array-to-Float64Array-set                  76.1929+-2.0861     ^     73.2643+-0.7217        ^ definitely 1.0400x faster
   Float64Array-alloc-long-lived                     94.8882+-0.6911     ?     94.9725+-0.8642        ?
   Float64Array-to-Int16Array-set                    93.0118+-1.7230           90.8748+-0.5644          might be 1.0235x faster
   fold-double-to-int                                21.1389+-0.2439           20.6805+-0.4525          might be 1.0222x faster
   fold-get-by-id-to-multi-get-by-offset-rare-int   
                                                     10.2410+-0.4324     ?     10.2935+-0.4249        ?
   fold-get-by-id-to-multi-get-by-offset              8.8654+-0.3321            8.7839+-0.0885        
   fold-multi-get-by-offset-to-get-by-offset   
                                                      8.3119+-0.1984            7.9435+-0.6802          might be 1.0464x faster
   fold-multi-get-by-offset-to-poly-get-by-offset   
                                                      8.1984+-0.9096     ?      8.5048+-0.7894        ? might be 1.0374x slower
   fold-multi-put-by-offset-to-poly-put-by-offset   
                                                      7.6902+-0.5176            7.4689+-0.6514          might be 1.0296x faster
   fold-multi-put-by-offset-to-put-by-offset   
                                                      6.0045+-0.6261            5.7327+-0.0566          might be 1.0474x faster
   fold-multi-put-by-offset-to-replace-or-transition-put-by-offset   
                                                     13.0771+-0.6161           12.4858+-1.1034          might be 1.0474x faster
   fold-put-by-id-to-multi-put-by-offset              8.3672+-0.7281     ?      8.6194+-0.9868        ? might be 1.0301x slower
   fold-put-structure                                 6.6268+-0.7340            6.2125+-0.5052          might be 1.0667x faster
   for-of-iterate-array-entries                       5.9680+-0.2351     ?      6.1136+-0.3016        ? might be 1.0244x slower
   for-of-iterate-array-keys                          5.2305+-0.2964            5.0605+-0.0676          might be 1.0336x faster
   for-of-iterate-array-values                        4.9684+-0.4375            4.8618+-0.1844          might be 1.0219x faster
   fround                                            22.4473+-0.6490           22.0413+-0.7953          might be 1.0184x faster
   ftl-library-inlining-dataview                     93.8133+-8.6699           88.9852+-1.0090          might be 1.0543x faster
   ftl-library-inlining                              88.0909+-0.5338     !    116.3621+-3.3117        ! definitely 1.3209x slower
   function-dot-apply                                 2.7095+-0.1413     ?      2.7184+-0.1057        ?
   function-test                                      4.6163+-0.0905     ?      4.7555+-0.1810        ? might be 1.0302x slower
   function-with-eval                               123.2970+-0.9951     ^    118.8639+-1.5411        ^ definitely 1.0373x faster
   gcse-poly-get-less-obvious                        24.7757+-0.2264           24.6990+-0.3516        
   gcse-poly-get                                     24.6593+-0.3261     ?     24.7603+-0.6875        ?
   gcse                                               6.4606+-0.0756     ?      6.4918+-0.2826        ?
   get-by-id-bimorphic-check-structure-elimination-simple   
                                                      3.3533+-0.1081     ?      3.3822+-0.1221        ?
   get-by-id-bimorphic-check-structure-elimination   
                                                      8.3935+-0.1395            8.3004+-0.3267          might be 1.0112x faster
   get-by-id-chain-from-try-block                    11.9385+-0.9309           11.5757+-0.3415          might be 1.0313x faster
   get-by-id-check-structure-elimination              7.5258+-0.1166            7.5239+-0.3075        
   get-by-id-proto-or-self                           22.7253+-1.2004     ?     23.1209+-0.9905        ? might be 1.0174x slower
   get-by-id-quadmorphic-check-structure-elimination-simple   
                                                      4.1005+-0.3572            4.0055+-0.1036          might be 1.0237x faster
   get-by-id-self-or-proto                           23.3521+-0.8623           22.3182+-1.0162          might be 1.0463x faster
   get-by-val-out-of-bounds                           5.5530+-0.1447     ?      5.5720+-0.2159        ?
   get_callee_monomorphic                             4.7485+-0.1824     ?      4.9684+-0.3483        ? might be 1.0463x slower
   get_callee_polymorphic                             4.4400+-0.3194     ?      4.5151+-0.2141        ? might be 1.0169x slower
   getter-no-activation                               5.6628+-0.1651     ?      5.8127+-0.1378        ? might be 1.0265x slower
   getter-richards                                  127.1050+-6.3940          125.7343+-2.9919          might be 1.0109x faster
   getter                                             7.4930+-0.1442            7.4644+-0.1581        
   global-var-const-infer-fire-from-opt               1.2510+-0.2971     ?      1.4047+-0.2983        ? might be 1.1229x slower
   global-var-const-infer                             1.1983+-0.2472            1.1628+-0.1425          might be 1.0305x faster
   HashMap-put-get-iterate-keys                      35.0887+-0.4482           34.4258+-0.6579          might be 1.0193x faster
   HashMap-put-get-iterate                           34.8667+-0.7375           34.1568+-0.7734          might be 1.0208x faster
   HashMap-string-put-get-iterate                    35.0436+-1.7596     ?     35.4054+-1.8732        ? might be 1.0103x slower
   hoist-make-rope                                   14.7690+-2.3793     ?     14.8574+-0.8360        ?
   hoist-poly-check-structure-effectful-loop   
                                                      6.7942+-0.1990     ?      6.9172+-0.1839        ? might be 1.0181x slower
   hoist-poly-check-structure                         4.9361+-0.1496     ?      4.9918+-0.2323        ? might be 1.0113x slower
   imul-double-only                                  10.3059+-0.9879            9.9948+-0.1305          might be 1.0311x faster
   imul-int-only                                     12.4103+-1.1218           12.3260+-0.9415        
   imul-mixed                                         9.8901+-0.5659     ?     10.2966+-0.8740        ? might be 1.0411x slower
   in-four-cases                                     23.8143+-0.6293     ?     23.8658+-0.1648        ?
   in-one-case-false                                 12.4968+-0.1195           12.3926+-0.0858        
   in-one-case-true                                  12.4893+-0.2553     ?     12.5048+-0.2592        ?
   in-two-cases                                      12.9935+-0.1397           12.9724+-0.1209        
   indexed-properties-in-objects                      3.7130+-0.0873     ?      3.7607+-0.1883        ? might be 1.0128x slower
   infer-closure-const-then-mov-no-inline             4.2790+-0.1037            4.2761+-0.1461        
   infer-closure-const-then-mov                      20.7541+-0.5123     ?     21.3935+-1.6055        ? might be 1.0308x slower
   infer-closure-const-then-put-to-scope-no-inline   
                                                     14.4138+-0.2711           14.3137+-0.0935        
   infer-closure-const-then-put-to-scope             28.1136+-0.9032           27.9778+-0.2671        
   infer-closure-const-then-reenter-no-inline   
                                                     67.5102+-0.5659     ?     67.5370+-0.9558        ?
   infer-closure-const-then-reenter                 149.7248+-0.8179          149.4867+-0.5578        
   infer-constant-global-property                    37.2645+-0.2518     ?     37.4317+-0.7846        ?
   infer-constant-property                            3.2803+-0.1010     ?      3.2847+-0.1904        ?
   infer-one-time-closure-ten-vars                   15.1195+-0.2200           14.7386+-0.4190          might be 1.0258x faster
   infer-one-time-closure-two-vars                   14.1717+-0.5029           14.1415+-0.7512        
   infer-one-time-closure                            14.2391+-0.4259     ?     14.3198+-0.3783        ?
   infer-one-time-deep-closure                       24.5686+-0.3975     ?     24.7173+-0.8607        ?
   inline-arguments-access                            5.4758+-0.2113            5.3972+-0.1098          might be 1.0146x faster
   inline-arguments-aliased-access                    5.4705+-0.3731            5.3597+-0.1792          might be 1.0207x faster
   inline-arguments-local-escape                      5.7104+-0.2117            5.6591+-0.1959        
   inline-get-scoped-var                              5.6605+-0.1330     ?      5.7246+-0.0675        ? might be 1.0113x slower
   inlined-put-by-id-transition                      14.3009+-0.4855     ?     14.4254+-0.0910        ?
   int-or-other-abs-then-get-by-val                   6.8181+-0.3051     ?      6.9564+-0.2427        ? might be 1.0203x slower
   int-or-other-abs-zero-then-get-by-val             27.4595+-0.4265           27.3370+-1.0519        
   int-or-other-add-then-get-by-val                   5.9669+-0.1815            5.8682+-0.3437          might be 1.0168x faster
   int-or-other-add                                   7.8913+-0.2085     ?      7.8965+-0.1797        ?
   int-or-other-div-then-get-by-val                   5.1285+-0.1979     ?      5.1613+-0.2242        ?
   int-or-other-max-then-get-by-val                   5.2114+-0.1345            5.1863+-0.1663        
   int-or-other-min-then-get-by-val                   5.1469+-0.1182     ?      5.1963+-0.2686        ?
   int-or-other-mod-then-get-by-val                   5.0132+-0.2768     ?      5.0950+-0.1925        ? might be 1.0163x slower
   int-or-other-mul-then-get-by-val                   5.0471+-0.0994     ?      5.0905+-0.2438        ?
   int-or-other-neg-then-get-by-val                   6.3703+-0.3817     ?      6.4604+-0.2417        ? might be 1.0142x slower
   int-or-other-neg-zero-then-get-by-val             27.8058+-0.6001           27.2402+-0.8259          might be 1.0208x faster
   int-or-other-sub-then-get-by-val                  18.8475+-0.6523     ^      6.0399+-0.1927        ^ definitely 3.1205x faster
   int-or-other-sub                                   7.6725+-0.1986     ^      4.6757+-0.1486        ^ definitely 1.6409x faster
   int-overflow-local                                 5.9067+-0.3719            5.6765+-0.3678          might be 1.0405x faster
   Int16Array-alloc-long-lived                       64.1351+-0.5376     ?     64.4781+-1.1637        ?
   Int16Array-bubble-sort-with-byteLength            37.4631+-0.4379     ?     37.6722+-0.8407        ?
   Int16Array-bubble-sort                            37.0704+-0.2780     ?     37.1110+-0.4503        ?
   Int16Array-load-int-mul                            1.8980+-0.0564     ?      2.0156+-0.1415        ? might be 1.0619x slower
   Int16Array-to-Int32Array-set                      75.1518+-3.3682           73.3477+-1.1094          might be 1.0246x faster
   Int32Array-alloc-large                            33.8190+-0.7424     ?     34.1107+-0.8631        ?
   Int32Array-alloc-long-lived                       73.0798+-1.5685     ?     73.5154+-0.6814        ?
   Int32Array-alloc                                   4.4828+-0.2276            4.4022+-0.1515          might be 1.0183x faster
   Int32Array-Int8Array-view-alloc                    9.3644+-0.2065            9.2097+-0.2420          might be 1.0168x faster
   int52-spill                                        8.1042+-0.3437     ?      8.2597+-0.2315        ? might be 1.0192x slower
   Int8Array-alloc-long-lived                        58.6620+-1.0601     ?     58.9169+-1.7988        ?
   Int8Array-load-with-byteLength                     4.7340+-0.1170            4.6935+-0.0990        
   Int8Array-load                                     4.7839+-0.1333            4.7475+-0.0551        
   integer-divide                                    14.0746+-0.3540     ?     14.1302+-0.1040        ?
   integer-modulo                                     2.8842+-0.0659     ^      2.7377+-0.0278        ^ definitely 1.0535x faster
   large-int-captured                                 7.9919+-0.4517     ?      8.0541+-0.2873        ?
   large-int-neg                                     20.3979+-0.3046     ?     20.4384+-0.3659        ?
   large-int                                         18.4594+-0.4631           18.0730+-0.3964          might be 1.0214x faster
   logical-not                                        6.3566+-0.2756     ?      6.3818+-0.6244        ?
   lots-of-fields                                    18.8129+-0.7254     ?     19.3470+-0.7800        ? might be 1.0284x slower
   make-indexed-storage                               4.2089+-0.5249            4.0932+-0.5043          might be 1.0283x faster
   make-rope-cse                                      6.0540+-0.0626     ?      6.1915+-0.1327        ? might be 1.0227x slower
   marsaglia-larger-ints                             55.3675+-0.6090     ?     55.6903+-0.7850        ?
   marsaglia-osr-entry                               28.0812+-0.5274           27.6150+-0.2404          might be 1.0169x faster
   max-boolean                                        3.3295+-0.1532            3.2437+-0.1216          might be 1.0265x faster
   method-on-number                                  23.9301+-1.2571           22.8285+-0.2832          might be 1.0483x faster
   min-boolean                                        3.2474+-0.1017            3.1565+-0.0742          might be 1.0288x faster
   minus-boolean-double                               4.2600+-0.5628            4.1343+-0.1595          might be 1.0304x faster
   minus-boolean                                      3.2106+-0.1140            3.2090+-0.1236        
   misc-strict-eq                                    51.4467+-1.4982     ?     55.2245+-6.5439        ? might be 1.0734x slower
   mod-boolean-double                                11.6617+-0.1987     ?     11.6903+-0.2406        ?
   mod-boolean                                        8.8420+-0.1435     ?      9.0066+-0.1951        ? might be 1.0186x slower
   mul-boolean-double                                 4.8557+-0.0796            4.8205+-0.0869        
   mul-boolean                                        3.5005+-0.1472            3.4625+-0.0808          might be 1.0110x faster
   neg-boolean                                        4.3387+-0.1174     ?      4.3752+-0.2245        ?
   negative-zero-divide                               0.5305+-0.1387            0.5135+-0.1195          might be 1.0330x faster
   negative-zero-modulo                               0.5573+-0.1167            0.5022+-0.0948          might be 1.1097x faster
   negative-zero-negate                               0.4926+-0.1053            0.4684+-0.0665          might be 1.0515x faster
   nested-function-parsing                           55.1633+-0.7064     ?     55.1648+-0.9367        ?
   new-array-buffer-dead                              3.7546+-0.1840     ?      3.9039+-0.1974        ? might be 1.0398x slower
   new-array-buffer-push                              9.1025+-0.9154     ?      9.1210+-0.3576        ?
   new-array-dead                                    14.2250+-1.3434     ?     14.5283+-0.9260        ? might be 1.0213x slower
   new-array-push                                     5.4955+-0.1382     ^      5.1141+-0.1686        ^ definitely 1.0746x faster
   number-test                                        4.3293+-0.2425            4.2377+-0.1684          might be 1.0216x faster
   object-closure-call                                7.6861+-0.3225     ?      7.7908+-0.1708        ? might be 1.0136x slower
   object-test                                        4.5288+-0.1318            4.5063+-0.1994        
   obvious-sink-pathology-taken                     171.6553+-1.7369          170.8647+-1.9718        
   obvious-sink-pathology                           161.8836+-2.0776          161.7612+-2.4071        
   obviously-elidable-new-object                     48.7735+-0.4591           47.2975+-4.5013          might be 1.0312x faster
   plus-boolean-arith                                 3.3005+-0.1207            3.2618+-0.1643          might be 1.0119x faster
   plus-boolean-double                                4.2078+-0.1095            4.1800+-0.1123        
   plus-boolean                                       3.1017+-0.0977     ?      3.1623+-0.1242        ? might be 1.0195x slower
   poly-chain-access-different-prototypes-simple   
                                                      3.8408+-0.0528     ?      3.8747+-0.0563        ?
   poly-chain-access-different-prototypes             3.0356+-0.1390     ?      3.0720+-0.0446        ? might be 1.0120x slower
   poly-chain-access-simpler                          3.8655+-0.1029     ?      3.8673+-0.0686        ?
   poly-chain-access                                  3.1157+-0.1243     ?      3.1852+-0.0707        ? might be 1.0223x slower
   poly-stricteq                                     68.2395+-0.8655     ?     68.3802+-1.7879        ?
   polymorphic-array-call                             1.9282+-0.3314     ?      2.1507+-0.6913        ? might be 1.1154x slower
   polymorphic-get-by-id                              4.1381+-0.2315            4.1018+-0.0365        
   polymorphic-put-by-id                             39.7392+-5.1480           38.6070+-1.5256          might be 1.0293x faster
   polymorphic-structure                             21.5836+-0.5070     ?     21.8079+-0.2730        ? might be 1.0104x slower
   polyvariant-monomorphic-get-by-id                 12.0316+-0.1796     ?     12.0905+-0.4013        ?
   proto-getter-access                               12.7004+-0.5697     ?     12.9565+-0.2990        ? might be 1.0202x slower
   put-by-id-replace-and-transition                  11.6949+-0.6485           11.2933+-0.4281          might be 1.0356x faster
   put-by-id-slightly-polymorphic                     3.5920+-0.0628            3.5890+-0.1330        
   put-by-id                                         17.8708+-0.7970     ?     18.0135+-0.4763        ?
   put-by-val-direct                                  0.8674+-0.1296     ?      0.9542+-0.1297        ? might be 1.1001x slower
   put-by-val-large-index-blank-indexing-type   
                                                      8.2989+-0.4746            8.0515+-0.2998          might be 1.0307x faster
   put-by-val-machine-int                             3.3589+-0.2847     ?      3.4321+-0.1898        ? might be 1.0218x slower
   rare-osr-exit-on-local                            18.3589+-0.4516     ?     18.4135+-0.5239        ?
   register-pressure-from-osr                        26.3456+-0.3462     ?     26.6462+-0.2505        ? might be 1.0114x slower
   setter                                             6.5180+-0.1202            6.3528+-0.1910          might be 1.0260x faster
   simple-activation-demo                            30.4738+-0.4146           30.4636+-0.2942        
   simple-getter-access                              17.9427+-0.3504           17.8651+-0.3494        
   simple-poly-call-nested                           10.2398+-0.1446           10.1982+-0.1371        
   simple-poly-call                                   1.6949+-0.1380     ?      1.7061+-0.1586        ?
   sin-boolean                                       25.6973+-3.4941     ?     26.7427+-0.7091        ? might be 1.0407x slower
   sinkable-new-object-dag                           93.8912+-0.9466           93.8262+-1.1859        
   sinkable-new-object-taken                         69.4092+-4.0019     ?     71.3421+-3.1705        ? might be 1.0278x slower
   sinkable-new-object                               52.0214+-1.3884     ?     52.5345+-2.4065        ?
   slow-array-profile-convergence                     3.7615+-0.2001            3.7219+-0.3554          might be 1.0106x faster
   slow-convergence                                   4.5020+-0.2110     ?      4.5172+-0.3806        ?
   sorting-benchmark                                 27.5936+-0.4104           27.4783+-0.1325        
   sparse-conditional                                 1.6474+-0.0508     ?      1.6595+-0.1026        ?
   splice-to-remove                                  21.5546+-0.5115     ?     22.0516+-0.3950        ? might be 1.0231x slower
   string-char-code-at                               19.7233+-0.7865           19.5958+-0.4246        
   string-concat-object                               3.0151+-0.2714            2.9204+-0.1451          might be 1.0324x faster
   string-concat-pair-object                          2.9481+-0.1991     ?      3.0450+-0.2730        ? might be 1.0329x slower
   string-concat-pair-simple                         15.7651+-0.6478     ?     15.9643+-0.3087        ? might be 1.0126x slower
   string-concat-simple                              16.6523+-0.8233           16.1683+-0.6011          might be 1.0299x faster
   string-cons-repeat                                10.5820+-0.1111           10.4332+-0.2480          might be 1.0143x faster
   string-cons-tower                                 10.4177+-0.2758     ?     10.5896+-0.5630        ? might be 1.0165x slower
   string-equality                                   20.8058+-0.2457     ?     21.1415+-0.5101        ? might be 1.0161x slower
   string-get-by-val-big-char                         9.6390+-0.1830     ?      9.7580+-0.3748        ? might be 1.0123x slower
   string-get-by-val-out-of-bounds-insane             5.4898+-0.7633            5.2135+-0.1751          might be 1.0530x faster
   string-get-by-val-out-of-bounds                    6.7663+-0.2174     ?      6.7776+-0.2243        ?
   string-get-by-val                                  4.7082+-0.0323     ?      4.7654+-0.0613        ? might be 1.0121x slower
   string-hash                                        2.7150+-0.1256     ?      2.7318+-0.1723        ?
   string-long-ident-equality                        17.2205+-0.3060     ?     17.4904+-0.3058        ? might be 1.0157x slower
   string-out-of-bounds                              17.6560+-0.8377           17.4520+-0.3288          might be 1.0117x faster
   string-repeat-arith                               41.5208+-0.4424     ?     42.8428+-5.6094        ? might be 1.0318x slower
   string-sub                                        79.8868+-0.6080           79.1179+-0.9155        
   string-test                                        4.2327+-0.2434            4.2130+-0.1163        
   string-var-equality                               44.7887+-0.5200     ?     44.8480+-0.3547        ?
   structure-hoist-over-transitions                   3.3317+-0.1682     ?      3.3561+-0.1026        ?
   substring-concat-weird                            53.6677+-0.7359           53.3647+-0.3814        
   substring-concat                                  56.4335+-0.5045           55.6010+-0.3505          might be 1.0150x faster
   substring                                         63.0771+-0.5481     ^     61.2958+-0.5115        ^ definitely 1.0291x faster
   switch-char-constant                               3.3964+-0.1214     ?      3.4205+-0.1579        ?
   switch-char                                        7.9070+-0.1129     ?      7.9178+-0.0767        ?
   switch-constant                                   10.8323+-0.6342     ?     10.9756+-0.6284        ? might be 1.0132x slower
   switch-string-basic-big-var                       25.6546+-2.3879     ?     25.6681+-1.2623        ?
   switch-string-basic-big                           18.7395+-1.2329           18.1890+-0.5983          might be 1.0303x faster
   switch-string-basic-var                           26.6420+-3.8777     ?     29.2478+-2.1044        ? might be 1.0978x slower
   switch-string-basic                               19.7865+-0.5653     ?     20.0808+-1.0570        ? might be 1.0149x slower
   switch-string-big-length-tower-var                25.0721+-0.2097     ?     25.1670+-0.4176        ?
   switch-string-length-tower-var                    20.0525+-0.4586     ?     20.0748+-0.1524        ?
   switch-string-length-tower                        17.2461+-4.7688     ?     17.3765+-4.4080        ?
   switch-string-short                               14.9811+-0.7628           14.5292+-0.2552          might be 1.0311x faster
   switch                                            14.3185+-0.4458           14.2857+-0.1213        
   tear-off-arguments-simple                          4.2345+-0.0786     ?      4.3016+-0.2498        ? might be 1.0158x slower
   tear-off-arguments                                 6.1191+-0.2002            6.0267+-0.1156          might be 1.0153x faster
   temporal-structure                                17.1265+-1.0390           16.8425+-0.1970          might be 1.0169x faster
   to-int32-boolean                                  20.9338+-0.1894     ?     21.0963+-0.0602        ?
   try-catch-get-by-val-cloned-arguments             18.5468+-0.3712           18.4890+-0.7229        
   try-catch-get-by-val-direct-arguments              7.9125+-0.4297            7.6967+-0.3327          might be 1.0280x faster
   try-catch-get-by-val-scoped-arguments              9.8092+-0.3226     ?     10.0986+-1.0971        ? might be 1.0295x slower
   undefined-property-access                        471.5936+-6.9891          469.7192+-2.9550        
   undefined-test                                     4.4830+-0.2091            4.4700+-0.1649        
   unprofiled-licm                                   27.2985+-0.2025     ?     27.3394+-0.7271        ?
   varargs-call                                      18.1284+-0.3566           17.8185+-0.3277          might be 1.0174x faster
   varargs-construct-inline                          29.7478+-0.4445           29.2712+-0.3339          might be 1.0163x faster
   varargs-construct                                 42.1095+-1.0884     ?     42.4069+-0.4350        ?
   varargs-inline                                    11.1242+-0.2998     ?     11.3067+-0.1578        ? might be 1.0164x slower
   varargs-strict-mode                               12.9892+-0.1725           12.9866+-0.5872        
   varargs                                           13.1182+-0.2032           13.0960+-0.2089        
   weird-inlining-const-prop                          2.9144+-0.0554            2.9091+-0.2504        

   <geometric>                                       11.1086+-0.0598           11.0317+-0.0297          might be 1.0070x faster

                                                          143582                   Baseline                                     
AsmBench:
   bigfib.cpp                                       667.3652+-6.7131     ?    669.4995+-15.2721       ?
   cray.c                                           625.1473+-2.1646          623.4245+-4.4124        
   dry.c                                            645.6097+-15.3710    ?    651.7583+-16.1842       ?
   FloatMM.c                                        949.4996+-0.7359     ?    949.9029+-1.9270        ?
   gcc-loops.cpp                                   5829.1793+-17.0083        5819.0223+-15.1316       
   n-body.c                                        1670.8822+-2.0515         1669.2583+-4.9243        
   Quicksort.c                                      586.3422+-3.0808          585.5504+-0.5038        
   stepanov_container.cpp                          4847.9328+-15.9948        4842.5245+-21.9571       
   Towers.c                                         383.0115+-1.6648     ^    377.7148+-1.3582        ^ definitely 1.0140x faster

   <geometric>                                     1117.2167+-1.7729         1116.1209+-4.0759          might be 1.0010x faster

                                                          143582                   Baseline                                     
CompressionBench:
   huffman                                          505.6462+-5.2758     ?    509.1373+-0.9154        ?
   arithmetic-simple                                526.3203+-3.0629          525.8661+-4.1828        
   arithmetic-precise                               397.4643+-2.9002     ?    399.1771+-5.7526        ?
   arithmetic-complex-precise                       393.1271+-2.3906     ?    397.1610+-4.5703        ? might be 1.0103x slower
   arithmetic-precise-order-0                       576.5404+-11.9738    ?    577.5720+-11.5825       ?
   arithmetic-precise-order-1                       417.4902+-4.2838          416.9048+-4.7199        
   arithmetic-precise-order-2                       470.8320+-3.8804     ?    477.1902+-10.6764       ? might be 1.0135x slower
   arithmetic-simple-order-1                        526.4678+-1.6065          525.2267+-3.9598        
   arithmetic-simple-order-2                        588.8906+-1.9394          588.8398+-2.8140        
   lz-string                                        417.9329+-3.0647     ?    422.1351+-19.7194       ? might be 1.0101x slower

   <geometric>                                      477.0594+-0.3531     ?    479.0459+-3.5216        ? might be 1.0042x slower

                                                          143582                   Baseline                                     

                                                          143582                   Baseline                                     
Geomean of preferred means:
   <scaled-result>                                   62.5561+-0.0766           62.4059+-0.0914          might be 1.0024x faster
Comment 4 Geoffrey Garen 2015-04-09 16:08:12 PDT
Comment on attachment 250476 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=250476&action=review

> Source/JavaScriptCore/dfg/DFGGraph.h:246
> +#if 0
> +            add->child1()->shouldSpeculateInt32OrBoolean(),
> +            add->child2()->shouldSpeculateInt32OrBoolean(),
> +#else
>              add->child1()->shouldSpeculateInt32OrBooleanForArithmetic(),
>              add->child2()->shouldSpeculateInt32OrBooleanForArithmetic(),
> +#endif

Are you sure this is the right patch?
Comment 5 Michael Saboff 2015-04-09 16:11:09 PDT
Created attachment 250482 [details]
Patch.

Posted a debug version.  This is the real patch.
Comment 6 Geoffrey Garen 2015-04-09 16:20:37 PDT
Comment on attachment 250482 [details]
Patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=250482&action=review

> Source/JavaScriptCore/ChangeLog:9
> +        The fixup phase handling of ArithAdd and ArithSub nodes was setting them to SpeculateInt32 even
> +        when the inputs weren't numbers.  In this case SpecObjectOther.  Changed to use

Is it appropriate to have an ArithAdd or ArithSub in the graph with a SpecObjectOther input? My understanding was that ArithAdd was limited to adds that were known to do arithmetic. But an object might produce a string, causing non-arithmetic. It seems like we should not have produced an ArithAdd or not have produced a SpecObjectOther in the first place.

If you read AbstractInterpreter::executeEffects(), you'll see that it ASSERTs if the input to ArithAdd or ArithSub is not int32, int52, or double. Why does that ASSERT not fire in this case (with or without your patch)?
Comment 7 Michael Saboff 2015-04-09 17:16:42 PDT
(In reply to comment #6)
> Comment on attachment 250482 [details]
> Patch.
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=250482&action=review
> 
> > Source/JavaScriptCore/ChangeLog:9
> > +        The fixup phase handling of ArithAdd and ArithSub nodes was setting them to SpeculateInt32 even
> > +        when the inputs weren't numbers.  In this case SpecObjectOther.  Changed to use
> 
> Is it appropriate to have an ArithAdd or ArithSub in the graph with a
> SpecObjectOther input? My understanding was that ArithAdd was limited to
> adds that were known to do arithmetic. But an object might produce a string,
> causing non-arithmetic. It seems like we should not have produced an
> ArithAdd or not have produced a SpecObjectOther in the first place.
> 
> If you read AbstractInterpreter::executeEffects(), you'll see that it
> ASSERTs if the input to ArithAdd or ArithSub is not int32, int52, or double.
> Why does that ASSERT not fire in this case (with or without your patch)?

Without my patch, the Fixup Phase set the use kids for both children to Int32.  Here is the node as it appears after Mixup (which is before any phase that uses the abstract interpreter):
  82:< 1:->     ArithSub(Check:Int32:@78, Check:Int32:@81, Number|UseAsOther|IsFlushed, Int32, CheckOverflow, bc#108)

After my patch, the Fixup Phase set the use kind for both children to DoubleRep and inserted DoubleRep nodes:
 169:< 1:->     DoubleRep(Check:Number:@78, Double|PureInt, Bytecodedouble, bc#108)
 170:< 1:->     DoubleRep(Check:Number:@81, Double|PureInt, Bytecodedouble, bc#108)
  82:< 1:->     ArithSub(Check:DoubleRep:@169<Double>, Check:DoubleRep:@170<Double>, Double|UseAsOther|IsFlushed, Bytecodedouble

In both cases we ORS exit due to a bad type.  Before the fix, the ArithSub node's check did the OSR exit.  After the fix, the first DoubleRep node did the OSR exit.
Comment 8 Geoffrey Garen 2015-04-09 17:49:57 PDT
It seems valid for an arithmetic node to speculate int32 for an object. The point is that the node can't handle the object, so it might as well OSR exit with an int32 check -- that's just as good as OSR exiting with a double check, or any other check.

This bug seems more low-level to me: Once the compiler decides the speculate int32, why does the back-end ASSERT? And why does it only ASSERT in the 32_64 back-end? I think the answer must lie there.
Comment 9 Geoffrey Garen 2015-04-09 17:52:00 PDT
In general, the compiler should be able to speculate whatever it wants, even if suboptimal, without getting incorrect codegen. The whole point of speculation is that you're allowed to be wrong without crashing.
Comment 10 Michael Saboff 2015-04-09 18:16:24 PDT
Comment on attachment 250482 [details]
Patch.

This patch "fixes" the issue because we convert the Object to a Double, thus silencing the ASSERT.  We really need to see why we are spilling DataFormatCell on 32 bit and DataFormatJS on 64 bit.  Then go from there to address the spill / fill mismatch.
Comment 11 Michael Saboff 2015-04-10 15:06:19 PDT
I have a reduced test case:
function benchmark(f, array)
{
    var start = new Date;
    f.call(array);
    var end = new Date;

    return end - start;
}

f is Array.prototype.sort.  Replacing the f.call with array.sort() and the problem doesn't reproduce.

The relevant parts of graph at byte code parsing is:
Block #0 (bc#0):
   ...
  26:<!0:->     Construct(@21, @21, JS|MustGen|VarArgs|PureInt, R:World, W:Heap, bc#21)  predicting Otherobj
  27:< 1:->     MovHint(@26, loc1, W:SideState, bc#21)
  28:< 1:->     SetLocal(@26, loc1(K~/FlushedJSValue), W:Stack(-2), bc#21, exit: bc#30)  predicting None
   ...

Block #3 (bc#75):
   ...
  69:<!0:->     Construct(@64, @64, JS|MustGen|VarArgs|PureInt, R:World, W:Heap, bc#93)  predicting Otherobj
   ...
  72:< 1:->     GetLocal(JS|PureInt, loc1(BB~/FlushedJSValue), R:Stack(-2), bc#102)  predicting None
  73:< 1:->     ArithSub(@69, @72, Number|PureInt, NotSet, bc#102)
   ...

It appears that the CFG needs to modify the graph for the problem to reproduce.  After control flow simplification:

Block #0 (bc#0):
   ...
  26:<!0:->     Construct(@21, @21, JS|MustGen|VarArgs|UseAsOther, Otherobj, R:World, W:Heap, bc#21)  predicting Otherobj
  27:< 1:->     MovHint(@26, loc1, W:SideState, bc#21)
  87:<!0:->     InvalidationPoint(MustGen, W:SideState, bc#21, exit: bc#30)
  28:< 1:->     SetLocal(Check:Cell:@26, loc1(K<Object>/FlushedCell), W:Stack(-2), bc#21, exit: bc#30)  predicting Otherobj
   ...
  69:<!0:->     Construct(@64, @64, JS|MustGen|VarArgs|UseAsOther, Otherobj, R:World, W:Heap, bc#93)  predicting Otherobj
   ...
exit: bc#102)  predicting Otherobj
  72:< 1:->     GetLocal(@79, JS|UseAsOther, Otherobj, loc1(K<Object>/FlushedCell), R:Stack(-2), bc#102)  predicting Otherobj
  73:< 1:->     ArithSub(Check:Int32:@69, Check:Int32:@72, Number|UseAsOther, Int32, CheckOverflow, bc#102)

The subsequent local CSE phase removes the 72:GetLocal and replaced with 26.  The flush format though is established by the SetLocal for node 26 which is node 28.  

During the mixup phase, the flush format is determined for SetLocals.  For 32 bit builds, the flush format for node 28 is:
  28:< 1:->     SetLocal(Check:Cell:@26, loc1(K<Object>/FlushedCell), W:Stack(-2), bc#21, exit: bc#30)  predicting Otherobj
For 64 bit builds, the same node's flush format is:
  28:< 1:->     SetLocal(@26, loc1(K~<Object>/FlushedJSValue), W:Stack(-2), bc#21, exit: bc#30)  predicting Otherobj

Later, the SetLocal is eliminated, but the flush format is recorded in the GenerationInfo for the corresponding VirtualRegister.

More debugging.
Comment 12 Michael Saboff 2015-04-10 15:23:51 PDT
The bug is due to the Fixup phase.  Specifically the code:

    bool alwaysUnboxSimplePrimitives()
    {
#if USE(JSVALUE64)
        return false;
#else
        // Any boolean, int, or cell value is profitable to unbox on 32-bit because it
        // reduces traffic.
        return true;
#endif
    }

The function observeUseKindOnNode() in that class uses alwaysUnboxSimplePrimitives() to decide whether or not it is profitable to unbox.  For 32 bit we'll unbox and 64 bit we don't (because we don't need to).
Comment 13 Michael Saboff 2015-04-10 16:31:03 PDT
Created attachment 250539 [details]
Updated Patch
Comment 14 Mark Lam 2015-04-10 16:47:10 PDT
Comment on attachment 250539 [details]
Updated Patch

r=me
Comment 15 Michael Saboff 2015-04-10 17:01:35 PDT
Committed r182643: <http://trac.webkit.org/changeset/182643>
Comment 16 Michael Saboff 2015-04-10 18:17:57 PDT
Discussed this with Geoff and he questioned the validity of adding this hard OSR exit for the case where we have a DataFormatCell spill format.  He requested that I research the intent of the ASSERT_UNUSED(spillFormat, (spillFormat & DataFormatJS) || spillFormat == DataFormatInt32);

That ASSERT was added in change set <http://trac.webkit.org/changeset/103792> when changes were made to spill unboxed values in DFG 32_64.  That change was to make the 32_64 patch similar to the DFG 64 path.  In addition to spilling unboxed integers and booleans just like the DFG 64 path, it also could spill unboxed Cells.

The DFG 64 change set that allowed for the spilling of unboxed integers and doubles was <http://trac.webkit.org/changeset/97118>.  In that change set the related ASSERT firing for this bug was ASSERT(spillFormat & DataFormatJS) and an additional check allowing for the format to also be DataFormatInteger was added.  That ASSERT appears to have been copied to change set 103792 without any consideration for unboxed Cell values, which are unique to 32_64.

Prior to both of those changes, both DFG 64 and DFG 32_64 the fillSpeculateInteger code only ASSERTed that the spill format was DataFormatJS and did not check for other formats.
Comment 17 Geoffrey Garen 2015-04-13 13:46:53 PDT
Comment on attachment 250539 [details]
Updated Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=250539&action=review

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:872
> +        if (spillFormat == DataFormatCell) {

Why don't we need to check DataFormatBoolean here?

Why doesn't SpeculativeJIT::fillSpeculateCell need to check DataFormatBoolean?
Comment 18 Geoffrey Garen 2015-04-13 14:01:17 PDT
When was this bug introduced? r182567 is when this test was introduced, not this mismatch between spill and fill type.

If your theory is correct -- that this bug was a simple copy-paste error in r103792 -- then you should be able to reproduce this ASSERT as far back as r103792. Please try that experiment and report your result.

In bug 143645, we have a similar ASSERT in the 64bit JIT because something spilled in an unexpected format. Bug 143645 contradicts your theory that this problem is 32-bit only -- unless somehow the mismatch in spill and fill type in bug 143645 is a distinct problem, despite its superficial similarity.

We have ASSERTs throughout the compiler that say, "If I'm trying to fill as X, then ASSERT the spilled type is not Y". These ASSERTs seem to conflict with the behavior you noticed, in which the optimizer eliminated a SetLocal that was the only thing ensuring "not type Y".

If "not type Y" is an invariant in the compiler, can you explain how SetLocal elimination avoids violating the "not type Y" invariant when it eliminates a SetLocal that changes type? If you can't explain this, then we haven't solved this bug.
Comment 19 Michael Saboff 2015-04-13 18:50:33 PDT
(In reply to comment #18)
> When was this bug introduced? r182567 is when this test was introduced, not
> this mismatch between spill and fill type.
> 
> If your theory is correct -- that this bug was a simple copy-paste error in
> r103792 -- then you should be able to reproduce this ASSERT as far back as
> r103792. Please try that experiment and report your result.

The farthest that I have gone back is r160706 (12/17/13), the oldest archive build I know of, and the bug still happens with my reduced test case:
DYLD_FRAMEWORK_PATH=. arch -i386 ./jsc --thresholdForJITAfterWarmUp\=10 --thresholdForJITSoon\=10 --thresholdForOptimizeAfterWarmUp\=20 --thresholdForOptimizeAfterLongWarmUp\=20 --thresholdForOptimizeSoon\=20 --thresholdForFTLOptimizeAfterWarmUp\=20 --thresholdForFTLOptimizeSoon\=20 ../reduced-143582.js
ASSERTION FAILED: (spillFormat & DataFormatJS) || spillFormat == DataFormatInt32
/Volumes/Data/slave/syrah-debug-archive/build/OpenSource/Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp(729) : GPRReg JSC::DFG::SpeculativeJIT::fillSpeculateInt32Internal(JSC::DFG::Edge, JSC::DataFormat &) [strict = false]
1   0x78742d WTFCrash
2   0x365961 JSC::X86Registers::RegisterID JSC::DFG::SpeculativeJIT::fillSpeculateInt32Internal<false>(JSC::DFG::Edge, JSC::DataFormat&)
3   0x342b62 JSC::DFG::SpeculativeJIT::fillSpeculateInt32(JSC::DFG::Edge, JSC::DataFormat&)
4   0x3157a4 JSC::DFG::SpeculateInt32Operand::gpr()
5   0x303ca5 JSC::DFG::SpeculativeJIT::compileArithSub(JSC::DFG::Node*)
6   0x34cf2f JSC::DFG::SpeculativeJIT::compile(JSC::DFG::Node*)
7   0x2f98d4 JSC::DFG::SpeculativeJIT::compileCurrentBlock()
8   0x2fa1e2 JSC::DFG::SpeculativeJIT::compile()
9   0x27ee40 JSC::DFG::JITCompiler::compileBody()
10  0x280f3c JSC::DFG::JITCompiler::compileFunction()
11  0x2e9aa2 JSC::DFG::Plan::compileInThreadImpl(JSC::DFG::LongLivedState&)
12  0x2e90f5 JSC::DFG::Plan::compileInThread(JSC::DFG::LongLivedState&)
13  0x23c114 JSC::DFG::compileImpl(JSC::VM&, JSC::CodeBlock*, JSC::DFG::CompilationMode, unsigned int, JSC::Operands<JSC::JSValue, JSC::OperandValueTraits<JSC::JSValue> > const&, WTF::PassRefPtr<JSC::DeferredCompilationCallback>, JSC::DFG::Worklist*)
14  0x23badc JSC::DFG::compile(JSC::VM&, JSC::CodeBlock*, JSC::DFG::CompilationMode, unsigned int, JSC::Operands<JSC::JSValue, JSC::OperandValueTraits<JSC::JSValue> > const&, WTF::PassRefPtr<JSC::DeferredCompilationCallback>, JSC::DFG::Worklist*)
15  0x430f2c operationOptimize
16  0x1bf9038
Segmentation fault: 11

Looking to see if we have older archived builds.
 
> In bug 143645, we have a similar ASSERT in the 64bit JIT because something
> spilled in an unexpected format. Bug 143645 contradicts your theory that
> this problem is 32-bit only -- unless somehow the mismatch in spill and fill
> type in bug 143645 is a distinct problem, despite its superficial similarity.
> 
> We have ASSERTs throughout the compiler that say, "If I'm trying to fill as
> X, then ASSERT the spilled type is not Y". These ASSERTs seem to conflict
> with the behavior you noticed, in which the optimizer eliminated a SetLocal
> that was the only thing ensuring "not type Y".
> 
> If "not type Y" is an invariant in the compiler, can you explain how
> SetLocal elimination avoids violating the "not type Y" invariant when it
> eliminates a SetLocal that changes type? If you can't explain this, then we
> haven't solved this bug.
Comment 20 Michael Saboff 2015-04-14 07:53:37 PDT
(In reply to comment #19)
> (In reply to comment #18)
> > When was this bug introduced? r182567 is when this test was introduced, not
> > this mismatch between spill and fill type.
> > 
> > If your theory is correct -- that this bug was a simple copy-paste error in
> > r103792 -- then you should be able to reproduce this ASSERT as far back as
> > r103792. Please try that experiment and report your result.
> 
> The farthest that I have gone back is r160706 (12/17/13), the oldest archive
> build I know of, and the bug still happens with my reduced test case:
> DYLD_FRAMEWORK_PATH=. arch -i386 ./jsc --thresholdForJITAfterWarmUp\=10
> --thresholdForJITSoon\=10 --thresholdForOptimizeAfterWarmUp\=20
> --thresholdForOptimizeAfterLongWarmUp\=20 --thresholdForOptimizeSoon\=20
> --thresholdForFTLOptimizeAfterWarmUp\=20 --thresholdForFTLOptimizeSoon\=20
> ../reduced-143582.js
> ASSERTION FAILED: (spillFormat & DataFormatJS) || spillFormat ==
> DataFormatInt32
> /Volumes/Data/slave/syrah-debug-archive/build/OpenSource/Source/
> JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp(729) : GPRReg
> JSC::DFG::SpeculativeJIT::fillSpeculateInt32Internal(JSC::DFG::Edge,
> JSC::DataFormat &) [strict = false]
> 1   0x78742d WTFCrash
> 2   0x365961 JSC::X86Registers::RegisterID
> JSC::DFG::SpeculativeJIT::fillSpeculateInt32Internal<false>(JSC::DFG::Edge,
> JSC::DataFormat&)
> 3   0x342b62 JSC::DFG::SpeculativeJIT::fillSpeculateInt32(JSC::DFG::Edge,
> JSC::DataFormat&)
> 4   0x3157a4 JSC::DFG::SpeculateInt32Operand::gpr()
> 5   0x303ca5 JSC::DFG::SpeculativeJIT::compileArithSub(JSC::DFG::Node*)
> 6   0x34cf2f JSC::DFG::SpeculativeJIT::compile(JSC::DFG::Node*)
> 7   0x2f98d4 JSC::DFG::SpeculativeJIT::compileCurrentBlock()
> 8   0x2fa1e2 JSC::DFG::SpeculativeJIT::compile()
> 9   0x27ee40 JSC::DFG::JITCompiler::compileBody()
> 10  0x280f3c JSC::DFG::JITCompiler::compileFunction()
> 11  0x2e9aa2 JSC::DFG::Plan::compileInThreadImpl(JSC::DFG::LongLivedState&)
> 12  0x2e90f5 JSC::DFG::Plan::compileInThread(JSC::DFG::LongLivedState&)
> 13  0x23c114 JSC::DFG::compileImpl(JSC::VM&, JSC::CodeBlock*,
> JSC::DFG::CompilationMode, unsigned int, JSC::Operands<JSC::JSValue,
> JSC::OperandValueTraits<JSC::JSValue> > const&,
> WTF::PassRefPtr<JSC::DeferredCompilationCallback>, JSC::DFG::Worklist*)
> 14  0x23badc JSC::DFG::compile(JSC::VM&, JSC::CodeBlock*,
> JSC::DFG::CompilationMode, unsigned int, JSC::Operands<JSC::JSValue,
> JSC::OperandValueTraits<JSC::JSValue> > const&,
> WTF::PassRefPtr<JSC::DeferredCompilationCallback>, JSC::DFG::Worklist*)
> 15  0x430f2c operationOptimize
> 16  0x1bf9038
> Segmentation fault: 11
> 
> Looking to see if we have older archived builds.

Found archives for older Mac OS release and was able to find the change that was responsible for this bug: http://trac.webkit.org/changeset/144862 - from <https://bugs.webkit.org/show_bug.cgi?id=109389> - "DFG DCE might eliminate checks unsoundly"

> > In bug 143645, we have a similar ASSERT in the 64bit JIT because something
> > spilled in an unexpected format. Bug 143645 contradicts your theory that
> > this problem is 32-bit only -- unless somehow the mismatch in spill and fill
> > type in bug 143645 is a distinct problem, despite its superficial similarity.
> > 
> > We have ASSERTs throughout the compiler that say, "If I'm trying to fill as
> > X, then ASSERT the spilled type is not Y". These ASSERTs seem to conflict
> > with the behavior you noticed, in which the optimizer eliminated a SetLocal
> > that was the only thing ensuring "not type Y".
> > 
> > If "not type Y" is an invariant in the compiler, can you explain how
> > SetLocal elimination avoids violating the "not type Y" invariant when it
> > eliminates a SetLocal that changes type? If you can't explain this, then we
> > haven't solved this bug.

The change responsible for this issues changed the way that dead SetLocals are handled.
Comment 21 Michael Saboff 2015-04-14 14:11:05 PDT
rdar://problem/20507178
Comment 22 Michael Saboff 2015-04-14 14:20:31 PDT
Geoff, Phil and I discussed this bug and believe it is an appropriate fix to verify the spill format is consistent with the requested fill format.  This patch however only addresses one case.  Filed <https://bugs.webkit.org/show_bug.cgi?id=143727> - "DFG register fillSpeculate*() functions should validate that the incoming spill format is compatible with the requested fill format".