Bug 113374 - DFG should use CheckStructure for typed array checks whenever possible
Summary: DFG should use CheckStructure for typed array checks whenever possible
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Filip Pizlo
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-03-27 01:51 PDT by Filip Pizlo
Modified: 2013-03-27 11:00 PDT (History)
7 users (show)

See Also:


Attachments
Patch (3.89 KB, patch)
2013-03-27 01:53 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
Patch (3.95 KB, patch)
2013-03-27 01:55 PDT, Filip Pizlo
ggaren: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Filip Pizlo 2013-03-27 01:51:23 PDT
DFG should use CheckStructure for typed array checks whenever possible
Comment 1 Filip Pizlo 2013-03-27 01:53:31 PDT
Created attachment 195247 [details]
Patch
Comment 2 Filip Pizlo 2013-03-27 01:55:40 PDT
Created attachment 195248 [details]
Patch
Comment 3 Geoffrey Garen 2013-03-27 09:20:44 PDT
Comment on attachment 195248 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=195248&action=review

r=me

Is this a speedup?

> Source/JavaScriptCore/dfg/DFGArrayMode.h:353
> +            // It might benefit from structure checks! If it ends up not benefiting, we can just
> +            // remove it.

Would be nice to clarify that FixupPhase will remove it.
Comment 4 Filip Pizlo 2013-03-27 10:00:34 PDT
(In reply to comment #3)
> (From update of attachment 195248 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=195248&action=review
> 
> r=me
> 
> Is this a speedup?

Ever so slight.  3% on Mandreel I think.

The fact that it's not more of a speed-up is a pretty good hint that our typed array support sucks.  Normally, removing a dependent load from a speculation branch is a huge deal.  So this tells me that we have a large bottleneck somewhere else, in those Octane programs.

> 
> > Source/JavaScriptCore/dfg/DFGArrayMode.h:353
> > +            // It might benefit from structure checks! If it ends up not benefiting, we can just
> > +            // remove it.
> 
> Would be nice to clarify that FixupPhase will remove it.

Will do.
Comment 5 Filip Pizlo 2013-03-27 10:01:30 PDT
The performance:

Benchmark report for SunSpider, V8Spider, Octane, Kraken, JSBench, JSRegress, and DSP on bigmac (MacPro5,1).

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quartary/OpenSource/WebKitBuild/Release/DumpRenderTree (r146946)
"FixCheckArray" at /Volumes/Data/pizlo/secondary/OpenSource/WebKitBuild/Release/DumpRenderTree (r146946)

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get
microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds.

                                                     TipOfTree               FixCheckArray                                   
SunSpider:
   3d-cube                                         8.8935+-0.2851            8.8270+-0.2959        
   3d-morph                                        7.4010+-0.0326     ?      7.4507+-0.0574        ?
   3d-raytrace                                     9.7987+-0.1107     ?      9.8020+-0.1477        ?
   access-binary-trees                             2.2594+-0.3235     ?      2.3364+-0.3003        ? might be 1.0341x slower
   access-fannkuch                                 6.5606+-0.1040     ?      6.5868+-0.1090        ?
   access-nbody                                    3.9516+-0.0574            3.9206+-0.0446        
   access-nsieve                                   4.2525+-0.0365            4.1985+-0.0541          might be 1.0129x faster
   bitops-3bit-bits-in-byte                        1.5219+-0.0140     ?      1.5294+-0.0256        ?
   bitops-bits-in-byte                             5.6379+-0.0506            5.6170+-0.0409        
   bitops-bitwise-and                              2.0445+-0.0668            2.0378+-0.0809        
   bitops-nsieve-bits                              3.3617+-0.0231     ?      3.3649+-0.0231        ?
   controlflow-recursive                           2.4874+-0.0158            2.4767+-0.0154        
   crypto-aes                                      7.2108+-0.3071     ?      7.3705+-0.2444        ? might be 1.0221x slower
   crypto-md5                                      3.6250+-0.0613            3.5911+-0.0849        
   crypto-sha1                                     2.8992+-0.0194            2.8944+-0.0214        
   date-format-tofte                              13.9241+-0.9335           13.9184+-0.9043        
   date-format-xparb                               9.0812+-0.6642     ?      9.2028+-0.6406        ? might be 1.0134x slower
   math-cordic                                     3.3064+-0.0146     ?      3.3106+-0.0222        ?
   math-partial-sums                              10.0759+-0.0396     ?     10.0760+-0.0565        ?
   math-spectral-norm                              2.6628+-0.0093            2.6593+-0.0232        
   regexp-dna                                      9.9260+-0.4735     ?      9.9614+-0.4756        ?
   string-base64                                   4.8051+-0.4941            4.7716+-0.4773        
   string-fasta                                    9.6569+-0.1867     ?      9.6863+-0.2252        ?
   string-tagcloud                                12.2864+-0.3021           12.1361+-0.1950          might be 1.0124x faster
   string-unpack-code                             24.7499+-0.5160     ?     24.7795+-0.4921        ?
   string-validate-input                           8.1722+-0.2510            8.1406+-0.2744        

   <arithmetic> *                                  6.9443+-0.1535     ?      6.9479+-0.1541        ? might be 1.0005x slower
   <geometric>                                     5.5444+-0.1117     ?      5.5491+-0.1010        ? might be 1.0008x slower
   <harmonic>                                      4.4374+-0.0906     ?      4.4447+-0.0678        ? might be 1.0017x slower

                                                     TipOfTree               FixCheckArray                                   
V8Spider:
   crypto                                         76.0600+-0.4779     ?     76.1200+-0.5103        ?
   deltablue                                     105.5310+-0.6358          105.4635+-0.4398        
   earley-boyer                                   70.9853+-0.7728     ?     71.4675+-0.6405        ?
   raytrace                                       54.7600+-3.9454           52.6467+-2.9982          might be 1.0401x faster
   regexp                                         84.1135+-0.3307     ?     84.1466+-0.3309        ?
   richards                                      100.1076+-1.3108           99.9817+-1.3541        
   splay                                          52.6312+-2.7335           52.2827+-2.5782        

   <arithmetic>                                   77.7412+-0.6585           77.4441+-0.6387          might be 1.0038x faster
   <geometric> *                                  75.2952+-0.8352           74.8926+-0.7881          might be 1.0054x faster
   <harmonic>                                     72.8292+-1.0248           72.3118+-0.9492          might be 1.0072x faster

                                                     TipOfTree               FixCheckArray                                   
Octane and V8v7:
   encrypt                                        0.40878+-0.00120    ?     0.40909+-0.00117       ?
   decrypt                                        7.39074+-0.00736    ?     7.40187+-0.00860       ?
   deltablue                             x2       0.50263+-0.00378    ?     0.50427+-0.00515       ?
   earley                                         0.77283+-0.00785          0.76406+-0.00258         might be 1.0115x faster
   boyer                                         10.77485+-0.03677    ?    10.77693+-0.03666       ?
   raytrace                              x2       3.87482+-0.02736          3.86303+-0.03085       
   regexp                                x2      26.25312+-0.06062    ?    26.25993+-0.12607       ?
   richards                              x2       0.25894+-0.00231    ?     0.25900+-0.00227       ?
   splay                                 x2       0.55872+-0.01695          0.55272+-0.00468         might be 1.0109x faster
   navier-stokes                         x2       9.04093+-0.00735    ?     9.04797+-0.00802       ?
   closure                                        0.25928+-0.03253    ?     0.25938+-0.03323       ?
   jquery                                         3.72531+-0.44150    ?     3.73189+-0.44281       ?
   gbemu                                 x2     120.32896+-5.62006    ?   122.16335+-7.34906       ? might be 1.0152x slower
   mandreel                              x2     153.68588+-0.44083    ^   149.09780+-0.53932       ^ definitely 1.0308x faster
   pdfjs                                 x2      93.86436+-0.29899         93.39331+-0.21015       
   box2d                                 x2      30.68463+-0.09211    ?    30.69007+-0.18309       ?

V8v7:
   <arithmetic>                                   6.27035+-0.00967    ?     6.27036+-0.01644       ? might be 1.0000x slower
   <geometric> *                                  2.06844+-0.00876          2.06507+-0.00473         might be 1.0016x faster
   <harmonic>                                     0.79716+-0.00580          0.79583+-0.00359         might be 1.0017x faster

Octane including V8v7:
   <arithmetic>                                  34.67068+-0.41751         34.42331+-0.54537         might be 1.0072x faster
   <geometric> *                                  6.12919+-0.05100          6.11341+-0.05997         might be 1.0026x faster
   <harmonic>                                     1.06399+-0.02116          1.06251+-0.02276         might be 1.0014x faster

                                                     TipOfTree               FixCheckArray                                   
Kraken:
   ai-astar                                       438.090+-2.918      ?     438.115+-2.797         ?
   audio-beat-detection                           208.695+-1.171      ?     209.030+-1.482         ?
   audio-dft                                      259.583+-2.198      ?     263.919+-5.981         ? might be 1.0167x slower
   audio-fft                                      121.738+-0.170            121.725+-0.179         
   audio-oscillator                               210.933+-0.297      !     211.693+-0.377         ! definitely 1.0036x slower
   imaging-darkroom                               244.030+-1.152            242.926+-0.831         
   imaging-desaturate                             133.527+-0.197            133.414+-0.075         
   imaging-gaussian-blur                          414.379+-0.255            414.339+-0.184         
   json-parse-financial                            69.654+-0.183      ^      67.657+-0.293         ^ definitely 1.0295x faster
   json-stringify-tinderbox                        83.420+-0.215      ?      83.537+-0.221         ?
   stanford-crypto-aes                            100.388+-0.359      ?     100.597+-1.038         ?
   stanford-crypto-ccm                             97.019+-0.222      ?      98.013+-0.839         ? might be 1.0102x slower
   stanford-crypto-pbkdf2                         228.156+-1.828      ?     229.235+-1.069         ?
   stanford-crypto-sha256-iterative               104.600+-0.688            104.284+-0.192         

   <arithmetic> *                                 193.872+-0.300      ?     194.177+-0.566         ? might be 1.0016x slower
   <geometric>                                    165.025+-0.203      ?     165.049+-0.430         ? might be 1.0001x slower
   <harmonic>                                     142.216+-0.136            141.880+-0.349           might be 1.0024x faster

                                                     TipOfTree               FixCheckArray                                   
JSBench:
   amazon                                          7.0833+-0.1834            7.0833+-0.1834        
   facebook                                       33.8333+-1.7312           33.6667+-1.5875        
   google                                         67.2500+-1.5356           67.0833+-1.6581        
   twitter                                         8.6667+-0.3128     ?      8.7500+-0.2874        ?
   yahoo                                           3.1667+-0.3668     ?      3.2500+-0.2874        ? might be 1.0263x slower

   <arithmetic> *                                 24.0000+-0.6658           23.9667+-0.6268          might be 1.0014x faster
   <geometric>                                    13.4208+-0.4344     ?     13.5075+-0.2332        ? might be 1.0065x slower
   <harmonic>                                      8.0400+-0.5160     ?      8.1802+-0.3203        ? might be 1.0174x slower

                                                     TipOfTree               FixCheckArray                                   
JSRegress:
   adapt-to-double-divide                         18.5269+-0.0168     ?     18.5381+-0.0240        ?
   aliased-arguments-getbyval                      0.7946+-0.0099     ?      0.8005+-0.0155        ?
   allocate-big-object                             3.4880+-1.1945     ?      3.5560+-1.2005        ? might be 1.0195x slower
   arity-mismatch-inlining                         0.6659+-0.0037     ?      0.6668+-0.0059        ?
   array-access-polymorphic-structure              7.7405+-1.8248     ?      7.8569+-1.8194        ? might be 1.0150x slower
   array-with-double-add                           4.7655+-0.0104     ?      4.8023+-0.0393        ?
   array-with-double-increment                     3.3107+-0.0693            3.2598+-0.0228          might be 1.0156x faster
   array-with-double-mul-add                       6.4914+-0.0469            6.4901+-0.0338        
   array-with-double-sum                           6.4108+-0.0150     ?      6.4254+-0.0269        ?
   array-with-int32-add-sub                        8.5685+-0.0440     ?      8.6337+-0.0525        ?
   array-with-int32-or-double-sum                  6.5172+-0.0558            6.5085+-0.0261        
   big-int-mul                                     4.0612+-0.0805            4.0328+-0.0375        
   boolean-test                                    3.5306+-0.0328            3.5099+-0.0225        
   cast-int-to-double                             11.3864+-0.0790     ?     11.4124+-0.0941        ?
   cell-argument                                  11.8642+-0.0233           11.8565+-0.0116        
   cfg-simplify                                    3.1394+-0.0485     ?      3.1677+-0.0351        ?
   cmpeq-obj-to-obj-other                          9.0941+-0.1769     !      9.3403+-0.0584        ! definitely 1.0271x slower
   constant-test                                   7.0125+-0.0708            6.9930+-0.0629        
   direct-arguments-getbyval                       0.7431+-0.0105            0.7302+-0.0041          might be 1.0177x faster
   double-pollution-getbyval                       8.8049+-0.0363     ?      8.8060+-0.0449        ?
   double-pollution-putbyoffset                    4.8856+-0.5742            4.7551+-0.5578          might be 1.0274x faster
   empty-string-plus-int                          11.4245+-0.4981           11.3935+-0.4992        
   external-arguments-getbyval                     2.2195+-0.1851     ?      2.2243+-0.1790        ?
   external-arguments-putbyval                     3.4085+-0.3114            3.2462+-0.3064          might be 1.0500x faster
   Float32Array-matrix-mult                       13.9726+-0.9940           13.7838+-0.9240          might be 1.0137x faster
   fold-double-to-int                             18.2216+-0.1566           18.1625+-0.2114        
   function-dot-apply                              2.6072+-0.0177            2.6011+-0.0097        
   function-test                                   4.0251+-0.0337            4.0090+-0.0340        
   get-by-id-chain-from-try-block                  6.1532+-0.0153            6.1517+-0.0582        
   HashMap-put-get-iterate-keys                   72.0772+-0.9429           71.9648+-0.8525        
   HashMap-put-get-iterate                        74.2234+-0.7040           74.2198+-0.6123        
   HashMap-string-put-get-iterate                 66.2650+-0.8843           66.0407+-1.0295        
   indexed-properties-in-objects                   3.6560+-0.0101     ?      3.6609+-0.0163        ?
   inline-arguments-access                         1.0762+-0.0141     ?      1.0801+-0.0246        ?
   inline-arguments-local-escape                  21.2695+-0.1514     ?     22.3062+-1.8114        ? might be 1.0487x slower
   inline-get-scoped-var                           5.3016+-0.0075     ?      5.3437+-0.0496        ?
   inlined-put-by-id-transition                   13.9025+-0.2032           13.7933+-0.1636        
   int-or-other-abs-then-get-by-val                7.2785+-0.0304            7.2568+-0.0108        
   int-or-other-abs-zero-then-get-by-val          30.2694+-0.1391     ?     30.2997+-0.2067        ?
   int-or-other-add-then-get-by-val                8.4317+-0.0171     ?      8.4382+-0.0236        ?
   int-or-other-add                                8.7041+-0.0465            8.6604+-0.0385        
   int-or-other-div-then-get-by-val                6.6087+-0.0559            6.5722+-0.0196        
   int-or-other-max-then-get-by-val                8.0664+-0.1888     ?      8.1049+-0.1858        ?
   int-or-other-min-then-get-by-val                6.8044+-0.0779            6.7301+-0.0080          might be 1.0110x faster
   int-or-other-mod-then-get-by-val                6.5688+-0.0536            6.5404+-0.0216        
   int-or-other-mul-then-get-by-val                5.8900+-0.0176     ?      5.9139+-0.0477        ?
   int-or-other-neg-then-get-by-val                6.5324+-0.0138     ?      6.5648+-0.0327        ?
   int-or-other-neg-zero-then-get-by-val          30.2816+-0.2019           30.2357+-0.0324        
   int-or-other-sub-then-get-by-val                8.4200+-0.0133     ?      8.4254+-0.0168        ?
   int-or-other-sub                                6.7292+-0.0066            6.7282+-0.0166        
   int-overflow-local                             10.6538+-0.0541     ?     10.6613+-0.0840        ?
   Int16Array-bubble-sort                         63.5054+-1.4341           63.2847+-1.7178        
   Int16Array-load-int-mul                         1.5630+-0.0121            1.5599+-0.0083        
   Int8Array-load                                  4.5079+-0.0934            4.4773+-0.0098        
   integer-divide                                 12.6186+-0.0301           12.6097+-0.0237        
   integer-modulo                                  1.8475+-0.0219     ?      1.8510+-0.0192        ?
   make-indexed-storage                            3.8254+-0.5653     ?      3.8277+-0.5827        ?
   method-on-number                               19.6643+-0.2958     ?     19.7580+-0.3366        ?
   nested-function-parsing-random                321.2225+-10.5227    ?    321.4234+-10.5548       ?
   nested-function-parsing                        48.1503+-2.9427           47.8688+-2.9841        
   new-array-buffer-dead                           3.0738+-0.1234            3.0485+-0.1129        
   new-array-buffer-push                          12.4847+-2.1404     ?     12.7949+-2.0198        ? might be 1.0248x slower
   new-array-dead                                 23.4258+-0.1147     ?     23.4630+-0.1129        ?
   new-array-push                                 10.1454+-1.6437     ?     10.2759+-1.6638        ? might be 1.0129x slower
   number-test                                     3.4384+-0.0279            3.4215+-0.0193        
   object-closure-call                             7.2214+-0.1914            7.1605+-0.2103        
   object-test                                     4.0161+-0.0963            3.9146+-0.0251          might be 1.0259x faster
   poly-stricteq                                  75.6239+-0.2050           75.4939+-0.1057        
   polymorphic-structure                          16.5408+-0.0142           16.5362+-0.0156        
   polyvariant-monomorphic-get-by-id              10.3087+-0.0290     ?     10.3352+-0.0320        ?
   rare-osr-exit-on-local                         16.9675+-0.0200     ?     16.9739+-0.0318        ?
   register-pressure-from-osr                     26.0581+-0.0624           26.0370+-0.0409        
   simple-activation-demo                         28.7263+-0.2303           28.6991+-0.2268        
   slow-array-profile-convergence                  3.9406+-0.2199            3.8369+-0.2422          might be 1.0270x faster
   slow-convergence                                3.1627+-0.0214            3.1477+-0.0090        
   sparse-conditional                              1.1125+-0.0058            1.1049+-0.0111        
   splice-to-remove                               41.1385+-0.1038     !     41.5818+-0.2962        ! definitely 1.0108x slower
   string-concat-object                            4.1765+-1.2831            4.1555+-1.2447        
   string-concat-pair-object                       4.2333+-1.2654            4.1170+-1.2603          might be 1.0282x faster
   string-concat-pair-simple                      17.0748+-0.6209           16.8810+-0.5550          might be 1.0115x faster
   string-concat-simple                           16.6888+-0.6370     ?     17.1253+-0.5896        ? might be 1.0262x slower
   string-cons-repeat                             12.3493+-0.8003           12.2000+-0.8026          might be 1.0122x faster
   string-cons-tower                              31.0190+-18.2250    ?     31.1902+-18.3573       ?
   string-hash                                     2.1702+-0.0177            2.1688+-0.0114        
   string-repeat-arith                            37.2816+-0.6524           36.9392+-0.2494        
   string-sub                                     72.9891+-0.9757     ?     73.0070+-0.6598        ?
   string-test                                     3.4153+-0.0151            3.4136+-0.0077        
   structure-hoist-over-transitions                3.3734+-0.5786            3.3412+-0.5625        
   tear-off-arguments-simple                       1.4945+-0.0141     ?      1.5071+-0.0240        ?
   tear-off-arguments                              2.7709+-0.0229            2.7528+-0.0112        
   temporal-structure                             17.2814+-0.0204           17.2569+-0.0264        
   to-int32-boolean                               25.3976+-0.0907           25.3365+-0.0408        
   undefined-test                                  3.6302+-0.0200            3.6296+-0.0140        

   <arithmetic>                                   17.4460+-0.3407     ?     17.4491+-0.3875        ? might be 1.0002x slower
   <geometric> *                                   8.1168+-0.1783            8.1074+-0.1780          might be 1.0012x faster
   <harmonic>                                      4.5200+-0.0765            4.5082+-0.0727          might be 1.0026x faster

                                                     TipOfTree               FixCheckArray                                   
DSP:
   filtrr-posterize-tint                          45.1752+-1.0134           45.0346+-1.0212        
   filtrr-tint-contrast-sat-bright                63.3960+-1.2619           62.6011+-1.1812          might be 1.0127x faster
   filtrr-tint-sat-adj-contr-mult                 74.9996+-1.9155           74.4293+-1.7453        
   filtrr-blur-overlay-sat-contr                 187.4295+-5.7822          186.2442+-5.1466        
   filtrr-sat-blur-mult-sharpen-contr            234.5227+-5.6520          233.9418+-4.8177        
   filtrr-sepia-bias                              32.4222+-1.7334           32.1414+-1.6880        
   route9-vp8                            x5     1050.6229+-6.2234     ?   1051.6386+-6.7868        ?
   starfield                             x5     1139.9473+-2.7052         1139.7498+-2.5549        
   bellard-jslinux                       x5     2777.1667+-9.3727         2773.6667+-12.6389       
   zynaps-quake3                         x5     1199.8414+-34.2733    ?   1206.8224+-22.1225       ?
   zynaps-mandelbrot                     x5     1001.6099+-3.7171         1000.8694+-4.7030        

   <arithmetic>                                 1176.8995+-5.5850     ?   1177.3589+-4.3175        ? might be 1.0004x slower
   <geometric> *                                 770.5118+-5.1445          770.1203+-2.4870          might be 1.0005x faster
   <harmonic>                                    276.9330+-6.5093          275.3090+-6.0916          might be 1.0059x faster

                                                     TipOfTree               FixCheckArray                                   
All benchmarks:
   <arithmetic>                                  210.7265+-0.8774     ?    210.7771+-0.7247        ? might be 1.0002x slower
   <geometric>                                    20.1997+-0.2401           20.1833+-0.2441          might be 1.0008x faster
   <harmonic>                                      3.8967+-0.0315            3.8921+-0.0363          might be 1.0012x faster

                                                     TipOfTree               FixCheckArray                                   
Geomean of preferred means:
   <scaled-result>                                36.8970+-0.2961           36.8515+-0.3373          might be 1.0012x faster
Comment 6 Filip Pizlo 2013-03-27 11:00:05 PDT
Landed in http://trac.webkit.org/changeset/146996