DFG should use CheckStructure for typed array checks whenever possible
Created attachment 195247 [details] Patch
Created attachment 195248 [details] Patch
Comment on attachment 195248 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=195248&action=review r=me Is this a speedup? > Source/JavaScriptCore/dfg/DFGArrayMode.h:353 > + // It might benefit from structure checks! If it ends up not benefiting, we can just > + // remove it. Would be nice to clarify that FixupPhase will remove it.
(In reply to comment #3) > (From update of attachment 195248 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=195248&action=review > > r=me > > Is this a speedup? Ever so slight. 3% on Mandreel I think. The fact that it's not more of a speed-up is a pretty good hint that our typed array support sucks. Normally, removing a dependent load from a speculation branch is a huge deal. So this tells me that we have a large bottleneck somewhere else, in those Octane programs. > > > Source/JavaScriptCore/dfg/DFGArrayMode.h:353 > > + // It might benefit from structure checks! If it ends up not benefiting, we can just > > + // remove it. > > Would be nice to clarify that FixupPhase will remove it. Will do.
The performance: Benchmark report for SunSpider, V8Spider, Octane, Kraken, JSBench, JSRegress, and DSP on bigmac (MacPro5,1). VMs tested: "TipOfTree" at /Volumes/Data/pizlo/quartary/OpenSource/WebKitBuild/Release/DumpRenderTree (r146946) "FixCheckArray" at /Volumes/Data/pizlo/secondary/OpenSource/WebKitBuild/Release/DumpRenderTree (r146946) Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree FixCheckArray SunSpider: 3d-cube 8.8935+-0.2851 8.8270+-0.2959 3d-morph 7.4010+-0.0326 ? 7.4507+-0.0574 ? 3d-raytrace 9.7987+-0.1107 ? 9.8020+-0.1477 ? access-binary-trees 2.2594+-0.3235 ? 2.3364+-0.3003 ? might be 1.0341x slower access-fannkuch 6.5606+-0.1040 ? 6.5868+-0.1090 ? access-nbody 3.9516+-0.0574 3.9206+-0.0446 access-nsieve 4.2525+-0.0365 4.1985+-0.0541 might be 1.0129x faster bitops-3bit-bits-in-byte 1.5219+-0.0140 ? 1.5294+-0.0256 ? bitops-bits-in-byte 5.6379+-0.0506 5.6170+-0.0409 bitops-bitwise-and 2.0445+-0.0668 2.0378+-0.0809 bitops-nsieve-bits 3.3617+-0.0231 ? 3.3649+-0.0231 ? controlflow-recursive 2.4874+-0.0158 2.4767+-0.0154 crypto-aes 7.2108+-0.3071 ? 7.3705+-0.2444 ? might be 1.0221x slower crypto-md5 3.6250+-0.0613 3.5911+-0.0849 crypto-sha1 2.8992+-0.0194 2.8944+-0.0214 date-format-tofte 13.9241+-0.9335 13.9184+-0.9043 date-format-xparb 9.0812+-0.6642 ? 9.2028+-0.6406 ? might be 1.0134x slower math-cordic 3.3064+-0.0146 ? 3.3106+-0.0222 ? math-partial-sums 10.0759+-0.0396 ? 10.0760+-0.0565 ? math-spectral-norm 2.6628+-0.0093 2.6593+-0.0232 regexp-dna 9.9260+-0.4735 ? 9.9614+-0.4756 ? string-base64 4.8051+-0.4941 4.7716+-0.4773 string-fasta 9.6569+-0.1867 ? 9.6863+-0.2252 ? string-tagcloud 12.2864+-0.3021 12.1361+-0.1950 might be 1.0124x faster string-unpack-code 24.7499+-0.5160 ? 24.7795+-0.4921 ? string-validate-input 8.1722+-0.2510 8.1406+-0.2744 <arithmetic> * 6.9443+-0.1535 ? 6.9479+-0.1541 ? might be 1.0005x slower <geometric> 5.5444+-0.1117 ? 5.5491+-0.1010 ? might be 1.0008x slower <harmonic> 4.4374+-0.0906 ? 4.4447+-0.0678 ? might be 1.0017x slower TipOfTree FixCheckArray V8Spider: crypto 76.0600+-0.4779 ? 76.1200+-0.5103 ? deltablue 105.5310+-0.6358 105.4635+-0.4398 earley-boyer 70.9853+-0.7728 ? 71.4675+-0.6405 ? raytrace 54.7600+-3.9454 52.6467+-2.9982 might be 1.0401x faster regexp 84.1135+-0.3307 ? 84.1466+-0.3309 ? richards 100.1076+-1.3108 99.9817+-1.3541 splay 52.6312+-2.7335 52.2827+-2.5782 <arithmetic> 77.7412+-0.6585 77.4441+-0.6387 might be 1.0038x faster <geometric> * 75.2952+-0.8352 74.8926+-0.7881 might be 1.0054x faster <harmonic> 72.8292+-1.0248 72.3118+-0.9492 might be 1.0072x faster TipOfTree FixCheckArray Octane and V8v7: encrypt 0.40878+-0.00120 ? 0.40909+-0.00117 ? decrypt 7.39074+-0.00736 ? 7.40187+-0.00860 ? deltablue x2 0.50263+-0.00378 ? 0.50427+-0.00515 ? earley 0.77283+-0.00785 0.76406+-0.00258 might be 1.0115x faster boyer 10.77485+-0.03677 ? 10.77693+-0.03666 ? raytrace x2 3.87482+-0.02736 3.86303+-0.03085 regexp x2 26.25312+-0.06062 ? 26.25993+-0.12607 ? richards x2 0.25894+-0.00231 ? 0.25900+-0.00227 ? splay x2 0.55872+-0.01695 0.55272+-0.00468 might be 1.0109x faster navier-stokes x2 9.04093+-0.00735 ? 9.04797+-0.00802 ? closure 0.25928+-0.03253 ? 0.25938+-0.03323 ? jquery 3.72531+-0.44150 ? 3.73189+-0.44281 ? gbemu x2 120.32896+-5.62006 ? 122.16335+-7.34906 ? might be 1.0152x slower mandreel x2 153.68588+-0.44083 ^ 149.09780+-0.53932 ^ definitely 1.0308x faster pdfjs x2 93.86436+-0.29899 93.39331+-0.21015 box2d x2 30.68463+-0.09211 ? 30.69007+-0.18309 ? V8v7: <arithmetic> 6.27035+-0.00967 ? 6.27036+-0.01644 ? might be 1.0000x slower <geometric> * 2.06844+-0.00876 2.06507+-0.00473 might be 1.0016x faster <harmonic> 0.79716+-0.00580 0.79583+-0.00359 might be 1.0017x faster Octane including V8v7: <arithmetic> 34.67068+-0.41751 34.42331+-0.54537 might be 1.0072x faster <geometric> * 6.12919+-0.05100 6.11341+-0.05997 might be 1.0026x faster <harmonic> 1.06399+-0.02116 1.06251+-0.02276 might be 1.0014x faster TipOfTree FixCheckArray Kraken: ai-astar 438.090+-2.918 ? 438.115+-2.797 ? audio-beat-detection 208.695+-1.171 ? 209.030+-1.482 ? audio-dft 259.583+-2.198 ? 263.919+-5.981 ? might be 1.0167x slower audio-fft 121.738+-0.170 121.725+-0.179 audio-oscillator 210.933+-0.297 ! 211.693+-0.377 ! definitely 1.0036x slower imaging-darkroom 244.030+-1.152 242.926+-0.831 imaging-desaturate 133.527+-0.197 133.414+-0.075 imaging-gaussian-blur 414.379+-0.255 414.339+-0.184 json-parse-financial 69.654+-0.183 ^ 67.657+-0.293 ^ definitely 1.0295x faster json-stringify-tinderbox 83.420+-0.215 ? 83.537+-0.221 ? stanford-crypto-aes 100.388+-0.359 ? 100.597+-1.038 ? stanford-crypto-ccm 97.019+-0.222 ? 98.013+-0.839 ? might be 1.0102x slower stanford-crypto-pbkdf2 228.156+-1.828 ? 229.235+-1.069 ? stanford-crypto-sha256-iterative 104.600+-0.688 104.284+-0.192 <arithmetic> * 193.872+-0.300 ? 194.177+-0.566 ? might be 1.0016x slower <geometric> 165.025+-0.203 ? 165.049+-0.430 ? might be 1.0001x slower <harmonic> 142.216+-0.136 141.880+-0.349 might be 1.0024x faster TipOfTree FixCheckArray JSBench: amazon 7.0833+-0.1834 7.0833+-0.1834 facebook 33.8333+-1.7312 33.6667+-1.5875 google 67.2500+-1.5356 67.0833+-1.6581 twitter 8.6667+-0.3128 ? 8.7500+-0.2874 ? yahoo 3.1667+-0.3668 ? 3.2500+-0.2874 ? might be 1.0263x slower <arithmetic> * 24.0000+-0.6658 23.9667+-0.6268 might be 1.0014x faster <geometric> 13.4208+-0.4344 ? 13.5075+-0.2332 ? might be 1.0065x slower <harmonic> 8.0400+-0.5160 ? 8.1802+-0.3203 ? might be 1.0174x slower TipOfTree FixCheckArray JSRegress: adapt-to-double-divide 18.5269+-0.0168 ? 18.5381+-0.0240 ? aliased-arguments-getbyval 0.7946+-0.0099 ? 0.8005+-0.0155 ? allocate-big-object 3.4880+-1.1945 ? 3.5560+-1.2005 ? might be 1.0195x slower arity-mismatch-inlining 0.6659+-0.0037 ? 0.6668+-0.0059 ? array-access-polymorphic-structure 7.7405+-1.8248 ? 7.8569+-1.8194 ? might be 1.0150x slower array-with-double-add 4.7655+-0.0104 ? 4.8023+-0.0393 ? array-with-double-increment 3.3107+-0.0693 3.2598+-0.0228 might be 1.0156x faster array-with-double-mul-add 6.4914+-0.0469 6.4901+-0.0338 array-with-double-sum 6.4108+-0.0150 ? 6.4254+-0.0269 ? array-with-int32-add-sub 8.5685+-0.0440 ? 8.6337+-0.0525 ? array-with-int32-or-double-sum 6.5172+-0.0558 6.5085+-0.0261 big-int-mul 4.0612+-0.0805 4.0328+-0.0375 boolean-test 3.5306+-0.0328 3.5099+-0.0225 cast-int-to-double 11.3864+-0.0790 ? 11.4124+-0.0941 ? cell-argument 11.8642+-0.0233 11.8565+-0.0116 cfg-simplify 3.1394+-0.0485 ? 3.1677+-0.0351 ? cmpeq-obj-to-obj-other 9.0941+-0.1769 ! 9.3403+-0.0584 ! definitely 1.0271x slower constant-test 7.0125+-0.0708 6.9930+-0.0629 direct-arguments-getbyval 0.7431+-0.0105 0.7302+-0.0041 might be 1.0177x faster double-pollution-getbyval 8.8049+-0.0363 ? 8.8060+-0.0449 ? double-pollution-putbyoffset 4.8856+-0.5742 4.7551+-0.5578 might be 1.0274x faster empty-string-plus-int 11.4245+-0.4981 11.3935+-0.4992 external-arguments-getbyval 2.2195+-0.1851 ? 2.2243+-0.1790 ? external-arguments-putbyval 3.4085+-0.3114 3.2462+-0.3064 might be 1.0500x faster Float32Array-matrix-mult 13.9726+-0.9940 13.7838+-0.9240 might be 1.0137x faster fold-double-to-int 18.2216+-0.1566 18.1625+-0.2114 function-dot-apply 2.6072+-0.0177 2.6011+-0.0097 function-test 4.0251+-0.0337 4.0090+-0.0340 get-by-id-chain-from-try-block 6.1532+-0.0153 6.1517+-0.0582 HashMap-put-get-iterate-keys 72.0772+-0.9429 71.9648+-0.8525 HashMap-put-get-iterate 74.2234+-0.7040 74.2198+-0.6123 HashMap-string-put-get-iterate 66.2650+-0.8843 66.0407+-1.0295 indexed-properties-in-objects 3.6560+-0.0101 ? 3.6609+-0.0163 ? inline-arguments-access 1.0762+-0.0141 ? 1.0801+-0.0246 ? inline-arguments-local-escape 21.2695+-0.1514 ? 22.3062+-1.8114 ? might be 1.0487x slower inline-get-scoped-var 5.3016+-0.0075 ? 5.3437+-0.0496 ? inlined-put-by-id-transition 13.9025+-0.2032 13.7933+-0.1636 int-or-other-abs-then-get-by-val 7.2785+-0.0304 7.2568+-0.0108 int-or-other-abs-zero-then-get-by-val 30.2694+-0.1391 ? 30.2997+-0.2067 ? int-or-other-add-then-get-by-val 8.4317+-0.0171 ? 8.4382+-0.0236 ? int-or-other-add 8.7041+-0.0465 8.6604+-0.0385 int-or-other-div-then-get-by-val 6.6087+-0.0559 6.5722+-0.0196 int-or-other-max-then-get-by-val 8.0664+-0.1888 ? 8.1049+-0.1858 ? int-or-other-min-then-get-by-val 6.8044+-0.0779 6.7301+-0.0080 might be 1.0110x faster int-or-other-mod-then-get-by-val 6.5688+-0.0536 6.5404+-0.0216 int-or-other-mul-then-get-by-val 5.8900+-0.0176 ? 5.9139+-0.0477 ? int-or-other-neg-then-get-by-val 6.5324+-0.0138 ? 6.5648+-0.0327 ? int-or-other-neg-zero-then-get-by-val 30.2816+-0.2019 30.2357+-0.0324 int-or-other-sub-then-get-by-val 8.4200+-0.0133 ? 8.4254+-0.0168 ? int-or-other-sub 6.7292+-0.0066 6.7282+-0.0166 int-overflow-local 10.6538+-0.0541 ? 10.6613+-0.0840 ? Int16Array-bubble-sort 63.5054+-1.4341 63.2847+-1.7178 Int16Array-load-int-mul 1.5630+-0.0121 1.5599+-0.0083 Int8Array-load 4.5079+-0.0934 4.4773+-0.0098 integer-divide 12.6186+-0.0301 12.6097+-0.0237 integer-modulo 1.8475+-0.0219 ? 1.8510+-0.0192 ? make-indexed-storage 3.8254+-0.5653 ? 3.8277+-0.5827 ? method-on-number 19.6643+-0.2958 ? 19.7580+-0.3366 ? nested-function-parsing-random 321.2225+-10.5227 ? 321.4234+-10.5548 ? nested-function-parsing 48.1503+-2.9427 47.8688+-2.9841 new-array-buffer-dead 3.0738+-0.1234 3.0485+-0.1129 new-array-buffer-push 12.4847+-2.1404 ? 12.7949+-2.0198 ? might be 1.0248x slower new-array-dead 23.4258+-0.1147 ? 23.4630+-0.1129 ? new-array-push 10.1454+-1.6437 ? 10.2759+-1.6638 ? might be 1.0129x slower number-test 3.4384+-0.0279 3.4215+-0.0193 object-closure-call 7.2214+-0.1914 7.1605+-0.2103 object-test 4.0161+-0.0963 3.9146+-0.0251 might be 1.0259x faster poly-stricteq 75.6239+-0.2050 75.4939+-0.1057 polymorphic-structure 16.5408+-0.0142 16.5362+-0.0156 polyvariant-monomorphic-get-by-id 10.3087+-0.0290 ? 10.3352+-0.0320 ? rare-osr-exit-on-local 16.9675+-0.0200 ? 16.9739+-0.0318 ? register-pressure-from-osr 26.0581+-0.0624 26.0370+-0.0409 simple-activation-demo 28.7263+-0.2303 28.6991+-0.2268 slow-array-profile-convergence 3.9406+-0.2199 3.8369+-0.2422 might be 1.0270x faster slow-convergence 3.1627+-0.0214 3.1477+-0.0090 sparse-conditional 1.1125+-0.0058 1.1049+-0.0111 splice-to-remove 41.1385+-0.1038 ! 41.5818+-0.2962 ! definitely 1.0108x slower string-concat-object 4.1765+-1.2831 4.1555+-1.2447 string-concat-pair-object 4.2333+-1.2654 4.1170+-1.2603 might be 1.0282x faster string-concat-pair-simple 17.0748+-0.6209 16.8810+-0.5550 might be 1.0115x faster string-concat-simple 16.6888+-0.6370 ? 17.1253+-0.5896 ? might be 1.0262x slower string-cons-repeat 12.3493+-0.8003 12.2000+-0.8026 might be 1.0122x faster string-cons-tower 31.0190+-18.2250 ? 31.1902+-18.3573 ? string-hash 2.1702+-0.0177 2.1688+-0.0114 string-repeat-arith 37.2816+-0.6524 36.9392+-0.2494 string-sub 72.9891+-0.9757 ? 73.0070+-0.6598 ? string-test 3.4153+-0.0151 3.4136+-0.0077 structure-hoist-over-transitions 3.3734+-0.5786 3.3412+-0.5625 tear-off-arguments-simple 1.4945+-0.0141 ? 1.5071+-0.0240 ? tear-off-arguments 2.7709+-0.0229 2.7528+-0.0112 temporal-structure 17.2814+-0.0204 17.2569+-0.0264 to-int32-boolean 25.3976+-0.0907 25.3365+-0.0408 undefined-test 3.6302+-0.0200 3.6296+-0.0140 <arithmetic> 17.4460+-0.3407 ? 17.4491+-0.3875 ? might be 1.0002x slower <geometric> * 8.1168+-0.1783 8.1074+-0.1780 might be 1.0012x faster <harmonic> 4.5200+-0.0765 4.5082+-0.0727 might be 1.0026x faster TipOfTree FixCheckArray DSP: filtrr-posterize-tint 45.1752+-1.0134 45.0346+-1.0212 filtrr-tint-contrast-sat-bright 63.3960+-1.2619 62.6011+-1.1812 might be 1.0127x faster filtrr-tint-sat-adj-contr-mult 74.9996+-1.9155 74.4293+-1.7453 filtrr-blur-overlay-sat-contr 187.4295+-5.7822 186.2442+-5.1466 filtrr-sat-blur-mult-sharpen-contr 234.5227+-5.6520 233.9418+-4.8177 filtrr-sepia-bias 32.4222+-1.7334 32.1414+-1.6880 route9-vp8 x5 1050.6229+-6.2234 ? 1051.6386+-6.7868 ? starfield x5 1139.9473+-2.7052 1139.7498+-2.5549 bellard-jslinux x5 2777.1667+-9.3727 2773.6667+-12.6389 zynaps-quake3 x5 1199.8414+-34.2733 ? 1206.8224+-22.1225 ? zynaps-mandelbrot x5 1001.6099+-3.7171 1000.8694+-4.7030 <arithmetic> 1176.8995+-5.5850 ? 1177.3589+-4.3175 ? might be 1.0004x slower <geometric> * 770.5118+-5.1445 770.1203+-2.4870 might be 1.0005x faster <harmonic> 276.9330+-6.5093 275.3090+-6.0916 might be 1.0059x faster TipOfTree FixCheckArray All benchmarks: <arithmetic> 210.7265+-0.8774 ? 210.7771+-0.7247 ? might be 1.0002x slower <geometric> 20.1997+-0.2401 20.1833+-0.2441 might be 1.0008x faster <harmonic> 3.8967+-0.0315 3.8921+-0.0363 might be 1.0012x faster TipOfTree FixCheckArray Geomean of preferred means: <scaled-result> 36.8970+-0.2961 36.8515+-0.3373 might be 1.0012x faster
Landed in http://trac.webkit.org/changeset/146996