Summary: | JIT and DFG should NaN-check loads from Float32 arrays | ||||||
---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Filip Pizlo <fpizlo> | ||||
Component: | JavaScriptCore | Assignee: | Filip Pizlo <fpizlo> | ||||
Status: | RESOLVED FIXED | ||||||
Severity: | Normal | CC: | barraclough, ggaren, mark.lam, mhahnenberg, msaboff, oliver, sam | ||||
Priority: | P2 | Keywords: | InRadar | ||||
Version: | 528+ (Nightly build) | ||||||
Hardware: | All | ||||||
OS: | All | ||||||
Attachments: |
|
Description
Filip Pizlo
2013-03-27 18:00:01 PDT
Created attachment 195449 [details]
the patch
Comment on attachment 195449 [details]
the patch
r=me
This isn't enough of a slow-down for us to care. Benchmark report for SunSpider, V8Spider, Octane, Kraken, JSBench, JSRegress, and DSP on bigmac (MacPro5,1). VMs tested: "TipOfTree" at /Volumes/Data/pizlo/quartary/OpenSource/WebKitBuild/Release/DumpRenderTree (r147012) "FixFloat32" at /Volumes/Data/pizlo/secondary/OpenSource/WebKitBuild/Release/DumpRenderTree (r147012) Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree FixFloat32 SunSpider: 3d-cube 8.9210+-0.3206 ? 9.0063+-0.2617 ? 3d-morph 7.4437+-0.0709 ? 7.5217+-0.0612 ? might be 1.0105x slower 3d-raytrace 9.8159+-0.2455 ? 9.8335+-0.2222 ? access-binary-trees 2.2653+-0.3076 ? 2.3541+-0.3046 ? might be 1.0392x slower access-fannkuch 6.6303+-0.0946 ? 6.6539+-0.1166 ? access-nbody 3.9617+-0.0590 3.9596+-0.0520 access-nsieve 4.3191+-0.0481 4.2357+-0.0629 might be 1.0197x faster bitops-3bit-bits-in-byte 1.5426+-0.0261 1.5321+-0.0255 bitops-bits-in-byte 5.6319+-0.0463 ? 5.6772+-0.0759 ? bitops-bitwise-and 2.0914+-0.0497 2.0747+-0.0741 bitops-nsieve-bits 3.3702+-0.0084 ? 3.3979+-0.0290 ? controlflow-recursive 2.4819+-0.0209 ? 2.4888+-0.0203 ? crypto-aes 7.2849+-0.3080 ? 7.5208+-0.2676 ? might be 1.0324x slower crypto-md5 3.6109+-0.0855 ? 3.6184+-0.0726 ? crypto-sha1 2.9073+-0.0563 2.8992+-0.0426 date-format-tofte 14.0093+-0.9144 ? 14.0252+-1.0001 ? date-format-xparb 9.1960+-0.5965 ? 9.2519+-0.6148 ? math-cordic 3.3265+-0.0324 ? 3.3274+-0.0222 ? math-partial-sums 10.1317+-0.0807 10.1278+-0.0683 math-spectral-norm 2.6751+-0.0221 ? 2.6982+-0.0195 ? regexp-dna 10.0047+-0.4666 ? 10.0390+-0.4525 ? string-base64 4.8176+-0.4960 ? 4.8417+-0.4947 ? string-fasta 9.7266+-0.2202 ? 9.7408+-0.1856 ? string-tagcloud 12.2484+-0.2146 12.1461+-0.2382 string-unpack-code 24.9933+-0.5367 24.9517+-0.5161 string-validate-input 8.2188+-0.2119 ? 8.2470+-0.2599 ? <arithmetic> * 6.9856+-0.1549 ? 7.0066+-0.1482 ? might be 1.0030x slower <geometric> 5.5771+-0.1043 ? 5.5980+-0.0988 ? might be 1.0037x slower <harmonic> 4.4669+-0.0741 ? 4.4832+-0.0696 ? might be 1.0037x slower TipOfTree FixFloat32 V8Spider: crypto 76.4436+-0.5328 ? 76.4480+-0.5475 ? deltablue 106.4238+-0.5406 105.8757+-0.4164 earley-boyer 71.6931+-0.7290 71.4327+-0.7473 raytrace 54.1809+-4.0743 52.7361+-2.9790 might be 1.0274x faster regexp 84.4150+-0.3854 84.2139+-0.3209 richards 100.3756+-1.2997 ? 100.3977+-1.2587 ? splay 53.0150+-2.5955 52.5163+-2.9083 <arithmetic> 78.0782+-0.6094 77.6601+-0.7159 might be 1.0054x faster <geometric> * 75.5769+-0.8001 75.0897+-0.8883 might be 1.0065x faster <harmonic> 73.0516+-0.9978 72.4918+-1.0701 might be 1.0077x faster TipOfTree FixFloat32 Octane and V8v7: encrypt 0.40957+-0.00100 0.40935+-0.00099 decrypt 7.41818+-0.00952 ? 7.41867+-0.00698 ? deltablue x2 0.49180+-0.00504 0.48992+-0.00485 earley 0.76386+-0.00659 ? 0.77066+-0.01352 ? boyer 10.80313+-0.01884 ? 10.81146+-0.03513 ? raytrace x2 3.95038+-0.04845 3.93395+-0.07377 regexp x2 26.29313+-0.06119 26.25907+-0.06921 richards x2 0.25832+-0.00245 ? 0.25909+-0.00361 ? splay x2 0.55109+-0.01165 0.54750+-0.00825 navier-stokes x2 9.14513+-0.09462 9.05378+-0.00704 might be 1.0101x faster closure 0.25930+-0.03320 0.25905+-0.03332 jquery 3.75395+-0.45104 3.73058+-0.44088 gbemu x2 119.36194+-5.41325 ? 127.10538+-7.67356 ? might be 1.0649x slower mandreel x2 149.69987+-0.53947 ! 152.64847+-0.93646 ! definitely 1.0197x slower pdfjs x2 93.69051+-0.28408 ? 94.02524+-0.23860 ? box2d x2 31.04716+-0.14735 30.91979+-0.18786 V8v7: <arithmetic> 6.29840+-0.01509 6.28105+-0.01552 might be 1.0028x faster <geometric> * 2.06658+-0.00749 2.06183+-0.00754 might be 1.0023x faster <harmonic> 0.79120+-0.00442 0.79076+-0.00466 might be 1.0005x faster Octane including V8v7: <arithmetic> 34.32256+-0.43934 ? 35.14939+-0.58520 ? might be 1.0241x slower <geometric> * 6.11592+-0.05184 ? 6.14345+-0.05740 ? might be 1.0045x slower <harmonic> 1.05731+-0.02084 1.05656+-0.02046 might be 1.0007x faster TipOfTree FixFloat32 Kraken: ai-astar 438.593+-2.494 ? 438.806+-2.807 ? audio-beat-detection 210.448+-2.306 207.945+-0.693 might be 1.0120x faster audio-dft 261.910+-1.976 259.024+-1.434 might be 1.0111x faster audio-fft 122.083+-0.285 121.793+-0.184 audio-oscillator 212.260+-0.487 211.789+-0.287 imaging-darkroom 244.255+-0.990 243.551+-0.956 imaging-desaturate 133.716+-0.121 ? 133.737+-0.159 ? imaging-gaussian-blur 415.293+-0.349 ? 416.187+-0.942 ? json-parse-financial 67.819+-0.139 ? 68.405+-0.881 ? json-stringify-tinderbox 83.960+-0.329 83.731+-0.187 stanford-crypto-aes 101.275+-0.741 101.170+-0.488 stanford-crypto-ccm 97.559+-0.472 ? 97.702+-0.429 ? stanford-crypto-pbkdf2 231.513+-1.800 229.923+-1.515 stanford-crypto-sha256-iterative 104.596+-0.317 104.596+-0.442 <arithmetic> * 194.663+-0.459 194.169+-0.213 might be 1.0025x faster <geometric> 165.505+-0.292 165.171+-0.184 might be 1.0020x faster <harmonic> 142.280+-0.163 142.182+-0.284 might be 1.0007x faster TipOfTree FixFloat32 JSBench: amazon 7.1667+-0.2473 ? 7.3333+-0.3128 ? might be 1.0233x slower facebook 33.9167+-1.6359 33.7500+-1.7791 google 67.5000+-1.7027 67.4167+-1.7233 twitter 8.5833+-0.3272 ? 8.7500+-0.3949 ? might be 1.0194x slower yahoo 2.9167+-0.4248 ? 3.1667+-0.3668 ? might be 1.0857x slower <arithmetic> * 24.0167+-0.7136 ? 24.0833+-0.7319 ? might be 1.0028x slower <geometric> 13.2026+-0.4882 ? 13.5325+-0.4190 ? might be 1.0250x slower <harmonic> 7.6571+-0.5848 ? 8.0983+-0.4704 ? might be 1.0576x slower TipOfTree FixFloat32 JSRegress: adapt-to-double-divide 18.5733+-0.0751 18.5706+-0.0472 aliased-arguments-getbyval 0.8060+-0.0132 0.7997+-0.0061 allocate-big-object 3.5236+-1.1873 ? 3.5291+-1.2089 ? arity-mismatch-inlining 0.6884+-0.0175 0.6701+-0.0054 might be 1.0274x faster array-access-polymorphic-structure 7.3217+-1.6776 6.8005+-1.4175 might be 1.0766x faster array-with-double-add 4.7875+-0.0374 4.7685+-0.0161 array-with-double-increment 3.2750+-0.0260 ? 3.2831+-0.0204 ? array-with-double-mul-add 6.6465+-0.1164 6.5011+-0.0618 might be 1.0224x faster array-with-double-sum 6.4356+-0.0303 ? 6.4379+-0.0239 ? array-with-int32-add-sub 8.6604+-0.0328 8.6466+-0.0185 array-with-int32-or-double-sum 6.5002+-0.0324 ? 6.5329+-0.0485 ? big-int-mul 4.0285+-0.0117 ? 4.0480+-0.0381 ? boolean-test 3.6082+-0.1003 3.5208+-0.0157 might be 1.0248x faster cast-int-to-double 11.4141+-0.0831 11.3766+-0.0885 cell-argument 11.8764+-0.0128 ? 11.8774+-0.0146 ? cfg-simplify 3.2203+-0.0679 3.1921+-0.0433 cmpeq-obj-to-obj-other 9.3500+-0.0974 ? 9.4729+-0.0909 ? might be 1.0131x slower constant-test 7.0564+-0.0859 6.9609+-0.0729 might be 1.0137x faster direct-arguments-getbyval 0.7419+-0.0136 ? 0.7423+-0.0069 ? double-pollution-getbyval 8.8398+-0.0497 8.8240+-0.0240 double-pollution-putbyoffset 4.7581+-0.5607 ? 4.7868+-0.5496 ? empty-string-plus-int 11.5215+-0.4654 11.4624+-0.4745 external-arguments-getbyval 2.1474+-0.1731 ? 2.2690+-0.1965 ? might be 1.0566x slower external-arguments-putbyval 3.3007+-0.2871 ? 3.3323+-0.3004 ? Float32Array-matrix-mult 12.8316+-0.6926 ? 13.0303+-0.7520 ? might be 1.0155x slower fold-double-to-int 18.2735+-0.1629 18.2722+-0.1860 function-dot-apply 2.6172+-0.0154 ? 2.6400+-0.0548 ? function-test 4.0872+-0.0483 4.0378+-0.0406 might be 1.0122x faster get-by-id-chain-from-try-block 6.1536+-0.0766 6.1259+-0.0334 HashMap-put-get-iterate-keys 72.3307+-0.8827 ? 72.7020+-1.0130 ? HashMap-put-get-iterate 73.9127+-0.7201 ? 73.9872+-0.8708 ? HashMap-string-put-get-iterate 67.1371+-1.2249 66.8672+-1.0170 indexed-properties-in-objects 3.7103+-0.0435 3.6689+-0.0139 might be 1.0113x faster inline-arguments-access 1.0779+-0.0128 ? 1.0901+-0.0244 ? might be 1.0113x slower inline-arguments-local-escape 21.4651+-0.1051 ? 21.5256+-0.1670 ? inline-get-scoped-var 5.3498+-0.0385 5.3218+-0.0143 inlined-put-by-id-transition 13.9193+-0.1986 13.8737+-0.2217 int-or-other-abs-then-get-by-val 7.2827+-0.0311 7.2807+-0.0267 int-or-other-abs-zero-then-get-by-val 30.3430+-0.1592 30.3139+-0.1402 int-or-other-add-then-get-by-val 8.4768+-0.0513 8.4367+-0.0168 int-or-other-add 8.7178+-0.0409 ? 8.7207+-0.0615 ? int-or-other-div-then-get-by-val 6.5903+-0.0262 ? 6.6646+-0.0707 ? might be 1.0113x slower int-or-other-max-then-get-by-val 8.1960+-0.2179 ? 8.2950+-0.1983 ? might be 1.0121x slower int-or-other-min-then-get-by-val 6.7563+-0.0195 ? 6.7566+-0.0219 ? int-or-other-mod-then-get-by-val 6.5729+-0.0258 ? 6.5864+-0.0596 ? int-or-other-mul-then-get-by-val 5.8892+-0.0394 ? 5.9025+-0.0306 ? int-or-other-neg-then-get-by-val 6.5939+-0.0412 6.5658+-0.0338 int-or-other-neg-zero-then-get-by-val 30.2630+-0.0678 30.2171+-0.0773 int-or-other-sub-then-get-by-val 8.4588+-0.0528 ? 8.4760+-0.0718 ? int-or-other-sub 6.7448+-0.0164 ? 6.7734+-0.0405 ? int-overflow-local 10.6757+-0.0425 10.6362+-0.0555 Int16Array-bubble-sort 67.5525+-2.8459 ? 74.8295+-21.5064 ? might be 1.1077x slower Int16Array-load-int-mul 1.5779+-0.0232 1.5706+-0.0108 Int8Array-load 4.6478+-0.1401 4.5711+-0.0868 might be 1.0168x faster integer-divide 12.8176+-0.2071 12.6705+-0.0309 might be 1.0116x faster integer-modulo 1.8640+-0.0187 1.8572+-0.0589 make-indexed-storage 3.8738+-0.5523 ? 3.8846+-0.5789 ? method-on-number 19.4652+-0.3545 ? 19.6622+-0.4129 ? might be 1.0101x slower nested-function-parsing-random 323.4666+-11.9693 322.0079+-10.6137 nested-function-parsing 48.0927+-3.0055 ? 48.5746+-3.1521 ? might be 1.0100x slower new-array-buffer-dead 3.1104+-0.1121 3.0707+-0.1108 might be 1.0129x faster new-array-buffer-push 12.7455+-2.0634 ? 12.8664+-2.0594 ? new-array-dead 23.4288+-0.0723 ? 23.4614+-0.0564 ? new-array-push 10.2926+-1.6250 10.2286+-1.6129 number-test 3.4521+-0.0380 3.4439+-0.0269 object-closure-call 7.1848+-0.2035 ? 7.1992+-0.1862 ? object-test 3.9351+-0.0338 ? 3.9507+-0.0311 ? poly-stricteq 76.4296+-0.8444 76.2550+-0.8107 polymorphic-structure 16.6998+-0.1822 16.5929+-0.0388 polyvariant-monomorphic-get-by-id 10.3368+-0.0431 ? 10.3518+-0.0487 ? rare-osr-exit-on-local 17.0050+-0.0481 ? 17.0210+-0.0404 ? register-pressure-from-osr 26.0911+-0.0405 ? 26.3013+-0.2249 ? simple-activation-demo 28.7682+-0.2249 28.7620+-0.2242 slow-array-profile-convergence 3.9795+-0.2066 3.9088+-0.2280 might be 1.0181x faster slow-convergence 3.1895+-0.0688 3.1772+-0.0240 sparse-conditional 1.1164+-0.0111 ? 1.1217+-0.0109 ? splice-to-remove 41.3113+-0.1283 ! 41.7906+-0.3009 ! definitely 1.0116x slower string-concat-object 3.8786+-1.1705 ? 4.2467+-1.2774 ? might be 1.0949x slower string-concat-pair-object 4.2080+-1.2601 4.1743+-1.2758 string-concat-pair-simple 17.1067+-0.4984 ? 17.3749+-0.5046 ? might be 1.0157x slower string-concat-simple 16.9203+-0.4657 ? 17.0031+-0.4794 ? string-cons-repeat 12.3395+-0.7783 ? 12.4141+-0.8664 ? string-cons-tower 30.8612+-18.1041 30.8386+-18.0645 string-hash 2.1724+-0.0114 ? 2.2091+-0.0319 ? might be 1.0169x slower string-repeat-arith 37.2928+-0.3484 37.0894+-0.3120 string-sub 73.2475+-0.9414 ? 73.3145+-0.5919 ? string-test 3.4426+-0.0295 ? 3.4439+-0.0256 ? structure-hoist-over-transitions 3.4060+-0.5736 3.3916+-0.5480 tear-off-arguments-simple 1.5005+-0.0098 1.4968+-0.0084 tear-off-arguments 2.7613+-0.0147 ? 2.7640+-0.0209 ? temporal-structure 17.3238+-0.0547 17.3025+-0.0360 to-int32-boolean 25.3500+-0.0169 ? 25.4299+-0.1043 ? undefined-test 3.6424+-0.0261 ? 3.6611+-0.0318 ? <arithmetic> 17.5419+-0.3934 ? 17.6178+-0.5476 ? might be 1.0043x slower <geometric> * 8.1383+-0.1617 ? 8.1457+-0.1917 ? might be 1.0009x slower <harmonic> 4.5405+-0.0646 4.5371+-0.0763 might be 1.0007x faster TipOfTree FixFloat32 DSP: filtrr-posterize-tint 44.8839+-0.9748 44.8740+-0.9559 filtrr-tint-contrast-sat-bright 64.6584+-2.3953 63.2487+-1.6434 might be 1.0223x faster filtrr-tint-sat-adj-contr-mult 74.7810+-1.9508 74.4355+-2.0009 filtrr-blur-overlay-sat-contr 193.9332+-5.8263 185.8268+-5.1224 might be 1.0436x faster filtrr-sat-blur-mult-sharpen-contr 233.8335+-4.6154 233.4521+-4.9055 filtrr-sepia-bias 32.3349+-1.6908 ? 32.5578+-1.7999 ? route9-vp8 x5 1038.2862+-25.6108 ? 1050.8866+-16.9175 ? might be 1.0121x slower starfield x5 1176.6958+-6.0084 1165.5170+-6.7290 bellard-jslinux x5 2764.9167+-10.5502 ? 2769.8333+-13.7874 ? zynaps-quake3 x5 1166.8512+-30.1726 1155.6982+-33.0065 zynaps-mandelbrot x5 1001.4150+-5.9211 1000.8416+-5.4550 <arithmetic> 1173.7177+-7.1366 1172.5251+-6.3965 might be 1.0010x faster <geometric> * 769.8201+-4.5473 767.5457+-5.0842 might be 1.0030x faster <harmonic> 277.2825+-6.7005 276.1702+-6.6491 might be 1.0040x faster TipOfTree FixFloat32 All benchmarks: <arithmetic> 210.3097+-1.1193 210.2237+-1.0757 might be 1.0004x faster <geometric> 20.2304+-0.2329 ? 20.2558+-0.2689 ? might be 1.0013x slower <harmonic> 3.8932+-0.0389 ? 3.8949+-0.0359 ? might be 1.0004x slower TipOfTree FixFloat32 Geomean of preferred means: <scaled-result> 36.9715+-0.3314 36.9668+-0.3676 might be 1.0001x faster Landed in http://trac.webkit.org/changeset/147047 |