RESOLVED FIXED 72845
DFG 32_64 should directly store double virtual registers on SetLocal
https://bugs.webkit.org/show_bug.cgi?id=72845
Summary DFG 32_64 should directly store double virtual registers on SetLocal
Filip Pizlo
Reported 2011-11-20 19:05:56 PST
The 32_64 DFG will perform a complex shuffling to move an FPR into a pair of GPRs when doing a SetLocal() with the source in an FPR. It should just store the FPR into memory instead.
Attachments
the patch (1.71 KB, patch)
2011-11-20 19:11 PST, Filip Pizlo
oliver: review+
Filip Pizlo
Comment 1 2011-11-20 19:11:35 PST
Created attachment 116021 [details] the patch Benchmark report for SunSpider, V8, and Kraken on nitroflex.local (MacBookPro8,2). VMs tested: "TipOfTree32" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc (r100877) "DoubleSetLocal" at /Volumes/Data/pizlo/OpenSource/WebKitBuild/Release/jsc (r100877) Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree32 DoubleSetLocal SunSpider: 3d-cube 8.4549+-0.1343 ^ 8.0452+-0.2251 ^ definitely 1.0509x faster 3d-morph 10.7193+-0.1773 10.4906+-0.1972 might be 1.0218x faster 3d-raytrace 9.3581+-0.1933 9.1155+-0.2413 might be 1.0266x faster access-binary-trees 1.8295+-0.0483 1.7555+-0.0480 might be 1.0422x faster access-fannkuch 6.9373+-0.1184 ? 6.9615+-0.0946 ? access-nbody 5.1551+-0.1148 5.0928+-0.0389 might be 1.0122x faster access-nsieve 2.5978+-0.0531 2.5769+-0.0547 bitops-3bit-bits-in-byte 1.2894+-0.0129 1.2718+-0.0216 might be 1.0139x faster bitops-bits-in-byte 2.4475+-0.0977 ? 2.4613+-0.0988 ? bitops-bitwise-and 2.9713+-0.1023 2.8787+-0.0825 might be 1.0322x faster bitops-nsieve-bits 6.2704+-0.0946 6.1502+-0.0575 might be 1.0195x faster controlflow-recursive 2.6202+-0.0480 ? 2.6276+-0.0594 ? crypto-aes 8.6539+-0.2166 8.6224+-0.1586 crypto-md5 3.0539+-0.0660 ? 3.1494+-0.1330 ? might be 1.0313x slower crypto-sha1 2.4836+-0.0682 2.4560+-0.0529 might be 1.0113x faster date-format-tofte 10.5511+-0.2150 ? 10.7371+-0.2162 ? might be 1.0176x slower date-format-xparb 10.4321+-0.1995 ? 10.5713+-0.2092 ? might be 1.0133x slower math-cordic 8.0759+-0.0995 ? 8.1393+-0.1471 ? math-partial-sums 9.5702+-0.1078 ? 9.5913+-0.1801 ? math-spectral-norm 2.5604+-0.0429 2.5581+-0.0905 regexp-dna 10.5702+-0.2713 10.4854+-0.1979 string-base64 4.3580+-0.1558 4.3331+-0.0987 string-fasta 8.7332+-0.1923 8.6863+-0.1381 string-tagcloud 13.0627+-0.2919 13.0404+-0.2198 string-unpack-code 21.5516+-0.4510 21.4704+-0.4070 string-validate-input 5.9273+-0.1235 ? 5.9752+-0.1225 ? <arithmetic> * 6.9321+-0.0299 6.8940+-0.0226 might be 1.0055x faster <geometric> 5.5357+-0.0271 5.4989+-0.0271 might be 1.0067x faster <harmonic> 4.3034+-0.0384 4.2653+-0.0455 might be 1.0089x faster TipOfTree32 DoubleSetLocal V8: crypto 90.0416+-0.8016 ? 90.6770+-0.6636 ? deltablue 154.9745+-0.6468 154.6830+-2.4361 earley-boyer 143.3423+-1.0763 143.0581+-1.2902 raytrace 61.0435+-0.3575 60.8018+-0.2768 regexp 108.9734+-1.1136 107.7358+-0.5199 might be 1.0115x faster richards 167.4317+-1.4102 165.2445+-0.9657 might be 1.0132x faster splay 78.8951+-1.1458 78.2908+-0.8136 <arithmetic> 114.9575+-0.3612 114.3559+-0.4613 might be 1.0053x faster <geometric> * 108.3815+-0.3745 107.8706+-0.3324 might be 1.0047x faster <harmonic> 101.8154+-0.4095 101.3808+-0.2502 might be 1.0043x faster TipOfTree32 DoubleSetLocal Kraken: ai-astar 521.7866+-4.5492 517.8003+-1.6303 audio-beat-detection 373.5188+-2.9094 ^ 368.5723+-1.1785 ^ definitely 1.0134x faster audio-dft 384.2349+-4.8247 ^ 367.0053+-2.2540 ^ definitely 1.0469x faster audio-fft 249.2332+-1.9537 249.0107+-0.6502 audio-oscillator 465.0110+-2.5229 461.8881+-3.2125 imaging-darkroom 397.2592+-3.4557 390.9891+-3.9149 might be 1.0160x faster imaging-desaturate 920.9864+-0.9105 917.8887+-3.3876 imaging-gaussian-blur 764.3026+-2.0463 ^ 698.2806+-2.7788 ^ definitely 1.0945x faster json-parse-financial 58.8100+-0.2344 58.4174+-0.1946 json-stringify-tinderbox 98.7484+-0.3706 98.5522+-0.5464 stanford-crypto-aes 107.9467+-0.6432 ? 108.5069+-0.2682 ? stanford-crypto-ccm 110.9496+-0.5222 109.9775+-0.5927 stanford-crypto-pbkdf2 216.4592+-1.3129 ^ 213.6526+-1.0766 ^ definitely 1.0131x faster stanford-crypto-sha256-iterative 91.4980+-0.2973 91.2890+-0.2927 <arithmetic> * 340.0532+-0.5716 ^ 332.2736+-0.4668 ^ definitely 1.0234x faster <geometric> 248.0638+-0.3670 ^ 244.3442+-0.3037 ^ definitely 1.0152x faster <harmonic> 176.6955+-0.2389 ^ 175.3484+-0.2843 ^ definitely 1.0077x faster TipOfTree32 DoubleSetLocal All benchmarks: <arithmetic> 122.2485+-0.1486 ^ 119.8205+-0.1493 ^ definitely 1.0203x faster <geometric> 26.7588+-0.0696 ^ 26.5219+-0.0802 ^ definitely 1.0089x faster <harmonic> 7.5933+-0.0661 7.5265+-0.0785 might be 1.0089x faster TipOfTree32 DoubleSetLocal Geomean of preferred means: <scaled-result> 63.4529+-0.0942 ^ 62.7510+-0.1008 ^ definitely 1.0112x faster
Filip Pizlo
Comment 2 2011-11-20 19:32:45 PST
Note You need to log in before you can comment on or make changes to this bug.