RESOLVED FIXED 156486
[JSC] B3 can use undefined bits or not defined required bits when spilling
https://bugs.webkit.org/show_bug.cgi?id=156486
Summary [JSC] B3 can use undefined bits or not defined required bits when spilling
Benjamin Poulain
Reported 2016-04-11 18:53:30 PDT
[JSC] B3 can use undefined bits or not defined required bits when spilling
Attachments
Patch (11.31 KB, patch)
2016-04-11 19:02 PDT, Benjamin Poulain
fpizlo: review+
Patch (12.41 KB, patch)
2016-04-11 19:29 PDT, Benjamin Poulain
no flags
Benjamin Poulain
Comment 1 2016-04-11 19:02:48 PDT
WebKit Commit Bot
Comment 2 2016-04-11 19:04:22 PDT
Attachment 276197 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/b3/testb3.cpp:11430: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/testb3.cpp:11440: Consider using CHECK_EQ instead of CHECK(a == b) [readability/check] [2] ERROR: Source/JavaScriptCore/b3/testb3.cpp:11440: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] ERROR: Source/JavaScriptCore/b3/testb3.cpp:11481: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/testb3.cpp:11494: Consider using CHECK_EQ instead of CHECK(a == b) [readability/check] [2] ERROR: Source/JavaScriptCore/b3/testb3.cpp:11494: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] ERROR: Source/JavaScriptCore/b3/testb3.cpp:11495: Consider using CHECK_EQ instead of CHECK(a == b) [readability/check] [2] ERROR: Source/JavaScriptCore/b3/testb3.cpp:11498: Consider using CHECK_EQ instead of CHECK(a == b) [readability/check] [2] ERROR: Source/JavaScriptCore/b3/testb3.cpp:11498: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] ERROR: Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:1465: Place brace on its own line for function definitions. [whitespace/braces] [4] Total errors found: 10 in 3 files If any of these errors are false positives, please file a bug against check-webkit-style.
Filip Pizlo
Comment 3 2016-04-11 19:05:26 PDT
Comment on attachment 276197 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=276197&action=review > Source/JavaScriptCore/ChangeLog:17 > + Op64 %tmp1, %tmp2, %tmp3 I believe that the width analysis should report that we could read 64 bits from %tmp1, so then %tmp1's spill slot would be 64-bit.
Benjamin Poulain
Comment 4 2016-04-11 19:29:10 PDT
WebKit Commit Bot
Comment 5 2016-04-11 19:30:42 PDT
Attachment 276200 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/b3/testb3.cpp:11430: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/testb3.cpp:11440: Consider using CHECK_EQ instead of CHECK(a == b) [readability/check] [2] ERROR: Source/JavaScriptCore/b3/testb3.cpp:11440: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] ERROR: Source/JavaScriptCore/b3/testb3.cpp:11481: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/testb3.cpp:11494: Consider using CHECK_EQ instead of CHECK(a == b) [readability/check] [2] ERROR: Source/JavaScriptCore/b3/testb3.cpp:11494: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] ERROR: Source/JavaScriptCore/b3/testb3.cpp:11495: Consider using CHECK_EQ instead of CHECK(a == b) [readability/check] [2] ERROR: Source/JavaScriptCore/b3/testb3.cpp:11498: Consider using CHECK_EQ instead of CHECK(a == b) [readability/check] [2] ERROR: Source/JavaScriptCore/b3/testb3.cpp:11498: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] ERROR: Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:1465: Place brace on its own line for function definitions. [whitespace/braces] [4] Total errors found: 10 in 4 files If any of these errors are false positives, please file a bug against check-webkit-style.
Filip Pizlo
Comment 6 2016-04-11 21:20:02 PDT
Comment on attachment 276200 [details] Patch How does performance look with this change?
Benjamin Poulain
Comment 7 2016-04-11 21:49:37 PDT
Looks like ASM.js is consistently ~1% slower. Conf#1 Conf#2 SunSpider: 3d-cube 8.4700+-0.2026 ? 8.7783+-0.3943 ? might be 1.0364x slower 3d-morph 8.0162+-0.1140 7.8483+-0.1866 might be 1.0214x faster 3d-raytrace 8.9434+-0.0881 ? 8.9725+-0.4521 ? access-binary-trees 3.1238+-0.1945 ? 3.1438+-0.1957 ? access-fannkuch 9.1057+-0.1984 ? 9.2297+-0.2061 ? might be 1.0136x slower access-nbody 4.2905+-0.2136 ? 4.3686+-0.1271 ? might be 1.0182x slower access-nsieve 4.6182+-0.1408 ? 4.7003+-0.1357 ? might be 1.0178x slower bitops-3bit-bits-in-byte 1.6906+-0.1234 1.6691+-0.1486 might be 1.0129x faster bitops-bits-in-byte 5.0170+-0.1201 ? 5.0589+-0.1450 ? bitops-bitwise-and 2.7761+-0.1008 ? 2.9255+-0.0535 ? might be 1.0538x slower bitops-nsieve-bits 4.6110+-0.1628 ? 4.6262+-0.1005 ? controlflow-recursive 3.4317+-0.1120 ? 3.5907+-0.1227 ? might be 1.0463x slower crypto-aes 6.1561+-0.4940 ? 6.1772+-0.3405 ? crypto-md5 3.8763+-0.1746 ? 3.9257+-0.1325 ? might be 1.0128x slower crypto-sha1 3.5192+-0.1567 ? 3.5223+-0.2197 ? date-format-tofte 11.9573+-0.3182 11.8615+-0.1484 date-format-xparb 7.2903+-0.2723 7.2382+-0.2115 math-cordic 4.4269+-0.1337 4.4053+-0.1568 math-partial-sums 9.7850+-0.1278 9.4720+-0.3954 might be 1.0330x faster math-spectral-norm 3.3010+-0.0402 3.2201+-0.1869 might be 1.0251x faster regexp-dna 10.3401+-0.5237 10.0945+-0.3628 might be 1.0243x faster string-base64 6.5437+-0.0766 ? 6.5683+-0.2092 ? string-fasta 9.0342+-0.4598 ? 9.0490+-0.3726 ? string-tagcloud 12.9733+-0.7505 12.6915+-0.3714 might be 1.0222x faster string-unpack-code 26.7853+-1.0230 ? 27.3695+-1.8284 ? might be 1.0218x slower string-validate-input 6.3016+-0.2336 6.2375+-0.1245 might be 1.0103x faster <arithmetic> 7.1686+-0.0851 ? 7.1825+-0.1089 ? might be 1.0019x slower Conf#1 Conf#2 Octane: encrypt 0.27507+-0.00936 0.27294+-0.00566 decrypt 4.92433+-0.01932 ! 5.00699+-0.04442 ! definitely 1.0168x slower deltablue x2 0.24382+-0.00785 0.24216+-0.00365 earley 0.51026+-0.00446 ? 0.51219+-0.00350 ? boyer 8.59965+-0.01781 ! 8.75121+-0.05111 ! definitely 1.0176x slower navier-stokes x2 6.45040+-0.01609 6.43984+-0.03206 raytrace x2 1.39355+-0.03777 ? 1.41191+-0.02327 ? might be 1.0132x slower richards x2 0.14166+-0.00191 ? 0.14179+-0.00215 ? splay x2 0.53410+-0.00409 0.53074+-0.01105 regexp x2 26.70413+-0.48799 26.62181+-0.37138 pdfjs x2 60.75209+-1.63618 60.37284+-0.46014 mandreel x2 67.10087+-0.27170 ! 67.86370+-0.30657 ! definitely 1.0114x slower gbemu x2 44.55959+-2.38735 42.65322+-0.40096 might be 1.0447x faster closure 0.83894+-0.00408 ! 0.84956+-0.00392 ! definitely 1.0127x slower jquery 10.92553+-0.25972 ? 10.99053+-0.10177 ? box2d x2 16.01959+-0.14411 ? 16.03096+-0.27536 ? zlib x2 550.03451+-13.48751 548.60665+-5.01457 typescript x2 1069.22711+-7.02785 ? 1072.60455+-11.38620 ? <geometric> 8.25054+-0.05727 8.24164+-0.04554 might be 1.0011x faster Conf#1 Conf#2 Kraken: ai-astar 132.773+-1.363 132.524+-0.396 audio-beat-detection 68.271+-0.251 ? 68.936+-1.227 ? audio-dft 127.768+-1.648 ? 129.038+-1.590 ? audio-fft 54.887+-0.406 ? 54.990+-0.319 ? audio-oscillator 78.777+-0.603 78.288+-0.740 imaging-darkroom 96.976+-2.958 95.803+-0.294 might be 1.0122x faster imaging-desaturate 92.572+-0.285 ? 92.892+-0.557 ? imaging-gaussian-blur 121.156+-1.000 110.865+-15.946 might be 1.0928x faster json-parse-financial 67.831+-0.424 ? 68.075+-0.309 ? json-stringify-tinderbox 40.422+-0.764 40.129+-0.196 stanford-crypto-aes 61.270+-0.715 61.013+-1.304 stanford-crypto-ccm 59.892+-4.987 59.022+-5.200 might be 1.0147x faster stanford-crypto-pbkdf2 149.264+-2.635 ? 149.866+-3.190 ? stanford-crypto-sha256-iterative 58.026+-0.545 ? 59.032+-0.555 ? might be 1.0173x slower <arithmetic> 86.420+-0.555 85.748+-0.567 might be 1.0078x faster Conf#1 Conf#2 AsmBench: bigfib.cpp 656.9565+-9.9714 ? 658.5748+-14.1574 ? cray.c 565.3174+-5.8253 ? 566.1335+-10.5589 ? dry.c 651.1944+-27.5699 ? 674.1650+-0.6050 ? might be 1.0353x slower FloatMM.c 889.3350+-2.3675 ? 891.9225+-3.2651 ? gcc-loops.cpp 6420.1935+-30.6182 ? 6434.8116+-22.1752 ? n-body.c 1522.2892+-79.2739 ? 1544.3279+-85.1951 ? might be 1.0145x slower Quicksort.c 574.9522+-3.7392 ! 590.7017+-2.7602 ! definitely 1.0274x slower stepanov_container.cpp 4460.5098+-27.9670 ? 4487.1917+-49.2854 ? Towers.c 377.2852+-1.7200 ! 385.9769+-0.6723 ! definitely 1.0230x slower <geometric> 1081.9978+-5.5919 ! 1095.7885+-2.4173 ! definitely 1.0127x slower Conf#1 Conf#2 Geomean of preferred means: <scaled-result> 48.4940+-0.2464 ? 48.5633+-0.3385 ? might be 1.0014x slower Second run of ASM.js: Conf#1 Conf#2 bigfib.cpp 654.1125+-6.8197 ? 655.4327+-5.5464 ? cray.c 565.2639+-5.8350 ? 571.0610+-8.0384 ? might be 1.0103x slower dry.c 657.9476+-30.5735 ? 658.9620+-28.6640 ? FloatMM.c 890.0372+-1.6299 ? 891.0305+-3.7049 ? gcc-loops.cpp 6419.4900+-21.4621 6414.3735+-32.6108 n-body.c 1548.6223+-91.1361 1542.5115+-81.2546 Quicksort.c 572.1219+-3.2370 ! 589.4455+-8.9777 ! definitely 1.0303x slower stepanov_container.cpp 4463.2445+-60.1400 ? 4490.3182+-59.7972 ? Towers.c 376.7356+-0.4260 ! 385.5025+-0.7840 ! definitely 1.0233x slower <geometric> 1084.1483+-11.3733 ? 1092.5004+-9.8962 ? might be 1.0077x slower
Benjamin Poulain
Comment 8 2016-04-11 22:14:06 PDT
Baseline SmallStackslot LargeStackslot LargeStackslot v. Baseline bigfib.cpp 655.0576+-11.3429 652.7036+-12.3709 ? 656.0836+-5.9068 ? cray.c 565.4435+-8.1634 ? 571.1257+-2.0889 565.9597+-5.1407 ? dry.c 650.7029+-25.8607 ? 674.8217+-2.0914 658.0632+-29.7491 ? might be 1.0113x slower FloatMM.c 889.3929+-1.2567 889.1890+-1.4387 ? 890.1675+-3.5484 ? gcc-loops.cpp 6413.8161+-21.2019 ? 6414.8786+-38.3145 6408.4991+-19.4484 n-body.c 1574.0956+-69.2270 1550.5068+-42.5719 ? 1569.4171+-78.0803 Quicksort.c 574.8629+-6.2179 ? 575.6371+-4.8284 ! 594.1012+-1.9979 ! definitely 1.0335x slower stepanov_container.cpp 4437.3698+-20.8219 ? 4464.8043+-26.3570 ? 4476.6780+-44.9211 ? Towers.c 376.1720+-1.6275 ? 377.3234+-2.8622 ! 386.1508+-1.9629 ! definitely 1.0265x slower <geometric> 1084.5183+-8.1783 ? 1089.1745+-3.3137 ? 1094.0357+-9.8083 ? might be 1.0088x slower
Benjamin Poulain
Comment 9 2016-04-11 22:15:09 PDT
Baseline against first patch again: Conf#1 Conf#2 bigfib.cpp 653.0945+-10.4978 ? 669.9697+-30.3487 ? might be 1.0258x slower cray.c 564.0093+-6.7668 ? 574.0212+-5.0392 ? might be 1.0178x slower dry.c 667.3754+-24.8306 650.4228+-25.3341 might be 1.0261x faster FloatMM.c 890.0898+-2.8910 ? 890.6833+-1.4781 ? gcc-loops.cpp 6434.5238+-43.4264 6414.8804+-17.1803 n-body.c 1548.2090+-85.1345 ? 1551.4907+-41.5082 ? Quicksort.c 576.8409+-2.9926 575.0517+-2.3609 stepanov_container.cpp 4438.1675+-30.6399 ? 4491.6426+-88.1739 ? might be 1.0120x slower Towers.c 376.4673+-0.9480 ? 377.4253+-2.2056 ? <geometric> 1085.8985+-6.9108 ? 1089.3462+-5.8747 ? might be 1.0032x slower
Filip Pizlo
Comment 10 2016-04-11 22:23:19 PDT
I think both approaches are sound, and they both appear to deliver good performance. I like the second one a little better because I think it puts the burden of work on defs (extra store on some defs) instead of uses (extra zext on some uses).
WebKit Commit Bot
Comment 11 2016-04-11 23:16:16 PDT
Comment on attachment 276200 [details] Patch Clearing flags on attachment: 276200 Committed r199337: <http://trac.webkit.org/changeset/199337>
Note You need to log in before you can comment on or make changes to this bug.