WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
154151
[JSC] On x86, improve the selection of which value are selected for the UseDef part of commutative operations
https://bugs.webkit.org/show_bug.cgi?id=154151
Summary
[JSC] On x86, improve the selection of which value are selected for the UseDe...
Benjamin Poulain
Reported
2016-02-11 20:19:41 PST
[JSC] On x86, improve the selection of which value are selected for the UseDef part of commutative operations
Attachments
Patch
(63.51 KB, patch)
2016-02-11 20:49 PST
,
Benjamin Poulain
no flags
Details
Formatted Diff
Diff
Patch
(63.31 KB, patch)
2016-02-11 20:55 PST
,
Benjamin Poulain
no flags
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Benjamin Poulain
Comment 1
2016-02-11 20:49:43 PST
Created
attachment 271129
[details]
Patch
Benjamin Poulain
Comment 2
2016-02-11 20:55:19 PST
Created
attachment 271130
[details]
Patch
WebKit Commit Bot
Comment 3
2016-02-11 20:57:18 PST
Attachment 271130
[details]
did not pass style-queue: ERROR: Source/JavaScriptCore/b3/testb3.cpp:7512: More than one command on the same line [whitespace/newline] [4] ERROR: Source/JavaScriptCore/b3/testb3.cpp:7516: More than one command on the same line [whitespace/newline] [4] ERROR: Source/JavaScriptCore/b3/testb3.cpp:7525: More than one command on the same line [whitespace/newline] [4] ERROR: Source/JavaScriptCore/b3/testb3.cpp:7529: Consider using CHECK_EQ instead of CHECK(a == b) [readability/check] [2] ERROR: Source/JavaScriptCore/b3/testb3.cpp:7555: More than one command on the same line [whitespace/newline] [4] ERROR: Source/JavaScriptCore/b3/testb3.cpp:7559: More than one command on the same line [whitespace/newline] [4] ERROR: Source/JavaScriptCore/b3/testb3.cpp:7568: More than one command on the same line [whitespace/newline] [4] ERROR: Source/JavaScriptCore/b3/testb3.cpp:7572: Consider using CHECK_EQ instead of CHECK(a == b) [readability/check] [2] ERROR: Source/JavaScriptCore/b3/testb3.cpp:7583: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/testb3.cpp:7599: Consider using CHECK_EQ instead of CHECK(a == b) [readability/check] [2] ERROR: Source/JavaScriptCore/b3/testb3.cpp:7599: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] ERROR: Source/JavaScriptCore/b3/testb3.cpp:7600: Consider using CHECK_EQ instead of CHECK(a == b) [readability/check] [2] ERROR: Source/JavaScriptCore/b3/testb3.cpp:7614: Place brace on its own line for function definitions. [whitespace/braces] [4] ERROR: Source/JavaScriptCore/b3/testb3.cpp:7630: Consider using CHECK_EQ instead of CHECK(a == b) [readability/check] [2] ERROR: Source/JavaScriptCore/b3/testb3.cpp:7630: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] ERROR: Source/JavaScriptCore/b3/testb3.cpp:7631: Consider using CHECK_EQ instead of CHECK(a == b) [readability/check] [2] ERROR: Source/JavaScriptCore/b3/testb3.cpp:8103: More than one command on the same line [whitespace/newline] [4] ERROR: Source/JavaScriptCore/b3/testb3.cpp:8107: More than one command on the same line [whitespace/newline] [4] ERROR: Source/JavaScriptCore/b3/testb3.cpp:8116: More than one command on the same line [whitespace/newline] [4] ERROR: Source/JavaScriptCore/b3/testb3.cpp:8120: Consider using CHECK_EQ instead of CHECK(a == b) [readability/check] [2] ERROR: Source/JavaScriptCore/b3/testb3.cpp:8146: More than one command on the same line [whitespace/newline] [4] ERROR: Source/JavaScriptCore/b3/testb3.cpp:8150: More than one command on the same line [whitespace/newline] [4] ERROR: Source/JavaScriptCore/b3/testb3.cpp:8159: More than one command on the same line [whitespace/newline] [4] ERROR: Source/JavaScriptCore/b3/testb3.cpp:8163: Consider using CHECK_EQ instead of CHECK(a == b) [readability/check] [2] Total errors found: 24 in 18 files If any of these errors are false positives, please file a bug against check-webkit-style.
Benjamin Poulain
Comment 4
2016-02-11 21:34:16 PST
Conf#1 Conf#2 SunSpider: 3d-cube 4.9807+-0.3910 ? 5.1310+-0.5237 ? might be 1.0302x slower 3d-morph 5.6145+-0.7113 5.5343+-0.3501 might be 1.0145x faster 3d-raytrace 5.5520+-0.1096 ? 5.6379+-0.0959 ? might be 1.0155x slower access-binary-trees 2.1312+-0.0934 ? 2.1907+-0.0996 ? might be 1.0279x slower access-fannkuch 6.0688+-0.1713 5.9970+-0.0450 might be 1.0120x faster access-nbody 2.8165+-0.1189 2.6761+-0.0542 might be 1.0525x faster access-nsieve 3.3240+-0.3803 ? 3.4362+-0.6464 ? might be 1.0337x slower bitops-3bit-bits-in-byte 1.1750+-0.0191 ? 1.2067+-0.0734 ? might be 1.0270x slower bitops-bits-in-byte 3.2972+-0.2676 3.2783+-0.1430 bitops-bitwise-and 2.1758+-0.3842 2.0399+-0.0323 might be 1.0666x faster bitops-nsieve-bits 3.0468+-0.0648 ? 3.1505+-0.4272 ? might be 1.0340x slower controlflow-recursive 2.3895+-0.0360 2.3827+-0.0854 crypto-aes 4.1085+-0.0824 4.1005+-0.0300 crypto-md5 2.6155+-0.1184 ? 2.6743+-0.0902 ? might be 1.0225x slower crypto-sha1 2.3057+-0.0525 ? 2.3200+-0.0430 ? date-format-tofte 7.1667+-0.3967 7.0256+-0.5552 might be 1.0201x faster date-format-xparb 4.7774+-0.3503 ? 4.9822+-0.6569 ? might be 1.0429x slower math-cordic 3.0025+-0.0117 ? 3.0117+-0.0760 ? math-partial-sums 5.2362+-0.4526 4.9336+-0.0487 might be 1.0613x faster math-spectral-norm 2.1485+-0.1867 2.0172+-0.0545 might be 1.0651x faster regexp-dna 6.4863+-0.6913 6.3231+-0.5293 might be 1.0258x faster string-base64 4.6309+-0.1672 ? 4.9260+-0.7500 ? might be 1.0637x slower string-fasta 6.1742+-0.5841 5.9218+-0.1762 might be 1.0426x faster string-tagcloud 8.2081+-0.1182 ? 8.3020+-0.1040 ? might be 1.0114x slower string-unpack-code 19.2466+-0.2708 ? 20.2353+-2.4593 ? might be 1.0514x slower string-validate-input 4.4363+-0.2171 4.3779+-0.1600 might be 1.0133x faster <arithmetic> 4.7352+-0.0616 ? 4.7620+-0.1431 ? might be 1.0057x slower Conf#1 Conf#2 LongSpider: 3d-cube 859.9249+-23.7331 834.9955+-13.4516 might be 1.0299x faster 3d-morph 608.1065+-0.6311 ? 612.3118+-7.5851 ? 3d-raytrace 642.3957+-5.6464 ? 643.7234+-14.1999 ? access-binary-trees 859.9940+-9.6951 854.5524+-4.3282 access-fannkuch 297.1360+-46.3517 ? 306.9773+-43.2358 ? might be 1.0331x slower access-nbody 704.2305+-3.0817 ^ 529.6432+-5.2786 ^ definitely 1.3296x faster access-nsieve 364.4804+-23.8780 352.2651+-2.9053 might be 1.0347x faster bitops-3bit-bits-in-byte 31.3505+-0.6331 ? 31.7724+-0.5952 ? might be 1.0135x slower bitops-bits-in-byte 100.0665+-7.6215 94.0323+-5.4902 might be 1.0642x faster bitops-nsieve-bits 411.5748+-6.0209 ^ 401.3955+-4.0279 ^ definitely 1.0254x faster controlflow-recursive 463.1362+-7.8059 ? 466.9354+-2.2785 ? crypto-aes 675.9106+-5.7451 671.1015+-5.2273 crypto-md5 606.9241+-17.4303 601.5291+-3.4836 crypto-sha1 760.4535+-8.1912 756.0695+-5.5658 date-format-tofte 586.4337+-21.9596 581.2687+-4.3826 date-format-xparb 677.7083+-3.9509 ? 677.9825+-11.3665 ? hash-map 158.2751+-2.7577 154.9199+-3.3136 might be 1.0217x faster math-cordic 469.9202+-7.0700 ! 537.7718+-10.7689 ! definitely 1.1444x slower math-partial-sums 443.0925+-6.8684 ^ 424.0950+-3.0523 ^ definitely 1.0448x faster math-spectral-norm 781.2426+-11.9736 ^ 568.1845+-1.3451 ^ definitely 1.3750x faster string-base64 375.2461+-3.2254 372.9837+-1.7681 string-fasta 366.4617+-11.3853 359.8982+-3.3985 might be 1.0182x faster string-tagcloud 179.5907+-1.7535 179.3361+-1.7787 <geometric> 406.7255+-4.6155 ^ 394.8478+-1.7041 ^ definitely 1.0301x faster Conf#1 Conf#2 V8Spider: crypto 38.3080+-0.5236 ? 38.7747+-0.0985 ? might be 1.0122x slower deltablue 53.5134+-1.7187 52.1210+-3.5557 might be 1.0267x faster earley-boyer 41.3729+-0.6301 41.2130+-0.4564 raytrace 21.1843+-0.3867 ? 21.5524+-0.5742 ? might be 1.0174x slower regexp 64.9052+-2.4526 63.5829+-0.8795 might be 1.0208x faster richards 41.8108+-0.1672 ? 42.7329+-2.3884 ? might be 1.0221x slower splay 33.8207+-2.9171 32.7829+-0.5352 might be 1.0317x faster <geometric> 40.0291+-0.5147 39.8546+-0.5622 might be 1.0044x faster Conf#1 Conf#2 Octane: encrypt 0.16353+-0.00551 ? 0.16587+-0.00119 ? might be 1.0143x slower decrypt 2.90342+-0.00983 ? 2.90354+-0.01409 ? deltablue x2 0.14188+-0.00191 0.13994+-0.00292 might be 1.0139x faster earley 0.29170+-0.00404 0.29037+-0.00332 boyer 4.83234+-0.23306 4.81827+-0.28100 navier-stokes x2 5.06262+-0.00369 ^ 5.02861+-0.02814 ^ definitely 1.0068x faster raytrace x2 0.92376+-0.00730 ^ 0.90786+-0.00469 ^ definitely 1.0175x faster richards x2 0.08408+-0.00176 0.08312+-0.00158 might be 1.0115x faster splay x2 0.34874+-0.00444 ? 0.34985+-0.00467 ? regexp x2 25.61157+-0.30929 ? 25.88974+-0.74449 ? might be 1.0109x slower pdfjs x2 38.42483+-0.27337 ? 38.47841+-0.27644 ? mandreel x2 45.11808+-0.40494 44.53033+-0.51351 might be 1.0132x faster gbemu x2 25.69402+-0.91761 25.39295+-0.31535 might be 1.0119x faster closure 0.57951+-0.00352 0.57719+-0.00627 jquery 7.60857+-0.08635 7.51081+-0.07659 might be 1.0130x faster box2d x2 9.59004+-0.36357 9.40430+-0.17040 might be 1.0198x faster zlib x2 398.45152+-7.95789 387.39514+-10.17527 might be 1.0285x faster typescript x2 670.43469+-6.56461 ? 674.14691+-16.07428 ? <geometric> 5.40296+-0.03868 5.36487+-0.02317 might be 1.0071x faster Conf#1 Conf#2 Kraken: ai-astar 96.560+-1.149 ? 97.470+-4.044 ? audio-beat-detection 52.785+-1.172 ^ 46.641+-2.125 ^ definitely 1.1317x faster audio-dft 98.918+-1.551 ? 98.919+-0.889 ? audio-fft 43.882+-2.536 ^ 35.610+-0.068 ^ definitely 1.2323x faster audio-oscillator 50.038+-3.190 ? 50.964+-3.232 ? might be 1.0185x slower imaging-darkroom 63.140+-0.175 ^ 61.273+-0.961 ^ definitely 1.0305x faster imaging-desaturate 45.268+-0.095 ? 46.139+-1.038 ? might be 1.0192x slower imaging-gaussian-blur 71.131+-1.371 ? 71.547+-0.658 ? json-parse-financial 38.203+-3.073 37.659+-1.343 might be 1.0144x faster json-stringify-tinderbox 23.943+-2.363 23.566+-0.478 might be 1.0160x faster stanford-crypto-aes 41.649+-0.501 ? 42.465+-4.439 ? might be 1.0196x slower stanford-crypto-ccm 37.482+-1.169 ? 39.173+-2.410 ? might be 1.0451x slower stanford-crypto-pbkdf2 102.549+-0.583 ? 103.831+-0.866 ? might be 1.0125x slower stanford-crypto-sha256-iterative 39.140+-0.400 ? 39.473+-0.622 ? <arithmetic> 57.478+-0.368 56.767+-0.454 might be 1.0125x faster Conf#1 Conf#2 AsmBench: bigfib.cpp 455.7532+-2.8219 446.5265+-14.8294 might be 1.0207x faster cray.c 391.1835+-5.9553 ^ 376.1931+-3.6907 ^ definitely 1.0398x faster dry.c 431.1630+-12.6541 ? 484.8128+-111.1190 ? might be 1.1244x slower FloatMM.c 737.8479+-6.6793 722.5731+-15.2301 might be 1.0211x faster gcc-loops.cpp 3678.4009+-10.9651 ? 3685.7400+-8.2060 ? n-body.c 895.2535+-6.3906 ^ 829.7021+-3.5961 ^ definitely 1.0790x faster Quicksort.c 422.0996+-4.7730 412.1368+-15.3854 might be 1.0242x faster stepanov_container.cpp 3392.2217+-50.0440 3363.9493+-25.7228 Towers.c 270.5571+-3.4450 ? 277.9387+-6.7250 ? might be 1.0273x slower <geometric> 746.9036+-3.5820 742.7479+-14.9086 might be 1.0056x faster Conf#1 Conf#2 CompressionBench: huffman 33.8490+-0.9527 33.6009+-0.6666 arithmetic-simple 270.5825+-4.7837 269.4107+-2.3910 arithmetic-precise 249.8232+-4.0015 ? 252.3221+-3.8757 ? might be 1.0100x slower arithmetic-complex-precise 249.8817+-1.8363 ? 252.7509+-3.6428 ? might be 1.0115x slower arithmetic-precise-order-0 273.3307+-3.2085 ? 274.0517+-4.4966 ? arithmetic-precise-order-1 310.1912+-8.6248 308.2789+-1.8559 arithmetic-precise-order-2 352.7737+-8.1991 ? 354.0013+-3.9566 ? arithmetic-simple-order-1 317.4145+-1.9266 ? 318.5526+-2.1949 ? arithmetic-simple-order-2 366.1655+-3.2764 ? 367.9578+-4.4014 ? lz-string 333.1664+-8.6698 ? 340.1069+-23.7490 ? might be 1.0208x slower <geometric> 241.0336+-1.2623 ? 241.9587+-1.6997 ? might be 1.0038x slower Conf#1 Conf#2 Geomean of preferred means: <scaled-result> 49.4459+-0.0983 49.1259+-0.2833 might be 1.0065x faster
Filip Pizlo
Comment 5
2016-02-11 21:58:09 PST
Comment on
attachment 271130
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=271130&action=review
> Source/JavaScriptCore/b3/B3CheckSpecial.cpp:133 > +bool CheckSpecial::shouldTryAliasingDef(Inst& inst, unsigned& defIndex)
Nit: this could return Option<unsigned> instead of returning bool and taking an unsigned ref.
> Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:1364 > + if (range.last - range.first <= 1 && range.count > range.admitStackCount)
Is this sound for the case that you have a self loop and a use *just before* a def? That's not super likely and the Tmp would have to be live all the way to the root block I guess, but it seems like that might be a bit weak.
Benjamin Poulain
Comment 6
2016-02-11 23:01:58 PST
(In reply to
comment #5
)
> Comment on
attachment 271130
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=271130&action=review
> > > Source/JavaScriptCore/b3/B3CheckSpecial.cpp:133 > > +bool CheckSpecial::shouldTryAliasingDef(Inst& inst, unsigned& defIndex) > > Nit: this could return Option<unsigned> instead of returning bool and taking > an unsigned ref.
Good point, that's gonna be cleaner.
> > Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:1364 > > + if (range.last - range.first <= 1 && range.count > range.admitStackCount) > > Is this sound for the case that you have a self loop and a use *just before* > a def? That's not super likely and the Tmp would have to be live all the > way to the root block I guess, but it seems like that might be a bit weak.
Do you have a suggestion? The reason for this counter was that it is super cheap and only run once. I could have code to clone the HashSet and prune it based on the Liveness when we get there but that seems like extra complication for handling undefined behavior. I would say that is still unspillable. The Tmp would have to be live on entry. Since there are no Def on entry, there are no point to spill from. Or am I misunderstanding the case?
Filip Pizlo
Comment 7
2016-02-12 06:05:37 PST
(In reply to
comment #6
)
> (In reply to
comment #5
) > > Comment on
attachment 271130
[details]
> > Patch > > > > View in context: > >
https://bugs.webkit.org/attachment.cgi?id=271130&action=review
> > > > > Source/JavaScriptCore/b3/B3CheckSpecial.cpp:133 > > > +bool CheckSpecial::shouldTryAliasingDef(Inst& inst, unsigned& defIndex) > > > > Nit: this could return Option<unsigned> instead of returning bool and taking > > an unsigned ref. > > Good point, that's gonna be cleaner. > > > > Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:1364 > > > + if (range.last - range.first <= 1 && range.count > range.admitStackCount) > > > > Is this sound for the case that you have a self loop and a use *just before* > > a def? That's not super likely and the Tmp would have to be live all the > > way to the root block I guess, but it seems like that might be a bit weak. > > Do you have a suggestion? The reason for this counter was that it is super > cheap and only run once. I could have code to clone the HashSet and prune it > based on the Liveness when we get there but that seems like extra > complication for handling undefined behavior. > > I would say that is still unspillable. > The Tmp would have to be live on entry. Since there are no Def on entry, > there are no point to spill from. > > Or am I misunderstanding the case?
Would you want to make Var unspillable? For (;;) { ... Lots of stuff Tmp1 = Var Var = Tmp2 ... More stuff } I may be misreading the alto but it seems like it would make Var unspillable.
Benjamin Poulain
Comment 8
2016-02-12 13:41:01 PST
Comment on
attachment 271130
[details]
Patch Filip is okay with me improving this in follow ups. I'll try to do that this weekend.
WebKit Commit Bot
Comment 9
2016-02-12 14:32:45 PST
Comment on
attachment 271130
[details]
Patch Clearing flags on attachment: 271130 Committed
r196513
: <
http://trac.webkit.org/changeset/196513
>
WebKit Commit Bot
Comment 10
2016-02-12 14:32:48 PST
All reviewed patches have been landed. Closing bug.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug