RESOLVED FIXED Bug 68430
DFG JIT performs too many negative zero checks, and too many overflow checks
https://bugs.webkit.org/show_bug.cgi?id=68430
Summary DFG JIT performs too many negative zero checks, and too many overflow checks
Filip Pizlo
Reported 2011-09-20 01:10:49 PDT
In the following constructs, the DFG JIT could perform some obvious optimizations. Currently it doesn't. All of these optimizations require one mechanism: backward propagation of the worst-case use kind of a value. The DFG currently does have this mechanism, but it should. a[b * c] b * c does not need to check for negative zero, since a[-0] is the same as a[0]. (a + b) | c a + b does not need to check for overflow. (a * b) * c a * b does not need to check for negative zero, if (a * b) * c does not need to. Finally, in the case where all of these optimizations fail and the DFG produces a double result because it believes that an overflow needs to be noted, the DFG JIT should not subsequently give up on speculation just because it sees a ValueToInt32. It's better to just perform a conversion to integer (even if it's not cheap) then it is to OSR exit.
Attachments
the patch (59.50 KB, patch)
2011-09-20 01:19 PDT, Filip Pizlo
no flags
the patch - fix style (59.47 KB, patch)
2011-09-20 01:30 PDT, Filip Pizlo
no flags
the patch - fix bugs (62.15 KB, patch)
2011-09-20 02:41 PDT, Filip Pizlo
oliver: review+
Filip Pizlo
Comment 1 2011-09-20 01:19:03 PDT
Created attachment 107974 [details] the patch This is a nice win. Benchmark report for SunSpider, V8, and Kraken. VMs tested: "TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc "BetterNumberConv" at /Volumes/Data/pizlo/senary/OpenSource/WebKitBuild/Release/jsc Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree BetterNumberConv SunSpider: 3d-cube 7.8791+-0.2642 ? 8.0094+-0.2106 ? might be 1.0165x slower 3d-morph 8.4468+-1.3923 7.5334+-0.1758 might be 1.1212x faster 3d-raytrace 7.6742+-0.2178 ? 7.7203+-0.2013 ? access-binary-trees 2.3274+-0.0733 ? 2.4202+-0.1349 ? might be 1.0399x slower access-fannkuch 11.5724+-0.2467 ? 11.6143+-0.2553 ? access-nbody 3.9548+-0.0935 3.9532+-0.1059 access-nsieve 2.6385+-0.0683 ? 2.6655+-0.0824 ? might be 1.0102x slower bitops-3bit-bits-in-byte 1.6935+-0.0437 ? 1.7216+-0.0415 ? might be 1.0166x slower bitops-bits-in-byte 2.8113+-0.0587 ? 2.8335+-0.0522 ? bitops-bitwise-and 3.8596+-0.2919 3.6143+-0.0739 might be 1.0679x faster bitops-nsieve-bits 6.0068+-0.7020 5.4780+-0.1282 might be 1.0965x faster controlflow-recursive 2.0247+-0.0389 ? 2.0689+-0.0863 ? might be 1.0218x slower crypto-aes 7.0917+-0.3823 6.9152+-0.2227 might be 1.0255x faster crypto-md5 2.8758+-0.0891 ? 3.4291+-0.7067 ? might be 1.1924x slower crypto-sha1 2.2817+-0.0575 2.2302+-0.0499 might be 1.0231x faster date-format-tofte 10.4814+-0.2338 10.0192+-0.2975 might be 1.0461x faster date-format-xparb 9.0143+-0.2438 8.9260+-0.2320 math-cordic 6.1861+-0.0999 ? 6.4139+-0.1781 ? might be 1.0368x slower math-partial-sums 7.5917+-0.1730 7.5370+-0.3952 math-spectral-norm 2.6855+-0.0724 ? 2.7436+-0.0586 ? might be 1.0216x slower regexp-dna 11.1408+-0.1574 ^ 10.7841+-0.1643 ^ definitely 1.0331x faster string-base64 5.9814+-0.2405 ? 6.6745+-0.8607 ? might be 1.1159x slower string-fasta 6.9674+-0.2165 ? 7.0385+-0.1657 ? might be 1.0102x slower string-tagcloud 12.3302+-0.2751 12.1328+-0.4081 might be 1.0163x faster string-unpack-code 19.4090+-0.7800 18.4574+-0.5016 might be 1.0516x faster string-validate-input 6.8779+-0.2438 6.7539+-0.3112 might be 1.0184x faster <arithmetic> 6.6078+-0.0965 6.5265+-0.0744 might be 1.0125x faster <geometric> 5.4311+-0.0597 5.4225+-0.0752 <harmonic> 4.4166+-0.0361 ? 4.4481+-0.0708 ? TipOfTree BetterNumberConv V8: crypto 79.0186+-2.6448 ^ 72.7767+-0.4733 ^ definitely 1.0858x faster deltablue 248.5009+-1.6062 248.1938+-3.3008 earley-boyer 98.2168+-1.0922 97.2901+-1.1711 raytrace 70.8401+-0.7682 69.8204+-0.6046 might be 1.0146x faster regexp 106.8140+-0.3316 ? 107.0127+-0.6624 ? richards 220.1409+-1.9328 218.8661+-1.5256 splay 99.3371+-0.8613 ? 100.6593+-0.9449 ? might be 1.0133x slower <arithmetic> 131.8383+-0.4309 ^ 130.6599+-0.6989 ^ definitely 1.0090x faster <geometric> 118.0035+-0.4232 ^ 116.3769+-0.5123 ^ definitely 1.0140x faster <harmonic> 107.7174+-0.5083 ^ 105.6953+-0.4600 ^ definitely 1.0191x faster TipOfTree BetterNumberConv Kraken: ai-astar 632.9047+-6.0172 ? 640.0044+-8.9559 ? might be 1.0112x slower audio-beat-detection 477.4563+-3.0349 ^ 466.4551+-1.2944 ^ definitely 1.0236x faster audio-dft 440.9264+-11.6726 ? 442.5051+-9.4084 ? audio-fft 370.5988+-1.4298 370.5765+-2.1601 audio-oscillator 322.8622+-2.8701 ^ 294.8488+-4.3636 ^ definitely 1.0950x faster imaging-darkroom 420.7992+-1.5369 ? 421.2799+-3.3969 ? imaging-desaturate 227.0660+-12.3934 219.7338+-1.5312 might be 1.0334x faster imaging-gaussian-blur 587.2874+-5.2972 ? 589.8138+-3.2451 ? json-parse-financial 48.8475+-0.8010 48.7679+-0.5812 json-stringify-tinderbox 69.5481+-0.8916 69.4690+-0.3177 stanford-crypto-aes 145.9134+-1.5163 142.6462+-1.7818 might be 1.0229x faster stanford-crypto-ccm 112.4743+-1.3139 110.8871+-1.5005 might be 1.0143x faster stanford-crypto-pbkdf2 403.8542+-7.7887 ^ 379.5667+-5.4956 ^ definitely 1.0640x faster stanford-crypto-sha256-iterative 149.8354+-1.2315 ^ 86.8713+-2.7902 ^ definitely 1.7248x faster <arithmetic> 315.0267+-1.3871 ^ 305.9590+-0.9802 ^ definitely 1.0296x faster <geometric> 246.1830+-1.3091 ^ 232.9846+-0.9188 ^ definitely 1.0566x faster <harmonic> 174.9720+-1.1707 ^ 163.1923+-0.9627 ^ definitely 1.0722x faster TipOfTree BetterNumberConv All benchmarks: <arithmetic> 117.1286+-0.4081 ^ 114.2071+-0.3557 ^ definitely 1.0256x faster <geometric> 26.7537+-0.1693 ^ 26.2405+-0.2139 ^ definitely 1.0196x faster <harmonic> 7.7919+-0.0622 ? 7.8369+-0.1218 ?
WebKit Review Bot
Comment 2 2011-09-20 01:21:52 PDT
Attachment 107974 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1 Source/JavaScriptCore/dfg/DFGPropagator.cpp:146: Missing spaces around / [whitespace/operators] [3] Source/JavaScriptCore/dfg/DFGPropagator.cpp:146: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] Source/JavaScriptCore/dfg/DFGPropagator.cpp:153: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] Source/JavaScriptCore/dfg/DFGPropagator.cpp:253: One line control clauses should not use braces. [whitespace/braces] [4] Total errors found: 4 in 7 files If any of these errors are false positives, please file a bug against check-webkit-style.
Filip Pizlo
Comment 3 2011-09-20 01:30:24 PDT
Created attachment 107976 [details] the patch - fix style
Filip Pizlo
Comment 4 2011-09-20 02:41:59 PDT
Created attachment 107980 [details] the patch - fix bugs
Oliver Hunt
Comment 5 2011-09-20 08:48:37 PDT
Comment on attachment 107980 [details] the patch - fix bugs View in context: https://bugs.webkit.org/attachment.cgi?id=107980&action=review r=me, but remove the silly whitespace change :D > Source/JavaScriptCore/dfg/DFGNode.h:201 > -#define NodeIdMask 0xFFF > -#define NodeResultMask 0xF000 > -#define NodeMustGenerate 0x10000 // set on nodes that have side effects, and may not trivially be removed by DCE. > -#define NodeIsConstant 0x20000 > -#define NodeIsJump 0x40000 > -#define NodeIsBranch 0x80000 > -#define NodeIsTerminal 0x100000 > -#define NodeHasVarArgs 0x200000 > -#define NodeClobbersWorld 0x400000 > -#define NodeMightClobber 0x800000 > +#define NodeIdMask 0xFFF > +#define NodeResultMask 0xF000 > +#define NodeMustGenerate 0x10000 // set on nodes that have side effects, and may not trivially be removed by DCE. > +#define NodeIsConstant 0x20000 > +#define NodeIsJump 0x40000 > +#define NodeIsBranch 0x80000 > +#define NodeIsTerminal 0x100000 > +#define NodeHasVarArgs 0x200000 > +#define NodeClobbersWorld 0x400000 > +#define NodeMightClobber 0x800000 Die whitespace die!
Filip Pizlo
Comment 6 2011-09-20 13:10:31 PDT
Landed in r95563
Note You need to log in before you can comment on or make changes to this bug.