Bug 68430

Summary: DFG JIT performs too many negative zero checks, and too many overflow checks
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: JavaScriptCoreAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
the patch
none
the patch - fix style
none
the patch - fix bugs oliver: review+

Description Filip Pizlo 2011-09-20 01:10:49 PDT
In the following constructs, the DFG JIT could perform some obvious optimizations.  Currently it doesn't.  All of these optimizations require one mechanism: backward propagation of the worst-case use kind of a value.  The DFG currently does have this mechanism, but it should.

a[b * c]
       b * c does not need to check for negative zero, since a[-0] is the same as a[0].

(a + b) | c
       a + b does not need to check for overflow.

(a * b) * c
       a * b does not need to check for negative zero, if (a * b) * c does not need to.

Finally, in the case where all of these optimizations fail and the DFG produces a double result because it believes that an overflow needs to be noted, the DFG JIT should not subsequently give up on speculation just because it sees a ValueToInt32.  It's better to just perform a conversion to integer (even if it's not cheap) then it is to OSR exit.
Comment 1 Filip Pizlo 2011-09-20 01:19:03 PDT
Created attachment 107974 [details]
the patch

This is a nice win.



Benchmark report for SunSpider, V8, and Kraken.

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc
"BetterNumberConv" at /Volumes/Data/pizlo/senary/OpenSource/WebKitBuild/Release/jsc

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM
invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting
benchmark execution times with 95% confidence intervals in milliseconds.

                                            TipOfTree            BetterNumberConv                                
SunSpider:
   3d-cube                                7.8791+-0.2642    ?     8.0094+-0.2106       ? might be 1.0165x slower
   3d-morph                               8.4468+-1.3923          7.5334+-0.1758         might be 1.1212x faster
   3d-raytrace                            7.6742+-0.2178    ?     7.7203+-0.2013       ?
   access-binary-trees                    2.3274+-0.0733    ?     2.4202+-0.1349       ? might be 1.0399x slower
   access-fannkuch                       11.5724+-0.2467    ?    11.6143+-0.2553       ?
   access-nbody                           3.9548+-0.0935          3.9532+-0.1059       
   access-nsieve                          2.6385+-0.0683    ?     2.6655+-0.0824       ? might be 1.0102x slower
   bitops-3bit-bits-in-byte               1.6935+-0.0437    ?     1.7216+-0.0415       ? might be 1.0166x slower
   bitops-bits-in-byte                    2.8113+-0.0587    ?     2.8335+-0.0522       ?
   bitops-bitwise-and                     3.8596+-0.2919          3.6143+-0.0739         might be 1.0679x faster
   bitops-nsieve-bits                     6.0068+-0.7020          5.4780+-0.1282         might be 1.0965x faster
   controlflow-recursive                  2.0247+-0.0389    ?     2.0689+-0.0863       ? might be 1.0218x slower
   crypto-aes                             7.0917+-0.3823          6.9152+-0.2227         might be 1.0255x faster
   crypto-md5                             2.8758+-0.0891    ?     3.4291+-0.7067       ? might be 1.1924x slower
   crypto-sha1                            2.2817+-0.0575          2.2302+-0.0499         might be 1.0231x faster
   date-format-tofte                     10.4814+-0.2338         10.0192+-0.2975         might be 1.0461x faster
   date-format-xparb                      9.0143+-0.2438          8.9260+-0.2320       
   math-cordic                            6.1861+-0.0999    ?     6.4139+-0.1781       ? might be 1.0368x slower
   math-partial-sums                      7.5917+-0.1730          7.5370+-0.3952       
   math-spectral-norm                     2.6855+-0.0724    ?     2.7436+-0.0586       ? might be 1.0216x slower
   regexp-dna                            11.1408+-0.1574    ^    10.7841+-0.1643       ^ definitely 1.0331x faster
   string-base64                          5.9814+-0.2405    ?     6.6745+-0.8607       ? might be 1.1159x slower
   string-fasta                           6.9674+-0.2165    ?     7.0385+-0.1657       ? might be 1.0102x slower
   string-tagcloud                       12.3302+-0.2751         12.1328+-0.4081         might be 1.0163x faster
   string-unpack-code                    19.4090+-0.7800         18.4574+-0.5016         might be 1.0516x faster
   string-validate-input                  6.8779+-0.2438          6.7539+-0.3112         might be 1.0184x faster

   <arithmetic>                           6.6078+-0.0965          6.5265+-0.0744         might be 1.0125x faster
   <geometric>                            5.4311+-0.0597          5.4225+-0.0752       
   <harmonic>                             4.4166+-0.0361    ?     4.4481+-0.0708       ?

                                            TipOfTree            BetterNumberConv                                
V8:
   crypto                                79.0186+-2.6448    ^    72.7767+-0.4733       ^ definitely 1.0858x faster
   deltablue                            248.5009+-1.6062        248.1938+-3.3008       
   earley-boyer                          98.2168+-1.0922         97.2901+-1.1711       
   raytrace                              70.8401+-0.7682         69.8204+-0.6046         might be 1.0146x faster
   regexp                               106.8140+-0.3316    ?   107.0127+-0.6624       ?
   richards                             220.1409+-1.9328        218.8661+-1.5256       
   splay                                 99.3371+-0.8613    ?   100.6593+-0.9449       ? might be 1.0133x slower

   <arithmetic>                         131.8383+-0.4309    ^   130.6599+-0.6989       ^ definitely 1.0090x faster
   <geometric>                          118.0035+-0.4232    ^   116.3769+-0.5123       ^ definitely 1.0140x faster
   <harmonic>                           107.7174+-0.5083    ^   105.6953+-0.4600       ^ definitely 1.0191x faster

                                            TipOfTree            BetterNumberConv                                
Kraken:
   ai-astar                             632.9047+-6.0172    ?   640.0044+-8.9559       ? might be 1.0112x slower
   audio-beat-detection                 477.4563+-3.0349    ^   466.4551+-1.2944       ^ definitely 1.0236x faster
   audio-dft                            440.9264+-11.6726   ?   442.5051+-9.4084       ?
   audio-fft                            370.5988+-1.4298        370.5765+-2.1601       
   audio-oscillator                     322.8622+-2.8701    ^   294.8488+-4.3636       ^ definitely 1.0950x faster
   imaging-darkroom                     420.7992+-1.5369    ?   421.2799+-3.3969       ?
   imaging-desaturate                   227.0660+-12.3934       219.7338+-1.5312         might be 1.0334x faster
   imaging-gaussian-blur                587.2874+-5.2972    ?   589.8138+-3.2451       ?
   json-parse-financial                  48.8475+-0.8010         48.7679+-0.5812       
   json-stringify-tinderbox              69.5481+-0.8916         69.4690+-0.3177       
   stanford-crypto-aes                  145.9134+-1.5163        142.6462+-1.7818         might be 1.0229x faster
   stanford-crypto-ccm                  112.4743+-1.3139        110.8871+-1.5005         might be 1.0143x faster
   stanford-crypto-pbkdf2               403.8542+-7.7887    ^   379.5667+-5.4956       ^ definitely 1.0640x faster
   stanford-crypto-sha256-iterative     149.8354+-1.2315    ^    86.8713+-2.7902       ^ definitely 1.7248x faster

   <arithmetic>                         315.0267+-1.3871    ^   305.9590+-0.9802       ^ definitely 1.0296x faster
   <geometric>                          246.1830+-1.3091    ^   232.9846+-0.9188       ^ definitely 1.0566x faster
   <harmonic>                           174.9720+-1.1707    ^   163.1923+-0.9627       ^ definitely 1.0722x faster

                                            TipOfTree            BetterNumberConv                                
All benchmarks:
   <arithmetic>                         117.1286+-0.4081    ^   114.2071+-0.3557       ^ definitely 1.0256x faster
   <geometric>                           26.7537+-0.1693    ^    26.2405+-0.2139       ^ definitely 1.0196x faster
   <harmonic>                             7.7919+-0.0622    ?     7.8369+-0.1218       ?
Comment 2 WebKit Review Bot 2011-09-20 01:21:52 PDT
Attachment 107974 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1

Source/JavaScriptCore/dfg/DFGPropagator.cpp:146:  Missing spaces around /  [whitespace/operators] [3]
Source/JavaScriptCore/dfg/DFGPropagator.cpp:146:  Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons.  [readability/comparison_to_zero] [5]
Source/JavaScriptCore/dfg/DFGPropagator.cpp:153:  Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons.  [readability/comparison_to_zero] [5]
Source/JavaScriptCore/dfg/DFGPropagator.cpp:253:  One line control clauses should not use braces.  [whitespace/braces] [4]
Total errors found: 4 in 7 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 3 Filip Pizlo 2011-09-20 01:30:24 PDT
Created attachment 107976 [details]
the patch - fix style
Comment 4 Filip Pizlo 2011-09-20 02:41:59 PDT
Created attachment 107980 [details]
the patch - fix bugs
Comment 5 Oliver Hunt 2011-09-20 08:48:37 PDT
Comment on attachment 107980 [details]
the patch - fix bugs

View in context: https://bugs.webkit.org/attachment.cgi?id=107980&action=review

r=me, but remove the silly whitespace change :D

> Source/JavaScriptCore/dfg/DFGNode.h:201
> -#define NodeIdMask           0xFFF
> -#define NodeResultMask      0xF000
> -#define NodeMustGenerate   0x10000 // set on nodes that have side effects, and may not trivially be removed by DCE.
> -#define NodeIsConstant     0x20000
> -#define NodeIsJump         0x40000
> -#define NodeIsBranch       0x80000
> -#define NodeIsTerminal    0x100000
> -#define NodeHasVarArgs    0x200000
> -#define NodeClobbersWorld 0x400000
> -#define NodeMightClobber  0x800000
> +#define NodeIdMask            0xFFF
> +#define NodeResultMask       0xF000
> +#define NodeMustGenerate    0x10000 // set on nodes that have side effects, and may not trivially be removed by DCE.
> +#define NodeIsConstant      0x20000
> +#define NodeIsJump          0x40000
> +#define NodeIsBranch        0x80000
> +#define NodeIsTerminal     0x100000
> +#define NodeHasVarArgs     0x200000
> +#define NodeClobbersWorld  0x400000
> +#define NodeMightClobber   0x800000

Die whitespace die!
Comment 6 Filip Pizlo 2011-09-20 13:10:31 PDT
Landed in r95563