Bug 68227

Summary: DFG JIT should inline Math.abs
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: JavaScriptCoreAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
the patch
gyuyoung.kim: commit-queue-
the patch - fix a lot of stuff
none
the patch - refactored oliver: review+

Description Filip Pizlo 2011-09-16 00:03:26 PDT
Math.abs is a simple function to inline.  We know exactly what it does.  DFG already has the ability to speculate that it is definitely calling a particular function, thanks to CheckMethod.  Implementing this feature will be a great first step for supporting inlining in general, and other sorts of intrinsic functions in particular.
Comment 1 Filip Pizlo 2011-09-16 00:05:10 PDT
This is a 13% win on Kraken, due to a 61% win on imaging-gaussian-blur.


Benchmark report for SunSpider, V8, and Kraken.

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc
"CheckMethod" at /Volumes/Data/pizlo/quartary/OpenSource/WebKitBuild/Release/jsc
"MathDotAbs" at /Volumes/Data/pizlo/septenary/OpenSource/WebKitBuild/Release/jsc

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM invocation for warm-up.
Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence
intervals in milliseconds.

                                            TipOfTree              CheckMethod              MathDotAbs          MathDotAbs v. TipOfTree  
SunSpider:
   3d-cube                                7.8771+-0.2118    ?     7.9524+-0.1532          7.7974+-0.2673         might be 1.0102x faster
   3d-morph                               7.4824+-0.1647          7.3816+-0.1877    ?     7.3948+-0.1308         might be 1.0118x faster
   3d-raytrace                            7.9141+-0.1938    ?     7.9445+-0.1934    ^     7.5801+-0.1588         might be 1.0441x faster
   access-binary-trees                    2.3728+-0.0569          2.2900+-0.0530    ?     2.2988+-0.0655         might be 1.0322x faster
   access-fannkuch                       11.7252+-0.2695    ?    11.7583+-0.1997         11.5732+-0.2085         might be 1.0131x faster
   access-nbody                           4.1582+-0.0611    ?     4.3343+-0.1218    ?     4.8783+-0.8866       ? might be 1.1732x slower
   access-nsieve                          2.6476+-0.0877          2.6017+-0.0657          2.5885+-0.0507         might be 1.0228x faster
   bitops-3bit-bits-in-byte               1.6404+-0.0357    ?     1.6442+-0.0437    ?     1.7019+-0.0660       ? might be 1.0375x slower
   bitops-bits-in-byte                    2.7977+-0.0822          2.6993+-0.0352    ?     2.7583+-0.0610         might be 1.0143x faster
   bitops-bitwise-and                     3.5206+-0.1040    ?     3.6103+-0.0862    ?     3.7339+-0.1119       ? might be 1.0606x slower
   bitops-nsieve-bits                     5.2149+-0.0636    ?     5.3302+-0.1189    ?     5.3482+-0.0815       ? might be 1.0256x slower
   controlflow-recursive                  1.9950+-0.0440    ?     2.0208+-0.0498    ?     2.0828+-0.0918       ? might be 1.0440x slower
   crypto-aes                             7.2947+-0.3756          6.9862+-0.3805          6.9653+-0.3419         might be 1.0473x faster
   crypto-md5                             2.8767+-0.1047          2.8361+-0.1007          2.7773+-0.0539         might be 1.0358x faster
   crypto-sha1                            2.2561+-0.0575          2.1765+-0.0356    ?     2.2091+-0.0720         might be 1.0213x faster
   date-format-tofte                     10.5105+-0.2760         10.3554+-0.1631    ?    10.3874+-0.2410         might be 1.0118x faster
   date-format-xparb                      8.6902+-0.2079    ?     9.0536+-0.2347          8.8852+-0.2032       ? might be 1.0224x slower
   math-cordic                            6.2988+-0.1205          6.1587+-0.1171    ?     6.2109+-0.1638         might be 1.0141x faster
   math-partial-sums                      7.2658+-0.1211    ?     7.3612+-0.1526          7.3564+-0.1448       ? might be 1.0125x slower
   math-spectral-norm                     2.8941+-0.4114    ?     3.1316+-0.6769          2.7383+-0.0906         might be 1.0569x faster
   regexp-dna                            10.7131+-0.2434    ?    10.9977+-0.1918         10.9197+-0.2360       ? might be 1.0193x slower
   string-base64                          5.8277+-0.1519          5.7318+-0.1803    ?     6.0018+-0.2913       ? might be 1.0299x slower
   string-fasta                           7.0203+-0.1499    ?     7.0551+-0.2153          6.9808+-0.1341       
   string-tagcloud                       11.9234+-0.3495    ?    12.2052+-0.4943         11.9608+-0.3251       ?
   string-unpack-code                    19.0595+-0.4566         18.9328+-0.3451         18.3859+-0.2872         might be 1.0366x faster
   string-validate-input                  6.6001+-0.2032          6.4331+-0.1183    ?     6.5165+-0.1550         might be 1.0128x faster

   <arithmetic>                           6.4837+-0.0365    ?     6.4993+-0.0494          6.4627+-0.0509       
   <geometric>                            5.3517+-0.0358    ?     5.3518+-0.0542    ?     5.3537+-0.0428       ?
   <harmonic>                             4.3701+-0.0436          4.3537+-0.0536    ?     4.3803+-0.0349       ?

                                            TipOfTree              CheckMethod              MathDotAbs          MathDotAbs v. TipOfTree  
V8:
   crypto                                83.3726+-0.6100    ?    83.3745+-0.5430         82.8878+-0.3526       
   deltablue                            246.6145+-3.1631    ^   240.5071+-1.2818    ?   243.8680+-2.7147         might be 1.0113x faster
   earley-boyer                          95.5398+-0.6002         94.6500+-0.4064    ?    95.3739+-0.3684       
   raytrace                              75.6773+-6.1967         72.7885+-0.4288         72.7452+-0.5102         might be 1.0403x faster
   regexp                               108.8515+-0.3325        108.4031+-0.5479    ^   106.5182+-0.3822       ^ definitely 1.0219x faster
   richards                             219.0579+-0.7349    ?   219.3573+-2.8656        218.7951+-1.8159       
   splay                                100.3839+-0.9403         99.6274+-0.6926    ^    98.2193+-0.3959       ^ definitely 1.0220x faster

   <arithmetic>                         132.7853+-1.1595        131.2440+-0.5493        131.2011+-0.6375         might be 1.0121x faster
   <geometric>                          119.7848+-1.3477        118.4509+-0.3559        118.1263+-0.3993         might be 1.0140x faster
   <harmonic>                           110.1116+-1.5381        108.8954+-0.2899        108.4592+-0.2939         might be 1.0152x faster

                                            TipOfTree              CheckMethod              MathDotAbs          MathDotAbs v. TipOfTree  
Kraken:
   ai-astar                             633.9435+-4.1433        632.6228+-4.4504    ?   638.0119+-2.3449       ?
   audio-beat-detection                 472.9599+-1.8222    ^   468.2434+-1.5094    !   477.2912+-7.3407       ?
   audio-dft                            424.4090+-1.8641    ?   430.5855+-4.3855    ?   458.9972+-42.7620      ? might be 1.0815x slower
   audio-fft                            369.4709+-1.7994    ^   363.4016+-1.2242    ?   365.5753+-3.3093         might be 1.0107x faster
   audio-oscillator                     319.4655+-1.6668    ^   316.7981+-0.6281    ^   314.1501+-1.7790       ^ definitely 1.0169x faster
   imaging-darkroom                     418.6947+-2.0732    ^   415.0992+-1.2111        414.6453+-0.6410       ^ definitely 1.0098x faster
   imaging-desaturate                   209.1284+-0.7906    ^   207.6594+-0.6454    ?   209.4528+-2.0826       ?
   imaging-gaussian-blur               1736.4665+-7.2265    ^  1574.9463+-7.2155    ^  1077.3633+-4.2862       ^ definitely 1.6118x faster
   json-parse-financial                  48.9103+-0.2949    !    50.7933+-1.0150         50.1767+-0.5739       ! definitely 1.0259x slower
   json-stringify-tinderbox              69.6547+-0.4141         69.3233+-0.1353         69.0610+-0.3808       
   stanford-crypto-aes                  145.1025+-0.9297    ?   145.1324+-0.8121        144.1862+-2.0463       
   stanford-crypto-ccm                  112.5759+-0.5658    ^   111.2905+-0.4928    ?   112.6970+-1.1181       ?
   stanford-crypto-pbkdf2               398.3721+-2.9153        395.1204+-2.0538    ?   400.3701+-5.5710       ?
   stanford-crypto-sha256-iterative     149.4793+-1.5592    ?   149.6702+-0.6328    ?   149.7595+-0.2238       ?

   <arithmetic>                         393.4738+-0.6612    ^   380.7633+-0.9488    ^   348.6955+-2.5650       ^ definitely 1.1284x faster
   <geometric>                          262.9734+-0.4518    ^   260.7038+-0.4816    ^   255.3935+-1.2807       ^ definitely 1.0297x faster
   <harmonic>                           176.2926+-0.4149    ?   177.1632+-0.8505        176.5071+-0.6388       ?

                                            TipOfTree              CheckMethod              MathDotAbs          MathDotAbs v. TipOfTree  
All benchmarks:
   <arithmetic>                         140.5682+-0.2888    ^   136.5612+-0.2688    ^   126.9825+-0.7800       ^ definitely 1.1070x faster
   <geometric>                           27.1244+-0.1400         27.0094+-0.1561         26.8389+-0.1312       ^ definitely 1.0106x faster
   <harmonic>                             7.7144+-0.0761          7.6856+-0.0928    ?     7.7308+-0.0601       ?
Comment 2 Filip Pizlo 2011-09-16 00:09:35 PDT
Created attachment 107613 [details]
the patch

This is still a work in progress, as it needs some build magic for other platforms.  I also plan on refactoring the intrinsic handling in DFGByteCodeParser, since I anticipate adding a lot more intrinsics.
Comment 3 WebKit Review Bot 2011-09-16 00:11:31 PDT
Attachment 107613 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1

Source/JavaScriptCore/runtime/JSGlobalData.h:34:  Alphabetical sorting problem.  [build/include_order] [4]
Source/JavaScriptCore/runtime/JSGlobalData.h:223:  The parameter name "intrinsic" adds no information, so it should be removed.  [readability/parameter_name] [5]
Total errors found: 2 in 19 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 4 Gyuyoung Kim 2011-09-16 00:23:26 PDT
Comment on attachment 107613 [details]
the patch

Attachment 107613 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/9711076
Comment 5 Filip Pizlo 2011-09-16 01:08:08 PDT
Created attachment 107617 [details]
the patch - fix a lot of stuff

This is still a bit of a work in progress because the changes are pretty extensive and I'm not sure I've gotten it right.
Comment 6 Filip Pizlo 2011-09-16 01:54:05 PDT
Created attachment 107625 [details]
the patch - refactored

Refactored the intrinsic handling in DFG::ByteCodeParser to be more reasonable, particularly for when we add more intrinsics.
Comment 7 Filip Pizlo 2011-09-16 11:49:41 PDT
Landed in r95310.