Bug 69996 - DFG should have inlining
Summary: DFG should have inlining
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on: 69995 70068 70157 70278 70466 70467 70468 70578
Blocks:
  Show dependency treegraph
 
Reported: 2011-10-12 21:12 PDT by Filip Pizlo
Modified: 2011-10-24 01:54 PDT (History)
10 users (show)

See Also:


Attachments
work in progress (43.78 KB, patch)
2011-10-17 22:20 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
work in progress (almost done) (65.13 KB, patch)
2011-10-18 02:40 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
work in progress - almost done (79.08 KB, patch)
2011-10-18 16:48 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
more work in progress (83.57 KB, patch)
2011-10-18 19:20 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
more work in progress (96.59 KB, patch)
2011-10-19 21:51 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
more work in progress (101.69 KB, patch)
2011-10-20 15:46 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
fix style (100.96 KB, patch)
2011-10-20 18:26 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
it works (101.33 KB, patch)
2011-10-21 01:53 PDT, Filip Pizlo
gyuyoung.kim: commit-queue-
Details | Formatted Diff | Diff
the patch (100.29 KB, patch)
2011-10-21 14:56 PDT, Filip Pizlo
fpizlo: review-
Details | Formatted Diff | Diff
the patch (112.17 KB, patch)
2011-10-21 16:23 PDT, Filip Pizlo
fpizlo: review-
webkit-ews: commit-queue-
Details | Formatted Diff | Diff
the patch (113.07 KB, patch)
2011-10-21 17:34 PDT, Filip Pizlo
oliver: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Filip Pizlo 2011-10-12 21:12:02 PDT
For now, this is an umbrella bug.
Comment 1 Filip Pizlo 2011-10-17 22:20:46 PDT
Created attachment 111381 [details]
work in progress

If this even compiles, then clang is broken.

Still working on this.  Items that are (hopefully) done:

- Flushing state related to the call that can be reflectively accessed, like the arguments, and the callee.

- Determining when it is profitable to inline based on the classic heuristics (inlinee size, inline stack depth, recursion detection, rejection of fancy operations that the inliner won't handle correctly).

- Actually inlining code, with the following special things:

- Inlining calls to single-basic-block functions does not introduce any control flow, and so the only "evidence" that there was a call is the flushing of arguments and the callee.

- Inlining calls to functions with multiple basic blocks does the right thing.

- Inlining calls to functions that don't terminate does the right thing.

- Inlining calls to functions that have multiple return statements does the right thing.

Things that aren't done:

- OSR support.

- ???

- Make it compiler, pass tests, and produce speed-ups.
Comment 2 Filip Pizlo 2011-10-18 02:40:17 PDT
Created attachment 111413 [details]
work in progress (almost done)

Added support for DFG graph dumping that prints useful stuff when inlining happens.  Implemented OSR support for inlining.

Still haven't tried to compile it.  Putting it up here for backup.
Comment 3 Filip Pizlo 2011-10-18 16:48:30 PDT
Created attachment 111528 [details]
work in progress - almost done

It already works for simple programs.  Still more debugging to do though.
Comment 4 WebKit Review Bot 2011-10-18 16:50:51 PDT
Attachment 111528 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1

Source/JavaScriptCore/dfg/DFGJITCompiler.h:444:  The parameter name "codeBlock" adds no information, so it should be removed.  [readability/parameter_name] [5]
Source/JavaScriptCore/dfg/DFGGraph.cpp:82:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
Source/JavaScriptCore/dfg/DFGGraph.cpp:92:  Declaration has space between type name and & in Node &currentNode  [whitespace/declaration] [3]
Source/JavaScriptCore/dfg/DFGGraph.cpp:93:  Declaration has space between type name and & in Node &previousNode  [whitespace/declaration] [3]
Source/JavaScriptCore/runtime/Executable.h:577:  The parameter name "kind" adds no information, so it should be removed.  [readability/parameter_name] [5]
Total errors found: 5 in 25 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 5 Filip Pizlo 2011-10-18 19:20:38 PDT
Created attachment 111552 [details]
more work in progress

It sort of works.
Comment 6 Oliver Hunt 2011-10-18 19:30:29 PDT
> It sort of works.

Always promising :D
Comment 7 Filip Pizlo 2011-10-19 21:51:12 PDT
Created attachment 111725 [details]
more work in progress

Removed some really bad bugs.  Note that this patch includes some of the patches that are listed as dependencies of this bug.  I'll fix that when those patches land.
Comment 8 Filip Pizlo 2011-10-20 15:46:50 PDT
Created attachment 111861 [details]
more work in progress

This isn't fully tested yet but it has a good chunk of functionality for landing once all bugs are addressed.  It's not yet complete though, so more patches will come after this one to flesh out the functionality.  Notably, we don't inline constructors (even though we really should) and we don't have a good inlining heuristics story yet.

On the V8 harness, it's an 8.7% speed-up.  Here's the performance using my harness:



Benchmark report for SunSpider, V8, and Kraken.

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc
"Inlining" at /Volumes/Data/pizlo/septenary/OpenSource/WebKitBuild/Release/jsc

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM
invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting
benchmark execution times with 95% confidence intervals in milliseconds.

                                            TipOfTree                Inlining                                    
SunSpider:
   3d-cube                                7.2294+-0.1743    ?     7.3492+-0.1425       ? might be 1.0166x slower
   3d-morph                               7.6382+-0.1203    ?     7.6582+-0.1256       ?
   3d-raytrace                            7.5246+-0.1949    ?     7.6317+-0.1888       ? might be 1.0142x slower
   access-binary-trees                    1.7486+-0.0494          1.7437+-0.0460       
   access-fannkuch                        6.5008+-0.1280          6.4137+-0.1062         might be 1.0136x faster
   access-nbody                           3.3089+-0.0763    !     3.5599+-0.0831       ! definitely 1.0759x slower
   access-nsieve                          2.6140+-0.0645    ?     2.6633+-0.0950       ? might be 1.0188x slower
   bitops-3bit-bits-in-byte               1.7503+-0.0280    ^     1.2948+-0.0372       ^ definitely 1.3518x faster
   bitops-bits-in-byte                    2.7918+-0.0953    ^     2.3645+-0.0767       ^ definitely 1.1807x faster
   bitops-bitwise-and                     3.4221+-0.1173          3.3934+-0.0957       
   bitops-nsieve-bits                     5.4132+-0.1308          5.3306+-0.0769         might be 1.0155x faster
   controlflow-recursive                  2.1608+-0.0553          2.1481+-0.0479       
   crypto-aes                             6.7897+-0.1909    !     7.2303+-0.2243       ! definitely 1.0649x slower
   crypto-md5                             2.7882+-0.0802          2.7425+-0.0615         might be 1.0167x faster
   crypto-sha1                            2.5315+-0.0735          2.4292+-0.0628         might be 1.0421x faster
   date-format-tofte                     10.1474+-0.2384          9.9917+-0.2174         might be 1.0156x faster
   date-format-xparb                      8.8026+-0.1675    ?     9.2311+-0.3072       ? might be 1.0487x slower
   math-cordic                            6.5753+-0.2417          6.5217+-0.1291       
   math-partial-sums                      7.6060+-0.1360    ?     7.6328+-0.1490       ?
   math-spectral-norm                     2.8472+-0.0610    ^     2.5817+-0.0614       ^ definitely 1.1028x faster
   regexp-dna                            11.6238+-0.1617         11.4674+-0.1856         might be 1.0136x faster
   string-base64                          4.4783+-0.1092          4.3667+-0.1042         might be 1.0256x faster
   string-fasta                           6.3774+-0.1054          6.3234+-0.1221       
   string-tagcloud                       11.4621+-0.1364         11.4097+-0.2304       
   string-unpack-code                    20.2390+-0.2392         20.1693+-0.2645       
   string-validate-input                  5.2160+-0.1046    ?     5.2871+-0.1256       ? might be 1.0136x slower

   <arithmetic> *                         6.1380+-0.0387          6.1129+-0.0295       
   <geometric>                            5.0496+-0.0307    ^     4.9542+-0.0300       ^ definitely 1.0192x faster
   <harmonic>                             4.1716+-0.0276    ^     3.9812+-0.0323       ^ definitely 1.0478x faster

                                            TipOfTree                Inlining                                    
V8:
   crypto                                73.4235+-0.4433    !    75.0053+-0.5804       ! definitely 1.0215x slower
   deltablue                            228.1755+-1.0590    ^   171.0148+-0.7583       ^ definitely 1.3342x faster
   earley-boyer                          91.4364+-1.6587    ?    94.3709+-1.4916       ? might be 1.0321x slower
   raytrace                              58.3178+-0.2884    !    59.9449+-0.4125       ! definitely 1.0279x slower
   regexp                               106.6983+-0.6395        106.3563+-0.5184       
   richards                             183.5438+-0.5371    ^   142.4662+-0.4241       ^ definitely 1.2883x faster
   splay                                 97.2676+-0.5820    ^    95.9481+-0.6546       ^ definitely 1.0138x faster

   <arithmetic>                         119.8376+-0.3437    ^   106.4438+-0.3814       ^ definitely 1.1258x faster
   <geometric> *                        107.8833+-0.3250    ^   100.7615+-0.4141       ^ definitely 1.0707x faster
   <harmonic>                            98.3173+-0.3068    ^    95.4845+-0.4341       ^ definitely 1.0297x faster

                                            TipOfTree                Inlining                                    
Kraken:
   ai-astar                             506.3080+-8.7997        504.8021+-3.0190       
   audio-beat-detection                 195.2658+-1.2351        194.5032+-1.0999       
   audio-dft                            272.5609+-2.2965        270.2281+-1.5971       
   audio-fft                            126.5316+-1.1957    ?   126.5381+-1.1408       ?
   audio-oscillator                     255.5309+-2.0254        254.8838+-1.3599       
   imaging-darkroom                     422.1903+-1.7198    ^   409.7418+-3.5790       ^ definitely 1.0304x faster
   imaging-desaturate                   222.6997+-0.4788    ?   223.1201+-1.2055       ?
   imaging-gaussian-blur                562.7176+-2.0577    ?   563.6714+-1.7784       ?
   json-parse-financial                  57.2528+-0.3100    ^    56.3006+-0.2807       ^ definitely 1.0169x faster
   json-stringify-tinderbox              68.5532+-0.5595    !    70.0214+-0.2980       ! definitely 1.0214x slower
   stanford-crypto-aes                  133.5707+-1.7238        132.0632+-1.7421         might be 1.0114x faster
   stanford-crypto-ccm                  102.6632+-1.2818    ?   102.7728+-0.8918       ?
   stanford-crypto-pbkdf2               196.2656+-1.0252    ?   198.3048+-2.5661       ? might be 1.0104x slower
   stanford-crypto-sha256-iterative      71.2168+-0.2476    !    72.1844+-0.4220       ! definitely 1.0136x slower

   <arithmetic> *                       228.0948+-0.7858        227.0811+-0.5272       
   <geometric>                          178.9893+-0.5003        178.6557+-0.4535       
   <harmonic>                           140.4928+-0.3341    ?   140.5655+-0.4151       ?

                                            TipOfTree                Inlining                                    
All benchmarks:
   <arithmetic>                          89.1867+-0.2757    ^    86.8761+-0.1709       ^ definitely 1.0266x faster
   <geometric>                           23.0604+-0.0983    ^    22.5749+-0.0838       ^ definitely 1.0215x faster
   <harmonic>                             7.3397+-0.0477    ^     7.0112+-0.0556       ^ definitely 1.0469x faster

                                            TipOfTree                Inlining                                    
Geomean of preferred means:
   <scaled-result>                       53.2550+-0.1847    ^    51.9082+-0.1028       ^ definitely 1.0259x faster
Comment 9 Filip Pizlo 2011-10-20 15:47:08 PDT
Comment on attachment 111861 [details]
more work in progress

Ooops, didn't mean to set the r? flag.
Comment 10 WebKit Review Bot 2011-10-20 15:49:54 PDT
Attachment 111861 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1

Source/JavaScriptCore/bytecode/CodeOrigin.h:65:  The parameter name "inlineCallFrame" adds no information, so it should be removed.  [readability/parameter_name] [5]
Source/JavaScriptCore/dfg/DFGDriver.cpp:40:  Should have a space between // and comment  [whitespace/comments] [4]
Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:857:  Should have a space between // and comment  [whitespace/comments] [4]
Total errors found: 3 in 30 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 11 Filip Pizlo 2011-10-20 18:26:39 PDT
Created attachment 111886 [details]
fix style
Comment 12 Filip Pizlo 2011-10-21 01:49:13 PDT
Updated performance after merging.



Benchmark report for SunSpider, V8, and Kraken.

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc
"Inlining" at /Volumes/Data/pizlo/septenary/OpenSource/WebKitBuild/Release/jsc

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime()
function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in
milliseconds.

                                            TipOfTree                Inlining                                    
SunSpider:
   3d-cube                                7.5802+-0.1700          7.5607+-0.1572       
   3d-morph                               7.7902+-0.1274    ?     7.8295+-0.1048       ?
   3d-raytrace                            7.6805+-0.1737    ?     7.7723+-0.2040       ? might be 1.0120x slower
   access-binary-trees                    1.7349+-0.0446    ?     1.7760+-0.0487       ? might be 1.0237x slower
   access-fannkuch                        6.4702+-0.1248    ?     6.6150+-0.1171       ? might be 1.0224x slower
   access-nbody                           3.3459+-0.0766    !     3.6672+-0.0759       ! definitely 1.0960x slower
   access-nsieve                          2.5824+-0.0628    ?     2.6535+-0.0556       ? might be 1.0275x slower
   bitops-3bit-bits-in-byte               1.7588+-0.0389    ^     1.2956+-0.0250       ^ definitely 1.3575x faster
   bitops-bits-in-byte                    2.8443+-0.0406    ^     2.3885+-0.0751       ^ definitely 1.1909x faster
   bitops-bitwise-and                     3.4283+-0.0824          3.3971+-0.0965       
   bitops-nsieve-bits                     5.5956+-0.2869          5.4540+-0.0905         might be 1.0260x faster
   controlflow-recursive                  2.1149+-0.0375    ?     2.1161+-0.0353       ?
   crypto-aes                             6.9257+-0.1076    !     7.4491+-0.2030       ! definitely 1.0756x slower
   crypto-md5                             2.8842+-0.0867          2.7452+-0.0804         might be 1.0506x faster
   crypto-sha1                            2.5600+-0.0690          2.4965+-0.0755         might be 1.0254x faster
   date-format-tofte                      9.9971+-0.1856    ?    10.0634+-0.1550       ?
   date-format-xparb                      9.3292+-0.1708    ?     9.4672+-0.2583       ? might be 1.0148x slower
   math-cordic                            6.4908+-0.1038    ?     6.6374+-0.1601       ? might be 1.0226x slower
   math-partial-sums                      7.7452+-0.1433          7.6759+-0.1205       
   math-spectral-norm                     2.9435+-0.0506    ^     2.6500+-0.0545       ^ definitely 1.1108x faster
   regexp-dna                            11.8225+-0.1472         11.5870+-0.1209         might be 1.0203x faster
   string-base64                          4.4506+-0.1049    ?     4.4925+-0.1559       ?
   string-fasta                           6.2615+-0.1023    ?     6.5080+-0.1471       ? might be 1.0394x slower
   string-tagcloud                       11.4536+-0.1568    !    11.8683+-0.1448       ! definitely 1.0362x slower
   string-unpack-code                    20.3437+-0.2670    ?    20.7484+-0.2756       ? might be 1.0199x slower
   string-validate-input                  5.2792+-0.1040    ?     5.3195+-0.1574       ?

   <arithmetic> *                         6.2082+-0.0304    ?     6.2398+-0.0199       ?
   <geometric>                            5.1020+-0.0250    ^     5.0399+-0.0270       ^ definitely 1.0123x faster
   <harmonic>                             4.2040+-0.0250    ^     4.0334+-0.0337       ^ definitely 1.0423x faster

                                            TipOfTree                Inlining                                    
V8:
   crypto                                75.0317+-0.6962    ?    76.1193+-0.6357       ? might be 1.0145x slower
   deltablue                            229.5168+-2.0350    ^   171.9966+-1.3813       ^ definitely 1.3344x faster
   earley-boyer                          94.3379+-2.0813    ?    96.7467+-1.9250       ? might be 1.0255x slower
   raytrace                              59.5699+-0.2731    !    61.1444+-1.2674       ! definitely 1.0264x slower
   regexp                               106.6368+-0.7844    ?   107.0270+-0.9189       ?
   richards                             185.1246+-1.0312    ^   144.3955+-0.9598       ^ definitely 1.2821x faster
   splay                                 98.7830+-0.5944    ^    96.1199+-0.7264       ^ definitely 1.0277x faster

   <arithmetic>                         121.2858+-0.5272    ^   107.6499+-0.4626       ^ definitely 1.1267x faster
   <geometric> *                        109.4864+-0.5406    ^   102.0054+-0.3858       ^ definitely 1.0733x faster
   <harmonic>                           100.0171+-0.5284    ^    96.7603+-0.3744       ^ definitely 1.0337x faster

                                            TipOfTree                Inlining                                    
Kraken:
   ai-astar                             504.0532+-2.4731    ?   512.0117+-6.1466       ? might be 1.0158x slower
   audio-beat-detection                 195.2423+-1.1973    ?   198.4151+-2.1492       ? might be 1.0163x slower
   audio-dft                            286.9930+-9.2743        280.3339+-7.3088         might be 1.0238x faster
   audio-fft                            128.6663+-1.6103        126.4186+-0.7910         might be 1.0178x faster
   audio-oscillator                     258.8092+-4.0336        258.7320+-2.5276       
   imaging-darkroom                     431.3622+-2.6677    ^   412.0987+-4.9051       ^ definitely 1.0467x faster
   imaging-desaturate                   224.0498+-2.1918    ?   224.5701+-1.6935       ?
   imaging-gaussian-blur                569.6371+-6.1933    ?   571.6295+-4.3435       ?
   json-parse-financial                  57.5612+-0.3672    ?    57.8921+-0.5814       ?
   json-stringify-tinderbox              69.6784+-0.7403    !    71.6162+-0.5837       ! definitely 1.0278x slower
   stanford-crypto-aes                  136.1775+-1.8543        134.4457+-2.1320         might be 1.0129x faster
   stanford-crypto-ccm                  103.4945+-1.0784    ?   104.4574+-0.8185       ?
   stanford-crypto-pbkdf2               203.6624+-1.5077        202.5547+-5.2252       
   stanford-crypto-sha256-iterative      72.7979+-0.7534    ?    73.4482+-0.6291       ?

   <arithmetic> *                       231.5846+-0.5794        230.6160+-0.6849       
   <geometric>                          181.8590+-0.3005        181.6454+-0.6541       
   <harmonic>                           142.6110+-0.2599    ?   143.1602+-0.5793       ?

                                            TipOfTree                Inlining                                    
All benchmarks:
   <arithmetic>                          90.4808+-0.2130    ^    88.1789+-0.2279       ^ definitely 1.0261x faster
   <geometric>                           23.3539+-0.0772    ^    22.9450+-0.0851       ^ definitely 1.0178x faster
   <harmonic>                             7.3984+-0.0431    ^     7.1037+-0.0580       ^ definitely 1.0415x faster

                                            TipOfTree                Inlining                                    
Geomean of preferred means:
   <scaled-result>                       53.9934+-0.1639    ^    52.7502+-0.1200       ^ definitely 1.0236x faster
Comment 13 Filip Pizlo 2011-10-21 01:53:24 PDT
Created attachment 111924 [details]
it works

Except on 32_64, where it'll either fail to compile or crash in awesome ways.  I haven't copy-pasted some code yet.
Comment 14 Gyuyoung Kim 2011-10-21 01:58:06 PDT
Comment on attachment 111924 [details]
it works

Attachment 111924 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/10181797
Comment 15 Early Warning System Bot 2011-10-21 02:03:22 PDT
Comment on attachment 111924 [details]
it works

Attachment 111924 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/10176842
Comment 16 Filip Pizlo 2011-10-21 02:04:56 PDT
Looks like there are some awesome crashes induced by botched OSR failures induced by LayoutTests, so the claim that "it works" is probably premature.  Will investigate.
Comment 17 Gustavo Noronha (kov) 2011-10-21 02:58:51 PDT
Comment on attachment 111924 [details]
it works

Attachment 111924 [details] did not pass gtk-ews (gtk):
Output: http://queues.webkit.org/results/10180843
Comment 18 Filip Pizlo 2011-10-21 14:56:39 PDT
Created attachment 112025 [details]
the patch

It passes tests.  It makes things faster.  Ready for review.
Comment 19 Filip Pizlo 2011-10-21 14:58:11 PDT
Comment on attachment 112025 [details]
the patch

Aaahhhh!  Never mind.  Still need to do 32_64.
Comment 20 Filip Pizlo 2011-10-21 16:23:49 PDT
Created attachment 112041 [details]
the patch

Passes tests, works on 32_64.  Still need to get gmail to load.
Comment 21 Early Warning System Bot 2011-10-21 16:37:19 PDT
Comment on attachment 112041 [details]
the patch

Attachment 112041 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/10198091
Comment 22 Gyuyoung Kim 2011-10-21 16:50:28 PDT
Comment on attachment 112041 [details]
the patch

Attachment 112041 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/10197101
Comment 23 Filip Pizlo 2011-10-21 17:04:50 PDT
Comment on attachment 112041 [details]
the patch

Looks like the gmail bug requires a slight rearchitecting of the block linking to make it more rugged.  There's probably a simple side-stepping but I'm going to use a sledge hammer to reduce the likelihood that I ever see assertion failures like this again.
Comment 24 Filip Pizlo 2011-10-21 17:34:46 PDT
Created attachment 112050 [details]
the patch

Ruggedized the block linker.  Will set r? once I'm happy that websites work.  Still testing that now.
Comment 25 Filip Pizlo 2011-10-21 17:40:41 PDT
Comment on attachment 112050 [details]
the patch

Ready for review.  I can browse gmail, facebook, google plus, bing, cnn, bankrate, and tests pass.  32_64 seems to work as well.
Comment 26 Filip Pizlo 2011-10-21 17:42:00 PDT
Latest perf numbers.



Benchmark report for SunSpider, V8, and Kraken.

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc
"Inlining" at /Volumes/Data/pizlo/septenary/OpenSource/WebKitBuild/Release/jsc

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime()
function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in
milliseconds.

                                            TipOfTree                Inlining                                    
SunSpider:
   3d-cube                                7.5292+-0.1551          7.5090+-0.1494       
   3d-morph                               7.7238+-0.1017    ?     7.7504+-0.1686       ?
   3d-raytrace                            7.7175+-0.1658          7.6361+-0.1587         might be 1.0107x faster
   access-binary-trees                    1.7710+-0.0353          1.6970+-0.0480         might be 1.0436x faster
   access-fannkuch                        6.5448+-0.1462          6.4973+-0.1362       
   access-nbody                           3.2920+-0.0581    !     3.6436+-0.0649       ! definitely 1.1068x slower
   access-nsieve                          2.6189+-0.0491          2.5701+-0.0533         might be 1.0190x faster
   bitops-3bit-bits-in-byte               1.7719+-0.0463    ^     1.3203+-0.0365       ^ definitely 1.3421x faster
   bitops-bits-in-byte                    2.8426+-0.0491    ^     2.3600+-0.0661       ^ definitely 1.2045x faster
   bitops-bitwise-and                     3.3914+-0.1234    ?     3.4515+-0.0830       ? might be 1.0177x slower
   bitops-nsieve-bits                     5.6335+-0.1041    ^     5.4495+-0.0794       ^ definitely 1.0338x faster
   controlflow-recursive                  2.1004+-0.0485    ?     2.1078+-0.0435       ?
   crypto-aes                             6.9378+-0.1358    !     7.6149+-0.1823       ! definitely 1.0976x slower
   crypto-md5                             2.8627+-0.1010          2.8080+-0.0770         might be 1.0195x faster
   crypto-sha1                            2.5662+-0.0679          2.5086+-0.0403         might be 1.0230x faster
   date-format-tofte                      9.9458+-0.2107    ?    10.2073+-0.3872       ? might be 1.0263x slower
   date-format-xparb                      9.4583+-0.1809    ^     8.9732+-0.0994       ^ definitely 1.0541x faster
   math-cordic                            6.6702+-0.1924          6.4718+-0.1315         might be 1.0307x faster
   math-partial-sums                      7.7084+-0.1235    ?     8.1012+-0.4149       ? might be 1.0510x slower
   math-spectral-norm                     2.8734+-0.0662    ^     2.6389+-0.0422       ^ definitely 1.0889x faster
   regexp-dna                            11.6119+-0.1368    ?    11.6933+-0.1866       ?
   string-base64                          4.4173+-0.1344    ?     4.4714+-0.1723       ? might be 1.0123x slower
   string-fasta                           6.4880+-0.1368          6.4316+-0.1918       
   string-tagcloud                       11.5306+-0.1765    ?    11.5918+-0.2346       ?
   string-unpack-code                    20.3630+-0.2860    ?    20.8189+-0.2604       ? might be 1.0224x slower
   string-validate-input                  5.2883+-0.1414    ?     5.4789+-0.1554       ? might be 1.0360x slower

   <arithmetic> *                         6.2177+-0.0345    ?     6.2232+-0.0225       ?
   <geometric>                            5.1094+-0.0268    ^     5.0234+-0.0242       ^ definitely 1.0171x faster
   <harmonic>                             4.2102+-0.0271    ^     4.0180+-0.0279       ^ definitely 1.0478x faster

                                            TipOfTree                Inlining                                    
V8:
   crypto                                74.2652+-0.3573    !    76.2611+-0.6886       ! definitely 1.0269x slower
   deltablue                            228.7236+-1.8105    ^   169.4730+-1.4956       ^ definitely 1.3496x faster
   earley-boyer                          93.0936+-1.7728    ?    95.7037+-1.5892       ? might be 1.0280x slower
   raytrace                              58.6868+-0.3134    !    61.5333+-0.9121       ! definitely 1.0485x slower
   regexp                               107.2100+-1.0549        106.2229+-0.4367       
   richards                             184.7886+-0.4967    ^   145.4120+-1.6208       ^ definitely 1.2708x faster
   splay                                 98.0963+-0.7513    ^    95.0552+-0.4904       ^ definitely 1.0320x faster

   <arithmetic>                         120.6949+-0.4253    ^   107.0944+-0.4520       ^ definitely 1.1270x faster
   <geometric> *                        108.7804+-0.3992    ^   101.5881+-0.4463       ^ definitely 1.0708x faster
   <harmonic>                            99.2003+-0.3728    ^    96.4955+-0.4677       ^ definitely 1.0280x faster

                                            TipOfTree                Inlining                                    
Kraken:
   ai-astar                             508.6602+-3.7149    ?   513.4338+-5.7509       ?
   audio-beat-detection                 194.7253+-1.9536        194.2500+-1.3624       
   audio-dft                            278.9230+-8.0107    ?   285.5524+-6.2409       ? might be 1.0238x slower
   audio-fft                            126.8889+-1.0945        125.9327+-0.9391       
   audio-oscillator                     254.7716+-2.0469    ?   255.3800+-1.1305       ?
   imaging-darkroom                     421.9319+-1.6725    ^   410.0498+-3.3201       ^ definitely 1.0290x faster
   imaging-desaturate                   224.2292+-1.6659    ?   225.1973+-2.4855       ?
   imaging-gaussian-blur                571.7987+-7.9711        564.7698+-2.5282         might be 1.0124x faster
   json-parse-financial                  57.3133+-0.6812         56.4367+-0.2013         might be 1.0155x faster
   json-stringify-tinderbox              69.9055+-1.0370    ?    70.0927+-0.9709       ?
   stanford-crypto-aes                  136.4939+-2.3346        133.3180+-1.3957         might be 1.0238x faster
   stanford-crypto-ccm                  103.6385+-1.1249    ?   104.3401+-1.2851       ?
   stanford-crypto-pbkdf2               198.7528+-3.1946        197.6987+-0.8001       
   stanford-crypto-sha256-iterative      72.7265+-0.6502    ?    73.2132+-0.5826       ?

   <arithmetic> *                       230.0542+-0.7879        229.2618+-0.7745       
   <geometric>                          180.6700+-0.7627        180.1717+-0.4641       
   <harmonic>                           141.9692+-0.6425        141.5408+-0.4094       

                                            TipOfTree                Inlining                                    
All benchmarks:
   <arithmetic>                          89.9422+-0.2815    ^    87.6836+-0.2670       ^ definitely 1.0258x faster
   <geometric>                           23.3046+-0.0910    ^    22.8341+-0.0758       ^ definitely 1.0206x faster
   <harmonic>                             7.4078+-0.0467    ^     7.0758+-0.0480       ^ definitely 1.0469x faster

                                            TipOfTree                Inlining                                    
Geomean of preferred means:
   <scaled-result>                       53.7855+-0.1745    ^    52.5281+-0.1441       ^ definitely 1.0239x faster
Comment 27 Oliver Hunt 2011-10-21 18:10:13 PDT
Comment on attachment 112050 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=112050&action=review

r=me

> Source/JavaScriptCore/dfg/DFGJITCompiler32_64.cpp:482
> +        store32(Imm32(JSValue::CellTag), tagFor((VirtualRegister)(inlineCallFrame->stackOffset + RegisterFile::ScopeChain)));

So much sadness :-/

> Source/JavaScriptCore/runtime/Heuristics.cpp:32
> -#define ENABLE_RUN_TIME_HEURISTICS 0
> +#define ENABLE_RUN_TIME_HEURISTICS 1

Do we want these on by default?
Comment 28 Filip Pizlo 2011-10-21 18:13:13 PDT
Here's some more performance data, from a different machine.


Benchmark report for SunSpider, V8, and Kraken.

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/tertiary/OpenSource/WebKitBuild/Release/jsc
"Inlining" at /Volumes/Data/fromMiniMe/septenary/OpenSource/WebKitBuild/Release/jsc

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM
invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting
benchmark execution times with 95% confidence intervals in milliseconds.

                                            TipOfTree                Inlining                                    
SunSpider:
   3d-cube                                7.9057+-0.0355    ?     7.9108+-0.0333       ?
   3d-morph                               8.6263+-0.1314    ^     8.4058+-0.0298       ^ definitely 1.0262x faster
   3d-raytrace                            8.1104+-0.0722          8.0728+-0.0743       
   access-binary-trees                    1.7884+-0.0047    !     1.8039+-0.0046       ! definitely 1.0086x slower
   access-fannkuch                        7.9673+-0.0243    ^     7.8511+-0.0598       ^ definitely 1.0148x faster
   access-nbody                           4.0548+-0.0328    !     4.2363+-0.0073       ! definitely 1.0448x slower
   access-nsieve                          3.1608+-0.0130    ?     3.1832+-0.0130       ?
   bitops-3bit-bits-in-byte               1.7805+-0.0034    ^     1.3165+-0.0147       ^ definitely 1.3525x faster
   bitops-bits-in-byte                    5.3079+-0.0117    ^     5.2709+-0.0227       ^ definitely 1.0070x faster
   bitops-bitwise-and                     3.4332+-0.0600    ?     3.4381+-0.0600       ?
   bitops-nsieve-bits                     5.6791+-0.0396          5.6473+-0.0367       
   controlflow-recursive                  2.3211+-0.0036    ?     2.3280+-0.0053       ?
   crypto-aes                             6.8789+-0.0498    !     7.6776+-0.0596       ! definitely 1.1161x slower
   crypto-md5                             3.0041+-0.0363    ^     2.8722+-0.0314       ^ definitely 1.0459x faster
   crypto-sha1                            2.7722+-0.0271    ^     2.6334+-0.0164       ^ definitely 1.0527x faster
   date-format-tofte                     10.5648+-0.0599    !    10.7473+-0.0880       ! definitely 1.0173x slower
   date-format-xparb                     10.9002+-0.1451    ^     9.5093+-0.1691       ^ definitely 1.1463x faster
   math-cordic                            7.2169+-0.0227    !     7.5781+-0.2770       ! definitely 1.0500x slower
   math-partial-sums                     10.5438+-0.0239    !    10.6200+-0.0396       ! definitely 1.0072x slower
   math-spectral-norm                     3.2655+-0.0115    ^     2.8810+-0.0056       ^ definitely 1.1335x faster
   regexp-dna                            13.3356+-0.1807    ?    13.3923+-0.2041       ?
   string-base64                          4.4239+-0.0169    ?     4.4245+-0.0157       ?
   string-fasta                           7.1008+-0.0334    ?     7.1265+-0.0376       ?
   string-tagcloud                       13.2980+-0.1306         13.2206+-0.1446       
   string-unpack-code                    22.6685+-0.1202         22.6000+-0.1731       
   string-validate-input                  5.6601+-0.0754          5.6096+-0.0405       

   <arithmetic> *                         6.9911+-0.0237          6.9368+-0.0332       
   <geometric>                            5.7257+-0.0154    ^     5.6206+-0.0229       ^ definitely 1.0187x faster
   <harmonic>                             4.6719+-0.0103    ^     4.4690+-0.0174       ^ definitely 1.0454x faster

                                            TipOfTree                Inlining                                    
V8:
   crypto                                80.1479+-0.0824    !    81.5153+-0.1235       ! definitely 1.0171x slower
   deltablue                            253.7627+-1.6497    ^   188.5828+-0.5628       ^ definitely 1.3456x faster
   earley-boyer                         111.5673+-2.2941    ?   113.8488+-1.4306       ? might be 1.0204x slower
   raytrace                              63.8901+-0.3075    !    66.7756+-0.6179       ! definitely 1.0452x slower
   regexp                               124.2609+-0.3933    ?   125.0151+-0.7319       ?
   richards                             212.5178+-0.6103    ^   165.2101+-0.2793       ^ definitely 1.2863x faster
   splay                                126.0742+-0.4379    ^   124.1464+-0.7412       ^ definitely 1.0155x faster

   <arithmetic>                         138.8887+-0.4821    ^   123.5849+-0.3212       ^ definitely 1.1238x faster
   <geometric> *                        125.2085+-0.4398    ^   116.9687+-0.3409       ^ definitely 1.0704x faster
   <harmonic>                           113.4041+-0.3858    ^   110.3364+-0.3652       ^ definitely 1.0278x faster

                                            TipOfTree                Inlining                                    
Kraken:
   ai-astar                             804.8272+-11.5509   ?   825.5071+-11.4066      ? might be 1.0257x slower
   audio-beat-detection                 210.3146+-1.2071    ?   210.9226+-1.8756       ?
   audio-dft                            263.5513+-8.3917        262.6012+-2.5779       
   audio-fft                            135.0298+-0.0938    ?   135.4095+-0.5561       ?
   audio-oscillator                     291.4857+-2.0305    ?   292.7020+-1.4143       ?
   imaging-darkroom                     480.7360+-3.4757    ^   445.8657+-2.5090       ^ definitely 1.0782x faster
   imaging-desaturate                   238.0163+-0.1114    ?   238.1223+-0.1226       ?
   imaging-gaussian-blur                621.0191+-0.4272    ?   621.0583+-0.3606       ?
   json-parse-financial                  70.8058+-0.2058    ^    69.6470+-0.2175       ^ definitely 1.0166x faster
   json-stringify-tinderbox              79.7848+-0.3356    ^    78.4771+-0.2201       ^ definitely 1.0167x faster
   stanford-crypto-aes                  154.2835+-1.7173    ^   151.3350+-1.0898       ^ definitely 1.0195x faster
   stanford-crypto-ccm                  116.3939+-0.6686    ?   117.5918+-0.6777       ? might be 1.0103x slower
   stanford-crypto-pbkdf2               236.0109+-1.7652    ?   236.9104+-2.0966       ?
   stanford-crypto-sha256-iterative      85.4621+-0.2368    ?    85.6366+-0.2715       ?

   <arithmetic> *                       270.5515+-1.0418        269.4133+-0.8671       
   <geometric>                          206.4148+-0.6826    ^   205.2601+-0.3124       ^ definitely 1.0056x faster
   <harmonic>                           162.4133+-0.4534    ^   161.4193+-0.2400       ^ definitely 1.0062x faster

                                            TipOfTree                Inlining                                    
All benchmarks:
   <arithmetic>                         105.1428+-0.3565    ^   102.4944+-0.3063       ^ definitely 1.0258x faster
   <geometric>                           26.3714+-0.0708    ^    25.7962+-0.0736       ^ definitely 1.0223x faster
   <harmonic>                             8.2267+-0.0181    ^     7.8753+-0.0300       ^ definitely 1.0446x faster

                                            TipOfTree                Inlining                                    
Geomean of preferred means:
   <scaled-result>                       61.8693+-0.1933    ^    60.2394+-0.1885       ^ definitely 1.0271x faster
Comment 29 Filip Pizlo 2011-10-21 18:13:36 PDT
(In reply to comment #27)
> (From update of attachment 112050 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=112050&action=review
> 
> r=me
> 
> > Source/JavaScriptCore/dfg/DFGJITCompiler32_64.cpp:482
> > +        store32(Imm32(JSValue::CellTag), tagFor((VirtualRegister)(inlineCallFrame->stackOffset + RegisterFile::ScopeChain)));
> 
> So much sadness :-/

:-(

> 
> > Source/JavaScriptCore/runtime/Heuristics.cpp:32
> > -#define ENABLE_RUN_TIME_HEURISTICS 0
> > +#define ENABLE_RUN_TIME_HEURISTICS 1
> 
> Do we want these on by default?

Ooops!  Thanks for catching that!
Comment 30 Filip Pizlo 2011-10-21 18:22:28 PDT
Landed in http://trac.webkit.org/changeset/98179
Comment 31 Patrick R. Gansterer 2011-10-24 01:54:48 PDT
Committed interpreter build fix r98220: <http://trac.webkit.org/changeset/98220>