Bug 69996

Summary: DFG should have inlining
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: JavaScriptCoreAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: barraclough, fpizlo, ggaren, gustavo, mrowe, oliver, paroga, sam, webkit.review.bot, xan.lopez
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Bug Depends on: 69995, 70068, 70157, 70278, 70466, 70467, 70468, 70578    
Bug Blocks:    
Attachments:
Description Flags
work in progress
none
work in progress (almost done)
none
work in progress - almost done
none
more work in progress
none
more work in progress
none
more work in progress
none
fix style
none
it works
gyuyoung.kim: commit-queue-
the patch
fpizlo: review-
the patch
fpizlo: review-, webkit-ews: commit-queue-
the patch oliver: review+

Filip Pizlo
Reported 2011-10-12 21:12:02 PDT
For now, this is an umbrella bug.
Attachments
work in progress (43.78 KB, patch)
2011-10-17 22:20 PDT, Filip Pizlo
no flags
work in progress (almost done) (65.13 KB, patch)
2011-10-18 02:40 PDT, Filip Pizlo
no flags
work in progress - almost done (79.08 KB, patch)
2011-10-18 16:48 PDT, Filip Pizlo
no flags
more work in progress (83.57 KB, patch)
2011-10-18 19:20 PDT, Filip Pizlo
no flags
more work in progress (96.59 KB, patch)
2011-10-19 21:51 PDT, Filip Pizlo
no flags
more work in progress (101.69 KB, patch)
2011-10-20 15:46 PDT, Filip Pizlo
no flags
fix style (100.96 KB, patch)
2011-10-20 18:26 PDT, Filip Pizlo
no flags
it works (101.33 KB, patch)
2011-10-21 01:53 PDT, Filip Pizlo
gyuyoung.kim: commit-queue-
the patch (100.29 KB, patch)
2011-10-21 14:56 PDT, Filip Pizlo
fpizlo: review-
the patch (112.17 KB, patch)
2011-10-21 16:23 PDT, Filip Pizlo
fpizlo: review-
webkit-ews: commit-queue-
the patch (113.07 KB, patch)
2011-10-21 17:34 PDT, Filip Pizlo
oliver: review+
Filip Pizlo
Comment 1 2011-10-17 22:20:46 PDT
Created attachment 111381 [details] work in progress If this even compiles, then clang is broken. Still working on this. Items that are (hopefully) done: - Flushing state related to the call that can be reflectively accessed, like the arguments, and the callee. - Determining when it is profitable to inline based on the classic heuristics (inlinee size, inline stack depth, recursion detection, rejection of fancy operations that the inliner won't handle correctly). - Actually inlining code, with the following special things: - Inlining calls to single-basic-block functions does not introduce any control flow, and so the only "evidence" that there was a call is the flushing of arguments and the callee. - Inlining calls to functions with multiple basic blocks does the right thing. - Inlining calls to functions that don't terminate does the right thing. - Inlining calls to functions that have multiple return statements does the right thing. Things that aren't done: - OSR support. - ??? - Make it compiler, pass tests, and produce speed-ups.
Filip Pizlo
Comment 2 2011-10-18 02:40:17 PDT
Created attachment 111413 [details] work in progress (almost done) Added support for DFG graph dumping that prints useful stuff when inlining happens. Implemented OSR support for inlining. Still haven't tried to compile it. Putting it up here for backup.
Filip Pizlo
Comment 3 2011-10-18 16:48:30 PDT
Created attachment 111528 [details] work in progress - almost done It already works for simple programs. Still more debugging to do though.
WebKit Review Bot
Comment 4 2011-10-18 16:50:51 PDT
Attachment 111528 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1 Source/JavaScriptCore/dfg/DFGJITCompiler.h:444: The parameter name "codeBlock" adds no information, so it should be removed. [readability/parameter_name] [5] Source/JavaScriptCore/dfg/DFGGraph.cpp:82: Place brace on its own line for function definitions. [whitespace/braces] [4] Source/JavaScriptCore/dfg/DFGGraph.cpp:92: Declaration has space between type name and & in Node &currentNode [whitespace/declaration] [3] Source/JavaScriptCore/dfg/DFGGraph.cpp:93: Declaration has space between type name and & in Node &previousNode [whitespace/declaration] [3] Source/JavaScriptCore/runtime/Executable.h:577: The parameter name "kind" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 5 in 25 files If any of these errors are false positives, please file a bug against check-webkit-style.
Filip Pizlo
Comment 5 2011-10-18 19:20:38 PDT
Created attachment 111552 [details] more work in progress It sort of works.
Oliver Hunt
Comment 6 2011-10-18 19:30:29 PDT
> It sort of works. Always promising :D
Filip Pizlo
Comment 7 2011-10-19 21:51:12 PDT
Created attachment 111725 [details] more work in progress Removed some really bad bugs. Note that this patch includes some of the patches that are listed as dependencies of this bug. I'll fix that when those patches land.
Filip Pizlo
Comment 8 2011-10-20 15:46:50 PDT
Created attachment 111861 [details] more work in progress This isn't fully tested yet but it has a good chunk of functionality for landing once all bugs are addressed. It's not yet complete though, so more patches will come after this one to flesh out the functionality. Notably, we don't inline constructors (even though we really should) and we don't have a good inlining heuristics story yet. On the V8 harness, it's an 8.7% speed-up. Here's the performance using my harness: Benchmark report for SunSpider, V8, and Kraken. VMs tested: "TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc "Inlining" at /Volumes/Data/pizlo/septenary/OpenSource/WebKitBuild/Release/jsc Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree Inlining SunSpider: 3d-cube 7.2294+-0.1743 ? 7.3492+-0.1425 ? might be 1.0166x slower 3d-morph 7.6382+-0.1203 ? 7.6582+-0.1256 ? 3d-raytrace 7.5246+-0.1949 ? 7.6317+-0.1888 ? might be 1.0142x slower access-binary-trees 1.7486+-0.0494 1.7437+-0.0460 access-fannkuch 6.5008+-0.1280 6.4137+-0.1062 might be 1.0136x faster access-nbody 3.3089+-0.0763 ! 3.5599+-0.0831 ! definitely 1.0759x slower access-nsieve 2.6140+-0.0645 ? 2.6633+-0.0950 ? might be 1.0188x slower bitops-3bit-bits-in-byte 1.7503+-0.0280 ^ 1.2948+-0.0372 ^ definitely 1.3518x faster bitops-bits-in-byte 2.7918+-0.0953 ^ 2.3645+-0.0767 ^ definitely 1.1807x faster bitops-bitwise-and 3.4221+-0.1173 3.3934+-0.0957 bitops-nsieve-bits 5.4132+-0.1308 5.3306+-0.0769 might be 1.0155x faster controlflow-recursive 2.1608+-0.0553 2.1481+-0.0479 crypto-aes 6.7897+-0.1909 ! 7.2303+-0.2243 ! definitely 1.0649x slower crypto-md5 2.7882+-0.0802 2.7425+-0.0615 might be 1.0167x faster crypto-sha1 2.5315+-0.0735 2.4292+-0.0628 might be 1.0421x faster date-format-tofte 10.1474+-0.2384 9.9917+-0.2174 might be 1.0156x faster date-format-xparb 8.8026+-0.1675 ? 9.2311+-0.3072 ? might be 1.0487x slower math-cordic 6.5753+-0.2417 6.5217+-0.1291 math-partial-sums 7.6060+-0.1360 ? 7.6328+-0.1490 ? math-spectral-norm 2.8472+-0.0610 ^ 2.5817+-0.0614 ^ definitely 1.1028x faster regexp-dna 11.6238+-0.1617 11.4674+-0.1856 might be 1.0136x faster string-base64 4.4783+-0.1092 4.3667+-0.1042 might be 1.0256x faster string-fasta 6.3774+-0.1054 6.3234+-0.1221 string-tagcloud 11.4621+-0.1364 11.4097+-0.2304 string-unpack-code 20.2390+-0.2392 20.1693+-0.2645 string-validate-input 5.2160+-0.1046 ? 5.2871+-0.1256 ? might be 1.0136x slower <arithmetic> * 6.1380+-0.0387 6.1129+-0.0295 <geometric> 5.0496+-0.0307 ^ 4.9542+-0.0300 ^ definitely 1.0192x faster <harmonic> 4.1716+-0.0276 ^ 3.9812+-0.0323 ^ definitely 1.0478x faster TipOfTree Inlining V8: crypto 73.4235+-0.4433 ! 75.0053+-0.5804 ! definitely 1.0215x slower deltablue 228.1755+-1.0590 ^ 171.0148+-0.7583 ^ definitely 1.3342x faster earley-boyer 91.4364+-1.6587 ? 94.3709+-1.4916 ? might be 1.0321x slower raytrace 58.3178+-0.2884 ! 59.9449+-0.4125 ! definitely 1.0279x slower regexp 106.6983+-0.6395 106.3563+-0.5184 richards 183.5438+-0.5371 ^ 142.4662+-0.4241 ^ definitely 1.2883x faster splay 97.2676+-0.5820 ^ 95.9481+-0.6546 ^ definitely 1.0138x faster <arithmetic> 119.8376+-0.3437 ^ 106.4438+-0.3814 ^ definitely 1.1258x faster <geometric> * 107.8833+-0.3250 ^ 100.7615+-0.4141 ^ definitely 1.0707x faster <harmonic> 98.3173+-0.3068 ^ 95.4845+-0.4341 ^ definitely 1.0297x faster TipOfTree Inlining Kraken: ai-astar 506.3080+-8.7997 504.8021+-3.0190 audio-beat-detection 195.2658+-1.2351 194.5032+-1.0999 audio-dft 272.5609+-2.2965 270.2281+-1.5971 audio-fft 126.5316+-1.1957 ? 126.5381+-1.1408 ? audio-oscillator 255.5309+-2.0254 254.8838+-1.3599 imaging-darkroom 422.1903+-1.7198 ^ 409.7418+-3.5790 ^ definitely 1.0304x faster imaging-desaturate 222.6997+-0.4788 ? 223.1201+-1.2055 ? imaging-gaussian-blur 562.7176+-2.0577 ? 563.6714+-1.7784 ? json-parse-financial 57.2528+-0.3100 ^ 56.3006+-0.2807 ^ definitely 1.0169x faster json-stringify-tinderbox 68.5532+-0.5595 ! 70.0214+-0.2980 ! definitely 1.0214x slower stanford-crypto-aes 133.5707+-1.7238 132.0632+-1.7421 might be 1.0114x faster stanford-crypto-ccm 102.6632+-1.2818 ? 102.7728+-0.8918 ? stanford-crypto-pbkdf2 196.2656+-1.0252 ? 198.3048+-2.5661 ? might be 1.0104x slower stanford-crypto-sha256-iterative 71.2168+-0.2476 ! 72.1844+-0.4220 ! definitely 1.0136x slower <arithmetic> * 228.0948+-0.7858 227.0811+-0.5272 <geometric> 178.9893+-0.5003 178.6557+-0.4535 <harmonic> 140.4928+-0.3341 ? 140.5655+-0.4151 ? TipOfTree Inlining All benchmarks: <arithmetic> 89.1867+-0.2757 ^ 86.8761+-0.1709 ^ definitely 1.0266x faster <geometric> 23.0604+-0.0983 ^ 22.5749+-0.0838 ^ definitely 1.0215x faster <harmonic> 7.3397+-0.0477 ^ 7.0112+-0.0556 ^ definitely 1.0469x faster TipOfTree Inlining Geomean of preferred means: <scaled-result> 53.2550+-0.1847 ^ 51.9082+-0.1028 ^ definitely 1.0259x faster
Filip Pizlo
Comment 9 2011-10-20 15:47:08 PDT
Comment on attachment 111861 [details] more work in progress Ooops, didn't mean to set the r? flag.
WebKit Review Bot
Comment 10 2011-10-20 15:49:54 PDT
Attachment 111861 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1 Source/JavaScriptCore/bytecode/CodeOrigin.h:65: The parameter name "inlineCallFrame" adds no information, so it should be removed. [readability/parameter_name] [5] Source/JavaScriptCore/dfg/DFGDriver.cpp:40: Should have a space between // and comment [whitespace/comments] [4] Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:857: Should have a space between // and comment [whitespace/comments] [4] Total errors found: 3 in 30 files If any of these errors are false positives, please file a bug against check-webkit-style.
Filip Pizlo
Comment 11 2011-10-20 18:26:39 PDT
Created attachment 111886 [details] fix style
Filip Pizlo
Comment 12 2011-10-21 01:49:13 PDT
Updated performance after merging. Benchmark report for SunSpider, V8, and Kraken. VMs tested: "TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc "Inlining" at /Volumes/Data/pizlo/septenary/OpenSource/WebKitBuild/Release/jsc Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree Inlining SunSpider: 3d-cube 7.5802+-0.1700 7.5607+-0.1572 3d-morph 7.7902+-0.1274 ? 7.8295+-0.1048 ? 3d-raytrace 7.6805+-0.1737 ? 7.7723+-0.2040 ? might be 1.0120x slower access-binary-trees 1.7349+-0.0446 ? 1.7760+-0.0487 ? might be 1.0237x slower access-fannkuch 6.4702+-0.1248 ? 6.6150+-0.1171 ? might be 1.0224x slower access-nbody 3.3459+-0.0766 ! 3.6672+-0.0759 ! definitely 1.0960x slower access-nsieve 2.5824+-0.0628 ? 2.6535+-0.0556 ? might be 1.0275x slower bitops-3bit-bits-in-byte 1.7588+-0.0389 ^ 1.2956+-0.0250 ^ definitely 1.3575x faster bitops-bits-in-byte 2.8443+-0.0406 ^ 2.3885+-0.0751 ^ definitely 1.1909x faster bitops-bitwise-and 3.4283+-0.0824 3.3971+-0.0965 bitops-nsieve-bits 5.5956+-0.2869 5.4540+-0.0905 might be 1.0260x faster controlflow-recursive 2.1149+-0.0375 ? 2.1161+-0.0353 ? crypto-aes 6.9257+-0.1076 ! 7.4491+-0.2030 ! definitely 1.0756x slower crypto-md5 2.8842+-0.0867 2.7452+-0.0804 might be 1.0506x faster crypto-sha1 2.5600+-0.0690 2.4965+-0.0755 might be 1.0254x faster date-format-tofte 9.9971+-0.1856 ? 10.0634+-0.1550 ? date-format-xparb 9.3292+-0.1708 ? 9.4672+-0.2583 ? might be 1.0148x slower math-cordic 6.4908+-0.1038 ? 6.6374+-0.1601 ? might be 1.0226x slower math-partial-sums 7.7452+-0.1433 7.6759+-0.1205 math-spectral-norm 2.9435+-0.0506 ^ 2.6500+-0.0545 ^ definitely 1.1108x faster regexp-dna 11.8225+-0.1472 11.5870+-0.1209 might be 1.0203x faster string-base64 4.4506+-0.1049 ? 4.4925+-0.1559 ? string-fasta 6.2615+-0.1023 ? 6.5080+-0.1471 ? might be 1.0394x slower string-tagcloud 11.4536+-0.1568 ! 11.8683+-0.1448 ! definitely 1.0362x slower string-unpack-code 20.3437+-0.2670 ? 20.7484+-0.2756 ? might be 1.0199x slower string-validate-input 5.2792+-0.1040 ? 5.3195+-0.1574 ? <arithmetic> * 6.2082+-0.0304 ? 6.2398+-0.0199 ? <geometric> 5.1020+-0.0250 ^ 5.0399+-0.0270 ^ definitely 1.0123x faster <harmonic> 4.2040+-0.0250 ^ 4.0334+-0.0337 ^ definitely 1.0423x faster TipOfTree Inlining V8: crypto 75.0317+-0.6962 ? 76.1193+-0.6357 ? might be 1.0145x slower deltablue 229.5168+-2.0350 ^ 171.9966+-1.3813 ^ definitely 1.3344x faster earley-boyer 94.3379+-2.0813 ? 96.7467+-1.9250 ? might be 1.0255x slower raytrace 59.5699+-0.2731 ! 61.1444+-1.2674 ! definitely 1.0264x slower regexp 106.6368+-0.7844 ? 107.0270+-0.9189 ? richards 185.1246+-1.0312 ^ 144.3955+-0.9598 ^ definitely 1.2821x faster splay 98.7830+-0.5944 ^ 96.1199+-0.7264 ^ definitely 1.0277x faster <arithmetic> 121.2858+-0.5272 ^ 107.6499+-0.4626 ^ definitely 1.1267x faster <geometric> * 109.4864+-0.5406 ^ 102.0054+-0.3858 ^ definitely 1.0733x faster <harmonic> 100.0171+-0.5284 ^ 96.7603+-0.3744 ^ definitely 1.0337x faster TipOfTree Inlining Kraken: ai-astar 504.0532+-2.4731 ? 512.0117+-6.1466 ? might be 1.0158x slower audio-beat-detection 195.2423+-1.1973 ? 198.4151+-2.1492 ? might be 1.0163x slower audio-dft 286.9930+-9.2743 280.3339+-7.3088 might be 1.0238x faster audio-fft 128.6663+-1.6103 126.4186+-0.7910 might be 1.0178x faster audio-oscillator 258.8092+-4.0336 258.7320+-2.5276 imaging-darkroom 431.3622+-2.6677 ^ 412.0987+-4.9051 ^ definitely 1.0467x faster imaging-desaturate 224.0498+-2.1918 ? 224.5701+-1.6935 ? imaging-gaussian-blur 569.6371+-6.1933 ? 571.6295+-4.3435 ? json-parse-financial 57.5612+-0.3672 ? 57.8921+-0.5814 ? json-stringify-tinderbox 69.6784+-0.7403 ! 71.6162+-0.5837 ! definitely 1.0278x slower stanford-crypto-aes 136.1775+-1.8543 134.4457+-2.1320 might be 1.0129x faster stanford-crypto-ccm 103.4945+-1.0784 ? 104.4574+-0.8185 ? stanford-crypto-pbkdf2 203.6624+-1.5077 202.5547+-5.2252 stanford-crypto-sha256-iterative 72.7979+-0.7534 ? 73.4482+-0.6291 ? <arithmetic> * 231.5846+-0.5794 230.6160+-0.6849 <geometric> 181.8590+-0.3005 181.6454+-0.6541 <harmonic> 142.6110+-0.2599 ? 143.1602+-0.5793 ? TipOfTree Inlining All benchmarks: <arithmetic> 90.4808+-0.2130 ^ 88.1789+-0.2279 ^ definitely 1.0261x faster <geometric> 23.3539+-0.0772 ^ 22.9450+-0.0851 ^ definitely 1.0178x faster <harmonic> 7.3984+-0.0431 ^ 7.1037+-0.0580 ^ definitely 1.0415x faster TipOfTree Inlining Geomean of preferred means: <scaled-result> 53.9934+-0.1639 ^ 52.7502+-0.1200 ^ definitely 1.0236x faster
Filip Pizlo
Comment 13 2011-10-21 01:53:24 PDT
Created attachment 111924 [details] it works Except on 32_64, where it'll either fail to compile or crash in awesome ways. I haven't copy-pasted some code yet.
Gyuyoung Kim
Comment 14 2011-10-21 01:58:06 PDT
Early Warning System Bot
Comment 15 2011-10-21 02:03:22 PDT
Filip Pizlo
Comment 16 2011-10-21 02:04:56 PDT
Looks like there are some awesome crashes induced by botched OSR failures induced by LayoutTests, so the claim that "it works" is probably premature. Will investigate.
Gustavo Noronha (kov)
Comment 17 2011-10-21 02:58:51 PDT
Filip Pizlo
Comment 18 2011-10-21 14:56:39 PDT
Created attachment 112025 [details] the patch It passes tests. It makes things faster. Ready for review.
Filip Pizlo
Comment 19 2011-10-21 14:58:11 PDT
Comment on attachment 112025 [details] the patch Aaahhhh! Never mind. Still need to do 32_64.
Filip Pizlo
Comment 20 2011-10-21 16:23:49 PDT
Created attachment 112041 [details] the patch Passes tests, works on 32_64. Still need to get gmail to load.
Early Warning System Bot
Comment 21 2011-10-21 16:37:19 PDT
Gyuyoung Kim
Comment 22 2011-10-21 16:50:28 PDT
Filip Pizlo
Comment 23 2011-10-21 17:04:50 PDT
Comment on attachment 112041 [details] the patch Looks like the gmail bug requires a slight rearchitecting of the block linking to make it more rugged. There's probably a simple side-stepping but I'm going to use a sledge hammer to reduce the likelihood that I ever see assertion failures like this again.
Filip Pizlo
Comment 24 2011-10-21 17:34:46 PDT
Created attachment 112050 [details] the patch Ruggedized the block linker. Will set r? once I'm happy that websites work. Still testing that now.
Filip Pizlo
Comment 25 2011-10-21 17:40:41 PDT
Comment on attachment 112050 [details] the patch Ready for review. I can browse gmail, facebook, google plus, bing, cnn, bankrate, and tests pass. 32_64 seems to work as well.
Filip Pizlo
Comment 26 2011-10-21 17:42:00 PDT
Latest perf numbers. Benchmark report for SunSpider, V8, and Kraken. VMs tested: "TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc "Inlining" at /Volumes/Data/pizlo/septenary/OpenSource/WebKitBuild/Release/jsc Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree Inlining SunSpider: 3d-cube 7.5292+-0.1551 7.5090+-0.1494 3d-morph 7.7238+-0.1017 ? 7.7504+-0.1686 ? 3d-raytrace 7.7175+-0.1658 7.6361+-0.1587 might be 1.0107x faster access-binary-trees 1.7710+-0.0353 1.6970+-0.0480 might be 1.0436x faster access-fannkuch 6.5448+-0.1462 6.4973+-0.1362 access-nbody 3.2920+-0.0581 ! 3.6436+-0.0649 ! definitely 1.1068x slower access-nsieve 2.6189+-0.0491 2.5701+-0.0533 might be 1.0190x faster bitops-3bit-bits-in-byte 1.7719+-0.0463 ^ 1.3203+-0.0365 ^ definitely 1.3421x faster bitops-bits-in-byte 2.8426+-0.0491 ^ 2.3600+-0.0661 ^ definitely 1.2045x faster bitops-bitwise-and 3.3914+-0.1234 ? 3.4515+-0.0830 ? might be 1.0177x slower bitops-nsieve-bits 5.6335+-0.1041 ^ 5.4495+-0.0794 ^ definitely 1.0338x faster controlflow-recursive 2.1004+-0.0485 ? 2.1078+-0.0435 ? crypto-aes 6.9378+-0.1358 ! 7.6149+-0.1823 ! definitely 1.0976x slower crypto-md5 2.8627+-0.1010 2.8080+-0.0770 might be 1.0195x faster crypto-sha1 2.5662+-0.0679 2.5086+-0.0403 might be 1.0230x faster date-format-tofte 9.9458+-0.2107 ? 10.2073+-0.3872 ? might be 1.0263x slower date-format-xparb 9.4583+-0.1809 ^ 8.9732+-0.0994 ^ definitely 1.0541x faster math-cordic 6.6702+-0.1924 6.4718+-0.1315 might be 1.0307x faster math-partial-sums 7.7084+-0.1235 ? 8.1012+-0.4149 ? might be 1.0510x slower math-spectral-norm 2.8734+-0.0662 ^ 2.6389+-0.0422 ^ definitely 1.0889x faster regexp-dna 11.6119+-0.1368 ? 11.6933+-0.1866 ? string-base64 4.4173+-0.1344 ? 4.4714+-0.1723 ? might be 1.0123x slower string-fasta 6.4880+-0.1368 6.4316+-0.1918 string-tagcloud 11.5306+-0.1765 ? 11.5918+-0.2346 ? string-unpack-code 20.3630+-0.2860 ? 20.8189+-0.2604 ? might be 1.0224x slower string-validate-input 5.2883+-0.1414 ? 5.4789+-0.1554 ? might be 1.0360x slower <arithmetic> * 6.2177+-0.0345 ? 6.2232+-0.0225 ? <geometric> 5.1094+-0.0268 ^ 5.0234+-0.0242 ^ definitely 1.0171x faster <harmonic> 4.2102+-0.0271 ^ 4.0180+-0.0279 ^ definitely 1.0478x faster TipOfTree Inlining V8: crypto 74.2652+-0.3573 ! 76.2611+-0.6886 ! definitely 1.0269x slower deltablue 228.7236+-1.8105 ^ 169.4730+-1.4956 ^ definitely 1.3496x faster earley-boyer 93.0936+-1.7728 ? 95.7037+-1.5892 ? might be 1.0280x slower raytrace 58.6868+-0.3134 ! 61.5333+-0.9121 ! definitely 1.0485x slower regexp 107.2100+-1.0549 106.2229+-0.4367 richards 184.7886+-0.4967 ^ 145.4120+-1.6208 ^ definitely 1.2708x faster splay 98.0963+-0.7513 ^ 95.0552+-0.4904 ^ definitely 1.0320x faster <arithmetic> 120.6949+-0.4253 ^ 107.0944+-0.4520 ^ definitely 1.1270x faster <geometric> * 108.7804+-0.3992 ^ 101.5881+-0.4463 ^ definitely 1.0708x faster <harmonic> 99.2003+-0.3728 ^ 96.4955+-0.4677 ^ definitely 1.0280x faster TipOfTree Inlining Kraken: ai-astar 508.6602+-3.7149 ? 513.4338+-5.7509 ? audio-beat-detection 194.7253+-1.9536 194.2500+-1.3624 audio-dft 278.9230+-8.0107 ? 285.5524+-6.2409 ? might be 1.0238x slower audio-fft 126.8889+-1.0945 125.9327+-0.9391 audio-oscillator 254.7716+-2.0469 ? 255.3800+-1.1305 ? imaging-darkroom 421.9319+-1.6725 ^ 410.0498+-3.3201 ^ definitely 1.0290x faster imaging-desaturate 224.2292+-1.6659 ? 225.1973+-2.4855 ? imaging-gaussian-blur 571.7987+-7.9711 564.7698+-2.5282 might be 1.0124x faster json-parse-financial 57.3133+-0.6812 56.4367+-0.2013 might be 1.0155x faster json-stringify-tinderbox 69.9055+-1.0370 ? 70.0927+-0.9709 ? stanford-crypto-aes 136.4939+-2.3346 133.3180+-1.3957 might be 1.0238x faster stanford-crypto-ccm 103.6385+-1.1249 ? 104.3401+-1.2851 ? stanford-crypto-pbkdf2 198.7528+-3.1946 197.6987+-0.8001 stanford-crypto-sha256-iterative 72.7265+-0.6502 ? 73.2132+-0.5826 ? <arithmetic> * 230.0542+-0.7879 229.2618+-0.7745 <geometric> 180.6700+-0.7627 180.1717+-0.4641 <harmonic> 141.9692+-0.6425 141.5408+-0.4094 TipOfTree Inlining All benchmarks: <arithmetic> 89.9422+-0.2815 ^ 87.6836+-0.2670 ^ definitely 1.0258x faster <geometric> 23.3046+-0.0910 ^ 22.8341+-0.0758 ^ definitely 1.0206x faster <harmonic> 7.4078+-0.0467 ^ 7.0758+-0.0480 ^ definitely 1.0469x faster TipOfTree Inlining Geomean of preferred means: <scaled-result> 53.7855+-0.1745 ^ 52.5281+-0.1441 ^ definitely 1.0239x faster
Oliver Hunt
Comment 27 2011-10-21 18:10:13 PDT
Comment on attachment 112050 [details] the patch View in context: https://bugs.webkit.org/attachment.cgi?id=112050&action=review r=me > Source/JavaScriptCore/dfg/DFGJITCompiler32_64.cpp:482 > + store32(Imm32(JSValue::CellTag), tagFor((VirtualRegister)(inlineCallFrame->stackOffset + RegisterFile::ScopeChain))); So much sadness :-/ > Source/JavaScriptCore/runtime/Heuristics.cpp:32 > -#define ENABLE_RUN_TIME_HEURISTICS 0 > +#define ENABLE_RUN_TIME_HEURISTICS 1 Do we want these on by default?
Filip Pizlo
Comment 28 2011-10-21 18:13:13 PDT
Here's some more performance data, from a different machine. Benchmark report for SunSpider, V8, and Kraken. VMs tested: "TipOfTree" at /Volumes/Data/pizlo/tertiary/OpenSource/WebKitBuild/Release/jsc "Inlining" at /Volumes/Data/fromMiniMe/septenary/OpenSource/WebKitBuild/Release/jsc Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree Inlining SunSpider: 3d-cube 7.9057+-0.0355 ? 7.9108+-0.0333 ? 3d-morph 8.6263+-0.1314 ^ 8.4058+-0.0298 ^ definitely 1.0262x faster 3d-raytrace 8.1104+-0.0722 8.0728+-0.0743 access-binary-trees 1.7884+-0.0047 ! 1.8039+-0.0046 ! definitely 1.0086x slower access-fannkuch 7.9673+-0.0243 ^ 7.8511+-0.0598 ^ definitely 1.0148x faster access-nbody 4.0548+-0.0328 ! 4.2363+-0.0073 ! definitely 1.0448x slower access-nsieve 3.1608+-0.0130 ? 3.1832+-0.0130 ? bitops-3bit-bits-in-byte 1.7805+-0.0034 ^ 1.3165+-0.0147 ^ definitely 1.3525x faster bitops-bits-in-byte 5.3079+-0.0117 ^ 5.2709+-0.0227 ^ definitely 1.0070x faster bitops-bitwise-and 3.4332+-0.0600 ? 3.4381+-0.0600 ? bitops-nsieve-bits 5.6791+-0.0396 5.6473+-0.0367 controlflow-recursive 2.3211+-0.0036 ? 2.3280+-0.0053 ? crypto-aes 6.8789+-0.0498 ! 7.6776+-0.0596 ! definitely 1.1161x slower crypto-md5 3.0041+-0.0363 ^ 2.8722+-0.0314 ^ definitely 1.0459x faster crypto-sha1 2.7722+-0.0271 ^ 2.6334+-0.0164 ^ definitely 1.0527x faster date-format-tofte 10.5648+-0.0599 ! 10.7473+-0.0880 ! definitely 1.0173x slower date-format-xparb 10.9002+-0.1451 ^ 9.5093+-0.1691 ^ definitely 1.1463x faster math-cordic 7.2169+-0.0227 ! 7.5781+-0.2770 ! definitely 1.0500x slower math-partial-sums 10.5438+-0.0239 ! 10.6200+-0.0396 ! definitely 1.0072x slower math-spectral-norm 3.2655+-0.0115 ^ 2.8810+-0.0056 ^ definitely 1.1335x faster regexp-dna 13.3356+-0.1807 ? 13.3923+-0.2041 ? string-base64 4.4239+-0.0169 ? 4.4245+-0.0157 ? string-fasta 7.1008+-0.0334 ? 7.1265+-0.0376 ? string-tagcloud 13.2980+-0.1306 13.2206+-0.1446 string-unpack-code 22.6685+-0.1202 22.6000+-0.1731 string-validate-input 5.6601+-0.0754 5.6096+-0.0405 <arithmetic> * 6.9911+-0.0237 6.9368+-0.0332 <geometric> 5.7257+-0.0154 ^ 5.6206+-0.0229 ^ definitely 1.0187x faster <harmonic> 4.6719+-0.0103 ^ 4.4690+-0.0174 ^ definitely 1.0454x faster TipOfTree Inlining V8: crypto 80.1479+-0.0824 ! 81.5153+-0.1235 ! definitely 1.0171x slower deltablue 253.7627+-1.6497 ^ 188.5828+-0.5628 ^ definitely 1.3456x faster earley-boyer 111.5673+-2.2941 ? 113.8488+-1.4306 ? might be 1.0204x slower raytrace 63.8901+-0.3075 ! 66.7756+-0.6179 ! definitely 1.0452x slower regexp 124.2609+-0.3933 ? 125.0151+-0.7319 ? richards 212.5178+-0.6103 ^ 165.2101+-0.2793 ^ definitely 1.2863x faster splay 126.0742+-0.4379 ^ 124.1464+-0.7412 ^ definitely 1.0155x faster <arithmetic> 138.8887+-0.4821 ^ 123.5849+-0.3212 ^ definitely 1.1238x faster <geometric> * 125.2085+-0.4398 ^ 116.9687+-0.3409 ^ definitely 1.0704x faster <harmonic> 113.4041+-0.3858 ^ 110.3364+-0.3652 ^ definitely 1.0278x faster TipOfTree Inlining Kraken: ai-astar 804.8272+-11.5509 ? 825.5071+-11.4066 ? might be 1.0257x slower audio-beat-detection 210.3146+-1.2071 ? 210.9226+-1.8756 ? audio-dft 263.5513+-8.3917 262.6012+-2.5779 audio-fft 135.0298+-0.0938 ? 135.4095+-0.5561 ? audio-oscillator 291.4857+-2.0305 ? 292.7020+-1.4143 ? imaging-darkroom 480.7360+-3.4757 ^ 445.8657+-2.5090 ^ definitely 1.0782x faster imaging-desaturate 238.0163+-0.1114 ? 238.1223+-0.1226 ? imaging-gaussian-blur 621.0191+-0.4272 ? 621.0583+-0.3606 ? json-parse-financial 70.8058+-0.2058 ^ 69.6470+-0.2175 ^ definitely 1.0166x faster json-stringify-tinderbox 79.7848+-0.3356 ^ 78.4771+-0.2201 ^ definitely 1.0167x faster stanford-crypto-aes 154.2835+-1.7173 ^ 151.3350+-1.0898 ^ definitely 1.0195x faster stanford-crypto-ccm 116.3939+-0.6686 ? 117.5918+-0.6777 ? might be 1.0103x slower stanford-crypto-pbkdf2 236.0109+-1.7652 ? 236.9104+-2.0966 ? stanford-crypto-sha256-iterative 85.4621+-0.2368 ? 85.6366+-0.2715 ? <arithmetic> * 270.5515+-1.0418 269.4133+-0.8671 <geometric> 206.4148+-0.6826 ^ 205.2601+-0.3124 ^ definitely 1.0056x faster <harmonic> 162.4133+-0.4534 ^ 161.4193+-0.2400 ^ definitely 1.0062x faster TipOfTree Inlining All benchmarks: <arithmetic> 105.1428+-0.3565 ^ 102.4944+-0.3063 ^ definitely 1.0258x faster <geometric> 26.3714+-0.0708 ^ 25.7962+-0.0736 ^ definitely 1.0223x faster <harmonic> 8.2267+-0.0181 ^ 7.8753+-0.0300 ^ definitely 1.0446x faster TipOfTree Inlining Geomean of preferred means: <scaled-result> 61.8693+-0.1933 ^ 60.2394+-0.1885 ^ definitely 1.0271x faster
Filip Pizlo
Comment 29 2011-10-21 18:13:36 PDT
(In reply to comment #27) > (From update of attachment 112050 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=112050&action=review > > r=me > > > Source/JavaScriptCore/dfg/DFGJITCompiler32_64.cpp:482 > > + store32(Imm32(JSValue::CellTag), tagFor((VirtualRegister)(inlineCallFrame->stackOffset + RegisterFile::ScopeChain))); > > So much sadness :-/ :-( > > > Source/JavaScriptCore/runtime/Heuristics.cpp:32 > > -#define ENABLE_RUN_TIME_HEURISTICS 0 > > +#define ENABLE_RUN_TIME_HEURISTICS 1 > > Do we want these on by default? Ooops! Thanks for catching that!
Filip Pizlo
Comment 30 2011-10-21 18:22:28 PDT
Patrick R. Gansterer
Comment 31 2011-10-24 01:54:48 PDT
Committed interpreter build fix r98220: <http://trac.webkit.org/changeset/98220>
Note You need to log in before you can comment on or make changes to this bug.