69690 – DFG does not have flow-sensitive intraprocedural control flow analysis

Filip Pizlo

Reported 2011-10-07 20:36:48 PDT

Currently the DFG will reperform certain type and safety checks because it forgot that it had already performed them in a previous basic block. It also sometimes forgets that it had performed checks in the same basic block, if those checks are not captured by either CSE or DataFormat. One way to ensure that checks are precisely eliminated if all possible paths up to that check already contain that same check (or a stricter check) followed by operations that cannot change the property being checked. Ascertaining this can be done with a flow-sensitive intraprocedural control flow analysis (CFA), which the DFG currently does not have.

Filip Pizlo

Comment 1 2011-10-07 20:44:35 PDT

Created attachment 110248 [details] work in progress Still working on it. Haven't measured performance yet. Haven't implemented 32_64 support yet.

WebKit Review Bot

Comment 2 2011-10-07 20:47:56 PDT

Attachment 110248 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1 Source/JavaScriptCore/dfg/DFGAbstractValue.h:119: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] Source/JavaScriptCore/dfg/DFGAbstractValue.h:204: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] Source/JavaScriptCore/dfg/DFGAbstractValue.h:253: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5] Source/JavaScriptCore/dfg/DFGAbstractState.h:95: The parameter name "codeBlock" adds no information, so it should be removed. [readability/parameter_name] [5] Source/JavaScriptCore/dfg/DFGAbstractState.h:95: The parameter name "graph" adds no information, so it should be removed. [readability/parameter_name] [5] Source/JavaScriptCore/dfg/DFGAbstractState.h:106: The parameter name "graph" adds no information, so it should be removed. [readability/parameter_name] [5] Source/JavaScriptCore/dfg/DFGAbstractState.h:111: The parameter name "basicBlock" adds no information, so it should be removed. [readability/parameter_name] [5] Source/JavaScriptCore/dfg/DFGAbstractState.h:134: The parameter name "mergeMode" adds no information, so it should be removed. [readability/parameter_name] [5] Source/JavaScriptCore/dfg/DFGAbstractState.h:147: The parameter name "nodeIndex" adds no information, so it should be removed. [readability/parameter_name] [5] Source/JavaScriptCore/dfg/DFGAbstractState.h:162: The parameter name "graph" adds no information, so it should be removed. [readability/parameter_name] [5] Source/JavaScriptCore/dfg/DFGAbstractState.h:162: The parameter name "basicBlock" adds no information, so it should be removed. [readability/parameter_name] [5] Source/JavaScriptCore/dfg/DFGAbstractState.h:171: The parameter name "nodeIndex" adds no information, so it should be removed. [readability/parameter_name] [5] Source/JavaScriptCore/dfg/DFGOSREntry.cpp:89: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] Total errors found: 14 in 22 files If any of these errors are false positives, please file a bug against check-webkit-style.

Filip Pizlo

Comment 3 2011-10-07 20:52:12 PDT

This looks like it still needs work, but even now, it's a speed-up on V8. Benchmark report for V8. VMs tested: "TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc "CFA" at /Volumes/Data/pizlo/OpenSource/WebKitBuild/Release/jsc Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree CFA crypto 71.8876+-0.2979 ! 73.8087+-0.2691 ! definitely 1.0267x slower deltablue 226.0413+-2.3269 ? 226.1569+-2.0996 ? earley-boyer 90.5814+-0.4154 ! 91.8870+-0.3324 ! definitely 1.0144x slower raytrace 58.4061+-0.3131 ! 59.3055+-0.3795 ! definitely 1.0154x slower regexp 102.7642+-0.2884 ? 103.4826+-0.4524 ? richards 204.4922+-0.8884 ^ 175.1322+-0.3993 ^ definitely 1.1676x faster splay 95.6563+-0.7585 ? 96.4660+-0.2593 ? <arithmetic> 121.4042+-0.3672 ^ 118.0341+-0.2739 ^ definitely 1.0286x faster <geometric> * 108.1240+-0.2137 ^ 106.8489+-0.1572 ^ definitely 1.0119x faster <harmonic> 97.7987+-0.1951 ? 97.9595+-0.1685 ?

Filip Pizlo

Comment 4 2011-10-07 20:54:11 PDT

Created attachment 110250 [details] work in progress - fix style, mostly

WebKit Review Bot

Comment 5 2011-10-07 20:56:48 PDT

Attachment 110250 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1 Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5] Total errors found: 1 in 22 files If any of these errors are false positives, please file a bug against check-webkit-style.

Filip Pizlo

Comment 6 2011-10-07 22:24:29 PDT

> crypto 71.8876+-0.2979 ! 73.8087+-0.2691 ! definitely 1.0267x slower This 2% appears to be entirely due to the CFA itself! The CFA code needs to be tuned up. I'll play around with making it faster.

Filip Pizlo

Comment 7 2011-10-07 23:00:21 PDT

Created attachment 110259 [details] work in progress - better performance I did two things: - Fixed OSR entry. If a variable is dead then the abstract value should be TOP. Internally, the abstract state machine treats dead variables as BOTTOM. That's necessary for making the analysis precise. - Changed the abstract structure value representation. Previously it was basically Vector<Structure*>. Now it's just Structure*, where sets larger than 1 are turned to TOP. This works great, since most structure sets only contain one element anyway. Benchmark report for SunSpider, V8, and Kraken. VMs tested: "TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc "CFA" at /Volumes/Data/pizlo/OpenSource/WebKitBuild/Release/jsc Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree CFA SunSpider: 3d-cube 7.3400+-0.2059 7.2108+-0.1398 might be 1.0179x faster 3d-morph 7.5851+-0.1400 ? 7.6501+-0.1387 ? 3d-raytrace 7.5752+-0.1830 7.5150+-0.1734 access-binary-trees 1.7129+-0.0337 ? 1.7169+-0.0476 ? access-fannkuch 6.2979+-0.0958 ? 6.5286+-0.1364 ? might be 1.0366x slower access-nbody 3.4297+-0.0839 ? 3.5317+-0.0830 ? might be 1.0297x slower access-nsieve 2.5318+-0.0516 ? 2.6022+-0.0786 ? might be 1.0278x slower bitops-3bit-bits-in-byte 1.6817+-0.0191 ! 1.7452+-0.0250 ! definitely 1.0378x slower bitops-bits-in-byte 2.7081+-0.0537 ! 2.9258+-0.1245 ! definitely 1.0804x slower bitops-bitwise-and 3.4372+-0.2532 3.3606+-0.0894 might be 1.0228x faster bitops-nsieve-bits 5.4391+-0.0644 5.3109+-0.1139 might be 1.0241x faster controlflow-recursive 2.0508+-0.0375 2.0316+-0.0499 crypto-aes 6.5082+-0.1513 ? 6.7630+-0.1750 ? might be 1.0392x slower crypto-md5 2.7830+-0.0851 ? 2.8322+-0.0706 ? might be 1.0177x slower crypto-sha1 2.4498+-0.0714 ? 2.5556+-0.0867 ? might be 1.0432x slower date-format-tofte 9.8098+-0.2208 ? 9.9895+-0.1587 ? might be 1.0183x slower date-format-xparb 9.4492+-0.1745 ! 9.8804+-0.2001 ! definitely 1.0456x slower math-cordic 6.3737+-0.1416 6.3125+-0.0656 math-partial-sums 7.5554+-0.1169 7.5150+-0.1276 math-spectral-norm 2.7905+-0.0655 ? 2.8601+-0.0784 ? might be 1.0250x slower regexp-dna 10.6186+-0.1482 10.5608+-0.1658 string-base64 5.1063+-0.0828 ? 5.1827+-0.0692 ? might be 1.0150x slower string-fasta 6.3348+-0.1140 ? 6.4154+-0.1134 ? might be 1.0127x slower string-tagcloud 11.1755+-0.2087 ? 11.2236+-0.2715 ? string-unpack-code 21.0133+-0.2257 ! 21.6342+-0.2337 ! definitely 1.0295x slower string-validate-input 6.2811+-0.1612 6.2791+-0.1612 <arithmetic> * 6.1553+-0.0194 ! 6.2359+-0.0193 ! definitely 1.0131x slower <geometric> 5.0469+-0.0176 ! 5.1180+-0.0158 ! definitely 1.0141x slower <harmonic> 4.1371+-0.0257 ! 4.2074+-0.0236 ! definitely 1.0170x slower TipOfTree CFA V8: crypto 71.6115+-0.3106 ! 72.6151+-0.5762 ! definitely 1.0140x slower deltablue 227.0336+-3.5653 223.0970+-2.1287 might be 1.0176x faster earley-boyer 90.4309+-0.3425 ? 90.9446+-0.2832 ? raytrace 57.5184+-0.4610 57.4722+-0.2783 regexp 102.7896+-0.3338 ? 102.8251+-0.5877 ? richards 205.1041+-1.1592 ^ 175.0092+-0.6119 ^ definitely 1.1720x faster splay 95.5634+-0.6248 ? 95.6634+-0.7130 ? <arithmetic> 121.4359+-0.5050 ^ 116.8038+-0.3613 ^ definitely 1.0397x faster <geometric> * 107.9023+-0.2225 ^ 105.5267+-0.2425 ^ definitely 1.0225x faster <harmonic> 97.3732+-0.1490 ^ 96.4890+-0.1988 ^ definitely 1.0092x faster TipOfTree CFA Kraken: ai-astar 493.1064+-3.9254 492.3566+-3.0941 audio-beat-detection 191.3132+-0.6433 ? 191.4514+-0.7850 ? audio-dft 265.4996+-2.7456 ^ 260.4941+-1.8597 ^ definitely 1.0192x faster audio-fft 125.3307+-0.5836 ? 126.1825+-1.9849 ? audio-oscillator 251.3514+-1.9030 249.7558+-2.0144 imaging-darkroom 410.4260+-0.8666 ! 413.6212+-1.1919 ! definitely 1.0078x slower imaging-desaturate 230.1417+-0.5040 ^ 216.3676+-0.4729 ^ definitely 1.0637x faster imaging-gaussian-blur 578.7784+-2.2247 ^ 568.6758+-2.3888 ^ definitely 1.0178x faster json-parse-financial 54.4313+-0.2379 ? 54.5051+-0.2106 ? json-stringify-tinderbox 69.6600+-0.8972 ^ 68.1390+-0.2398 ^ definitely 1.0223x faster stanford-crypto-aes 131.3154+-2.2640 130.2981+-1.6271 stanford-crypto-ccm 99.8329+-0.9814 ! 101.7389+-0.8432 ! definitely 1.0191x slower stanford-crypto-pbkdf2 187.8078+-0.7526 ! 191.5040+-1.2973 ! definitely 1.0197x slower stanford-crypto-sha256-iterative 69.6199+-0.2190 ? 69.9535+-0.3764 ? <arithmetic> * 225.6153+-0.5117 ^ 223.9317+-0.4806 ^ definitely 1.0075x faster <geometric> 176.2934+-0.4303 ^ 175.3365+-0.4004 ^ definitely 1.0055x faster <harmonic> 137.8627+-0.3810 137.4240+-0.3734 TipOfTree CFA All benchmarks: <arithmetic> 88.6958+-0.1882 ^ 87.5490+-0.1577 ^ definitely 1.0131x faster <geometric> 22.9502+-0.0535 ? 23.0146+-0.0492 ? <harmonic> 7.2778+-0.0442 ! 7.3970+-0.0406 ! definitely 1.0164x slower TipOfTree CFA Geomean of preferred means: <scaled-result> 53.1147+-0.0841 ^ 52.8190+-0.0781 ^ definitely 1.0056x faster

WebKit Review Bot

Comment 8 2011-10-07 23:03:52 PDT

Attachment 110259 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1 Source/JavaScriptCore/dfg/DFGAbstractValue.h:96: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5] Source/JavaScriptCore/dfg/DFGAbstractState.cpp:48: Should have a space between // and comment [whitespace/comments] [4] Source/JavaScriptCore/dfg/DFGAbstractState.cpp:65: Should have a space between // and comment [whitespace/comments] [4] Source/JavaScriptCore/dfg/DFGAbstractState.cpp:82: Should have a space between // and comment [whitespace/comments] [4] Source/JavaScriptCore/dfg/DFGAbstractState.cpp:123: Should have a space between // and comment [whitespace/comments] [4] Total errors found: 6 in 22 files If any of these errors are false positives, please file a bug against check-webkit-style.

Filip Pizlo

Comment 9 2011-10-08 00:33:34 PDT

Created attachment 110263 [details] work in progress - it's now pure win This is now pure win. This required a number of changes: - CFA merging requires interrogating the successors of basic blocks. This was previously a binary search. Now it's not. - The CFA is now pseudo-worklist based. It visits basic blocks in bytecode order (which is almost program order, and almost topological order), leading to a high probability that a block will not be visited unless its predecessors have been visited. But it does not visit a block if since the last time it was visited nothing has happened that would change the block's outcome. - Bunch of little tweaks to reduce the costs of various CFA operations. Benchmark report for SunSpider, V8, and Kraken. VMs tested: "TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc "CFA" at /Volumes/Data/pizlo/OpenSource/WebKitBuild/Release/jsc Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree CFA SunSpider: 3d-cube 7.1877+-0.1640 ? 7.2292+-0.1642 ? 3d-morph 7.5879+-0.1785 7.5121+-0.1356 might be 1.0101x faster 3d-raytrace 7.4148+-0.2026 7.3754+-0.1707 access-binary-trees 1.7784+-0.0808 1.6926+-0.0562 might be 1.0507x faster access-fannkuch 6.3277+-0.1237 6.3013+-0.1268 access-nbody 3.5172+-0.0884 3.4572+-0.0941 might be 1.0174x faster access-nsieve 2.5999+-0.0791 2.5825+-0.0877 bitops-3bit-bits-in-byte 1.7346+-0.0357 1.7115+-0.0327 might be 1.0135x faster bitops-bits-in-byte 2.7539+-0.0720 2.7514+-0.0510 bitops-bitwise-and 3.3173+-0.0725 ? 3.3754+-0.1000 ? might be 1.0175x slower bitops-nsieve-bits 5.3589+-0.1009 ? 5.3927+-0.1290 ? controlflow-recursive 2.0357+-0.0392 ? 2.0723+-0.0671 ? might be 1.0180x slower crypto-aes 6.5654+-0.1839 ? 6.6289+-0.1744 ? crypto-md5 2.7721+-0.0633 ? 2.8039+-0.0589 ? might be 1.0114x slower crypto-sha1 2.4790+-0.1072 ? 2.4911+-0.0745 ? date-format-tofte 9.8535+-0.2432 ? 9.9580+-0.2362 ? might be 1.0106x slower date-format-xparb 9.4830+-0.1867 ! 10.0138+-0.2298 ! definitely 1.0560x slower math-cordic 6.3866+-0.1093 ? 6.5029+-0.1173 ? might be 1.0182x slower math-partial-sums 7.6234+-0.1297 7.5099+-0.1406 might be 1.0151x faster math-spectral-norm 2.8589+-0.0701 2.8198+-0.0660 might be 1.0139x faster regexp-dna 10.5651+-0.2565 10.4409+-0.1349 might be 1.0119x faster string-base64 5.0891+-0.1039 ? 5.2070+-0.1078 ? might be 1.0232x slower string-fasta 6.4453+-0.1413 6.3143+-0.0977 might be 1.0207x faster string-tagcloud 11.0358+-0.2692 ? 11.1644+-0.1814 ? might be 1.0117x slower string-unpack-code 21.0071+-0.2760 ? 21.4615+-0.5199 ? might be 1.0216x slower string-validate-input 6.2340+-0.1813 ? 6.3609+-0.1603 ? might be 1.0204x slower <arithmetic> * 6.1543+-0.0155 ? 6.1973+-0.0322 ? <geometric> 5.0642+-0.0192 ? 5.0771+-0.0246 ? <harmonic> 4.1736+-0.0342 4.1643+-0.0360 TipOfTree CFA V8: crypto 71.6260+-0.3327 ? 71.8717+-0.3132 ? deltablue 226.0467+-1.9042 ^ 221.1798+-1.0192 ^ definitely 1.0220x faster earley-boyer 89.8473+-0.2783 ! 90.9236+-0.3194 ! definitely 1.0120x slower raytrace 57.7508+-0.2079 ? 58.0618+-0.2847 ? regexp 102.4067+-0.2936 ? 103.1329+-0.4744 ? richards 204.6559+-1.0139 ^ 175.9648+-1.3114 ^ definitely 1.1631x faster splay 96.0001+-0.5870 95.7168+-0.2982 <arithmetic> 121.1905+-0.2260 ^ 116.6931+-0.2717 ^ definitely 1.0385x faster <geometric> * 107.7836+-0.1350 ^ 105.5294+-0.1963 ^ definitely 1.0214x faster <harmonic> 97.3547+-0.1511 ^ 96.5707+-0.1726 ^ definitely 1.0081x faster TipOfTree CFA Kraken: ai-astar 491.8601+-2.3376 489.5730+-2.2160 audio-beat-detection 190.6516+-0.7819 ! 193.5275+-1.1763 ! definitely 1.0151x slower audio-dft 268.0732+-3.6325 ^ 259.3428+-1.9135 ^ definitely 1.0337x faster audio-fft 124.9162+-0.4877 124.7628+-0.2686 audio-oscillator 252.5326+-2.3092 249.5812+-1.6099 might be 1.0118x faster imaging-darkroom 411.7627+-1.5812 ? 412.7726+-0.8839 ? imaging-desaturate 229.9889+-0.7966 ^ 216.8373+-0.8930 ^ definitely 1.0607x faster imaging-gaussian-blur 580.7792+-2.4572 ^ 565.6115+-1.0252 ^ definitely 1.0268x faster json-parse-financial 55.2379+-0.3316 55.1255+-0.5643 json-stringify-tinderbox 68.1584+-0.4261 67.6238+-0.1915 stanford-crypto-aes 130.3048+-1.2905 ? 130.4318+-1.5252 ? stanford-crypto-ccm 99.8189+-0.8939 ! 102.0182+-0.9102 ! definitely 1.0220x slower stanford-crypto-pbkdf2 187.7075+-0.8696 ! 190.3954+-1.0340 ! definitely 1.0143x slower stanford-crypto-sha256-iterative 70.1452+-0.4680 ? 70.3815+-0.4656 ? <arithmetic> * 225.8527+-0.5194 ^ 223.4275+-0.4581 ^ definitely 1.0109x faster <geometric> 176.3357+-0.4920 ^ 175.2288+-0.4434 ^ definitely 1.0063x faster <harmonic> 137.8869+-0.4240 137.5821+-0.4486 TipOfTree CFA All benchmarks: <arithmetic> 88.7294+-0.1670 ^ 87.3610+-0.1625 ^ definitely 1.0157x faster <geometric> 22.9916+-0.0565 22.9087+-0.0767 <harmonic> 7.3403+-0.0588 7.3234+-0.0619 TipOfTree CFA Geomean of preferred means: <scaled-result> 53.1110+-0.0676 ^ 52.6706+-0.1069 ^ definitely 1.0084x faster

WebKit Review Bot

Comment 10 2011-10-08 00:36:03 PDT

Attachment 110263 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1 Source/JavaScriptCore/dfg/DFGAbstractValue.h:96: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5] Total errors found: 2 in 26 files If any of these errors are false positives, please file a bug against check-webkit-style.

Filip Pizlo

Comment 11 2011-10-08 00:45:28 PDT

Created attachment 110264 [details] the patch Got it to work with 32_64.

WebKit Review Bot

Comment 12 2011-10-08 00:47:36 PDT

Attachment 110264 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1 Source/JavaScriptCore/ChangeLog:3: Line contains tab character. [whitespace/tab] [5] Source/JavaScriptCore/ChangeLog:4: Line contains tab character. [whitespace/tab] [5] Source/JavaScriptCore/ChangeLog:8: Line contains tab character. [whitespace/tab] [5] Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5] Total errors found: 4 in 26 files If any of these errors are false positives, please file a bug against check-webkit-style.

Filip Pizlo

Comment 13 2011-10-08 00:49:30 PDT

Created attachment 110265 [details] the patch Oh, emacs, you and your tabs, which you like to put into ChangeLogs regardless of how many times I tell you not to.

WebKit Review Bot

Comment 14 2011-10-08 00:53:18 PDT

Attachment 110265 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1 Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5] Total errors found: 1 in 26 files If any of these errors are false positives, please file a bug against check-webkit-style.

Oliver Hunt

Comment 15 2011-10-08 11:28:40 PDT

Comment on attachment 110265 [details] the patch View in context: https://bugs.webkit.org/attachment.cgi?id=110265&action=review Should use a typedef for bool (*predictionCheck)(PredictedType), getBytecodeBegin shouldn't be taking an OwnPtr*, but it looks like that has just been moved from one place to another. Other than those style quirks I can't see any obvious problems. I am slightly concerned with the potential maintenance problems caused by separating type flow logic from codegen, especially given the vast amount of code duplication we currently have :-( I do think Gavin should also look at this though. > Source/JavaScriptCore/dfg/DFGBasicBlock.h:59 > + static inline BlockIndex getBytecodeBegin(OwnPtr<BasicBlock>* block) > + { > + return (*block)->bytecodeBegin; > + } This seems wrong, there's no transfer of ownership intended here, you should just be passing BasicBlock* > Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:350 > +void SpeculativeJIT::compileObjectEquality(Node& node, void* vptr, bool (*predictionCheck)(PredictedType)) Would be nice to have a typedef for bool (*predictionCheck)(PredictedType)

Filip Pizlo

Comment 16 2011-10-08 12:32:44 PDT

(In reply to comment #15) > (From update of attachment 110265 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=110265&action=review > > Should use a typedef for bool (*predictionCheck)(PredictedType), getBytecodeBegin shouldn't be taking an OwnPtr*, but it looks like that has just been moved from one place to another. > > Other than those style quirks I can't see any obvious problems. > > I am slightly concerned with the potential maintenance problems caused by separating type flow logic from codegen, especially given the vast amount of code duplication we currently have :-( I know. I don't think I want to do this in this patch, but I think I've got a solution for both this and register allocation: have named speculations (so just as we have an enum of node types, have an enum of speculation types), have attributes on nodes that describe what speculatiuons should be performed, have one phase in Propagator that deduces what those speculatins ought to be, and then while we're at it, have node attributes that specify what additional registers the codegen for a node will required. And viola, we'll be able to do global register allocation and get rid of GenerationInfo entirely. > > I do think Gavin should also look at this though. > > > Source/JavaScriptCore/dfg/DFGBasicBlock.h:59 > > + static inline BlockIndex getBytecodeBegin(OwnPtr<BasicBlock>* block) > > + { > > + return (*block)->bytecodeBegin; > > + } > > This seems wrong, there's no transfer of ownership intended here, you should just be passing BasicBlock* I think that this code is exactly as it should be. This is one of those gross stubs that we need for doing binary searches over Vector<OwnPtr<BasicBlock>>. The WTF::binarySearch() function takes a function pointer for getting the search key; for a vector of the form Vector<T> it takes a function of the form U (*)(T*), where U is the search key. For Vector<OwnPtr<BasicBlock>> we have T = OwnPtr<BasicBlock>, so the function must be of the form U (*)(OwnPtr<BasicBlock>*). The only way to change this is to make Vector<OwnPtr<BasicBlock>> be something else, but I quite like the idea of a vector of pointers for basic blocks, since there are few basic blocks and the basic block data structure is rather large. So while this looks yucky, and you'd probably never want to call it directly, it's necessary to support binary searches of basic blocks by their bytecode index. > > > Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:350 > > +void SpeculativeJIT::compileObjectEquality(Node& node, void* vptr, bool (*predictionCheck)(PredictedType)) > > Would be nice to have a typedef for bool (*predictionCheck)(PredictedType) Agree, will add.

Filip Pizlo

Comment 17 2011-10-08 13:00:39 PDT

Created attachment 110277 [details] the patch Added a typedef for bool (*)(PredictedType). Removed bytecode/ActionablePrediction.h because it's unused. Updated some build files (though most of them make no reference to either DFG, or to any headers, so did not need to change).

WebKit Review Bot

Comment 18 2011-10-08 13:03:44 PDT

Attachment 110277 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1 Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5] Total errors found: 1 in 28 files If any of these errors are false positives, please file a bug against check-webkit-style.

Filip Pizlo

Comment 19 2011-10-08 13:59:35 PDT

Created attachment 110281 [details] the patch - merge with 32_64 changes

WebKit Review Bot

Comment 20 2011-10-08 14:01:50 PDT

Attachment 110281 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1 Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5] Total errors found: 1 in 28 files If any of these errors are false positives, please file a bug against check-webkit-style.

Filip Pizlo

Comment 21 2011-10-08 15:30:24 PDT

Created attachment 110283 [details] the patch - more perf improvements - Did more tuning to the CFA itself. It now calls clobberStructures() less frequently. clobberStructures() is an expensive, but necessary, method. Likely any future work on improving CFA efficiency will focus on that method - how it's called, and what it does internally. - Added better debugging support. Every graph dump will now print the CFA state in a concise and clear way. - Fixed a bug where clobberStructures() was not being called for possibly-side-effecting operations like LogicalNot on questionable objects, and CompareXYZ on questionable objects. - Did some investigation into what is going on in SunSpider. It turns out that date-format-xparb, among others, are still failing speculation like crazy. A lot of its is due to GetByVal not having a string property case. The exponential backoff in recompilation is doing its best, but since the benchmark is so short-running, any increase in DFG runtime will affect such short-running benchmarks that fail speculation systematically. - Ran performance numbers on a separate machine, to find that the speed-up is smaller on my Mac Pro than it is on my MacBook Pro. But it's a speed-up on both machines. Current performance situation: - Overall 0.5% speed-up in geomean of preferred means. - Slight reproducible sub-1% slow-down on SunSpider - Either a 2.4% or a 0.5% win on V8 depending on machine. - 1% win on Kraken. Recommendation: we should land this, since: it's already a net win, the losses on SunSpider are due to preexisting conditions for which we are currently uninsured, and this should only become a bigger win as we add more machinery to take advantage of it. (Like, we're not doing anything to take advantage of the always-double propagation that this thing is already doing, and I haven't yet implemented the is-constant propagation, which should be a win on function checks among other things.) Not to mention that classically, CFA is most profitable when combined with inlining. Latest numbers on MacBook Pro: Benchmark report for SunSpider, V8, and Kraken. VMs tested: "TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc "CFA" at /Volumes/Data/pizlo/OpenSource/WebKitBuild/Release/jsc Collected 30 samples per benchmark/VM, with 10 VM invocations per benchmark. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree CFA SunSpider: 3d-cube 7.2512+-0.1027 7.2506+-0.1160 3d-morph 7.6656+-0.0813 ? 7.7124+-0.0978 ? 3d-raytrace 7.3843+-0.0984 ? 7.4021+-0.0967 ? access-binary-trees 1.7302+-0.0287 1.6781+-0.0239 might be 1.0310x faster access-fannkuch 6.3504+-0.0672 6.3159+-0.0640 access-nbody 3.4521+-0.0469 ? 3.4932+-0.0531 ? might be 1.0119x slower access-nsieve 2.5600+-0.0356 ? 2.5885+-0.0470 ? might be 1.0111x slower bitops-3bit-bits-in-byte 1.7154+-0.0262 ? 1.7434+-0.0319 ? might be 1.0163x slower bitops-bits-in-byte 2.7388+-0.0391 ? 2.7447+-0.0359 ? bitops-bitwise-and 3.3050+-0.0546 ? 3.3606+-0.0589 ? might be 1.0168x slower bitops-nsieve-bits 5.4385+-0.0653 5.4370+-0.0711 controlflow-recursive 2.0667+-0.0212 ? 2.0812+-0.0208 ? crypto-aes 6.6382+-0.1104 ? 6.6709+-0.1017 ? crypto-md5 2.8355+-0.0466 2.8343+-0.0453 crypto-sha1 2.4700+-0.0373 ? 2.5155+-0.0443 ? might be 1.0184x slower date-format-tofte 9.9355+-0.0996 ? 9.9652+-0.1371 ? date-format-xparb 9.5529+-0.1560 ! 9.9465+-0.1131 ! definitely 1.0412x slower math-cordic 6.2958+-0.0680 ? 6.3863+-0.0523 ? might be 1.0144x slower math-partial-sums 7.5041+-0.0695 ? 7.5846+-0.0878 ? might be 1.0107x slower math-spectral-norm 2.8198+-0.0398 ? 2.8726+-0.0465 ? might be 1.0187x slower regexp-dna 10.6842+-0.0893 10.6180+-0.1092 string-base64 5.2352+-0.0858 ? 5.2951+-0.0554 ? might be 1.0114x slower string-fasta 6.3476+-0.0765 ? 6.4334+-0.0947 ? might be 1.0135x slower string-tagcloud 11.0721+-0.1259 ? 11.1451+-0.1129 ? string-unpack-code 20.9296+-0.2131 ? 21.3736+-0.3461 ? might be 1.0212x slower string-validate-input 6.3714+-0.1157 6.3281+-0.0869 <arithmetic> * 6.1673+-0.0147 ! 6.2222+-0.0173 ! definitely 1.0089x slower <geometric> 5.0674+-0.0157 ! 5.1036+-0.0164 ! definitely 1.0072x slower <harmonic> 4.1647+-0.0213 ? 4.1880+-0.0207 ? TipOfTree CFA V8: crypto 71.9306+-0.2217 71.6923+-0.2342 deltablue 224.0349+-1.1196 ^ 220.8591+-0.8745 ^ definitely 1.0144x faster earley-boyer 90.8330+-0.2471 ? 91.0550+-0.6016 ? raytrace 58.6492+-0.1942 ^ 57.5860+-0.1636 ^ definitely 1.0185x faster regexp 103.2842+-0.5738 ? 103.6217+-0.3030 ? richards 204.7647+-0.4575 ^ 178.8069+-0.4249 ^ definitely 1.1452x faster splay 93.9206+-0.3154 93.8991+-0.3847 <arithmetic> 121.0596+-0.1853 ^ 116.7886+-0.1891 ^ definitely 1.0366x faster <geometric> * 107.9179+-0.1498 ^ 105.3890+-0.1611 ^ definitely 1.0240x faster <harmonic> 97.7053+-0.1498 ^ 96.2554+-0.1560 ^ definitely 1.0151x faster TipOfTree CFA Kraken: ai-astar 491.8722+-1.8657 491.6102+-2.1346 audio-beat-detection 191.3896+-0.6444 ! 193.1956+-0.5294 ! definitely 1.0094x slower audio-dft 265.5784+-1.5956 ^ 261.3193+-2.1896 ^ definitely 1.0163x faster audio-fft 124.9334+-0.1853 ? 125.4338+-0.4450 ? audio-oscillator 251.5153+-1.0876 250.4084+-1.0322 imaging-darkroom 412.2875+-0.8770 ? 412.7251+-0.8786 ? imaging-desaturate 229.8157+-0.4044 ^ 216.2189+-0.3764 ^ definitely 1.0629x faster imaging-gaussian-blur 579.9277+-0.8575 ^ 567.2564+-0.9921 ^ definitely 1.0223x faster json-parse-financial 53.8428+-0.2035 ? 54.0464+-0.2093 ? json-stringify-tinderbox 67.6931+-0.2539 ? 67.7358+-0.2734 ? stanford-crypto-aes 130.2188+-0.9518 129.4420+-0.8691 stanford-crypto-ccm 99.4847+-0.3260 ! 101.1548+-0.4164 ! definitely 1.0168x slower stanford-crypto-pbkdf2 187.6499+-0.8004 ? 188.8759+-1.0071 ? stanford-crypto-sha256-iterative 69.9878+-0.1863 69.6543+-0.1668 <arithmetic> * 225.4426+-0.3118 ^ 223.5055+-0.3132 ^ definitely 1.0087x faster <geometric> 175.7162+-0.2682 ^ 174.8010+-0.2755 ^ definitely 1.0052x faster <harmonic> 136.9772+-0.2368 136.7825+-0.2282 TipOfTree CFA All benchmarks: <arithmetic> 88.5950+-0.1068 ^ 87.4122+-0.1004 ^ definitely 1.0135x faster <geometric> 22.9797+-0.0474 22.9534+-0.0477 <harmonic> 7.3245+-0.0367 ? 7.3629+-0.0354 ? TipOfTree CFA Geomean of preferred means: <scaled-result> 53.1381+-0.0625 ^ 52.7237+-0.0651 ^ definitely 1.0079x faster Latest numbers on Mac Pro: Benchmark report for SunSpider, V8, and Kraken. VMs tested: "TipOfTree" at /Volumes/Data/pizlo/secondary/OpenSource/WebKitBuild/Release/jsc "CFA" at /Volumes/Data/fromMiniMe/OpenSource/WebKitBuild/Release/jsc Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree CFA SunSpider: 3d-cube 7.9004+-0.0265 ! 7.9902+-0.0335 ! definitely 1.0114x slower 3d-morph 8.4615+-0.1076 ? 8.4887+-0.1116 ? 3d-raytrace 8.0821+-0.1095 8.0724+-0.0618 access-binary-trees 1.7927+-0.0033 1.7806+-0.0161 access-fannkuch 7.6813+-0.0148 ! 7.9859+-0.0254 ! definitely 1.0397x slower access-nbody 4.1641+-0.0068 ! 4.1778+-0.0037 ! definitely 1.0033x slower access-nsieve 3.1422+-0.0032 ! 3.1584+-0.0066 ! definitely 1.0052x slower bitops-3bit-bits-in-byte 1.7320+-0.0103 ? 1.7441+-0.0089 ? bitops-bits-in-byte 5.2152+-0.0453 ? 5.2206+-0.0533 ? bitops-bitwise-and 3.4118+-0.0640 3.3993+-0.0644 bitops-nsieve-bits 5.6763+-0.0392 5.6438+-0.0351 controlflow-recursive 2.2564+-0.0244 2.2509+-0.0028 crypto-aes 6.9132+-0.0689 ! 7.0664+-0.0630 ! definitely 1.0222x slower crypto-md5 3.0181+-0.0343 ? 3.0585+-0.0331 ? might be 1.0134x slower crypto-sha1 2.7090+-0.0109 ! 2.7557+-0.0108 ! definitely 1.0172x slower date-format-tofte 10.6874+-0.0850 ? 10.7886+-0.1017 ? date-format-xparb 10.7804+-0.2653 ? 11.2082+-0.1793 ? might be 1.0397x slower math-cordic 7.2695+-0.0506 7.2238+-0.0219 math-partial-sums 10.4781+-0.0501 10.4767+-0.0339 math-spectral-norm 3.2288+-0.0168 ! 3.2505+-0.0036 ! definitely 1.0067x slower regexp-dna 12.4863+-0.1218 ? 12.5624+-0.1253 ? string-base64 5.4727+-0.0648 ? 5.5304+-0.0697 ? might be 1.0105x slower string-fasta 7.3632+-0.0576 7.3506+-0.0232 string-tagcloud 12.8958+-0.0586 12.8081+-0.0688 string-unpack-code 23.9726+-0.1714 ? 24.2456+-0.1831 ? might be 1.0114x slower string-validate-input 6.8011+-0.1062 6.7479+-0.0643 <arithmetic> * 7.0612+-0.0244 ! 7.1149+-0.0199 ! definitely 1.0076x slower <geometric> 5.7812+-0.0155 ! 5.8176+-0.0134 ! definitely 1.0063x slower <harmonic> 4.6961+-0.0111 ? 4.7189+-0.0120 ? TipOfTree CFA V8: crypto 80.3328+-0.2637 79.9557+-0.1580 deltablue 252.4219+-1.5655 ? 253.0487+-1.3899 ? earley-boyer 110.3867+-0.3100 ! 111.4466+-0.2599 ! definitely 1.0096x slower raytrace 66.3575+-0.3105 ^ 64.6593+-0.1899 ^ definitely 1.0263x faster regexp 123.6966+-0.5882 123.6552+-0.5325 richards 221.5944+-0.9748 ^ 215.4157+-1.0238 ^ definitely 1.0287x faster splay 122.9378+-0.7662 ? 124.5312+-0.8843 ? might be 1.0130x slower <arithmetic> 139.6754+-0.4149 138.9589+-0.3738 <geometric> * 125.8663+-0.2769 ^ 125.2510+-0.2812 ^ definitely 1.0049x faster <harmonic> 114.2437+-0.1976 ^ 113.5287+-0.2205 ^ definitely 1.0063x faster TipOfTree CFA Kraken: ai-astar 832.8773+-0.4054 825.4636+-11.7787 audio-beat-detection 214.1596+-1.7669 213.3784+-0.9437 audio-dft 271.3054+-4.2497 ^ 260.8641+-3.0615 ^ definitely 1.0400x faster audio-fft 135.4391+-0.1603 ! 139.2097+-1.1166 ! definitely 1.0278x slower audio-oscillator 293.9338+-3.0579 292.1754+-1.9142 imaging-darkroom 486.8431+-4.6328 484.2544+-3.7257 imaging-desaturate 244.8629+-0.2524 ^ 237.8543+-0.0918 ^ definitely 1.0295x faster imaging-gaussian-blur 641.8301+-0.2648 ^ 611.2323+-0.2040 ^ definitely 1.0501x faster json-parse-financial 68.4385+-0.1365 ? 68.6702+-0.2375 ? json-stringify-tinderbox 79.7739+-0.3035 ? 80.1401+-0.2522 ? stanford-crypto-aes 152.6437+-1.2898 151.5830+-1.4340 stanford-crypto-ccm 116.1246+-0.7775 ? 117.1616+-0.5738 ? stanford-crypto-pbkdf2 235.4855+-1.7872 232.6827+-1.6888 might be 1.0120x faster stanford-crypto-sha256-iterative 85.6205+-0.2548 ? 86.0403+-0.3846 ? <arithmetic> * 275.6670+-0.6405 ^ 271.4793+-0.9585 ^ definitely 1.0154x faster <geometric> 208.1917+-0.4363 ^ 206.5505+-0.5262 ^ definitely 1.0079x faster <harmonic> 162.2625+-0.2764 162.1470+-0.4291 TipOfTree CFA All benchmarks: <arithmetic> 106.8225+-0.2224 ^ 105.4981+-0.2833 ^ definitely 1.0126x faster <geometric> 26.6011+-0.0546 ? 26.6112+-0.0438 ? <harmonic> 8.2687+-0.0192 ? 8.3073+-0.0206 ? TipOfTree CFA Geomean of preferred means: <scaled-result> 62.5735+-0.1251 ^ 62.3100+-0.0910 ^ definitely 1.0042x faster

WebKit Review Bot

Comment 22 2011-10-08 15:31:50 PDT

Attachment 110283 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1 Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5] Total errors found: 1 in 28 files If any of these errors are false positives, please file a bug against check-webkit-style.

Filip Pizlo

Comment 23 2011-10-09 13:40:49 PDT

Created attachment 110310 [details] the patch - merged with ByVal changes

WebKit Review Bot

Comment 24 2011-10-09 13:43:09 PDT

Attachment 110310 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1 Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5] Total errors found: 1 in 28 files If any of these errors are false positives, please file a bug against check-webkit-style.

Gavin Barraclough

Comment 25 2011-10-10 10:03:04 PDT

Comment on attachment 110310 [details] the patch - merged with ByVal changes View in context: https://bugs.webkit.org/attachment.cgi?id=110310&action=review 1/2 way through the review! > Source/JavaScriptCore/bytecode/PredictedType.cpp:47 > + bool isTop = true; Could we not just test a mask & early return here? > Source/JavaScriptCore/dfg/DFGAbstractState.cpp:34 > +namespace JSC { namespace DFG { We normally only use ENABLE_* definitions in Platform.h (because this is included everywhere, and the way the macro works means a missed include won't result in a compile error). It would be better to make this a regular define & #if FOO check. > Source/JavaScriptCore/dfg/DFGAbstractState.cpp:63 > + PROFILE(17); Not a big issue, but it would probably be better to define these values rather than use magic numbers through the code.

Gavin Barraclough

Comment 26 2011-10-10 10:03:17 PDT

View in context: https://bugs.webkit.org/attachment.cgi?id=110310&action=review 1/2 way through the review! > Source/JavaScriptCore/bytecode/PredictedType.cpp:47 > + bool isTop = true; Could we not just test a mask & early return here? > Source/JavaScriptCore/dfg/DFGAbstractState.cpp:34 > +namespace JSC { namespace DFG { We normally only use ENABLE_* definitions in Platform.h (because this is included everywhere, and the way the macro works means a missed include won't result in a compile error). It would be better to make this a regular define & #if FOO check. > Source/JavaScriptCore/dfg/DFGAbstractState.cpp:63 > + PROFILE(17); Not a big issue, but it would probably be better to define these values rather than use magic numbers through the code.

Filip Pizlo

Comment 27 2011-10-10 11:39:01 PDT

(In reply to comment #26) > View in context: https://bugs.webkit.org/attachment.cgi?id=110310&action=review > > 1/2 way through the review! > > > Source/JavaScriptCore/bytecode/PredictedType.cpp:47 > > + bool isTop = true; > > Could we not just test a mask & early return here? That mask would be ever-changing and we'd have to separately maintain it, as we'd have to construct it manually by saying something like: PredictFinalObject | PredictArray | PredictObjectOther | PredictString | PredictCellOther | PredictInt32 | PredictDouble | PredictBoolean | PredictOther The reason why PredictTop doesn't do the job is that currently a top value will not have reserved bits set because nobody sets them, unless they use PredictTop directly (which happens, but rarely). For example, we leave 0x0004 and 0x0008, and 0x0020 for additional object types. Yet more bits are reserved if we ever wanted to get more specific about PredictOther (currently it means null-or-undefined but there is no separate null prediction or undefined prediction) or numbers (like int-represented-as-double). What I'm trying to do is to avoid a situation where the PredictTop mask would have to be changed to include newly used, previously reserved, bits every time we add support for predicting some new type. Fortunately, it's almost never the case that someone wants to know if a PredictedType is top. It only comes into play in the ToString function. That function already has to have an "if (foo & PredictBar)" statement for every type that it's capable of printing. So it was natural to have an else statement that clears isTop, which gives a more easy to maintain approach to the top madness. > > > Source/JavaScriptCore/dfg/DFGAbstractState.cpp:34 > > +namespace JSC { namespace DFG { > > We normally only use ENABLE_* definitions in Platform.h (because this is included everywhere, and the way the macro works means a missed include won't result in a compile error). > It would be better to make this a regular define & #if FOO check. OK. > > > Source/JavaScriptCore/dfg/DFGAbstractState.cpp:63 > > + PROFILE(17); > > Not a big issue, but it would probably be better to define these values rather than use magic numbers through the code. Will do.

Geoffrey Garen

Comment 28 2011-10-10 12:26:20 PDT

> I know. I don't think I want to do this in this patch, but I think I've got a solution for both this and register allocation: have named speculations (so just as we have an enum of node types, have an enum of speculation types), have attributes on nodes that describe what speculatiuons should be performed, have one phase in Propagator that deduces what those speculatins ought to be, and then while we're at it, have node attributes that specify what additional registers the codegen for a node will required. And viola, we'll be able to do global register allocation and get rid of GenerationInfo entirely. Not sure if this helps, but another possible option is to add explicit nodes for each speculation. We didn't do this originally because it made the graph not reusable for the non-speculative path, but now there is no non-speculative path. > > > + static inline BlockIndex getBytecodeBegin(OwnPtr<BasicBlock>* block) > > > + { > > > + return (*block)->bytecodeBegin; > > > + } RE this vs binarySearch, perhaps the binarySearch template could be modified to pass T& instead of T* to child functions.

Filip Pizlo

Comment 29 2011-10-10 12:31:43 PDT

(In reply to comment #28) > > I know. I don't think I want to do this in this patch, but I think I've got a solution for both this and register allocation: have named speculations (so just as we have an enum of node types, have an enum of speculation types), have attributes on nodes that describe what speculatiuons should be performed, have one phase in Propagator that deduces what those speculatins ought to be, and then while we're at it, have node attributes that specify what additional registers the codegen for a node will required. And viola, we'll be able to do global register allocation and get rid of GenerationInfo entirely. > > Not sure if this helps, but another possible option is to add explicit nodes for each speculation. We didn't do this originally because it made the graph not reusable for the non-speculative path, but now there is no non-speculative path. This would be a performance regression. Our boolean speculations are done as part of testing the boolean, and often without using SpeculateBooleanOperand. And then we have tests for object-or-undefined-or-null, where the speculation checks are embedded in the control flow of the node itself. As the DFG gets more powerful, our nodes will have more sophisticated execution semantics, and our optimizations will have to have more sophisticated modeling of those execution semantics. Better to just live with it. I've seen compiler hackers go down the path of trying to modularize this aspect of compilation and the results are invariably uglier than what we have. You either have to introduce new opcodes for each micro operation, but then you get performance regressions unless you also introduce more basic blocks (we could model the object-or-undefined speculation by blowing each node that does that into a control flow diamond). Or you can start to create a formal specification for your nodes. But that fails because the formal specification becomes more hairy and complicated than the compiler you were going to write in the first place. > > > > > + static inline BlockIndex getBytecodeBegin(OwnPtr<BasicBlock>* block) > > > > + { > > > > + return (*block)->bytecodeBegin; > > > > + } > > RE this vs binarySearch, perhaps the binarySearch template could be modified to pass T& instead of T* to child functions. That would still make the style bot unhappy...

Gavin Barraclough

Comment 30 2011-10-10 15:06:01 PDT

Comment on attachment 110310 [details] the patch - merged with ByVal changes View in context: https://bugs.webkit.org/attachment.cgi?id=110310&action=review couple more comments > Source/JavaScriptCore/dfg/DFGAbstractValue.h:63 > + m_structure = reinterpret_cast<Structure*>(1); The magic value '1' should be named, either behind a static const or an ifdef. > Source/JavaScriptCore/dfg/DFGAbstractValue.h:235 > + if (reinterpret_cast<uintptr_t>(m_structure) <= 1) This should use 'isNeitherClearNorTop'. > Source/JavaScriptCore/dfg/DFGAbstractValue.h:249 > + bool isNeitherClearNorTop() const { return reinterpret_cast<uintptr_t>(m_structure) > 1; } It would probably make the code more understandable to negate this - make this method isClearOrTop(), negate uses.

Filip Pizlo

Comment 31 2011-10-10 15:48:11 PDT

Created attachment 110424 [details] the patch - address Gavin's most recent comments

Filip Pizlo

Comment 32 2011-10-10 16:21:55 PDT

Created attachment 110430 [details] the patch - merged up to ToT

WebKit Review Bot

Comment 33 2011-10-10 16:25:55 PDT

Attachment 110430 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1 Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5] Total errors found: 1 in 28 files If any of these errors are false positives, please file a bug against check-webkit-style.

Gavin Barraclough

Comment 34 2011-10-11 16:45:44 PDT

Comment on attachment 110430 [details] the patch - merged up to ToT View in context: https://bugs.webkit.org/attachment.cgi?id=110430&action=review r+. I still don't think AbstractValue is really descriptive enough, but I don't have a better name to suggest. > Source/JavaScriptCore/dfg/DFGAbstractValue.h:312 > +struct AbstractValue { AbstractValue doesn't seem to be the right name for this class. This struct can tell you a type, not a value - e.g. "int", or "object with properties x, y", not '3' or '"Hello, world"'. This naming is confusing in it's uses, e.g. in GetLocal. > Source/JavaScriptCore/dfg/DFGNode.h:715 > + unsigned takenBytecodeOffsetDuringParsing() This change could probably have been made as a separate patch, breaking up this review a bit would have helped! - probably no point doing so now though. It's a bit unfortunate that the meaning of the branch's opInfos changes over time, it would be better if we could guard against error here. We could consider extra state in debug builds that we can ASSERT on, or we could change reflect this in the node types (have a BranchDuringParse/JumpDuringParse that are updated as the successors are set, and never reach generation).

Filip Pizlo

Comment 35 2011-10-11 19:05:36 PDT

Landed in r97218.

Attachments
work in progress (118.13 KB, patch) 2011-10-07 20:44 PDT, Filip Pizlo	no flags	Details Formatted Diff Diff
work in progress - fix style, mostly (117.58 KB, patch) 2011-10-07 20:54 PDT, Filip Pizlo	no flags	Details Formatted Diff Diff
work in progress - better performance (120.10 KB, patch) 2011-10-07 23:00 PDT, Filip Pizlo	no flags	Details Formatted Diff Diff
work in progress - it's now pure win (133.94 KB, patch) 2011-10-08 00:33 PDT, Filip Pizlo	no flags	Details Formatted Diff Diff
the patch (150.64 KB, patch) 2011-10-08 00:45 PDT, Filip Pizlo	no flags	Details Formatted Diff Diff
the patch (150.66 KB, patch) 2011-10-08 00:49 PDT, Filip Pizlo	no flags	Details Formatted Diff Diff
the patch (162.27 KB, patch) 2011-10-08 13:00 PDT, Filip Pizlo	no flags	Details Formatted Diff Diff
the patch - merge with 32_64 changes (162.40 KB, patch) 2011-10-08 13:59 PDT, Filip Pizlo	no flags	Details Formatted Diff Diff
the patch - more perf improvements (164.39 KB, patch) 2011-10-08 15:30 PDT, Filip Pizlo	no flags	Details Formatted Diff Diff
the patch - merged with ByVal changes (164.91 KB, patch) 2011-10-09 13:40 PDT, Filip Pizlo	no flags	Details Formatted Diff Diff
the patch - address Gavin's most recent comments (165.31 KB, patch) 2011-10-10 15:48 PDT, Filip Pizlo	no flags	Details Formatted Diff Diff
the patch - merged up to ToT (165.99 KB, patch) 2011-10-10 16:21 PDT, Filip Pizlo	barraclough: review+	Details Formatted Diff Diff
Show Obsolete (11) View All Add attachment proposed patch, testcase, etc.