Summary: | DFG does not have flow-sensitive intraprocedural control flow analysis | ||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Filip Pizlo <fpizlo> | ||||||||||||||||||||||||||
Component: | JavaScriptCore | Assignee: | Nobody <webkit-unassigned> | ||||||||||||||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||||||||||||||
Severity: | Normal | CC: | barraclough, fpizlo, ggaren, oliver, webkit.review.bot | ||||||||||||||||||||||||||
Priority: | P2 | ||||||||||||||||||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||||||||||||||||||
Hardware: | All | ||||||||||||||||||||||||||||
OS: | All | ||||||||||||||||||||||||||||
Attachments: |
|
Description
Filip Pizlo
2011-10-07 20:36:48 PDT
Created attachment 110248 [details]
work in progress
Still working on it. Haven't measured performance yet. Haven't implemented 32_64 support yet.
Attachment 110248 [details] did not pass style-queue:
Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1
Source/JavaScriptCore/dfg/DFGAbstractValue.h:119: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5]
Source/JavaScriptCore/dfg/DFGAbstractValue.h:204: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5]
Source/JavaScriptCore/dfg/DFGAbstractValue.h:253: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5]
Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5]
Source/JavaScriptCore/dfg/DFGAbstractState.h:95: The parameter name "codeBlock" adds no information, so it should be removed. [readability/parameter_name] [5]
Source/JavaScriptCore/dfg/DFGAbstractState.h:95: The parameter name "graph" adds no information, so it should be removed. [readability/parameter_name] [5]
Source/JavaScriptCore/dfg/DFGAbstractState.h:106: The parameter name "graph" adds no information, so it should be removed. [readability/parameter_name] [5]
Source/JavaScriptCore/dfg/DFGAbstractState.h:111: The parameter name "basicBlock" adds no information, so it should be removed. [readability/parameter_name] [5]
Source/JavaScriptCore/dfg/DFGAbstractState.h:134: The parameter name "mergeMode" adds no information, so it should be removed. [readability/parameter_name] [5]
Source/JavaScriptCore/dfg/DFGAbstractState.h:147: The parameter name "nodeIndex" adds no information, so it should be removed. [readability/parameter_name] [5]
Source/JavaScriptCore/dfg/DFGAbstractState.h:162: The parameter name "graph" adds no information, so it should be removed. [readability/parameter_name] [5]
Source/JavaScriptCore/dfg/DFGAbstractState.h:162: The parameter name "basicBlock" adds no information, so it should be removed. [readability/parameter_name] [5]
Source/JavaScriptCore/dfg/DFGAbstractState.h:171: The parameter name "nodeIndex" adds no information, so it should be removed. [readability/parameter_name] [5]
Source/JavaScriptCore/dfg/DFGOSREntry.cpp:89: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5]
Total errors found: 14 in 22 files
If any of these errors are false positives, please file a bug against check-webkit-style.
This looks like it still needs work, but even now, it's a speed-up on V8. Benchmark report for V8. VMs tested: "TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc "CFA" at /Volumes/Data/pizlo/OpenSource/WebKitBuild/Release/jsc Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree CFA crypto 71.8876+-0.2979 ! 73.8087+-0.2691 ! definitely 1.0267x slower deltablue 226.0413+-2.3269 ? 226.1569+-2.0996 ? earley-boyer 90.5814+-0.4154 ! 91.8870+-0.3324 ! definitely 1.0144x slower raytrace 58.4061+-0.3131 ! 59.3055+-0.3795 ! definitely 1.0154x slower regexp 102.7642+-0.2884 ? 103.4826+-0.4524 ? richards 204.4922+-0.8884 ^ 175.1322+-0.3993 ^ definitely 1.1676x faster splay 95.6563+-0.7585 ? 96.4660+-0.2593 ? <arithmetic> 121.4042+-0.3672 ^ 118.0341+-0.2739 ^ definitely 1.0286x faster <geometric> * 108.1240+-0.2137 ^ 106.8489+-0.1572 ^ definitely 1.0119x faster <harmonic> 97.7987+-0.1951 ? 97.9595+-0.1685 ? Created attachment 110250 [details]
work in progress - fix style, mostly
Attachment 110250 [details] did not pass style-queue:
Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1
Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5]
Total errors found: 1 in 22 files
If any of these errors are false positives, please file a bug against check-webkit-style.
> crypto 71.8876+-0.2979 ! 73.8087+-0.2691 ! definitely 1.0267x slower
This 2% appears to be entirely due to the CFA itself! The CFA code needs to be tuned up. I'll play around with making it faster.
Created attachment 110259 [details]
work in progress - better performance
I did two things:
- Fixed OSR entry. If a variable is dead then the abstract value should be TOP. Internally, the abstract state machine treats dead variables as BOTTOM. That's necessary for making the analysis precise.
- Changed the abstract structure value representation. Previously it was basically Vector<Structure*>. Now it's just Structure*, where sets larger than 1 are turned to TOP. This works great, since most structure sets only contain one element anyway.
Benchmark report for SunSpider, V8, and Kraken.
VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc
"CFA" at /Volumes/Data/pizlo/OpenSource/WebKitBuild/Release/jsc
Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM
invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting
benchmark execution times with 95% confidence intervals in milliseconds.
TipOfTree CFA
SunSpider:
3d-cube 7.3400+-0.2059 7.2108+-0.1398 might be 1.0179x faster
3d-morph 7.5851+-0.1400 ? 7.6501+-0.1387 ?
3d-raytrace 7.5752+-0.1830 7.5150+-0.1734
access-binary-trees 1.7129+-0.0337 ? 1.7169+-0.0476 ?
access-fannkuch 6.2979+-0.0958 ? 6.5286+-0.1364 ? might be 1.0366x slower
access-nbody 3.4297+-0.0839 ? 3.5317+-0.0830 ? might be 1.0297x slower
access-nsieve 2.5318+-0.0516 ? 2.6022+-0.0786 ? might be 1.0278x slower
bitops-3bit-bits-in-byte 1.6817+-0.0191 ! 1.7452+-0.0250 ! definitely 1.0378x slower
bitops-bits-in-byte 2.7081+-0.0537 ! 2.9258+-0.1245 ! definitely 1.0804x slower
bitops-bitwise-and 3.4372+-0.2532 3.3606+-0.0894 might be 1.0228x faster
bitops-nsieve-bits 5.4391+-0.0644 5.3109+-0.1139 might be 1.0241x faster
controlflow-recursive 2.0508+-0.0375 2.0316+-0.0499
crypto-aes 6.5082+-0.1513 ? 6.7630+-0.1750 ? might be 1.0392x slower
crypto-md5 2.7830+-0.0851 ? 2.8322+-0.0706 ? might be 1.0177x slower
crypto-sha1 2.4498+-0.0714 ? 2.5556+-0.0867 ? might be 1.0432x slower
date-format-tofte 9.8098+-0.2208 ? 9.9895+-0.1587 ? might be 1.0183x slower
date-format-xparb 9.4492+-0.1745 ! 9.8804+-0.2001 ! definitely 1.0456x slower
math-cordic 6.3737+-0.1416 6.3125+-0.0656
math-partial-sums 7.5554+-0.1169 7.5150+-0.1276
math-spectral-norm 2.7905+-0.0655 ? 2.8601+-0.0784 ? might be 1.0250x slower
regexp-dna 10.6186+-0.1482 10.5608+-0.1658
string-base64 5.1063+-0.0828 ? 5.1827+-0.0692 ? might be 1.0150x slower
string-fasta 6.3348+-0.1140 ? 6.4154+-0.1134 ? might be 1.0127x slower
string-tagcloud 11.1755+-0.2087 ? 11.2236+-0.2715 ?
string-unpack-code 21.0133+-0.2257 ! 21.6342+-0.2337 ! definitely 1.0295x slower
string-validate-input 6.2811+-0.1612 6.2791+-0.1612
<arithmetic> * 6.1553+-0.0194 ! 6.2359+-0.0193 ! definitely 1.0131x slower
<geometric> 5.0469+-0.0176 ! 5.1180+-0.0158 ! definitely 1.0141x slower
<harmonic> 4.1371+-0.0257 ! 4.2074+-0.0236 ! definitely 1.0170x slower
TipOfTree CFA
V8:
crypto 71.6115+-0.3106 ! 72.6151+-0.5762 ! definitely 1.0140x slower
deltablue 227.0336+-3.5653 223.0970+-2.1287 might be 1.0176x faster
earley-boyer 90.4309+-0.3425 ? 90.9446+-0.2832 ?
raytrace 57.5184+-0.4610 57.4722+-0.2783
regexp 102.7896+-0.3338 ? 102.8251+-0.5877 ?
richards 205.1041+-1.1592 ^ 175.0092+-0.6119 ^ definitely 1.1720x faster
splay 95.5634+-0.6248 ? 95.6634+-0.7130 ?
<arithmetic> 121.4359+-0.5050 ^ 116.8038+-0.3613 ^ definitely 1.0397x faster
<geometric> * 107.9023+-0.2225 ^ 105.5267+-0.2425 ^ definitely 1.0225x faster
<harmonic> 97.3732+-0.1490 ^ 96.4890+-0.1988 ^ definitely 1.0092x faster
TipOfTree CFA
Kraken:
ai-astar 493.1064+-3.9254 492.3566+-3.0941
audio-beat-detection 191.3132+-0.6433 ? 191.4514+-0.7850 ?
audio-dft 265.4996+-2.7456 ^ 260.4941+-1.8597 ^ definitely 1.0192x faster
audio-fft 125.3307+-0.5836 ? 126.1825+-1.9849 ?
audio-oscillator 251.3514+-1.9030 249.7558+-2.0144
imaging-darkroom 410.4260+-0.8666 ! 413.6212+-1.1919 ! definitely 1.0078x slower
imaging-desaturate 230.1417+-0.5040 ^ 216.3676+-0.4729 ^ definitely 1.0637x faster
imaging-gaussian-blur 578.7784+-2.2247 ^ 568.6758+-2.3888 ^ definitely 1.0178x faster
json-parse-financial 54.4313+-0.2379 ? 54.5051+-0.2106 ?
json-stringify-tinderbox 69.6600+-0.8972 ^ 68.1390+-0.2398 ^ definitely 1.0223x faster
stanford-crypto-aes 131.3154+-2.2640 130.2981+-1.6271
stanford-crypto-ccm 99.8329+-0.9814 ! 101.7389+-0.8432 ! definitely 1.0191x slower
stanford-crypto-pbkdf2 187.8078+-0.7526 ! 191.5040+-1.2973 ! definitely 1.0197x slower
stanford-crypto-sha256-iterative 69.6199+-0.2190 ? 69.9535+-0.3764 ?
<arithmetic> * 225.6153+-0.5117 ^ 223.9317+-0.4806 ^ definitely 1.0075x faster
<geometric> 176.2934+-0.4303 ^ 175.3365+-0.4004 ^ definitely 1.0055x faster
<harmonic> 137.8627+-0.3810 137.4240+-0.3734
TipOfTree CFA
All benchmarks:
<arithmetic> 88.6958+-0.1882 ^ 87.5490+-0.1577 ^ definitely 1.0131x faster
<geometric> 22.9502+-0.0535 ? 23.0146+-0.0492 ?
<harmonic> 7.2778+-0.0442 ! 7.3970+-0.0406 ! definitely 1.0164x slower
TipOfTree CFA
Geomean of preferred means:
<scaled-result> 53.1147+-0.0841 ^ 52.8190+-0.0781 ^ definitely 1.0056x faster
Attachment 110259 [details] did not pass style-queue:
Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1
Source/JavaScriptCore/dfg/DFGAbstractValue.h:96: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5]
Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5]
Source/JavaScriptCore/dfg/DFGAbstractState.cpp:48: Should have a space between // and comment [whitespace/comments] [4]
Source/JavaScriptCore/dfg/DFGAbstractState.cpp:65: Should have a space between // and comment [whitespace/comments] [4]
Source/JavaScriptCore/dfg/DFGAbstractState.cpp:82: Should have a space between // and comment [whitespace/comments] [4]
Source/JavaScriptCore/dfg/DFGAbstractState.cpp:123: Should have a space between // and comment [whitespace/comments] [4]
Total errors found: 6 in 22 files
If any of these errors are false positives, please file a bug against check-webkit-style.
Created attachment 110263 [details]
work in progress - it's now pure win
This is now pure win. This required a number of changes:
- CFA merging requires interrogating the successors of basic blocks. This was previously a binary search. Now it's not.
- The CFA is now pseudo-worklist based. It visits basic blocks in bytecode order (which is almost program order, and almost topological order), leading to a high probability that a block will not be visited unless its predecessors have been visited. But it does not visit a block if since the last time it was visited nothing has happened that would change the block's outcome.
- Bunch of little tweaks to reduce the costs of various CFA operations.
Benchmark report for SunSpider, V8, and Kraken.
VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc
"CFA" at /Volumes/Data/pizlo/OpenSource/WebKitBuild/Release/jsc
Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM
invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting
benchmark execution times with 95% confidence intervals in milliseconds.
TipOfTree CFA
SunSpider:
3d-cube 7.1877+-0.1640 ? 7.2292+-0.1642 ?
3d-morph 7.5879+-0.1785 7.5121+-0.1356 might be 1.0101x faster
3d-raytrace 7.4148+-0.2026 7.3754+-0.1707
access-binary-trees 1.7784+-0.0808 1.6926+-0.0562 might be 1.0507x faster
access-fannkuch 6.3277+-0.1237 6.3013+-0.1268
access-nbody 3.5172+-0.0884 3.4572+-0.0941 might be 1.0174x faster
access-nsieve 2.5999+-0.0791 2.5825+-0.0877
bitops-3bit-bits-in-byte 1.7346+-0.0357 1.7115+-0.0327 might be 1.0135x faster
bitops-bits-in-byte 2.7539+-0.0720 2.7514+-0.0510
bitops-bitwise-and 3.3173+-0.0725 ? 3.3754+-0.1000 ? might be 1.0175x slower
bitops-nsieve-bits 5.3589+-0.1009 ? 5.3927+-0.1290 ?
controlflow-recursive 2.0357+-0.0392 ? 2.0723+-0.0671 ? might be 1.0180x slower
crypto-aes 6.5654+-0.1839 ? 6.6289+-0.1744 ?
crypto-md5 2.7721+-0.0633 ? 2.8039+-0.0589 ? might be 1.0114x slower
crypto-sha1 2.4790+-0.1072 ? 2.4911+-0.0745 ?
date-format-tofte 9.8535+-0.2432 ? 9.9580+-0.2362 ? might be 1.0106x slower
date-format-xparb 9.4830+-0.1867 ! 10.0138+-0.2298 ! definitely 1.0560x slower
math-cordic 6.3866+-0.1093 ? 6.5029+-0.1173 ? might be 1.0182x slower
math-partial-sums 7.6234+-0.1297 7.5099+-0.1406 might be 1.0151x faster
math-spectral-norm 2.8589+-0.0701 2.8198+-0.0660 might be 1.0139x faster
regexp-dna 10.5651+-0.2565 10.4409+-0.1349 might be 1.0119x faster
string-base64 5.0891+-0.1039 ? 5.2070+-0.1078 ? might be 1.0232x slower
string-fasta 6.4453+-0.1413 6.3143+-0.0977 might be 1.0207x faster
string-tagcloud 11.0358+-0.2692 ? 11.1644+-0.1814 ? might be 1.0117x slower
string-unpack-code 21.0071+-0.2760 ? 21.4615+-0.5199 ? might be 1.0216x slower
string-validate-input 6.2340+-0.1813 ? 6.3609+-0.1603 ? might be 1.0204x slower
<arithmetic> * 6.1543+-0.0155 ? 6.1973+-0.0322 ?
<geometric> 5.0642+-0.0192 ? 5.0771+-0.0246 ?
<harmonic> 4.1736+-0.0342 4.1643+-0.0360
TipOfTree CFA
V8:
crypto 71.6260+-0.3327 ? 71.8717+-0.3132 ?
deltablue 226.0467+-1.9042 ^ 221.1798+-1.0192 ^ definitely 1.0220x faster
earley-boyer 89.8473+-0.2783 ! 90.9236+-0.3194 ! definitely 1.0120x slower
raytrace 57.7508+-0.2079 ? 58.0618+-0.2847 ?
regexp 102.4067+-0.2936 ? 103.1329+-0.4744 ?
richards 204.6559+-1.0139 ^ 175.9648+-1.3114 ^ definitely 1.1631x faster
splay 96.0001+-0.5870 95.7168+-0.2982
<arithmetic> 121.1905+-0.2260 ^ 116.6931+-0.2717 ^ definitely 1.0385x faster
<geometric> * 107.7836+-0.1350 ^ 105.5294+-0.1963 ^ definitely 1.0214x faster
<harmonic> 97.3547+-0.1511 ^ 96.5707+-0.1726 ^ definitely 1.0081x faster
TipOfTree CFA
Kraken:
ai-astar 491.8601+-2.3376 489.5730+-2.2160
audio-beat-detection 190.6516+-0.7819 ! 193.5275+-1.1763 ! definitely 1.0151x slower
audio-dft 268.0732+-3.6325 ^ 259.3428+-1.9135 ^ definitely 1.0337x faster
audio-fft 124.9162+-0.4877 124.7628+-0.2686
audio-oscillator 252.5326+-2.3092 249.5812+-1.6099 might be 1.0118x faster
imaging-darkroom 411.7627+-1.5812 ? 412.7726+-0.8839 ?
imaging-desaturate 229.9889+-0.7966 ^ 216.8373+-0.8930 ^ definitely 1.0607x faster
imaging-gaussian-blur 580.7792+-2.4572 ^ 565.6115+-1.0252 ^ definitely 1.0268x faster
json-parse-financial 55.2379+-0.3316 55.1255+-0.5643
json-stringify-tinderbox 68.1584+-0.4261 67.6238+-0.1915
stanford-crypto-aes 130.3048+-1.2905 ? 130.4318+-1.5252 ?
stanford-crypto-ccm 99.8189+-0.8939 ! 102.0182+-0.9102 ! definitely 1.0220x slower
stanford-crypto-pbkdf2 187.7075+-0.8696 ! 190.3954+-1.0340 ! definitely 1.0143x slower
stanford-crypto-sha256-iterative 70.1452+-0.4680 ? 70.3815+-0.4656 ?
<arithmetic> * 225.8527+-0.5194 ^ 223.4275+-0.4581 ^ definitely 1.0109x faster
<geometric> 176.3357+-0.4920 ^ 175.2288+-0.4434 ^ definitely 1.0063x faster
<harmonic> 137.8869+-0.4240 137.5821+-0.4486
TipOfTree CFA
All benchmarks:
<arithmetic> 88.7294+-0.1670 ^ 87.3610+-0.1625 ^ definitely 1.0157x faster
<geometric> 22.9916+-0.0565 22.9087+-0.0767
<harmonic> 7.3403+-0.0588 7.3234+-0.0619
TipOfTree CFA
Geomean of preferred means:
<scaled-result> 53.1110+-0.0676 ^ 52.6706+-0.1069 ^ definitely 1.0084x faster
Attachment 110263 [details] did not pass style-queue:
Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1
Source/JavaScriptCore/dfg/DFGAbstractValue.h:96: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5]
Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5]
Total errors found: 2 in 26 files
If any of these errors are false positives, please file a bug against check-webkit-style.
Created attachment 110264 [details]
the patch
Got it to work with 32_64.
Attachment 110264 [details] did not pass style-queue:
Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1
Source/JavaScriptCore/ChangeLog:3: Line contains tab character. [whitespace/tab] [5]
Source/JavaScriptCore/ChangeLog:4: Line contains tab character. [whitespace/tab] [5]
Source/JavaScriptCore/ChangeLog:8: Line contains tab character. [whitespace/tab] [5]
Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5]
Total errors found: 4 in 26 files
If any of these errors are false positives, please file a bug against check-webkit-style.
Created attachment 110265 [details]
the patch
Oh, emacs, you and your tabs, which you like to put into ChangeLogs regardless of how many times I tell you not to.
Attachment 110265 [details] did not pass style-queue:
Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1
Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5]
Total errors found: 1 in 26 files
If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 110265 [details] the patch View in context: https://bugs.webkit.org/attachment.cgi?id=110265&action=review Should use a typedef for bool (*predictionCheck)(PredictedType), getBytecodeBegin shouldn't be taking an OwnPtr*, but it looks like that has just been moved from one place to another. Other than those style quirks I can't see any obvious problems. I am slightly concerned with the potential maintenance problems caused by separating type flow logic from codegen, especially given the vast amount of code duplication we currently have :-( I do think Gavin should also look at this though. > Source/JavaScriptCore/dfg/DFGBasicBlock.h:59 > + static inline BlockIndex getBytecodeBegin(OwnPtr<BasicBlock>* block) > + { > + return (*block)->bytecodeBegin; > + } This seems wrong, there's no transfer of ownership intended here, you should just be passing BasicBlock* > Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:350 > +void SpeculativeJIT::compileObjectEquality(Node& node, void* vptr, bool (*predictionCheck)(PredictedType)) Would be nice to have a typedef for bool (*predictionCheck)(PredictedType) (In reply to comment #15) > (From update of attachment 110265 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=110265&action=review > > Should use a typedef for bool (*predictionCheck)(PredictedType), getBytecodeBegin shouldn't be taking an OwnPtr*, but it looks like that has just been moved from one place to another. > > Other than those style quirks I can't see any obvious problems. > > I am slightly concerned with the potential maintenance problems caused by separating type flow logic from codegen, especially given the vast amount of code duplication we currently have :-( I know. I don't think I want to do this in this patch, but I think I've got a solution for both this and register allocation: have named speculations (so just as we have an enum of node types, have an enum of speculation types), have attributes on nodes that describe what speculatiuons should be performed, have one phase in Propagator that deduces what those speculatins ought to be, and then while we're at it, have node attributes that specify what additional registers the codegen for a node will required. And viola, we'll be able to do global register allocation and get rid of GenerationInfo entirely. > > I do think Gavin should also look at this though. > > > Source/JavaScriptCore/dfg/DFGBasicBlock.h:59 > > + static inline BlockIndex getBytecodeBegin(OwnPtr<BasicBlock>* block) > > + { > > + return (*block)->bytecodeBegin; > > + } > > This seems wrong, there's no transfer of ownership intended here, you should just be passing BasicBlock* I think that this code is exactly as it should be. This is one of those gross stubs that we need for doing binary searches over Vector<OwnPtr<BasicBlock>>. The WTF::binarySearch() function takes a function pointer for getting the search key; for a vector of the form Vector<T> it takes a function of the form U (*)(T*), where U is the search key. For Vector<OwnPtr<BasicBlock>> we have T = OwnPtr<BasicBlock>, so the function must be of the form U (*)(OwnPtr<BasicBlock>*). The only way to change this is to make Vector<OwnPtr<BasicBlock>> be something else, but I quite like the idea of a vector of pointers for basic blocks, since there are few basic blocks and the basic block data structure is rather large. So while this looks yucky, and you'd probably never want to call it directly, it's necessary to support binary searches of basic blocks by their bytecode index. > > > Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:350 > > +void SpeculativeJIT::compileObjectEquality(Node& node, void* vptr, bool (*predictionCheck)(PredictedType)) > > Would be nice to have a typedef for bool (*predictionCheck)(PredictedType) Agree, will add. Created attachment 110277 [details]
the patch
Added a typedef for bool (*)(PredictedType). Removed bytecode/ActionablePrediction.h because it's unused. Updated some build files (though most of them make no reference to either DFG, or to any headers, so did not need to change).
Attachment 110277 [details] did not pass style-queue:
Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1
Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5]
Total errors found: 1 in 28 files
If any of these errors are false positives, please file a bug against check-webkit-style.
Created attachment 110281 [details]
the patch - merge with 32_64 changes
Attachment 110281 [details] did not pass style-queue:
Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1
Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5]
Total errors found: 1 in 28 files
If any of these errors are false positives, please file a bug against check-webkit-style.
Created attachment 110283 [details]
the patch - more perf improvements
- Did more tuning to the CFA itself. It now calls clobberStructures() less frequently. clobberStructures() is an expensive, but necessary, method. Likely any future work on improving CFA efficiency will focus on that method - how it's called, and what it does internally.
- Added better debugging support. Every graph dump will now print the CFA state in a concise and clear way.
- Fixed a bug where clobberStructures() was not being called for possibly-side-effecting operations like LogicalNot on questionable objects, and CompareXYZ on questionable objects.
- Did some investigation into what is going on in SunSpider. It turns out that date-format-xparb, among others, are still failing speculation like crazy. A lot of its is due to GetByVal not having a string property case. The exponential backoff in recompilation is doing its best, but since the benchmark is so short-running, any increase in DFG runtime will affect such short-running benchmarks that fail speculation systematically.
- Ran performance numbers on a separate machine, to find that the speed-up is smaller on my Mac Pro than it is on my MacBook Pro. But it's a speed-up on both machines.
Current performance situation:
- Overall 0.5% speed-up in geomean of preferred means.
- Slight reproducible sub-1% slow-down on SunSpider
- Either a 2.4% or a 0.5% win on V8 depending on machine.
- 1% win on Kraken.
Recommendation: we should land this, since: it's already a net win, the losses on SunSpider are due to preexisting conditions for which we are currently uninsured, and this should only become a bigger win as we add more machinery to take advantage of it. (Like, we're not doing anything to take advantage of the always-double propagation that this thing is already doing, and I haven't yet implemented the is-constant propagation, which should be a win on function checks among other things.) Not to mention that classically, CFA is most profitable when combined with inlining.
Latest numbers on MacBook Pro:
Benchmark report for SunSpider, V8, and Kraken.
VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc
"CFA" at /Volumes/Data/pizlo/OpenSource/WebKitBuild/Release/jsc
Collected 30 samples per benchmark/VM, with 10 VM invocations per benchmark. Used 1 benchmark iteration per VM
invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting
benchmark execution times with 95% confidence intervals in milliseconds.
TipOfTree CFA
SunSpider:
3d-cube 7.2512+-0.1027 7.2506+-0.1160
3d-morph 7.6656+-0.0813 ? 7.7124+-0.0978 ?
3d-raytrace 7.3843+-0.0984 ? 7.4021+-0.0967 ?
access-binary-trees 1.7302+-0.0287 1.6781+-0.0239 might be 1.0310x faster
access-fannkuch 6.3504+-0.0672 6.3159+-0.0640
access-nbody 3.4521+-0.0469 ? 3.4932+-0.0531 ? might be 1.0119x slower
access-nsieve 2.5600+-0.0356 ? 2.5885+-0.0470 ? might be 1.0111x slower
bitops-3bit-bits-in-byte 1.7154+-0.0262 ? 1.7434+-0.0319 ? might be 1.0163x slower
bitops-bits-in-byte 2.7388+-0.0391 ? 2.7447+-0.0359 ?
bitops-bitwise-and 3.3050+-0.0546 ? 3.3606+-0.0589 ? might be 1.0168x slower
bitops-nsieve-bits 5.4385+-0.0653 5.4370+-0.0711
controlflow-recursive 2.0667+-0.0212 ? 2.0812+-0.0208 ?
crypto-aes 6.6382+-0.1104 ? 6.6709+-0.1017 ?
crypto-md5 2.8355+-0.0466 2.8343+-0.0453
crypto-sha1 2.4700+-0.0373 ? 2.5155+-0.0443 ? might be 1.0184x slower
date-format-tofte 9.9355+-0.0996 ? 9.9652+-0.1371 ?
date-format-xparb 9.5529+-0.1560 ! 9.9465+-0.1131 ! definitely 1.0412x slower
math-cordic 6.2958+-0.0680 ? 6.3863+-0.0523 ? might be 1.0144x slower
math-partial-sums 7.5041+-0.0695 ? 7.5846+-0.0878 ? might be 1.0107x slower
math-spectral-norm 2.8198+-0.0398 ? 2.8726+-0.0465 ? might be 1.0187x slower
regexp-dna 10.6842+-0.0893 10.6180+-0.1092
string-base64 5.2352+-0.0858 ? 5.2951+-0.0554 ? might be 1.0114x slower
string-fasta 6.3476+-0.0765 ? 6.4334+-0.0947 ? might be 1.0135x slower
string-tagcloud 11.0721+-0.1259 ? 11.1451+-0.1129 ?
string-unpack-code 20.9296+-0.2131 ? 21.3736+-0.3461 ? might be 1.0212x slower
string-validate-input 6.3714+-0.1157 6.3281+-0.0869
<arithmetic> * 6.1673+-0.0147 ! 6.2222+-0.0173 ! definitely 1.0089x slower
<geometric> 5.0674+-0.0157 ! 5.1036+-0.0164 ! definitely 1.0072x slower
<harmonic> 4.1647+-0.0213 ? 4.1880+-0.0207 ?
TipOfTree CFA
V8:
crypto 71.9306+-0.2217 71.6923+-0.2342
deltablue 224.0349+-1.1196 ^ 220.8591+-0.8745 ^ definitely 1.0144x faster
earley-boyer 90.8330+-0.2471 ? 91.0550+-0.6016 ?
raytrace 58.6492+-0.1942 ^ 57.5860+-0.1636 ^ definitely 1.0185x faster
regexp 103.2842+-0.5738 ? 103.6217+-0.3030 ?
richards 204.7647+-0.4575 ^ 178.8069+-0.4249 ^ definitely 1.1452x faster
splay 93.9206+-0.3154 93.8991+-0.3847
<arithmetic> 121.0596+-0.1853 ^ 116.7886+-0.1891 ^ definitely 1.0366x faster
<geometric> * 107.9179+-0.1498 ^ 105.3890+-0.1611 ^ definitely 1.0240x faster
<harmonic> 97.7053+-0.1498 ^ 96.2554+-0.1560 ^ definitely 1.0151x faster
TipOfTree CFA
Kraken:
ai-astar 491.8722+-1.8657 491.6102+-2.1346
audio-beat-detection 191.3896+-0.6444 ! 193.1956+-0.5294 ! definitely 1.0094x slower
audio-dft 265.5784+-1.5956 ^ 261.3193+-2.1896 ^ definitely 1.0163x faster
audio-fft 124.9334+-0.1853 ? 125.4338+-0.4450 ?
audio-oscillator 251.5153+-1.0876 250.4084+-1.0322
imaging-darkroom 412.2875+-0.8770 ? 412.7251+-0.8786 ?
imaging-desaturate 229.8157+-0.4044 ^ 216.2189+-0.3764 ^ definitely 1.0629x faster
imaging-gaussian-blur 579.9277+-0.8575 ^ 567.2564+-0.9921 ^ definitely 1.0223x faster
json-parse-financial 53.8428+-0.2035 ? 54.0464+-0.2093 ?
json-stringify-tinderbox 67.6931+-0.2539 ? 67.7358+-0.2734 ?
stanford-crypto-aes 130.2188+-0.9518 129.4420+-0.8691
stanford-crypto-ccm 99.4847+-0.3260 ! 101.1548+-0.4164 ! definitely 1.0168x slower
stanford-crypto-pbkdf2 187.6499+-0.8004 ? 188.8759+-1.0071 ?
stanford-crypto-sha256-iterative 69.9878+-0.1863 69.6543+-0.1668
<arithmetic> * 225.4426+-0.3118 ^ 223.5055+-0.3132 ^ definitely 1.0087x faster
<geometric> 175.7162+-0.2682 ^ 174.8010+-0.2755 ^ definitely 1.0052x faster
<harmonic> 136.9772+-0.2368 136.7825+-0.2282
TipOfTree CFA
All benchmarks:
<arithmetic> 88.5950+-0.1068 ^ 87.4122+-0.1004 ^ definitely 1.0135x faster
<geometric> 22.9797+-0.0474 22.9534+-0.0477
<harmonic> 7.3245+-0.0367 ? 7.3629+-0.0354 ?
TipOfTree CFA
Geomean of preferred means:
<scaled-result> 53.1381+-0.0625 ^ 52.7237+-0.0651 ^ definitely 1.0079x faster
Latest numbers on Mac Pro:
Benchmark report for SunSpider, V8, and Kraken.
VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/secondary/OpenSource/WebKitBuild/Release/jsc
"CFA" at /Volumes/Data/fromMiniMe/OpenSource/WebKitBuild/Release/jsc
Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM
invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting
benchmark execution times with 95% confidence intervals in milliseconds.
TipOfTree CFA
SunSpider:
3d-cube 7.9004+-0.0265 ! 7.9902+-0.0335 ! definitely 1.0114x slower
3d-morph 8.4615+-0.1076 ? 8.4887+-0.1116 ?
3d-raytrace 8.0821+-0.1095 8.0724+-0.0618
access-binary-trees 1.7927+-0.0033 1.7806+-0.0161
access-fannkuch 7.6813+-0.0148 ! 7.9859+-0.0254 ! definitely 1.0397x slower
access-nbody 4.1641+-0.0068 ! 4.1778+-0.0037 ! definitely 1.0033x slower
access-nsieve 3.1422+-0.0032 ! 3.1584+-0.0066 ! definitely 1.0052x slower
bitops-3bit-bits-in-byte 1.7320+-0.0103 ? 1.7441+-0.0089 ?
bitops-bits-in-byte 5.2152+-0.0453 ? 5.2206+-0.0533 ?
bitops-bitwise-and 3.4118+-0.0640 3.3993+-0.0644
bitops-nsieve-bits 5.6763+-0.0392 5.6438+-0.0351
controlflow-recursive 2.2564+-0.0244 2.2509+-0.0028
crypto-aes 6.9132+-0.0689 ! 7.0664+-0.0630 ! definitely 1.0222x slower
crypto-md5 3.0181+-0.0343 ? 3.0585+-0.0331 ? might be 1.0134x slower
crypto-sha1 2.7090+-0.0109 ! 2.7557+-0.0108 ! definitely 1.0172x slower
date-format-tofte 10.6874+-0.0850 ? 10.7886+-0.1017 ?
date-format-xparb 10.7804+-0.2653 ? 11.2082+-0.1793 ? might be 1.0397x slower
math-cordic 7.2695+-0.0506 7.2238+-0.0219
math-partial-sums 10.4781+-0.0501 10.4767+-0.0339
math-spectral-norm 3.2288+-0.0168 ! 3.2505+-0.0036 ! definitely 1.0067x slower
regexp-dna 12.4863+-0.1218 ? 12.5624+-0.1253 ?
string-base64 5.4727+-0.0648 ? 5.5304+-0.0697 ? might be 1.0105x slower
string-fasta 7.3632+-0.0576 7.3506+-0.0232
string-tagcloud 12.8958+-0.0586 12.8081+-0.0688
string-unpack-code 23.9726+-0.1714 ? 24.2456+-0.1831 ? might be 1.0114x slower
string-validate-input 6.8011+-0.1062 6.7479+-0.0643
<arithmetic> * 7.0612+-0.0244 ! 7.1149+-0.0199 ! definitely 1.0076x slower
<geometric> 5.7812+-0.0155 ! 5.8176+-0.0134 ! definitely 1.0063x slower
<harmonic> 4.6961+-0.0111 ? 4.7189+-0.0120 ?
TipOfTree CFA
V8:
crypto 80.3328+-0.2637 79.9557+-0.1580
deltablue 252.4219+-1.5655 ? 253.0487+-1.3899 ?
earley-boyer 110.3867+-0.3100 ! 111.4466+-0.2599 ! definitely 1.0096x slower
raytrace 66.3575+-0.3105 ^ 64.6593+-0.1899 ^ definitely 1.0263x faster
regexp 123.6966+-0.5882 123.6552+-0.5325
richards 221.5944+-0.9748 ^ 215.4157+-1.0238 ^ definitely 1.0287x faster
splay 122.9378+-0.7662 ? 124.5312+-0.8843 ? might be 1.0130x slower
<arithmetic> 139.6754+-0.4149 138.9589+-0.3738
<geometric> * 125.8663+-0.2769 ^ 125.2510+-0.2812 ^ definitely 1.0049x faster
<harmonic> 114.2437+-0.1976 ^ 113.5287+-0.2205 ^ definitely 1.0063x faster
TipOfTree CFA
Kraken:
ai-astar 832.8773+-0.4054 825.4636+-11.7787
audio-beat-detection 214.1596+-1.7669 213.3784+-0.9437
audio-dft 271.3054+-4.2497 ^ 260.8641+-3.0615 ^ definitely 1.0400x faster
audio-fft 135.4391+-0.1603 ! 139.2097+-1.1166 ! definitely 1.0278x slower
audio-oscillator 293.9338+-3.0579 292.1754+-1.9142
imaging-darkroom 486.8431+-4.6328 484.2544+-3.7257
imaging-desaturate 244.8629+-0.2524 ^ 237.8543+-0.0918 ^ definitely 1.0295x faster
imaging-gaussian-blur 641.8301+-0.2648 ^ 611.2323+-0.2040 ^ definitely 1.0501x faster
json-parse-financial 68.4385+-0.1365 ? 68.6702+-0.2375 ?
json-stringify-tinderbox 79.7739+-0.3035 ? 80.1401+-0.2522 ?
stanford-crypto-aes 152.6437+-1.2898 151.5830+-1.4340
stanford-crypto-ccm 116.1246+-0.7775 ? 117.1616+-0.5738 ?
stanford-crypto-pbkdf2 235.4855+-1.7872 232.6827+-1.6888 might be 1.0120x faster
stanford-crypto-sha256-iterative 85.6205+-0.2548 ? 86.0403+-0.3846 ?
<arithmetic> * 275.6670+-0.6405 ^ 271.4793+-0.9585 ^ definitely 1.0154x faster
<geometric> 208.1917+-0.4363 ^ 206.5505+-0.5262 ^ definitely 1.0079x faster
<harmonic> 162.2625+-0.2764 162.1470+-0.4291
TipOfTree CFA
All benchmarks:
<arithmetic> 106.8225+-0.2224 ^ 105.4981+-0.2833 ^ definitely 1.0126x faster
<geometric> 26.6011+-0.0546 ? 26.6112+-0.0438 ?
<harmonic> 8.2687+-0.0192 ? 8.3073+-0.0206 ?
TipOfTree CFA
Geomean of preferred means:
<scaled-result> 62.5735+-0.1251 ^ 62.3100+-0.0910 ^ definitely 1.0042x faster
Attachment 110283 [details] did not pass style-queue:
Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1
Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5]
Total errors found: 1 in 28 files
If any of these errors are false positives, please file a bug against check-webkit-style.
Created attachment 110310 [details]
the patch - merged with ByVal changes
Attachment 110310 [details] did not pass style-queue:
Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1
Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5]
Total errors found: 1 in 28 files
If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 110310 [details] the patch - merged with ByVal changes View in context: https://bugs.webkit.org/attachment.cgi?id=110310&action=review 1/2 way through the review! > Source/JavaScriptCore/bytecode/PredictedType.cpp:47 > + bool isTop = true; Could we not just test a mask & early return here? > Source/JavaScriptCore/dfg/DFGAbstractState.cpp:34 > +namespace JSC { namespace DFG { We normally only use ENABLE_* definitions in Platform.h (because this is included everywhere, and the way the macro works means a missed include won't result in a compile error). It would be better to make this a regular define & #if FOO check. > Source/JavaScriptCore/dfg/DFGAbstractState.cpp:63 > + PROFILE(17); Not a big issue, but it would probably be better to define these values rather than use magic numbers through the code. View in context: https://bugs.webkit.org/attachment.cgi?id=110310&action=review 1/2 way through the review! > Source/JavaScriptCore/bytecode/PredictedType.cpp:47 > + bool isTop = true; Could we not just test a mask & early return here? > Source/JavaScriptCore/dfg/DFGAbstractState.cpp:34 > +namespace JSC { namespace DFG { We normally only use ENABLE_* definitions in Platform.h (because this is included everywhere, and the way the macro works means a missed include won't result in a compile error). It would be better to make this a regular define & #if FOO check. > Source/JavaScriptCore/dfg/DFGAbstractState.cpp:63 > + PROFILE(17); Not a big issue, but it would probably be better to define these values rather than use magic numbers through the code. (In reply to comment #26) > View in context: https://bugs.webkit.org/attachment.cgi?id=110310&action=review > > 1/2 way through the review! > > > Source/JavaScriptCore/bytecode/PredictedType.cpp:47 > > + bool isTop = true; > > Could we not just test a mask & early return here? That mask would be ever-changing and we'd have to separately maintain it, as we'd have to construct it manually by saying something like: PredictFinalObject | PredictArray | PredictObjectOther | PredictString | PredictCellOther | PredictInt32 | PredictDouble | PredictBoolean | PredictOther The reason why PredictTop doesn't do the job is that currently a top value will not have reserved bits set because nobody sets them, unless they use PredictTop directly (which happens, but rarely). For example, we leave 0x0004 and 0x0008, and 0x0020 for additional object types. Yet more bits are reserved if we ever wanted to get more specific about PredictOther (currently it means null-or-undefined but there is no separate null prediction or undefined prediction) or numbers (like int-represented-as-double). What I'm trying to do is to avoid a situation where the PredictTop mask would have to be changed to include newly used, previously reserved, bits every time we add support for predicting some new type. Fortunately, it's almost never the case that someone wants to know if a PredictedType is top. It only comes into play in the ToString function. That function already has to have an "if (foo & PredictBar)" statement for every type that it's capable of printing. So it was natural to have an else statement that clears isTop, which gives a more easy to maintain approach to the top madness. > > > Source/JavaScriptCore/dfg/DFGAbstractState.cpp:34 > > +namespace JSC { namespace DFG { > > We normally only use ENABLE_* definitions in Platform.h (because this is included everywhere, and the way the macro works means a missed include won't result in a compile error). > It would be better to make this a regular define & #if FOO check. OK. > > > Source/JavaScriptCore/dfg/DFGAbstractState.cpp:63 > > + PROFILE(17); > > Not a big issue, but it would probably be better to define these values rather than use magic numbers through the code. Will do. > I know. I don't think I want to do this in this patch, but I think I've got a solution for both this and register allocation: have named speculations (so just as we have an enum of node types, have an enum of speculation types), have attributes on nodes that describe what speculatiuons should be performed, have one phase in Propagator that deduces what those speculatins ought to be, and then while we're at it, have node attributes that specify what additional registers the codegen for a node will required. And viola, we'll be able to do global register allocation and get rid of GenerationInfo entirely. Not sure if this helps, but another possible option is to add explicit nodes for each speculation. We didn't do this originally because it made the graph not reusable for the non-speculative path, but now there is no non-speculative path. > > > + static inline BlockIndex getBytecodeBegin(OwnPtr<BasicBlock>* block) > > > + { > > > + return (*block)->bytecodeBegin; > > > + } RE this vs binarySearch, perhaps the binarySearch template could be modified to pass T& instead of T* to child functions. (In reply to comment #28) > > I know. I don't think I want to do this in this patch, but I think I've got a solution for both this and register allocation: have named speculations (so just as we have an enum of node types, have an enum of speculation types), have attributes on nodes that describe what speculatiuons should be performed, have one phase in Propagator that deduces what those speculatins ought to be, and then while we're at it, have node attributes that specify what additional registers the codegen for a node will required. And viola, we'll be able to do global register allocation and get rid of GenerationInfo entirely. > > Not sure if this helps, but another possible option is to add explicit nodes for each speculation. We didn't do this originally because it made the graph not reusable for the non-speculative path, but now there is no non-speculative path. This would be a performance regression. Our boolean speculations are done as part of testing the boolean, and often without using SpeculateBooleanOperand. And then we have tests for object-or-undefined-or-null, where the speculation checks are embedded in the control flow of the node itself. As the DFG gets more powerful, our nodes will have more sophisticated execution semantics, and our optimizations will have to have more sophisticated modeling of those execution semantics. Better to just live with it. I've seen compiler hackers go down the path of trying to modularize this aspect of compilation and the results are invariably uglier than what we have. You either have to introduce new opcodes for each micro operation, but then you get performance regressions unless you also introduce more basic blocks (we could model the object-or-undefined speculation by blowing each node that does that into a control flow diamond). Or you can start to create a formal specification for your nodes. But that fails because the formal specification becomes more hairy and complicated than the compiler you were going to write in the first place. > > > > > + static inline BlockIndex getBytecodeBegin(OwnPtr<BasicBlock>* block) > > > > + { > > > > + return (*block)->bytecodeBegin; > > > > + } > > RE this vs binarySearch, perhaps the binarySearch template could be modified to pass T& instead of T* to child functions. That would still make the style bot unhappy... Comment on attachment 110310 [details] the patch - merged with ByVal changes View in context: https://bugs.webkit.org/attachment.cgi?id=110310&action=review couple more comments > Source/JavaScriptCore/dfg/DFGAbstractValue.h:63 > + m_structure = reinterpret_cast<Structure*>(1); The magic value '1' should be named, either behind a static const or an ifdef. > Source/JavaScriptCore/dfg/DFGAbstractValue.h:235 > + if (reinterpret_cast<uintptr_t>(m_structure) <= 1) This should use 'isNeitherClearNorTop'. > Source/JavaScriptCore/dfg/DFGAbstractValue.h:249 > + bool isNeitherClearNorTop() const { return reinterpret_cast<uintptr_t>(m_structure) > 1; } It would probably make the code more understandable to negate this - make this method isClearOrTop(), negate uses. Created attachment 110424 [details]
the patch - address Gavin's most recent comments
Created attachment 110430 [details]
the patch - merged up to ToT
Attachment 110430 [details] did not pass style-queue:
Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1
Source/JavaScriptCore/dfg/DFGBasicBlock.h:56: The parameter type should use PassOwnPtr instead of OwnPtr. [readability/pass_ptr] [5]
Total errors found: 1 in 28 files
If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 110430 [details] the patch - merged up to ToT View in context: https://bugs.webkit.org/attachment.cgi?id=110430&action=review r+. I still don't think AbstractValue is really descriptive enough, but I don't have a better name to suggest. > Source/JavaScriptCore/dfg/DFGAbstractValue.h:312 > +struct AbstractValue { AbstractValue doesn't seem to be the right name for this class. This struct can tell you a type, not a value - e.g. "int", or "object with properties x, y", not '3' or '"Hello, world"'. This naming is confusing in it's uses, e.g. in GetLocal. > Source/JavaScriptCore/dfg/DFGNode.h:715 > + unsigned takenBytecodeOffsetDuringParsing() This change could probably have been made as a separate patch, breaking up this review a bit would have helped! - probably no point doing so now though. It's a bit unfortunate that the meaning of the branch's opInfos changes over time, it would be better if we could guard against error here. We could consider extra state in debug builds that we can ASSERT on, or we could change reflect this in the node types (have a BranchDuringParse/JumpDuringParse that are updated as the successors are set, and never reach generation). Landed in r97218. |