RESOLVED FIXED 19408
Constant folding in parser
https://bugs.webkit.org/show_bug.cgi?id=19408
Summary Constant folding in parser
Gabor Loki
Reported 2008-06-06 08:35:56 PDT
Hi all, I have implemented an experimental constant folding algorithm into the parser. It is not finished yet, but I would like to share it, in order to get feedbacks about it. Any comments and suggestions are welcome! Note: No new JavaScript regression. SunSpider says '1.007x as fast'. If I remove the comment signs around String operations SunSpider says '1.011x as slow'
Attachments
Const.Folding.v1 (46.85 KB, patch)
2008-06-06 08:38 PDT, Gabor Loki
darin: review-
Const.Folding.v2 (53.61 KB, patch)
2008-06-12 03:46 PDT, Gabor Loki
darin: review-
Fold only *, +, <<, >>, ~ operators (40.12 KB, patch)
2008-07-16 03:49 PDT, Gabor Loki
no flags
Fold only *, +, <<, >>, ~ operators (v2) (39.96 KB, patch)
2008-07-16 03:57 PDT, Gabor Loki
no flags
Fold only *, +, <<, >>, ~ operators (v3) (40.02 KB, patch)
2008-07-22 02:02 PDT, Gabor Loki
zwarich: review-
Fold only *, +, <<, >>, ~ operators (v4) (41.56 KB, patch)
2008-08-26 03:36 PDT, Gabor Loki
no flags
SunSpider results for last patch against r35965 (3.56 KB, text/plain)
2008-08-28 21:24 PDT, Cameron Zwarich (cpst)
no flags
Fold *, /, +, <<, >>, ~ operators (v5) (39.29 KB, patch)
2008-09-01 03:26 PDT, Gabor Loki
darin: review+
Gabor Loki
Comment 1 2008-06-06 08:38:22 PDT
Created attachment 21527 [details] Const.Folding.v1 The patch is against rev 34372.
Gabor Loki
Comment 2 2008-06-09 01:13:01 PDT
Comment on attachment 21527 [details] Const.Folding.v1 I forgot to mark the patch with '?'. Note: This patch is not prepared to apply to trunk. I only ask for feedback.
Darin Adler
Comment 3 2008-06-09 06:48:22 PDT
Comment on attachment 21527 [details] Const.Folding.v1 The general approach looks OK. Here's a quick pass of review. + OP_UN_PLUS, /* '+' */ For an enum like this, there's no reason to use the ALL_CAPS style. That's usually reserved for macros, where you're concerned about conflict with other identifiers, ignoring namespaces. You should also abbreviate less. I think: operatorUnaryPlus is just fine, and OP_UN_PLUS is not really better. +static ExpressionNode* foldConst(char, ExpressionNode*, ExpressionNode*); "char" is not the right type to use here. You should give the enum type a name and use that type. Even if you did want to use a type in the "char" family, it would probably be unsigned char. Since "char" is signed or unsigned depending on the compiler. + if ((en = foldConst(OP_UN_LNOT, $2.m_node, NULL))) We use "0" in JavaScriptCore, not NULL. + if (typeid(*m1) == typeid(StringNode) && typeid(*m2) == typeid(StringNode)) { There's no RTTI in WebCore so you can't use typeid. If you want to find the type of a node, you'll need to add an isStringNode() function. But that's a good thing, because a virtual function call is faster than a typeid check on most compilers. + UString s1 = ((StringNode*)m1)->value(); We use C++ style casts, such as static_cast, not C style casts like this. +static ExpressionNode* foldConst(char op, ExpressionNode* m1, ExpressionNode* m2) I think it would be best to put these functions in a different file rather than at the bottom of the grammar. It's critical to use SunSpider and see how this constant folding affects speed on the benchmark.
Gabor Loki
Comment 4 2008-06-12 03:46:49 PDT
Created attachment 21658 [details] Const.Folding.v2 This patch adds a lightweight constant folding algorithm. The algorithm handles simple folding cases (like the left and the right side of a binary operator are Number: CONST * CONST). It does not do expensive tree rotation or operand swapping to optimize constant folding. So such as expression: CONST * VAR * CONST will never be folded. This patch is against rev. 34508. The update of MSVC and XCode project files are missing. No new regression found (with run-javascriptcore-tests). SunSpider says: 1.019x as fast, 3273.1ms +/- 0.6%, 3212.9ms +/- 0.2%, significant
Darin Adler
Comment 5 2008-06-12 07:21:41 PDT
Comment on attachment 21658 [details] Const.Folding.v2 This looks great! The patch to the ChangeLog file has tabs in it; we can't check in files with tabs so it's better for the committer if you author the file without tab characters. For new files in JavaScriptCore we're using naming more like the ones in WebCore. So it should be ConstantFolding.h rather than const_folding.h. + ExpressionNode* foldConst(enum ConstFoldOperType op, ExpressionNode* m1, ExpressionNode* m2) No need for "enum" here. + UString s1 = (static_cast<StringNode*>(m1))->value(); + UString s2 = (static_cast<StringNode*>(m2))->value(); + d1 = (static_cast<NumberNode*>(m1))->value(); + d1 = (static_cast<BooleanNode*>(m1))->value(); These expressions have extra unneeded parentheses. + int n1 = 0, n2 = 0; These variables seem to be booleans, but you are using int instead of bool. Also, these declarations should be just before the values are used. + if (op == operatorUnaryLogicalNot) { + if (sn->value().isEmpty()) + return new TrueNode; + else + return new FalseNode; + } else if (op != operatorBinaryAdd) { + d1 = sn->value().toDouble(); + if (!isnan(d1)) + n1 = 1; + } We normally do not put an else after a return (done here twice). Also, the TrueNode/FalseNode idiom happens so many times I think we should add a helper function that does that to avoid all those different if statements. + return new NumberNode(JSValue::toInt32(d1) << ((uint32_t)JSValue::toUInt32(d2) & 0x1f)); The typecast of the result of toUInt32 seems unneeded. + if (n2) + switch(op) { There should be braces after "if (n2)". We only omit the braces if the body is one line -- one statement is not enough. There should be a space after "switch". + #ifndef CONST_FOLDING_H_ Style guide says this should be the same case as the filename. If you use ConstantFolding.h, then it should be ConstantFolding_h. 447 | '+' UnaryExpr { ExpressionNode *en; 448 if ((en = foldConst(operatorUnaryPlus, $2.m_node, 0))) 449 $$ = createNodeFeatureInfo<ExpressionNode*>(en, $2.m_featureInfo); 450 else 451 $$ = createNodeFeatureInfo<ExpressionNode*>(new UnaryPlusNode($2.m_node), $2.m_featureInfo); } These would read better if you declared the ExpressionNode* inside the if statement. The * goes next to the type name, so it should be "ExpressionNode*", not "ExpressionNode *". I don't think it's great to add isBoolean() and BooleanNode. This just makes too many classes. There could be separate isTrue and isFalse. But the reason true and false have separate nodes in the first place is from when the tree was used for execution. Nowadays it would be fine to eliminate TrueNode and FalseNode altogether and just have a BooleanNode. 356 virtual bool isString() const KJS_FAST_CALL { return true; } This might sound surprising, but the override should be private. The reason for that is that it's a programming mistake to call isString() if you know it's a string node, and by making the function private you get a compiler error if you make that error. We need to add a test along with this code change. It can go in LayoutTests/fast/js. We need a test case for each constant folding case, to make sure we're actually exercising the folded constant code. I should be able to make the test fail by changing anything in the constant folding function. I'm tempted to say review+ because all the comments are minor style issues. But for now I'll say review- because we really need a test case. This really is great work. I'm looking forward to getting it in.
Maciej Stachowiak
Comment 6 2008-06-15 00:05:45 PDT
Some comments: 1) Instead of introducing new OP_WHATEVER enums for some of the operations, why not just use the existing op_ constants from the bytecode? 2) Is it really helpful to have a single foldConstant function which takes an opcode enum and then switches on it? Why not just a function for each type of fold? 3) Constant folds should be doable even if operand types aren't exactly what is expected, boolean string and number constants should be safely convertible. 4) We've discussed the possibility of doing constant folding at codegen time. I know it is a traditional parse-time optimization, but it fits very naturally with the way we want to have specialized versions of the opcodes with constant operands. If you have to handle the register-register, register-constant and constant-register cases, it makes sense to handle the constant-constant case right there. It has to be checked for anyway, because the parse-time constant folding here won't handle strings or various edge cases and folding seems like it is probably better than emitting excessive constant loads.
Maciej Stachowiak
Comment 7 2008-06-15 00:07:33 PDT
I'm not sure consideration #4 means that we shouldn't do parse-time constant folding for now, but it is something worth discussing.
Gabor Loki
Comment 8 2008-06-17 01:24:16 PDT
> 1) Instead of introducing new OP_WHATEVER enums for some of the operations, why > not just use the existing op_ constants from the bytecode? OpcodeID enum has so many enumerators. If I use those GCC will generate a bigger and less effective code for switch statement. The GCC generated tablejumps are very effective construction if the cases in a switch check for continuous (or almost continuous) numbers (like: 0,1,2,...,n). > 2) Is it really helpful to have a single foldConstant function which takes an > opcode enum and then switches on it? Why not just a function for each type of > fold? I am not sure which variant is better. With functions for each type of fold the tablejump can be saved. I will check it. > 3) Constant folds should be doable even if operand types aren't exactly what is > expected, boolean string and number constants should be safely convertible. What did I miss? > 4) We've discussed the possibility of doing constant folding at codegen time. You are right about not every constant can be handled quick in parse-time, but IMHO this should not be the only priority. If the AST is smaller the codegen will be also quicker. In addition if you like have a lightweight constant propagation, it is better to have constant folding in AST instead of in codegen. time. Of course, codegen. can have its own separate constant folding.
Gabor Loki
Comment 9 2008-07-16 03:49:12 PDT
Created attachment 22302 [details] Fold only *, +, <<, >>, ~ operators I have tried several implementations of constant folding, but I couldn't achieve such as big speed-up as the first time. So I tried to determine why the first patch was so good. The first thing what I have noticed if an unused member (or non-member) function left in the code it will easily change the result (in any direction). For example: after I have placed an isString() function into ExpressionNode and StringNode (and do nothing else). The result was about 1% slow down. In spite of this, after an isBoolean() function was placed into ExpressionNode and BooleanNode an 0.5% speed up was measured. I guess the reason of this that the pointers of the objects and the target of jumps have been changed in the generated object files. The second thing: it is pointless to implement all constant folding cases in the parser. In general the JavaScripts (or any other language) contain very few constant folding opportunity at parser-time. In the most frequent cases the following operators were used: *, +, <<, >>, ~. SunSpider contains very few const. folding opportunity (only for * and + operators) So the attached patch does constant folding for *, + (only for numbers), <<, >>, ~ operators. I have also added a new test case for LayoutTests/fast/js which covers all constant folding cases. I have not measured any significant performance speed up in SunSpider, but the end result is not worse than reference: ** TOTAL **: FROM: 3201.4ms +/- 0.7% TO: 3193.4ms +/- 0.7% The patch is against rev. 35197.
Gabor Loki
Comment 10 2008-07-16 03:57:43 PDT
Created attachment 22303 [details] Fold only *, +, <<, >>, ~ operators (v2) Opps, I forgot to remove an unnecessary include header from the patch. Sorry.
Oliver Hunt
Comment 11 2008-07-20 14:05:45 PDT
Comment on attachment 22303 [details] Fold only *, +, <<, >>, ~ operators (v2) This basically looks good. Unfortunately iev made some parser changes which *may* conflict with your changes (not in any real way though). If you could bring your patch up to ToT and include commandline performance numbers that would be awesome
Gabor Loki
Comment 12 2008-07-22 02:02:25 PDT
Created attachment 22422 [details] Fold only *, +, <<, >>, ~ operators (v3) Here is the requested patch (rev. 35284). The SunSpider result: ** TOTAL **: - 2817.0ms +/- 0.9% 2816.7ms +/- 0.7% Measured on qt-linux with the patch of https://bugs.webkit.org/show_bug.cgi?id=20097 bug.
Cameron Zwarich (cpst)
Comment 13 2008-07-22 02:10:12 PDT
Do you know where this actually gets triggered in SunSpider?
Gabor Loki
Comment 14 2008-07-22 03:37:57 PDT
The patch does constant folding on the following JS lines. '*' : - 3d-cube.js:311 - 3d-raytrace.js:316 '<<' : - crypto-aes.js:369 '+' : - crypto-md5:175 - crypto-sha1:121
Geoffrey Garen
Comment 15 2008-07-29 17:33:17 PDT
If you added division to the pass, you would also catch this important line from math-partial-sums.js: var twothirds = 2.0/3.0;
Cameron Zwarich (cpst)
Comment 16 2008-08-01 16:23:24 PDT
Comment on attachment 22422 [details] Fold only *, +, <<, >>, ~ operators (v3) As Geoff mentions, you should probably see what happens when you do constant folding for division. Why don't you fold the other bitwise operations. +static ExpressionNode* makeMultNode(void*, ExpressionNode*, ExpressionNode*, bool); +static ExpressionNode* makeAddNode(void*, ExpressionNode*, ExpressionNode*, bool); +static ExpressionNode* makeLeftShiftNode(void*, ExpressionNode*, ExpressionNode*, bool); +static ExpressionNode* makeRightShiftNode(void*, ExpressionNode*, ExpressionNode*, bool); The last bool argument should probably be named here, because it isn't clear what it means. It should also be given a descriptive name in the functions themselves, at least something better than "ra". + if (m1->isNumber() && m2->isNumber()) + return makeNumberNode(globalPtr, static_cast<NumberNode*>(m1)->value() * static_cast<NumberNode*>(m2)->value()); + else + return new MultNode(GLOBAL_DATA, m1, m2, ra); In WebKit, we don't return from an else like this, we simply remove the else and put the return statement after the if statement. +description( +"This test checks constant folding algorithm." +); This might be better put as "This test checks the correctness of constant folding." We tend to prefer 4 space indentation in JavaScript tests, but why did you make your own shouldBe() function? I don't see any reason why you can't use the standard one. Other than that, it seems like a good patch. I apologize on behalf of the JavaScript team for not reviewing your patch earlier. Most of the development takes place on IRC, so sometimes we are not as responsive to one-off patches posted on the Bugzilla as we should be. I will try to ensure that this doesn't happen in the future.
Gabor Loki
Comment 17 2008-08-26 03:36:04 PDT
Created attachment 22997 [details] Fold only *, +, <<, >>, ~ operators (v4) (In reply to comment #16) > As Geoff mentions, you should probably see what happens when you do constant > folding for division. Why don't you fold the other bitwise operations. When I have tried to fold divisions I had worse result than without it. The same happened for other operators. If you like to see those cases and its results also I will send it separately. I have just rechecked division, and now it performs better. So I have included it into the patch. > your own shouldBe() function? I don't see any reason why you can't use the > standard one. Because of rounding. When I used 'shouldBe' I have for example the following fail: FAIL 10.3 - 2.1 should be 8.2. Was 8.200000000000001. > Other than that, it seems like a good patch. I apologize on behalf of the > JavaScript team for not reviewing your patch earlier. No problem. I have also out of office. So the attached patch is against rev 35930. The SunSpider result is: ** TOTAL **: - From: 2890.3ms +/- 0.5% To: 2884.8ms +/- 0.5% for "*" operator: 3d-cube: 118.9ms +/- 2.2% 118.4ms +/- 0.6% 3d-raytrace: 120.4ms +/- 1.1% 121.4ms +/- 1.0% for "<<" operator: crypto-aes: 53.6ms +/- 0.7% 53.7ms +/- 0.6% for "+" operator: crypto-md5: 53.0ms +/- 0.6% 53.1ms +/- 0.4% crypto-sha1: 51.7ms +/- 0.9% 51.7ms +/- 0.7% for "/" operator: math-partial-sums: 159.4ms +/- 2.0% 156.7ms +/- 1.2%
Cameron Zwarich (cpst)
Comment 18 2008-08-28 13:26:31 PDT
I will review this soon.
Cameron Zwarich (cpst)
Comment 19 2008-08-28 21:24:34 PDT
Created attachment 23063 [details] SunSpider results for last patch against r35965 Your SunSpider results had a very wide variance, so I ran SunSpider myself. There is no noticeable change on SunSpider, which is what I suspect, but I still think that this patch can be landed. It was a slight regression on Oliver's machine, but that might be entirely random. If it is a problem later, we can always back it out.
Cameron Zwarich (cpst)
Comment 20 2008-08-28 21:51:02 PDT
Comment on attachment 22997 [details] Fold only *, +, <<, >>, ~ operators (v4) I think it would be best to make m1 and m2 more descriptive, like expr1 and expr2. You don't want them too long, because some of this code is already quite lengthy, but those would be better than m1 and m2. Also, locHasAssignments doesn't really make the most sense here. In makeAssignNode(), locHasAssignments refers to the location of the assignment. For these binary operations, it should be rightHasAssignments, because we are concerned about the right expression having assignments. I can make both of these changes when I land the patch, if you'd like. I still find the shouldBeRound() thing sort of odd, because the value it prints is not actually the result of the operation. You could have used different values that would work out exactly. However, I won't complain too much. More tests are always welcome, and I can always change it if I want.
Gabor Loki
Comment 21 2008-08-29 00:04:10 PDT
(In reply to comment #20) > I can make both of these changes when I land the patch, if you'd like. I agree with you about renaming. If you could do both changes during landing I would appreciate that. > I still find the shouldBeRound() thing sort of odd, because the value it prints > is not actually the result of the operation. You could have used different > values that would work out exactly. I will look into it again, read what ecma says about rounding, and try to check it on other platform.
Cameron Zwarich (cpst)
Comment 22 2008-08-31 11:47:13 PDT
(In reply to comment #21) > I will look into it again, read what ecma says about rounding, and try to check > it on other platform. ECMA states that all arithmetic is done with double precision IEEE floating-point, and the result of the calculation you use in your tests can not be represented exactly. You could just write the full value of the result in the tests or pick different calculations whose can be represented exactly. I'll even do it while landing, if you want. For now, I'll r- the patch so no one lands it.
Cameron Zwarich (cpst)
Comment 23 2008-08-31 11:47:52 PDT
Comment on attachment 22997 [details] Fold only *, +, <<, >>, ~ operators (v4) Clearing review flag while waiting for a response from Gabor.
Gabor Loki
Comment 24 2008-09-01 03:26:20 PDT
Created attachment 23095 [details] Fold *, /, +, <<, >>, ~ operators (v5) I have added the requested changes: - rename m1, m2 to expr1, expr2 - rename locHasAssignments to rightHasAssignments - remove shouldBeRound function and use double-precision values in the test cases
Cameron Zwarich (cpst)
Comment 25 2008-09-06 18:34:15 PDT
I'll review this tonight. I just want to get some new SunSpider numbers. We have been having some weird problems with frontend changes pissing off GCC.
Darin Adler
Comment 26 2008-09-21 11:40:40 PDT
Comment on attachment 23095 [details] Fold *, /, +, <<, >>, ~ operators (v5) + if (v2 != 0) + return makeNumberNode(globalPtr, v1 / v2); Why can't this be done for 0 too? Sure, it would make a NaN node, but is there a problem with that? Great test case, but I don't really find the test case easier to read with all that space padding in it. What effect does this have on performance? Maybe I'll test it.
Darin Adler
Comment 27 2008-09-21 12:15:29 PDT
Comment on attachment 23095 [details] Fold *, /, +, <<, >>, ~ operators (v5) r=me
Darin Adler
Comment 28 2008-09-21 12:19:48 PDT
Others say they can't measure consistent SunSpider results on this, but I do see a consistent speedup. It's in: http://trac.webkit.org/changeset/36739 The code in negate is strange -- I don't see any reason we have to modify the number node in place rather than making a new one, and I don't see any reason to skip the optimization for 0 and negative numbers. I'm going to close this as fixed.
Note You need to log in before you can comment on or make changes to this bug.