33860 – potential speed up of parser by recoding expression rules

RESOLVED WONTFIX 33860

potential speed up of parser by recoding expression rules

https://bugs.webkit.org/show_bug.cgi?id=33860

Summary potential speed up of parser by recoding expression rules

Patrick Mueller

Reported 2010-01-19 13:07:33 PST

While playing with the bison parser for JavaScriptCore, I did a little googling research on bison optimization. Stumbled upon this: http://compilers.iecc.com/comparch/article/97-03-086 which claims you can reduce the number of reductions by changing the way the rules are defined; basically flattening the recursive rules that define precedence, and adding the precedence back via %prec directives. Just to see what was going on, I traced through the following expression (turned on the debug flags) x = { a: 1 }; The parser walks the following steps when it hits the "1" token, consuming a "}" during the reduction: NUMBER Literal PrimaryExprNoBrace PrimaryExpr MemberExpr NewExpr LeftHandSideExpr PostfixExpr UnaryExpr MultiplicativeExpr AdditiveExpr ShiftExpr RelationalExpr EqualityExpr BitwiseANDExpr BitwiseXORExpr BitwiseORExpr LogicalANDExpr LogicalORExpr ConditionalExpr AssignmentExpr It seems to me that at least the rules from PostfixExpr down through ConditionalExpr would be good candidates to try this on.

Attachments
Add attachment proposed patch, testcase, etc.

Patrick Mueller

Comment 1 2010-01-19 13:09:34 PST

In case the link in the summary goes away, here is the relevant text: One more hint: If the language has compicated expression syntax (like C) it is more efficient to express operators precedence explicitly by %prec directiove, instead of expression syntax tricks like it is done in the ISO C and C++ Draft Standatd (where operators precedence is expressed implicitly). It is significanly reduces number of reduces (by mean of LALR algorithm) while parsing. For example: [bad style] AdditiveExpression: AdditiveExpression AddOperator MultiplicativeExpression | MultiplicativeExpression; AddOperator: lxmPlus | lxmMinus; MultiplicativeExpression: MultiplicativeExpression MulOperator HighPrecExpression | HighPrecExpression; MulOperator: lxmMul | lxmDiv; [better style] Expression: HighPrecExpression | lxmLeftPar Expression lxmRightPar | Expression lxmPlus Expression %prec pseudoAdditive | Expression lxmMinus Expression %prec pseudoAdditive | Expression lxmMul Expression %prec pseudoMultiplicative | Expression lxmDiv Expression %prec pseudoMultiplicative; Regards, Alexander. -- Alexander N. Krotoff krotoff@such.srcc.msu.su Research Computer Center tel: +7(095)939-2638 Moscow State University fax: +7(095)939-4430

Zach Carter

Comment 2 2010-01-21 10:06:37 PST

Right, by using ambiguous rules and manually resolving the ambiguities using precedence directives, it should reduce the number of reductions needed. You can also set the precedence of an operator token directly: %left '+' '-' %left '*' '/' etc... E.g. any rule with a '+' token will have the associated precedence, although using %prec gives you direct control over which rule to apply the precedence to.

Patrick Mueller

Comment 3 2010-01-21 13:19:55 PST

Some stats from using YYDEBUG and then running jsc on the file SunSpider/tests/parse-only/jquery-1.3.2.js, looking at output lines of the form: -> $$ = nterm [whatever} which appear to be giving interim parse states. 240,083 total states 224,882 states containing "NoNode" 16,074 states containing "NoBF" 174,595 states containing "Expr_NoNode" 10,008 states containing "PostfixExpr_NoNode" 7,821 states containing "ConditionalExpr_NoNode" so it looks like concentrating on just the "expressions", and the NoNode variety in particular, will hit a large swath of the reductions. The PostFixExpr and ConditionalExpr counts are interesting, because of the number of reductions from the first to the last. There are 13 states between them, so if you ball-park that 7.8K of them went through all 13 states, that's about 100K states. 40% of the states hit in total.

Patrick Mueller

Comment 4 2010-01-22 05:15:16 PST

In Bug 32722, comment 28, it's mentioned that Zoltan may be working on a replacement, non-Bison generated parser. In which case this bug can be RESOLVED WONTFIX or something.

Ariya Hidayat

Comment 5 2011-11-01 11:44:06 PDT

JSC does not use bison-based parser anymore.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution WONTFIX

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware All

OS All

Product WebKit

Component JavaScriptCore

Assignee

Nobody

Reported

2010-01-19 13:07 PST

Modified

2011-11-01 11:44 PDT History

CC List

4 users Show

URL

Keywords

Depends on

Blocks