Bug 33860 - potential speed up of parser by recoding expression rules
Summary: potential speed up of parser by recoding expression rules
Status: RESOLVED WONTFIX
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-01-19 13:07 PST by Patrick Mueller
Modified: 2011-11-01 11:44 PDT (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Patrick Mueller 2010-01-19 13:07:33 PST
While playing with the bison parser for JavaScriptCore, I did a little googling research on bison optimization.  Stumbled upon this:

   http://compilers.iecc.com/comparch/article/97-03-086

which claims you can reduce the number of reductions by changing the way the rules are defined; basically flattening the recursive rules that define precedence, and adding the precedence back via %prec directives.

Just to see what was going on, I traced through the following expression (turned on the debug flags)

x  = { a: 1 };

The parser walks the following steps when it hits the "1" token, consuming a "}" during the reduction:

    NUMBER
    Literal
    PrimaryExprNoBrace
    PrimaryExpr
    MemberExpr
    NewExpr
    LeftHandSideExpr
    PostfixExpr
    UnaryExpr
    MultiplicativeExpr
    AdditiveExpr
    ShiftExpr
    RelationalExpr
    EqualityExpr
    BitwiseANDExpr
    BitwiseXORExpr
    BitwiseORExpr
    LogicalANDExpr
    LogicalORExpr
    ConditionalExpr
    AssignmentExpr

It seems to me that at least the rules from PostfixExpr down through ConditionalExpr would be good candidates to try this on.
Comment 1 Patrick Mueller 2010-01-19 13:09:34 PST
In case the link in the summary goes away, here is the relevant text:

    One more hint:

    If the language has compicated expression syntax (like C)
    it is more efficient to express operators precedence explicitly
    by %prec directiove, instead of expression syntax tricks like
    it is done in the ISO C and C++ Draft Standatd (where operators
    precedence is expressed implicitly).

    It is significanly reduces number of reduces (by mean of LALR
    algorithm) while parsing.

    For example:
    [bad style]
    AdditiveExpression:
    AdditiveExpression AddOperator MultiplicativeExpression |
    MultiplicativeExpression;

    AddOperator:
    lxmPlus |
    lxmMinus;

    MultiplicativeExpression:
    MultiplicativeExpression MulOperator HighPrecExpression |
    HighPrecExpression;

    MulOperator:
    lxmMul |
    lxmDiv;

    [better style]
    Expression:
    HighPrecExpression |
    lxmLeftPar Expression lxmRightPar |
    Expression lxmPlus Expression	%prec pseudoAdditive |
    Expression lxmMinus Expression	%prec pseudoAdditive |
    Expression lxmMul Expression	%prec pseudoMultiplicative |
    Expression lxmDiv Expression	%prec pseudoMultiplicative;

    Regards,
    Alexander.
    --
    Alexander N. Krotoff	 krotoff@such.srcc.msu.su
    Research Computer Center	tel: +7(095)939-2638
    Moscow State University	 fax: +7(095)939-4430
Comment 2 Zach Carter 2010-01-21 10:06:37 PST
Right, by using ambiguous rules and manually resolving the ambiguities using precedence directives, it should reduce the number of reductions needed. You can also set the precedence of an operator token directly:

%left '+' '-'
%left '*' '/'

etc...

E.g. any rule with a '+' token will have the associated precedence, although using %prec gives you direct control over which rule to apply the precedence to.
Comment 3 Patrick Mueller 2010-01-21 13:19:55 PST
Some stats from using YYDEBUG and then running jsc on the file SunSpider/tests/parse-only/jquery-1.3.2.js, looking at output lines of the form:

   -> $$ = nterm [whatever}

which appear to be giving interim parse states.

 240,083 total states 
 224,882 states containing "NoNode"
  16,074 states containing "NoBF"
 174,595 states containing "Expr_NoNode"
  10,008 states containing "PostfixExpr_NoNode"
   7,821 states containing "ConditionalExpr_NoNode"

so it looks like concentrating on just the "expressions", and the NoNode variety in particular, will hit a large swath of the reductions.   The PostFixExpr and ConditionalExpr counts are interesting, because of the number of reductions from the first to the last.  There are 13 states between them, so if you ball-park that 7.8K of them went through all 13 states, that's about 100K states.  40% of the states hit in total.
Comment 4 Patrick Mueller 2010-01-22 05:15:16 PST
In Bug 32722, comment 28, it's mentioned that Zoltan may be working on a replacement, non-Bison generated parser.  In which case this bug can be RESOLVED WONTFIX or something.
Comment 5 Ariya Hidayat 2011-11-01 11:44:06 PDT
JSC does not use bison-based parser anymore.