Bug 34019 - Custom-written JavaScript parser
Summary: Custom-written JavaScript parser
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Enhancement
Assignee: Zoltan Herczeg
URL:
Keywords:
Depends on:
Blocks: 10701
  Show dependency treegraph
 
Reported: 2010-01-22 13:41 PST by Zoltan Herczeg
Modified: 2010-11-03 12:42 PDT (History)
25 users (show)

See Also:


Attachments
Dump AST (54.65 KB, patch)
2010-01-22 13:41 PST, Zoltan Herczeg
no flags Details | Formatted Diff | Diff
A new JavaScript Grammar (draft) (286.47 KB, patch)
2010-01-22 13:42 PST, Zoltan Herczeg
no flags Details | Formatted Diff | Diff
Improved Dump AST (55.76 KB, patch)
2010-01-25 06:33 PST, Zoltan Herczeg
no flags Details | Formatted Diff | Diff
A new JavaScript Grammar, which passes all jscore regression tests. (300.35 KB, patch)
2010-01-25 06:38 PST, Zoltan Herczeg
no flags Details | Formatted Diff | Diff
Selectable JavaScript grammar (178.63 KB, patch)
2010-01-26 03:19 PST, Zoltan Herczeg
no flags Details | Formatted Diff | Diff
Selectable JavaScript grammar (+mac build support) (184.80 KB, patch)
2010-01-26 06:40 PST, Zoltan Herczeg
oliver: review-
Details | Formatted Diff | Diff
Next attempt (191.46 KB, patch)
2010-01-27 06:12 PST, Zoltan Herczeg
oliver: review-
Details | Formatted Diff | Diff
Layout test for parser syntax checking (36.03 KB, patch)
2010-01-28 06:29 PST, Zoltan Herczeg
no flags Details | Formatted Diff | Diff
Another attempt (192.62 KB, patch)
2010-01-29 06:14 PST, Zoltan Herczeg
oliver: review-
Details | Formatted Diff | Diff
Layout test for parser syntax checking (patch 2) (37.82 KB, patch)
2010-01-29 06:15 PST, Zoltan Herczeg
no flags Details | Formatted Diff | Diff
Next draft (211.49 KB, patch)
2010-02-03 06:35 PST, Zoltan Herczeg
oliver: review-
Details | Formatted Diff | Diff
Merging syntax checker and grammar (177.49 KB, patch)
2010-02-05 05:20 PST, Zoltan Herczeg
no flags Details | Formatted Diff | Diff
template based approach (204.85 KB, patch)
2010-02-09 06:40 PST, Zoltan Herczeg
no flags Details | Formatted Diff | Diff
jquery-yacc.zip (6.74 MB, application/zip)
2010-02-09 15:16 PST, Geoffrey Garen
no flags Details
Updated patch (199.42 KB, patch)
2010-02-12 01:05 PST, Zoltan Herczeg
no flags Details | Formatted Diff | Diff
profiling data (152.00 KB, application/octet-stream)
2010-02-12 01:08 PST, Zoltan Herczeg
no flags Details
profiling data (maxFunctionDepth = 32) (151.04 KB, application/octet-stream)
2010-02-12 05:02 PST, Zoltan Herczeg
no flags Details
Version 4.1 (199.41 KB, patch)
2010-03-23 03:43 PDT, Zoltan Herczeg
no flags Details | Formatted Diff | Diff
dump AST current version (55.76 KB, patch)
2010-05-20 23:34 PDT, Zoltan Herczeg
no flags Details | Formatted Diff | Diff
latest version of my recursive descent parser (204.85 KB, patch)
2010-05-21 06:20 PDT, Zoltan Herczeg
no flags Details | Formatted Diff | Diff
Recursive descent parser (114.85 KB, patch)
2010-05-24 02:11 PDT, Oliver Hunt
no flags Details | Formatted Diff | Diff
latest patch (199.74 KB, patch)
2010-05-24 02:24 PDT, Zoltan Herczeg
no flags Details | Formatted Diff | Diff
More complete parser (117.89 KB, patch)
2010-05-28 23:25 PDT, Oliver Hunt
no flags Details | Formatted Diff | Diff
Updated parser (116.93 KB, patch)
2010-06-06 13:27 PDT, Oliver Hunt
no flags Details | Formatted Diff | Diff
faster parser (198.59 KB, patch)
2010-06-17 05:00 PDT, Zoltan Herczeg
no flags Details | Formatted Diff | Diff
Patch (132.06 KB, patch)
2010-06-23 18:17 PDT, Oliver Hunt
no flags Details | Formatted Diff | Diff
Patch (130.70 KB, patch)
2010-06-23 19:11 PDT, Oliver Hunt
barraclough: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Zoltan Herczeg 2010-01-22 13:41:11 PST
A bug entry for a custom written JavaScript parser.
FAQ:
  - Why?
    I felt I could write a faster, specialized parser than bison
  - When the work has started?
    Start of January 2010 (3 weeks ago)
  - Main components?
    - grammar : parsing and generating nodes
    - syntax checker : just parsing (used for checking function bodies to find the terminator '}')
  - Can grammar and syntax checker be merged? It would help maintaining the code.
    Possible. I was thinking about that when actually started to separate the grammar from the syntax checker. However, there are some key
    differences between them (like syntax checker should check nested function bodies). And syntax checker has a much simpler (and faster)
    code. When you open a simple page (like www.google.com), you also get a huge JS with many functions. Most of them are never executed
    (browser specific for a different browser), but parsed (syntax checked). I would not like to give up the speed advantage, and the
    code of the grammar is ugly anyway :(
  - Advantages compared to bison?
    Speed. According to my measurement the grammar is 3 and the syntax checker is 4 times faster than bison.
  - Disadvantages?
    Readability. Although the code in grammar.y is ugly, it is repetative, and easy to understand.
  - Focus of the work?
    Compatibility with the existing bison parser (and the standard as well). Especially to detect parser errors in the same way.
  - But you skip possible optimizations in this way...
    I know. But it is only a first step on a long road. When this parser is tested and goes mainline, we could think about improving it.
    Actually there is a lot of room for such improvements. But not for now.
  - Testing?
    See dump-AST.patch . You can dump the AST, and compare the old and new ones. Now 127 fail on javascriptcore tests.
    Missing features: object literal, switch, arguments feature (no idea)
  - Supported build systems?
    Qt now. But easy for others, since you only have to add JavaScriptCore/parser/Grammar.cpp, .h, SyntaxChecker.cpp, .h, GrammarConsts.h
  - Any help?
    See the comments in GrammarConsts.h
  - Anything else?
    See LEGACY comments in the code. They could be improved soon, but break the AST compatibility. Anyway, we should do something
    with the divot points, since every node use them in a different way. For now, the compatibility is the first priority, but later
    we should define general rules for them, and all node should follow these rules.
Comment 1 Zoltan Herczeg 2010-01-22 13:41:45 PST
Created attachment 47222 [details]
Dump AST
Comment 2 Zoltan Herczeg 2010-01-22 13:42:40 PST
Created attachment 47223 [details]
A new JavaScript Grammar (draft)
Comment 3 Oliver Hunt 2010-01-22 14:35:50 PST
I'm unsure why the syntax checker is separate from the parser.  Syntax checking is basically answering the question "can i parse this".

There are semantic checks that we will want to do while parsing:
   the various "features" we track
   variable and function accumulation
   partial AST generation

New things we _must_ be able to do in the new parser:
   strict mode identification needs to be done during the parse stage and alters the state of the parser to reject constructs that lax mode would allow, and produces parse time errors instead).  For efficiency reasons this basically forces us into a recursive-descent parser :-/
Comment 4 Zoltan Herczeg 2010-01-22 14:57:39 PST
(In reply to comment #3)
> I'm unsure why the syntax checker is separate from the parser.  Syntax checking
> is basically answering the question "can i parse this".

The grammar do both syntax checking and generating nodes. Syntax checker only checks syntax, and used only for skipping nested function bodies. That is what I tried to explain. They have lot in common, but they have differences as well.

> There are semantic checks that we will want to do while parsing:
>    the various "features" we track
>    variable and function accumulation
>    partial AST generation

That is what my grammar do :) It is "binary compatible" with the bison parser. The only thing which is not tracked is the "arguments feature" because I don't really understand its purpose yet :(

> New things we _must_ be able to do in the new parser:
>    strict mode identification needs to be done during the parse stage and
> alters the state of the parser to reject constructs that lax mode would allow,
> and produces parse time errors instead).  For efficiency reasons this basically
> forces us into a recursive-descent parser :-/

I saw the new ECMA standard, and I think I can hanle the strict mode restrictions, at least those which I saw (local identifiers cannot be used before a var declaration seems a more complex one, since you have to store all the primary identifiers during parsing). I checked wikipedia, and turned out my parser is a predictive parser (uses a stack, but never backtracks). The predictive parser belongs to the family of recursive descent parsers.
Comment 5 Oliver Hunt 2010-01-22 15:05:31 PST
(In reply to comment #4)
> (In reply to comment #3)
> > I'm unsure why the syntax checker is separate from the parser.  Syntax checking
> > is basically answering the question "can i parse this".
> 
> The grammar do both syntax checking and generating nodes. Syntax checker only
> checks syntax, and used only for skipping nested function bodies. That is what
> I tried to explain. They have lot in common, but they have differences as well.

Basically the parser should be set up with a flag along the lines of
parse(int maxFunctionASTDepth) {
...
}

then the various functions to create nodes should basically be
if (currentFunctionDepth < maxFunctionDepth)
    newNode = new ...;

Having the ability to dynamically vary how deep we generate the tree will be beneficial for certain real world cases.

> 
> > There are semantic checks that we will want to do while parsing:
> >    the various "features" we track
> >    variable and function accumulation
> >    partial AST generation
> 
> That is what my grammar do :) It is "binary compatible" with the bison parser.
> The only thing which is not tracked is the "arguments feature" because I don't
> really understand its purpose yet :(

Whether a function ever references a variable with the name "arguments" -- we need to know whether the arguments object may have been used.

> 
> > New things we _must_ be able to do in the new parser:
> >    strict mode identification needs to be done during the parse stage and
> > alters the state of the parser to reject constructs that lax mode would allow,
> > and produces parse time errors instead).  For efficiency reasons this basically
> > forces us into a recursive-descent parser :-/
> 
> I saw the new ECMA standard, and I think I can hanle the strict mode
> restrictions, at least those which I saw (local identifiers cannot be used
> before a var declaration seems a more complex one, since you have to store all
> the primary identifiers during parsing). I checked wikipedia, and turned out my
> parser is a predictive parser (uses a stack, but never backtracks). The
> predictive parser belongs to the family of recursive descent parsers.

Everything should be trivial i think, the parser is gnarly enough to make me sad though.
Comment 6 Zoltan Herczeg 2010-01-22 15:20:03 PST
> Basically the parser should be set up with a flag along the lines of
> parse(int maxFunctionASTDepth) {
> ...
> }
> 
> then the various functions to create nodes should basically be
> if (currentFunctionDepth < maxFunctionDepth)
>     newNode = new ...;
> 
> Having the ability to dynamically vary how deep we generate the tree will be
> beneficial for certain real world cases.

No problem. I could extend my grammar to go generate nodes for nested functions as well (by default, the function depth counter will be 1, but you can increase it).

Ok, just tell me your requirements, I hope my parser-beast can handle anything :)
Comment 7 Oliver Hunt 2010-01-22 15:23:24 PST
(In reply to comment #6)
> > Basically the parser should be set up with a flag along the lines of
> > parse(int maxFunctionASTDepth) {
> > ...
> > }
> > 
> > then the various functions to create nodes should basically be
> > if (currentFunctionDepth < maxFunctionDepth)
> >     newNode = new ...;
> > 
> > Having the ability to dynamically vary how deep we generate the tree will be
> > beneficial for certain real world cases.
> 
> No problem. I could extend my grammar to go generate nodes for nested functions
> as well (by default, the function depth counter will be 1, but you can increase
> it).
> 
> Ok, just tell me your requirements, I hope my parser-beast can handle anything
> :)

Any idea how easy/hard your model would make free variable analysis?
Comment 8 Zoltan Herczeg 2010-01-22 22:12:09 PST
> Any idea how easy/hard your model would make free variable analysis?

Agree, it would be good to put some effort analysis here.

Features, the new parser should do:
   - generating nodes for nested functions
     effort: easy. Actually this feature was in the code, but I removed it the gain some speed.
   - variable analysis
     effort: easy. We need some kind of 'set' where we can put the variables (I assume we talk about primary literals here). Because the tokens returned by the lexer analysed in order, we could easly check any dependency requirements between primary literals.
   - some other strict mode requirements:
     - with is not supported
       effort: easy
     - eval is a reserved word
       effort: easy
     - arguments is a reserved word
       nooo arguments again... anyway, probably easy
Comment 9 Zoltan Herczeg 2010-01-25 06:33:10 PST
Created attachment 47344 [details]
Improved Dump AST
Comment 10 Zoltan Herczeg 2010-01-25 06:38:48 PST
Created attachment 47345 [details]
A new JavaScript Grammar, which passes all jscore regression tests.

Compared to the bison parser on SunSpider parser tests, namely jquery, mootools, prototype. The AST is the same. There are some divot differences because regular expression literals are not correctly handled by the bison parser. Line infos for non-statements are useless for bison parser.
Comment 11 Zoltan Herczeg 2010-01-25 06:44:34 PST
mootools parser test shown the following interesting behaviour of the bison parser on the following expression (simplified example):

for (var i in x) for (var j in y) ;

According to the bison parser, j is declared earlier than i. My predictive parser handles the declaration order correctly.
Comment 12 Zoltan Herczeg 2010-01-25 14:06:14 PST
I am thinking about the future of this work. I will do layout and performance testings tommorrow, but still meany other tasks ahead. Would be good to broke this patch to smaller pieces, but it doesn't seem possible, since everything is needed. I cannot cut out things, like 'switch', 'for' or 'function' parsing. Since this is a huge piece of code, probably many things will be needed to improve, like changing variable names, restructure code. This will need a lot of reviewer effort, which I thank for in advance.

I planned this parser to be robust and future proof. It seems these advantages are put in good use sooner than I expected. Such things are strict JS parsing or generating AST for nested function bodies. However, it would be better to do them in separate patches, than doing everything in an all in-one package. Probably would be better to start these works after I got some feedback about the code.

I have one question about function nesting. We use ParserArenaData<DeclarationStacks::...> variables in the code, but the FunctionBodyNode expects DeclarationStacks::... variables. What would be a good solution for this difference?

Anyway, although the code is big, the concept behind it is very simple. A step-by-step debugging on simple JS sources can reaveal easly how it works.
Comment 13 Zoltan Herczeg 2010-01-26 03:19:08 PST
Created attachment 47396 [details]
Selectable JavaScript grammar

The patch is reworked. Now, it does not remove the original bison parser. The new parser is compiler selectable (ENABLE_PREDICTIVE_PARSER). It is disabled by default.
Comment 14 Zoltan Herczeg 2010-01-26 03:22:29 PST
New parser runtime results
--suite parse-only:

TEST                           COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:                   2.19x as fast     69.5ms +/- 1.1%   31.8ms +/- 0.9%     significant

=============================================================================

  jquery:                      2.20x as fast     10.8ms +/- 2.8%    4.9ms +/- 4.6%     significant
    1.3.2:                     2.20x as fast     10.8ms +/- 2.8%    4.9ms +/- 4.6%     significant

  mootools:                    2.36x as fast     11.8ms +/- 2.6%    5.0ms +/- 0.0%     significant
    1.2.2-core-nc:             2.36x as fast     11.8ms +/- 2.6%    5.0ms +/- 0.0%     significant

  prototype:                   2.12x as fast     12.5ms +/- 3.0%    5.9ms +/- 3.8%     significant
    1.6.0.3:                   2.12x as fast     12.5ms +/- 3.0%    5.9ms +/- 3.8%     significant

  concat:                      2.15x as fast     34.4ms +/- 1.1%   16.0ms +/- 0.0%     significant
    jquery-mootools-prototype: 2.15x as fast     34.4ms +/- 1.1%   16.0ms +/- 0.0%     significant

No observable differences on other benchmarks (SunSpider, V8, WindScorpion)
Comment 15 WebKit Review Bot 2010-01-26 03:27:14 PST
Attachment 47396 [details] did not pass style-queue:

Failed to run "WebKitTools/Scripts/check-webkit-style" exit_code: 1
JavaScriptCore/parser/PredictiveGrammar.h:39:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
JavaScriptCore/parser/PredictiveGrammarConsts.h:90:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
JavaScriptCore/parser/SyntaxChecker.h:36:  Alphabetical sorting problem.  [build/include_order] [4]
JavaScriptCore/parser/SyntaxChecker.h:46:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
JavaScriptCore/parser/Lexer.h:36:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
JavaScriptCore/parser/Lexer.h:103:  This { should be at the end of the previous line  [whitespace/braces] [4]
JavaScriptCore/parser/Lexer.h:109:  This { should be at the end of the previous line  [whitespace/braces] [4]
Total errors found: 7


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 16 Zoltan Herczeg 2010-01-26 06:40:30 PST
Created attachment 47407 [details]
Selectable JavaScript grammar (+mac build support)
Comment 17 WebKit Review Bot 2010-01-26 06:47:17 PST
Attachment 47407 [details] did not pass style-queue:

Failed to run "WebKitTools/Scripts/check-webkit-style" exit_code: 1
JavaScriptCore/parser/PredictiveGrammar.h:39:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
JavaScriptCore/parser/PredictiveGrammarConsts.h:90:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
JavaScriptCore/parser/SyntaxChecker.h:46:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
JavaScriptCore/parser/Lexer.h:36:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
Total errors found: 4


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 18 Oliver Hunt 2010-01-26 14:35:51 PST
Comment on attachment 47407 [details]
Selectable JavaScript grammar (+mac build support)

I don't think there's any reason to still use cumulative "feature" flags.

What you should do is have a function stack (very much pseudo code):
struct FunctionInformation {
    Features features;
    Vector<VariableDeclaration> variables;
    ... etc ...
}
Vector<FunctionInformation> functions;

Then when the parser starts a new function it should simply do:
functions.append(FunctionInformation);

And all the various constructs that add features become:
functions.last().features |= NewFeatures;

case VarDeclaration:
    ...
    functions.last().variables.add(theNewVariableDeclaration);
    ...

etc

(so we lose all the various array and feature unification rules)

The only reason we use cumulative feature flags currently is because yacc/bison generate bottom up parsers which by definition can't do this (the parser doesn't "know" it's producing a function until it has already parsed the child statemetns)

In terms of style rather than the current top(-n) = new Node(top(), top(-1), ..., top(-n)) i would prefer
push(new Node(pop(), pop(), ...)

Which i think is much nicer.
Comment 19 Zoltan Herczeg 2010-01-26 15:21:54 PST
(In reply to comment #18)
> (From update of attachment 47407 [details])
> I don't think there's any reason to still use cumulative "feature" flags.

I would prefer a scope flag better myself, but many nodes actually uses these features as arguments, like BinaryOpNode (and its derived classes) has rightHasAssignments (example: a + (b * c = d) -> AddNode rightHasAssignments must be true), or TryNode catchHasEval, BracketAccessorNode (and similar nodes) subscriptHasAssignments, etc. It would be really hard to decide these flags without local features. Actually I don't know how these flags modify the behaviour of emitByteCode, but if they are unnecessary, I could remove all of them :)

> What you should do is have a function stack (very much pseudo code):
> struct FunctionInformation {
>     Features features;
>     Vector<VariableDeclaration> variables;
>     ... etc ...
> }
> Vector<FunctionInformation> functions;

I have exactly the same idea about function stack, except a minor tweak. The topmost FunctionInformation is stored by the Grammar itself, and pushed to the stack when a nested function is parsed. It is a little faster than an array access.

> case VarDeclaration:
>     ...
>     functions.last().variables.add(theNewVariableDeclaration);
>     ...
> 
> etc
> 
> (so we lose all the various array and feature unification rules)
> 

> In terms of style rather than the current top(-n) = new Node(top(), top(-1),
> ..., top(-n)) i would prefer
> push(new Node(pop(), pop(), ...)
> 
> Which i think is much nicer.

Wow, you are right. I didn't think about it before. Actually this is possible, since the items are not freed on the top of the stack, and the values can be accessed before a push reuse the entry. I hope the compiler won't mess up multiple pop()-s.
Comment 20 Guillaume Chereau 2010-01-26 23:39:43 PST
Hello all,

I tried the new parser on the javascript parsing test from Holger's benchmarks (http://gitorious.org/qtwebkit/performance), and I get the following results on my laptop :

bison parser : 192 ms
new parser   : 115 ms

(it is ~ 1.6 time better)

The test consist of parsing a set of javascript codes taken from a list of popular websites.


However I couldn't test the whole set because of an error when parsing lines of this form :
  
  "new f(x++)"

I get :

ASSERTION FAILED: m_unaryMode == PrimaryMode || m_unaryMode == PrimaryDotMode || m_unaryMode == PrimaryArgumentMode
(../../../JavaScriptCore/parser/PredictiveGrammar.cpp:273 void JSC::Grammar::primaryExpressionToNode())
Segmentation fault
Comment 21 Zoltan Herczeg 2010-01-27 06:12:29 PST
Created attachment 47525 [details]
Next attempt
Comment 22 WebKit Review Bot 2010-01-27 06:13:49 PST
Attachment 47525 [details] did not pass style-queue:

Failed to run "WebKitTools/Scripts/check-webkit-style" exit_code: 1
JavaScriptCore/parser/PredictiveGrammar.h:39:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
JavaScriptCore/parser/PredictiveGrammarConsts.h:90:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
JavaScriptCore/parser/SyntaxChecker.h:46:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
JavaScriptCore/parser/Lexer.h:36:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
JavaScriptCore/parser/PredictiveGrammar.cpp:2699:  Should have a space between // and comment  [whitespace/comments] [4]
Total errors found: 5


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 23 Zoltan Herczeg 2010-01-27 06:27:34 PST
Thank you for the feedback!

> (it is ~ 1.6 time better)

great! I hope that was in release mode :)

>   "new f(x++)"

Thanks. I also found some bugs myself, and the fast/js layout tests also shown me some others, like: (1++).x; a(x++).b; for (new a in c) { }, and "a + b = c" should fail on the syntax checker as well. All of them are fixed in the newest patch.

The constructor of the parser extended to:
Grammar(JSGlobalData* globalData, int maxFunctionDepth = 1, bool strict = false)

I started to work on maxFunctionDepth first, and it is supported now. Tested both on small examples and all JSCore tests are passed with maxFunctionDepth = 3 . The 'strict' is just added for future use. I really want to implement the strict mode checks in a different patch :)

There are only two fast/js tests which fail:
fast/js/reparsing-semicolon-insertion.html - the comment says:
// According to the ECMA spec, these should all be syntax errors. However, the
// pre-existing behaviour of JavaScriptCore has always been to accept them.

fast/js/function-declaration-statement.html - crashes. No idea what is happening here (yet), since the JS code snippets are parsed correctly by the command line tool.
Comment 24 Oliver Hunt 2010-01-27 15:54:36 PST
Comment on attachment 47525 [details]
Next attempt

I don't like the
case '-': ....
case '{': ...
etc

idiom. You should stick to the symbolic names like OPENBRACE -- this will also reduce the lexer diffs, and reduce the risk of accidentally introducing a symbolic name that overloads a character.

We also need to get the full set of tests to parse correctly.  Then we need to fuzz this (we don't have many -- if any -- invalid script tests alas)

--Oliver
Comment 25 Zoltan Herczeg 2010-01-28 06:29:29 PST
Created attachment 47616 [details]
Layout test for parser syntax checking
Comment 26 Zoltan Herczeg 2010-01-28 06:41:33 PST
(In reply to comment #25)
> Created an attachment (id=47616) [details]
> Layout test for parser syntax checking

This test contains syntax checking examples. Both parsers can handle these test cases. However, I noticed something for the bison parser:

function f() { return 6 + }

is accepted, altough

return 6 +

is not.
Comment 27 Zoltan Herczeg 2010-01-29 06:14:04 PST
Created attachment 47711 [details]
Another attempt
Comment 28 Zoltan Herczeg 2010-01-29 06:15:11 PST
Created attachment 47712 [details]
Layout test for parser syntax checking (patch 2)

Adding more tests.
Comment 29 WebKit Review Bot 2010-01-29 06:17:37 PST
Attachment 47711 [details] did not pass style-queue:

Failed to run "WebKitTools/Scripts/check-webkit-style" exit_code: 1
JavaScriptCore/parser/PredictiveGrammar.h:39:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
JavaScriptCore/parser/PredictiveGrammarConsts.h:90:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
JavaScriptCore/parser/SyntaxChecker.h:46:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
JavaScriptCore/parser/Lexer.h:36:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
Total errors found: 4


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 30 Zoltan Herczeg 2010-01-29 06:28:48 PST
(In reply to comment #24)
> (From update of attachment 47525 [details])
> I don't like the
> case '-': ....
> case '{': ...
> etc

Done. I also found another two errors in my parser: "for (a,b in c ; ;) {}" accepted, and "if (a) function f() {}" caused segfault. Both of them fixed (these were minor bugs, only 1 line change was required). Now all jscore and layout tests are passed successfully. Finally :)

> We also need to get the full set of tests to parse correctly.  Then we need to
> fuzz this (we don't have many -- if any -- invalid script tests alas)

I have attached a new patch here. It contains about 250 strings, half of them valid JavaScript source code, and the other half is not. The strings were tested as plain stings, and a "function f() {}" function declaration is added around them. This is useful, since for example "return 6+" fails with syntax error in the bison test, while "function f() { return 6+ }" is not.

I hope the patch is not far from final. Hard to update this amount of code again and again :) When it reaches nearly finished state, I will add a ChangeLog entry and update the mac project rules.
Comment 31 Zoltan Herczeg 2010-01-29 06:51:54 PST
TEST                           COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:                   2.16x as fast     69.5ms +/- 1.3%   32.2ms +/- 0.9%     significant

=============================================================================

  jquery:                      2.14x as fast     10.7ms +/- 3.2%    5.0ms +/- 0.0%     significant
    1.3.2:                     2.14x as fast     10.7ms +/- 3.2%    5.0ms +/- 0.0%     significant

  mootools:                    2.26x as fast     11.3ms +/- 3.1%    5.0ms +/- 0.0%     significant
    1.2.2-core-nc:             2.26x as fast     11.3ms +/- 3.1%    5.0ms +/- 0.0%     significant

  prototype:                   2.08x as fast     12.7ms +/- 2.7%    6.1ms +/- 3.7%     significant
    1.6.0.3:                   2.08x as fast     12.7ms +/- 2.7%    6.1ms +/- 3.7%     significant

  concat:                      2.16x as fast     34.8ms +/- 0.9%   16.1ms +/- 1.4%     significant
    jquery-mootools-prototype: 2.16x as fast     34.8ms +/- 0.9%   16.1ms +/- 1.4%     significant

Since the code is near final, I have measured the gain again. No surprise here. We have tried it on real web-pages, but unfortunately we experinced about 100% deviation, so the comparison is useless.

(I got a message now, that the regressions are passed on mac as well! Excellent news for me)

charlie137, would you repeat your measurements again?
Comment 32 Guillaume Chereau 2010-01-29 10:05:31 PST
My previous test was in debug mode, that is why the difference was not as big between the old an new parser.

Here are the results (in ms) using the full data set, in debug and release mode for the old and new parser :

| mode    | old | new |
|---------+-----+-----|
| debug   | 836 | 689 |
| release | 204 |  75 |

Strangely, with this dataset the difference is not that big in debug mode, but huge in release mode (compiled with -O3).
Comment 33 Oliver Hunt 2010-02-01 12:26:48 PST
Comment on attachment 47711 [details]
Another attempt

I really don't like the top(-n) stuff.  I would much rather all were replaced with pop(), etc.

I think if there isn't a performance hit it would be _much_ clearer to have all code be of the form

GrammarStackItem rhs = pop();
GrammarStackItem lhs = pop();
push(new AddNode(lhs, rhs));

Or whatever, rather than 
top(-2) = new AddNode(top(-1), top(-2));
pop();
Comment 34 Zoltan Herczeg 2010-02-03 06:35:37 PST
Created attachment 48022 [details]
Next draft

Top-less version :D
(+25% .cpp code size +0% functionality)
JS core and layout tests passed
Fortunately, the parsing speed is not affected (based on SS --parse-only tests)

This was a more difficult work than I expected. Finally it is finished.
Comment 35 Zoltan Herczeg 2010-02-03 07:04:13 PST
Oliver, what is your opinion about the syntax checker patch?

Tasks ahead:

* I havent't broken the node generator mechanism, yet. Partial object generation would only be helpful for few, complex nodes, like try, switch, for (and they need to do some consistency check before emitByteCode). Unfortunately, we still have to store the parsing status, which means perhaps we can decrease the required number of stack items to 2 down from 3.

* line info, and divot points. It is a mess now, every node do the things differently (depending on how the bison parsing them). We should introduce clear rules and follow them. We can do that now using our predictive parser.

* generating parser rules. I have no idea how to do this, since we have so many different language constructions. I couldn't find any real node grouping rules. The two major groups are expressions and statements. Statements are more difficult to implement than expressions.

Is there anything else?
Comment 36 Oliver Hunt 2010-02-03 16:17:10 PST
Comment on attachment 48022 [details]
Next draft

Okay, I'm going to say that the separate syntax checker should go -- there's too much risk of different behaviour between the syntax generator and real parser.  If we want pure syntax checking to be faster we just need a call to the parser that says that it shouldn't build an ast at any level
Comment 37 Zoltan Herczeg 2010-02-05 05:20:12 PST
Created attachment 48223 [details]
Merging syntax checker and grammar

The merge is done, although I am not sure this is a step ahead, since it is slower.

*1.028x as slow*  32.6ms +/- 1.1%   33.5ms +/- 2.1%     significant

I feel the code can be improved (and fast/js/function-dot-arguments.html fails), but first I would like to hear your opinion whether I should continue this work or step back to the previous patch.

Besides, is the parser layout test OK?
Comment 38 Zoltan Herczeg 2010-02-05 05:52:04 PST
> *1.028x as slow*  32.6ms +/- 1.1%   33.5ms +/- 2.1%     significant

Just for the sake of clarity, it is compared to the previous version with a separated checker. Compared to the bison parser, it is still more than 2 times faster.

2.07x as fast     69.5ms +/- 1.3%   33.5ms +/- 2.1%
Comment 39 Darin Adler 2010-02-05 06:53:01 PST
A way to avoid having two copies of the code and yet still have the non-syntax-check version be fast, is to use C++ templates. You can have a boolean template argument decide whether functions are for syntax checking or for compiling, and then avoid adding any branches.

I considered mentioning this right when Oliver did his review, but for some reason assumed that performance would not be a problem.

If this technique works then we get the best of both worlds. Only one set of source code to maintain, but the full speed we'd otherwise get by having the syntax checker entirely separate.
Comment 40 Zoltan Herczeg 2010-02-05 08:15:31 PST
> A way to avoid having two copies of the code and yet still have the
> non-syntax-check version be fast, is to use C++ templates. 

I like this idea. A difference between the syntax checker and he grammar is not only the simplified code, but also the simplified data structures. Furtunately, the data structures can also be template arguments.

My question is which is the best way to do it? What about the following:

template<bool syntaxChecker, class StackItem, class MemberVariables>::class Grammar {};

struct GrammarStackItem { ... };
struct SyntaxCheckerStackItem { ... };

struct SyntaxCheckerMemberVariables { ... };
struct GrammarMemberVariables {
     ...
     Grammar<true, SyntaxCheckerStackItem,
         SyntaxCheckerMemberVariables> checker;
};

I hope the declaration order will be ok in this way.
Comment 41 Zoltan Herczeg 2010-02-05 21:48:57 PST
Hm, I am not sure all compiler will like to access non-existing members even if the a branch is known to be false. Instead, I could use ALWAYS_INLINE get/set methods, and put ASSERT_NOT_REACHED to those getters/setter which are not exist for the current structure.
Comment 42 Darin Adler 2010-02-07 13:07:12 PST
(In reply to comment #41)
> Hm, I am not sure all compiler will like to access non-existing members even if
> the a branch is known to be false.

It won't.

> Instead, I could use ALWAYS_INLINE get/set
> methods, and put ASSERT_NOT_REACHED to those getters/setter which are not exist
> for the current structure.

That might work. Generally the technique is to have functions in a class passed to the template that perform all access to the members that are conditional and then empty functions with the same name in the other class passed to the template.
Comment 43 Oliver Hunt 2010-02-08 12:15:55 PST
Comment on attachment 47712 [details]
Layout test for parser syntax checking (patch 2)

r=me on the testcase -- there's no reason not to have the test in the repository
Comment 44 Eric Seidel 2010-02-08 15:18:08 PST
Attachment 47712 [details] was posted by a committer and has review+, assigning to Zoltan Herczeg for commit.
Comment 45 Zoltan Herczeg 2010-02-09 01:35:06 PST
Comment on attachment 47712 [details]
Layout test for parser syntax checking (patch 2)

Landed in r54530
Comment 46 Zoltan Herczeg 2010-02-09 06:40:22 PST
Created attachment 48409 [details]
template based approach

Finally finished. Actually, this was the hardest work so far. Darin's prediction was right, the speed of this approach is nearly as fast as the separate grammar syntax checker approach (32.6ms up from 32.2ms). Besides, the code is nearly as big :) Regardless, I think this is the best choice between speed and maintainability.

I have a small design question. I know C++ inlines functions inside a class definition, but still, I like to put an ALWAYS_INLINE macro before the definition to stress that I want it to be an inline function. This information can be useful for those, who want to change the code. What is WebKit's policy here? (I couldn't find anything about it in the coding guidelines)
Comment 47 Alexey Proskuryakov 2010-02-09 08:17:29 PST
We usually add ALWAYS_INLINE when profiling suggests measurable benefit from it.
Comment 48 Darin Adler 2010-02-09 10:46:25 PST
Comment on attachment 48409 [details]
template based approach

The pbxproj part of this patch is failing to apply, so the trybots are unable to evaluate it.
Comment 49 Darin Adler 2010-02-09 10:47:33 PST
(In reply to comment #47)
> We usually add ALWAYS_INLINE when profiling suggests measurable benefit from
> it.

Yes, that's the guideline. Don't use it unless it's needed to improve performance, and feel free to use it if it does improve performance. Same for NEVER_INLINE.
Comment 50 Zoltan Herczeg 2010-02-09 11:47:14 PST
(In reply to comment #48)
> (From update of attachment 48409 [details])
> The pbxproj part of this patch is failing to apply, so the trybots are unable
> to evaluate it.

I haven't updated my project for more than a week, and I don't intend to do it until major changes required, since I cannot compare the performance gains or losses anymore. When you say the general direction is ok, and only smaller reworkings are required, I will update the project of course. Each of the last 3 patch changed at least 50% of the codebase, and I wanted to make sure I don't mess up something :)

Thank you for your advices about ALWAYS_INLINE. I prefer all 1 line getter setters to be inline functions.
Comment 51 Darin Adler 2010-02-09 11:59:36 PST
(In reply to comment #50)
> Thank you for your advices about ALWAYS_INLINE. I prefer all 1 line getter
> setters to be inline functions.

Yes.

But please not that we only use ALWAYS_INLINE if there’s some reason they are not getting inlined. We don’t use that macro to express preferences.
Comment 52 Darin Adler 2010-02-09 12:01:18 PST
The major problem remaining here is that this new parser may be too difficult to modify when we make changes to the language or fix bugs.

I believe Oliver is investigating generating this parser from some sort of grammar, which seems like a way to get a parser that works just like this, but remain flexible.
Comment 53 Geoffrey Garen 2010-02-09 14:52:26 PST
These comments probably would have been more useful earlier in your development process. Sorry about that.

I wish we knew exactly what was slow about our YACC parser. I hope we're not just falling prey to Second System Syndrome here, and throwing out working code that could be optimized. Do you have any data on where the YACC parser spends its time?

If I had to guess, I would guess that the main optimization in this rewrite, as measured by SunSpider, was to do less work in the syntax checking case. But there seems to be low-hanging fruit in the existing YACC grammar for doing those same optimizations. Did you consider that option? If so, are there significant barriers to it?

I found the code in this patch a bit hard to follow, and I'm worried that it might be difficult to modify this parser in the future. Here are a few things that could make this patch clearer:

* I don't think there's any need for the ENABLE(PREDICTIVE_PARSER) flag. Having two parsers just makes the code harder to follow. I think this patch can just remove the old parser -- that's its intent, after all.

* Why does the parser keep its own m_stackPointer instead of just using Vector::size and Vector::last?

* It's a surprising idiom to grow the stack, and then modify the top-most item. I would recommend creating a stack entry and pushing onto the top of the stack instead.

* The two different copies of the *StackItem and * LocalMembers classes make the structure of the stack hard to follow. Did you provide two copies of these classes just to support ASSERT_NOT_REACHED calls? If so, is there another way to accomplish what those assertions try to accomplish?
Comment 54 Geoffrey Garen 2010-02-09 15:15:57 PST
I'm attaching an example of the kind of analysis you could do of YACC (jquery-yacc.zip).

I set YYDEBUG to 1 and ran the jquery parsing test.

It looks like the most common state the parser was in was 607, which is a choice between shifting '.', '(', or '[', and reducing MemberExpr_NoNode or NewExpr_NoNode.
Comment 55 Geoffrey Garen 2010-02-09 15:16:52 PST
Created attachment 48444 [details]
jquery-yacc.zip
Comment 56 Zoltan Herczeg 2010-02-10 00:57:23 PST
Thank you for your comments

> I wish we knew exactly what was slow about our YACC parser. I hope we're not
> just falling prey to Second System Syndrome here, and throwing out working code
> that could be optimized. Do you have any data on where the YACC parser spends
> its time?

This is only half of the story. The real issue here is not the present, but the future of JavaScript. The strict mode defined by ECMA requires an LL parser. This example was mentioned before:

for (var a in b) for (var c in a) { }

According to the bison parser, 'c' is defined first, and 'a' later, which is false. This is a nature of a bottom-up parser (right recursive LALR grammar), and probably cannot be helped. In strict mode, the "var id" definition must precede the first use of "id". The above statement is valid in strict mode, since "var a" is defined before it is used. However, this check cannot be done during the parsing in a bison parser. Post parsing checks takes extra time, and it would be really hard for nested function bodies, since no AST is generated for them.

That means we need an LL (left recursive) grammar to do these strict mode checks in parsing time. Unfortunately, JavaScript is not left recursive which makes things difficult. Oliver tries to find a solution, which would be basically an automatically generated LL parser, which can enter to right recursive mode, when it is necessary. The task is really challenging, but if it was successful (the resulting parser is much faster than bison) it would be worth some nice research papers on some big conferences (or even more).

Anyway, your idea is good, profiling can help for further parser optimizations.

> If I had to guess, I would guess that the main optimization in this rewrite, as
> measured by SunSpider, was to do less work in the syntax checking case. But
> there seems to be low-hanging fruit in the existing YACC grammar for doing
> those same optimizations. Did you consider that option? If so, are there
> significant barriers to it?

Actually, no. The node generator is 3 times faster (in pure speed, excluding everything else, including the lexer). The syntax checking is a little better, it is 4 times faster. Besides, do you think it would be worth to cut the bison grammar into two files, one for NoNode rules, and one for everything else?

> I found the code in this patch a bit hard to follow, and I'm worried that it
> might be difficult to modify this parser in the future. Here are a few things
> that could make this patch clearer:
> 
> * I don't think there's any need for the ENABLE(PREDICTIVE_PARSER) flag. Having
> two parsers just makes the code harder to follow. I think this patch can just
> remove the old parser -- that's its intent, after all.

This is the long term plan, but we should give some time for all ports to prepare for the change. When it is enabled on all ports, we can simply remove the grammar.y In my first patch, I removed the grammar.y, but others conviced me not to do it now.

> * Why does the parser keep its own m_stackPointer instead of just using
> Vector::size and Vector::last?

Because the parser stack is ever growing during parsing. Actually the members of a stack item must be accessible after it is popped from the stack. An automated shrink could cause a segmentation fault.

> * It's a surprising idiom to grow the stack, and then modify the top-most item.
> I would recommend creating a stack entry and pushing onto the top of the stack
> instead.

The stack item is really complex, and usually only some members are actually used. We can gain some speed here by not copying unnecessary blocks. Anyway, I think I could change this behaviour if you wish, I don't think the gain would be that significant.

> * The two different copies of the *StackItem and * LocalMembers classes make
> the structure of the stack hard to follow. Did you provide two copies of these
> classes just to support ASSERT_NOT_REACHED calls? If so, is there another way
> to accomplish what those assertions try to accomplish?

StackItem can be SyntaxCheckerStackItem or GrammarStackItem, and LocalMembers can be SyntaxCheckerLocalMembers or GrammarLocalMembers. Naturally, at least one (sometimes both) of these implementations do not contain an ASSERT_NOT_REACHED for a getter or setter. We need these inline function calls to support templates.
Comment 57 Zoltan Herczeg 2010-02-12 01:05:51 PST
Created attachment 48631 [details]
Updated patch

Updated patch. Actually ALWAYS_INLINE really improves the performance for getter/setters, so I decided to keep them.
Comment 58 Zoltan Herczeg 2010-02-12 01:08:45 PST
Created attachment 48632 [details]
profiling data

profiling - grpof "cat concat-jquery-mootools-prototype.js" 16 times
coverage - gcov on concat-jquery-mootools-prototype.js
interesting. really
Comment 59 Zoltan Herczeg 2010-02-12 05:02:07 PST
Created attachment 48638 [details]
profiling data (maxFunctionDepth = 32)

The previous profile data was mainly syntax checking, since the nested functions are never called (nodes are not generated for them). However, our shiny custom parser supports deeper AST generation. I set maxFunctionDepth to 32, which depth is probably enough for the parsing tests, and repeated the measurements.
Comment 60 Eric Seidel 2010-02-17 15:29:17 PST
I'm confused.  Is this bug intentionally marked as a security bug?
Comment 61 Oliver Hunt 2010-02-17 15:36:30 PST
(In reply to comment #60)
> I'm confused.  Is this bug intentionally marked as a security bug?

I don't believe so, if i don't see anything to the contrary in the next few hours i'll drop the flag.
Comment 62 Zoltan Herczeg 2010-02-19 13:30:50 PST
I was informed this work has no future, so I think this bug should be closed. I plan to write a blog post about summarizing the results, and would be good to link the patches here. Could you drop the security flag? (Could someone give me the right to set the WONTFIX flag in the bugzilla?)
Comment 63 Eric Seidel 2010-02-19 13:34:16 PST
I've fixed your bug permissions.

Dropping the security flag per Oliver's comment above.
Comment 64 Oliver Hunt 2010-02-19 13:38:14 PST
(In reply to comment #62)
> I was informed this work has no future, so I think this bug should be closed. I
> plan to write a blog post about summarizing the results, and would be good to
> link the patches here. Could you drop the security flag? (Could someone give me
> the right to set the WONTFIX flag in the bugzilla?)

I'm not sure we want to close this bug -- maybe we want to retitle this bug, something like "make a decent parser in the model of zoltan's awesome one" ?
Comment 65 Darin Adler 2010-02-20 09:37:17 PST
(In reply to comment #62)
> I was informed this work has no future

I'm surprised to hear that!

I thought there was a big feature for this. My hope is we can use this work to create a fast parser, but enhance it further, restoring the feature the bison one has, where you can change the syntax without having to be an expert on the parser internals.
Comment 66 Darin Adler 2010-02-20 09:37:31 PST
(In reply to comment #65)
> I thought there was a big feature for this.

future
Comment 67 Oliver Hunt 2010-02-20 18:34:53 PST
(In reply to comment #65)
> (In reply to comment #62)
> > I was informed this work has no future
> 
> I'm surprised to hear that!
> 
In its current form it's simply not maintainable -- either we need to somehow make it something that can be understood, or we cannot reasonably use it -- ES6 is likely to pick up new syntactic forms and it is hard to see how someone could add support for any new syntax in this grammar.  The problem is that it is effectively a single large piece of spaghetti and nothing i can think of really resolves this issue.  The alternatives (to me) seem to be either writing something that is actually a recursive descent parser, or automatically generating a parser of this form.
Comment 68 Geoffrey Garen 2010-02-22 12:01:47 PST
> The alternatives (to me) seem to be either writing
> something that is actually a recursive descent parser, or automatically
> generating a parser of this form.

Agreed.
Comment 69 Geoffrey Garen 2010-02-22 12:03:59 PST
> This is only half of the story. The real issue here is not the present, but the
> future of JavaScript. The strict mode defined by ECMA requires an LL parser.
> This example was mentioned before:
> 
> for (var a in b) for (var c in a) { }
> 
> According to the bison parser, 'c' is defined first, and 'a' later, which is
> false. This is a nature of a bottom-up parser (right recursive LALR grammar),
> and probably cannot be helped. In strict mode, the "var id" definition must
> precede the first use of "id". The above statement is valid in strict mode,
> since "var a" is defined before it is used. However, this check cannot be done
> during the parsing in a bison parser. Post parsing checks takes extra time, and
> it would be really hard for nested function bodies, since no AST is generated
> for them.

Interesting. Thanks for pointing that out. I don't see an obvious good way to solve that problem with a bottom-up parser.
Comment 70 Zoltan Herczeg 2010-03-23 03:43:00 PDT
Created attachment 51410 [details]
Version 4.1
Comment 71 Zoltan Herczeg 2010-05-20 23:34:51 PDT
Created attachment 56676 [details]
dump AST current version

Latest dump AST patch
Comment 72 Zoltan Herczeg 2010-05-21 06:20:54 PDT
Created attachment 56704 [details]
latest version of my recursive descent parser

Here comes an updated version of our recursive descent JS parser. It's twice as fast as the bison-generated one. Please, have a look.

TEST                           COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:                   2.17x as fast     75.4ms +/- 0.9%   34.7ms +/- 1.4%     significant

=============================================================================

  jquery:                      2.29x as fast     11.9ms +/- 1.9%    5.2ms +/- 5.8%     significant
    1.3.2:                     2.29x as fast     11.9ms +/- 1.9%    5.2ms +/- 5.8%     significant

  mootools:                    2.10x as fast     12.4ms +/- 3.0%    5.9ms +/- 3.8%     significant
    1.2.2-core-nc:             2.10x as fast     12.4ms +/- 3.0%    5.9ms +/- 3.8%     significant

  prototype:                   2.12x as fast     13.8ms +/- 2.2%    6.5ms +/- 5.8%     significant
    1.6.0.3:                   2.12x as fast     13.8ms +/- 2.2%    6.5ms +/- 5.8%     significant

  concat:                      2.18x as fast     37.3ms +/- 0.9%   17.1ms +/- 1.3%     significant
    jquery-mootools-prototype: 2.18x as fast     37.3ms +/- 0.9%   17.1ms +/- 1.3%     significant

(Oliver mentioned on IRC that there are plans for writing a recursive descent parser. Why to implement another one if here is an already working solution?)
Comment 73 Oliver Hunt 2010-05-21 08:05:46 PDT
(In reply to comment #72)
> Created an attachment (id=56704) [details]
> latest version of my recursive descent parser
> 
> Here comes an updated version of our recursive descent JS parser. It's twice as fast as the bison-generated one. Please, have a look.
> 
> TEST                           COMPARISON            FROM                 TO             DETAILS
> 
> =============================================================================
> 
> ** TOTAL **:                   2.17x as fast     75.4ms +/- 0.9%   34.7ms +/- 1.4%     significant
> 
> =============================================================================
> 
>   jquery:                      2.29x as fast     11.9ms +/- 1.9%    5.2ms +/- 5.8%     significant
>     1.3.2:                     2.29x as fast     11.9ms +/- 1.9%    5.2ms +/- 5.8%     significant
> 
>   mootools:                    2.10x as fast     12.4ms +/- 3.0%    5.9ms +/- 3.8%     significant
>     1.2.2-core-nc:             2.10x as fast     12.4ms +/- 3.0%    5.9ms +/- 3.8%     significant
> 
>   prototype:                   2.12x as fast     13.8ms +/- 2.2%    6.5ms +/- 5.8%     significant
>     1.6.0.3:                   2.12x as fast     13.8ms +/- 2.2%    6.5ms +/- 5.8%     significant
> 
>   concat:                      2.18x as fast     37.3ms +/- 0.9%   17.1ms +/- 1.3%     significant
>     jquery-mootools-prototype: 2.18x as fast     37.3ms +/- 0.9%   17.1ms +/- 1.3%     significant
> 
> (Oliver mentioned on IRC that there are plans for writing a recursive descent parser. Why to implement another one if here is an already working solution?)

I just find this parser very difficult to follow -- it's possible my parser is insufficiently complete to show the complexity present in this one (my parser is still very very early in its development) but I honestly have difficulty following the code here.  Once i have more of the grammar hooked up in my parser (i have yet to get a complete parser for the expression tree written) i'll look at the complexity
Comment 74 Akos Kiss 2010-05-21 10:11:03 PDT
(In reply to comment #73)
> > (Oliver mentioned on IRC that there are plans for writing a recursive descent parser. Why to implement another one if here is an already working solution?)
> 
> I just find this parser very difficult to follow -- it's possible my parser is insufficiently complete to show the complexity present in this one (my parser is still very very early in its development) but I honestly have difficulty following the code here.  Once i have more of the grammar hooked up in my parser (i have yet to get a complete parser for the expression tree written) i'll look at the complexity

Ohh, is this serious? A whole parser will be reimplemented (how long will it take?), just because the current one (which is ready for use, and which is fast!) is complex?

I have a deja vu... It seems to me that we played the same game before. We've been working on the ARM JIT and got our solutions rejected again and again, just to find out later on that someone at Apple is already working on something similar.

Long live the open source mentality... :(

P.S.: We are happy to maintain the code we write. We have been around for 2 years, but we are still strugling to get us accepted as serious contributors...
Comment 75 Oliver Hunt 2010-05-21 10:25:38 PDT
(In reply to comment #74)
> (In reply to comment #73)
> > > (Oliver mentioned on IRC that there are plans for writing a recursive descent parser. Why to implement another one if here is an already working solution?)
> > 
> > I just find this parser very difficult to follow -- it's possible my parser is insufficiently complete to show the complexity present in this one (my parser is still very very early in its development) but I honestly have difficulty following the code here.  Once i have more of the grammar hooked up in my parser (i have yet to get a complete parser for the expression tree written) i'll look at the complexity
> 
> Ohh, is this serious? A whole parser will be reimplemented (how long will it take?), just because the current one (which is ready for use, and which is fast!) is complex?

No idea, i just need to get enough implemented to get an idea as to whether the complexity of this parser is reasonable.  We always attempt to avoid unnecessarily complex code, and this parser seems very strangely constructed.

> 
> I have a deja vu... It seems to me that we played the same game before. We've been working on the ARM JIT and got our solutions rejected again and again, just to find out later on that someone at Apple is already working on something similar.

In the case of ARM we were working on a thumb2 version and i believe the discussions between you guys and gavin led to what was a better design overall.

Anyhoo, this situation is entirely different -- I started my recursive descent parser only recently, i've only spent around 3 hours on it so far so it can't even parse an entire expression tree yet.  I also told zoltan the first time i saw him on irc again.

> 
> Long live the open source mentality... :(
> 
> P.S.: We are happy to maintain the code we write. We have been around for 2 years, but we are still strugling to get us accepted as serious contributors...

We do consider you guys to be serious contributors, we frequently talk to zoltan and ossy on irc, however we really don't like big complex code chunks landing unless we're sure that the complexity is necessary -- this applies to everyone, including other people from Apple.
Comment 76 Akos Kiss 2010-05-22 02:11:31 PDT
(In reply to comment #75)
> > I have a deja vu... It seems to me that we played the same game before. We've been working on the ARM JIT and got our solutions rejected again and again, just to find out later on that someone at Apple is already working on something similar.
> 
> In the case of ARM we were working on a thumb2 version and i believe the discussions between you guys and gavin led to what was a better design overall.

Let's say, it's a different design. (We are still investigating what we might have lost - in terms of performance - by conforming to the macro assembler.)

> > P.S.: We are happy to maintain the code we write. We have been around for 2 years, but we are still strugling to get us accepted as serious contributors...
> 
> We do consider you guys to be serious contributors, we frequently talk to zoltan and ossy on irc, however we really don't like big complex code chunks landing unless we're sure that the complexity is necessary -- this applies to everyone, including other people from Apple.

Well, I've never seen before a feature implemented twice just to check it's complexity. (Although, I can imagine other reasons for the reimplementation...)

Anyway, this is not the right forum to discuss this. And I don't want to start a flame war either. Perhaps we can have a talk about this next week in person...
Comment 77 Oliver Hunt 2010-05-22 03:07:26 PDT
(In reply to comment #76)
> (In reply to comment #75)
> > > I have a deja vu... It seems to me that we played the same game before. We've been working on the ARM JIT and got our solutions rejected again and again, just to find out later on that someone at Apple is already working on something similar.
> > 
> > In the case of ARM we were working on a thumb2 version and i believe the discussions between you guys and gavin led to what was a better design overall.
> 
> Let's say, it's a different design. (We are still investigating what we might have lost - in terms of performance - by conforming to the macro assembler.)

When accounting for performance you need to consider all the optimisations we have made that would need to be duplicated if you weren't using the macro assembler.

> 
> > > P.S.: We are happy to maintain the code we write. We have been around for 2 years, but we are still strugling to get us accepted as serious contributors...
> > 
> > We do consider you guys to be serious contributors, we frequently talk to zoltan and ossy on irc, however we really don't like big complex code chunks landing unless we're sure that the complexity is necessary -- this applies to everyone, including other people from Apple.
> 
> Well, I've never seen before a feature implemented twice just to check it's complexity. (Although, I can imagine other reasons for the reimplementation...)

I feel that the implementation attached to this patch is unnecessarily complicated, and as a project we try not to take code that is more complex than necessary.

The problem with complex patches is that they are difficult understand, and therefore difficult to improve.  This may not seem like a problem, but it means for all we know there could be substantial performance improvements to be made that will be hidden behind the complexity.

This is why we always strive for simple code, especially on the initial commit (alas complexity always creeps in over time, but starting off with "complex" just makes it worse).


> 
> Anyway, this is not the right forum to discuss this. And I don't want to start a flame war either. Perhaps we can have a talk about this next week in person...

In all honesty, the way to avoid a flame war is to come on irc and actually discuss issues as it allows comments to be elaborated upon and so reduces the likelihood of comments being misconstrued as insulting or passive aggressive, which in is all honesty what your initial comment came across as.
Comment 78 Zoltan Herczeg 2010-05-22 05:06:23 PDT
> I feel that the implementation attached to this patch is unnecessarily complicated, and as a project we try not to take code that is more complex than necessary.

This is my fault. As you may guessed, this is not my first parser, and I just throw heavily optimizations which was proven useful before without thinking about others. Probably the same thing would happen, if I have to reimplement JIT for a different project, I would just use experinces gained here to make an effective implementation, and others would not understand since they don't know the steps I skipped. The parser is fast, has small memory footprint (including machine stack), but it is true, not easy to understand. A good documentation could be useful here (and not just here, newcomers always find difficult to understand WebKit)

Anyway, the parser is not a time critical project, we should evaluate options before we decide. And share the experiences on irc :)
Comment 79 Zoltan Horvath 2010-05-22 09:14:37 PDT
Hey,

this patch only adds code to WebKit - and what it adds is disabled by default - doesn't remove anything. 
For the present, it doesn't claim additional work from other contributors.

I think it doesn't claim such a big thing to add it to the trunk. We can hack on the code complexity, but there are lots of codes in WebKit, which are understood only by a couple of people. I can’t see difference here. If somebody wants to improve this section, then he'll understand it sooner or later. With a good documentation in the trac, this can be easy enough.

If it will be in the trunk, people can try it out easily and make their decision about using it. If we will see that the community doesn't like it, there is still a chance to remove it. 

I know that WebKit is not an experimental project and we don't want to see much experimental features in WebKit, but neither is this parser. I will be glad to see this in the trunk, because I don't see any counter-arguments.

Zoltan
Comment 80 Oliver Hunt 2010-05-24 02:11:59 PDT
Created attachment 56863 [details]
Recursive descent parser

Here's my weekends work -- it's a simple recursive descent parser for JS.

It is not complete -- it's missing const, debugger, accessors in object literals, and with.  Additionally i haven't yet had a chance to run zoltan's ast dumper which will be necessary as i'm am fairly sure i've screwed up some of the line and character positions for exceptions, etc

It also doesn't have any real stack guards -- there's a stack checker there at the moment but i'm basically just using that to judge stack usage so it simply crashes out if the arbitrary limit is reached.

So with those caveats, i figure i should point out the goodness: i believe this is much more understandable and maintainable than the iterative parser -- it is also much less code, but not by as much as the patch sizes implies -- the iterative parser includes a significant amount of comments.  It is currently ~5% slower than the iterative parser, but is still much faster than bison -- there may be some trivial improvements that can be made but i'm hesistant to make claims like that as I don't believe Zoltan has done any real perf optimisations to the iterative parser either (is this correct?).

Anyhoo, I just figured i'd put this up in response to the "how long will it take?" question. Answer: 1 weekend :-D

--Oliver
Comment 81 Zoltan Herczeg 2010-05-24 02:24:07 PDT
Created attachment 56864 [details]
latest patch

Unfortunately I submitted a wrong patch into bugzilla (version 3 instead of version 4.3).
Comment 82 Zoltan Herczeg 2010-05-24 02:30:27 PDT
> I don't believe Zoltan has done any real perf optimisations to the iterative parser either (is this correct?).

True. My parser even mimics bugs and other things bison have. The focus of my patch was compatibility, and make it possible to use heavy optimizations after the switch.
Comment 83 Oliver Hunt 2010-05-24 03:00:50 PDT
(In reply to comment #82)
> > I don't believe Zoltan has done any real perf optimisations to the iterative parser either (is this correct?).
> 
> True. My parser even mimics bugs and other things bison have. The focus of my patch was compatibility, and make it possible to use heavy optimizations after the switch.

I agree -- there's so much that can be made much better both, but whichever parser we end up using we probably want to basically leave alone for a few weeks to settle in the nightlies before we start making changes to the AST.

I'm trapped in meetings from hell for the next two days so i'll probably do some more tidying of my parsing code during them and will post any significant updates.

--Oliver
Comment 84 Akos Kiss 2010-05-24 09:33:59 PDT
(In reply to comment #80)
> Anyhoo, I just figured i'd put this up in response to the "how long will it take?" question. Answer: 1 weekend :-D

Aye, but that's not fair... it's incomplete... ;)
-Akos
Comment 85 Oliver Hunt 2010-05-28 23:25:54 PDT
Created attachment 57405 [details]
More complete parser

Okay this now passes everything except one test in fast/js that is actually incorrect (the automatic semicolon insertion in yacc is incorrect -- i'm unsure what zoltan's patch does for that test).

Still has issues (ignoring perf) it uses more stack space than i would like, though it's stack usage seems comparable to spidermonkey + v8 (quite a bit better than ffx for the most part, on par with v8's).  The stack guard is not thread safe for no reason other than me being a slacker.  I've also realised that the parsing of for-loops is unnecessarily obtuse.

Once done with those i'm going to go through and fix up the rest of the squirrelly code and then see what can be done to make it nicer.
Comment 86 Darin Adler 2010-05-29 21:04:47 PDT
Comment on attachment 57405 [details]
More complete parser

> +#define parseError() do { m_error = true; return 0; } while (0)
> +#define failIfFalse(cond) do { if (!(cond)) parseError(); } while(0)
> +#define failIfTrue(cond) do { if ((cond)) parseError(); } while(0)
> +#define assertMatch(tok) ASSERT(match(tok))
> +#define checkStackLimit() do { failIfFalse(allowRecursion()); } while(0)

Why can't these all be inline member functions instead of macros?

> +// Macros to make the more common TreeBuilder types a little less verbose
> +#define TreeStatement typename TreeBuilder::Statement
> +#define TreeExpression typename TreeBuilder::Expression
> +#define TreeFormalParameterList typename TreeBuilder::FormalParameterList
> +#define TreeSourceElements typename TreeBuilder::SourceElements

Why can't these be typedefs instead of macros?
Comment 87 Oliver Hunt 2010-05-29 23:44:06 PDT
(In reply to comment #86)
> (From update of attachment 57405 [details])
> > +#define parseError() do { m_error = true; return 0; } while (0)
> > +#define failIfFalse(cond) do { if (!(cond)) parseError(); } while(0)
> > +#define failIfTrue(cond) do { if ((cond)) parseError(); } while(0)
> > +#define assertMatch(tok) ASSERT(match(tok))
> > +#define checkStackLimit() do { failIfFalse(allowRecursion()); } while(0)
> 
> Why can't these all be inline member functions instead of macros?
Because they return from the containing function, if they were functions they couldn't do this.
 
> > +// Macros to make the more common TreeBuilder types a little less verbose
> > +#define TreeStatement typename TreeBuilder::Statement
> > +#define TreeExpression typename TreeBuilder::Expression
> > +#define TreeFormalParameterList typename TreeBuilder::FormalParameterList
> > +#define TreeSourceElements typename TreeBuilder::SourceElements
> 
> Why can't these be typedefs instead of macros?
Because they don't have a single context in which they could be typedef'd -- TreeBuilder is a parameter passed to the parsing functions, not the parser itself.
Comment 88 Oliver Hunt 2010-06-06 13:27:46 PDT
Created attachment 57981 [details]
Updated parser

This parser is both speedy and complete -- it passes all tests, with the exception of the incorrect semicolon insertion test, and can run in very little stack space -- it currently uses stack guards that match the limit used by spider monkey and v8.

That said it has a real problem currently in that lazy and non-lazy parsing use different amounts of stack space, so it's possible to construct examples where a function body can be parsed lazily, but not in the actual "i'd like an AST" pass.  I'm unsure how to deal with this.

The obvious solution (making it iterative) seem highly likely to lead to the kind of convoluted code I was aiming to avoid in the first case with this patch.  Alternatively we could restrict the depth of the lazy parsing more than the full parse, but i am fairly sure that it would still be possible to induce badness even in those cases.  Another alternative would be to handle reparsing failures and produce a stack overflow, but we also need to reparse in response to an exception, so if the stack is exhausted when we go to reparse once again we could end up with badness.

WWDC week is upon us once more so i may be able to spend some time looking into solutions over the course of the week.  I'm wondering if it's possible to separate statement and expression parsing, and have two iterative parsers for less complexity than having a single merged parser.  You'd still recurse, but you'd have bound recursion -- basically statement can call the expression parser, but can't recurse into the statement parser, and the expression parser would have to be completely iterative.

I'm really not sure.

/me shrugs
Comment 89 Oliver Hunt 2010-06-06 13:33:24 PDT
(In reply to comment #88)
> Created an attachment (id=57981) [details]
> Updated parser

Basic changes in this version of the parser is to make assignment chains iterative, and remove the large number of arguments to most of the functions in order to bring a measure of sanity things.  It can now get through run-jsc-tests with 4k of stack space in 64bit i think it was.
Comment 90 Zoltan Herczeg 2010-06-17 05:00:08 PDT
Created attachment 58980 [details]
faster parser

After a long silence, I decided to improve my parser a little because the Qt guys are interested in it:
https://lists.webkit.org/pipermail/webkit-qt/2010-June/000632.html

This is a great news for me, and would like to thank them again.

Practically I reordered some parts of the parser, and improved the switches a little. Seems "prototype" liked it.

TEST                           COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:                   1.033x as fast    34.4ms +/- 2.0%   33.3ms +/- 1.8%     significant

=============================================================================

  jquery:                      -                  5.3ms +/- 6.5%    5.2ms +/- 5.8%
    1.3.2:                     -                  5.3ms +/- 6.5%    5.2ms +/- 5.8%

  mootools:                    ??                 5.3ms +/- 6.5%    5.4ms +/- 6.8%     not conclusive: might be *1.019x as slow*
    1.2.2-core-nc:             ??                 5.3ms +/- 6.5%    5.4ms +/- 6.8%     not conclusive: might be *1.019x as slow*

  prototype:                   1.097x as fast     6.8ms +/- 4.4%    6.2ms +/- 4.9%     significant
    1.6.0.3:                   1.097x as fast     6.8ms +/- 4.4%    6.2ms +/- 4.9%     significant

  concat:                      1.030x as fast    17.0ms +/- 0.0%   16.5ms +/- 2.3%     significant
    jquery-mootools-prototype: 1.030x as fast    17.0ms +/- 0.0%   16.5ms +/- 2.3%     significant
Comment 91 Oliver Hunt 2010-06-17 09:50:33 PDT
(In reply to comment #90)
> Created an attachment (id=58980) [details]
> faster parser
> 
> After a long silence, I decided to improve my parser a little because the Qt guys are interested in it:
> https://lists.webkit.org/pipermail/webkit-qt/2010-June/000632.html
> 
> This is a great news for me, and would like to thank them again.
> 
> Practically I reordered some parts of the parser, and improved the switches a little. Seems "prototype" liked it.

The issue with the iterative parser isn't speed -- it was already very fast, the issue is maintainability, the code is very difficult to follow and it's hard to see how that can be improved.

I'll attach a slightly improved version of my parser in a bit, but i have been distracted by other stuff recently :-/
Comment 92 Zoltan Herczeg 2010-06-17 23:44:32 PDT
> The issue with the iterative parser isn't speed -- it was already very fast, the issue is maintainability, the code is very difficult to follow and it's hard to see how that can be improved.

I think my code is not that bad as the number of "maintability" words imply. It is a state machine, kind of a vertical solution if your code is the horizontal.

Anyway, although we have a lot of guys here who is an expert on code quality, I never heard that there is a high correlation between code quality and successful projects.
Comment 93 Geoffrey Garen 2010-06-18 12:30:01 PDT
(In reply to comment #92)
> I never heard that there is a high correlation between code quality and successful projects.

Code quality is an explicitly state goal of the WebKit project. (See http://webkit.org/projects/goals.html.) If you believe that the WebKit project should change its goals to exclude code quality, please bring the issue up on webkit-dev.

For the purposes of this bug, though, I propose working within the WebKit project goals as they currently stand.
Comment 94 Akos Kiss 2010-06-18 12:48:00 PDT
(In reply to comment #93)
> (In reply to comment #92)
> > I never heard that there is a high correlation between code quality and successful projects.
> 
> Code quality is an explicitly state goal of the WebKit project. (See http://webkit.org/projects/goals.html.) If you believe that the WebKit project should change its goals to exclude code quality, please bring the issue up on webkit-dev.
> 
> For the purposes of this bug, though, I propose working within the WebKit project goals as they currently stand.

I do think that it's been an unfortunate comment only. Certainly, we don't want to lower but to increase the quality of the code of WebKit.

Where we not necessarily agree is whether Zoltan's version is more _complex_ and/or less _maintainable_ than Oliver's implementation. But hopefully both pathes are of high _quality_. Thus, should any of those be selected for landing, WebKit can only benefit from it.
Comment 95 Zoltan Herczeg 2010-06-18 14:16:30 PDT
> Code quality is an explicitly state goal of the WebKit project. (See http://webkit.org/projects/goals.html.) If you believe that the WebKit project should change its goals to exclude code quality, please bring the issue up on webkit-dev.
> 
> For the purposes of this bug, though, I propose working within the WebKit project goals as they currently stand.

Ok, that comment was too much, I know, but keep hearing this "unmaintainable" mantra makes me feel that I have no clue of software developing. I still think that my work is not that bad, and you could help me to improve it.
Comment 96 Oliver Hunt 2010-06-23 18:17:12 PDT
Created attachment 59594 [details]
Patch
Comment 97 WebKit Review Bot 2010-06-23 18:23:01 PDT
Attachment 59594 [details] did not pass style-queue:

Failed to run "['WebKitTools/Scripts/check-webkit-style', '--no-squash']" exit_code: 1
Last 3072 characters of output:
itespace/indent] [4]
JavaScriptCore/parser/JSParser.cpp:1054:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
JavaScriptCore/parser/JSParser.cpp:1056:  A case label should not be indented, but line up with its switch statement.  [whitespace/indent] [4]
JavaScriptCore/parser/JSParser.cpp:1157:  A case label should not be indented, but line up with its switch statement.  [whitespace/indent] [4]
JavaScriptCore/parser/JSParser.cpp:1165:  More than one command on the same line  [whitespace/newline] [4]
JavaScriptCore/parser/JSParser.cpp:1208:  One space before end of line comments  [whitespace/comments] [5]
JavaScriptCore/parser/JSParser.cpp:1208:  Should have a space between // and comment  [whitespace/comments] [4]
JavaScriptCore/parser/JSParser.cpp:1267:  A case label should not be indented, but line up with its switch statement.  [whitespace/indent] [4]
JavaScriptCore/parser/JSParser.cpp:1319:  One line control clauses should not use braces.  [whitespace/braces] [4]
JavaScriptCore/parser/JSParser.cpp:1378:  A case label should not be indented, but line up with its switch statement.  [whitespace/indent] [4]
JavaScriptCore/parser/JSParser.cpp:1442:  A case label should not be indented, but line up with its switch statement.  [whitespace/indent] [4]
JavaScriptCore/parser/JSParser.cpp:1463:  A case label should not be indented, but line up with its switch statement.  [whitespace/indent] [4]
JavaScriptCore/parser/ASTBuilder.h:37:  Missing space before {  [whitespace/braces] [5]
JavaScriptCore/parser/ASTBuilder.h:60:  Missing space before {  [whitespace/braces] [5]
JavaScriptCore/parser/ASTBuilder.h:134:  More than one command on the same line  [whitespace/newline] [4]
JavaScriptCore/parser/ASTBuilder.h:140:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
JavaScriptCore/parser/ASTBuilder.h:759:  A case label should not be indented, but line up with its switch statement.  [whitespace/indent] [4]
JavaScriptCore/parser/ASTBuilder.h:846:  An else statement can be removed when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
JavaScriptCore/parser/ASTBuilder.h:855:  An else statement can be removed when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
JavaScriptCore/parser/Lexer.cpp:29:  Alphabetical sorting problem.  [build/include_order] [4]
JavaScriptCore/parser/JSParser.h:110:  first_line is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
JavaScriptCore/parser/JSParser.h:111:  last_line is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
JavaScriptCore/parser/JSParser.h:112:  first_column is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
JavaScriptCore/parser/JSParser.h:113:  last_column is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
Total errors found: 38 in 17 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 98 Oliver Hunt 2010-06-23 19:11:43 PDT
Created attachment 59603 [details]
Patch
Comment 99 WebKit Review Bot 2010-06-23 19:15:14 PDT
Attachment 59603 [details] did not pass style-queue:

Failed to run "['WebKitTools/Scripts/check-webkit-style', '--no-squash']" exit_code: 1
JavaScriptCore/parser/SyntaxChecker.h:148:  More than one command on the same line  [whitespace/newline] [4]
JavaScriptCore/parser/SyntaxChecker.h:149:  More than one command on the same line  [whitespace/newline] [4]
JavaScriptCore/parser/SyntaxChecker.h:153:  More than one command on the same line  [whitespace/newline] [4]
JavaScriptCore/parser/JSParser.cpp:35:  Alphabetical sorting problem.  [build/include_order] [4]
JavaScriptCore/parser/JSParser.cpp:36:  Alphabetical sorting problem.  [build/include_order] [4]
JavaScriptCore/parser/Lexer.cpp:29:  Alphabetical sorting problem.  [build/include_order] [4]
JavaScriptCore/parser/JSParser.h:110:  first_line is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
JavaScriptCore/parser/JSParser.h:111:  last_line is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
JavaScriptCore/parser/JSParser.h:112:  first_column is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
JavaScriptCore/parser/JSParser.h:113:  last_column is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
Total errors found: 10 in 17 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 100 Gavin Barraclough 2010-06-23 21:12:19 PDT
Comment on attachment 59603 [details]
Patch

comparing Identifer == char* in createGetterOrSetterProperty makes me a little sad.

bool consume(int expected)

This looks a little screwy?

if (!result) {
    m_error = result;

is this what you really mean here? - if so, it's probably clearer written as:


if (!result) {
    m_error = false;

or is this inverted? - if so, I think you want


if (!result)
    fail();

?

 202 };
 203 int jsParse(JSGlobalData* globalData)

pls to be add a newline here.

 229 }
 230 
 231 
 232 bool JSParser::parseProgram()


pls to be remove a newline here. ;-D

 597     consumeOrFail(CLOSEBRACE);
 598     return context.createSwitchStatement(expr, firstClauses, defaultClause, secondClauses, startLine, endLine);
 599 
 600 }

and again.

#define ENABLE_RECURSIVE_PARSE 1

will this go away completely in your next patch? - I strongly hope so.

r+, please please please to be removing the old parser asap.
Comment 101 Oliver Hunt 2010-06-23 21:19:48 PDT
Committed r61732: <http://trac.webkit.org/changeset/61732>
Comment 102 WebKit Review Bot 2010-06-23 21:44:28 PDT
http://trac.webkit.org/changeset/61732 might have broken Qt Linux Release
Comment 103 Eric Seidel 2010-06-23 23:08:11 PDT
Seems to have borked lots of bots. :(
Comment 104 Csaba Osztrogonác 2010-06-24 06:25:33 PDT
(In reply to comment #103)
> Seems to have borked lots of bots. :(

Tiger is still dead :(( 
http://build.webkit.org/builders/Tiger%20Intel%20Release/builds/13588