Bug 32722 - Allow Reserved Words in Property Accessors
Summary: Allow Reserved Words in Property Accessors
Status: RESOLVED DUPLICATE of bug 42471
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC OS X 10.5
: P2 Normal
Assignee: Nobody
URL:
Keywords: ES5
Depends on:
Blocks:
 
Reported: 2009-12-18 10:28 PST by Joseph Pecoraro
Modified: 2010-12-07 08:11 PST (History)
11 users (show)

See Also:


Attachments
attempted patch (7.52 KB, patch)
2010-01-06 10:25 PST, Patrick Mueller
no flags Details | Formatted Diff | Diff
attempted patch (12.28 KB, patch)
2010-01-06 11:45 PST, Patrick Mueller
no flags Details | Formatted Diff | Diff
attempted patch with ChangeLog entries this time (13.52 KB, patch)
2010-01-06 11:56 PST, Patrick Mueller
darin: review-
Details | Formatted Diff | Diff
patch so far (14.51 KB, patch)
2010-01-08 13:25 PST, Patrick Mueller
no flags Details | Formatted Diff | Diff
refactored parser changes (14.89 KB, patch)
2010-01-13 14:13 PST, Patrick Mueller
no flags Details | Formatted Diff | Diff
2010-01-21-a patch (14.98 KB, patch)
2010-01-21 09:24 PST, Patrick Mueller
abarth: review-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Joseph Pecoraro 2009-12-18 10:28:54 PST
Currently JavaScriptCore does not allow using reserved words as member properties:

  jsc> var x = { 'import': 1 }
  jsc> x.import
  Exception: SyntaxError: Parse error

Other browsers seem to allow this right now. The above works in Chrome 4, FF 3.5, and Opera 10.

The Grammar rules are different. Notice that the last step is different (IdentifierName instead of Identifier).

ES3:

  MemberExpression
    → MemberExpression . Identifier
  
  CallExpression
    → CallExpression . Identifier

ES5:

  MemberExpression
    → MemberExpression . IdentifierName
  
  CallExpression
    → CallExpression . IdentifierName


Workaround: (this is what JSON ended up doing)

  jsc> var x = { 'import': 1 };
  jsc> x['import']; // 1
Comment 1 Joseph Pecoraro 2009-12-18 10:34:44 PST
There is specifically a  "Property Accessors" section of each specification. It is "11.2.1 Property Accessors" in both.
Comment 2 Patrick Mueller 2010-01-06 10:25:50 PST
Created attachment 45967 [details]
attempted patch 

It's been forever since I played with bison, and I never played with it much to begin with.  But thought I'd look into this bug anyway.  

I've patched the grammar to add the required support - it passes the JSC tests, and worked for a simple command-line JSC session using the built jsc:

   >  x = { true: 2 }
   [object Object]
   > x.true
   2

My guess is that this will inflate the parser runtime a bit, since presumably the string data for the existing keywords / reserved words now have to track their values (presumably didn't before).  Thought I'd go ahead and post the patch to see how kosher it looks though.

I'll go ahead and create some actual tests for this as well.
Comment 3 Patrick Mueller 2010-01-06 11:45:10 PST
Created attachment 45975 [details]
attempted patch

added a LayoutTest test case
Comment 4 Patrick Mueller 2010-01-06 11:47:04 PST
Woops, forgot the ChangeLog entry in that patch.

Also wanted to note that the current patch also addresses bug 32721.
Comment 5 WebKit Review Bot 2010-01-06 11:54:30 PST
style-queue ran check-webkit-style on attachment 45975 [details] without any errors.
Comment 6 Patrick Mueller 2010-01-06 11:56:50 PST
Created attachment 45976 [details]
attempted patch with ChangeLog entries this time

added ChangeLog entries
Comment 7 WebKit Review Bot 2010-01-06 12:00:05 PST
style-queue ran check-webkit-style on attachment 45976 [details] without any errors.
Comment 8 Darin Adler 2010-01-06 12:18:00 PST
Comment on attachment 45976 [details]
attempted patch with ChangeLog entries this time

> +%type <ident>           IdentifierNamesvn-diff

This can't be right!

I don't think the name IdentifierName expresses the "all identifiers including reserved words" concept.

review- because of the "svn-diff" thing
Comment 9 Patrick Mueller 2010-01-06 12:25:22 PST
freaking editor focus!  The svn-diff must have snuck in at the last minute somehow.  Drat!

I chose "IdentifierName" per the name Joe pasted in from the ES5 spec in the bug description.  Of course, I don't have a problem with a different name.  Suggestions?

IdentifierOrReserved perhaps?
Comment 10 Maciej Stachowiak 2010-01-06 13:48:25 PST
(In reply to comment #9)
> freaking editor focus!  The svn-diff must have snuck in at the last minute
> somehow.  Drat!
> 
> I chose "IdentifierName" per the name Joe pasted in from the ES5 spec in the
> bug description.  Of course, I don't have a problem with a different name. 
> Suggestions?
> 
> IdentifierOrReserved perhaps?

IdentifierOrReserved sounds good.
Comment 11 Patrick Mueller 2010-01-08 08:31:49 PST
Thought I'd note I'm trying to do some benchmarking to see if the change impacts performance (speed and memory usage).

So far, I've done some runs of SunSpider, as well as a home-built bench of taking the text of uncompressed jquery.js, turning it into an assignment of the outer wrapper function to a variable, and copying that bit in the same file till the file reached 5M in size.  Benched time using "time".

In both cases, I didn't see any significant differences in speed - the numbers were fairly variable to begin with.  Would like something more precise though.  Ideas?
Comment 12 Patrick Mueller 2010-01-08 13:25:59 PST
Created attachment 46153 [details]
patch so far

now using IdentifierOrReserved instead of IdentifierName.

Still looking to bench time/space differences with and without the patch.  Or I'll give up and put in review mode next week if I can't find anything, since it looks generally kosher.  And waiting for olliej to look at this also.
Comment 13 Patrick Mueller 2010-01-11 11:50:17 PST
Did some more performance testing today.  Creating two WebKit directories, built with and without the grammar update.  Then constructed a JS file many-jquerys.js which consisted of many copies of the statement:

   x = function() {
      [contents of jquery.js]
   }

Validated that this script file was parseable.  Then created another js file called wait.js with the contents:
   
   print("press enter to continue")
   readline()

then ran jsc with both builds as 

   ./WebKitTools/Scripts/run-jsc .many-jquerys.js wait.js

I then ran "heap" on the resulting processes, which would have parsed the large file, and then hung waiting at the readline.

The diff between the two outputs of heap resulted in a very small memory diff - < 100 bytes - between the two runs.

I also brought up Activity Monitor and did an "inspect" of both processes.  Likewise, no significant differences.

The size difference between the JavaScriptCore files was:

   2358888 after
   2357552 before

The difference in sizes between Grammar.o are comparable.

Concluding that there is an insignificant impact on memory use with the new grammar.
Comment 14 Patrick Mueller 2010-01-11 12:26:39 PST
I wasn't able to figure out how to use any of the standard Mac perf tools to micro-bench the new parser to see how it compared, so I just did a test with "time" instead.  Took the big many-jquerys.js file I used per the comment above, and passed it as an argument 5 times to jsc and prefixed the whole thing with "time".  The "real" time was on the order of 1.5 seconds, and there didn't seem to be any signficant different between multiple runs of both - both generated numbers +/- 2% different on back-to-back runs.

At this point, I'm confident to the best of my abilities there aren't any significant time/space differences with the new grammar.
Comment 15 Patrick Mueller 2010-01-11 12:30:08 PST
Comment on attachment 46153 [details]
patch so far

Marking current patch for review.  Olliej was a requested reviewer from someone on IRC.
Comment 16 Darin Adler 2010-01-11 13:55:45 PST
Comment on attachment 46153 [details]
patch so far

Everything looks fine to me. I think Oliver may want to double check your results on performance testing, but otherwise seems good to go.

I would like the reserved words to be in some canonical order. The order in %token is not the same as the order in the IdentifierOrReserved production. I suggest using alphabetical order in IdentifierOrReserved (leaving IDENT at the top).
Comment 17 Oliver Hunt 2010-01-11 14:25:24 PST
SunSpider actually has a parsing test -- run-sunspider --suite="parse-only" i
just fixed it in r53099 can you do your tests with that?
Comment 18 Patrick Mueller 2010-01-12 09:10:56 PST
(In reply to comment #17)
> SunSpider actually has a parsing test -- run-sunspider --suite="parse-only" i
> just fixed it in r53099 can you do your tests with that?

I did clean rebuids with and without, and then ran the parse-only sunspider suite.  the individual runs resulted in numbers that varied enough that the results didn't appear significant (sometimes "new" was better, but ~usually~ "old" was better).  So I ran the parse tests 20 times in a row, read the results, and calculated the mean and stdev the same way sunspider does (throw out first run).  Here are the results:

before grammar change:

                  mootools-1.2.2-core-nc  mean:   7.9  stdev:   1.9
                            jquery-1.3.2  mean:   7.7  stdev:   1.6
                       prototype-1.6.0.3  mean:   9.1  stdev:   2.6
        concat-jquery-mootools-prototype  mean:  24.8  stdev:   2.4


after grammar change:

                  mootools-1.2.2-core-nc  mean:   7.9  stdev:   1.8
                            jquery-1.3.2  mean:   7.8  stdev:   2.1
                       prototype-1.6.0.3  mean:   9.7  stdev:   3.4
        concat-jquery-mootools-prototype  mean:  26.4  stdev:   4.3

the grammar change seems slower, though the stdev is larger for the slower results as well.

No chance we can get a higher resolution timer, is there?
Comment 19 Oliver Hunt 2010-01-12 10:11:55 PST
(In reply to comment #18)
> (In reply to comment #17)
> > SunSpider actually has a parsing test -- run-sunspider --suite="parse-only" i
> > just fixed it in r53099 can you do your tests with that?
> 
> I did clean rebuids with and without, and then ran the parse-only sunspider
> suite.  the individual runs resulted in numbers that varied enough that the
> results didn't appear significant (sometimes "new" was better, but ~usually~
> "old" was better).  So I ran the parse tests 20 times in a row, read the
> results, and calculated the mean and stdev the same way sunspider does (throw
> out first run).  Here are the results:
> 
> before grammar change:
> 
>                   mootools-1.2.2-core-nc  mean:   7.9  stdev:   1.9
>                             jquery-1.3.2  mean:   7.7  stdev:   1.6
>                        prototype-1.6.0.3  mean:   9.1  stdev:   2.6
>         concat-jquery-mootools-prototype  mean:  24.8  stdev:   2.4
> 
> 
> after grammar change:
> 
>                   mootools-1.2.2-core-nc  mean:   7.9  stdev:   1.8
>                             jquery-1.3.2  mean:   7.8  stdev:   2.1
>                        prototype-1.6.0.3  mean:   9.7  stdev:   3.4
>         concat-jquery-mootools-prototype  mean:  26.4  stdev:   4.3
> 
> the grammar change seems slower, though the stdev is larger for the slower
> results as well.
> 
> No chance we can get a higher resolution timer, is there?

put --runs 30 into the commandline, make sure you're not running anything else, and use sunspider-compare-results to do the actual comparison
Comment 20 Patrick Mueller 2010-01-12 10:56:05 PST
(In reply to comment #19)

> put --runs 30 into the commandline, make sure you're not running anything else,
> and use sunspider-compare-results to do the actual comparison

results per those instructions (actually ran 50 times instead of 30):

TEST                           COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:                   *1.025x as slow*  48.3ms +/- 1.3%   49.5ms +/- 1.6%     significant

=============================================================================

  jquery:                      ??                 7.4ms +/- 2.1%    7.5ms +/- 2.1%     not conclusive: might be *1.011x as slow*
    1.3.2:                     ??                 7.4ms +/- 2.1%    7.5ms +/- 2.1%     not conclusive: might be *1.011x as slow*

  mootools:                    ??                 7.8ms +/- 3.1%    8.0ms +/- 7.1%     not conclusive: might be *1.033x as slow*
    1.2.2-core-nc:             ??                 7.8ms +/- 3.1%    8.0ms +/- 7.1%     not conclusive: might be *1.033x as slow*

  prototype:                   ??                 8.7ms +/- 5.1%    8.8ms +/- 2.4%     not conclusive: might be *1.007x as slow*
    1.6.0.3:                   ??                 8.7ms +/- 5.1%    8.8ms +/- 2.4%     not conclusive: might be *1.007x as slow*

  concat:                      *1.034x as slow*  24.4ms +/- 1.7%   25.2ms +/- 1.8%     significant
    jquery-mootools-prototype: *1.034x as slow*  24.4ms +/- 1.7%   25.2ms +/- 1.8%     significant

All these recent tests indicate: the new grammar is slower.  I assume I should look for alternatives at this point?
Comment 21 Oliver Hunt 2010-01-12 10:58:22 PST
> All these recent tests indicate: the new grammar is slower.  I assume I should
> look for alternatives at this point?

You could try making the tests larger, then using the --profiling and  --shark or --shark20 flags on run-sunspider to see if shows any unreasonably hot points?
Comment 22 Patrick Mueller 2010-01-12 13:26:51 PST
(In reply to comment #21)
> > All these recent tests indicate: the new grammar is slower.  I assume I should
> > look for alternatives at this point?
> 
> You could try making the tests larger, then using the --profiling and  --shark
> or --shark20 flags on run-sunspider to see if shows any unreasonably hot
> points?

Thanks for the pointers.  A little worried though - I really would need to diff such a profile run against the current parser, because I'm looking for hot-spots in the diffs.  And then I'm worried I may not be able to track such a hot-spot back to the Grammar.y file.

I think what I'm going to do first is actually figure out more about how the grammar actually works, and see if it can be systematically fixed.  For example, I'm assuming now that there are more look ups for these "property" accesses to see if the property is actually a reserved word, when in fact ... it doesn't matter.  If it's an identifier, then it's legal, doesn't actually matter if it's a reserved word or not.  Even worse, using reserved words as property values is the exception rather than the rule, and we haven't even benched using reserved words as properties yet!  Presumably, cooking thoughts like into the grammar would result in a faster parser - less work to do at the property parsing points.  Time to crack open "Bison for Dummies!"
Comment 23 Patrick Mueller 2010-01-13 14:13:20 PST
Created attachment 46505 [details]
refactored parser changes

I played with the grammar some more, after reading the Bison doc.  Main goal was to split out the IdentifierOrReserved patterns into two, in hopes it would require less work to match the common cases.  Also, keeping the semantic type of the keywords undefined, except in the case of the new Reserved non-terminal.  I think it was a little successful.  SunSpider runs were generally faster than with the previous patch, but also didn't quite get as fast as the original parser.  At this point, I think I'll do some profiling of the parser to see if anything shows up.
Comment 24 Patrick Mueller 2010-01-19 13:36:54 PST
I finally got around to profiling the parser, but didn't see anything obvious going on.

I created a new parser-only test which was the big concat sample, only 10x bigger (copied the text 9 times).  Saw the same sort of perf differences.  About 1% slower.

I started looking into the generated Grammar.cpp file, comparing the sizes of the tables; I'm left thinking that "patch so far", which created a non-terminal named IdentOrReserved, is a better approach than the split/duplicate way it was done in "refactored parser changes".  The parse table were bigger, the speed difference seemed to be insigificant ("refactored parser changes" seemed to run a bit faster, but it's hard to tell), there's by definition more code, and there are duplicate rules to keep in sync in the .y file.

In particular, I was worried about assigning a type of <ident> to the keywords, but this had absolutely no effect on the generated code - it's a form of type-checking for bison, it appears.

I'm left with assuming there's no way that I can get the speed back and add the new capability.

On the plus side, during my research, I dug up this: Bug 33860, which seems like something interesting to try.
Comment 25 Patrick Mueller 2010-01-21 09:24:13 PST
Created attachment 47127 [details]
2010-01-21-a patch

attempt at a final patch:

- seems like a general 1% increase in parse time across the board

- reverrted back to "patch so far" styled change for the property rules - causes an additional reduce step by the compiler but doesn't require the code duplication for having the duplicate IDENT/IdentifierOrReserved rules

- sorted the token lists in the %token specs and IdentifierOrReserved definition.

- added LayoutTests/fast/js/reserved-words-as-properties-expected.txt to test using all keywords/reserved words as properties
Comment 26 Darin Adler 2010-01-21 11:42:32 PST
I don’t think an increase in parse time is acceptable. Can we find a way to avoid that?
Comment 27 Patrick Mueller 2010-01-21 13:21:24 PST
(In reply to comment #26)
> I don’t think an increase in parse time is acceptable. Can we find a way to
> avoid that?

Unrelated to this bug, I opened the following while googling for optimizing yacc grammars: Bug 33860.  I was hoping that perhaps we could pay for this parser cost with savings from that bug.

It's not clear to me that we can always get away with not increasing the parse time for grammar changes.  Not a happy story, but maybe just the way it is.

I should also note that I am in no way an expert on yacc-style parser building.  I've exhausted the exploration paths that I was aware of though.
Comment 28 Darin Adler 2010-01-21 14:08:11 PST
Recently Zoltan said he was working on a custom-written parser to replace the Bison one.
Comment 29 Patrick Mueller 2010-01-22 05:11:47 PST
(In reply to comment #28)
> Recently Zoltan said he was working on a custom-written parser to replace the
> Bison one.

That would obviously make this patch moot.  But the test case can be reused.
Comment 30 Zoltan Herczeg 2010-01-22 08:16:12 PST
(In reply to comment #29)
> (In reply to comment #28)
> > Recently Zoltan said he was working on a custom-written parser to replace the
> > Bison one.
> 
> That would obviously make this patch moot.  But the test case can be reused.

Yeah, I am working on a new parser. The patch is 270k now, 110k to remove grammar.y, but the remaining is new code. Will be a reviewers nightmare :D There is 271 regressions so far, because there are still some complicated JS statement is not yet finished (namely: try, switch, object-literal). And still I am not sure what the ArgumentsFeature means. There is another 55k patch which dumps the AST to a text file which is used for regression testing (although could be interesting for the others to actually see the tree). The hand written parser is ~ 3x faster for regular code, and ~ 4x faster for syntax checking (was NoNode rules in grammar.y). Would be also good to have syntax error regression tests (like: 3 = 2 is ok, while -3 = 2 is parse error, 'do ; while(0) ++i is' ok, but 'for (var a, b in c) {}' is a parse error.). Actually the 'for' was the worst to parse, since it has so many valid and invalid forms.

I didn't know that keywords can be used after a dot, and in the object literal. I will probably need a keyword to ident converter, and append all of these tokens to CommonIdentifiers.
Comment 31 Oliver Hunt 2010-01-22 11:16:57 PST
(In reply to comment #30)
> (In reply to comment #29)
> > (In reply to comment #28)
> > > Recently Zoltan said he was working on a custom-written parser to replace the
> > > Bison one.
> > 
> > That would obviously make this patch moot.  But the test case can be reused.
> 
> Yeah, I am working on a new parser. The patch is 270k now, 110k to remove
> grammar.y, but the remaining is new code. Will be a reviewers nightmare :D
> There is 271 regressions so far, because there are still some complicated JS
> statement is not yet finished (namely: try, switch, object-literal). And still
> I am not sure what the ArgumentsFeature means. There is another 55k patch which
> dumps the AST to a text file which is used for regression testing (although
> could be interesting for the others to actually see the tree). The hand written
> parser is ~ 3x faster for regular code, and ~ 4x faster for syntax checking
> (was NoNode rules in grammar.y). Would be also good to have syntax error
> regression tests (like: 3 = 2 is ok, while -3 = 2 is parse error, 'do ;
> while(0) ++i is' ok, but 'for (var a, b in c) {}' is a parse error.). Actually
> the 'for' was the worst to parse, since it has so many valid and invalid forms.
> 
> I didn't know that keywords can be used after a dot, and in the object literal.
> I will probably need a keyword to ident converter, and append all of these
> tokens to CommonIdentifiers.
It would be good if you could put your patch up for feedback, rather than writing the entire thing in a vacuum - I had been planning on rewriting the parser in a recursive descent form as that's necessary for strict mode support so it would be good to see what your hand rolled parser is doing.
Comment 32 Zoltan Herczeg 2010-01-22 13:45:25 PST
See: https://bugs.webkit.org/show_bug.cgi?id=34019
Comment 33 Adam Barth 2010-03-08 10:53:42 PST
Comment on attachment 47127 [details]
2010-01-21-a patch

According to Darin, we don't want this patch because it's a 1% perf regression.  Also, we're re-writing the JavaScript parser.  There's no need for this patch to be sitting in pending-review.
Comment 34 Zoltan Herczeg 2010-07-16 11:35:39 PDT

*** This bug has been marked as a duplicate of bug 42471 ***
Comment 35 Darin Adler 2010-12-07 08:11:56 PST
Someone recently pointed to this bug. Note that it’s a duplicate. See bug 42471 for how this was ultimately resolved.