Bug 20031 - Implement ES 3.1 JSON object
: Implement ES 3.1 JSON object
Status: RESOLVED FIXED
: WebKit
JavaScriptCore
: 528+ (Nightly build)
: All All
: P2 Normal
Assigned To:
: http://wiki.ecmascript.org/doku.php?i...
: ES5
: 26249 26587 26592
: 25419
  Show dependency treegraph
 
Reported: 2008-07-13 21:29 PST by
Modified: 2011-04-27 03:24 PST (History)


Attachments
Work-in-progress: Step 1: Add skeleton JSON object (18.69 KB, patch)
2008-07-14 02:53 PST, Adam Barth
no flags Review Patch | Details | Formatted Diff | Diff
Work-in-progress: Step 2: Beginnings of JSON parser (12.50 KB, patch)
2008-07-14 02:55 PST, Adam Barth
no flags Review Patch | Details | Formatted Diff | Diff
Work-in-progress: Step 2: Beginnings of JSON parser (12.79 KB, patch)
2008-07-14 03:10 PST, Adam Barth
no flags Review Patch | Details | Formatted Diff | Diff
Work-in-progress: Step 2: JSON parser front-end (14.48 KB, patch)
2008-07-14 15:54 PST, Adam Barth
no flags Review Patch | Details | Formatted Diff | Diff
Work-in-progress: Step 3: Parser frond-end tests (19.44 KB, patch)
2008-07-14 15:57 PST, Adam Barth
no flags Review Patch | Details | Formatted Diff | Diff
Work-in-progress: Step 4: Parser back-end with tests (34.19 KB, patch)
2008-07-15 01:01 PST, Adam Barth
no flags Review Patch | Details | Formatted Diff | Diff
Patch to Add JSON Object (101.96 KB, patch)
2009-03-21 19:02 PST, Husam Senussi
eric: review-
Review Patch | Details | Formatted Diff | Diff
JSON test cases pacth (40.67 KB, patch)
2009-05-13 16:01 PST, Husam Senussi
oliver: review-
Review Patch | Details | Formatted Diff | Diff
Patch to add JSON object - remove tabs (61.57 KB, patch)
2009-05-13 16:28 PST, Husam Senussi
oliver: review-
Review Patch | Details | Formatted Diff | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2008-07-13 21:29:43 PST
ECMAScript 3.1 provides support for native parsing and stringifying of JSON objects.  It looks like IE8 and Firefox 3.1 are going to support this as well.  Maciej wrote on es4-discuss that we would gladly accept patches for ES 3.1 features.  Let's get to it!
------- Comment #1 From 2008-07-13 21:31:15 PST -------
This will likely take more that one patch, so it might make sense to use this as a meta bug.  I'm interested in making this happen, so I thought I'd start with the easy bits.  :)
------- Comment #2 From 2008-07-14 02:53:05 PST -------
Created an attachment (id=22263) [details]
Work-in-progress: Step 1: Add skeleton JSON object

I'm not sure if we want to start landing these patches until we get some level of basic functionality.
------- Comment #3 From 2008-07-14 02:55:11 PST -------
Created an attachment (id=22264) [details]
Work-in-progress: Step 2: Beginnings of JSON parser

This parser knows how to eat JSON and how not to eat non-JSON.  The next step is to actually build a JSObject.
------- Comment #4 From 2008-07-14 03:10:27 PST -------
Created an attachment (id=22265) [details]
Work-in-progress: Step 2: Beginnings of JSON parser

Fix a few bugs.
------- Comment #5 From 2008-07-14 15:54:44 PST -------
Created an attachment (id=22271) [details]
Work-in-progress: Step 2: JSON parser front-end

Here is a reasonable front-end for the JSON parser.  We're still missing the back end.
------- Comment #6 From 2008-07-14 15:57:32 PST -------
Created an attachment (id=22272) [details]
Work-in-progress: Step 3: Parser frond-end tests

Tests for the parser front end.  These test cases come from Firefox.  Should these tests be in the JavaScriptCore/tests directory?  I wasn't sure (1) how to add tests there or (2) how to get these exact characters to throw at the parser without XMLHttpRequest.
------- Comment #7 From 2008-07-14 17:16:02 PST -------
(From update of attachment 22263 [details])
We usually don't use a static property map in JSC for things like this, and instead put the functions directly in the Object it self, see DateConstructor for an example of how do this. 
------- Comment #8 From 2008-07-15 01:01:51 PST -------
Created an attachment (id=22277) [details]
Work-in-progress: Step 4: Parser back-end with tests

Here is a back-end for the parser.  Sam, I'll address your comment next time I rebase the whole patch stream.  I'm going to ask Doug if he has more test cases.

Here is the Firefox native JSON bug:
https://bugzilla.mozilla.org/show_bug.cgi?id=408838
------- Comment #9 From 2008-07-15 02:46:51 PST -------
(From update of attachment 22271 [details])
I didn't attempt to really review this (and the patch is not up for review anyway :) ), but here are some comments I had while skimming over.

+static inline void ws(State* state)

This definitely needs a better name, and ideally, should share code with other isWhitespace/skipWhitespace functions we have (e.g. Lexer::isWhiteSpace, isASCIIWhitespace).

+    return paddedLiteral(state, L'[');

On the Mac, wchar_t is 32-bit, so we cannot use this type and its associated compiler features.

+    static const UChar falseLiteral[] = { 'f', 'a', 'l', 's', 'e', '\0' };

I'm not 100% sure, but I suspect that this array may be initialized at runtime, not compile time. If so, this initialization is not thread safe.

+        while(digit(state))

There should be a space after while.

A general question: what is the reason for writing a custom parser, and not using bison?
------- Comment #10 From 2008-07-15 03:27:38 PST -------
I discussed this with Sam, Geoff and Oliver at the office today. Most of us agreed that using the existing JavaScriptCore lexer along with a bison grammar would be a better approach than an ad-hoc custom parser. Geoff also argued that it would be better to use the full JS parser and then study the syntax tree to ensure that it has no JSON violations, but the rest of us thought the existing lexer plus a custom grammar would work better. (In theory though, I guess it could create the same kinds of AST nodes as the full parser and thus the back end could just consist of compiling and executing the resulting syntax tree. Or alternately the grammar could directly create JS values and combine them, but there might be GC problems with this approach).
------- Comment #11 From 2008-07-17 03:59:29 PST -------
> I discussed this with Sam, Geoff and Oliver at the office today. Most of us
> agreed that using the existing JavaScriptCore lexer along with a bison grammar
> would be a better approach than an ad-hoc custom parser.

Unfortunately, we can't reuse the existing lexer because JSON has different lexing rules.  For example, JSON only supports some of JavaScript string escape sequences.

> Geoff also argued that
> it would be better to use the full JS parser and then study the syntax tree to
> ensure that it has no JSON violations, but the rest of us thought the existing
> lexer plus a custom grammar would work better.

Just looking at the syntax tree is far too late.  There are many requirements of the JSON RFC that are lost at that point.

I've made some amount of progress with a bison grammar for a new lexer.  I'll post a patch once I get sufficiently far to be able to handle simple examples.  It is unclear whether this approach will yield a simpler parser, but the result may be more maintainable as you folks are already used to maintaining a bison parser.
------- Comment #12 From 2008-07-24 00:06:00 PST -------
I've gotten distracted with other things.  It will probably be a bit before I can work on this more.
------- Comment #13 From 2008-10-15 07:28:52 PST -------
It might be good to have a common parser that is used both from script and C++ side, for things like Inspector JSON transport (http://trac.webkit.org/wiki/ProposedWebInspectorRearchitecting).

I am interested in helping out with this patch, but may need guidance :)
------- Comment #14 From 2008-12-11 00:51:12 PST -------
Is there any help that is required in getting this to trunk ?
Please do let me know.
------- Comment #15 From 2008-12-11 06:26:19 PST -------
(In reply to comment #14)
> Is there any help that is required in getting this to trunk ?
> Please do let me know.

The existing patches need to be fixed and attached for review.  See this web page for more details about the process (except in this case, you have some patches to start from):

http://webkit.org/coding/contributing.html
------- Comment #16 From 2008-12-11 11:01:43 PST -------
Per above comments, the consensus seems to be that the existing patches go in the wrong direction, and we'd prefer not to have a custom parser.
------- Comment #17 From 2008-12-11 11:09:55 PST -------
> Per above comments, the consensus seems to be that the existing patches go in
> the wrong direction, and we'd prefer not to have a custom parser.

Yes, I was working on a Yacc-based parser (which is what we use for JavaScript), but I got distracted by other things.  Please feel free to steal this bug.  :)
------- Comment #18 From 2009-03-21 19:02:33 PST -------
Created an attachment (id=28827) [details]
Patch to Add JSON Object

The patch include  the changes for adding JSON object and test cases, There are few thing on the code could be changed.

1. Build the object while parsing instead of the syntax tree.
2. Move some of the code in JSONLexer to common object so it can be used by both JS lexer and JSON.
------- Comment #19 From 2009-03-29 09:50:54 PST -------
I'll try to review this soon.
------- Comment #20 From 2009-04-30 09:00:27 PST -------
It would be good to review/land this soon, as we will need a JSON parser for bug 25419.
------- Comment #21 From 2009-04-30 09:54:05 PST -------
Native JSON support was recently added to V8.  It might be worthwhile cross-testing.  Here are some tests from V8:

http://code.google.com/p/v8/source/browse/trunk/test/mjsunit/json.js
------- Comment #22 From 2009-05-11 05:58:49 PST -------
(From update of attachment 28827 [details])
This patch uses tabs, which is against the style guidelines (and will cause the commit to fail).

Please remove the tabs and re-post.  Also, I think we could review and land the tests first and then review the code.  Or at least I think they should be reviewed separately to make the review smaller.

r- for the tabs.
------- Comment #23 From 2009-05-11 15:18:14 PST -------
I will remove tabs and post separate  patch for the test cases.

But it looks like  my changes not compatible with latest version of the source code, so I need to fix them first.

------- Comment #24 From 2009-05-13 16:01:15 PST -------
Created an attachment (id=30293) [details]
JSON test cases pacth

Test cases for JSON object, I will try and include google test case
------- Comment #25 From 2009-05-13 16:28:19 PST -------
Created an attachment (id=30295) [details]
Patch to add JSON object - remove tabs

Tab is now removed from the source files, Makefiles still have the tabs because it's required
------- Comment #26 From 2009-05-19 23:13:08 PST -------
I have been looking at the JSON lexer grammar recently and have realised the the specified JSON lexer is much more restrictive than the standard lexer used by JS.  This means that we really need to consider whether we use a strict json lexer or the much more forgiving JS lexer in JSC, using the strict JSON lexer seems "nice", but the standard json2.js library just uses eval which could mean that the strict lexer will result in failures that are not present in when using the non-native eval based implementation.
------- Comment #27 From 2009-05-19 23:44:47 PST -------
(In reply to comment #26)
> the standard json2.js library just uses eval which could mean
> that the strict lexer will result in failures that are not present in when
> using the non-native eval based implementation.

I recommend testing Firefox 3.5 and IE8 to see how their lexer behaves.  We should match them as closely as possible.
------- Comment #28 From 2009-05-20 02:20:09 PST -------
> I recommend testing Firefox 3.5 and IE8 to see how their lexer behaves.  We
> should match them as closely as possible.
> 

I did test firefox 3.5 and they are using strict lexer, but I'm not sure about IE8 I don't realy use windows.

I had look at json2.js it looks like the code use regular expression on top of  eval for restriction but one thing is not clear wither it allows use of single quote with members name (i.e  {'x':1}).
------- Comment #29 From 2009-05-20 09:16:03 PST -------
> I had look at json2.js it looks like the code use regular expression on top of 
> eval for restriction but one thing is not clear wither it allows use of single
> quote with members name (i.e  {'x':1}).

json2 is much more lax than we want to be.  We should be as strict as IE8 and Firefox 3.5.
------- Comment #30 From 2009-05-22 19:42:03 PST -------
(From update of attachment 30293 [details])
Okay, while in general i like the tests this patch does i'd like the testcases to use our "standard" js testcase model then tryBackend becomes
shouldBe("JSON2.stringify(JSON2.parse(text))", "JSON.stringify(JSON.parse(text))"), however this is optional (for now ;) )

I'd also like a few tests to cover large inputs, as well as use of the replacement/reviver functions, specifically to test
 * which reviver functions get called; and
 * the order in which reviver functions are called

I'll r- this patch due to the additional testing that i'd like, but by and large i like what's currently here, it just needs to be more complete.
------- Comment #31 From 2009-05-22 19:50:36 PST -------
(From update of attachment 30295 [details])
I don't like this patch, while the JSON object itself is fine, i don't believe it is necessary to use a yacc parser -- given the simplicity of the JSON grammar i believe a hand coded recursive descent parser is likely to be faster, with a little care i believe this will also make it easier to have the parser avoid ever creating an AST in the case where we do not have custom reviver functions.

(Addtionally this particular grammar tracks a lot of information that we track in the normal JS grammar which is of no value to the JSON parser) -- I'm also not entirely sure i like the lexer, but certainly i prefer it over the grammar implementation.  We can in fact probably just use the literal parser we preflight eval with to achieve this.

Sorry for the delay in reviewing.

If you want to discuss more directly i tend to be on #webkit as olliej fairly continuously.
------- Comment #32 From 2009-05-23 10:00:41 PST -------
I agree with your comments but yhe use of bison parser instead of custom parser was based on the Comment #10 From Maciej Stachowiak and Comment #16 From Alexey Proskuryakov.

(In reply to comment #31)
> (From update of attachment 30295 [details] [review])
> I don't like this patch, while the JSON object itself is fine, i don't believe
> it is necessary to use a yacc parser -- given the simplicity of the JSON
> grammar i believe a hand coded recursive descent parser is likely to be faster,
> with a little care i believe this will also make it easier to have the parser
> avoid ever creating an AST in the case where we do not have custom reviver
> functions.
> 
> (Addtionally this particular grammar tracks a lot of information that we track
> in the normal JS grammar which is of no value to the JSON parser) -- I'm also
> not entirely sure i like the lexer, but certainly i prefer it over the grammar
> implementation.  We can in fact probably just use the literal parser we
> preflight eval with to achieve this.
> 
> Sorry for the delay in reviewing.
> 
> If you want to discuss more directly i tend to be on #webkit as olliej fairly
> continuously.
> 
------- Comment #33 From 2009-05-24 04:30:41 PST -------
It's true that Alexey and I originally suggested that. But seeing the patch, and in light of Oliver's recent fast literal parser, I think Oliver is correct.
------- Comment #34 From 2009-06-03 16:48:07 PST -------
I've locally got the literal parser to the point where it is conformant with JSON spec behaviour, and i'm working on getting Husam's JSON object hooked up to it.

Will put initial patch up in a few hours 
------- Comment #35 From 2009-06-22 22:49:50 PST -------
JSON is now complete, made up with
r44550
r44813
r44923
r44924
r44929
r44931
r44968
r44974