Bug 63339 - JSON errors should be informative
Summary: JSON errors should be informative
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Juan C. Montemayor
URL:
Keywords:
Depends on: 65144
Blocks:
  Show dependency treegraph
 
Reported: 2011-06-24 12:09 PDT by Oliver Hunt
Modified: 2011-07-26 10:54 PDT (History)
6 users (show)

See Also:


Attachments
Work in progress (30.83 KB, patch)
2011-07-21 20:14 PDT, Juan C. Montemayor
no flags Details | Formatted Diff | Diff
Proposed patch (32.57 KB, patch)
2011-07-22 16:45 PDT, Juan C. Montemayor
no flags Details | Formatted Diff | Diff
proposed patch (33.60 KB, patch)
2011-07-22 17:57 PDT, Juan C. Montemayor
no flags Details | Formatted Diff | Diff
proposed patch (33.32 KB, patch)
2011-07-22 22:46 PDT, Juan C. Montemayor
no flags Details | Formatted Diff | Diff
updated patch for cr-bot (57.29 KB, patch)
2011-07-25 10:59 PDT, Juan C. Montemayor
no flags Details | Formatted Diff | Diff
patch (no speed regression this time) (104.46 KB, patch)
2011-07-25 21:16 PDT, Juan C. Montemayor
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Oliver Hunt 2011-06-24 12:09:07 PDT
Currently the JSON parser just gives a generic "invalid json" style error for all errors, which isn't exactly helpful for debugging.
Comment 1 Juan C. Montemayor 2011-07-21 20:14:18 PDT
Created attachment 101695 [details]
Work in progress

Some error messages don't make sense right now, but they will be improved on soon.
Comment 2 Oliver Hunt 2011-07-22 10:35:37 PDT
Comment on attachment 101695 [details]
Work in progress

View in context: https://bugs.webkit.org/attachment.cgi?id=101695&action=review

r-, due to some code style issues and error message clarity.   Would be good to also store the value of m_ptr for error feedback, then we can give line and column information.

> Source/JavaScriptCore/runtime/LiteralParser.cpp:240
> +                m_lexErrorMessage = "JSON Lexer error: Single quotes are not allowed in JSON";

Call out the specific character as well just to make it clear

> Source/JavaScriptCore/runtime/LiteralParser.cpp:281
> +                m_lexErrorMessage = "JSON Lexer error: Unexpected end of string";

Unterminated string

> Source/JavaScriptCore/runtime/LiteralParser.cpp:325
> +                            m_lexErrorMessage = "JSON Lexer error: Escaped unicode is not a ASCII hexadecimal digit";

I would say "\\u..." is not a valid unicode escape

> Source/JavaScriptCore/runtime/LiteralParser.cpp:436
> +            m_lexErrorMessage = "JSON Lexer error: Exponent symbols should be followed by '+' or '-' and least one number";

+/- are optional

> Source/JavaScriptCore/runtime/LiteralParser.cpp:608
> +                        m_parseErrorMessage = "JSON Parse error: Could not parse value expression";

At this point you either have a bogus token -- so the lexer should have an error message, or you should be able to say what token you got and why it is wrong.

> Source/JavaScriptCore/runtime/LiteralParser.cpp:638
> +                m_parseErrorMessage = "JSON Parse error: Expected end of JSON but file continued";

Unexpected content at end of JSON literal.

> Source/JavaScriptCore/runtime/LiteralParser.h:97
> +            UString m_lexErrorMessage;

make it private

> Source/JavaScriptCore/runtime/LiteralParser.h:112
> +                : m_lexErrorMessage()

UString has a default constructor so this isn't necessary
Comment 3 Oliver Hunt 2011-07-22 10:39:20 PDT
Comment on attachment 101695 [details]
Work in progress

View in context: https://bugs.webkit.org/attachment.cgi?id=101695&action=review

> Source/JavaScriptCore/runtime/LiteralParser.cpp:541
> +                    m_parseErrorMessage = "JSON Parse error: Expected string in object expression";

Property name must be a string literal

> Source/JavaScriptCore/runtime/LiteralParser.cpp:626
> +                        m_parseErrorMessage = "JSON Parse error: Could not parse name in expression";

name?  JSON statement must begin with a [, number, string, or (

That said no one will ever see this error message as you only hit this path when we're trying speculative parsing, not with true JSON.

> Source/JavaScriptCore/runtime/LiteralParser.cpp:633
> +                if (m_lexer.currentToken().type != TokRParen) {
> +                    m_parseErrorMessage = "JSON Parse error: Expected right parenthesis after name in expression";

name again?
Comment 4 Juan C. Montemayor 2011-07-22 10:45:47 PDT
(In reply to comment #3)
> (From update of attachment 101695 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=101695&action=review
> 
> > Source/JavaScriptCore/runtime/LiteralParser.cpp:541
> > +                    m_parseErrorMessage = "JSON Parse error: Expected string in object expression";
> 
> Property name must be a string literal
> 
> > Source/JavaScriptCore/runtime/LiteralParser.cpp:626
> > +                        m_parseErrorMessage = "JSON Parse error: Could not parse name in expression";
> 
> name?  JSON statement must begin with a [, number, string, or (
> 
> That said no one will ever see this error message as you only hit this path when we're trying speculative parsing, not with true JSON.
> 
> > Source/JavaScriptCore/runtime/LiteralParser.cpp:633
> > +                if (m_lexer.currentToken().type != TokRParen) {
> > +                    m_parseErrorMessage = "JSON Parse error: Expected right parenthesis after name in expression";
> 
> name again?

I'll fix these changes. I wrote down name since json.org said that "an object is an unordered set of name/value pairs" but I like property name much better!
Comment 5 Juan C. Montemayor 2011-07-22 16:45:15 PDT
Created attachment 101791 [details]
Proposed patch

Updated code with Oliver's comments.
Comment 6 Oliver Hunt 2011-07-22 16:55:56 PDT
Comment on attachment 101791 [details]
Proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=101791&action=review

> Source/JavaScriptCore/runtime/LiteralParser.cpp:339
> +                    m_lexErrorMessage = "JSON Lexer error: Unable to read string";

Invalid escape character %c

> Source/JavaScriptCore/runtime/LiteralParser.cpp:346
> +        m_lexErrorMessage = "JSON Lexer error: Unterminated string";

No lexer, only parse errors.

"JSON Parse error: "  is repeated everywhere, i think you should just drop that prefix, and add it at the end when you're making the error message.

> Source/JavaScriptCore/runtime/LiteralParser.cpp:436
> +            m_lexErrorMessage = "JSON Lexer error: Exponent symbols should be followed by an optional '+' or '-' and then by at least one number";

Should just be JSON Parse error -- the lexer vs. parser difference isn't really relevant to develoeprs

> Source/JavaScriptCore/runtime/LiteralParser.cpp:634
> +                    case TokRBracket:
> +                        m_parseErrorMessage = "JSON Parse error: Could not parse right bracket";
> +                        return JSValue();
> +                    case TokRBrace:
> +                        m_parseErrorMessage = "JSON Parse error: Could not parse right brace";
> +                        return JSValue();
> +                    case TokIdentifier:
> +                        m_parseErrorMessage = String::format("JSON Parse error: Could not parse \"%s\"", UString(m_lexer.currentToken().stringToken).ascii().data()).impl();
> +                        return JSValue();
> +                    case TokColon:
> +                        m_parseErrorMessage = "JSON Parse error: Could not parse colon";
> +                        return JSValue();
> +                    case TokLParen:
> +                        m_parseErrorMessage = "JSON Parse error: Could not parse left parenthesis";
> +                        return JSValue();
> +                    case TokRParen:
> +                        m_parseErrorMessage = "JSON Parse error: Could not parse right parenthesis";
> +                        return JSValue();
> +                    case TokComma:
> +                        m_parseErrorMessage = "JSON Parse error: Could not parse comma";
> +                        return JSValue();
> +                    case TokDot:
> +                        m_parseErrorMessage = "JSON Parse error: Could not parse dot";
> +                        return JSValue();
> +                    case TokAssign:
> +                        m_parseErrorMessage = "JSON Parse error: Could not parse assignment";
> +                        return JSValue();
> +                    case TokSemi:
> +                        m_parseErrorMessage = "JSON Parse error: Could not parse semicolon";
> +                        return JSValue();

These error messages are icky, it should be something along the lines of "Unexpected token <blah>"

> Source/JavaScriptCore/runtime/LiteralParser.cpp:635
> +                    case TokEnd:

This should be unexpected EOF or some such.

> Source/JavaScriptCore/runtime/LiteralParser.cpp:657
> +                        m_parseErrorMessage = "JSON statement must begin with a [, number, string, or (";

I'd rather go with unexpected token <blah>

> Source/JavaScriptCore/runtime/LiteralParser.cpp:664
> +                    m_parseErrorMessage = "JSON Parse error: Expected right parenthesis after property name in expression";

This error is bogus.  It's testing to ensure that you have balanced the parenthesis.  However you can't actually test for it as this path is only taken when preparsing content in the hope it's json.
Comment 7 Juan C. Montemayor 2011-07-22 17:15:43 PDT
> > Source/JavaScriptCore/runtime/LiteralParser.cpp:634
> > +                    case TokRBracket:
> > +                        m_parseErrorMessage = "JSON Parse error: Could not parse right bracket";
> > +                        return JSValue();
> > +                    case TokRBrace:
> > +                        m_parseErrorMessage = "JSON Parse error: Could not parse right brace";
> > +                        return JSValue();
> > +                    case TokIdentifier:
> > +                        m_parseErrorMessage = String::format("JSON Parse error: Could not parse \"%s\"", UString(m_lexer.currentToken().stringToken).ascii().data()).impl();
> > +                        return JSValue();
> > +                    case TokColon:
> > +                        m_parseErrorMessage = "JSON Parse error: Could not parse colon";
> > +                        return JSValue();
> > +                    case TokLParen:
> > +                        m_parseErrorMessage = "JSON Parse error: Could not parse left parenthesis";
> > +                        return JSValue();
> > +                    case TokRParen:
> > +                        m_parseErrorMessage = "JSON Parse error: Could not parse right parenthesis";
> > +                        return JSValue();
> > +                    case TokComma:
> > +                        m_parseErrorMessage = "JSON Parse error: Could not parse comma";
> > +                        return JSValue();
> > +                    case TokDot:
> > +                        m_parseErrorMessage = "JSON Parse error: Could not parse dot";
> > +                        return JSValue();
> > +                    case TokAssign:
> > +                        m_parseErrorMessage = "JSON Parse error: Could not parse assignment";
> > +                        return JSValue();
> > +                    case TokSemi:
> > +                        m_parseErrorMessage = "JSON Parse error: Could not parse semicolon";
> > +                        return JSValue();
> 
> These error messages are icky, it should be something along the lines of "Unexpected token <blah>"
> 
> 
> > Source/JavaScriptCore/runtime/LiteralParser.cpp:657
> > +                        m_parseErrorMessage = "JSON statement must begin with a [, number, string, or (";
> 
> I'd rather go with unexpected token <blah>

the problem with doing this is that doing something like:

m_parseErrorMessage = String::format("Unexpected token \'%s\'", UString(m_lexer.currentToken().stringToken).ascii().data()).impl();

causes some regressiosn, since m_lexer.currentToken().stringToken may be null. Do you want me to write a special case for that? or write a const error message?

> 
> > Source/JavaScriptCore/runtime/LiteralParser.cpp:664
> > +                    m_parseErrorMessage = "JSON Parse error: Expected right parenthesis after property name in expression";
> 
> This error is bogus.  It's testing to ensure that you have balanced the parenthesis.  However you can't actually test for it as this path is only taken when preparsing content in the hope it's json.

So I should just leave this one out?
Comment 8 Oliver Hunt 2011-07-22 17:21:38 PDT
> m_parseErrorMessage = String::format("Unexpected token \'%s\'", UString(m_lexer.currentToken().stringToken).ascii().data()).impl();
> 
> causes some regressiosn, since m_lexer.currentToken().stringToken may be null. Do you want me to write a special case for that? or write a const error message?

You'll want to special case it for each token type.

> 
> > 
> > > Source/JavaScriptCore/runtime/LiteralParser.cpp:664
> > > +                    m_parseErrorMessage = "JSON Parse error: Expected right parenthesis after property name in expression";
> > 
> > This error is bogus.  It's testing to ensure that you have balanced the parenthesis.  However you can't actually test for it as this path is only taken when preparsing content in the hope it's json.
> 
> So I should just leave this one out?

Yeah
Comment 9 Juan C. Montemayor 2011-07-22 17:57:49 PDT
Created attachment 101797 [details]
proposed patch

patch, take two
Comment 10 Oliver Hunt 2011-07-22 18:04:41 PDT
Comment on attachment 101797 [details]
proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=101797&action=review

Seems like one more round is necessary

> Source/JavaScriptCore/runtime/LiteralParser.cpp:499
> +                    m_parseErrorMessage = "Expected right bracket at end of array expression";

expected ]

> Source/JavaScriptCore/runtime/LiteralParser.cpp:519
> +                        m_parseErrorMessage = "Expected colon before value in object definition";

Expected ':' before value in object property definition

> Source/JavaScriptCore/runtime/LiteralParser.cpp:529
> +                    m_parseErrorMessage = "Expected right brace at the end of object definition";

'}'

> Source/JavaScriptCore/runtime/LiteralParser.cpp:548
> +                    m_parseErrorMessage = "Expected colon before value in object definition";

':'

> Source/JavaScriptCore/runtime/LiteralParser.cpp:564
> +                    m_parseErrorMessage = "Expected right brace at end of object expression";

'}'

> Source/JavaScriptCore/runtime/LiteralParser.cpp:668
> +                        m_parseErrorMessage = String::format("Could not parse \"%s\"", UString(m_lexer.currentToken().stringToken).ascii().data()).impl();

Unexpected identifier "%s"
Comment 11 Juan C. Montemayor 2011-07-22 22:46:14 PDT
Created attachment 101805 [details]
proposed patch

patch, take three
Comment 12 WebKit Review Bot 2011-07-22 23:08:48 PDT
Comment on attachment 101805 [details]
proposed patch

Attachment 101805 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/9232276

New failing tests:
fast/canvas/webgl/framebuffer-test.html
Comment 13 Juan C. Montemayor 2011-07-22 23:15:29 PDT
Oliver, do you know why the chrome-bot failed?
Comment 14 Juan C. Montemayor 2011-07-25 10:59:18 PDT
Created attachment 101885 [details]
updated patch for cr-bot

hopefully the bot will like this one
Comment 15 Juan C. Montemayor 2011-07-25 11:25:30 PDT
So, the cr-linux bot liked the last patch... but it failed 3 times before it passed. Is that significant?
Comment 16 Adam Barth 2011-07-25 11:29:37 PDT
> So, the cr-linux bot liked the last patch... but it failed 3 times before it passed. Is that significant?

Nope.  That just means there were some existing failures unrelated to your patch.
Comment 17 WebKit Review Bot 2011-07-25 11:41:53 PDT
Comment on attachment 101885 [details]
updated patch for cr-bot

Clearing flags on attachment: 101885

Committed r91686: <http://trac.webkit.org/changeset/91686>
Comment 18 WebKit Review Bot 2011-07-25 11:41:59 PDT
All reviewed patches have been landed.  Closing bug.
Comment 19 Oliver Hunt 2011-07-25 16:12:37 PDT
rolled out patch as it introduced 1.5% regression
Comment 20 Juan C. Montemayor 2011-07-25 21:16:58 PDT
Created attachment 101968 [details]
patch (no speed regression this time)

This patch does not cause a speed regression in the sunspider tests. The cause for the regression in the previous patch was the line that created an error message for an unexpected identifier using String::format(). Removing the String::format from that line will still create an adequate error message (Unexpected identifier) and it will not cause a speed regression.
Comment 21 WebKit Review Bot 2011-07-26 10:53:54 PDT
Comment on attachment 101968 [details]
patch (no speed regression this time)

Clearing flags on attachment: 101968

Committed r91765: <http://trac.webkit.org/changeset/91765>
Comment 22 WebKit Review Bot 2011-07-26 10:54:01 PDT
All reviewed patches have been landed.  Closing bug.