Summary: | Lexer error messages are not very informative | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Oliver Hunt <oliver> | ||||||||||||||||||
Component: | JavaScriptCore | Assignee: | Juan C. Montemayor <j.mont> | ||||||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||||||
Severity: | Normal | CC: | dglazkov, ggaren, oliver, webkit.review.bot | ||||||||||||||||||
Priority: | P2 | ||||||||||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||
Attachments: |
|
Description
Oliver Hunt
2011-06-24 12:12:46 PDT
Created attachment 98602 [details]
Patch
(In reply to comment #1) > Created an attachment (id=98602) [details] > Patch For the javascritptcore-tests this patch causes no regressions in debug mode but causes two regressions in release mode. I am currently re-checking my work to find out why, but if anyone has any ideas... (In reply to comment #2) > (In reply to comment #1) > > Created an attachment (id=98602) [details] [details] > > Patch > > For the javascritptcore-tests this patch causes no regressions in debug mode but causes two regressions in release mode. I am currently re-checking my work to find out why, but if anyone has any ideas... Also, there are still some junk test outputs in the patch... I'll remove those once I fix the regression. Comment on attachment 98602 [details] Patch Attachment 98602 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/8932894 New failing tests: transitions/transition-end-event-transform.html http/tests/media/reload-after-dialog.html http/tests/media/pdf-served-as-pdf.html Created attachment 98607 [details]
Archive of layout-test-results from ec2-cr-linux-03
The attached test failures were seen while running run-webkit-tests on the chromium-ews.
Bot: ec2-cr-linux-03 Port: Chromium Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
Comment on attachment 98602 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=98602&action=review > Source/JavaScriptCore/parser/Lexer.cpp:531 > + m_lexErrorMessage = "Only quote characters are allowed after \\u"; \u can only be followed by a unicode character number, or a closing quote > Source/JavaScriptCore/parser/Lexer.cpp:582 > + m_lexErrorMessage = "New-line or end of input is not allowed here"; Unexpected EOF > Source/JavaScriptCore/parser/Lexer.cpp:1119 > + m_lexErrorMessage = "Invalid character"; Should say what the character is -- and should possibly escape it if it's a control character. > LayoutTests/fast/js/basic-strict-mode-expected.txt:184 > +PASS 'use strict';5.f threw exception SyntaxError: Identifiers cannot start with a number. This error is bogus -- "Only digits may follow a decimal point" might be ebtter > LayoutTests/fast/js/cyclic-proto-expected.txt:1 > +posix_spawn fatal error: 9! This change is bogus > > Source/JavaScriptCore/parser/Lexer.cpp:1119
> > + m_lexErrorMessage = "Invalid character";
>
> Should say what the character is -- and should possibly escape it if it's a control character.
So, I'm getting weird character values for this case... I'm getting (as int) 8205 for the removing-Cf-characters test, 247 for the parse test, and 827 for basic-strict-mode. These aren't ascii, amirite?
In any case, I get these errors -
'use strict'Ì» threw exception SyntaxError: Invalid character ;.
(function(){'use strict'Ì»}) threw exception SyntaxError: Invalid character ;.
'use strict';Ì» threw exception SyntaxError: Invalid character ;.
(function(){'use strict';Ì»}) threw exception SyntaxError: Invalid character ;.
In the first two, there aren't any semicolons in the code, and in the second ones, I'm pretty sure what fails isn't the semicolon.
I tried not outputting any characters if they weren't ascii, and null terminating those that needed null terminating like so:
case CharacterInvalid:
if (m_current < 33)
m_lexErrorMessage = String::format("Invalid character \%i",m_current).impl();
else if (m_current > 127)
m_lexErrorMessage = "Invalid character";
else
m_lexErrorMessage = String::format("Invalid character %i",m_current).impl();
goto returnError;
but then the tests just showed "Invalid character" as output. Any recommendations?
Also, what should I do about the regressions in release mode?
(In reply to comment #7) > > > Source/JavaScriptCore/parser/Lexer.cpp:1119 > > > + m_lexErrorMessage = "Invalid character"; > > > > Should say what the character is -- and should possibly escape it if it's a control character. > > So, I'm getting weird character values for this case... I'm getting (as int) 8205 for the removing-Cf-characters test, 247 for the parse test, and 827 for basic-strict-mode. These aren't ascii, amirite? > > In any case, I get these errors - > 'use strict'Ì» threw exception SyntaxError: Invalid character ;. > (function(){'use strict'Ì»}) threw exception SyntaxError: Invalid character ;. > 'use strict';Ì» threw exception SyntaxError: Invalid character ;. > (function(){'use strict';Ì»}) threw exception SyntaxError: Invalid character ;. > > In the first two, there aren't any semicolons in the code, and in the second ones, I'm pretty sure what fails isn't the semicolon. > > I tried not outputting any characters if they weren't ascii, and null terminating those that needed null terminating like so: > > case CharacterInvalid: > if (m_current < 33) > m_lexErrorMessage = String::format("Invalid character \%i",m_current).impl(); > else if (m_current > 127) > m_lexErrorMessage = "Invalid character"; > else > m_lexErrorMessage = String::format("Invalid character %i",m_current).impl(); > goto returnError; > > but then the tests just showed "Invalid character" as output. Any recommendations? In the invlaid character case i'd just do escaped unicode (%04u i think should do it), specialising for \t, \r, \n, \v, etc > > Also, what should I do about the regressions in release mode? Work out what's causing them -- at a guess you're probably relying on an uninitialised value > > but then the tests just showed "Invalid character" as output. Any recommendations? > In the invlaid character case i'd just do escaped unicode (%04u i think should do it), specialising for \t, \r, \n, \v, etc Using %04u I get this output: PASS var ZWJ_Iâ€nside; threw exception SyntaxError: Invalid character 8205. Is that fine? > > > > Also, what should I do about the regressions in release mode? > > Work out what's causing them -- at a guess you're probably relying on an uninitialised value Brilliant! Fixed. (In reply to comment #9) > > > but then the tests just showed "Invalid character" as output. Any recommendations? > > In the invlaid character case i'd just do escaped unicode (%04u i think should do it), specialising for \t, \r, \n, \v, etc > > Using %04u I get this output: > > PASS var ZWJ_Iâ€nside; threw exception SyntaxError: Invalid character 8205. > > Is that fine? Having thought about it maybe we want to do printf("Invalid character %C (\\u%04u)", m_current, m_current); But i'm not sure if Stirng::format supports %C > > > > > > > Also, what should I do about the regressions in release mode? > > > > Work out what's causing them -- at a guess you're probably relying on an uninitialised value > > Brilliant! Fixed. \o/ > > Using %04u I get this output:
> >
> > PASS var ZWJ_Iâ€nside; threw exception SyntaxError: Invalid character 8205.
> >
> > Is that fine?
>
> Having thought about it maybe we want to do
>
> printf("Invalid character %C (\\u%04u)", m_current, m_current);
>
> But i'm not sure if Stirng::format supports %C
>
I checked, and it doesn't... I'll add the \\u before the unicode, and figure out how I can implement the %C
(In reply to comment #11) > > > Using %04u I get this output: > > > > > > PASS var ZWJ_Iâ€nside; threw exception SyntaxError: Invalid character 8205. > > > > > > Is that fine? > > > > Having thought about it maybe we want to do > > > > printf("Invalid character %C (\\u%04u)", m_current, m_current); > > > > But i'm not sure if Stirng::format supports %C > > > > I checked, and it doesn't... I'll add the \\u before the unicode, and figure out how I can implement the %C Brainfart, just use %04d :d Created attachment 98769 [details]
Patch
Should I run any other tests other than webkit-tests and javascriptcore-tests in both debug and release mode?
Comment on attachment 98769 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=98769&action=review Other than the above comments this looks good. Once this is done we can move to JSON error messages or getting line a character information :D > Source/JavaScriptCore/parser/Lexer.cpp:531 > + m_lexErrorMessage = "\\u can only be followed by a unicode character number, or a closing quote"; s/number/sequence i think would be a better wording > Source/JavaScriptCore/parser/Lexer.cpp:569 > + m_lexErrorMessage = "Problem reading character sequence"; I believe this should be "Unterminated string constant" > Source/JavaScriptCore/parser/Lexer.cpp:1119 > + m_lexErrorMessage = String::format("Invalid character (\\u%04u)", m_current).impl(); I think this should be a function for creating the error message so that simple characters that are invalid are displayed in a user friendly way, eg: `, @, #, \n, \r, \v, \0 The rest can be shown as a unicode escape Comment on attachment 98769 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=98769&action=review Other than the above comments this looks good. Once this is done we can move to JSON error messages or getting line a character information :D >> Source/JavaScriptCore/parser/Lexer.cpp:531 >> + m_lexErrorMessage = "\\u can only be followed by a unicode character number, or a closing quote"; > > s/number/sequence i think would be a better wording s/number/sequence i think would be a better wording >> Source/JavaScriptCore/parser/Lexer.cpp:569 >> + m_lexErrorMessage = "Problem reading character sequence"; > > I believe this should be "Unterminated string constant" I believe this should be "Unterminated string constant" >> Source/JavaScriptCore/parser/Lexer.cpp:1119 >> + m_lexErrorMessage = String::format("Invalid character (\\u%04u)", m_current).impl(); > > I think this should be a function for creating the error message so that simple characters that are invalid are displayed in a user friendly way, eg: > `, @, #, \n, \r, \v, \0 > > The rest can be shown as a unicode escape I think this should be a function for creating the error message so that simple characters that are invalid are displayed in a user friendly way, eg: `, @, #, \n, \r, \v, \0 The rest can be shown as a unicode escape Sorry for the noise, review screen seemed to go nuts on me :( Created attachment 99208 [details]
Proposed patch
fixed wording of some errors and added function for creating the error message
Created attachment 99212 [details]
fixed patch
Fixed tab and also added file that will hopefully not break cr-linux :)
Comment on attachment 99212 [details]
fixed patch
my bad, accidentally uploaded this patch to this bug. I'll try to find out why the win bot is failing on my other patch
Created attachment 99335 [details]
proposed patch
fixed the issue that made the win bot fail
Comment on attachment 99335 [details] proposed patch View in context: https://bugs.webkit.org/attachment.cgi?id=99335&action=review Looking good -- the new error messages are a huge improvement. But I think this patch could be even better. > Source/JavaScriptCore/parser/Lexer.cpp:237 > +ALWAYS_INLINE UString Lexer::getInvalidCharMessage() I don't think you want to inline this -- it's the error case, so it should be out of line, for smaller code generation and less register pressure and instruction cache pollution in the non-error path. > Source/JavaScriptCore/parser/Lexer.cpp:241 > + return UString("Invalid character '\\0'").impl(); Since your return type is UString, you can just return a C string ("Invalid..."), and the UString constructor will do the rest. This construct you've used will create two UStrings for every call, instead of one. > Source/JavaScriptCore/parser/Lexer.cpp:243 > + return UString("Invalid character '\\n'").impl(); I think these messages would be clearer with a colon: "Invalid character: ..."). > Source/JavaScriptCore/parser/Lexer.cpp:554 > + m_lexErrorMessage = "\\u can only be followed by a unicode character sequence"; Unicode is a proper noun, so please capitalize it. > Source/JavaScriptCore/parser/Lexer.cpp:562 > + m_lexErrorMessage = "Only valid numeric escape in strict mode is '\\0'"; To make this a complete sentence, say "The only valid..." > Source/JavaScriptCore/parser/Lexer.cpp:1296 > + m_lexErrorMessage = UString("Lexer error"); No need for "UString" here -- the constructor will do that for you. Created attachment 99397 [details]
revised patch
Fixed details mentioned by Geoff
Comment on attachment 99397 [details] revised patch View in context: https://bugs.webkit.org/attachment.cgi?id=99397&action=review > Source/JavaScriptCore/parser/Lexer.cpp:253 > + return "Invalid character: '\\#'"; > + case 64: > + return "Invalid character: '\\@'"; > + case 96: > + return "Invalid character: '\\`'"; These characters don't need to be escaped, so shouldn't be :-( > Source/JavaScriptCore/parser/Lexer.cpp:1108 > + m_lexErrorMessage = "Only digits may follow a decimal point"; I think this would be better as "At least one digit must occur after a decimal point" > Source/JavaScriptCore/parser/Lexer.cpp:1296 > + m_lexErrorMessage = "Lexer error"; Why have a string here at all in the clean state? > Source/JavaScriptCore/parser/Lexer.h:78 > + m_lexErrorMessage = "Lexer error"; Default error message should be empty. By doing this assignment I believe you're adding at least two memory allocations and a memcpy to something we expect to be very fast. > Source/JavaScriptCore/parser/Parser.cpp:54 > + UString lexErrorMessage = lexError ? lexer.getErrorMessage() : UString(); I would do lexErrorMessage = lexer.getErrorMessage(); ASSERT(lexErrorMessage.isNull() != lexError); // If there's been a lexer error there should be a message, if there hasn't been, there should not be a message. Created attachment 99487 [details]
patch
updated patch based on Oliver's comments
Attachment 99487 [details] did not pass style-queue:
Failed to run "['Tools/Scripts/update-webkit', '--chromium']" exit_code: 2
Updating OpenSource
Current branch master is up to date.
Updating chromium port dependencies using gclient...
________ running '/usr/bin/python gyp_webkit' in '/mnt/git/webkit-style-queue/Source/WebKit/chromium'
Updating webkit projects from gyp files...
TError: /usr/bin/python gyp_webkit in /mnt/git/webkit-style-queue/Source/WebKit/chromium returned 1
raceback (most recent call last):
File "gyp_webkit", line 117, in <module>
sys.exit(gyp.main(args))
File "/mnt/git/webkit-style-queue/Source/WebKit/chromium/tools/gyp/pylib/gyp/__init__.py", line 448, in main
options.circular_check)
File "/mnt/git/webkit-style-queue/Source/WebKit/chromium/tools/gyp/pylib/gyp/__init__.py", line 87, in Load
depth, generator_input_info, check, circular_check)
File "/mnt/git/webkit-style-queue/Source/WebKit/chromium/tools/gyp/pylib/gyp/input.py", line 2210, in Load
depth, check)
File "/mnt/git/webkit-style-queue/Source/WebKit/chromium/tools/gyp/pylib/gyp/input.py", line 421, in LoadTargetBuildFile
includes, depth, check)
File "/mnt/git/webkit-style-queue/Source/WebKit/chromium/tools/gyp/pylib/gyp/input.py", line 421, in LoadTargetBuildFile
includes, depth, check)
File "/mnt/git/webkit-style-queue/Source/WebKit/chromium/tools/gyp/pylib/gyp/input.py", line 355, in LoadTargetBuildFile
includes, True, check)
File "/mnt/git/webkit-style-queue/Source/WebKit/chromium/tools/gyp/pylib/gyp/input.py", line 210, in LoadOneBuildFile
raise Exception("%s not found (cwd: %s)" % (build_file_path, os.getcwd()))
Exception: ../../WebKit/chromium/sql/sql.gyp not found (cwd: /mnt/git/webkit-style-queue/Source/WebKit/chromium) while loading dependencies of ../../WebKit/chromium/webkit/support/webkit_support.gyp while loading dependencies of WebKit.gyp while trying to load WebKit.gyp
Died at Tools/Scripts/update-webkit-chromium line 69.
No such file or directory at Tools/Scripts/update-webkit line 100.
If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 99487 [details] patch Clearing flags on attachment: 99487 Committed r90265: <http://trac.webkit.org/changeset/90265> All reviewed patches have been landed. Closing bug. |