140420 – JavaScript identifier incorrectly parsed if the prefix before an escape sequence is a keyword

RESOLVED FIXED 140420

JavaScript identifier incorrectly parsed if the prefix before an escape sequence is a keyword

https://bugs.webkit.org/show_bug.cgi?id=140420

Summary JavaScript identifier incorrectly parsed if the prefix before an escape seque...

Alan Tam

Reported 2015-01-13 17:49:23 PST

STEPS: 1. Run "in\u00e9dit = 1" in JavaScript console. EXPECTED: It sets variable inédit to 1 and returns 1. ACTUAL: SyntaxError: Unexpected keyword 'in' This is a result of a minifier trying to convert all code to ASCII. Chrome 39 and Firefox 34 both work fine.

Attachments
Patch (6.25 KB, patch) 2015-01-14 10:26 PST, Michael Saboff	oliver: review+	Details Formatted Diff Diff
Performance results of the patch (50.09 KB, text/plain) 2015-01-14 10:29 PST, Michael Saboff	no flags	Details
View All Add attachment proposed patch, testcase, etc.

Michael Saboff

Comment 1 2015-01-13 20:47:24 PST

This is probably related to adding for..in iteration to the parser.

Michael Saboff

Comment 2 2015-01-13 20:57:33 PST

Using ToT r178251. For "in\u00e9dit = 1;" I get: SyntaxError: Unexpected keyword 'in' For "var in\u00e9dit = 1;" I get: SyntaxError: Cannot use the keyword 'in' as a variable name.

Alan Tam

Comment 3 2015-01-14 02:45:00 PST

It is not limited to for..in, but all keywords. Indeed, object literal is another way to trigger the bug. > ({while\u00e9dit:1}) SyntaxError: Unexpected identifier '\u00e9dit'. Expected a ':' following the property name 'while'. Again, this works in Chrome and Firefox, returning this hash: {"whileédit":1}

Michael Saboff

Comment 4 2015-01-14 08:09:39 PST

Yes, it affects all keywords. Test performance of a patch now.

Michael Saboff

Comment 5 2015-01-14 08:58:09 PST

The problem is due to parseKeyword() matching the "in" or any other keyword. It then calls isIdentPart() on the next character, the \ for the unicode escape. isIdentPart() only looks for characters with the types of CharacterIdentifierStart, CharacterZero and CharacterNumber. The \ character is CharacterBackSlash. The character that results from the unicode escape \u00e9 is é, which has the character class CharacterIdentifierStart. parseKeyword() is generated from KeywordLookupGenerator.py. Looks like it needs to be taught about escaped characters. Adding a new isIdentPartOrEscape() function that will call isIdentPart(). If that fails, it looks for '\' an a valid unicode escape. If it finds one, it checks that unicode character with isIdentPart().

Michael Saboff

Comment 6 2015-01-14 10:26:22 PST

Created attachment 244611 [details] Patch

Michael Saboff

Comment 7 2015-01-14 10:29:43 PST

Created attachment 244612 [details] Performance results of the patch Seems to be neutral.

Michael Saboff

Comment 8 2015-01-14 10:48:54 PST

Committed r178427: <http://trac.webkit.org/changeset/178427>

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Unspecified

OS Unspecified

Product WebKit

Component JavaScriptCore

Assignee

Michael Saboff

Reported

2015-01-13 17:49 PST

Modified

2015-01-14 10:48 PST History

CC List

2 users Show

URL

Keywords

Depends on

Blocks