Bug 16695 - JSC allows non-identifier codepoints in identifiers (affects Acid3)
Summary: JSC allows non-identifier codepoints in identifiers (affects Acid3)
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Darin Adler
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-01-01 01:43 PST by Eric Seidel (no email)
Modified: 2008-01-01 22:44 PST (History)
2 users (show)

See Also:


Attachments
patch (7.93 KB, patch)
2008-01-01 11:11 PST, Darin Adler
eric: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Seidel (no email) 2008-01-01 01:43:33 PST
JSC allows non-identifier codepoints in identifiers (Acid3 bug)

We don't throw a parse error exception here like we should:

    function () {
      // test 85: ES3 section 7.3 (unicode escapes can't be used to put non-identifier characters into identifiers)
      // and there's no other place for them in the syntax (other than strings, of course)
      var ok = false;
      try {
        eval("var test = { };\ntest.i= 0;\ntest.i\\u002b= 1;\ntest.i;\n");
      } catch (e) {
        ok = true;
      }
      assert(ok, "\\u002b was not considered a parse error in script");
      return 6;
    },
Comment 1 Alexey Proskuryakov 2008-01-01 03:33:38 PST
FWIW, Firefox fails this test, too.

Strangely, in my copy of ECMA-262 3rd edition, it's section 7.6, not 7.3.
Comment 2 Darin Adler 2008-01-01 10:13:23 PST
Yes, it's section 7.6 -- I think that's just a mistake in Hixie's comment.
Comment 3 Darin Adler 2008-01-01 11:11:52 PST
Created attachment 18227 [details]
patch
Comment 4 Eric Seidel (no email) 2008-01-01 12:55:31 PST
Comment on attachment 18227 [details]
patch

1.  I'm confused by your testing of \u vs \\u at the end
shouldBe("var test = { }; test.i= 0; test.i\u002b= 1; test.i;", "1");
The fact that that is supposed to work and \\u isn't, confuses me, and calls into question all the other \\u tests above
I guess I'm just not sure what '\u' would be interpreted as, and why it's a valid identifier char.
2.  I don't see a test for valid identifier part unicodes used at a identifier start
3.  it looks like the logic for checking \u could be squashed into the two "call sites" instead of making a new state
(which would be faster, and fewer states)

Assuming you address those comments via IRC or here, the patch looks great. :)  Marking r+ since I don't need to see another patch.
Comment 5 Ian 'Hixie' Hickson 2008-01-01 14:26:54 PST
This:
   shouldBe("var test = { }; test.i= 0; test.i\u002b= 1; test.i;", "1");
...is exactly equivalen to:
   shouldBe("var test = { }; test.i= 0; test.i+= 1; test.i;", "1");
...because the backslash escape sequence is interpreted when the original script is parsed, not when the eval() script is parsed.
Comment 6 Darin Adler 2008-01-01 14:35:13 PST
(In reply to comment #5)
> This:
>    shouldBe("var test = { }; test.i= 0; test.i\u002b= 1; test.i;", "1");
> ...is exactly equivalen to:
>    shouldBe("var test = { }; test.i= 0; test.i+= 1; test.i;", "1");
> ...because the backslash escape sequence is interpreted when the original
> script is parsed, not when the eval() script is parsed.

Yes. Eric was quoting one of two different test lines. The line above has two backslashes and expects an exception rather than the value 1.
Comment 7 Darin Adler 2008-01-01 22:44:36 PST
r29075