WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 151597
Some tests fail with ES6 `u` (Unicode) flag for regular expressions
https://bugs.webkit.org/show_bug.cgi?id=151597
Summary
Some tests fail with ES6 `u` (Unicode) flag for regular expressions
Mathias Bynens
Reported
2015-11-25 02:26:11 PST
Spec:
https://tc39.github.io/ecma262/#sec-pattern-semantics
More background:
https://mathiasbynens.be/notes/javascript-unicode
Attachments
Patch addressing \w and \W with "iu" flags
(13.02 KB, patch)
2016-04-13 16:54 PDT
,
Michael Saboff
no flags
Details
Formatted Diff
Diff
View All
Add attachment
proposed patch, testcase, etc.
Mathias Bynens
Comment 1
2015-11-25 02:29:29 PST
More background on the `u` flag for regular expressions:
https://mathiasbynens.be/notes/es6-unicode-regex
Mathias Bynens
Comment 2
2016-03-30 12:44:07 PDT
Seems like this has been implemented in Safari Technology Preview v9.1.1 (11601.6.10, 11602.1.25). However, the implementation is buggy:
http://mathias.html5.org/tests/javascript/regexp/
The following tests fail: assert_equals(/𝌆{2}/u.test('𝌆𝌆'), true); assert_equals(/\uD834\uDF06{2}/u.test('\uD834\uDF06\uD834\uDF06'), true); assert_equals(/\W/iu.test('S'), true); assert_equals(/\W/iu.test('K'), true); Please fix this before shipping this in a stable release to avoid compatibility problems.
Radar WebKit Bug Importer
Comment 3
2016-03-30 12:48:38 PDT
<
rdar://problem/25447036
>
Timothy Hatcher
Comment 4
2016-03-30 12:53:08 PDT
This was implemented with
bug 154842
.
Michael Saboff
Comment 5
2016-03-30 15:59:00 PDT
(In reply to
comment #2
)
> Seems like this has been implemented in Safari Technology Preview v9.1.1 > (11601.6.10, 11602.1.25). > > However, the implementation is buggy: >
http://mathias.html5.org/tests/javascript/regexp/
> > The following tests fail: > > assert_equals(/𝌆{2}/u.test('𝌆𝌆'), true); > assert_equals(/\uD834\uDF06{2}/u.test('\uD834\uDF06\uD834\uDF06'), true);
These two tests point out bug in quantified unicode regular expression processing.
> assert_equals(/\W/iu.test('S'), true); > assert_equals(/\W/iu.test('K'), true);
According the CharacterClassEscape pattern semantic rules specified in the ES6 spec section 21.2.2.12 (
https://tc39.github.io/ecma262/2016/#sec-characterclassescape
) along with the canonicalization rules found at 21.2.2.8.2 (
https://tc39.github.io/ecma262/2016/#sec-runtime-semantics-canonicalize-ch
), upper case ASCII 'S' and 'K' ARE word characters and therefore should fail with the non-word, \W, character class. This also holds true for when the ignore case flag is provided. Note that the Chrome team believes that the current Chrome canary (51.0.2692.0 canary) incorrectly handles these two test cases. This Chrome issue is tracked in
https://bugs.chromium.org/p/v8/issues/detail?id=4879
.
> Please fix this before shipping this in a stable release to avoid > compatibility problems.
I created a new bug (
https://bugs.webkit.org/show_bug.cgi?id=156044
) to track just the quantified unicode RegExp test failures.
Mathias Bynens
Comment 6
2016-03-30 23:27:40 PDT
(In reply to
comment #5
)
> (In reply to
comment #2
) > > Seems like this has been implemented in Safari Technology Preview v9.1.1 > > (11601.6.10, 11602.1.25). > > > > However, the implementation is buggy: > >
http://mathias.html5.org/tests/javascript/regexp/
> > > > The following tests fail: > > > > assert_equals(/𝌆{2}/u.test('𝌆𝌆'), true); > > assert_equals(/\uD834\uDF06{2}/u.test('\uD834\uDF06\uD834\uDF06'), true); > > These two tests point out bug in quantified unicode regular expression > processing. > > > assert_equals(/\W/iu.test('S'), true); > > assert_equals(/\W/iu.test('K'), true); > > According the CharacterClassEscape pattern semantic rules specified in the > ES6 spec section 21.2.2.12 > (
https://tc39.github.io/ecma262/2016/#sec-characterclassescape
) along with > the canonicalization rules found at 21.2.2.8.2 > (
https://tc39.github.io/ecma262/2016/#sec-runtime-semantics-canonicalize-ch
), > upper case ASCII 'S' and 'K' ARE word characters and therefore should fail > with the non-word, \W, character class.
Without the `u` and `i` flags enabled, this statements is entirely correct.
> This also holds true for when the ignore case flag is provided.
This is incorrect, though. Did you read the explanation at
https://mathiasbynens.be/notes/es6-unicode-regex#impact-i
?
> Note that the Chrome team believes that the current Chrome canary > (51.0.2692.0 canary) incorrectly handles these two test cases. This Chrome > issue is tracked in
https://bugs.chromium.org/p/v8/issues/detail?id=4879
.
No, they got it right:
https://bugs.chromium.org/p/v8/issues/detail?id=4879#c3
> I created a new bug (
https://bugs.webkit.org/show_bug.cgi?id=156044
) to > track just the quantified unicode RegExp test failures.
Thanks.
Michael Saboff
Comment 7
2016-04-13 14:59:10 PDT
As the standard is currently written, /\W/iu should match 's', 'k', 'S' and 'K'. I disagree with the standard and have created a pull request to change the standard. That request can be found at
https://github.com/tc39/ecma262/pull/525
. In the mean time, I will fix the implementation.
Michael Saboff
Comment 8
2016-04-13 16:54:23 PDT
Created
attachment 276366
[details]
Patch addressing \w and \W with "iu" flags
Geoffrey Garen
Comment 9
2016-04-13 16:56:08 PDT
Comment on
attachment 276366
[details]
Patch addressing \w and \W with "iu" flags r=me
WebKit Commit Bot
Comment 10
2016-04-13 17:47:32 PDT
Comment on
attachment 276366
[details]
Patch addressing \w and \W with "iu" flags Clearing flags on attachment: 276366 Committed
r199523
: <
http://trac.webkit.org/changeset/199523
>
WebKit Commit Bot
Comment 11
2016-04-13 17:47:35 PDT
All reviewed patches have been landed. Closing bug.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug