RESOLVED FIXED Bug 151597
Some tests fail with ES6 `u` (Unicode) flag for regular expressions
https://bugs.webkit.org/show_bug.cgi?id=151597
Summary Some tests fail with ES6 `u` (Unicode) flag for regular expressions
Attachments
Patch addressing \w and \W with "iu" flags (13.02 KB, patch)
2016-04-13 16:54 PDT, Michael Saboff
no flags
Mathias Bynens
Comment 1 2015-11-25 02:29:29 PST
More background on the `u` flag for regular expressions: https://mathiasbynens.be/notes/es6-unicode-regex
Mathias Bynens
Comment 2 2016-03-30 12:44:07 PDT
Seems like this has been implemented in Safari Technology Preview v9.1.1 (11601.6.10, 11602.1.25). However, the implementation is buggy: http://mathias.html5.org/tests/javascript/regexp/ The following tests fail: assert_equals(/𝌆{2}/u.test('𝌆𝌆'), true); assert_equals(/\uD834\uDF06{2}/u.test('\uD834\uDF06\uD834\uDF06'), true); assert_equals(/\W/iu.test('S'), true); assert_equals(/\W/iu.test('K'), true); Please fix this before shipping this in a stable release to avoid compatibility problems.
Radar WebKit Bug Importer
Comment 3 2016-03-30 12:48:38 PDT
Timothy Hatcher
Comment 4 2016-03-30 12:53:08 PDT
This was implemented with bug 154842.
Michael Saboff
Comment 5 2016-03-30 15:59:00 PDT
(In reply to comment #2) > Seems like this has been implemented in Safari Technology Preview v9.1.1 > (11601.6.10, 11602.1.25). > > However, the implementation is buggy: > http://mathias.html5.org/tests/javascript/regexp/ > > The following tests fail: > > assert_equals(/𝌆{2}/u.test('𝌆𝌆'), true); > assert_equals(/\uD834\uDF06{2}/u.test('\uD834\uDF06\uD834\uDF06'), true); These two tests point out bug in quantified unicode regular expression processing. > assert_equals(/\W/iu.test('S'), true); > assert_equals(/\W/iu.test('K'), true); According the CharacterClassEscape pattern semantic rules specified in the ES6 spec section 21.2.2.12 (https://tc39.github.io/ecma262/2016/#sec-characterclassescape) along with the canonicalization rules found at 21.2.2.8.2 (https://tc39.github.io/ecma262/2016/#sec-runtime-semantics-canonicalize-ch), upper case ASCII 'S' and 'K' ARE word characters and therefore should fail with the non-word, \W, character class. This also holds true for when the ignore case flag is provided. Note that the Chrome team believes that the current Chrome canary (51.0.2692.0 canary) incorrectly handles these two test cases. This Chrome issue is tracked in https://bugs.chromium.org/p/v8/issues/detail?id=4879. > Please fix this before shipping this in a stable release to avoid > compatibility problems. I created a new bug (https://bugs.webkit.org/show_bug.cgi?id=156044) to track just the quantified unicode RegExp test failures.
Mathias Bynens
Comment 6 2016-03-30 23:27:40 PDT
(In reply to comment #5) > (In reply to comment #2) > > Seems like this has been implemented in Safari Technology Preview v9.1.1 > > (11601.6.10, 11602.1.25). > > > > However, the implementation is buggy: > > http://mathias.html5.org/tests/javascript/regexp/ > > > > The following tests fail: > > > > assert_equals(/𝌆{2}/u.test('𝌆𝌆'), true); > > assert_equals(/\uD834\uDF06{2}/u.test('\uD834\uDF06\uD834\uDF06'), true); > > These two tests point out bug in quantified unicode regular expression > processing. > > > assert_equals(/\W/iu.test('S'), true); > > assert_equals(/\W/iu.test('K'), true); > > According the CharacterClassEscape pattern semantic rules specified in the > ES6 spec section 21.2.2.12 > (https://tc39.github.io/ecma262/2016/#sec-characterclassescape) along with > the canonicalization rules found at 21.2.2.8.2 > (https://tc39.github.io/ecma262/2016/#sec-runtime-semantics-canonicalize-ch), > upper case ASCII 'S' and 'K' ARE word characters and therefore should fail > with the non-word, \W, character class. Without the `u` and `i` flags enabled, this statements is entirely correct. > This also holds true for when the ignore case flag is provided. This is incorrect, though. Did you read the explanation at https://mathiasbynens.be/notes/es6-unicode-regex#impact-i? > Note that the Chrome team believes that the current Chrome canary > (51.0.2692.0 canary) incorrectly handles these two test cases. This Chrome > issue is tracked in https://bugs.chromium.org/p/v8/issues/detail?id=4879. No, they got it right: https://bugs.chromium.org/p/v8/issues/detail?id=4879#c3 > I created a new bug (https://bugs.webkit.org/show_bug.cgi?id=156044) to > track just the quantified unicode RegExp test failures. Thanks.
Michael Saboff
Comment 7 2016-04-13 14:59:10 PDT
As the standard is currently written, /\W/iu should match 's', 'k', 'S' and 'K'. I disagree with the standard and have created a pull request to change the standard. That request can be found at https://github.com/tc39/ecma262/pull/525. In the mean time, I will fix the implementation.
Michael Saboff
Comment 8 2016-04-13 16:54:23 PDT
Created attachment 276366 [details] Patch addressing \w and \W with "iu" flags
Geoffrey Garen
Comment 9 2016-04-13 16:56:08 PDT
Comment on attachment 276366 [details] Patch addressing \w and \W with "iu" flags r=me
WebKit Commit Bot
Comment 10 2016-04-13 17:47:32 PDT
Comment on attachment 276366 [details] Patch addressing \w and \W with "iu" flags Clearing flags on attachment: 276366 Committed r199523: <http://trac.webkit.org/changeset/199523>
WebKit Commit Bot
Comment 11 2016-04-13 17:47:35 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.