RESOLVED FIXED 187248
YARR: . doesn't match non-BMP Unicode characters in some cases
https://bugs.webkit.org/show_bug.cgi?id=187248
Summary YARR: . doesn't match non-BMP Unicode characters in some cases
Michael Saboff
Reported 2018-07-02 08:05:05 PDT
The expression /^.-clef/u.test("\u{1D11E}-clef") should be true, but evaluates to false. The issue is only present when the RegExp JIT is enabled (the default). The YARR interpreter evaluates the expression properly.
Attachments
Patch (3.22 KB, patch)
2018-07-10 09:43 PDT, Michael Saboff
no flags
Michael Saboff
Comment 1 2018-07-04 22:33:53 PDT
This bug was raised in this GitHub issue thread - https://github.com/kangax/compat-table/issues/1322#issuecomment-401969005 This RE also fails, but should pass: /c.lef/u.test( "c𝄞lef" )
Radar WebKit Bug Importer
Comment 2 2018-07-10 09:18:24 PDT
Michael Saboff
Comment 3 2018-07-10 09:24:06 PDT
The bug is that the alternative optimizer moved the '.' character class term to after the fixed character terms. The safety check for moving the term did not take into account that the character class is inverted, specifically "not a newline" and therefore the character class' m_hasNonBMPCharacters has an inverted sense. The fix is to check that the character class doesn't have non-BMP character AND it isn't an inverted check of that character class.
Michael Saboff
Comment 4 2018-07-10 09:43:44 PDT
Geoffrey Garen
Comment 5 2018-07-10 09:51:32 PDT
Comment on attachment 344706 [details] Patch r=me "Does not not have BMP characters and is not notted". OK.
Keith Miller
Comment 6 2018-07-10 09:53:17 PDT
r=me too.
WebKit Commit Bot
Comment 7 2018-07-10 10:34:41 PDT
Comment on attachment 344706 [details] Patch Clearing flags on attachment: 344706 Committed r233690: <https://trac.webkit.org/changeset/233690>
WebKit Commit Bot
Comment 8 2018-07-10 10:34:42 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.