Bug 187248 - YARR: . doesn't match non-BMP Unicode characters in some cases
Summary: YARR: . doesn't match non-BMP Unicode characters in some cases
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Michael Saboff
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2018-07-02 08:05 PDT by Michael Saboff
Modified: 2018-07-10 10:34 PDT (History)
9 users (show)

See Also:


Attachments
Patch (3.22 KB, patch)
2018-07-10 09:43 PDT, Michael Saboff
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Saboff 2018-07-02 08:05:05 PDT
The expression /^.-clef/u.test("\u{1D11E}-clef") should be true, but evaluates to false.  The issue is only present when the RegExp JIT is enabled (the default).  The YARR interpreter evaluates the expression properly.
Comment 1 Michael Saboff 2018-07-04 22:33:53 PDT
This bug was raised in this GitHub issue thread - https://github.com/kangax/compat-table/issues/1322#issuecomment-401969005

This RE also fails, but should pass:
  /c.lef/u.test( "cπ„žlef" )
Comment 2 Radar WebKit Bug Importer 2018-07-10 09:18:24 PDT
<rdar://problem/42026714>
Comment 3 Michael Saboff 2018-07-10 09:24:06 PDT
The bug is that the alternative optimizer moved the '.' character class term to after the fixed character terms.  The safety check for moving the term did not take into account that the character class is inverted, specifically "not a newline" and therefore the character class' m_hasNonBMPCharacters has an inverted sense.

The fix is to check that the character class doesn't have non-BMP character AND it isn't an inverted check of that character class.
Comment 4 Michael Saboff 2018-07-10 09:43:44 PDT
Created attachment 344706 [details]
Patch
Comment 5 Geoffrey Garen 2018-07-10 09:51:32 PDT
Comment on attachment 344706 [details]
Patch

r=me

"Does not not have BMP characters and is not notted". OK.
Comment 6 Keith Miller 2018-07-10 09:53:17 PDT
r=me too.
Comment 7 WebKit Commit Bot 2018-07-10 10:34:41 PDT
Comment on attachment 344706 [details]
Patch

Clearing flags on attachment: 344706

Committed r233690: <https://trac.webkit.org/changeset/233690>
Comment 8 WebKit Commit Bot 2018-07-10 10:34:42 PDT
All reviewed patches have been landed.  Closing bug.