Bug 223498

Summary: [YARR] Interpreter incorrectly matches non-BMP characters with multiple .
Product: WebKit Reporter: Michael Saboff <msaboff>
Component: JavaScriptCoreAssignee: Michael Saboff <msaboff>
Status: RESOLVED FIXED    
Severity: Normal CC: ews-watchlist, keith_miller, mark.lam, saam, tzagallo, webkit-bug-importer, ysuzuki
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
mark.lam: review+, ews-feeder: commit-queue-
Updated patch to fix layout test ysuzuki: review+, ews-feeder: commit-queue-

Description Michael Saboff 2021-03-18 21:03:09 PDT
Consider the expression:
  let m = String.fromCodePoint(0x10000).match(/../u);
It should not match.  The . atom (any character, but newline) should match the non-BMP character U+10000 leaving the second . nothing to match causing the whole RegExp to fail.

The Yarr JIT properly processes the RegExp, but the Yarr interpreter erroneously matches.
Comment 1 Michael Saboff 2021-03-18 21:03:45 PDT
<rdar://74698760>
Comment 2 Michael Saboff 2021-03-19 09:36:39 PDT
Created attachment 423737 [details]
Patch
Comment 3 Michael Saboff 2021-03-19 12:59:52 PDT
Created attachment 423764 [details]
Updated patch to fix layout test
Comment 4 Yusuke Suzuki 2021-03-19 13:59:06 PDT
Comment on attachment 423764 [details]
Updated patch to fix layout test

r=me
Comment 5 Michael Saboff 2021-03-22 15:08:45 PDT
Committed r274806 (235606@main): <https://commits.webkit.org/235606@main>