Bug 196296

Summary: [YARR] Precompute BMP / non-BMP status when constructing character classes
Product: WebKit Reporter: Michael Saboff <msaboff>
Component: JavaScriptCoreAssignee: Michael Saboff <msaboff>
Status: RESOLVED FIXED    
Severity: Normal CC: ews-watchlist, guijemont, guijemont+jsc-armv7-ews, keith_miller, mark.lam, saam, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
keith_miller: review+, ews-watchlist: commit-queue-
Archive of layout-test-results from ews206 for win-future
none
Patch fo landing
none
Updated Patch for landing - fixed MIPS build none

Michael Saboff
Reported 2019-03-27 08:15:56 PDT
The range of Unicode code points extends to 0x10ffff. This codespace is broken up into 17 "planes" of 64K code points. The first plane, 0..0xffff, is called the Basic Multilingual Plane (BMP). In JavaScript, we use surrogate pairs to represent non-BMP code points. The RegExp processing differs slightly for BMP and non-BMP matching. Currently, character classes have a single flag, m_hasNonBMPCharacters, to indicate if the class has any non-BMP characters or ranges. By adding another flag that indicates if a character class has BMP code points, the processing of character classes in the regular expressions can be optimized when the character class has only characters from either the BMP or non-BMP codespace.
Attachments
Patch (35.49 KB, patch)
2019-03-27 09:02 PDT, Michael Saboff
keith_miller: review+
ews-watchlist: commit-queue-
Archive of layout-test-results from ews206 for win-future (12.88 MB, application/zip)
2019-03-27 11:09 PDT, EWS Watchlist
no flags
Patch fo landing (37.46 KB, patch)
2019-03-28 10:45 PDT, Michael Saboff
no flags
Updated Patch for landing - fixed MIPS build (36.47 KB, patch)
2019-03-28 11:08 PDT, Michael Saboff
no flags
Radar WebKit Bug Importer
Comment 1 2019-03-27 08:16:28 PDT
Michael Saboff
Comment 2 2019-03-27 09:02:08 PDT
EWS Watchlist
Comment 3 2019-03-27 09:03:35 PDT
Attachment 366074 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/yarr/YarrPattern.h:62: Missing spaces around | [whitespace/operators] [3] Total errors found: 1 in 9 files If any of these errors are false positives, please file a bug against check-webkit-style.
Keith Miller
Comment 4 2019-03-27 10:28:10 PDT
Comment on attachment 366074 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366074&action=review r=me. > Source/JavaScriptCore/yarr/YarrJIT.cpp:1813 > - add32(TrustedImm32(1), countRegister); > #ifdef JIT_UNICODE_EXPRESSIONS > if (m_decodeSurrogatePairs) { > - Jump isBMPChar = branch32(LessThan, character, supplementaryPlanesBase); > - op.m_jumps.append(atEndOfInput()); > - add32(TrustedImm32(1), countRegister); > - add32(TrustedImm32(1), index); > - isBMPChar.link(this); > - } > + if (term->characterClass->hasOneCharacterSize() && !term->invert()) > + add32(TrustedImm32(term->characterClass->hasNonBMPCharacters() ? 2 : 1), countRegister); > + else { > + add32(TrustedImm32(1), countRegister); > + Jump isBMPChar = branch32(LessThan, character, supplementaryPlanesBase); > + op.m_jumps.append(atEndOfInput()); > + add32(TrustedImm32(1), countRegister); > + add32(TrustedImm32(1), index); > + isBMPChar.link(this); > + } > + } else > #endif > + add32(TrustedImm32(1), countRegister); Seems like this could be a helper method. > Source/JavaScriptCore/yarr/YarrJIT.cpp:1977 > + add32(TrustedImm32(term->characterClass->hasNonBMPCharacters() ? 2 : 1), index); Nit: it might be nice to have a helper method to get the width for one class like: Optional<size_t> CharacterClass::knownCharacterClass() { if (!hasOneCharacterSize()) return nullopt; return hasNonBMPCharacters() ? 2 : 1; }
Michael Saboff
Comment 5 2019-03-27 10:29:14 PDT
Comment on attachment 366074 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366074&action=review > Source/JavaScriptCore/yarr/YarrJIT.cpp:1900 > + I'll eliminate this extra whitespace. > Source/JavaScriptCore/yarr/YarrJIT.cpp:1916 > + I'll eliminate this extra whitespace.
EWS Watchlist
Comment 6 2019-03-27 11:09:09 PDT
Comment on attachment 366074 [details] Patch Attachment 366074 [details] did not pass win-ews (win): Output: https://webkit-queues.webkit.org/results/11685313 New failing tests: js/regexp-unicode.html
EWS Watchlist
Comment 7 2019-03-27 11:09:21 PDT
Created attachment 366085 [details] Archive of layout-test-results from ews206 for win-future The attached test failures were seen while running run-webkit-tests on the win-ews. Bot: ews206 Port: win-future Platform: CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit
Michael Saboff
Comment 8 2019-03-27 13:50:36 PDT
Comment on attachment 366074 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366074&action=review >> Source/JavaScriptCore/yarr/YarrJIT.cpp:1813 >> + add32(TrustedImm32(1), countRegister); > > Seems like this could be a helper method. Done.
Michael Saboff
Comment 9 2019-03-28 10:30:09 PDT
(In reply to Build Bot from comment #7) > Created attachment 366085 [details] > Archive of layout-test-results from ews206 for win-future > > The attached test failures were seen while running run-webkit-tests on the > win-ews. > Bot: ews206 Port: win-future Platform: > CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit Fixed this issue locally in the YarrInterpreter.
Michael Saboff
Comment 10 2019-03-28 10:32:24 PDT
(In reply to Michael Saboff from comment #9) > (In reply to Build Bot from comment #7) > > Created attachment 366085 [details] > > Archive of layout-test-results from ews206 for win-future > > > > The attached test failures were seen while running run-webkit-tests on the > > win-ews. > > Bot: ews206 Port: win-future Platform: > > CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit > > Fixed this issue locally in the YarrInterpreter. The jsc-armv7 test failure is the same fixed issue as the win test failure.
Michael Saboff
Comment 11 2019-03-28 10:45:24 PDT
Created attachment 366183 [details] Patch fo landing
EWS Watchlist
Comment 12 2019-03-28 10:48:26 PDT
Attachment 366183 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/yarr/YarrPattern.h:62: Missing spaces around | [whitespace/operators] [3] Total errors found: 1 in 9 files If any of these errors are false positives, please file a bug against check-webkit-style.
Michael Saboff
Comment 13 2019-03-28 11:08:20 PDT
Created attachment 366185 [details] Updated Patch for landing - fixed MIPS build
EWS Watchlist
Comment 14 2019-03-28 11:10:53 PDT
Attachment 366185 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/yarr/YarrPattern.h:62: Missing spaces around | [whitespace/operators] [3] Total errors found: 1 in 9 files If any of these errors are false positives, please file a bug against check-webkit-style.
Michael Saboff
Comment 15 2019-03-28 23:05:58 PDT
Note You need to log in before you can comment on or make changes to this bug.