The range of Unicode code points extends to 0x10ffff. This codespace is broken up into 17 "planes" of 64K code points. The first plane, 0..0xffff, is called the Basic Multilingual Plane (BMP). In JavaScript, we use surrogate pairs to represent non-BMP code points. The RegExp processing differs slightly for BMP and non-BMP matching. Currently, character classes have a single flag, m_hasNonBMPCharacters, to indicate if the class has any non-BMP characters or ranges. By adding another flag that indicates if a character class has BMP code points, the processing of character classes in the regular expressions can be optimized when the character class has only characters from either the BMP or non-BMP codespace.
<rdar://problem/49335655>
Created attachment 366074 [details] Patch
Attachment 366074 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/yarr/YarrPattern.h:62: Missing spaces around | [whitespace/operators] [3] Total errors found: 1 in 9 files If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 366074 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366074&action=review r=me. > Source/JavaScriptCore/yarr/YarrJIT.cpp:1813 > - add32(TrustedImm32(1), countRegister); > #ifdef JIT_UNICODE_EXPRESSIONS > if (m_decodeSurrogatePairs) { > - Jump isBMPChar = branch32(LessThan, character, supplementaryPlanesBase); > - op.m_jumps.append(atEndOfInput()); > - add32(TrustedImm32(1), countRegister); > - add32(TrustedImm32(1), index); > - isBMPChar.link(this); > - } > + if (term->characterClass->hasOneCharacterSize() && !term->invert()) > + add32(TrustedImm32(term->characterClass->hasNonBMPCharacters() ? 2 : 1), countRegister); > + else { > + add32(TrustedImm32(1), countRegister); > + Jump isBMPChar = branch32(LessThan, character, supplementaryPlanesBase); > + op.m_jumps.append(atEndOfInput()); > + add32(TrustedImm32(1), countRegister); > + add32(TrustedImm32(1), index); > + isBMPChar.link(this); > + } > + } else > #endif > + add32(TrustedImm32(1), countRegister); Seems like this could be a helper method. > Source/JavaScriptCore/yarr/YarrJIT.cpp:1977 > + add32(TrustedImm32(term->characterClass->hasNonBMPCharacters() ? 2 : 1), index); Nit: it might be nice to have a helper method to get the width for one class like: Optional<size_t> CharacterClass::knownCharacterClass() { if (!hasOneCharacterSize()) return nullopt; return hasNonBMPCharacters() ? 2 : 1; }
Comment on attachment 366074 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366074&action=review > Source/JavaScriptCore/yarr/YarrJIT.cpp:1900 > + I'll eliminate this extra whitespace. > Source/JavaScriptCore/yarr/YarrJIT.cpp:1916 > + I'll eliminate this extra whitespace.
Comment on attachment 366074 [details] Patch Attachment 366074 [details] did not pass win-ews (win): Output: https://webkit-queues.webkit.org/results/11685313 New failing tests: js/regexp-unicode.html
Created attachment 366085 [details] Archive of layout-test-results from ews206 for win-future The attached test failures were seen while running run-webkit-tests on the win-ews. Bot: ews206 Port: win-future Platform: CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit
Comment on attachment 366074 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366074&action=review >> Source/JavaScriptCore/yarr/YarrJIT.cpp:1813 >> + add32(TrustedImm32(1), countRegister); > > Seems like this could be a helper method. Done.
(In reply to Build Bot from comment #7) > Created attachment 366085 [details] > Archive of layout-test-results from ews206 for win-future > > The attached test failures were seen while running run-webkit-tests on the > win-ews. > Bot: ews206 Port: win-future Platform: > CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit Fixed this issue locally in the YarrInterpreter.
(In reply to Michael Saboff from comment #9) > (In reply to Build Bot from comment #7) > > Created attachment 366085 [details] > > Archive of layout-test-results from ews206 for win-future > > > > The attached test failures were seen while running run-webkit-tests on the > > win-ews. > > Bot: ews206 Port: win-future Platform: > > CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit > > Fixed this issue locally in the YarrInterpreter. The jsc-armv7 test failure is the same fixed issue as the win test failure.
Created attachment 366183 [details] Patch fo landing
Attachment 366183 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/yarr/YarrPattern.h:62: Missing spaces around | [whitespace/operators] [3] Total errors found: 1 in 9 files If any of these errors are false positives, please file a bug against check-webkit-style.
Created attachment 366185 [details] Updated Patch for landing - fixed MIPS build
Attachment 366185 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/yarr/YarrPattern.h:62: Missing spaces around | [whitespace/operators] [3] Total errors found: 1 in 9 files If any of these errors are false positives, please file a bug against check-webkit-style.
Committed r243642: <https://trac.webkit.org/changeset/243642>