RESOLVED FIXED 180538
YARR: JIT RegExps with greedy parenthesized sub patterns
https://bugs.webkit.org/show_bug.cgi?id=180538
Summary YARR: JIT RegExps with greedy parenthesized sub patterns
Michael Saboff
Reported 2017-12-07 11:38:21 PST
We currently do not JIT code for patterns like /(a)*(b)*/. Instead we handle those in the YARR interpreter which is much slower.
Attachments
Patch (79.93 KB, patch)
2017-12-07 16:54 PST, Michael Saboff
ews-watchlist: commit-queue-
Archive of layout-test-results from ews114 for mac-elcapitan (3.06 MB, application/zip)
2017-12-07 18:19 PST, EWS Watchlist
no flags
Fixed test failure. (80.00 KB, patch)
2017-12-07 19:37 PST, Michael Saboff
jfbastien: review+
Radar WebKit Bug Importer
Comment 1 2017-12-07 11:39:33 PST
Michael Saboff
Comment 2 2017-12-07 16:54:24 PST
EWS Watchlist
Comment 3 2017-12-07 18:19:39 PST
Comment on attachment 328755 [details] Patch Attachment 328755 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/5536664 New failing tests: js/regexp-zero-length-alternatives.html
EWS Watchlist
Comment 4 2017-12-07 18:19:40 PST
Created attachment 328769 [details] Archive of layout-test-results from ews114 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews114 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Michael Saboff
Comment 5 2017-12-07 19:37:16 PST
Created attachment 328782 [details] Fixed test failure. The test failure was due a JIT'ed ASSERT emitted for Debug builds. It caught the case where we didn't restore the parenthesis start index when backtracking.
JF Bastien
Comment 6 2017-12-08 10:08:32 PST
Comment on attachment 328782 [details] Fixed test failure. View in context: https://bugs.webkit.org/attachment.cgi?id=328782&action=review r=me > Source/JavaScriptCore/testRegExp.cpp:438 > + const char* regexpError; Seems less error-prone to set to nullptr here, instead of in parseRegExpLine. > Source/JavaScriptCore/yarr/YarrJIT.cpp:187 > + return sizeof(ParenContext) + sizeof(Subpatterns) * parenContextSizes.numSubpatterns() + sizeof(uintptr_t) * parenContextSizes.frameSlots(); Do we cap these enough so that they can't overflow? We deal with 32-bit values above, so even if this is size_t (because of sizeof) I can't convince myself we can't overflow. > Source/JavaScriptCore/yarr/YarrJIT.cpp:263 > + for (unsigned subpattern = firstSubpattern; subpattern <= lastSubpattern; subpattern++) { subpattern <= lastSubpattern range seems odd! I guess it's correct since it's used below as well? > Source/JavaScriptCore/yarr/YarrJIT.cpp:283 > + for (unsigned subpattern = firstSubpattern; subpattern <= lastSubpattern; subpattern++) { (here) > Source/JavaScriptCore/yarr/YarrJIT.cpp:289 > + for (unsigned frameLocation = subpatternBaseFrameLocation; frameLocation < m_parenContextSizes.frameSlots(); frameLocation++) { Versus closed range here :-) > Source/JavaScriptCore/yarr/YarrJIT.cpp:2218 > + ASSERT_NOT_REACHED(); RELEASE_ ? > Source/JavaScriptCore/yarr/YarrJIT.cpp:2276 > + ASSERT_NOT_REACHED(); RELEASE_ ? > Source/JavaScriptCore/yarr/YarrJIT.cpp:3421 > + move(TrustedImm32(1000000), remainingMatchCount); matchLimit from Yarr.h instead? > Source/JavaScriptCore/yarr/YarrJIT.h:43 > +const size_t patternContextBufferSize = 8192; // Space caller allocates to save nested parenthesis context I would make this constexpr so it clearly doesn't change, since you use it in a variable-sized array.
Michael Saboff
Comment 7 2017-12-08 12:26:14 PST
Comment on attachment 328782 [details] Fixed test failure. View in context: https://bugs.webkit.org/attachment.cgi?id=328782&action=review >> Source/JavaScriptCore/testRegExp.cpp:438 >> + const char* regexpError; > > Seems less error-prone to set to nullptr here, instead of in parseRegExpLine. Fixed. >> Source/JavaScriptCore/yarr/YarrJIT.cpp:187 >> + return sizeof(ParenContext) + sizeof(Subpatterns) * parenContextSizes.numSubpatterns() + sizeof(uintptr_t) * parenContextSizes.frameSlots(); > > Do we cap these enough so that they can't overflow? We deal with 32-bit values above, so even if this is size_t (because of sizeof) I can't convince myself we can't overflow. I'll add a check where we create the free list from sizeOf() and fall back to the interpreter on unreasonable values. >> Source/JavaScriptCore/yarr/YarrJIT.cpp:263 >> + for (unsigned subpattern = firstSubpattern; subpattern <= lastSubpattern; subpattern++) { > > subpattern <= lastSubpattern range seems odd! I guess it's correct since it's used below as well? lastSubpattern is an inclusive value. It is the last allocated subpattern. >> Source/JavaScriptCore/yarr/YarrJIT.cpp:289 >> + for (unsigned frameLocation = subpatternBaseFrameLocation; frameLocation < m_parenContextSizes.frameSlots(); frameLocation++) { > > Versus closed range here :-) frameSlots() is the number of frame slots, not the last one. >> Source/JavaScriptCore/yarr/YarrJIT.cpp:2218 >> + ASSERT_NOT_REACHED(); > > RELEASE_ ? Changed. This is a defensive check, since these node types are not emitted when JIT_ALL_PARENS_EXPRESSIONS is not defined. >> Source/JavaScriptCore/yarr/YarrJIT.cpp:2276 >> + ASSERT_NOT_REACHED(); > > RELEASE_ ? Ditto. >> Source/JavaScriptCore/yarr/YarrJIT.cpp:3421 >> + move(TrustedImm32(1000000), remainingMatchCount); > > matchLimit from Yarr.h instead? Done >> Source/JavaScriptCore/yarr/YarrJIT.h:43 >> +const size_t patternContextBufferSize = 8192; // Space caller allocates to save nested parenthesis context > > I would make this constexpr so it clearly doesn't change, since you use it in a variable-sized array. Done
Michael Saboff
Comment 8 2017-12-08 12:32:44 PST
Yusuke Suzuki
Comment 9 2017-12-09 11:43:13 PST
I think recent Yarr changes improve Speedometer/Ember! Last time I checked Ember, significant time is consumed in Yarr Interpreter. It meant Ember.js includes some RegExps which are not compiled in Yarr JIT.
Matt Lewis
Comment 10 2017-12-11 14:46:42 PST
This caused a break in the High Sierra 32-bit JSC build. https://build.webkit.org/builders/Apple%20High%20Sierra%2032-bit%20JSC%20%28BuildAndTest%29/builds/671 https://build.webkit.org/builders/Apple%20High%20Sierra%2032-bit%20JSC%20%28BuildAndTest%29/builds/671/steps/webkit-32bit-jsc-test/logs/stdio failure: ChakraCore.yaml/ChakraCore/test/es6/regex-unicode.js.default ChakraCore.yaml/ChakraCore/test/es6/regex-unicode.js.default: Detected failures: ChakraCore.yaml/ChakraCore/test/es6/regex-unicode.js.default: *** Running test #4 (3): A character set containing a range spanning multiple planes should match characters from all those planes ChakraCore.yaml/ChakraCore/test/es6/regex-unicode.js.default: Test threw exception: assert.isTrue failed: expected: true actual: false: First character in plane #1 ChakraCore.yaml/ChakraCore/test/es6/regex-unicode.js.default: FAILED ChakraCore.yaml/ChakraCore/test/es6/regex-unicode.js.default: Summary of tests: total executed: 7; passed: 6; failed: 1 FAIL: ChakraCore.yaml/ChakraCore/test/es6/regex-unicode.js.default
Note You need to log in before you can comment on or make changes to this bug.