Summary: | [JSC] Yarr should perform BoyerMoore search | ||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Yusuke Suzuki <ysuzuki> | ||||||||||||||||||||||||||||||||||||||
Component: | JavaScriptCore | Assignee: | Yusuke Suzuki <ysuzuki> | ||||||||||||||||||||||||||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||||||||||||||||||||||||||
Severity: | Normal | CC: | benjamin, cdumez, cmarcelo, ews-watchlist, keith_miller, mark.lam, msaboff, saam, sam, tzagallo, webkit-bug-importer | ||||||||||||||||||||||||||||||||||||||
Priority: | P2 | Keywords: | InRadar | ||||||||||||||||||||||||||||||||||||||
Version: | WebKit Nightly Build | ||||||||||||||||||||||||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||||||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||||||||||||||||||||||
Attachments: |
|
Description
Yusuke Suzuki
2021-07-26 14:08:46 PDT
Created attachment 434238 [details]
Patch
Created attachment 434239 [details]
Patch
Created attachment 434319 [details]
Patch
Created attachment 434320 [details]
Patch
Created attachment 434343 [details]
Patch
Created attachment 434384 [details]
Patch
Created attachment 434394 [details]
Patch
Created attachment 434397 [details]
Patch
Created attachment 434398 [details]
Patch
Created attachment 434399 [details]
Patch
Created attachment 434403 [details]
Patch
Created attachment 434474 [details]
Patch
Fun! I would replace "BM" with "BoyerMoore" to make it a bit clearer. (In reply to Sam Weinig from comment #13) > Fun! I would replace "BM" with "BoyerMoore" to make it a bit clearer. Sounds good, relaced. Created attachment 434480 [details]
Patch
Created attachment 434481 [details]
Patch
Created attachment 434482 [details]
Patch
Created attachment 434484 [details]
Patch
Created attachment 434486 [details]
Patch
Created attachment 434488 [details]
Patch
Comment on attachment 434488 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=434488&action=review r=me I also suggest filing/linking bugs for your FIXMEs > Source/JavaScriptCore/yarr/YarrJIT.cpp:944 > + BoyerMooreInfo* m_bmInfo; can we just default this to nullptr so we don't need to set it at the call sites > Source/JavaScriptCore/yarr/YarrJIT.cpp:2415 > + auto vector = makeUniqueRef<BoyerMooreByteVector>(map); > + dataLogLnIf(Options::verboseRegExpCompilation(), "Found bitmap lookahead count:(", mapCount, "),range:[", beginIndex, ", ", endIndex, ")"); > + const uint8_t* pointer = vector->data(); > + if (const auto* existingPointer = m_codeBlock.findSameVector(op.m_bmInfo->index(), vector.get())) > + pointer = existingPointer; > + else > + m_bmVector.append(WTFMove(vector)); small nit: I think it's nicer to encapsulate this logic in its own function, so we don't mistakenly use a stale pointer somehow. Like, "getBMByteVector" or something, and it take in BoyerMooreBitmap::Map& > Source/JavaScriptCore/yarr/YarrJIT.h:127 > + void set8BitCode(MacroAssemblerCodeRef<Yarr8BitPtrTag> ref, Vector<UniqueRef<BoyerMooreByteVector>>&& maps) this should take Vector& not Vector&& unless it actually takes ownership over "maps". Alternatively, it can take "Vector", and take ownership > Source/JavaScriptCore/yarr/YarrJIT.h:134 > + void set16BitCode(MacroAssemblerCodeRef<Yarr16BitPtrTag> ref, Vector<UniqueRef<BoyerMooreByteVector>>&& maps) ditto > Source/JavaScriptCore/yarr/YarrJIT.h:144 > + void set8BitCodeMatchOnly(MacroAssemblerCodeRef<YarrMatchOnly8BitPtrTag> matchOnly, Vector<UniqueRef<BoyerMooreByteVector>>&& maps) ditto > Source/JavaScriptCore/yarr/YarrJIT.h:151 > + void set16BitCodeMatchOnly(MacroAssemblerCodeRef<YarrMatchOnly16BitPtrTag> matchOnly, Vector<UniqueRef<BoyerMooreByteVector>>&& maps) ditto > Source/JavaScriptCore/yarr/YarrJIT.h:243 > + // Compile and Clear are guarded by RegExp's cell-lock. So these operations work atomically. nit: would be nicer if they just took a const Locker& instead of a comment > Source/JavaScriptCore/yarr/YarrJIT.h:254 > + const uint8_t* findSameVector(unsigned index, BoyerMooreByteVector& vector) const tryFindSameBoyerMooreVector? or tryFindSameBMVector? or tryReuseBMVector? findSameVector is a weirdly generic name Comment on attachment 434488 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=434488&action=review Thanks! >> Source/JavaScriptCore/yarr/YarrJIT.cpp:944 >> + BoyerMooreInfo* m_bmInfo; > > can we just default this to nullptr so we don't need to set it at the call sites Fixed. >> Source/JavaScriptCore/yarr/YarrJIT.cpp:2415 >> + m_bmVector.append(WTFMove(vector)); > > small nit: I think it's nicer to encapsulate this logic in its own function, so we don't mistakenly use a stale pointer somehow. Like, "getBMByteVector" or something, and it take in BoyerMooreBitmap::Map& Fixed. >> Source/JavaScriptCore/yarr/YarrJIT.h:127 >> + void set8BitCode(MacroAssemblerCodeRef<Yarr8BitPtrTag> ref, Vector<UniqueRef<BoyerMooreByteVector>>&& maps) > > this should take Vector& not Vector&& unless it actually takes ownership over "maps". Alternatively, it can take "Vector", and take ownership Fixed. >> Source/JavaScriptCore/yarr/YarrJIT.h:134 >> + void set16BitCode(MacroAssemblerCodeRef<Yarr16BitPtrTag> ref, Vector<UniqueRef<BoyerMooreByteVector>>&& maps) > > ditto Fixed. >> Source/JavaScriptCore/yarr/YarrJIT.h:144 >> + void set8BitCodeMatchOnly(MacroAssemblerCodeRef<YarrMatchOnly8BitPtrTag> matchOnly, Vector<UniqueRef<BoyerMooreByteVector>>&& maps) > > ditto Fixed. >> Source/JavaScriptCore/yarr/YarrJIT.h:151 >> + void set16BitCodeMatchOnly(MacroAssemblerCodeRef<YarrMatchOnly16BitPtrTag> matchOnly, Vector<UniqueRef<BoyerMooreByteVector>>&& maps) > > ditto Fixed. >> Source/JavaScriptCore/yarr/YarrJIT.h:243 >> + // Compile and Clear are guarded by RegExp's cell-lock. So these operations work atomically. > > nit: would be nicer if they just took a const Locker& instead of a comment Fixed. >> Source/JavaScriptCore/yarr/YarrJIT.h:254 >> + const uint8_t* findSameVector(unsigned index, BoyerMooreByteVector& vector) const > > tryFindSameBoyerMooreVector? or tryFindSameBMVector? or tryReuseBMVector? > > findSameVector is a weirdly generic name Fixed. Committed r280452 (240087@main): <https://commits.webkit.org/240087@main> Committed r280496 (240128@main): <https://commits.webkit.org/240128@main> |