WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
228301
[JSC] Yarr should perform BoyerMoore search
https://bugs.webkit.org/show_bug.cgi?id=228301
Summary
[JSC] Yarr should perform BoyerMoore search
Yusuke Suzuki
Reported
2021-07-26 14:08:46 PDT
We would like to start the simple one. Let's try to deploy BM search for the top-level disjunction first. Then, let's extend the coverage for the other disjunctions.
Attachments
Patch
(7.58 KB, patch)
2021-07-26 14:18 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(8.26 KB, patch)
2021-07-26 14:24 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(23.67 KB, patch)
2021-07-27 14:54 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(23.59 KB, patch)
2021-07-27 14:58 PDT
,
Yusuke Suzuki
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(24.00 KB, patch)
2021-07-27 15:13 PDT
,
Yusuke Suzuki
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(24.01 KB, patch)
2021-07-27 16:35 PDT
,
Yusuke Suzuki
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(26.73 KB, patch)
2021-07-27 19:43 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(28.76 KB, patch)
2021-07-27 20:21 PDT
,
Yusuke Suzuki
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(28.87 KB, patch)
2021-07-27 20:37 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(29.03 KB, patch)
2021-07-27 21:03 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(29.12 KB, patch)
2021-07-27 23:01 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(46.44 KB, patch)
2021-07-28 16:42 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(46.91 KB, patch)
2021-07-28 17:55 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(46.98 KB, patch)
2021-07-28 17:57 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(47.27 KB, patch)
2021-07-28 18:00 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(47.29 KB, patch)
2021-07-28 18:07 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(47.78 KB, patch)
2021-07-28 18:51 PDT
,
Yusuke Suzuki
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(47.83 KB, patch)
2021-07-28 18:54 PDT
,
Yusuke Suzuki
saam
: review+
Details
Formatted Diff
Diff
Show Obsolete
(17)
View All
Add attachment
proposed patch, testcase, etc.
Yusuke Suzuki
Comment 1
2021-07-26 14:18:47 PDT
Created
attachment 434238
[details]
Patch
Yusuke Suzuki
Comment 2
2021-07-26 14:24:52 PDT
Created
attachment 434239
[details]
Patch
Yusuke Suzuki
Comment 3
2021-07-27 14:54:43 PDT
Created
attachment 434319
[details]
Patch
Yusuke Suzuki
Comment 4
2021-07-27 14:58:32 PDT
Created
attachment 434320
[details]
Patch
Yusuke Suzuki
Comment 5
2021-07-27 15:13:55 PDT
Created
attachment 434343
[details]
Patch
Yusuke Suzuki
Comment 6
2021-07-27 16:35:06 PDT
Created
attachment 434384
[details]
Patch
Yusuke Suzuki
Comment 7
2021-07-27 19:43:09 PDT
Created
attachment 434394
[details]
Patch
Yusuke Suzuki
Comment 8
2021-07-27 20:21:25 PDT
Created
attachment 434397
[details]
Patch
Yusuke Suzuki
Comment 9
2021-07-27 20:37:09 PDT
Created
attachment 434398
[details]
Patch
Yusuke Suzuki
Comment 10
2021-07-27 21:03:41 PDT
Created
attachment 434399
[details]
Patch
Yusuke Suzuki
Comment 11
2021-07-27 23:01:13 PDT
Created
attachment 434403
[details]
Patch
Yusuke Suzuki
Comment 12
2021-07-28 16:42:14 PDT
Created
attachment 434474
[details]
Patch
Sam Weinig
Comment 13
2021-07-28 17:46:49 PDT
Fun! I would replace "BM" with "BoyerMoore" to make it a bit clearer.
Yusuke Suzuki
Comment 14
2021-07-28 17:52:36 PDT
(In reply to Sam Weinig from
comment #13
)
> Fun! I would replace "BM" with "BoyerMoore" to make it a bit clearer.
Sounds good, relaced.
Yusuke Suzuki
Comment 15
2021-07-28 17:55:13 PDT
Created
attachment 434480
[details]
Patch
Yusuke Suzuki
Comment 16
2021-07-28 17:57:39 PDT
Created
attachment 434481
[details]
Patch
Yusuke Suzuki
Comment 17
2021-07-28 18:00:25 PDT
Created
attachment 434482
[details]
Patch
Yusuke Suzuki
Comment 18
2021-07-28 18:07:05 PDT
Created
attachment 434484
[details]
Patch
Yusuke Suzuki
Comment 19
2021-07-28 18:51:22 PDT
Created
attachment 434486
[details]
Patch
Yusuke Suzuki
Comment 20
2021-07-28 18:54:06 PDT
Created
attachment 434488
[details]
Patch
Saam Barati
Comment 21
2021-07-29 13:56:24 PDT
Comment on
attachment 434488
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=434488&action=review
r=me I also suggest filing/linking bugs for your FIXMEs
> Source/JavaScriptCore/yarr/YarrJIT.cpp:944 > + BoyerMooreInfo* m_bmInfo;
can we just default this to nullptr so we don't need to set it at the call sites
> Source/JavaScriptCore/yarr/YarrJIT.cpp:2415 > + auto vector = makeUniqueRef<BoyerMooreByteVector>(map); > + dataLogLnIf(Options::verboseRegExpCompilation(), "Found bitmap lookahead count:(", mapCount, "),range:[", beginIndex, ", ", endIndex, ")"); > + const uint8_t* pointer = vector->data(); > + if (const auto* existingPointer = m_codeBlock.findSameVector(op.m_bmInfo->index(), vector.get())) > + pointer = existingPointer; > + else > + m_bmVector.append(WTFMove(vector));
small nit: I think it's nicer to encapsulate this logic in its own function, so we don't mistakenly use a stale pointer somehow. Like, "getBMByteVector" or something, and it take in BoyerMooreBitmap::Map&
> Source/JavaScriptCore/yarr/YarrJIT.h:127 > + void set8BitCode(MacroAssemblerCodeRef<Yarr8BitPtrTag> ref, Vector<UniqueRef<BoyerMooreByteVector>>&& maps)
this should take Vector& not Vector&& unless it actually takes ownership over "maps". Alternatively, it can take "Vector", and take ownership
> Source/JavaScriptCore/yarr/YarrJIT.h:134 > + void set16BitCode(MacroAssemblerCodeRef<Yarr16BitPtrTag> ref, Vector<UniqueRef<BoyerMooreByteVector>>&& maps)
ditto
> Source/JavaScriptCore/yarr/YarrJIT.h:144 > + void set8BitCodeMatchOnly(MacroAssemblerCodeRef<YarrMatchOnly8BitPtrTag> matchOnly, Vector<UniqueRef<BoyerMooreByteVector>>&& maps)
ditto
> Source/JavaScriptCore/yarr/YarrJIT.h:151 > + void set16BitCodeMatchOnly(MacroAssemblerCodeRef<YarrMatchOnly16BitPtrTag> matchOnly, Vector<UniqueRef<BoyerMooreByteVector>>&& maps)
ditto
> Source/JavaScriptCore/yarr/YarrJIT.h:243 > + // Compile and Clear are guarded by RegExp's cell-lock. So these operations work atomically.
nit: would be nicer if they just took a const Locker& instead of a comment
> Source/JavaScriptCore/yarr/YarrJIT.h:254 > + const uint8_t* findSameVector(unsigned index, BoyerMooreByteVector& vector) const
tryFindSameBoyerMooreVector? or tryFindSameBMVector? or tryReuseBMVector? findSameVector is a weirdly generic name
Yusuke Suzuki
Comment 22
2021-07-29 15:05:40 PDT
Comment on
attachment 434488
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=434488&action=review
Thanks!
>> Source/JavaScriptCore/yarr/YarrJIT.cpp:944 >> + BoyerMooreInfo* m_bmInfo; > > can we just default this to nullptr so we don't need to set it at the call sites
Fixed.
>> Source/JavaScriptCore/yarr/YarrJIT.cpp:2415 >> + m_bmVector.append(WTFMove(vector)); > > small nit: I think it's nicer to encapsulate this logic in its own function, so we don't mistakenly use a stale pointer somehow. Like, "getBMByteVector" or something, and it take in BoyerMooreBitmap::Map&
Fixed.
>> Source/JavaScriptCore/yarr/YarrJIT.h:127 >> + void set8BitCode(MacroAssemblerCodeRef<Yarr8BitPtrTag> ref, Vector<UniqueRef<BoyerMooreByteVector>>&& maps) > > this should take Vector& not Vector&& unless it actually takes ownership over "maps". Alternatively, it can take "Vector", and take ownership
Fixed.
>> Source/JavaScriptCore/yarr/YarrJIT.h:134 >> + void set16BitCode(MacroAssemblerCodeRef<Yarr16BitPtrTag> ref, Vector<UniqueRef<BoyerMooreByteVector>>&& maps) > > ditto
Fixed.
>> Source/JavaScriptCore/yarr/YarrJIT.h:144 >> + void set8BitCodeMatchOnly(MacroAssemblerCodeRef<YarrMatchOnly8BitPtrTag> matchOnly, Vector<UniqueRef<BoyerMooreByteVector>>&& maps) > > ditto
Fixed.
>> Source/JavaScriptCore/yarr/YarrJIT.h:151 >> + void set16BitCodeMatchOnly(MacroAssemblerCodeRef<YarrMatchOnly16BitPtrTag> matchOnly, Vector<UniqueRef<BoyerMooreByteVector>>&& maps) > > ditto
Fixed.
>> Source/JavaScriptCore/yarr/YarrJIT.h:243 >> + // Compile and Clear are guarded by RegExp's cell-lock. So these operations work atomically. > > nit: would be nicer if they just took a const Locker& instead of a comment
Fixed.
>> Source/JavaScriptCore/yarr/YarrJIT.h:254 >> + const uint8_t* findSameVector(unsigned index, BoyerMooreByteVector& vector) const > > tryFindSameBoyerMooreVector? or tryFindSameBMVector? or tryReuseBMVector? > > findSameVector is a weirdly generic name
Fixed.
Yusuke Suzuki
Comment 23
2021-07-29 15:26:22 PDT
Committed
r280452
(
240087@main
): <
https://commits.webkit.org/240087@main
>
Radar WebKit Bug Importer
Comment 24
2021-07-29 15:27:19 PDT
<
rdar://problem/81293620
>
Yusuke Suzuki
Comment 25
2021-07-30 14:16:53 PDT
Committed
r280496
(
240128@main
): <
https://commits.webkit.org/240128@main
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug