WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
285121
[Yarr] Improve processing of \s* and related simple Character Class atoms
https://bugs.webkit.org/show_bug.cgi?id=285121
Summary
[Yarr] Improve processing of \s* and related simple Character Class atoms
Michael Saboff
Reported
2024-12-23 17:13:12 PST
Current the JIT code generated for matching \s* for a 8 bit string is: 1:Term PatternCharacterClass checked-offset:(0) <whitespace> {0,...} greedy <68> 0x11aabce04: movz w7, #0x0 <72> 0x11aabce08: cmp w1, w2 <76> 0x11aabce0c: b.eq 0x11aabce44 -> <132> <80> 0x11aabce10: ldrb w6, [x0, x1] <84> 0x11aabce14: cmp w6, #9 <88> 0x11aabce18: b.lt 0x11aabce34 -> <116> <92> 0x11aabce1c: cmp w6, #13 <96> 0x11aabce20: b.le 0x11aabce38 -> <120> <100> 0x11aabce24: cmp w6, #32 <104> 0x11aabce28: b.eq 0x11aabce38 -> <120> <108> 0x11aabce2c: cmp w6, #160 <112> 0x11aabce30: b.eq 0x11aabce38 -> <120> <116> 0x11aabce34: b 0x11aabce44 -> <132> <120> 0x11aabce38: add w1, w1, #1 <124> 0x11aabce3c: add w7, w7, #1 <128> 0x11aabce40: b 0x11aabce08 -> <72> <132> 0x11aabce44: stur x7, [sp, #8] The JIT code generated for matching the same atom for 16 bit strings is slightly better: 1:Term PatternCharacterClass checked-offset:(0) <whitespace> {0,...} greedy <68> 0x11aabcf44: movz w7, #0x0 <72> 0x11aabcf48: cmp w1, w2 <76> 0x11aabcf4c: b.eq 0x11aabcf78 -> <120> <80> 0x11aabcf50: ldrh w6, [x0, x1, lsl #1] <84> 0x11aabcf54: movz x17, #0xb501 <88> 0x11aabcf58: movk x17, #0xe5d, lsl #16 <92> 0x11aabcf5c: movk x17, #0x1, lsl #32 -> 0x10e5db501 <96> 0x11aabcf60: ldrb w17, [x6, x17] <100> 0x11aabcf64: cbnz w17, 0x11aabcf6c -> <108> <104> 0x11aabcf68: b 0x11aabcf78 -> <120> <108> 0x11aabcf6c: add w1, w1, #1 <112> 0x11aabcf70: add w7, w7, #1 <116> 0x11aabcf74: b 0x11aabcf48 -> <72> <120> 0x11aabcf78: stur x7, [sp, #8] There are two issues with the 8 bit matching. First it isn't using the character table from the builtin spaces character class. The second issue is that we branch over a branch (instructions at offset 112 & 116). The 16 bit matching code only has the branch over a branch issue (see instructions at offset 100 * 104).
Attachments
Add attachment
proposed patch, testcase, etc.
Radar WebKit Bug Importer
Comment 1
2024-12-23 17:14:25 PST
<
rdar://problem/141967884
>
Michael Saboff
Comment 2
2024-12-23 17:43:06 PST
Pull request:
https://github.com/WebKit/WebKit/pull/38353
EWS
Comment 3
2024-12-25 04:32:41 PST
Committed
288284@main
(33a069473f47): <
https://commits.webkit.org/288284@main
> Reviewed commits have been landed. Closing PR #38353 and removing active labels.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug