RESOLVED FIXED 285121
[Yarr] Improve processing of \s* and related simple Character Class atoms
https://bugs.webkit.org/show_bug.cgi?id=285121
Summary [Yarr] Improve processing of \s* and related simple Character Class atoms
Michael Saboff
Reported 2024-12-23 17:13:12 PST
Current the JIT code generated for matching \s* for a 8 bit string is: 1:Term PatternCharacterClass checked-offset:(0) <whitespace> {0,...} greedy <68> 0x11aabce04: movz w7, #0x0 <72> 0x11aabce08: cmp w1, w2 <76> 0x11aabce0c: b.eq 0x11aabce44 -> <132> <80> 0x11aabce10: ldrb w6, [x0, x1] <84> 0x11aabce14: cmp w6, #9 <88> 0x11aabce18: b.lt 0x11aabce34 -> <116> <92> 0x11aabce1c: cmp w6, #13 <96> 0x11aabce20: b.le 0x11aabce38 -> <120> <100> 0x11aabce24: cmp w6, #32 <104> 0x11aabce28: b.eq 0x11aabce38 -> <120> <108> 0x11aabce2c: cmp w6, #160 <112> 0x11aabce30: b.eq 0x11aabce38 -> <120> <116> 0x11aabce34: b 0x11aabce44 -> <132> <120> 0x11aabce38: add w1, w1, #1 <124> 0x11aabce3c: add w7, w7, #1 <128> 0x11aabce40: b 0x11aabce08 -> <72> <132> 0x11aabce44: stur x7, [sp, #8] The JIT code generated for matching the same atom for 16 bit strings is slightly better: 1:Term PatternCharacterClass checked-offset:(0) <whitespace> {0,...} greedy <68> 0x11aabcf44: movz w7, #0x0 <72> 0x11aabcf48: cmp w1, w2 <76> 0x11aabcf4c: b.eq 0x11aabcf78 -> <120> <80> 0x11aabcf50: ldrh w6, [x0, x1, lsl #1] <84> 0x11aabcf54: movz x17, #0xb501 <88> 0x11aabcf58: movk x17, #0xe5d, lsl #16 <92> 0x11aabcf5c: movk x17, #0x1, lsl #32 -> 0x10e5db501 <96> 0x11aabcf60: ldrb w17, [x6, x17] <100> 0x11aabcf64: cbnz w17, 0x11aabcf6c -> <108> <104> 0x11aabcf68: b 0x11aabcf78 -> <120> <108> 0x11aabcf6c: add w1, w1, #1 <112> 0x11aabcf70: add w7, w7, #1 <116> 0x11aabcf74: b 0x11aabcf48 -> <72> <120> 0x11aabcf78: stur x7, [sp, #8] There are two issues with the 8 bit matching. First it isn't using the character table from the builtin spaces character class. The second issue is that we branch over a branch (instructions at offset 112 & 116). The 16 bit matching code only has the branch over a branch issue (see instructions at offset 100 * 104).
Attachments
Radar WebKit Bug Importer
Comment 1 2024-12-23 17:14:25 PST
Michael Saboff
Comment 2 2024-12-23 17:43:06 PST
EWS
Comment 3 2024-12-25 04:32:41 PST
Committed 288284@main (33a069473f47): <https://commits.webkit.org/288284@main> Reviewed commits have been landed. Closing PR #38353 and removing active labels.
Note You need to log in before you can comment on or make changes to this bug.