WebKit Bugzilla
New
Browse
Search+
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
312688
YARR Interpreter Omits Named Group from `indices.groups` via Unconditional Tracking-Slot Reset on Backtrack
https://bugs.webkit.org/show_bug.cgi?id=312688
Summary
YARR Interpreter Omits Named Group from `indices.groups` via Unconditional Tr...
parkjuny
Reported
2026-04-18 12:24:56 PDT
## Summary The YARR interpreter resets the duplicate named-group tracking slot to zero on every backtrack. `RegExpMatchesArray.h` gates the `indicesGroups` property write on that slot being non-zero, so the interpreter silently omits the named-group property from `indices.groups`. The JIT retains a stale (non-zero) slot after the same backtrack and accidentally produces the correct result. The property should always be present. ## Bug ### Summary For a duplicate named capture group that partially matches and then backtracks, `RegExpMatchesArray.h:228` writes the `"x"` property into `indicesGroups` only when `captureIndex > 0`. The interpreter zeros the tracking slot on backtrack, producing `captureIndex = 0`, so the property is never written and `"x" in m.indices.groups` returns `false`. The JIT does not zero the slot on backtrack, so `captureIndex` remains non-zero and the property is written with value `undefined` — matching the correct behavior. The JIT output is the correct answer; the interpreter output is wrong. ### Detail **Root-cause site — `RegExpMatchesArray.h:228`:** ```cpp groups->putDirect(vm, Identifier::fromString(vm, groupName), value); // always written — correct if (createIndices && captureIndex > 0) // BUG: skips write when slot is 0 indicesGroups->putDirect(vm, Identifier::fromString(vm, groupName), indicesArray->getIndexQuickly(captureIndex)); ``` `groups` always receives the property (even when `captureIndex == 0`, value is `jsUndefined()`). `indicesGroups` should mirror `groups`, but the `captureIndex > 0` guard prevents it. **Interpreter backtrack — `YarrInterpreter.cpp:173-176`, `1134-1137` — restores slot to 0:** For a `{N}` group, the interpreter uses `matchParentheses`/`backtrackParentheses` with a `ParenthesesDisjunctionContext` save/restore mechanism. Each context saves the current tracking slot value on allocation and zeros it: ```cpp // YarrInterpreter.cpp:173-176 — ParenthesesDisjunctionContext constructor for (unsigned duplicateNamedGroupId : m_duplicateNamedGroups) { subpatternAndGroupIdBackup[...] = output[pattern->offsetForDuplicateNamedGroupId(duplicateNamedGroupId)]; // saves current value output[pattern->offsetForDuplicateNamedGroupId(duplicateNamedGroupId)] = 0; } ``` On backtrack, `resetMatches` → `restoreOutput` puts back the saved value: ```cpp // YarrInterpreter.cpp:1134-1138 void resetMatches(ByteTerm& term, ParenthesesDisjunctionContext* context) { unsigned firstSubpatternId = term.subpatternId(); context->restoreOutput(output, firstSubpatternId); } ``` Because `recordParenthesesMatch` — which writes the subpatternId to the slot — is called only after all N iterations succeed (`YarrInterpreter.cpp:1450`), the per-iteration contexts always save a slot value of 0. Backtracking therefore restores the slot to 0. After this, `subpatternIdForGroupName` returns 0 → `captureIndex = 0` → the `indicesGroups` write is skipped. **JIT backtrack — `YarrJIT.cpp:4885-4888` — does not reset slot:** ```cpp if (shouldRecordSubpatterns() && term->containsAnyCaptures()) { for (unsigned subpattern = term->parentheses.subpatternId; subpattern <= term->parentheses.lastSubpatternId; subpattern++) clearSubpattern(subpattern); // clears capture start/end; tracking slot left stale } ``` The stale slot keeps `captureIndex > 0`, so the `indicesGroups` write proceeds and the property is present with value `undefined`. **`subpatternIdForGroupName` — `RegExp.h:119-130` — reads the tracking slot:** ```cpp return ovector[offsetVectorBaseForNamedCaptures() + it->value[0] - 1]; // tracking slot: 0 or subpatternId ``` ### Trigger Conditions 1. Regex has the **`/d` flag**. 2. Pattern contains **duplicate named capture groups** across alternatives. 3. At least one duplicate group is **quantified `{N}` with N ≥ 2** (required for the JIT/interpreter discrepancy; with `{1}` both engines return `false`). 4. That group **partially matches then fails** (FixedCount path is entered and backtracked). 5. The overall match succeeds via an **alternative that does not define the duplicate group**. ## Version ### Reproduced Version - `main` branch latest commit (2026/04/19): `a4390137a4039d12b4a0843e4f2b37e9ce2b6e6c` ## Reproduction Case ### Release Build ```bash jsc poc.js # JIT (default): true ← correct jsc --useRegExpJIT=false poc.js # Interpreter: false ← wrong ``` Debug build produces identical output; no assertion fires as the stale slot is not validated by any ASSERT. ### PoC Code ```js let m = /(?<x>a){2}z|(?<x>b){2}y|c/d.exec("aac"); print("x" in m.indices.groups); ``` ## Suggested Patch ### File: `Source/JavaScriptCore/runtime/RegExpMatchesArray.h` #### Diff ```diff --- a/Source/JavaScriptCore/runtime/RegExpMatchesArray.h +++ b/Source/JavaScriptCore/runtime/RegExpMatchesArray.h @@ -225,8 +225,10 @@ ALWAYS_INLINE JSArray* createRegExpMatchesArray( value = jsUndefined(); groups->putDirect(vm, Identifier::fromString(vm, groupName), value); - if (createIndices && captureIndex > 0) - indicesGroups->putDirect(vm, Identifier::fromString(vm, groupName), indicesArray->getIndexQuickly(captureIndex)); + if (createIndices) { + JSValue indicesValue = captureIndex > 0 ? indicesArray->getIndexQuickly(captureIndex) : jsUndefined(); + indicesGroups->putDirect(vm, Identifier::fromString(vm, groupName), indicesValue); + } } } } ``` This mirrors how `groups` is built (lines 221–226) and ensures `indicesGroups` always contains a property for every named capture group, with value `undefined` when the group did not participate. Note: `YarrJIT.cpp:4885-4888` should separately add `storeDuplicateNamedGroupSubpatternId(duplicateNamedGroupId, 0)` in the FixedCount backtrack loop (matching the `ParenthesesSubpatternOnceBegin` backtrack at lines 4772–4778) to eliminate the stale slot — but that is an independent state-hygiene fix and does not affect observable behavior once `RegExpMatchesArray.h` is corrected. ### Credit Information Reporter credit: Junyoung Park (@candymate) of KAIST Hacking Lab
Attachments
Add attachment
proposed patch, testcase, etc.
Radar WebKit Bug Importer
Comment 1
2026-04-19 13:22:15 PDT
<
rdar://problem/175122294
>
Kai Tamkun
Comment 2
2026-04-30 10:52:52 PDT
Pull request:
https://github.com/WebKit/WebKit/pull/63982
EWS
Comment 3
2026-05-22 14:30:35 PDT
Committed
313762@main
(e1cdfab158f3): <
https://commits.webkit.org/313762@main
> Reviewed commits have been landed. Closing PR #63982 and removing active labels.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug