RESOLVED FIXED289330
REGRESSION (288911@main-290821@main): RegExp for large input (800KB) incorrectly returns null
https://bugs.webkit.org/show_bug.cgi?id=289330
Summary REGRESSION (288911@main-290821@main): RegExp for large input (800KB) incorrec...
Jarred Sumner
Reported 2025-03-07 06:00:30 PST
Created attachment 474482 [details] it should not print null This base64 validation RegExp `/^(?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{2}==|[A-Za-z0-9+/]{3}=)?$/` fails to match valid input when the size is around 800KB. This began happening in upstream WebKit between commits ff24546a6c0762f37c9d67c10f91aee76a868b99 and da73ec3618fad6f43bab1cc20570898b121c9e0e. Previously this RegExp would correctly match input of this size, but now it returns null. The same pattern continues to work correctly in Node.js/V8 with inputs of this size and in Bun v1.2.2. It regressed in Bun v1.2.3. The most likely cause is commit 424c8d883269 "Prevent Yarr::Interpreter's evaluation stack from growing unboundedly" which introduced a 4MB limit on the RegExp stack size. While this is a reasonable safety measure, it appears to be causing valid RegExp evaluations to return different results than in Chrome/V8/Node.js. Increasing JSC::Options::maxRegExpStackSize does workaround the issue. A reproduction that should run in the jsc shell is attached Suspect Commits: 1. 424c8d883269 - "Prevent Yarr::Interpreter's evaluation stack from growing unboundedly" (Most likely cause) - Added a 4MB limit to the RegExp stack Other relevant commits in this range: 2. d30962803be5 - "Increase Yarr matchLimit" - Changed match limit from 1M to 100M 3. 12c34ef5e305 - "[Yarr] Improve processing of an alternation of strings" - Added string list optimization (later reverted) 4. 5e5c7a9d5e4f - "Unreviewed, reverting 290791@main" - Reverted the string list optimization 5. 87e70c49cbfa - "Re-enable RegExp Modifiers" 6. 868302168e9b - "Implement RegExp Modifiers" 7. 80442335b668 - "[Yarr] Update YarrJIT disassembly to happen after other link tasks" 9. 99be77af7cd2 - "[JSC] YARR: Update UCS canonicalization tables"
Attachments
it should not print null (818.62 KB, text/javascript)
2025-03-07 06:00 PST, Jarred Sumner
no flags
Radar WebKit Bug Importer
Comment 1 2025-03-07 11:12:11 PST
Alexey Proskuryakov
Comment 2 2025-03-09 18:47:31 PDT
Thank you for the report! Could you please use canonical commit IDs, not git hashes in the future? In case you are willing to positively identify the culprits, we store historical builds, and have tools like `Tools/Scripts/bisect-builds` and `Tools/CISupport/download-built-product`.
Yusuke Suzuki
Comment 3 2025-03-09 19:34:45 PDT
EWS
Comment 4 2025-03-09 23:00:07 PDT
Committed 291877@main (75a5cc813c16): <https://commits.webkit.org/291877@main> Reviewed commits have been landed. Closing PR #42175 and removing active labels.
EWS
Comment 5 2025-03-11 15:47:19 PDT
Committed 289651.254@safari-7621-branch (dbe27d9378cc): <https://commits.webkit.org/289651.254@safari-7621-branch> Reviewed commits have been landed. Closing PR #2775 and removing active labels.
Note You need to log in before you can comment on or make changes to this bug.