RESOLVED FIXED304181
Incorrect handling of invalid UTF-8 in streaming decoder
https://bugs.webkit.org/show_bug.cgi?id=304181
Summary Incorrect handling of invalid UTF-8 in streaming decoder
Nikita Skovoroda
Reported 2025-12-15 07:32:38 PST
```js > const x = Uint8Array.of(0xf0, 0xc3, 0x80, 42) > new TextDecoder().decode(x) // valid '�À*' > const d = new TextDecoder(); > [d.decode(x.subarray(0, 1), { stream: true }), d.decode(x.subarray(1), { stream: true }), d.decode()].join('') // invalid '�À�' ``` See https://issues.chromium.org/issues/468458744, WebKit is also affected This is already public but has security implications utf8 decoder is affected by the structure of underlying memory chunks Anything checking signatures / computing hashes etc is not affected by that Responses with the exact same bytes are decoded differently depending on network timing and chunking, and could potentially be affected by a MitM to trigger decoding to different data, without affecting TLS See a live demo at https://tmp-demo.rray.org/utf-8
Attachments
Radar WebKit Bug Importer
Comment 1 2025-12-15 18:51:37 PST
Darin Adler
Comment 2 2025-12-15 18:55:32 PST
Nikita Skovoroda
Comment 3 2025-12-15 19:10:45 PST
To clarify in addition to the title change: this does not only affect `TextDecoder` This also affects `await res.text()` in fetch, and it also affects page/resource loads (as seen in the live demo - that's a plain html decoding differently)
EWS
Comment 4 2025-12-15 21:36:12 PST
Committed 304496@main (ab37a057cd38): <https://commits.webkit.org/304496@main> Reviewed commits have been landed. Closing PR #55452 and removing active labels.
Note You need to log in before you can comment on or make changes to this bug.