234030 – TextCodecUTF8 can skip characters after an invalid sequence near EOF

RESOLVED DUPLICATE of bug 233921234030

TextCodecUTF8 can skip characters after an invalid sequence near EOF

https://bugs.webkit.org/show_bug.cgi?id=234030

Summary TextCodecUTF8 can skip characters after an invalid sequence near EOF

Andreu Botella

Reported 2021-12-08 12:55:22 PST

Created attachment 446414 [details] Sample to show that this bug affects page loading. WPT tests: https://wpt.fyi/results/encoding/textdecoder-eof.any.html?label=experimental&label=master&aligned (also tests for bug 233921). When the TextCodecUTF8 decoder finds a non-ASCII lead byte, it waits until enough bytes are consumed to make a valid sequence starting at that position, before starting to process the bytes. But if the stream is flushed before that, the decoder assumes that the remaining bytes are part of a truncated partial sequence, and so discards them while emitting a single replacement character. But this assumption doesn't necessarily hold, and it can result in non-replacement characters being skipped: // "�A" in Firefox and Chromium 98, and according to the spec. // "��A" in earlier versions of Chromium. // "�" in WebKit. new TextDecoder().decode(new Uint8Array([0xF0, 0x9F, 0x41])); This can also result in fewer replacement characters being emitted than should be the case: // "��A" in Firefox, Chrome, and according to the spec. // "�" in WebKit. new TextDecoder().decode(new Uint8Array([0xF0, 0x80, 0x41])); This bug also affects page loading, as with the attached sample.

Attachments
Sample to show that this bug affects page loading. (50 bytes, text/html) 2021-12-08 12:55 PST, Andreu Botella	no flags	Details
View All Add attachment proposed patch, testcase, etc.

Alex Christensen

Comment 1 2021-12-09 09:50:17 PST

*** This bug has been marked as a duplicate of bug 233921 ***

Alex Christensen

Comment 2 2021-12-09 09:50:33 PST

This will be fixed with the same fix as bug 233921

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution DUPLICATE

of bug 233921

Priority P2

Severity Normal

Classification Unclassified

Version WebKit Nightly Build

Hardware Unspecified

OS Unspecified

Product WebKit

Component Page Loading

Assignee

Nobody

Reported

2021-12-08 12:55 PST

Modified

2021-12-09 09:50 PST History

CC List

4 users Show

URL

Keywords

Depends on

Blocks