WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED DUPLICATE of
bug 233921
234030
TextCodecUTF8 can skip characters after an invalid sequence near EOF
https://bugs.webkit.org/show_bug.cgi?id=234030
Summary
TextCodecUTF8 can skip characters after an invalid sequence near EOF
Andreu Botella
Reported
2021-12-08 12:55:22 PST
Created
attachment 446414
[details]
Sample to show that this bug affects page loading. WPT tests:
https://wpt.fyi/results/encoding/textdecoder-eof.any.html?label=experimental&label=master&aligned
(also tests for
bug 233921
). When the TextCodecUTF8 decoder finds a non-ASCII lead byte, it waits until enough bytes are consumed to make a valid sequence starting at that position, before starting to process the bytes. But if the stream is flushed before that, the decoder assumes that the remaining bytes are part of a truncated partial sequence, and so discards them while emitting a single replacement character. But this assumption doesn't necessarily hold, and it can result in non-replacement characters being skipped: // "�A" in Firefox and Chromium 98, and according to the spec. // "��A" in earlier versions of Chromium. // "�" in WebKit. new TextDecoder().decode(new Uint8Array([0xF0, 0x9F, 0x41])); This can also result in fewer replacement characters being emitted than should be the case: // "��A" in Firefox, Chrome, and according to the spec. // "�" in WebKit. new TextDecoder().decode(new Uint8Array([0xF0, 0x80, 0x41])); This bug also affects page loading, as with the attached sample.
Attachments
Sample to show that this bug affects page loading.
(50 bytes, text/html)
2021-12-08 12:55 PST
,
Andreu Botella
no flags
Details
View All
Add attachment
proposed patch, testcase, etc.
Alex Christensen
Comment 1
2021-12-09 09:50:17 PST
*** This bug has been marked as a duplicate of
bug 233921
***
Alex Christensen
Comment 2
2021-12-09 09:50:33 PST
This will be fixed with the same fix as
bug 233921
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug