RESOLVED CONFIGURATION CHANGED 122795
Make UTF-8 encoding of unpaired surrogates match Encoding standard
https://bugs.webkit.org/show_bug.cgi?id=122795
Summary Make UTF-8 encoding of unpaired surrogates match Encoding standard
Ryosuke Niwa
Reported 2013-10-14 17:51:49 PDT
Consider merging https://chromium.googlesource.com/chromium/blink/+/109e9896a406aa3e76350a733bd030e8eeacc4c4 The Encoding standard says that unpaired UTF-16 surrogates in JS strings should be converted into U+FFFD (replacement character) during encode operations. This is (optionally) done already in WTFString::utf8() but not handled in TextCodecUTF8.
Attachments
Ahmad Saleem
Comment 1 2022-08-21 00:35:23 PDT
I didn't find good test from Chromium patch but this is the place where this patch needs to be applied: Link - https://github.com/WebKit/WebKit/blob/4ddaf4f8c28e7795d0dae5f39fad1873a566067e/Source/WebCore/PAL/pal/text/TextCodecUTF8.cpp#L466 I don't if this is still needed or not. Appreciate if someone else can comment. Thanks!
Alexey Proskuryakov
Comment 2 2022-08-21 13:28:01 PDT
WebKit passes all tests that were added with this Chromium commit.
Note You need to log in before you can comment on or make changes to this bug.