RESOLVED FIXED 216202
TextDecoder should properly handle streams
https://bugs.webkit.org/show_bug.cgi?id=216202
Summary TextDecoder should properly handle streams
Alex Christensen
Reported 2020-09-04 17:01:50 PDT
Allow TextCodec::decode to properly handle streams
Attachments
Patch (57.36 KB, patch)
2020-09-04 17:05 PDT, Alex Christensen
no flags
Patch (17.10 KB, patch)
2020-09-05 00:23 PDT, Alex Christensen
darin: review+
Alex Christensen
Comment 1 2020-09-04 17:05:36 PDT
Darin Adler
Comment 2 2020-09-04 17:17:35 PDT
Comment on attachment 408046 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=408046&action=review > Source/WebCore/platform/text/DecodeFailure.h:35 > +struct DecodeFailure { > + String stringBeforeError; > + size_t bytesConsumed { 0 }; > +}; Looking at the call sites, Expected isn’t really doing its job to make the code clean. Most of the call sites seem to want the string whether there was an error or not, but it seems they only need to know how many bytes were consumed if it was an error. Maybe the return value should just be a simple structure: struct DecodeResult { String string; Optional<size_t> bytesConsumedBeforeError; }; Or if you want to be even more straightforward: struct DecodeResult { String string; size_t bytesConsumed { 0 }; bool success { false }; }; I think those might be better than the Expected for how this is actually used.
Alexey Proskuryakov
Comment 3 2020-09-04 19:22:42 PDT
Comment on attachment 408046 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=408046&action=review > Source/WebCore/ChangeLog:8 > + In order to properly handle cases like a stream breaking in the middle of a surrogate pair Doesn't text decoding already properly handle such cases when decoding content as it comes from the network?
Alex Christensen
Comment 4 2020-09-04 19:55:16 PDT
Comment on attachment 408046 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=408046&action=review >> Source/WebCore/ChangeLog:8 >> + In order to properly handle cases like a stream breaking in the middle of a surrogate pair > > Doesn't text decoding already properly handle such cases when decoding content as it comes from the network? That uses TextResourceDecoder, which stores a std::unique_ptr<TextCodec> which keeps state instead of keeping a buffer like TextDecoder currently does. While this approach passes all existing web platform tests, it is incorrect. I need to keep a std::unique_ptr<TextCodec> instead of a buffer, and I should probably add a test that fails with this implementation and passes with a correct implementation. the only decoding failures in the existing tests are at the end of a stream block.
Alex Christensen
Comment 5 2020-09-05 00:23:11 PDT
EWS Watchlist
Comment 6 2020-09-05 00:24:23 PDT
This patch modifies the imported WPT tests. Please ensure that any changes on the tests (not coming from a WPT import) are exported to WPT. Please see https://trac.webkit.org/wiki/WPTExportProcess
Darin Adler
Comment 7 2020-09-05 08:38:08 PDT
Comment on attachment 408067 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=408067&action=review Excellent. This is exactly how the TextCodec class was designed to be used. > Source/WebCore/dom/TextDecoder.h:54 > + ~TextDecoder(); We typically put this before other member functions. Arbitrary, but it’s atypical to put it at the end of the public section. > Source/WebCore/dom/TextDecoder.h:64 > + std::unique_ptr<TextCodec> m_textCodec; I would have named this just m_codec.
Alex Christensen
Comment 8 2020-09-05 12:48:30 PDT
Radar WebKit Bug Importer
Comment 9 2020-09-05 12:49:22 PDT
Darin Adler
Comment 10 2020-09-05 12:52:23 PDT
Comment on attachment 408067 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=408067&action=review > Source/WebCore/dom/TextDecoder.cpp:153 > + auto oldBuffer = std::exchange(m_buffer, { }); Can we also come back here and delete this code? > Source/WebCore/dom/TextDecoder.h:66 > Vector<uint8_t> m_buffer; And delete this?
Alex Christensen
Comment 11 2020-10-05 10:20:37 PDT
Those suggested improvements were done in https://trac.webkit.org/changeset/266681
Note You need to log in before you can comment on or make changes to this bug.