WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
216202
TextDecoder should properly handle streams
https://bugs.webkit.org/show_bug.cgi?id=216202
Summary
TextDecoder should properly handle streams
Alex Christensen
Reported
2020-09-04 17:01:50 PDT
Allow TextCodec::decode to properly handle streams
Attachments
Patch
(57.36 KB, patch)
2020-09-04 17:05 PDT
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(17.10 KB, patch)
2020-09-05 00:23 PDT
,
Alex Christensen
darin
: review+
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Alex Christensen
Comment 1
2020-09-04 17:05:36 PDT
Created
attachment 408046
[details]
Patch
Darin Adler
Comment 2
2020-09-04 17:17:35 PDT
Comment on
attachment 408046
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=408046&action=review
> Source/WebCore/platform/text/DecodeFailure.h:35 > +struct DecodeFailure { > + String stringBeforeError; > + size_t bytesConsumed { 0 }; > +};
Looking at the call sites, Expected isn’t really doing its job to make the code clean. Most of the call sites seem to want the string whether there was an error or not, but it seems they only need to know how many bytes were consumed if it was an error. Maybe the return value should just be a simple structure: struct DecodeResult { String string; Optional<size_t> bytesConsumedBeforeError; }; Or if you want to be even more straightforward: struct DecodeResult { String string; size_t bytesConsumed { 0 }; bool success { false }; }; I think those might be better than the Expected for how this is actually used.
Alexey Proskuryakov
Comment 3
2020-09-04 19:22:42 PDT
Comment on
attachment 408046
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=408046&action=review
> Source/WebCore/ChangeLog:8 > + In order to properly handle cases like a stream breaking in the middle of a surrogate pair
Doesn't text decoding already properly handle such cases when decoding content as it comes from the network?
Alex Christensen
Comment 4
2020-09-04 19:55:16 PDT
Comment on
attachment 408046
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=408046&action=review
>> Source/WebCore/ChangeLog:8 >> + In order to properly handle cases like a stream breaking in the middle of a surrogate pair > > Doesn't text decoding already properly handle such cases when decoding content as it comes from the network?
That uses TextResourceDecoder, which stores a std::unique_ptr<TextCodec> which keeps state instead of keeping a buffer like TextDecoder currently does. While this approach passes all existing web platform tests, it is incorrect. I need to keep a std::unique_ptr<TextCodec> instead of a buffer, and I should probably add a test that fails with this implementation and passes with a correct implementation. the only decoding failures in the existing tests are at the end of a stream block.
Alex Christensen
Comment 5
2020-09-05 00:23:11 PDT
Created
attachment 408067
[details]
Patch
EWS Watchlist
Comment 6
2020-09-05 00:24:23 PDT
This patch modifies the imported WPT tests. Please ensure that any changes on the tests (not coming from a WPT import) are exported to WPT. Please see
https://trac.webkit.org/wiki/WPTExportProcess
Darin Adler
Comment 7
2020-09-05 08:38:08 PDT
Comment on
attachment 408067
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=408067&action=review
Excellent. This is exactly how the TextCodec class was designed to be used.
> Source/WebCore/dom/TextDecoder.h:54 > + ~TextDecoder();
We typically put this before other member functions. Arbitrary, but it’s atypical to put it at the end of the public section.
> Source/WebCore/dom/TextDecoder.h:64 > + std::unique_ptr<TextCodec> m_textCodec;
I would have named this just m_codec.
Alex Christensen
Comment 8
2020-09-05 12:48:30 PDT
http://trac.webkit.org/r266668
Radar WebKit Bug Importer
Comment 9
2020-09-05 12:49:22 PDT
<
rdar://problem/68402719
>
Darin Adler
Comment 10
2020-09-05 12:52:23 PDT
Comment on
attachment 408067
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=408067&action=review
> Source/WebCore/dom/TextDecoder.cpp:153 > + auto oldBuffer = std::exchange(m_buffer, { });
Can we also come back here and delete this code?
> Source/WebCore/dom/TextDecoder.h:66 > Vector<uint8_t> m_buffer;
And delete this?
Alex Christensen
Comment 11
2020-10-05 10:20:37 PDT
Those suggested improvements were done in
https://trac.webkit.org/changeset/266681
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug