RESOLVED FIXED Bug 163771
Implement TextDecoder and TextEncoder
https://bugs.webkit.org/show_bug.cgi?id=163771
Summary Implement TextDecoder and TextEncoder
Alex Christensen
Reported 2016-10-20 18:30:13 PDT
Implement TextDecoder and TextEncoder
Attachments
Patch (2.39 MB, patch)
2016-10-20 18:37 PDT, Alex Christensen
no flags
Patch (2.39 MB, patch)
2016-10-20 18:43 PDT, Alex Christensen
no flags
Archive of layout-test-results from ews102 for mac-yosemite (1.11 MB, application/zip)
2016-10-20 19:42 PDT, Build Bot
no flags
Archive of layout-test-results from ews107 for mac-yosemite-wk2 (1016.81 KB, application/zip)
2016-10-20 19:50 PDT, Build Bot
no flags
Archive of layout-test-results from ews115 for mac-yosemite (1.99 MB, application/zip)
2016-10-20 19:53 PDT, Build Bot
no flags
Patch (2.40 MB, patch)
2016-10-20 20:05 PDT, Alex Christensen
no flags
Patch (2.40 MB, patch)
2016-10-20 20:59 PDT, Alex Christensen
no flags
Patch (2.40 MB, patch)
2016-10-20 23:34 PDT, Alex Christensen
no flags
Patch (2.40 MB, patch)
2016-11-16 17:33 PST, Alex Christensen
no flags
Patch (2.40 MB, patch)
2016-11-17 10:29 PST, Alex Christensen
no flags
Patch (2.40 MB, patch)
2016-11-17 12:06 PST, Alex Christensen
sam: review+
source_changes (27.98 KB, patch)
2016-11-17 15:05 PST, Alex Christensen
sam: review+
buildbot: commit-queue-
Archive of layout-test-results from ews103 for mac-yosemite (1.71 MB, application/zip)
2016-11-17 16:04 PST, Build Bot
no flags
Archive of layout-test-results from ews104 for mac-yosemite-wk2 (2.13 MB, application/zip)
2016-11-17 16:07 PST, Build Bot
no flags
Archive of layout-test-results from ews115 for mac-yosemite (2.29 MB, application/zip)
2016-11-17 16:20 PST, Build Bot
no flags
Archive of layout-test-results from ews125 for ios-simulator-wk2 (16.26 MB, application/zip)
2016-11-17 16:27 PST, Build Bot
no flags
Alex Christensen
Comment 1 2016-10-20 18:37:58 PDT
Alex Christensen
Comment 2 2016-10-20 18:43:23 PDT
Build Bot
Comment 3 2016-10-20 19:42:01 PDT
Comment on attachment 292306 [details] Patch Attachment 292306 [details] did not pass mac-ews (mac): Output: http://webkit-queues.webkit.org/results/2335040 New failing tests: js/dom/global-constructors-attributes-dedicated-worker.html
Build Bot
Comment 4 2016-10-20 19:42:03 PDT
Created attachment 292313 [details] Archive of layout-test-results from ews102 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews102 Port: mac-yosemite Platform: Mac OS X 10.10.5
Build Bot
Comment 5 2016-10-20 19:50:11 PDT
Comment on attachment 292306 [details] Patch Attachment 292306 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.webkit.org/results/2335083 New failing tests: js/dom/global-constructors-attributes.html js/dom/global-constructors-attributes-dedicated-worker.html
Build Bot
Comment 6 2016-10-20 19:50:13 PDT
Created attachment 292314 [details] Archive of layout-test-results from ews107 for mac-yosemite-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews107 Port: mac-yosemite-wk2 Platform: Mac OS X 10.10.5
Build Bot
Comment 7 2016-10-20 19:53:00 PDT
Comment on attachment 292306 [details] Patch Attachment 292306 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/2335060 New failing tests: js/dom/global-constructors-attributes-dedicated-worker.html
Build Bot
Comment 8 2016-10-20 19:53:02 PDT
Created attachment 292315 [details] Archive of layout-test-results from ews115 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews115 Port: mac-yosemite Platform: Mac OS X 10.10.5
Alex Christensen
Comment 9 2016-10-20 20:05:03 PDT
Alex Christensen
Comment 10 2016-10-20 20:31:46 PDT
Comment on attachment 292318 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=292318&action=review > LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-labels-expected.txt:3170 > +FAIL "cseuckr" => "euc-kr" assert_equals: label for encoding should match expected "euc-kr" but got "windows-949" These failures can only be fixed by changing TextCodecICU.cpp which I don't want to touch in this patch. > LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-utf16-surrogates-expected.txt:5 > +FAIL utf-16le - lone surrogate trail assert_equals: expected "�" but got "í°Â" These failures are related to the FIXME comment in TextCodecUTF16::decode which I don't want to touch in this patch.
Alex Christensen
Comment 11 2016-10-20 20:59:33 PDT
Alex Christensen
Comment 12 2016-10-20 23:34:44 PDT
Alex Christensen
Comment 13 2016-10-21 00:45:18 PDT
Could somebody test the efl build? I don't know why it fails.
Alex Christensen
Comment 14 2016-10-21 09:48:48 PDT
Why are there linker failures with typed arrays? This is super confusing
Sam Weinig
Comment 15 2016-10-21 10:42:59 PDT
Comment on attachment 292323 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=292323&action=review > Source/WebCore/dom/TextDecoder.idl:39 > +[Constructor(optional DOMString label = "utf-8", optional TextDecoderOptions options), > +Exposed=(Window,Worker), > +ImplementationLacksVTable, > +ConstructorMayThrowException] > +interface TextDecoder { This is weird formatting for extended attributes. We usually do something like: [ Foo, Bar ] interface TextDecoder { > Source/WebCore/dom/TextEncoder.idl:29 > +[Constructor, > +Exposed=(Window,Worker), > +ImplementationLacksVTable] > +interface TextEncoder { Formatting.
Darin Adler
Comment 16 2016-10-21 11:59:04 PDT
Comment on attachment 292323 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=292323&action=review > Source/WebCore/dom/TextDecoder.cpp:39 > +static bool isEncodingWhitespace(UChar c) > +{ > + return c == 0x09 > + || c == 0x0A > + || c == 0x0C > + || c == 0x0D > + || c == 0x20; > +} This function does the same thing as isHTMLSpace, but less efficiently. I suggest either using isHTMLSpace directly, or having this function be an inline that calls isHTMLSpace. > Source/WebCore/dom/TextDecoder.cpp:41 > +ExceptionOr<Ref<TextDecoder>> TextDecoder::create(String label, Options options) This function should take a const String&, not a String. > Source/WebCore/dom/TextDecoder.cpp:43 > + String strippedLabel = label.stripWhiteSpace(isEncodingWhitespace); This should use the much more efficient stripLeadingAndTrailingHTMLSpaces function rather than the one where we pass a function in. > Source/WebCore/dom/TextDecoder.cpp:46 > + if (strippedLabel != utf8Label.data()) > + return Exception { RangeError }; What is the point of this? I think this is just a slow way to check wither the label has any non-ASCII characters in it. We have faster ways to check that. > Source/WebCore/dom/TextDecoder.cpp:48 > + Ref<TextDecoder> decoder = adoptRef(*new TextDecoder(utf8Label, options)); Should use auto here, I think. > Source/WebCore/dom/TextDecoder.cpp:54 > +TextDecoder::TextDecoder(CString label, Options options) This should take const char*, not CString. > Source/WebCore/dom/TextDecoder.cpp:93 > + const UChar utf16BEBOM[2] = {0xFEFF, '\0'}; > + return makeString(utf16BEBOM, decoded); This is so expensive, requires copying the entire string to prepend this single character. Is there another way to accomplish this? Can we put this into the buffer beforehand or something? > Source/WebCore/dom/TextDecoder.cpp:136 > + m_hasDecoded = true; > + return WTFMove(result); Should just remove these two lines of code. The function will do the same thing without them. > Source/WebCore/dom/TextEncoder.cpp:38 > +RefPtr<Uint8Array> TextEncoder::encode(String&& input) const This should return Ref, not RefPtr. > Source/WebCore/dom/TextEncoder.cpp:41 > + return Uint8Array::create(input.characters8(), input.length()); This is wrong if any characters are non-ASCII. The Latin-1 characters won’t be converted to UTF-8 and so the encoding will be done incorrectly. Need test cases to cover this. > Source/WebCore/dom/TextEncoder.cpp:43 > + CString utf8 = input.utf8(); Using this function means copying the UTF-8 bytes, rather than decoding them directly into a Uint8Array buffer. We have the functions that could do this more efficiently or we can refactor to make sure we don’t need to alllocate an extra buffer and I would like to consider doing that instead. The utf8() function can fail, returning an empty string, when the input text is not valid UTF-16. Is it OK that we create an empty array in that case? Do we have test cases covering that? > Source/WebCore/dom/TextEncoder.h:35 > +class TextEncoder : public RefCounted<TextEncoder> { The class is named TextEncoder, but it always encodes as UTF-8?
Darin Adler
Comment 17 2016-10-21 11:59:48 PDT
Comment on attachment 292323 [details] Patch review- because of the Latin-1/UTF-8 mistake, which is a showstopper Also, there seems to be insufficient test coverage in the tests we imported for this.
Alex Christensen
Comment 18 2016-11-16 17:33:08 PST
Sam Weinig
Comment 19 2016-11-16 23:04:57 PST
Please make sure to update the features.json file with this change.
Sam Weinig
Comment 20 2016-11-16 23:05:36 PST
Also, might be worth splitting out the patch from the tests for review, as 2.4 MB is a bit big.
Alex Christensen
Comment 21 2016-11-17 10:29:56 PST
Alex Christensen
Comment 22 2016-11-17 12:06:32 PST
Alex Christensen
Comment 23 2016-11-17 15:05:25 PST
Created attachment 295092 [details] source_changes
Sam Weinig
Comment 24 2016-11-17 15:24:21 PST
Comment on attachment 295092 [details] source_changes View in context: https://bugs.webkit.org/attachment.cgi?id=295092&action=review > Source/WebCore/dom/TextEncoder.cpp:43 > + // FIXME: We should not need to allocate a CString to encode into a Uint8Array. > + CString utf8 = input.utf8(); > + return Uint8Array::create(reinterpret_cast<const uint8_t*>(utf8.data()), utf8.length()); While I agree with your comment, is there no way to have Uint8Array adopt the internal buffer of the c-string, which can relinquish control? Copying here seems really unfortunate. > Source/WebCore/dom/TextEncoder.h:39 > + RefPtr<Uint8Array> encode(String&&) const; This can return a Ref. > Source/WebCore/dom/TextEncoder.h:41 > + TextEncoder() { }; Can this be = default? > Source/WebCore/dom/TextEncoder.idl:34 > + > + // FIXME: This should have [NewObject] > + Uint8Array encode(optional USVString input = ""); Why not add it? What happens?
Build Bot
Comment 25 2016-11-17 16:04:19 PST
Comment on attachment 295092 [details] source_changes Attachment 295092 [details] did not pass mac-ews (mac): Output: http://webkit-queues.webkit.org/results/2533704 New failing tests: imported/w3c/web-platform-tests/encoding/textencoder-constructor-non-utf.html imported/w3c/web-platform-tests/encoding/textdecoder-streaming.html imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.html imported/w3c/web-platform-tests/encoding/api-replacement-encodings.html imported/w3c/web-platform-tests/encoding/idlharness.html imported/w3c/web-platform-tests/encoding/textencoder-utf16-surrogates.html imported/w3c/web-platform-tests/encoding/api-basics.html imported/w3c/web-platform-tests/encoding/iso-2022-jp-decoder.html js/dom/global-constructors-attributes-dedicated-worker.html imported/w3c/web-platform-tests/encoding/api-surrogates-utf8.html imported/w3c/web-platform-tests/encoding/textdecoder-utf16-surrogates.html imported/w3c/web-platform-tests/encoding/textdecoder-ignorebom.html imported/w3c/web-platform-tests/encoding/textdecoder-fatal-streaming.html imported/w3c/web-platform-tests/encoding/textdecoder-fatal.html imported/w3c/web-platform-tests/encoding/textdecoder-labels.html imported/w3c/web-platform-tests/encoding/api-invalid-label.html imported/w3c/web-platform-tests/encoding/single-byte-decoder.html imported/w3c/web-platform-tests/encoding/textdecoder-byte-order-marks.html
Build Bot
Comment 26 2016-11-17 16:04:24 PST
Created attachment 295102 [details] Archive of layout-test-results from ews103 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews103 Port: mac-yosemite Platform: Mac OS X 10.10.5
Build Bot
Comment 27 2016-11-17 16:07:54 PST
Comment on attachment 295092 [details] source_changes Attachment 295092 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.webkit.org/results/2533712 New failing tests: imported/w3c/web-platform-tests/encoding/textencoder-constructor-non-utf.html imported/w3c/web-platform-tests/encoding/textdecoder-streaming.html imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.html imported/w3c/web-platform-tests/encoding/api-replacement-encodings.html imported/w3c/web-platform-tests/encoding/idlharness.html imported/w3c/web-platform-tests/encoding/textencoder-utf16-surrogates.html imported/w3c/web-platform-tests/encoding/api-basics.html imported/w3c/web-platform-tests/encoding/iso-2022-jp-decoder.html js/dom/global-constructors-attributes-dedicated-worker.html imported/w3c/web-platform-tests/encoding/api-surrogates-utf8.html imported/w3c/web-platform-tests/encoding/textdecoder-utf16-surrogates.html imported/w3c/web-platform-tests/encoding/textdecoder-ignorebom.html imported/w3c/web-platform-tests/encoding/textdecoder-fatal-streaming.html imported/w3c/web-platform-tests/encoding/textdecoder-fatal.html imported/w3c/web-platform-tests/encoding/textdecoder-labels.html imported/w3c/web-platform-tests/encoding/api-invalid-label.html imported/w3c/web-platform-tests/encoding/single-byte-decoder.html imported/w3c/web-platform-tests/encoding/textdecoder-byte-order-marks.html js/dom/global-constructors-attributes.html
Build Bot
Comment 28 2016-11-17 16:07:59 PST
Created attachment 295103 [details] Archive of layout-test-results from ews104 for mac-yosemite-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews104 Port: mac-yosemite-wk2 Platform: Mac OS X 10.10.5
Build Bot
Comment 29 2016-11-17 16:20:29 PST
Comment on attachment 295092 [details] source_changes Attachment 295092 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/2533764 New failing tests: imported/w3c/web-platform-tests/encoding/textencoder-constructor-non-utf.html imported/w3c/web-platform-tests/encoding/textdecoder-streaming.html imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.html imported/w3c/web-platform-tests/encoding/api-replacement-encodings.html imported/w3c/web-platform-tests/encoding/idlharness.html imported/w3c/web-platform-tests/encoding/textencoder-utf16-surrogates.html imported/w3c/web-platform-tests/encoding/api-basics.html imported/w3c/web-platform-tests/encoding/iso-2022-jp-decoder.html js/dom/global-constructors-attributes-dedicated-worker.html imported/w3c/web-platform-tests/encoding/api-surrogates-utf8.html imported/w3c/web-platform-tests/encoding/textdecoder-utf16-surrogates.html imported/w3c/web-platform-tests/encoding/textdecoder-ignorebom.html imported/w3c/web-platform-tests/encoding/textdecoder-fatal-streaming.html imported/w3c/web-platform-tests/encoding/textdecoder-fatal.html imported/w3c/web-platform-tests/encoding/textdecoder-labels.html imported/w3c/web-platform-tests/encoding/api-invalid-label.html imported/w3c/web-platform-tests/encoding/single-byte-decoder.html imported/w3c/web-platform-tests/encoding/textdecoder-byte-order-marks.html
Build Bot
Comment 30 2016-11-17 16:20:34 PST
Created attachment 295104 [details] Archive of layout-test-results from ews115 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews115 Port: mac-yosemite Platform: Mac OS X 10.10.5
Build Bot
Comment 31 2016-11-17 16:27:07 PST
Comment on attachment 295092 [details] source_changes Attachment 295092 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: http://webkit-queues.webkit.org/results/2533771 New failing tests: imported/w3c/web-platform-tests/encoding/textencoder-constructor-non-utf.html imported/w3c/web-platform-tests/encoding/textdecoder-streaming.html imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.html imported/w3c/web-platform-tests/encoding/api-replacement-encodings.html imported/w3c/web-platform-tests/encoding/idlharness.html imported/w3c/web-platform-tests/encoding/textencoder-utf16-surrogates.html imported/w3c/web-platform-tests/encoding/api-basics.html imported/w3c/web-platform-tests/encoding/iso-2022-jp-decoder.html js/dom/global-constructors-attributes-dedicated-worker.html imported/w3c/web-platform-tests/encoding/api-surrogates-utf8.html imported/w3c/web-platform-tests/encoding/textdecoder-utf16-surrogates.html imported/w3c/web-platform-tests/encoding/textdecoder-ignorebom.html imported/w3c/web-platform-tests/encoding/textdecoder-fatal-streaming.html imported/w3c/web-platform-tests/encoding/textdecoder-fatal.html imported/w3c/web-platform-tests/encoding/textdecoder-labels.html imported/w3c/web-platform-tests/encoding/api-invalid-label.html imported/w3c/web-platform-tests/encoding/single-byte-decoder.html imported/w3c/web-platform-tests/encoding/textdecoder-byte-order-marks.html
Build Bot
Comment 32 2016-11-17 16:27:12 PST
Created attachment 295106 [details] Archive of layout-test-results from ews125 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews125 Port: ios-simulator-wk2 Platform: Mac OS X 10.11.6
Alex Christensen
Comment 33 2016-11-17 18:11:25 PST
Chris Rebert
Comment 34 2016-12-06 22:14:14 PST
*** Bug 160653 has been marked as a duplicate of this bug. ***
Note You need to log in before you can comment on or make changes to this bug.