Bug 193274

Summary: Implement TextEncoder's encodeInto()
Product: WebKit Reporter: Anne van Kesteren <annevk>
Component: DOMAssignee: Alex Christensen <achristensen>
Status: RESOLVED FIXED    
Severity: Normal CC: achristensen, ap, cdumez, darin, denelxan, esprehn+autocc, ews-watchlist, kangil.han, kondapallykalyan, mmaxfield, webkit-bug-importer, youennf
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch darin: review+

Description Anne van Kesteren 2019-01-08 23:56:55 PST
This API allows for encoding a string into UTF-8 bytes using a preexisting target buffer.

See https://github.com/whatwg/encoding/pull/166 for the standard change and https://github.com/web-platform-tests/wpt/pull/14505 for tests.
Comment 1 Alex Christensen 2020-09-02 20:05:39 PDT
Created attachment 407853 [details]
Patch
Comment 2 Darin Adler 2020-09-02 20:51:50 PDT
Comment on attachment 407853 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=407853&action=review

> Source/WebCore/dom/TextEncoder.cpp:64
> +        uint8_t buffer[U8_MAX_LENGTH];
> +        unsigned offset = 0;
> +        U8_APPEND(buffer, offset, sizeof(buffer), token, sawError);
> +        if (sawError)
> +            break;
> +        if (written + offset > capacity)
> +            break;
> +        memcpy(destinationBytes + written, buffer, offset);
> +        written += offset;

Since U8_APPEND has bounds checking built in we don’t need to keep using a buffer every time. We could do something more like this:

    auto offset = written;
    U8_APPEND(destinationBytes, offset, capacity, token, sawError);
    if (sawError)
        break;
    written = offset;

> Source/WebCore/dom/TextEncoder.cpp:68
> +        if (U_IS_BMP(token))
> +            read++;
> +        else
> +            read += 2;

This could be:

    read += U16_LENGTH(token);
Comment 3 Darin Adler 2020-09-02 20:52:51 PDT
Comment on attachment 407853 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=407853&action=review

>> Source/WebCore/dom/TextEncoder.cpp:64
>> +        written += offset;
> 
> Since U8_APPEND has bounds checking built in we don’t need to keep using a buffer every time. We could do something more like this:
> 
>     auto offset = written;
>     U8_APPEND(destinationBytes, offset, capacity, token, sawError);
>     if (sawError)
>         break;
>     written = offset;

Reviewed the U8_APPEND implementation and we can do better than that:

    U8_APPEND(destinationBytes, written, capacity, token, sawError);
    if (sawError)
        break;
Comment 4 youenn fablet 2020-09-03 01:33:54 PDT
Comment on attachment 407853 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=407853&action=review

> Source/WebCore/dom/TextEncoder.idl:31
> +};

Seems like there is an issue with the binding generator here in identifying that TextEncoder should be exposed to worker.

Just move the dictionary declaration in another idl file or at the end of this file to circumvent the issue.
We should probably file a bug for this issue.
Comment 5 Alex Christensen 2020-09-03 10:10:42 PDT
Adding Exposed=(Window,Worker) to the dictionary works, too.
Comment 6 youenn fablet 2020-09-03 10:36:47 PDT
(In reply to Alex Christensen from comment #5)
> Adding Exposed=(Window,Worker) to the dictionary works, too.

Sure but I do not think dictionaries have Exposed in WebIDL.
Comment 7 Alex Christensen 2020-09-03 10:51:07 PDT
(In reply to Darin Adler from comment #3)
>     U8_APPEND(destinationBytes, written, capacity, token, sawError);
>     if (sawError)
>         break;

This works, but I first have to check if written == capacity and break.  It is a precondition of U8_APPEND that written is strictly less than capacity.
Comment 8 Alex Christensen 2020-09-03 10:53:13 PDT
http://trac.webkit.org/r266533
Comment 9 Radar WebKit Bug Importer 2020-09-03 10:54:24 PDT
<rdar://problem/68288806>
Comment 10 Alex Christensen 2020-09-04 11:06:37 PDT
http://trac.webkit.org/r266621