Bug 227244 - Replace builder.append(String::fromUTF8(...)) uses with a more efficient new StringTypeAdapter
Summary: Replace builder.append(String::fromUTF8(...)) uses with a more efficient new ...
Status: ASSIGNED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Web Template Framework (show other bugs)
Version: Other
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Sam Weinig
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2021-06-21 16:53 PDT by Sam Weinig
Modified: 2021-07-05 09:08 PDT (History)
13 users (show)

See Also:


Attachments
Patch (22.17 KB, patch)
2021-07-05 08:55 PDT, Sam Weinig
ews-feeder: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Sam Weinig 2021-06-21 16:53:31 PDT
A few places in WebCore use builder.append(String::fromUTF8(...)) to add UTF8 content to a StringBuilder. This makes an extra String allocation per call and can be improved by adding a StringTypeAdapter for this use case to directly convert into the StringBuilder's buffer. Something like:

struct FromUTF8Converter {
    Span<const std::byte> buffer;
};

inline FromUTF8Converter fromUTF8(Span<const std::byte> buffer)
{
    return { buffer };
}

template<> class StringTypeAdapter<FromUTF8Converter, void> {
public:
    StringTypeAdapter(const FromUTF8Converter& converter)
        : m_converter { converter }
    {
        auto [utf16StringLength, isAllASCII] = computeUTF16StringLengthAreAllASCII(m_converter.buffer.data(), m_converter.buffer.size()) }

        m_utf16StringLength = utf16StringLength;
        m_isAllASCII = isAllASCII;
    }

    unsigned length() const
    { 
        if (m_isAllASCII)
            return m_converter.buffer.size();
        return m_utf16StringLength;
    }

    bool is8Bit() const { return m_isAllASCII; }
    
    void writeTo(LChar* destination) const
    {
        ASSERT(m_isAllASCII);
        memcpy(destination, m_converter.buffer.data(), m_converter.buffer.data() + m_converter.buffer.size(), );
    }

    void writeTo(UChar* destination) const
    {
        convertUTF8ToUTF16(m_converter.buffer.data(), m_converter.buffer.size(), destination, destination + m_utf16StringLength);
    }

private:
    const FromUTF8Converter& m_converter;
    unsigned m_utf16StringLength;
    bool m_isAllASCII;
};


...


builder.append(fromUTF8(data));
Comment 1 Darin Adler 2021-06-23 09:35:31 PDT
Great idea.
Comment 2 Darin Adler 2021-06-23 09:36:31 PDT
Once we do this we could use this with makeString too, so we don’t really need String::fromUTF8 any more (although of course we can keep it); using it with makeString is more flexible.
Comment 3 Radar WebKit Bug Importer 2021-06-28 16:54:16 PDT
<rdar://problem/79888871>
Comment 4 Sam Weinig 2021-07-05 08:55:52 PDT
Created attachment 432888 [details]
Patch