Bug 227244

Summary: Replace builder.append(String::fromUTF8(...)) uses with a more efficient new StringTypeAdapter
Product: WebKit Reporter: Sam Weinig <sam>
Component: Web Template FrameworkAssignee: Sam Weinig <sam>
Status: ASSIGNED ---    
Severity: Normal CC: benjamin, cdumez, cmarcelo, darin, ews-watchlist, hi, joepeck, keith_miller, mark.lam, msaboff, saam, tzagallo, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: Other   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch ews-feeder: commit-queue-

Description Sam Weinig 2021-06-21 16:53:31 PDT
A few places in WebCore use builder.append(String::fromUTF8(...)) to add UTF8 content to a StringBuilder. This makes an extra String allocation per call and can be improved by adding a StringTypeAdapter for this use case to directly convert into the StringBuilder's buffer. Something like:

struct FromUTF8Converter {
    Span<const std::byte> buffer;
};

inline FromUTF8Converter fromUTF8(Span<const std::byte> buffer)
{
    return { buffer };
}

template<> class StringTypeAdapter<FromUTF8Converter, void> {
public:
    StringTypeAdapter(const FromUTF8Converter& converter)
        : m_converter { converter }
    {
        auto [utf16StringLength, isAllASCII] = computeUTF16StringLengthAreAllASCII(m_converter.buffer.data(), m_converter.buffer.size()) }

        m_utf16StringLength = utf16StringLength;
        m_isAllASCII = isAllASCII;
    }

    unsigned length() const
    { 
        if (m_isAllASCII)
            return m_converter.buffer.size();
        return m_utf16StringLength;
    }

    bool is8Bit() const { return m_isAllASCII; }
    
    void writeTo(LChar* destination) const
    {
        ASSERT(m_isAllASCII);
        memcpy(destination, m_converter.buffer.data(), m_converter.buffer.data() + m_converter.buffer.size(), );
    }

    void writeTo(UChar* destination) const
    {
        convertUTF8ToUTF16(m_converter.buffer.data(), m_converter.buffer.size(), destination, destination + m_utf16StringLength);
    }

private:
    const FromUTF8Converter& m_converter;
    unsigned m_utf16StringLength;
    bool m_isAllASCII;
};


...


builder.append(fromUTF8(data));
Comment 1 Darin Adler 2021-06-23 09:35:31 PDT
Great idea.
Comment 2 Darin Adler 2021-06-23 09:36:31 PDT
Once we do this we could use this with makeString too, so we don’t really need String::fromUTF8 any more (although of course we can keep it); using it with makeString is more flexible.
Comment 3 Radar WebKit Bug Importer 2021-06-28 16:54:16 PDT
<rdar://problem/79888871>
Comment 4 Sam Weinig 2021-07-05 08:55:52 PDT
Created attachment 432888 [details]
Patch