See: the discussion in https://bugs.webkit.org/show_bug.cgi?id=211498.
Created attachment 398798 [details] Patch
Created attachment 398804 [details] Patch
Comment on attachment 398804 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=398804&action=review > Source/WebCore/editing/markup.cpp:921 > +#if PLATFORM(COCOA) I think this kind of issue might exist on other platforms as well. Would be nice to call people’s attention to this in case they want it to take advantage of it. > Source/WebCore/editing/markup.cpp:926 > + if (!accumulatedMarkup.isAllASCII()) { > + // On Cocoa platforms, this markup is eventually persisted to the pasteboard and read back as UTF-8 data, > + // so this meta tag is needed for clients that read this data in the future from the pasteboard and load it. > + return makeString("<meta charset=\"UTF-8\">", WTFMove(accumulatedMarkup)); > + } Could avoid making a second copy of the entire string by adding an isAllASCII function to StyledMarkupAccumulator and adding another function you can use to add this to m_reversedPrecedingMarkup before calling takeResults. Less economical in code complexity, but more efficient.
Comment on attachment 398804 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=398804&action=review >> Source/WebCore/editing/markup.cpp:926 >> + } > > Could avoid making a second copy of the entire string by adding an isAllASCII function to StyledMarkupAccumulator and adding another function you can use to add this to m_reversedPrecedingMarkup before calling takeResults. Less economical in code complexity, but more efficient. Sounds good to me! I’ll update the patch to do this.
(In reply to Wenson Hsieh from comment #4) > Comment on attachment 398804 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=398804&action=review > > >> Source/WebCore/editing/markup.cpp:926 > >> + } > > > > Could avoid making a second copy of the entire string by adding an isAllASCII function to StyledMarkupAccumulator and adding another function you can use to add this to m_reversedPrecedingMarkup before calling takeResults. Less economical in code complexity, but more efficient. > > Sounds good to me! I’ll update the patch to do this. So adding an `isAllASCII()` method to `StyledMarkupAccumulator` would require us to also add an `isAllASCII()` method to StringBuilder — which, I think, seems fine? I imagine it would just be like WTF::String’s. Something like: bool isAllASCII() const { return !m_buffer || m_buffer->isAllASCII(); }
(In reply to Wenson Hsieh from comment #5) > (In reply to Wenson Hsieh from comment #4) > > Comment on attachment 398804 [details] > > Patch > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=398804&action=review > > > > >> Source/WebCore/editing/markup.cpp:926 > > >> + } > > > > > > Could avoid making a second copy of the entire string by adding an isAllASCII function to StyledMarkupAccumulator and adding another function you can use to add this to m_reversedPrecedingMarkup before calling takeResults. Less economical in code complexity, but more efficient. > > > > Sounds good to me! I’ll update the patch to do this. > > So adding an `isAllASCII()` method to `StyledMarkupAccumulator` would > require us to also add an `isAllASCII()` method to StringBuilder — which, I > think, seems fine? I imagine it would just be like WTF::String’s. Something > like: > > bool isAllASCII() const { return !m_buffer || m_buffer->isAllASCII(); } …upon further testing, this isn’t correct, because a StringBuilder can be resized (but keep the same m_buffer) :/
Created attachment 398819 [details] Address feedback
Comment on attachment 398819 [details] Address feedback View in context: https://bugs.webkit.org/attachment.cgi?id=398819&action=review > Source/WebCore/editing/MarkupAccumulator.h:72 > + bool isAllASCII() const { return m_markup.toStringPreserveCapacity().isAllASCII(); } We can follow up and make a much more efficient version of this. Fine to land like this I suppose. > Source/WebCore/editing/markup.cpp:248 > + m_reversedPrecedingMarkup.append("<meta charset=\"UTF-8\">"); Should add a _s here, I think. It’s more efficient to create a String from an ASCIILiteral, since it doesn’t copy the characters.
Comment on attachment 398819 [details] Address feedback View in context: https://bugs.webkit.org/attachment.cgi?id=398819&action=review >> Source/WebCore/editing/MarkupAccumulator.h:72 >> + bool isAllASCII() const { return m_markup.toStringPreserveCapacity().isAllASCII(); } > > We can follow up and make a much more efficient version of this. Fine to land like this I suppose. I’m happy to do this optimization after this lands.
Comment on attachment 398819 [details] Address feedback View in context: https://bugs.webkit.org/attachment.cgi?id=398819&action=review >>> Source/WebCore/editing/MarkupAccumulator.h:72 >>> + bool isAllASCII() const { return m_markup.toStringPreserveCapacity().isAllASCII(); } >> >> We can follow up and make a much more efficient version of this. Fine to land like this I suppose. > > I’m happy to do this optimization after this lands. \o/ >> Source/WebCore/editing/markup.cpp:248 >> + m_reversedPrecedingMarkup.append("<meta charset=\"UTF-8\">"); > > Should add a _s here, I think. It’s more efficient to create a String from an ASCIILiteral, since it doesn’t copy the characters. Done!
Created attachment 398871 [details] Patch for landing
Committed r261395: <https://trac.webkit.org/changeset/261395> All reviewed patches have been landed. Closing bug and clearing flags on attachment 398871 [details].
<rdar://problem/63027006>
Comment on attachment 398819 [details] Address feedback View in context: https://bugs.webkit.org/attachment.cgi?id=398819&action=review >>> Source/WebCore/editing/markup.cpp:248 >>> + m_reversedPrecedingMarkup.append("<meta charset=\"UTF-8\">"); >> >> Should add a _s here, I think. It’s more efficient to create a String from an ASCIILiteral, since it doesn’t copy the characters. > > Done! This patch broke WebKit's ability to save XHTML documents. This element will cause a parsing error in a XHTML document.