WebKit Bugzilla
New
Browse
Search+
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
211524
Preserve character set information when writing to the pasteboard when copying rich text
https://bugs.webkit.org/show_bug.cgi?id=211524
Summary
Preserve character set information when writing to the pasteboard when copyin...
Wenson Hsieh
Reported
2020-05-06 12:35:31 PDT
See: the discussion in
https://bugs.webkit.org/show_bug.cgi?id=211498
.
Attachments
Patch
(14.76 KB, patch)
2020-05-07 14:39 PDT
,
Wenson Hsieh
no flags
Details
Formatted Diff
Diff
Patch
(17.63 KB, patch)
2020-05-07 15:29 PDT
,
Wenson Hsieh
no flags
Details
Formatted Diff
Diff
Address feedback
(19.68 KB, patch)
2020-05-07 17:33 PDT
,
Wenson Hsieh
darin
: review+
Details
Formatted Diff
Diff
Patch for landing
(19.65 KB, patch)
2020-05-08 10:13 PDT
,
Wenson Hsieh
no flags
Details
Formatted Diff
Diff
Show Obsolete
(2)
View All
Add attachment
proposed patch, testcase, etc.
Wenson Hsieh
Comment 1
2020-05-07 14:39:56 PDT
Comment hidden (obsolete)
Created
attachment 398798
[details]
Patch
Wenson Hsieh
Comment 2
2020-05-07 15:29:16 PDT
Created
attachment 398804
[details]
Patch
Darin Adler
Comment 3
2020-05-07 15:35:15 PDT
Comment on
attachment 398804
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=398804&action=review
> Source/WebCore/editing/markup.cpp:921 > +#if PLATFORM(COCOA)
I think this kind of issue might exist on other platforms as well. Would be nice to call people’s attention to this in case they want it to take advantage of it.
> Source/WebCore/editing/markup.cpp:926 > + if (!accumulatedMarkup.isAllASCII()) { > + // On Cocoa platforms, this markup is eventually persisted to the pasteboard and read back as UTF-8 data, > + // so this meta tag is needed for clients that read this data in the future from the pasteboard and load it. > + return makeString("<meta charset=\"UTF-8\">", WTFMove(accumulatedMarkup)); > + }
Could avoid making a second copy of the entire string by adding an isAllASCII function to StyledMarkupAccumulator and adding another function you can use to add this to m_reversedPrecedingMarkup before calling takeResults. Less economical in code complexity, but more efficient.
Wenson Hsieh
Comment 4
2020-05-07 16:09:29 PDT
Comment on
attachment 398804
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=398804&action=review
>> Source/WebCore/editing/markup.cpp:926 >> + } > > Could avoid making a second copy of the entire string by adding an isAllASCII function to StyledMarkupAccumulator and adding another function you can use to add this to m_reversedPrecedingMarkup before calling takeResults. Less economical in code complexity, but more efficient.
Sounds good to me! I’ll update the patch to do this.
Wenson Hsieh
Comment 5
2020-05-07 16:37:30 PDT
(In reply to Wenson Hsieh from
comment #4
)
> Comment on
attachment 398804
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=398804&action=review
> > >> Source/WebCore/editing/markup.cpp:926 > >> + } > > > > Could avoid making a second copy of the entire string by adding an isAllASCII function to StyledMarkupAccumulator and adding another function you can use to add this to m_reversedPrecedingMarkup before calling takeResults. Less economical in code complexity, but more efficient. > > Sounds good to me! I’ll update the patch to do this.
So adding an `isAllASCII()` method to `StyledMarkupAccumulator` would require us to also add an `isAllASCII()` method to StringBuilder — which, I think, seems fine? I imagine it would just be like WTF::String’s. Something like: bool isAllASCII() const { return !m_buffer || m_buffer->isAllASCII(); }
Wenson Hsieh
Comment 6
2020-05-07 17:12:42 PDT
(In reply to Wenson Hsieh from
comment #5
)
> (In reply to Wenson Hsieh from
comment #4
) > > Comment on
attachment 398804
[details]
> > Patch > > > > View in context: > >
https://bugs.webkit.org/attachment.cgi?id=398804&action=review
> > > > >> Source/WebCore/editing/markup.cpp:926 > > >> + } > > > > > > Could avoid making a second copy of the entire string by adding an isAllASCII function to StyledMarkupAccumulator and adding another function you can use to add this to m_reversedPrecedingMarkup before calling takeResults. Less economical in code complexity, but more efficient. > > > > Sounds good to me! I’ll update the patch to do this. > > So adding an `isAllASCII()` method to `StyledMarkupAccumulator` would > require us to also add an `isAllASCII()` method to StringBuilder — which, I > think, seems fine? I imagine it would just be like WTF::String’s. Something > like: > > bool isAllASCII() const { return !m_buffer || m_buffer->isAllASCII(); }
…upon further testing, this isn’t correct, because a StringBuilder can be resized (but keep the same m_buffer) :/
Wenson Hsieh
Comment 7
2020-05-07 17:33:47 PDT
Created
attachment 398819
[details]
Address feedback
Darin Adler
Comment 8
2020-05-08 09:04:27 PDT
Comment on
attachment 398819
[details]
Address feedback View in context:
https://bugs.webkit.org/attachment.cgi?id=398819&action=review
> Source/WebCore/editing/MarkupAccumulator.h:72 > + bool isAllASCII() const { return m_markup.toStringPreserveCapacity().isAllASCII(); }
We can follow up and make a much more efficient version of this. Fine to land like this I suppose.
> Source/WebCore/editing/markup.cpp:248 > + m_reversedPrecedingMarkup.append("<meta charset=\"UTF-8\">");
Should add a _s here, I think. It’s more efficient to create a String from an ASCIILiteral, since it doesn’t copy the characters.
Darin Adler
Comment 9
2020-05-08 09:35:03 PDT
Comment on
attachment 398819
[details]
Address feedback View in context:
https://bugs.webkit.org/attachment.cgi?id=398819&action=review
>> Source/WebCore/editing/MarkupAccumulator.h:72 >> + bool isAllASCII() const { return m_markup.toStringPreserveCapacity().isAllASCII(); } > > We can follow up and make a much more efficient version of this. Fine to land like this I suppose.
I’m happy to do this optimization after this lands.
Wenson Hsieh
Comment 10
2020-05-08 10:06:35 PDT
Comment on
attachment 398819
[details]
Address feedback View in context:
https://bugs.webkit.org/attachment.cgi?id=398819&action=review
>>> Source/WebCore/editing/MarkupAccumulator.h:72 >>> + bool isAllASCII() const { return m_markup.toStringPreserveCapacity().isAllASCII(); } >> >> We can follow up and make a much more efficient version of this. Fine to land like this I suppose. > > I’m happy to do this optimization after this lands.
\o/
>> Source/WebCore/editing/markup.cpp:248 >> + m_reversedPrecedingMarkup.append("<meta charset=\"UTF-8\">"); > > Should add a _s here, I think. It’s more efficient to create a String from an ASCIILiteral, since it doesn’t copy the characters.
Done!
Wenson Hsieh
Comment 11
2020-05-08 10:13:16 PDT
Created
attachment 398871
[details]
Patch for landing
EWS
Comment 12
2020-05-08 10:35:51 PDT
Committed
r261395
: <
https://trac.webkit.org/changeset/261395
> All reviewed patches have been landed. Closing bug and clearing flags on
attachment 398871
[details]
.
Radar WebKit Bug Importer
Comment 13
2020-05-08 10:41:59 PDT
<
rdar://problem/63027006
>
Ryosuke Niwa
Comment 14
2022-08-02 21:24:31 PDT
Comment on
attachment 398819
[details]
Address feedback View in context:
https://bugs.webkit.org/attachment.cgi?id=398819&action=review
>>> Source/WebCore/editing/markup.cpp:248 >>> + m_reversedPrecedingMarkup.append("<meta charset=\"UTF-8\">"); >> >> Should add a _s here, I think. It’s more efficient to create a String from an ASCIILiteral, since it doesn’t copy the characters. > > Done!
This patch broke WebKit's ability to save XHTML documents. This element will cause a parsing error in a XHTML document.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug