I've run into a bug where WebKit based browsers are throwing away whitespace during a copy operation (For CF_HTML based copying but not for text copying).
The issue occurs when wrapping between text and certain tags e.g. "The <b>fat</b> cat <i>sat</i> on the mat" will be copied as "The <b>fat</b>cat <i>sat</i> on the mat" if word wrapping occurs just before cat and "The <b>fat</b> cat<i>sat</i> on the mat" if word wrapping occurs just before sat, etc.
Problem verified on:
Chrome - latest stable as well as an older version
Safari for windows
I've tracked this issue down to line 545 in TextIterator.cpp:
unsigned runEnd = min(textBoxEnd, end);
textBoxEnd is one less then the actual end of the text, so this causes the last character (the wrapped space) not to be considered by createMarkup.
As an experiment I have changed the line to:
unsigned runEnd = max(textBoxEnd, end);
which resolves the problem.
This is of course probably not the actual way to fix the problem, unfortunately I do not have the knowledge of this code required to produce a proper fix/patch but it should at least give a good starting point of where to look.
I can't reproduce this bug. Could you give us exact reproduction steps?
Created attachment 98462 [details]
Basic html sample with which bug can be reproduced
1) Open minimal sample "wrapbug.html"
2) resize browser window so that line wrapping occurs between one of the bolded words and a normal word
3) highlight all the text
5) paste the text into any application that supports the CF_HTML clipboard format instead of the plain text format (bug does not occur in plain text as that uses different code to generate) - for most obvious results ensure that the target application has a different window size so that wrapping is not the same.
6) observe that in the pasted text the bolded word and the normal word are now next to each other with no spacing
8) View clipboard contents directly using a raw clipboard viewer - e.g."ClipView.exe"
9) Observe that there is no spacing in the raw CF_HTML on the clipboard either, so it is definitely a bug in generating the CF_HTML and not a bug in e.g. the receiving application reading the CF_HTML wrong.
The plain text generation does use TextIterator, so it’s strange that a TextIterator bug would affect CF_HTML but not plain text.
I think the problem comes from the fact createMarkup uses renderer to serialize text nodes and we don't render trailing whitespace (collapse them). We'd either have to special-case it in StylizedMarkupAccumulator or create inline boxes for unrendered trailing whitespace.
This bug is related to https://bugs.webkit.org/show_bug.cgi?id=54598.
The test case below failed on latest webkit.
The problem is StyledMarkupAccumulator uses renderedText and the space after the text has been stripped when the following tag was wrapped.
The following test case failed on latest webkit.
The problem is StyledMarkupAccumulator uses renderedText and the space at the end of the text has been stripped when the tag after the text was wrapped.
Created attachment 223249 [details]
Comment on attachment 223249 [details]
View in context: https://bugs.webkit.org/attachment.cgi?id=223249&action=review
This output doesn't tell us anything about the nature of the test.
> +expected = 'Copy this text <a href="http://www.google.com">AVeryLongWordThatWillWrap</a>';
> +actual = paste.innerHTML;
> +if (window.testRunner && actual == expected)
> + document.body.innerText = "Success";
> + document.body.innerText = "Failure. Found: " + actual + ", but expected: " + expected;
Please use dump-as-markup, etc... to output the serialized markup
so that expected result is self explanatory.
> + The problem is StyledMarkupAccumulator uses renderedText and the space at
> + the end of the text has been stripped when the tag after the text was wrapped.
Please add inline function description as to how you're fixing this bug.
> + Reviewed by NOBODY (OOPS!).
This line should appear after the bug URL but before the long description.
> + if (!nextTextBox && m_isNotLastNode && (str.length() == runEnd + 1)) // Restore space if followed by a non-text box
> + subrunEnd++;
It's not clear how incrementing subrunEnd++ ends up restoring space.
Also, it's better to define a local boolean like followedByNonTextBox instead of adding a comment like this.
> + behavior = TextIteratorIsNotLastNode;
I don't think this behavior is descriptive. TextIterator isn't a node. It iterates over nodes.
Looks like we need a behavior where text iterator behaves as if there are more nodes??
Created attachment 223968 [details]
fix patch 2
new patch to address the review comments.
Comment on attachment 223968 [details]
fix patch 2
Clearing flags on attachment: 223968
Committed r164047: <http://trac.webkit.org/changeset/164047>
All reviewed patches have been landed. Closing bug.