Bug 34564 - Copying can result in span around block elements on the clipboard
Summary: Copying can result in span around block elements on the clipboard
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: HTML Editing (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC OS X 10.5
: P2 Normal
Assignee: Ryosuke Niwa
URL:
Keywords:
Depends on: 45449 46545 60988 61454 61466 61595 61597 61599 61602 61605 61607 61608 61610 61614 61616 61617 61618 61680 61681 61682 61683 61887 65824 65833
Blocks: 42850 62041 12248 12319 24009 26483
  Show dependency treegraph
 
Reported: 2010-02-04 00:02 PST by Tony Chang
Modified: 2011-12-28 16:45 PST (History)
10 users (show)

See Also:


Attachments
work in progress (12.44 KB, patch)
2011-05-24 19:22 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
work in progress 2 (88.70 KB, patch)
2011-05-26 15:59 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
work in progress 3 (42.24 KB, patch)
2011-05-27 14:12 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
work in progress 4 (39.91 KB, patch)
2011-05-27 17:30 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
work in progress 5 (46.19 KB, patch)
2011-06-02 23:41 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
alpha version (53.11 KB, patch)
2011-06-03 17:05 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
beta version (51.06 KB, patch)
2011-06-03 23:06 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
Patch (65.48 KB, patch)
2011-06-04 00:01 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
updated (59.54 KB, patch)
2011-08-05 20:37 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
rebaselined more tests for chromium (67.49 KB, patch)
2011-08-06 17:43 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
updated after r92620 (64.06 KB, patch)
2011-08-08 15:10 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
updated after r92695 (41.40 KB, patch)
2011-08-09 16:07 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
reverted 2 bad rebaselines (39.83 KB, patch)
2011-08-09 16:20 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
Updated per comment #38 (40.18 KB, patch)
2011-08-10 18:45 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
Renamed enum values (40.13 KB, patch)
2011-08-10 18:46 PDT, Ryosuke Niwa
tony: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tony Chang 2010-02-04 00:02:22 PST
Bug 26937 is related in that the pasted html is invalid (with spans around divs), but even if we fix bug 26937 at paste time, we shouldn't be putting invalid HTML in the clipboard.

Comments 5+ on bug 26937 describe the problem in more detail.  Comment 7 has an example that will generate invalid HTML on the clipboard.
Comment 1 Ryosuke Niwa 2010-09-09 02:58:49 PDT
This bug seems to be still causing a lot of problems in pasting side.  I have a plan to fix this issue in several iterations:
1. Isolate the code to find the special ancestors into a function, and call this function before calling serializeNodes.
2. Modify ReplaceSelectionCommand so that it'll remove duplicated style from all top-level nodes in the fragment.
3. Add an ability for serializeNodes to annotate document + editing style on each top-level nodes in the fragment.  We'll use this only if there is no special common ancestor.
4. Stop wrapping styles in createMarkup
Comment 2 Ryosuke Niwa 2011-05-24 19:22:32 PDT
Created attachment 94727 [details]
work in progress
Comment 3 Annie Sullivan 2011-05-24 19:39:05 PDT
Can you give an example of how styles will get applied to top-level nodes? For example, if I have the following HTML:

<ul>
  <li>Line One</li>
  <li>Line Two</li>
</ul>

And the text should be red, would it look like this:

<ul>
  <li><font color="ff0000">Line One</font></li>
  <li><font color="ff0000">Line Two</font></li>
</ul>

Or like this:

<ul style="color:red">
  <li>Line One</li>
  <li>Line Two</li>
</ul>

Or something else? Do I understand correctly that we currently wrap the <font> or <span> tag around the entire <ul>?
Comment 4 Ryosuke Niwa 2011-05-24 19:41:41 PDT
The latter.  We'd have to include all computed style in inline declaration when we copy and remove them all when we paste.
Comment 5 Annie Sullivan 2011-05-25 09:55:44 PDT
(In reply to comment #4)
> The latter.  We'd have to include all computed style in inline declaration when we copy and remove them all when we paste.

The latter is complicated for a few reasons:

1. Some styles apply differently to block elements than to inline ones. For example, setting background-color on the ul will look very different than wrapping the list items in background-color styled spans.

2. The former behavior is how WebKit currently behaves if you select the list in a contenteditable region and use execCommand to apply a style. It's confusing for web developers to have the two different ways styles can be applied (of course, the current span-wrapping is worse).
Comment 6 Ryosuke Niwa 2011-05-25 10:04:08 PDT
(In reply to comment #5)
> 1. Some styles apply differently to block elements than to inline ones. For example, setting background-color on the ul will look very different than wrapping the list items in background-color styled spans.

Sure but the style is coming from the ancestors, and we apply the style on children when we paste anyways.

> 2. The former behavior is how WebKit currently behaves if you select the list in a contenteditable region and use execCommand to apply a style. It's confusing for web developers to have the two different ways styles can be applied (of course, the current span-wrapping is worse).

This is irrelevant because we'll strip all redundant styles in paste side.  And it's impossible to get the same behavior between serialization code and ApplyStyleCommand because we can't modify DOM when we serialize DOM.

New approach will only be visible to websites if they manually interrupt paste event and grab the clipboard content directly but then they need to deal with different kinds of markup anyways now.
Comment 7 Ryosuke Niwa 2011-05-26 15:59:38 PDT
Created attachment 95063 [details]
work in progress 2

Smart replace is broken and there's something wrong with XML copy & paste right now but this patch is a good representative of where I'm heading.
Comment 8 Ryosuke Niwa 2011-05-26 18:03:52 PDT
This patch requires the following tests be rebaselined.  I'll covert them to text-based tests first to make the correctness clear.
editing/pasteboard/4700297.html
editing/pasteboard/5065605.html
editing/pasteboard/5089327.html
editing/pasteboard/5483567.html
editing/pasteboard/emacs-ctrl-k-y-001.html
editing/pasteboard/interchange-newline-2.html
editing/pasteboard/merge-end-3.html
editing/pasteboard/merge-end-5.html
editing/pasteboard/merge-end-blockquote.html
editing/pasteboard/merge-end-borders.html
editing/pasteboard/merge-end-list.html
editing/pasteboard/merge-end-table.html
editing/pasteboard/paste-line-endings-007.html
editing/pasteboard/paste-line-endings-008.html
editing/pasteboard/paste-line-endings-009.html
editing/pasteboard/paste-text-004.html
editing/pasteboard/paste-text-005.html
editing/pasteboard/paste-text-008.html
editing/pasteboard/paste-text-009.html
editing/pasteboard/paste-text-010.html
editing/pasteboard/smart-paste-001.html
editing/pasteboard/smart-paste-003.html
editing/pasteboard/smart-paste-006.html
editing/pasteboard/smart-paste-007.html
editing/pasteboard/smart-paste-008.html
Comment 9 Ryosuke Niwa 2011-05-26 21:07:11 PDT
More tests to convert:

editing/pasteboard/5027857.html
editing/pasteboard/interchange-newline-1.html
editing/pasteboard/merge-end-1.html
editing/pasteboard/merge-end-2.html
editing/pasteboard/paste-blockquote-3.html
editing/pasteboard/paste-noscript-xhtml.xhtml
editing/pasteboard/paste-noscript.html
editing/pasteboard/paste-pre-001.html
editing/pasteboard/paste-pre-002.html
editing/pasteboard/paste-table-001.html
editing/pasteboard/paste-text-001.html
editing/pasteboard/paste-text-002.html
editing/pasteboard/paste-text-003.html
editing/pasteboard/paste-unrendered-select.html
editing/pasteboard/smart-paste-002.html
editing/pasteboard/smart-paste-004.html
editing/pasteboard/smart-paste-005.html

These are actually more important because the correctness of rebaselines for these tests is less obvious.
Comment 10 Ryosuke Niwa 2011-05-27 14:12:02 PDT
Created attachment 95214 [details]
work in progress 3

This approach is flawed as it turns out.  I need to remove UA default styles when we serialize markup instead of removing them in the paste side.
Comment 11 Ryosuke Niwa 2011-05-27 17:30:24 PDT
Created attachment 95235 [details]
work in progress 4

Okay, now only 34 tests fail and 2 tests crash. I'll have to split changes and see which change is causing which failure now.  Also, we'd have to be more clever about avoiding presentational elements when we're pasting contents.  e.g. when we copy & paste <b><i>hello</i></b> at the same place, we'll end up nesting them like <b><i><b style="font-weight: normal; font-style: normal;"><i>hello</i></b></i></b>.  Maybe I'll file a blocker and fix it separately.
Comment 12 Ryosuke Niwa 2011-06-02 23:41:03 PDT
Created attachment 95857 [details]
work in progress 5

There are 3-4 regressions and 8 layout tests fail but there are no crashes so it's getting there.  I'm hoping to get this ready for a review by the end of next week.
Comment 13 Ryosuke Niwa 2011-06-03 17:05:36 PDT
Created attachment 95991 [details]
alpha version

This patch works in the sense that there are no obvious regressions.  I'll try cleaning up the code and submit for a review on Monday.
Comment 14 WebKit Review Bot 2011-06-03 17:10:07 PDT
Attachment 95991 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/editing/deleting/deleting-line..." exit_code: 1

Source/WebCore/editing/EditingStyle.h:126:  The parameter name "element" adds no information, so it should be removed.  [readability/parameter_name] [5]
Total errors found: 1 in 19 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 15 Ryosuke Niwa 2011-06-03 23:06:27 PDT
Created attachment 96010 [details]
beta version

Cleaned up the patch.  This is ready for a review once I write the change log entries.
Comment 16 Ryosuke Niwa 2011-06-04 00:01:27 PDT
Created attachment 96013 [details]
Patch
Comment 17 Ryosuke Niwa 2011-06-12 15:08:13 PDT
Can someone review my patch?
Comment 18 Eric Seidel (no email) 2011-06-14 08:35:39 PDT
Comment on attachment 96013 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=96013&action=review

> LayoutTests/editing/execCommand/insert-list-with-noneditable-content-expected.txt:7
> -|       style="background-color: LightGray;"
> +|       style="background-color: rgb(211, 211, 211); "

Is this intentional?

> LayoutTests/editing/pasteboard/5065605-expected.txt:47
>  |   <font>
>  |     class="Apple-style-span"
>  |     color="#ff0000"
> -|     <span>
> +|     <font>
>  |       class="Apple-style-span"
> +|       color="#ff0000"
> +|       "This text should be red."
> +|     <div>

The fonts are now nested?  Yet neither of them provides any font information?

> LayoutTests/editing/pasteboard/display-block-on-spans-expected.txt:33
>  |   <b>
> -|     "This<#selection-caret>"
> -|   <b>
> +|     <b>
> +|       "This<#selection-caret>"

The bolds are now nested?

> LayoutTests/editing/pasteboard/paste-pre-001-expected.txt:10
> -execPasteCommand: <div id="test" class="editing"> <pre><span class="Apple-style-span" style="font-family: Times; white-space: normal; "><pre>foo bar</pre></span></pre> </div>
> +execPasteCommand: <div id="test" class="editing"> <pre><pre>foo bar</pre></pre> </div>

Is this change intentional?
Comment 19 Ryosuke Niwa 2011-06-14 09:26:50 PDT
Comment on attachment 96013 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=96013&action=review

>> LayoutTests/editing/execCommand/insert-list-with-noneditable-content-expected.txt:7
>> +|       style="background-color: rgb(211, 211, 211); "
> 
> Is this intentional?

As I mentioned in the change log, this is visually identical.  It's just that we're preserving background color in different code path.

>> LayoutTests/editing/pasteboard/5065605-expected.txt:47
>> +|     <div>
> 
> The fonts are now nested?  Yet neither of them provides any font information?

Not quite.  We used to nest font, span, font, div, font, div font.  Now we only have font, font, div, font.

>> LayoutTests/editing/pasteboard/display-block-on-spans-expected.txt:33
>> +|       "This<#selection-caret>"
> 
> The bolds are now nested?

Yeah, sad but we should be able to improve it in follow ups.

>> LayoutTests/editing/pasteboard/paste-pre-001-expected.txt:10
>> +execPasteCommand: <div id="test" class="editing"> <pre><pre>foo bar</pre></pre> </div>
> 
> Is this change intentional?

Yes, this span wasn't doing anything useful because pre overrode both font-family and white-space.
Comment 20 Ryosuke Niwa 2011-06-15 10:46:55 PDT
Ping reviewers.
Comment 21 Ryosuke Niwa 2011-06-21 13:51:31 PDT
Any reviewers?
Comment 22 Ryosuke Niwa 2011-08-02 17:35:28 PDT
Ping reviewers again.
Comment 23 Tony Chang 2011-08-05 15:23:10 PDT
Comment on attachment 96013 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=96013&action=review

I would like to review this patch, but it is too large for me to comprehend.  Is it possible to break it into smaller patches?  Here's one suggestion on how to do that:

a) a patch for adding inline style declarations to top level elements (we would still wrap in an apple style span)
b) a patch for wraping top level text nodes with a style span (maybe in addition to the apple style span)
c) a patch to remove apple style span since it is now redundant

I admit that it would make the resulting html worse temporarily (since we're duplicating styles), but hopefully that wouldn't be more than a few weeks.

> LayoutTests/editing/pasteboard/paste-text-with-style-expected.txt:40
> +|     <div>
> +|       style="font-style: normal; font-weight: normal; "
> +|       <i>
> +|         "hello italic"

In this test, why did the div end up inside the <i> further up the DOM?  This test seems to have gone from valid html to invalid html.
Comment 24 Ryosuke Niwa 2011-08-05 16:52:00 PDT
I talked with Tony about this and I'm going to split this patch into two pieces:
1. Add style removal logic to RelaceSelectionCommand.  We can even add tests to ensure redundant styles are not added here.
2. Update serialization code not to use wrapping span.

I'll first update the existing patch for ToT.
Comment 25 Ryosuke Niwa 2011-08-05 17:59:47 PDT
It appears that this patch is now broken due to various changes in ReplaceSelectionCommand.
Comment 26 Ryosuke Niwa 2011-08-05 19:09:09 PDT
Apparently, this was caused by r92330 and I found various regressions caused by r92330 while investigating the issue.  It turned out that this bug really blocks the bug 26483.
Comment 27 Ryosuke Niwa 2011-08-05 20:37:48 PDT
Created attachment 103144 [details]
updated
Comment 28 WebKit Review Bot 2011-08-05 21:15:05 PDT
Comment on attachment 103144 [details]
updated

Attachment 103144 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/9328055

New failing tests:
fast/events/ondrop-text-html.html
editing/pasteboard/paste-pre-002.html
editing/pasteboard/paste-code-in-pre.html
editing/pasteboard/onpaste-text-html.html
editing/pasteboard/paste-text-011.html
editing/pasteboard/paste-pre-001.html
editing/pasteboard/data-transfer-items.html
Comment 29 Ryosuke Niwa 2011-08-06 17:43:25 PDT
Created attachment 103171 [details]
rebaselined more tests for chromium
Comment 30 WebKit Review Bot 2011-08-07 06:41:01 PDT
Comment on attachment 103171 [details]
rebaselined more tests for chromium

Attachment 103171 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/9329470

New failing tests:
fast/events/ondrop-text-html.html
editing/pasteboard/paste-code-in-pre.html
Comment 31 Ryosuke Niwa 2011-08-08 15:10:42 PDT
Created attachment 103299 [details]
updated after r92620
Comment 32 Ryosuke Niwa 2011-08-08 15:12:06 PDT
Comment on attachment 103299 [details]
updated after r92620

Most of rebaselines in this patch are due to the bug 65824 so the final patch should be even smaller than this.
Comment 33 WebKit Review Bot 2011-08-08 15:50:04 PDT
Comment on attachment 103299 [details]
updated after r92620

Attachment 103299 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/9333178

New failing tests:
fast/events/ondrop-text-html.html
editing/pasteboard/paste-code-in-pre.html
Comment 34 Ryosuke Niwa 2011-08-09 16:07:40 PDT
Created attachment 103412 [details]
updated after r92695
Comment 35 Ryosuke Niwa 2011-08-09 16:16:46 PDT
Comment on attachment 103412 [details]
updated after r92695

Oops, I included 2 bad rebaselines.  Uploading a new patch in a minute.
Comment 36 Ryosuke Niwa 2011-08-09 16:20:26 PDT
Created attachment 103414 [details]
reverted 2 bad rebaselines
Comment 37 Ryosuke Niwa 2011-08-09 16:25:06 PDT
I've decided to keep code that deals with style spans in ReplaceSelectionCommand to paste contents copied by old versions of WebKit.  We can probably kill it in the future once this version become dominant.
Comment 38 Tony Chang 2011-08-10 16:59:24 PDT
Comment on attachment 103412 [details]
updated after r92695

View in context: https://bugs.webkit.org/attachment.cgi?id=103412&action=review

Sorry, comments on the previous patch, but I think they still apply.

> Source/WebCore/editing/markup.cpp:139
> +    Node* traverseNodesForSerialization(Node* startNode, Node* pastEnd, bool shouldEmit);

It would be nice if shouldEmit was an enum.

> Source/WebCore/editing/markup.cpp:223
> +        // <rdar://problem/5371536>

Nit: Maybe add FIXME: ?

> Source/WebCore/editing/markup.cpp:224
> +        // Make sure spans are inline style in paste side e.g. span { display: block }

Nit: Period at the end of the comment.

> Source/WebCore/editing/markup.cpp:226
> +        // FIXME: should this be included in forceInline?

Nit: Capitalize |should|.

> Source/WebCore/editing/markup.cpp:339
> +        Node* lastClosed = traverseNodesForSerialization(startNode, pastEnd, false);

It looks like we traverse the nodes for serialization twice now (once before) so we can determine the highest node to be serialized.  Do you think that will have a perf impact?
Comment 39 Ryosuke Niwa 2011-08-10 17:23:33 PDT
Comment on attachment 103412 [details]
updated after r92695

View in context: https://bugs.webkit.org/attachment.cgi?id=103412&action=review

>> Source/WebCore/editing/markup.cpp:139
>> +    Node* traverseNodesForSerialization(Node* startNode, Node* pastEnd, bool shouldEmit);
> 
> It would be nice if shouldEmit was an enum.

That's a good idea.

>> Source/WebCore/editing/markup.cpp:223
>> +        // <rdar://problem/5371536>
> 
> Nit: Maybe add FIXME: ?

Yeah, it should be
// FIXME: <rdar://problem/5371536> Style rules that match pasted content can change it's appearance
instead

>> Source/WebCore/editing/markup.cpp:224

> 
> Nit: Period at the end of the comment.

Oops, will fix.

>> Source/WebCore/editing/markup.cpp:226
>> +        // FIXME: should this be included in forceInline?
> 
> Nit: Capitalize |should|.

Will fix.

>> Source/WebCore/editing/markup.cpp:339
>> +        Node* lastClosed = traverseNodesForSerialization(startNode, pastEnd, false);
> 
> It looks like we traverse the nodes for serialization twice now (once before) so we can determine the highest node to be serialized.  Do you think that will have a perf impact?

I don't think it will.  Sure it'll double the traversal time but I suspect that we spend much more time generating string and concatenating them.
Comment 40 Ryosuke Niwa 2011-08-10 18:45:16 PDT
Created attachment 103567 [details]
Updated per comment #38
Comment 41 Ryosuke Niwa 2011-08-10 18:46:58 PDT
Created attachment 103568 [details]
Renamed enum values
Comment 42 Tony Chang 2011-08-10 19:04:57 PDT
Comment on attachment 103568 [details]
Renamed enum values

View in context: https://bugs.webkit.org/attachment.cgi?id=103568&action=review

> Source/WebCore/ChangeLog:34
> +        with shouldEmit set false first to compute the highest node to be serialized. The second call to

Nit: Use enum name.

> Source/WebCore/ChangeLog:37
> +        Outputs string only if shouldEmit is true.

Nit: Ditto.

> Source/WebCore/editing/markup.cpp:221
> +    bool parentIsTextarea = text->parentElement() && text->parentElement()->tagQName() == textareaTag;
> +    bool wrappingSpan = shouldApplyWrappingStyle(text) && !parentIsTextarea;

FWIW, the usage of const bool for local variables seems kind of arbitrary.  I don't think there's a style guide rule one way or another (and I don't have a strong preference, I find them not that useful but not that noisy either), but it would be nice if we were at least consistent within a file.  E.g., you could make these const.

> Source/WebCore/editing/markup.cpp:291
> +    bool shouldAnnotateOrForceInline = element->isHTMLElement() && (shouldAnnotate() || addDisplayInline);
> +    bool shouldOverrideStyleAttr = shouldAnnotateOrForceInline || shouldApplyWrappingStyle(element);

...these could be const too.
Comment 43 Ryosuke Niwa 2011-08-10 22:45:12 PDT
Thanks for the review!  I'm very excited to land this patch.

(In reply to comment #42)
> > Source/WebCore/ChangeLog:34
> > +        with shouldEmit set false first to compute the highest node to be serialized. The second call to
> 
> Nit: Use enum name.

Fixed.

> > Source/WebCore/ChangeLog:37
> > +        Outputs string only if shouldEmit is true.
> 
> Nit: Ditto.

Fixed.

> > Source/WebCore/editing/markup.cpp:221
> > +    bool parentIsTextarea = text->parentElement() && text->parentElement()->tagQName() == textareaTag;
> > +    bool wrappingSpan = shouldApplyWrappingStyle(text) && !parentIsTextarea;
> 
> FWIW, the usage of const bool for local variables seems kind of arbitrary.  I don't think there's a style guide rule one way or another (and I don't have a strong preference, I find them not that useful but not that noisy either), but it would be nice if we were at least consistent within a file.  E.g., you could make these const.

Fixed.
Comment 44 Ryosuke Niwa 2011-08-10 23:29:27 PDT
Committed r92823: <http://trac.webkit.org/changeset/92823>
Comment 45 mitz 2011-12-28 16:45:10 PST
(In reply to comment #44)
> Committed r92823: <http://trac.webkit.org/changeset/92823>

This change caused bug 75330.