Bug 74514 - Need a way to produce leaner markup when pasting a fragment containing verbose markup
Summary: Need a way to produce leaner markup when pasting a fragment containing verbos...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: HTML Editing (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Enrica Casucci
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2011-12-14 10:20 PST by Enrica Casucci
Modified: 2011-12-14 16:34 PST (History)
3 users (show)

See Also:


Attachments
Patch (16.51 KB, patch)
2011-12-14 10:53 PST, Enrica Casucci
rniwa: review-
Details | Formatted Diff | Diff
Patch 2 (19.90 KB, patch)
2011-12-14 14:33 PST, Enrica Casucci
rniwa: review+
webkit.review.bot: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Enrica Casucci 2011-12-14 10:20:30 PST
When performing a copy/paste operation, the fragment copied is taken as is and only minimal changes are performed during pasting to meet the stylistic requirements.
If the original fragment contains verbose markup, the result of the paste operation will contain the same verbose markup.

<rdar://problem/10208653>
Comment 1 Enrica Casucci 2011-12-14 10:53:26 PST
Created attachment 119248 [details]
Patch
Comment 2 Ryosuke Niwa 2011-12-14 11:02:33 PST
Comment on attachment 119248 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=119248&action=review

> Source/WebCore/editing/ReplaceSelectionCommand.cpp:376
> +    , m_sanitizeFragment(options & SanitizeFragment)

Why can't we use this option always?

> Source/WebCore/editing/ReplaceSelectionCommand.cpp:572
> +            if (currentNode && currentNode->parentNode() != rootNode && currentNode->firstChild() == currentNode->lastChild() && currentNode->hasTagName(divTag)) {
> +                const NamedNodeMap* attributeMap = currentNode->attributes();
> +                if (!attributeMap || attributeMap->isEmpty())
> +                    nodesToRemove.append(currentNode);
> +            }
> +
> +            currentNode = currentNode->parentNode();
> +            if (!currentNode->renderer() || !currentNode->renderer()->isRenderInline() || toRenderInline(currentNode->renderer())->alwaysCreateLineBoxes())
> +                continue;

I think you want to reuse the logic in isStyleSpanOrSpanWithOnlyStyleAttribute.

> Source/WebCore/editing/ReplaceSelectionCommand.cpp:580
> +            if (currentNode->renderStyle()->diff(startingStyle, context) == StyleDifferenceEqual)

You should be able to reuse the logic in EditingStyle::removeStyleAddedByNode.
Comment 3 Ryosuke Niwa 2011-12-14 12:05:47 PST
Comment on attachment 119248 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=119248&action=review

> Source/WebCore/editing/ReplaceSelectionCommand.cpp:574
> +            if (currentNode && currentNode->firstChild() != currentNode->lastChild()) {

This condition is going to be true when there are elements with display: none or collapsed whitespace.
By the way, currentNode here can't be null, right? If that were the case, the previous statement would have crashed.
Comment 4 Enrica Casucci 2011-12-14 12:18:27 PST
(In reply to comment #3)
> 
> Why can't we use this option always?
> 
ReplaceSelectionCommand is used extensively in the editing code and we don't want to always modify
the fragment, since we don't know at this level where it originated.
For example it could be coming from insertHTML and we don't want to change that.

> I think you want to reuse the logic in isStyleSpanOrSpanWithOnlyStyleAttribute.

It does not implement the same logic. Here I wan't to consider only inline elements that don't always require a line box.
I was to be able to handle correctly the case of the markup below:

<i style="border: solid red"><i style="border: solid red">hello</i></i>

The renderstyle is the same, but we cannot remove one of the elements without affecting the visual appearance.

> > Source/WebCore/editing/ReplaceSelectionCommand.cpp:580
> > +            if (currentNode->renderStyle()->diff(startingStyle, context) == StyleDifferenceEqual)
> 
> You should be able to reuse the logic in EditingStyle::removeStyleAddedByNode.
Comparing render styles is more efficient than what removeStyleAddedByNode does.
Comment 5 Enrica Casucci 2011-12-14 12:23:03 PST
(In reply to comment #3)
> (From update of attachment 119248 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=119248&action=review
> 
> > Source/WebCore/editing/ReplaceSelectionCommand.cpp:574
> > +            if (currentNode && currentNode->firstChild() != currentNode->lastChild()) {
> 
> This condition is going to be true when there are elements with display: none or collapsed whitespace.
> By the way, currentNode here can't be null, right? If that were the case, the previous statement would have crashed.
It should not be null. Yes, in that case we will not trim the markup but it is a trade-off I did for performance to avoid using VisiblePosition or looking at renderers and do a more complex analysis of the markup.
Comment 6 Enrica Casucci 2011-12-14 14:33:41 PST
Created attachment 119299 [details]
Patch 2
Comment 7 Ryosuke Niwa 2011-12-14 14:48:31 PST
Comment on attachment 119299 [details]
Patch 2

View in context: https://bugs.webkit.org/attachment.cgi?id=119299&action=review

> Source/WebCore/editing/ReplaceSelectionCommand.cpp:555
> +        if (node->firstChild())
> +            continue;
> +        if (node->isTextNode() && node->nextSibling())
> +            continue;

We should combine these ifs.

> Source/WebCore/editing/ReplaceSelectionCommand.cpp:563
> +        while (currentNode != rootNode) {

Can we just check currentNode->parentNode() != rootNode here instead?
(of course, add if (currentNode == rootNode) continue; outside of the loop)

> Source/WebCore/editing/ReplaceSelectionCommand.cpp:578
> +                topNodeWithStartingStyle = currentNode;

Why can't we just add currentNode to nodesToRemove here?
Comment 8 Enrica Casucci 2011-12-14 15:13:06 PST
(In reply to comment #7)
> (From update of attachment 119299 [details])
> We should combine these ifs.
> 
Sure.
> > Source/WebCore/editing/ReplaceSelectionCommand.cpp:563
> > +        while (currentNode != rootNode) {
> 
> Can we just check currentNode->parentNode() != rootNode here instead?
> (of course, add if (currentNode == rootNode) continue; outside of the loop)
> 
No, because I want to exclude the top element if it is a block, I still want to apply the logic below if it is an inline.

> > Source/WebCore/editing/ReplaceSelectionCommand.cpp:578
> > +                topNodeWithStartingStyle = currentNode;
> 
> Why can't we just add currentNode to nodesToRemove here?
Because if I encounter an element with more than one child up the chain and I don't want to perform any cleanup and leave everything as is. Otherwise I would have to go back and remove all the node I've previously added to the nodesToRemove list.
Comment 9 WebKit Review Bot 2011-12-14 15:16:03 PST
Comment on attachment 119299 [details]
Patch 2

Attachment 119299 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/10875454

New failing tests:
editing/pasteboard/paste-text-013.html
editing/pasteboard/paste-text-014.html
Comment 10 Ryosuke Niwa 2011-12-14 15:22:36 PST
Comment on attachment 119299 [details]
Patch 2

View in context: https://bugs.webkit.org/attachment.cgi?id=119299&action=review

>>> Source/WebCore/editing/ReplaceSelectionCommand.cpp:563
>>> +        while (currentNode != rootNode) {
>> 
>> Can we just check currentNode->parentNode() != rootNode here instead?
>> (of course, add if (currentNode == rootNode) continue; outside of the loop)
> 
> No, because I want to exclude the top element if it is a block, I still want to apply the logic below if it is an inline.

Ah, I see. Got it.

>>> Source/WebCore/editing/ReplaceSelectionCommand.cpp:578
>>> +                topNodeWithStartingStyle = currentNode;
>> 
>> Why can't we just add currentNode to nodesToRemove here?
> 
> Because if I encounter an element with more than one child up the chain and I don't want to perform any cleanup and leave everything as is. Otherwise I would have to go back and remove all the node I've previously added to the nodesToRemove list.

Isn't still valuable to remove all empty descendants in that case?
Comment 11 Ryosuke Niwa 2011-12-14 15:23:10 PST
(In reply to comment #9)
> (From update of attachment 119299 [details])
> Attachment 119299 [details] did not pass chromium-ews (chromium-xvfb):
> Output: http://queues.webkit.org/results/10875454
> 
> New failing tests:
> editing/pasteboard/paste-text-013.html
> editing/pasteboard/paste-text-014.html

These failures may reproduce on Windows (probably fails on non-Mac).
Comment 12 Enrica Casucci 2011-12-14 15:28:59 PST
(In reply to comment #11)
> (In reply to comment #9)
> > (From update of attachment 119299 [details] [details])
> > Attachment 119299 [details] [details] did not pass chromium-ews (chromium-xvfb):
> > Output: http://queues.webkit.org/results/10875454
> > 
> > New failing tests:
> > editing/pasteboard/paste-text-013.html
> > editing/pasteboard/paste-text-014.html
> 
> These failures may reproduce on Windows (probably fails on non-Mac).

I'll put in the new test results for these platform.
Comment 13 Ryosuke Niwa 2011-12-14 15:35:10 PST
Comment on attachment 119299 [details]
Patch 2

View in context: https://bugs.webkit.org/attachment.cgi?id=119299&action=review

Okay, I think we can land as it, assuming you fix the address the index vs. iterator issue below. Thanks for the clarifications! Please ping me if you needed any help in rebaselining chromium tests.

> Source/WebCore/editing/ReplaceSelectionCommand.cpp:587
> +    for (Vector<Node*>::const_iterator it = nodesToRemove.begin(); it != nodesToRemove.end(); ++it)

Please use integral index instead of iterator as discussed on https://lists.webkit.org/pipermail/webkit-dev/2011-October/018274.html.

> LayoutTests/editing/pasteboard/paste-and-sanitize.html:39
> +testPaste("div", "<div><b><i><span style=\"font-weight: normal\"><b><i>Hello</i></b></span></i></b></div>", "<b><i>Hello</i></b>");

You may want to use single quotation to avoid having to cast ".

> LayoutTests/editing/pasteboard/paste-and-sanitize.html:47
> +          "<i style=\"margin-top: 10px; margin-right: 10px; margin-bottom: 10px; margin-left: 10px; \"><b><i style=\"margin-top: 10px; margin-right: 10px; margin-bottom: 10px; margin-left: 10px; \">hello</i></b></i>");

We really need to make cssText() use shorthand notations :(
Comment 14 Enrica Casucci 2011-12-14 15:58:56 PST
> > Source/WebCore/editing/ReplaceSelectionCommand.cpp:587
> > +    for (Vector<Node*>::const_iterator it = nodesToRemove.begin(); it != nodesToRemove.end(); ++it)
> 
will do.

> Please use integral index instead of iterator as discussed on https://lists.webkit.org/pipermail/webkit-dev/2011-October/018274.html.
> 
> > LayoutTests/editing/pasteboard/paste-and-sanitize.html:39
> > +testPaste("div", "<div><b><i><span style=\"font-weight: normal\"><b><i>Hello</i></b></span></i></b></div>", "<b><i>Hello</i></b>");
> 
> You may want to use single quotation to avoid having to cast ".
>
ok. 
> > LayoutTests/editing/pasteboard/paste-and-sanitize.html:47
> > +          "<i style=\"margin-top: 10px; margin-right: 10px; margin-bottom: 10px; margin-left: 10px; \"><b><i style=\"margin-top: 10px; margin-right: 10px; margin-bottom: 10px; margin-left: 10px; \">hello</i></b></i>");
> 
> We really need to make cssText() use shorthand notations :(
Yes, I've already looked at it and it seems pretty straightforward. I added this test line also for the purpose of showing the problem and having to fix the test result when we change cssText.
Comment 15 Enrica Casucci 2011-12-14 16:14:19 PST
> > You may want to use single quotation to avoid having to cast ".
> >
> ok. 
I cannot, since it breaks this line

shouldBe("confirmedMarkup", "'" + expected + "'")

that uses single quote.
Comment 16 Enrica Casucci 2011-12-14 16:34:05 PST
http://trac.webkit.org/changeset/102846