Bug 112240 - execCommand("RemoveFormat") might remove format after the selection
Summary: execCommand("RemoveFormat") might remove format after the selection
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: HTML Editing (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Claudio Saavedra
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-03-13 02:42 PDT by Claudio Saavedra
Modified: 2013-04-17 03:28 PDT (History)
5 users (show)

See Also:


Attachments
Patch (6.94 KB, patch)
2013-04-16 11:26 PDT, Claudio Saavedra
no flags Details | Formatted Diff | Diff
Patch (7.77 KB, patch)
2013-04-16 11:51 PDT, Claudio Saavedra
no flags Details | Formatted Diff | Diff
Patch for landing (8.34 KB, patch)
2013-04-17 03:00 PDT, Claudio Saavedra
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Claudio Saavedra 2013-03-13 02:42:20 PDT
This looks to me like a bug. If we have the following HTML:

    "<q><b><div>hello world</div></b>WebKit</q>"

and we select everything from "world" until right before "WebKit", execCommand("RemoveFormat") will yield the following HTML:

    "<div><q><b>hello </b></q>world</div>WebKit"

This is inconsistent with the behavior of "RemoveFormat" otherwise. Take the following too cases:

1) If we select only "world" before applying "RemoveFormat", the <q> tag is placed also around "WebKit":

    "<div><q><b>hello </b></q>world</div><q>WebKit</q>"

2) If we select "world</div><q>W", then the <q> tag is placed around "ebKit" too:

    "<div><q><b>hello </b></q>world</div>W<q>ebKit</q>"
Comment 1 Claudio Saavedra 2013-03-13 03:18:34 PDT
This seems to happen always when selection goes until right after the closing </div>.
Comment 2 Claudio Saavedra 2013-03-19 01:21:07 PDT
The following test seems to be affected by what could be the same, though I am not entirely sure: editing/style/5046875-1.html

In that test, we have:

<div id="div" contenteditable="true">foo<div>bar<br>baz</div>

Selecting past the <br> but before "baz" and then applying execCommand("JustifyCenter") applies centering to "baz" as well, even when that text is unselected.
Comment 3 Claudio Saavedra 2013-03-19 01:24:06 PDT
(In reply to comment #2)
> The following test seems to be affected by what could be the same, though I am not entirely sure: editing/style/5046875-1.html
> 
> In that test, we have:
> 
> <div id="div" contenteditable="true">foo<div>bar<br>baz</div>
> 
> Selecting past the <br> but before "baz" and then applying execCommand("JustifyCenter") applies centering to "baz" as well, even when that text is unselected.

Well, I tried with LibreOffice and this is the outcome there, so perhaps this is desired and unrelated to the bug as described in comment 0.
Comment 4 Ryosuke Niwa 2013-03-19 01:34:32 PDT
(In reply to comment #3)
> (In reply to comment #2)
> > The following test seems to be affected by what could be the same, though I am not entirely sure: editing/style/5046875-1.html
> > 
> > In that test, we have:
> > 
> > <div id="div" contenteditable="true">foo<div>bar<br>baz</div>
> > 
> > Selecting past the <br> but before "baz" and then applying execCommand("JustifyCenter") applies centering to "baz" as well, even when that text is unselected.
> 
> Well, I tried with LibreOffice and this is the outcome there, so perhaps this is desired and unrelated to the bug as described in comment 0.

That indeed is an intended behavior.
Comment 5 Claudio Saavedra 2013-04-11 08:30:52 PDT
I've spent some time learning my way through the editing code and I've found a couple of things that might be leading to this bug.

In order to explain this let's use a simpler HTML fragment

  <b><div>Foo</div>Bar</b>

for which, selection starts right before "Foo" and ends right before "Bar" (using square brackets, that would look like this):

  <b><div>[Foo</div>]Bar</b>

All of the following happens in WebCore::ApplyStyleCommand::removeInlineStyle(). The first thing I've noticed is that in that method, pushDownInlineStyleAroundNode() is called for both the start and the end node. The reason why this is done is to ensure that we preserve the style to be removed around the selection. However, for the start node, it is checked whether the current position is in the end of the containing node, in which case what is used to push down the style is the next visually distinct candidate:

    // If the pushDownStart is at the end of a text node, then this node is not fully selected.
    // Move it to the next deep quivalent position to avoid removing the style from this node.
    // e.g. if pushDownStart was at Position("hello", 5) in <b>hello<div>world</div></b>, we want Position("world", 0) instead.

In the test case we have, there is an analogous situation but for the end node: pushDownEnd is at the beginning of a text node, therefore this node is not really selected. I think, in this case, it makes sense to use the previous visually distinct candidate to push down the style (I have tested this single change locally and all editing tests pass).

After doing that change, and after running pushDownInlineStyleAroundNode() for both the pushDownStart and pushDownNode nodes, the tree is changed to the following:

  <div>Foo</div><b>Bar</b>

This is already looking good (without these changes, we would instead already have "<div>Foo</div>Bar"). Now comes the next oddity in the code. The block of code that follows takes care of traversing the tree from the start until the end nodes, and removing the format for all nodes in between, as long as they are fully selected. This is ensured by a call to

  nodeFullySelected(node, start, end)

for each HTML element node in the range, including the <B> one that has been pushed down. In particular, for the <B> node, one would expect nodeFullySelected() to return false, as the position where the selection ends is _before_ the last position in the node. Contrarily to this, the function returns true and the <B> node is removed, yielding

  <div>Foo</div>Bar

as the resulting HTML.

I have yet to find why nodeFullySelected() is returning true here, but I wanted to share this before continuing.
Comment 6 Claudio Saavedra 2013-04-12 08:01:40 PDT
> I have yet to find why nodeFullySelected() is returning true here, but I wanted to share this before continuing.

I think I've found an answer to this question. This method makes two comparisons to find whether the given node is fully selected. For nodeFullySelected(node, start, end), we are interested in the second comparison, which in the code has the following form:

  comparePositions(lastPositionInOrAfterNode(node).upstream(), end) <= 0

lastPositionInOrAfterNode(node), where node is <B>, as in our example, will return a position that is anchored after all children, however, calling upstream() in this position ends up losing the AnchorType. I suspect this is related to the following comment in WebCore::Position::upstream():

    // FIXME: PositionIterator should respect Before and After positions.

When a Position p has an AnchorType of PositionIsAfterChildren, the position p.upstream() will just have an anchor type of PositionIsOffsetInAnchor but with offset 0. That's why later Range::compareBoundaryPoints() will return the wrong value comparePositions() will return -1.

Just removing the call to upstream() and using the Position returned by lastPositionInOrAfterNode() works around the issue, but I haven't investigated yet whether this will break something.
Comment 7 Claudio Saavedra 2013-04-12 09:57:08 PDT
So, to make this clearer, this is p = lastPositionInOrAfterNode(node):

  {m_anchorNode = {m_ptr = 0xaa6b00}, m_offset = 0, m_anchorType = 4, m_isLegacyEditingPosition = false}

and this is p.upstream():

  {m_anchorNode = {m_ptr = 0xaa6b00}, m_offset = 0, m_anchorType = 0, m_isLegacyEditingPosition = false}

node is <B> with a text child of length != 0. p is after the child (m_anchorType = 4 means PositionIsAfterChildren) while p.upstream() is *before* the child (m_anchorType = 0 means PositionIsOffsetInAnchor with m_offset = 0).
Comment 8 Claudio Saavedra 2013-04-15 08:53:26 PDT
(In reply to comment #7)
> So, to make this clearer, this is p = lastPositionInOrAfterNode(node):
> 
>   {m_anchorNode = {m_ptr = 0xaa6b00}, m_offset = 0, m_anchorType = 4, m_isLegacyEditingPosition = false}
> 
> and this is p.upstream():
> 
>   {m_anchorNode = {m_ptr = 0xaa6b00}, m_offset = 0, m_anchorType = 0, m_isLegacyEditingPosition = false}
> 
> node is <B> with a text child of length != 0. p is after the child (m_anchorType = 4 means PositionIsAfterChildren) while p.upstream() is *before* the child (m_anchorType = 0 means PositionIsOffsetInAnchor with m_offset = 0).

Apparently this happens because upstream() is iterating through all of the text node instead of returning the last position in the text node, as it should.

First iteration:

{m_anchorNode = 0x787ae0,	m_nodeAfterPositionInAnchor = 0x0, m_offsetInAnchor = 1}

with anchor node the node with the B tag and the offset in anchor indicating that we are after the child text node.

Second iteration, after the iterator is decremented:

 {m_anchorNode = 0xaad7e0,	m_nodeAfterPositionInAnchor = 0x0, m_offsetInAnchor = 3}

with node now pointing to the text node which is a child of the B tag, and given offset points to the position after the "Foo" text. Unless I am understanding the semantics of WebCore::Position::upstream() totally wrong, *this* should be the position returned by this method. However, this is not the case, and the iteration continues until the iterator moves to a position that is visually distinct and the last visible position is returned instead.
Comment 9 Claudio Saavedra 2013-04-16 07:01:12 PDT
So the only thing that seems to make sense to me is that in Position::upstream(),

          if (renderer->isText() && toRenderText(renderer)->firstTextBox()) {

firstTextBox() should be returning something else than 0 but for whatever reason there are no text boxes in the renderer (yet?). Should the render tree be updated somehow after the DOM is changed and this is why now this is not working?
Comment 10 Claudio Saavedra 2013-04-16 08:31:26 PDT
Running  node->document()->updateLayoutIgnorePendingStylesheets(); updates the TextRender. I'll attach a patch in brief.
Comment 11 Claudio Saavedra 2013-04-16 11:26:36 PDT
Created attachment 198353 [details]
Patch
Comment 12 Ryosuke Niwa 2013-04-16 11:32:05 PDT
Comment on attachment 198353 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=198353&action=review

Thanks for the fix.

> Source/WebCore/ChangeLog:7
> +

Please explain why the bug was caused and how you fixed it akin to what you've posted on the bug.

> Source/WebCore/editing/ApplyStyleCommand.cpp:1093
> +    if (pushDownEndContainer && pushDownEndContainer->isTextNode()
> +        && !pushDownEnd.computeOffsetInContainerNode())

We can probably fit all of this in one line.

> Source/WebCore/editing/ApplyStyleCommand.cpp:1108
> +    // The tree is changed and the code below, in particular calls to Position::upstream/downstream(),

Nit: The tree may have changed.
Where are we using upstream/downstream?

> Source/WebCore/editing/ApplyStyleCommand.cpp:1109
> +    // rely in an up-to-date layout.

Nit: rely on.
Comment 13 Claudio Saavedra 2013-04-16 11:51:41 PDT
Created attachment 198354 [details]
Patch
Comment 14 Ryosuke Niwa 2013-04-16 11:55:20 PDT
Comment on attachment 198354 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=198354&action=review

> Source/WebCore/editing/ApplyStyleCommand.cpp:1094
>  

We should probably assert that start & end are not in a middle of text node either.
If they were, then we failed to split the text nodes properly.

> Source/WebCore/editing/ApplyStyleCommand.cpp:1109
> +    // The tree may have changed and Position::upstream(), which is used in nodeFullySelected(), relies
> +    // on an up-to-date layout.
> +    node->document()->updateLayoutIgnorePendingStylesheets();

On my second thought, we probably need to call this inside nodeFullySelected.
Comment 15 Claudio Saavedra 2013-04-17 02:39:36 PDT
Comment on attachment 198354 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=198354&action=review

>> Source/WebCore/editing/ApplyStyleCommand.cpp:1094
>>  
> 
> We should probably assert that start & end are not in a middle of text node either.
> If they were, then we failed to split the text nodes properly.

I'm adding a FIXME for the time being since I am not entirely sure on how to safely check this. I am inclined to do something like this:

    ASSERT(!start.containerNode()->isTextNode() || start.offsetInContainerNode() == start.containerNode()->maxCharacterOffset());
    ASSERT(!end.containerNode()->isTextNode() || end.offsetInContainerNode() == end.containerNode()->maxCharacterOffset());

If you can confirm this is right I'll add it later. Perhaps a helper method would make this cleaner?
Comment 16 Claudio Saavedra 2013-04-17 03:00:34 PDT
Created attachment 198491 [details]
Patch for landing
Comment 17 WebKit Commit Bot 2013-04-17 03:28:06 PDT
Comment on attachment 198491 [details]
Patch for landing

Clearing flags on attachment: 198491

Committed r148597: <http://trac.webkit.org/changeset/148597>
Comment 18 WebKit Commit Bot 2013-04-17 03:28:08 PDT
All reviewed patches have been landed.  Closing bug.