Bug 19524 - Able to move nodes across documents
: Able to move nodes across documents
Status: RESOLVED FIXED
: WebKit
XML DOM
: 528+ (Nightly build)
: PC Mac OS X 10.5
: P2 Normal
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2008-06-12 15:43 PST by
Modified: 2011-03-10 00:33 PST (History)


Attachments
Patch (17.46 KB, patch)
2010-08-27 18:59 PST, Ojan Vafai
darin: review+
Review Patch | Details | Formatted Diff | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2008-06-12 15:43:56 PST
Webkit currently enables a compat hack that allows you to move nodes between documents.  Here is the situation overall:

Node::checkAddChild special cases a node that is not inDocument() to be compatible with Mozilla and Mac IE.  The compatibility is because window.Option internally creates a element using document.createElement(), meaning the node's m_document is window.document.  Apparently some people would use the Options() constructor, and then try to add this node to another document (say an iframe).  Mozilla allows you to move nodes between documents, so this works, but I argue this is completely broken.  I have no idea about Mac IE.  Windows IE does not allow you to move nodes between documents, including the Options case.

Also, the hack is incorrect.  The logic really means more of "wasNeverInADocument" and not "inDocument".  If you add a node to a document, you can remove it, and then add it to a different document, which is clearly incorrect.  Adding a wasEverInADocument flag is also not correct, since this allows any elements created via document.createElement, etc, to be attached to a different document.

Also, checkAddChild will walk the nodes and setDocument it and all the children, which is completely not the correct way to adopt a child.  checkAddChild should simply do a check, like the name implies, and no modification should happen here! (really this should be a const method).

There are other bugs involving adopting nodes, but I would like to clean up the code so there is only a single place where node adoption can occur (document.adoptNode) so that the code will be more clear and we can fix the adoption problems through a single path.  I believe this is correct in terms of the spec and the web.  I don't think there are many cases of this new Options problem, specially seeing that it doesn't work in IE.
------- Comment #1 From 2008-06-12 21:45:37 PST -------
Could you please attach a test case?

I suspect that this behavior is by design (as there doesn't seem to be a compelling reason to refuse moving nodes between documents), but having a test case would make the discussion more focused.
------- Comment #2 From 2008-06-13 10:45:41 PST -------
http://danceoffwithyourpantsoff.googlepages.com/webkit-adopt.html

In Mozilla and Opera, you won't get an alert, because it allows you to move nodes between documents.

In IE, you will get an alert, and then a script error, because it never allows you to move nodes between documents.

In Webkit, you will get an alert first, but then because of the compat hack, you can still move the node across documents as long as you remove it first, so you will see that div moves inside the iframe.

I would argue that you shouldn't never be able to move nodes across documents, unless you adoptNode().  This is the point of adoptNode().

If we decide that we should be able to move nodes across documents, then

1) The case of moving a connected node (which now exceptions) should be made to work, like it does in Mozilla and Opera.

2) We should not do the "adopting" in checkAddChild, this is clearly the wrong place to do it, and the adopting is clearly implemented incorrectly.  These paths should share the same code paths as adoptChild().  Adopting needs to be handled carefully, and it should only be done in one place so we can make sure the implementation handles this consistently.
------- Comment #3 From 2008-09-17 14:46:32 PST -------
(In reply to comment #2)
> I would argue that you shouldn't never be able to move nodes across documents,
> unless you adoptNode(). This is the point of adoptNode().

Yes, that's the point of adoptNode, but, adoptNode was designed by a DOM working group that was not thinking much about the web.

It's a mistake to drive our design based on the existence of adoptNode. I think that changing the code to not allow moving nodes between documents creates unnecessary differences between WebKit and Gecko, so I'd only want to do it if there was a real compelling reason other than the abstract argument.

> 1) The case of moving a connected node (which now exceptions) should be made to
> work, like it does in Mozilla and Opera.

OK.

> 2) We should not do the "adopting" in checkAddChild, this is clearly the wrong
> place to do it, and the adopting is clearly implemented incorrectly. These
> paths should share the same code paths as adoptChild().

Sounds good.

> Adopting needs to be handled carefully, 

"Carefully" doesn't really mean anything.

> it should only be done in one place so we can make sure
> the implementation handles this consistently.

What we need most are enough test cases to show whether we have this right.

How much the code is shared is driven by multiple factors; generally more sharing is better, unless certain code paths are much hotter than others.
------- Comment #4 From 2009-02-16 02:48:04 PST -------
This is mostly covered by bug 4569 and bug 12339.
------- Comment #5 From 2010-08-27 18:34:37 PST -------
*** Bug 44639 has been marked as a duplicate of this bug. ***
------- Comment #6 From 2010-08-27 18:59:35 PST -------
Created an attachment (id=65803) [details]
Patch
------- Comment #7 From 2010-08-28 19:41:59 PST -------
(From update of attachment 65803 [details])
Change seems worthwhile. But we ought to do better than calling this a "willful violation of the spec". We should bring this up in the appropriate forum and get the specifications fixed. Unless the specification in question is moribund and can't be practically revised.

> +    // FIXME: Remove this code. Given the above check that the element owns this attribute.
> +    // It should not be possible that it is in a different document.
>      if (document() != attr->document()) {
> -        ec = WRONG_DOCUMENT_ERR;
> -        return 0;
> +        ASSERT_NOT_REACHED();
> +        document()->adoptNode(attr, ec);
> +        if (ec)
> +            return 0;
>      }

I don't understand why you are checking in new unreachable code. There is no reason to change the unreachable code! You can remove it if you like, or leave it in if you don't want to deal with that right now, but please don't check in new untestable code!

> +        if (newChild->inDocument())
> +            document()->adoptNode(newChild, ec);
> +        else {
> +            for (Node* node = newChild; node; node = node->traverseNextNode(newChild))
> +                node->setDocument(document());
> +        }

It's kind of irritating that adoptNode has an out argument of an exception. Is there really any possibility of an exception here? I'd prefer to see this refactored so we aren't calling our external entry point here. adoptNode would only be the external entry point.

I also don't understand exactly why, if newChild returns false to inDocument, adoptNode can't be called. Can't a node with inDocument returning false nonetheless be visible to JavaScript? If so, then doesn't it have to work if the JavaScript calls adoptNode on it?

> +    if (newChild->document() != document()) {
> +        if (newChild->inDocument())
> +            document()->adoptNode(newChild, ec);
> +        else {
> +            for (Node* node = newChild; node; node = node->traverseNextNode(newChild))
> +                node->setDocument(document());
> +        }
> +    }

Quite inelegant to have this code repeated twice. The right way to handle that is to have an inline function called in both places.

I don’t see the pattern of where you could just call adoptNode and where you needed to do more work.

The test cases don't seem to cover all the code paths. You tested each function, but there's the inDocument and non-inDocument cases, so we'd need at least two tests for each of those call sites.

I'm going to say review+, but I think there should be more test cases, and better factoring of the code to have a truly great fix here.
------- Comment #8 From 2010-08-30 12:25:03 PST -------
(In reply to comment #7)
> Change seems worthwhile. But we ought to do better than calling this a "willful violation of the spec". We should bring this up in the appropriate forum and get the specifications fixed. Unless the specification in question is moribund and can't be practically revised.

I've sent an email to www-dom, but I'm skeptical we could get this changed without putting a significant effort into this. This spec has been in the W3C recommendation state since 2004 and, as far as I know, does not currently have an active editor. http://www.w3.org/TR/DOM-Level-3-Core/

> > +    // FIXME: Remove this code. Given the above check that the element owns this attribute.
> > +    // It should not be possible that it is in a different document.
> >      if (document() != attr->document()) {
> > -        ec = WRONG_DOCUMENT_ERR;
> > -        return 0;
> > +        ASSERT_NOT_REACHED();
> > +        document()->adoptNode(attr, ec);
> > +        if (ec)
> > +            return 0;
> >      }
> 
> I don't understand why you are checking in new unreachable code. There is no reason to change the unreachable code! You can remove it if you like, or leave it in if you don't want to deal with that right now, but please don't check in new untestable code!

I was thinking that leaving in the assert would catch if we somehow did hit this case, but I'm comfortable with just turning that whole block into an assert.

> > +        if (newChild->inDocument())
> > +            document()->adoptNode(newChild, ec);
> > +        else {
> > +            for (Node* node = newChild; node; node = node->traverseNextNode(newChild))
> > +                node->setDocument(document());
> > +        }
> 
> It's kind of irritating that adoptNode has an out argument of an exception. Is there really any possibility of an exception here? I'd prefer to see this refactored so we aren't calling our external entry point here. adoptNode would only be the external entry point.
> 
> I also don't understand exactly why, if newChild returns false to inDocument, adoptNode can't be called. Can't a node with inDocument returning false nonetheless be visible to JavaScript? If so, then doesn't it have to work if the JavaScript calls adoptNode on it?

> I don’t see the pattern of where you could just call adoptNode and where you needed to do more work.

adoptNode does more than just setDocument. Most notably it calls setRemainsAliveOnRemovalFromTree on iframes. I was hesitant to make that change in behavior, but now that I think about it, this should be safe since currently the appendChild/insertBefore call would throw an error. I'll change these two cases to just call adoptNode.

> The test cases don't seem to cover all the code paths. You tested each function, but there's the inDocument and non-inDocument cases, so we'd need at least two tests for each of those call sites.

Will add tests.
------- Comment #9 From 2010-08-30 12:25:28 PST -------
www-dom discussion: http://lists.w3.org/Archives/Public/www-dom/2010JulSep/0107.html
------- Comment #10 From 2010-08-30 12:45:44 PST -------
Missed a comment:

> > +        if (newChild->inDocument())
> > +            document()->adoptNode(newChild, ec);
> > +        else {
> > +            for (Node* node = newChild; node; node = node->traverseNextNode(newChild))
> > +                node->setDocument(document());
> > +        }
> 
> It's kind of irritating that adoptNode has an out argument of an exception. Is there really any possibility of an exception here? I'd prefer to see this refactored so we aren't calling our external entry point here. adoptNode would only be the external entry point.

From inspecting the code, it looks like an exception could only happen if trying to move a DocumentType node across documents. I'm not sure what the right behavior is there. Gecko throws an error in that case, so I was matching them for now. But maybe it makes sense to move that check higher up? I'll post a patch with that.
------- Comment #11 From 2010-10-13 12:24:52 PST -------
Attachment 65803 [details] was posted by a committer and has review+, assigning to Ojan Vafai for commit.
------- Comment #12 From 2010-10-13 12:44:59 PST -------
Ojan, could you please update bug 4569 and bug 12339 with current status once you land this?
------- Comment #13 From 2010-10-13 12:55:41 PST -------
You can drop "This is a violation of the DOM spec", by the way; the new spec (<http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html>) is more web-compatible.
------- Comment #14 From 2010-12-20 22:40:51 PST -------
@ojan?
------- Comment #15 From 2010-12-28 19:29:45 PST -------
Sorry, this needs a new patch. I realized that this patch doesn't deal well with mutation events. There are crashes with mutation events currently that this patch gives more code paths to. Now that we can scope mutation events though it should be straightforward to address this. I'll try to post a new patch next week.
------- Comment #16 From 2011-03-10 00:03:50 PST -------
Due to Dimitri's refactoring of setDocument --> setDocumentRecursively, we can now call setDocumentRecursively instead of adoptNode (addressing Darin's factoring concerns) and mutation events are properly handled. I've added a bunch more more tests. Will commit shortly.
------- Comment #17 From 2011-03-10 00:33:33 PST -------
Committed r80696: <http://trac.webkit.org/changeset/80696>