Bug 19524

Summary: Able to move nodes across documents
Product: WebKit Reporter: danceoffwithyourpantsoff <danceoffwithyourpantsoff>
Component: DOMAssignee: Ojan Vafai <ojan>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, arv, cdumez, emacemac7, eric, Ms2ger, ojan
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: OS X 10.5   
Attachments:
Description Flags
Patch darin: review+

Description danceoffwithyourpantsoff 2008-06-12 15:43:56 PDT
Webkit currently enables a compat hack that allows you to move nodes between documents.  Here is the situation overall:

Node::checkAddChild special cases a node that is not inDocument() to be compatible with Mozilla and Mac IE.  The compatibility is because window.Option internally creates a element using document.createElement(), meaning the node's m_document is window.document.  Apparently some people would use the Options() constructor, and then try to add this node to another document (say an iframe).  Mozilla allows you to move nodes between documents, so this works, but I argue this is completely broken.  I have no idea about Mac IE.  Windows IE does not allow you to move nodes between documents, including the Options case.

Also, the hack is incorrect.  The logic really means more of "wasNeverInADocument" and not "inDocument".  If you add a node to a document, you can remove it, and then add it to a different document, which is clearly incorrect.  Adding a wasEverInADocument flag is also not correct, since this allows any elements created via document.createElement, etc, to be attached to a different document.

Also, checkAddChild will walk the nodes and setDocument it and all the children, which is completely not the correct way to adopt a child.  checkAddChild should simply do a check, like the name implies, and no modification should happen here! (really this should be a const method).

There are other bugs involving adopting nodes, but I would like to clean up the code so there is only a single place where node adoption can occur (document.adoptNode) so that the code will be more clear and we can fix the adoption problems through a single path.  I believe this is correct in terms of the spec and the web.  I don't think there are many cases of this new Options problem, specially seeing that it doesn't work in IE.
Comment 1 Alexey Proskuryakov 2008-06-12 21:45:37 PDT
Could you please attach a test case?

I suspect that this behavior is by design (as there doesn't seem to be a compelling reason to refuse moving nodes between documents), but having a test case would make the discussion more focused.
Comment 2 danceoffwithyourpantsoff 2008-06-13 10:45:41 PDT
http://danceoffwithyourpantsoff.googlepages.com/webkit-adopt.html

In Mozilla and Opera, you won't get an alert, because it allows you to move nodes between documents.

In IE, you will get an alert, and then a script error, because it never allows you to move nodes between documents.

In Webkit, you will get an alert first, but then because of the compat hack, you can still move the node across documents as long as you remove it first, so you will see that div moves inside the iframe.

I would argue that you shouldn't never be able to move nodes across documents, unless you adoptNode().  This is the point of adoptNode().

If we decide that we should be able to move nodes across documents, then

1) The case of moving a connected node (which now exceptions) should be made to work, like it does in Mozilla and Opera.

2) We should not do the "adopting" in checkAddChild, this is clearly the wrong place to do it, and the adopting is clearly implemented incorrectly.  These paths should share the same code paths as adoptChild().  Adopting needs to be handled carefully, and it should only be done in one place so we can make sure the implementation handles this consistently.
Comment 3 Darin Adler 2008-09-17 14:46:32 PDT
(In reply to comment #2)
> I would argue that you shouldn't never be able to move nodes across documents,
> unless you adoptNode(). This is the point of adoptNode().

Yes, that's the point of adoptNode, but, adoptNode was designed by a DOM working group that was not thinking much about the web.

It's a mistake to drive our design based on the existence of adoptNode. I think that changing the code to not allow moving nodes between documents creates unnecessary differences between WebKit and Gecko, so I'd only want to do it if there was a real compelling reason other than the abstract argument.

> 1) The case of moving a connected node (which now exceptions) should be made to
> work, like it does in Mozilla and Opera.

OK.

> 2) We should not do the "adopting" in checkAddChild, this is clearly the wrong
> place to do it, and the adopting is clearly implemented incorrectly. These
> paths should share the same code paths as adoptChild().

Sounds good.

> Adopting needs to be handled carefully, 

"Carefully" doesn't really mean anything.

> it should only be done in one place so we can make sure
> the implementation handles this consistently.

What we need most are enough test cases to show whether we have this right.

How much the code is shared is driven by multiple factors; generally more sharing is better, unless certain code paths are much hotter than others.
Comment 4 Alexey Proskuryakov 2009-02-16 02:48:04 PST
This is mostly covered by bug 4569 and bug 12339.
Comment 5 Ojan Vafai 2010-08-27 18:34:37 PDT
*** Bug 44639 has been marked as a duplicate of this bug. ***
Comment 6 Ojan Vafai 2010-08-27 18:59:35 PDT
Created attachment 65803 [details]
Patch
Comment 7 Darin Adler 2010-08-28 19:41:59 PDT
Comment on attachment 65803 [details]
Patch

Change seems worthwhile. But we ought to do better than calling this a "willful violation of the spec". We should bring this up in the appropriate forum and get the specifications fixed. Unless the specification in question is moribund and can't be practically revised.

> +    // FIXME: Remove this code. Given the above check that the element owns this attribute.
> +    // It should not be possible that it is in a different document.
>      if (document() != attr->document()) {
> -        ec = WRONG_DOCUMENT_ERR;
> -        return 0;
> +        ASSERT_NOT_REACHED();
> +        document()->adoptNode(attr, ec);
> +        if (ec)
> +            return 0;
>      }

I don't understand why you are checking in new unreachable code. There is no reason to change the unreachable code! You can remove it if you like, or leave it in if you don't want to deal with that right now, but please don't check in new untestable code!

> +        if (newChild->inDocument())
> +            document()->adoptNode(newChild, ec);
> +        else {
> +            for (Node* node = newChild; node; node = node->traverseNextNode(newChild))
> +                node->setDocument(document());
> +        }

It's kind of irritating that adoptNode has an out argument of an exception. Is there really any possibility of an exception here? I'd prefer to see this refactored so we aren't calling our external entry point here. adoptNode would only be the external entry point.

I also don't understand exactly why, if newChild returns false to inDocument, adoptNode can't be called. Can't a node with inDocument returning false nonetheless be visible to JavaScript? If so, then doesn't it have to work if the JavaScript calls adoptNode on it?

> +    if (newChild->document() != document()) {
> +        if (newChild->inDocument())
> +            document()->adoptNode(newChild, ec);
> +        else {
> +            for (Node* node = newChild; node; node = node->traverseNextNode(newChild))
> +                node->setDocument(document());
> +        }
> +    }

Quite inelegant to have this code repeated twice. The right way to handle that is to have an inline function called in both places.

I don’t see the pattern of where you could just call adoptNode and where you needed to do more work.

The test cases don't seem to cover all the code paths. You tested each function, but there's the inDocument and non-inDocument cases, so we'd need at least two tests for each of those call sites.

I'm going to say review+, but I think there should be more test cases, and better factoring of the code to have a truly great fix here.
Comment 8 Ojan Vafai 2010-08-30 12:25:03 PDT
(In reply to comment #7)
> Change seems worthwhile. But we ought to do better than calling this a "willful violation of the spec". We should bring this up in the appropriate forum and get the specifications fixed. Unless the specification in question is moribund and can't be practically revised.

I've sent an email to www-dom, but I'm skeptical we could get this changed without putting a significant effort into this. This spec has been in the W3C recommendation state since 2004 and, as far as I know, does not currently have an active editor. http://www.w3.org/TR/DOM-Level-3-Core/

> > +    // FIXME: Remove this code. Given the above check that the element owns this attribute.
> > +    // It should not be possible that it is in a different document.
> >      if (document() != attr->document()) {
> > -        ec = WRONG_DOCUMENT_ERR;
> > -        return 0;
> > +        ASSERT_NOT_REACHED();
> > +        document()->adoptNode(attr, ec);
> > +        if (ec)
> > +            return 0;
> >      }
> 
> I don't understand why you are checking in new unreachable code. There is no reason to change the unreachable code! You can remove it if you like, or leave it in if you don't want to deal with that right now, but please don't check in new untestable code!

I was thinking that leaving in the assert would catch if we somehow did hit this case, but I'm comfortable with just turning that whole block into an assert.

> > +        if (newChild->inDocument())
> > +            document()->adoptNode(newChild, ec);
> > +        else {
> > +            for (Node* node = newChild; node; node = node->traverseNextNode(newChild))
> > +                node->setDocument(document());
> > +        }
> 
> It's kind of irritating that adoptNode has an out argument of an exception. Is there really any possibility of an exception here? I'd prefer to see this refactored so we aren't calling our external entry point here. adoptNode would only be the external entry point.
> 
> I also don't understand exactly why, if newChild returns false to inDocument, adoptNode can't be called. Can't a node with inDocument returning false nonetheless be visible to JavaScript? If so, then doesn't it have to work if the JavaScript calls adoptNode on it?

> I don’t see the pattern of where you could just call adoptNode and where you needed to do more work.

adoptNode does more than just setDocument. Most notably it calls setRemainsAliveOnRemovalFromTree on iframes. I was hesitant to make that change in behavior, but now that I think about it, this should be safe since currently the appendChild/insertBefore call would throw an error. I'll change these two cases to just call adoptNode.

> The test cases don't seem to cover all the code paths. You tested each function, but there's the inDocument and non-inDocument cases, so we'd need at least two tests for each of those call sites.

Will add tests.
Comment 9 Ojan Vafai 2010-08-30 12:25:28 PDT
www-dom discussion: http://lists.w3.org/Archives/Public/www-dom/2010JulSep/0107.html
Comment 10 Ojan Vafai 2010-08-30 12:45:44 PDT
Missed a comment:

> > +        if (newChild->inDocument())
> > +            document()->adoptNode(newChild, ec);
> > +        else {
> > +            for (Node* node = newChild; node; node = node->traverseNextNode(newChild))
> > +                node->setDocument(document());
> > +        }
> 
> It's kind of irritating that adoptNode has an out argument of an exception. Is there really any possibility of an exception here? I'd prefer to see this refactored so we aren't calling our external entry point here. adoptNode would only be the external entry point.

From inspecting the code, it looks like an exception could only happen if trying to move a DocumentType node across documents. I'm not sure what the right behavior is there. Gecko throws an error in that case, so I was matching them for now. But maybe it makes sense to move that check higher up? I'll post a patch with that.
Comment 11 Eric Seidel (no email) 2010-10-13 12:24:52 PDT
Attachment 65803 [details] was posted by a committer and has review+, assigning to Ojan Vafai for commit.
Comment 12 Alexey Proskuryakov 2010-10-13 12:44:59 PDT
Ojan, could you please update bug 4569 and bug 12339 with current status once you land this?
Comment 13 Ms2ger (he/him; ⌚ UTC+1/+2) 2010-10-13 12:55:41 PDT
You can drop "This is a violation of the DOM spec", by the way; the new spec (<http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html>) is more web-compatible.
Comment 14 Eric Seidel (no email) 2010-12-20 22:40:51 PST
@ojan?
Comment 15 Ojan Vafai 2010-12-28 19:29:45 PST
Sorry, this needs a new patch. I realized that this patch doesn't deal well with mutation events. There are crashes with mutation events currently that this patch gives more code paths to. Now that we can scope mutation events though it should be straightforward to address this. I'll try to post a new patch next week.
Comment 16 Ojan Vafai 2011-03-10 00:03:50 PST
Due to Dimitri's refactoring of setDocument --> setDocumentRecursively, we can now call setDocumentRecursively instead of adoptNode (addressing Darin's factoring concerns) and mutation events are properly handled. I've added a bunch more more tests. Will commit shortly.
Comment 17 Ojan Vafai 2011-03-10 00:33:33 PST
Committed r80696: <http://trac.webkit.org/changeset/80696>
Comment 18 Lucas Forschler 2019-02-06 09:02:45 PST
Mass moving XML DOM bugs to the "DOM" Component.