This bug was detected on Chromium: http://code.google.com/p/chromium/issues/detail?id=17023 The chromium crash reporting tool identified a few different crash stacks that fail on the same method. Here is one such stack trace: 0x6bc0fc59 [chrome.dll- node.cpp:781]WebCore::Node::nodeIndex() 0x6bc00f1e [chrome.dll- rangeboundarypoint.h:100] WebCore::RangeBoundaryPoint::offset() 0x6bc0360f [chrome.dll- range.cpp:1762] WebCore::boundaryTextNodesMerged 0x6bbe863e [chrome.dll- document.cpp:2718] WebCore::Document::textNodesMerged(WebCore::Text *,unsigned int) 0x6bc0f9d2 [chrome.dll - node.cpp:622] WebCore::Node::normalize() 0x6bc838cf [chrome.dll- v8node.cpp:236] WebCore::NodeInternal::normalizeCallback 0x6c25d0a5 [chrome.dll- builtins.cc:381] v8::internal::Builtin_HandleApiCall 0x6c25a1cf [chrome.dll + 0x006ba1cf] 0x030e6d9c I haven't been able to reproduce this crash, but I suspect this had to do with simultaneous Range Selection and Node editing.
Created attachment 38516 [details] more crash reports Crash reports suggest that user interaction involving range selection and dom change trigger the crashes.
Digging into the bug, I suspect that the crashes happen when m_childBeforeBoundary is a deallocated node, or a sibling of a deallocated node. I believe that there are code paths that deallocate nodes without notifying the range objects. One thing that caught my eye is that RangeBoundaryPoint::m_childBeforeBoundary is a Node *, while m_containerNode is a RefPtr<Node>. Shouldn't m_childBeforeBoundary be RefPtr<Node> too?
Might be related to: https://bugs.webkit.org/show_bug.cgi?id=20471
Also crashes on mac: 0x029731e0 Node.h:124] WebCore::Node::nodeIndex() const 0x029892ff RangeBoundaryPoint.h:89] WebCore::Range::firstNode() const 0x02e4cfed webframe_impl.cc:1663] WebFrameImpl::SetFindEndstateFocusAndSelection() 0x02e4d237 webframe_impl.cc:1191] WebFrameImpl::stopFinding(bool) 0x02117a31 render_view.cc:759] RenderView::OnStopFinding(bool) 0x021305b0 ../base/tuple.h:422] bool IPC::MessageWithTuple<Tuple1<bool> >::Dispatch<RenderView, void (RenderView::*)(bool)>(IPC::Message const*, RenderView*, void (RenderView::*)(bool)) 0x0212adbd render_view.cc:376] RenderView::OnMessageReceived(IPC::Message const&) 0x01d95c90 message_router.cc:41] MessageRouter::RouteMessage(IPC::Message const&) 0x025820aa ../base/tuple.h:422] RunnableMethod<IPC::ChannelProxy::Context, void (IPC::ChannelProxy::Context::*)(IPC::Message const&), Tuple1<IPC::Message> >::Run() 0x0218ec4a message_loop.cc:314] MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) 0x0218effa message_loop.cc:429] MessageLoop::DoWork() 0x0219276b message_pump_mac.mm:217] base::MessagePumpCFRunLoopBase::RunWorkSource(void*)
Since mac & windows, changes platform to all.
I don't think it's very useful to aggregate crashes in nodeIndex() - as you write in comment 2, these crashes occur when it's called incorrectly, which can happen in entirely different scenarios.
Not being called correctly is just a hypothesis that has not been proven. I collected the crash reports since they are the only clues for reproducing this (and the last one proved its also on mac).
Created attachment 42577 [details] Crash seen in Mac Safari
Reproducible crasher with reduction. Marking P1.
The reduction is in the chromium bug. I am attempting to reduce it further and attach it here.
Created attachment 42594 [details] reduction. Not a layout test yet.
Created attachment 42596 [details] Slightly better reduction
Created attachment 42598 [details] Even better reduction with more source comments. I think there are multiple bugs at play here. I expect that the range crasher could be reproduced w/o needing any copy event, but I've not been able to do so yet. I think the cloneContents() is cloneing the <script> tag as well. I suspect the document.write() From the script tag is executing during the appendChild(), and possibly causing the rest of the document to "fail to insert". Thus the Nodes may be being "removed from the document" in a way that the Range does not expect and thus the range is never getting updated. I'm not sure yet. Darin might have a theory. I'll CC him.
Created attachment 42601 [details] Even better reduction which highlights that the document.write is executed twice.
OK, so the Range definitely contains a pointer to a deleted Node. My current theory is that Document::removeChildren() (which is actually Container::removeChildren()) called from Document::implicitOpen() is removing all the root children, but that the root's grandchildren are not having willRemoveChild called, and thus the Document never learns that they're being removed and thus ranges which point to anything deep in the tree end up with invalid Node pointers. I'll work to validate this theory tomorrow.
Ok, I found the bug. In removeChildren(): // Do any prep work needed before actually starting to detach // and remove... e.g. stop loading frames, fire unload events. // FIXME: Adding new children from event handlers can cause an infinite loop here. for (RefPtr<Node> n = m_firstChild; n; n = n->nextSibling()) willRemoveChild(n.get()); willRemoveChild() is correctly calling void Range::nodeWillBeRemoved(Node* node) However, since it's removing all the nodes at once, when the matching node is found: boundary.setToBeforeChild(nodeToBeRemoved); is called, which does: m_childBeforeBoundary = child->previousSibling(); however, in this case, nodeToBeRemoved->previousSibling() is also being removed (and in fact, just had willRemoveChild() called for it) but since removeChildren is iterating from first to last node, Range will never notice, and happily set m_childBeforeBoundary to a node which is about to be removed! I'm not yet sure what the proper fix is.
document->nodeWillBeRemoved(child); is just not designed to be called repeatedly like this. NodeIterator looks like it could have the same problem depending on if NodePointer isPointerBeforeNode or not. One solution would be to move the document->nodeWillBeRemoved(child); call out of dispatchChildRemovalEvents and put it next to where the node will actually be removed.
Created attachment 43893 [details] A LayoutTest ready reduction (will crash Safari)
The "easy" solution would be to get rid of removeChildren() all together. Since the model of removing lots of children at once does not fit with the: document->nodeWillBeRemoved(child); model. nodeWillBeRemoved assumes that the node in question is the *only* node about to be removed, and that all siblings and parents are going to stay around. We could have a separate callback in this case which is nodesChildrenWillBeRemoved? That would fix the Range case. Really not sure. Going to sleep on it. I'm still curious what Darin's opinion here is.
So this crash happens anytime we have a Range which points into a subtree which is removed by removeChildren(). Which is actually a rather common call, so although I don't think we've seen a lot of these crashers, it could be rather common. I'll see if I can come up with a fix this evening.
Created attachment 44006 [details] First attempt at a fix.
Comment on attachment 44006 [details] First attempt at a fix. I don't really like this fix because of all the duplicated code. I also think that this needs a NodeIterator test case before it can be landed, but I'm posting it for review now in case those playing along at home have fundamental architectural complaints that I need to address. We could land this as-is, but I also think it could definitely be better. I'm most interested in hearing from Darin Adler, since he has worked in this code extensively.
style-queue ran check-webkit-style on attachment 44006 [details] without any errors.
Comment on attachment 44006 [details] First attempt at a fix. > -static ExceptionCode willRemoveChild(Node *child) > +static void willRemoveChild(const RefPtr<Node>& child) There's no reason to make this a const RefPtr<Node>& instead of a Node*. I suggest you change it back. > +static void willRemoveChildrenFromNode(const RefPtr<ContainerNode> node) I think this should be called willRemoveChildren. I think the argument should be a ContainerNode* and be named container. > + // FIXME: Adding new children from event handlers can cause an infinite loop here. > + for (RefPtr<Node> child = node->firstChild(); child; child = child->nextSibling()) { > + // fire removed from document mutation events. > + dispatchChildRemovalEvents(child.get()); > + > + if (child->attached()) > + child->willRemove(); > + } Instead of keeping this possible bug alive, maybe we should just gather all the children into a local stack buffer. You could do this: Vector<RefPtr<Node>, 32> children; for (Node* child = node->firstChild(); child; child = child->nextSibling()) children.append(child); And then use the children array for the events and such. But you could do that in another patch I guess. > + ASSERT(!eventDispatchForbidden()); > + > RefPtr<Node> c = child; > RefPtr<Document> document = child->document(); > > - // update auxiliary doc info (e.g. iterators) to note that node is being removed > - document->nodeWillBeRemoved(child); > - > - document->incDOMTreeVersion(); > + // document->incDOMTreeVersion() and document->nodeWillBeRemoved() should be > + // handled by the caller as this function may be called in a loop. This comment would make sense in a change log entry, but does not make sense in the code that remains. If you want to add a comment about calling nodeWillBeRemoved and incDOMTreeVersion, that should probably go before the function rather than in the middle of the implementation. > + for (HashSet<NodeIterator*>::const_iterator it = m_nodeIterators.begin(); it != nodeIteratorsEnd; ++it) > + for (Node* n = container->firstChild(); n; n = n->nextSibling()) > + (*it)->nodeWillBeRemoved(n); This needs braces. review- because I think you should do at least one tweak
<rdar://problem/7946578>
Eric, do you intend to finish this fix in the near future?
I'm not currently working on this, nor do I plan to return to it in the near future.
Created attachment 55438 [details] a more reliable test (will crash) The test from proposed patch didn't always crash for me, but this one does.
Created attachment 55600 [details] proposed fix Addressed Darin comments, fixed the code to avoid entering an infinite loop.
Thank you ap.
Committed <http://trac.webkit.org/changeset/59098>. Pelase file separate bugs for any remaining crashes in Node::nodeIndex().