RESOLVED FIXED Bug 28697
WebKit crash on WebCore::Node::nodeIndex()
https://bugs.webkit.org/show_bug.cgi?id=28697
Summary WebKit crash on WebCore::Node::nodeIndex()
Yaar Schnitman
Reported 2009-08-24 17:27:42 PDT
This bug was detected on Chromium: http://code.google.com/p/chromium/issues/detail?id=17023 The chromium crash reporting tool identified a few different crash stacks that fail on the same method. Here is one such stack trace: 0x6bc0fc59 [chrome.dll- node.cpp:781]WebCore::Node::nodeIndex() 0x6bc00f1e [chrome.dll- rangeboundarypoint.h:100] WebCore::RangeBoundaryPoint::offset() 0x6bc0360f [chrome.dll- range.cpp:1762] WebCore::boundaryTextNodesMerged 0x6bbe863e [chrome.dll- document.cpp:2718] WebCore::Document::textNodesMerged(WebCore::Text *,unsigned int) 0x6bc0f9d2 [chrome.dll - node.cpp:622] WebCore::Node::normalize() 0x6bc838cf [chrome.dll- v8node.cpp:236] WebCore::NodeInternal::normalizeCallback 0x6c25d0a5 [chrome.dll- builtins.cc:381] v8::internal::Builtin_HandleApiCall 0x6c25a1cf [chrome.dll + 0x006ba1cf] 0x030e6d9c I haven't been able to reproduce this crash, but I suspect this had to do with simultaneous Range Selection and Node editing.
Attachments
more crash reports (38.62 KB, text/html)
2009-08-24 17:35 PDT, Yaar Schnitman
no flags
Crash seen in Mac Safari (31.45 KB, text/plain)
2009-11-05 10:35 PST, Eric Seidel (no email)
no flags
reduction. Not a layout test yet. (1020 bytes, text/html)
2009-11-05 14:11 PST, Eric Seidel (no email)
no flags
Slightly better reduction (1.41 KB, text/html)
2009-11-05 14:24 PST, Eric Seidel (no email)
no flags
Even better reduction with more source comments. (1.47 KB, text/html)
2009-11-05 14:40 PST, Eric Seidel (no email)
no flags
Even better reduction which highlights that the document.write is executed twice. (1.38 KB, text/html)
2009-11-05 15:07 PST, Eric Seidel (no email)
no flags
A LayoutTest ready reduction (will crash Safari) (899 bytes, text/plain)
2009-11-25 21:30 PST, Eric Seidel (no email)
no flags
First attempt at a fix. (10.96 KB, patch)
2009-11-29 21:28 PST, Eric Seidel (no email)
darin: review-
a more reliable test (will crash) (1.01 KB, text/html)
2010-05-07 17:13 PDT, Alexey Proskuryakov
no flags
proposed fix (11.08 KB, patch)
2010-05-10 14:05 PDT, Alexey Proskuryakov
darin: review+
Yaar Schnitman
Comment 1 2009-08-24 17:35:54 PDT
Created attachment 38516 [details] more crash reports Crash reports suggest that user interaction involving range selection and dom change trigger the crashes.
Yaar Schnitman
Comment 2 2009-08-24 17:39:52 PDT
Digging into the bug, I suspect that the crashes happen when m_childBeforeBoundary is a deallocated node, or a sibling of a deallocated node. I believe that there are code paths that deallocate nodes without notifying the range objects. One thing that caught my eye is that RangeBoundaryPoint::m_childBeforeBoundary is a Node *, while m_containerNode is a RefPtr<Node>. Shouldn't m_childBeforeBoundary be RefPtr<Node> too?
Yaar Schnitman
Comment 3 2009-08-24 17:40:32 PDT
Yaar Schnitman
Comment 4 2009-09-17 11:47:21 PDT
Also crashes on mac: 0x029731e0 Node.h:124] WebCore::Node::nodeIndex() const 0x029892ff RangeBoundaryPoint.h:89] WebCore::Range::firstNode() const 0x02e4cfed webframe_impl.cc:1663] WebFrameImpl::SetFindEndstateFocusAndSelection() 0x02e4d237 webframe_impl.cc:1191] WebFrameImpl::stopFinding(bool) 0x02117a31 render_view.cc:759] RenderView::OnStopFinding(bool) 0x021305b0 ../base/tuple.h:422] bool IPC::MessageWithTuple<Tuple1<bool> >::Dispatch<RenderView, void (RenderView::*)(bool)>(IPC::Message const*, RenderView*, void (RenderView::*)(bool)) 0x0212adbd render_view.cc:376] RenderView::OnMessageReceived(IPC::Message const&) 0x01d95c90 message_router.cc:41] MessageRouter::RouteMessage(IPC::Message const&) 0x025820aa ../base/tuple.h:422] RunnableMethod<IPC::ChannelProxy::Context, void (IPC::ChannelProxy::Context::*)(IPC::Message const&), Tuple1<IPC::Message> >::Run() 0x0218ec4a message_loop.cc:314] MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) 0x0218effa message_loop.cc:429] MessageLoop::DoWork() 0x0219276b message_pump_mac.mm:217] base::MessagePumpCFRunLoopBase::RunWorkSource(void*)
Yaar Schnitman
Comment 5 2009-09-17 11:47:55 PDT
Since mac & windows, changes platform to all.
Alexey Proskuryakov
Comment 6 2009-09-18 08:21:19 PDT
I don't think it's very useful to aggregate crashes in nodeIndex() - as you write in comment 2, these crashes occur when it's called incorrectly, which can happen in entirely different scenarios.
Yaar Schnitman
Comment 7 2009-10-05 14:45:07 PDT
Not being called correctly is just a hypothesis that has not been proven. I collected the crash reports since they are the only clues for reproducing this (and the last one proved its also on mac).
Eric Seidel (no email)
Comment 8 2009-11-05 10:35:38 PST
Created attachment 42577 [details] Crash seen in Mac Safari
Eric Seidel (no email)
Comment 9 2009-11-05 10:36:05 PST
Reproducible crasher with reduction. Marking P1.
Eric Seidel (no email)
Comment 10 2009-11-05 11:02:13 PST
The reduction is in the chromium bug. I am attempting to reduce it further and attach it here.
Eric Seidel (no email)
Comment 11 2009-11-05 14:11:57 PST
Created attachment 42594 [details] reduction. Not a layout test yet.
Eric Seidel (no email)
Comment 12 2009-11-05 14:24:24 PST
Created attachment 42596 [details] Slightly better reduction
Eric Seidel (no email)
Comment 13 2009-11-05 14:40:24 PST
Created attachment 42598 [details] Even better reduction with more source comments. I think there are multiple bugs at play here. I expect that the range crasher could be reproduced w/o needing any copy event, but I've not been able to do so yet. I think the cloneContents() is cloneing the <script> tag as well. I suspect the document.write() From the script tag is executing during the appendChild(), and possibly causing the rest of the document to "fail to insert". Thus the Nodes may be being "removed from the document" in a way that the Range does not expect and thus the range is never getting updated. I'm not sure yet. Darin might have a theory. I'll CC him.
Eric Seidel (no email)
Comment 14 2009-11-05 15:07:27 PST
Created attachment 42601 [details] Even better reduction which highlights that the document.write is executed twice.
Eric Seidel (no email)
Comment 15 2009-11-25 19:58:03 PST
OK, so the Range definitely contains a pointer to a deleted Node. My current theory is that Document::removeChildren() (which is actually Container::removeChildren()) called from Document::implicitOpen() is removing all the root children, but that the root's grandchildren are not having willRemoveChild called, and thus the Document never learns that they're being removed and thus ranges which point to anything deep in the tree end up with invalid Node pointers. I'll work to validate this theory tomorrow.
Eric Seidel (no email)
Comment 16 2009-11-25 20:18:14 PST
Ok, I found the bug. In removeChildren(): // Do any prep work needed before actually starting to detach // and remove... e.g. stop loading frames, fire unload events. // FIXME: Adding new children from event handlers can cause an infinite loop here. for (RefPtr<Node> n = m_firstChild; n; n = n->nextSibling()) willRemoveChild(n.get()); willRemoveChild() is correctly calling void Range::nodeWillBeRemoved(Node* node) However, since it's removing all the nodes at once, when the matching node is found: boundary.setToBeforeChild(nodeToBeRemoved); is called, which does: m_childBeforeBoundary = child->previousSibling(); however, in this case, nodeToBeRemoved->previousSibling() is also being removed (and in fact, just had willRemoveChild() called for it) but since removeChildren is iterating from first to last node, Range will never notice, and happily set m_childBeforeBoundary to a node which is about to be removed! I'm not yet sure what the proper fix is.
Eric Seidel (no email)
Comment 17 2009-11-25 20:29:59 PST
document->nodeWillBeRemoved(child); is just not designed to be called repeatedly like this. NodeIterator looks like it could have the same problem depending on if NodePointer isPointerBeforeNode or not. One solution would be to move the document->nodeWillBeRemoved(child); call out of dispatchChildRemovalEvents and put it next to where the node will actually be removed.
Eric Seidel (no email)
Comment 18 2009-11-25 21:30:32 PST
Created attachment 43893 [details] A LayoutTest ready reduction (will crash Safari)
Eric Seidel (no email)
Comment 19 2009-11-25 21:40:16 PST
The "easy" solution would be to get rid of removeChildren() all together. Since the model of removing lots of children at once does not fit with the: document->nodeWillBeRemoved(child); model. nodeWillBeRemoved assumes that the node in question is the *only* node about to be removed, and that all siblings and parents are going to stay around. We could have a separate callback in this case which is nodesChildrenWillBeRemoved? That would fix the Range case. Really not sure. Going to sleep on it. I'm still curious what Darin's opinion here is.
Eric Seidel (no email)
Comment 20 2009-11-28 15:01:43 PST
So this crash happens anytime we have a Range which points into a subtree which is removed by removeChildren(). Which is actually a rather common call, so although I don't think we've seen a lot of these crashers, it could be rather common. I'll see if I can come up with a fix this evening.
Eric Seidel (no email)
Comment 21 2009-11-29 21:28:23 PST
Created attachment 44006 [details] First attempt at a fix.
Eric Seidel (no email)
Comment 22 2009-11-29 21:31:16 PST
Comment on attachment 44006 [details] First attempt at a fix. I don't really like this fix because of all the duplicated code. I also think that this needs a NodeIterator test case before it can be landed, but I'm posting it for review now in case those playing along at home have fundamental architectural complaints that I need to address. We could land this as-is, but I also think it could definitely be better. I'm most interested in hearing from Darin Adler, since he has worked in this code extensively.
Adam Barth
Comment 23 2009-11-30 12:49:20 PST
style-queue ran check-webkit-style on attachment 44006 [details] without any errors.
Darin Adler
Comment 24 2009-11-30 16:06:35 PST
Comment on attachment 44006 [details] First attempt at a fix. > -static ExceptionCode willRemoveChild(Node *child) > +static void willRemoveChild(const RefPtr<Node>& child) There's no reason to make this a const RefPtr<Node>& instead of a Node*. I suggest you change it back. > +static void willRemoveChildrenFromNode(const RefPtr<ContainerNode> node) I think this should be called willRemoveChildren. I think the argument should be a ContainerNode* and be named container. > + // FIXME: Adding new children from event handlers can cause an infinite loop here. > + for (RefPtr<Node> child = node->firstChild(); child; child = child->nextSibling()) { > + // fire removed from document mutation events. > + dispatchChildRemovalEvents(child.get()); > + > + if (child->attached()) > + child->willRemove(); > + } Instead of keeping this possible bug alive, maybe we should just gather all the children into a local stack buffer. You could do this: Vector<RefPtr<Node>, 32> children; for (Node* child = node->firstChild(); child; child = child->nextSibling()) children.append(child); And then use the children array for the events and such. But you could do that in another patch I guess. > + ASSERT(!eventDispatchForbidden()); > + > RefPtr<Node> c = child; > RefPtr<Document> document = child->document(); > > - // update auxiliary doc info (e.g. iterators) to note that node is being removed > - document->nodeWillBeRemoved(child); > - > - document->incDOMTreeVersion(); > + // document->incDOMTreeVersion() and document->nodeWillBeRemoved() should be > + // handled by the caller as this function may be called in a loop. This comment would make sense in a change log entry, but does not make sense in the code that remains. If you want to add a comment about calling nodeWillBeRemoved and incDOMTreeVersion, that should probably go before the function rather than in the middle of the implementation. > + for (HashSet<NodeIterator*>::const_iterator it = m_nodeIterators.begin(); it != nodeIteratorsEnd; ++it) > + for (Node* n = container->firstChild(); n; n = n->nextSibling()) > + (*it)->nodeWillBeRemoved(n); This needs braces. review- because I think you should do at least one tweak
Alexey Proskuryakov
Comment 25 2010-05-05 12:02:25 PDT
Alexey Proskuryakov
Comment 26 2010-05-05 12:05:29 PDT
Eric, do you intend to finish this fix in the near future?
Eric Seidel (no email)
Comment 27 2010-05-05 13:35:52 PDT
I'm not currently working on this, nor do I plan to return to it in the near future.
Alexey Proskuryakov
Comment 28 2010-05-07 17:13:58 PDT
Created attachment 55438 [details] a more reliable test (will crash) The test from proposed patch didn't always crash for me, but this one does.
Alexey Proskuryakov
Comment 29 2010-05-10 14:05:10 PDT
Created attachment 55600 [details] proposed fix Addressed Darin comments, fixed the code to avoid entering an infinite loop.
Eric Seidel (no email)
Comment 30 2010-05-10 15:09:32 PDT
Thank you ap.
Alexey Proskuryakov
Comment 31 2010-05-10 15:14:50 PDT
Committed <http://trac.webkit.org/changeset/59098>. Pelase file separate bugs for any remaining crashes in Node::nodeIndex().
Note You need to log in before you can comment on or make changes to this bug.