Bug 114677 - Reimplement ContainerNode::removeChildren
Summary: Reimplement ContainerNode::removeChildren
Status: RESOLVED INVALID
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Andrei Bucur
URL:
Keywords:
: 113517 (view as bug list)
Depends on:
Blocks:
 
Reported: 2013-04-16 03:29 PDT by Andrei Bucur
Modified: 2023-05-28 09:28 PDT (History)
8 users (show)

See Also:


Attachments
Patch (17.37 KB, patch)
2013-04-16 08:36 PDT, Andrei Bucur
no flags Details | Formatted Diff | Diff
Patch (23.06 KB, patch)
2013-04-16 14:07 PDT, Andrei Bucur
no flags Details | Formatted Diff | Diff
Patch (25.38 KB, patch)
2013-04-17 02:17 PDT, Andrei Bucur
no flags Details | Formatted Diff | Diff
Dromaeo-dom (25.18 KB, text/html)
2013-04-17 12:46 PDT, Andrei Bucur
no flags Details
Dromaeo All Test results (620.36 KB, application/x-webarchive)
2013-04-18 06:49 PDT, Andrei Bucur
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andrei Bucur 2013-04-16 03:29:50 PDT
ContainerNode::removeChildren tries to process all the nodes in bulk. This leads to a lot of issues because synchronous scripts can modify the DOM tree while the nodes are still processed.
Reimplement removeChildren to remove the nodes one by one. Evaluate the performance impact and possible optimizations.
Comment 1 Andrei Bucur 2013-04-16 04:14:04 PDT
*** Bug 113517 has been marked as a duplicate of this bug. ***
Comment 2 Andrei Bucur 2013-04-16 04:14:43 PDT
*** Bug 113433 has been marked as a duplicate of this bug. ***
Comment 3 Andrei Bucur 2013-04-16 08:36:15 PDT
Created attachment 198331 [details]
Patch
Comment 4 Ryosuke Niwa 2013-04-16 10:38:56 PDT
Comment on attachment 198331 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=198331&action=review

What is the performance impact this change?  Do you see any change when you run Parser/html5-full-render.html?

> Source/WebCore/dom/ContainerNode.cpp:457
> +void ContainerNode::disconnectDescendantFrames()

Could you add these function elsewhere so that the diff looks saner?
Comment 5 Andrei Bucur 2013-04-16 14:07:41 PDT
Created attachment 198427 [details]
Patch
Comment 6 Andrei Bucur 2013-04-17 02:17:21 PDT
Created attachment 198485 [details]
Patch
Comment 7 Ryosuke Niwa 2013-04-17 10:08:07 PDT
Comment on attachment 198485 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=198485&action=review

The code looks much better!  Would you also mind running Dromaeo and see if we have any perf. regressions there?

> Source/WebCore/dom/ContainerNode.cpp:362
> +// Returns false if the the node was moved and it can't be removed any more.
> +bool ContainerNode::removeChildInternal(Node* child)

I think we can rename the function to something like removeChildInternalIfPossible and remove the comment.
Also, this function should take PassRefPtr<Node> to be safe.
If not, you should at least store it locally in a RefPtr.

> Source/WebCore/dom/ContainerNode.cpp:385
> +    // Notify the mutation observers the node will be removed.

This comment repeats what the code says. Please remove it.

> Source/WebCore/dom/ContainerNode.cpp:395
> +    WidgetHierarchyUpdatesSuspensionScope suspendWidgetHierarchyUpdates;
> +
> +    Node* prev = child->previousSibling();
> +    Node* next = child->nextSibling();
> +    removeBetween(prev, next, child);
> +    childrenChanged(false, prev, next, -1);
> +    ChildNodeRemovalNotifier(this).notify(child);

I would prefer adding curly brackets around here explicitly to signify the order.

> Source/WebCore/dom/ContainerNode.cpp:586
> +        RefPtr<Node> child = m_firstChild;
> +        removeChildInternal(child.get());

It seems like we can just do removeChildInternal(m_firstChild);
Comment 8 Andrei Bucur 2013-04-17 10:42:14 PDT
Comment on attachment 198485 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=198485&action=review

I'd also like add a new test for mutation observers... something like set a DOMNodeRemoved mutation event listener on a node, remove one of its children with removeChild and in the handler put the Node somewhere else with appendChild. With the current code, I think you get the node removed mutation record twice in the end.

>> Source/WebCore/dom/ContainerNode.cpp:362
>> +bool ContainerNode::removeChildInternal(Node* child)
> 
> I think we can rename the function to something like removeChildInternalIfPossible and remove the comment.
> Also, this function should take PassRefPtr<Node> to be safe.
> If not, you should at least store it locally in a RefPtr.

The renaming sounds good. I though about the RefPtr stuff, I wanted to be an obligation of the caller to protect the node. But you're right, it's safer and more decoupled to just have the removeChildInternal function guard the node. Who knows where it will be used later on.

>> Source/WebCore/dom/ContainerNode.cpp:385
>> +    // Notify the mutation observers the node will be removed.
> 
> This comment repeats what the code says. Please remove it.

I imagined you would say this :).

>> Source/WebCore/dom/ContainerNode.cpp:395
>> +    ChildNodeRemovalNotifier(this).notify(child);
> 
> I would prefer adding curly brackets around here explicitly to signify the order.

I've intentionally removed the brackets because they are no longer necessary. You want me to add them back to suggest "the widget tree can't be changed while the DOM tree is mutated", right?

>> Source/WebCore/dom/ContainerNode.cpp:586
>> +        removeChildInternal(child.get());
> 
> It seems like we can just do removeChildInternal(m_firstChild);

True but then I'm afraid someone might get hasty and just use the freed up Node somehow. But that's probably a silly worry to have.
Comment 9 Ryosuke Niwa 2013-04-17 10:45:59 PDT
Comment on attachment 198485 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=198485&action=review

>>> Source/WebCore/dom/ContainerNode.cpp:586
>>> +        removeChildInternal(child.get());
>> 
>> It seems like we can just do removeChildInternal(m_firstChild);
> 
> True but then I'm afraid someone might get hasty and just use the freed up Node somehow. But that's probably a silly worry to have.

That's why I would like removeChildInternal to take a PassRefPtr.
Comment 10 Elliott Sprehn 2013-04-17 11:15:49 PDT
Comment on attachment 198485 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=198485&action=review

This is going to execute childrenChanged() repeatedly which makes a couple things n^2 (attrNode.innerHTML which is likely very rare, and fieldset.innerHTML which likely real) and looks like it might cause other perf regressions. You'll want to really test this.

ex. Now if you have <div><span/>#text #text #text<span/></div> and div:first-child {} we'll walk through those #text nodes 4 times.

These cases might all be rare enough not to matter, I'm not confident we have coverage of using all the fancy selectors checked for in checkForSiblingStyleChanges that also go through innerHTML though.

>>> Source/WebCore/dom/ContainerNode.cpp:362
>>> +bool ContainerNode::removeChildInternal(Node* child)
>> 
>> I think we can rename the function to something like removeChildInternalIfPossible and remove the comment.
>> Also, this function should take PassRefPtr<Node> to be safe.
>> If not, you should at least store it locally in a RefPtr.
> 
> The renaming sounds good. I though about the RefPtr stuff, I wanted to be an obligation of the caller to protect the node. But you're right, it's safer and more decoupled to just have the removeChildInternal function guard the node. Who knows where it will be used later on.

You want a PassRefPtr<>, this thing takes ownership and then disposes of it.

> Source/WebCore/dom/ContainerNode.cpp:394
> +    childrenChanged(false, prev, next, -1);

This looks like a perf regression. You're executing childrenChanged N times now instead of once. For example you go through checkForSiblingStyleChanges, Document::updateRangesAfterChildrenChanged, and Node::invalidateNodeListCachesInAncestors for every child when before we only did this once.

This also makes attrNode.innerHTML and fieldset.innerHTML O(n^2) since they walk every child.

> Source/WebCore/dom/ContainerNode.cpp:585
> +        RefPtr<Node> child = m_firstChild;

You can make this whole loop one line as:

while (RefPtr<Node> child = m_firstChild)
  removeChildInternal(child.release());
Comment 11 Ryosuke Niwa 2013-04-17 11:45:26 PDT
Comment on attachment 198485 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=198485&action=review

>> Source/WebCore/dom/ContainerNode.cpp:394
>> +    childrenChanged(false, prev, next, -1);
> 
> This looks like a perf regression. You're executing childrenChanged N times now instead of once. For example you go through checkForSiblingStyleChanges, Document::updateRangesAfterChildrenChanged, and Node::invalidateNodeListCachesInAncestors for every child when before we only did this once.
> 
> This also makes attrNode.innerHTML and fieldset.innerHTML O(n^2) since they walk every child.

Unfortunately that's exactly what we need to fix. invalidateNodeListCachesInAncestors needs to be called between each script invocation.
Performance isn't a great excuse for having a security vulnerability.
Comment 12 Andrei Bucur 2013-04-17 12:46:19 PDT
Created attachment 198596 [details]
Dromaeo-dom

Test results w/o (left) and w/ the patch (right).
Comment 13 Andrei Bucur 2013-04-17 12:48:38 PDT
Comment on attachment 198596 [details]
Dromaeo-dom

Grrr... didn't see the abs path in the file.
/Dromaeo/dom-attr:Runs	runs/s	123.94	± 0.62%	124.29	± 0.48%	
/Dromaeo/dom-modify:Runs	runs/s	56.35	± 1.09%	55.51	± 1.05%	1.49% Worse 
/Dromaeo/dom-query:Runs	runs/s	203.28	± 1.17%	211.93	± 3.04%	4.25% Better
/Dromaeo/dom-traverse:Runs	runs/s	181.33	± 0.45%	178.99	± 0.75%	1.29% Worse
Comment 14 Andrei Bucur 2013-04-18 06:49:17 PDT
Created attachment 198733 [details]
Dromaeo All Test results
Comment 15 Andrei Bucur 2013-04-18 06:57:55 PDT
(In reply to comment #14)
> Created an attachment (id=198733) [details]
> Dromaeo All Test results

/Dromaeo/jslib-attr-jquery:Runs seems to be worst affected, a few percents.
Comment 16 Ryosuke Niwa 2013-04-18 09:56:28 PDT
(In reply to comment #14)
> Created an attachment (id=198733) [details]
> Dromaeo All Test results

You don't need to upload an archive. Just upload the html file and everything will magically work. It's designed that way.
Comment 17 Ryosuke Niwa 2013-04-18 09:57:36 PDT
(In reply to comment #15)
> (In reply to comment #14)
> > Created an attachment (id=198733) [details] [details]
> > Dromaeo All Test results
> 
> /Dromaeo/jslib-attr-jquery:Runs seems to be worst affected, a few percents.

It seems like we should investigate why the score of this benchmark got worse.
Comment 18 Darin Adler 2016-03-09 09:43:31 PST
Comment on attachment 198485 [details]
Patch

If we still want to do this we need a new patch.
Comment 19 Anne van Kesteren 2023-05-28 09:28:48 PDT
No new patch in many years, let's close.