Bug 107074 - Remove NodeListsNodeData when it's no longer needed
Summary: Remove NodeListsNodeData when it's no longer needed
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Ryosuke Niwa
URL:
Keywords:
Depends on:
Blocks: 106586
  Show dependency treegraph
 
Reported: 2013-01-16 17:48 PST by Ryosuke Niwa
Modified: 2013-01-18 01:53 PST (History)
12 users (show)

See Also:


Attachments
Fixes the bug (12.04 KB, patch)
2013-01-16 18:14 PST, Ryosuke Niwa
darin: review+
Details | Formatted Diff | Diff
Perf. test results (open Memory tab) and look at Malloc values (27.08 KB, text/html)
2013-01-16 18:15 PST, Ryosuke Niwa
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ryosuke Niwa 2013-01-16 17:48:41 PST
Don't keep NodeListsNodeData forever.
Comment 1 Ryosuke Niwa 2013-01-16 18:14:08 PST
Created attachment 183081 [details]
Fixes the bug
Comment 2 Ryosuke Niwa 2013-01-16 18:15:20 PST
Created attachment 183082 [details]
Perf. test results (open Memory tab) and look at Malloc values
Comment 3 Ryosuke Niwa 2013-01-16 18:16:25 PST
This simple patch reduces the memory usage of the micro benchmark by 3%. Given that NodeListsNodeData is much smaller than NodeRareData, removing NodeRareData may reduce the overall memory consumption by as much as 6%-8%.
Comment 4 Darin Adler 2013-01-17 09:22:11 PST
Comment on attachment 183081 [details]
Fixes the bug

View in context: https://bugs.webkit.org/attachment.cgi?id=183081&action=review

Seems OK, although I have a few substantive thoughts below.

Somewhere there should be a comment about how we decided to delete these when no longer needed rather than keeping them around to optimize re-using them later. It’s the performance test, but something in the code should talk about how we made the tradeoff, if possible.

> Source/WebCore/ChangeLog:10
> +        If we detect that we have only one node lists left in the data structure,

Should be “node list” instead of “node lists”.

> Source/WebCore/ChangeLog:20
> +        (Node):

Bogus line should be removed. Or fix the script so it doesn’t add these.

> Source/WebCore/ChangeLog:26
> +        (NodeListsNodeData):

Bogus line should be removed. Or fix the script so it doesn’t add these.

> Source/WebCore/dom/NodeRareData.h:74
> +        if (!removeThisIfOnlyOneListIsLeft(list->ownerNode()))
> +            m_childNodeList = 0;

I think I’d like these better if they used an early return statement instead of nesting the remaining code. The logic is “if I can do this just by self destructing, then do that and return, otherwise do it the normal way”.

> Source/WebCore/dom/NodeRareData.h:225
> +    bool removeThisIfOnlyOneListIsLeft(Node*);

This function is confusing for a few reasons. The idea of this function is that we call it just before removing a list, which is why it checks for one remaining list rather than zero. That was completely non-obvious to me and I had to read the patch over and over again to figure that out. So I think that either the function name or a comment needs to explain that.

The trickiness and danger of using this function without looking at the boolean result would be clearer if it used the word “delete” instead of “remove”. On the other hand, “delete” does make it sound like this just deletes the object rather than cleanly removing it from the node.

Maybe a better name is deleteThisIfAboutToRemoveLastList? Or maybe selfDestructIfAboutToRemoveLastList? Maybe you can come up with an even better name.

I’m also not sure the factoring is quite right. It seems awkward to call this class to remove a list, then have this class recover the node pointer so it can do work on the node’s rare data. It’s kind of a messy relationship between the node, rare data, and lists classes. I understand that this is an easy place to correctly make the change, so I see the attraction of doing it this way, but the need to recover the owner node pointer is a factoring danger sign.

> Tools/Scripts/webkitpy/performance_tests/perftest.py:209
>          # Following is for Parser/html-parser.html
> -        re.compile(re.escape("""CONSOLE MESSAGE: Blocked script execution in 'html-parser.html' because the document's frame is sandboxed and the 'allow-scripts' permission is not set.""")),
> +        re.compile(r"CONSOLE MESSAGE: Blocked script execution in '[A-Za-z0-9\-\.]+' because the document's frame is sandboxed and the 'allow-scripts' permission is not set."),

Comment now seems wrong.
Comment 5 Ryosuke Niwa 2013-01-17 12:28:19 PST
(In reply to comment #4)
> (From update of attachment 183081 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=183081&action=review
> 
> Seems OK, although I have a few substantive thoughts below.
> 
> Somewhere there should be a comment about how we decided to delete these when no longer needed rather than keeping them around to optimize re-using them later. It’s the performance test, but something in the code should talk about how we made the tradeoff, if possible.

I forgot to mention this but this was the intended design back in the days although it never worked. I removed the code in http://trac.webkit.org/changeset/120367/trunk/Source/WebCore/dom/Node.cpp but the intent had always been to remove this data structure when we no longer need them. I'll clarify it in the change log.

> > Source/WebCore/ChangeLog:10
> > +        If we detect that we have only one node lists left in the data structure,
> 
> Should be “node list” instead of “node lists”.
> 
> > Source/WebCore/ChangeLog:20
> > +        (Node):
> 
> Bogus line should be removed. Or fix the script so it doesn’t add these.
> 
> > Source/WebCore/ChangeLog:26
> > +        (NodeListsNodeData):
> 
> Bogus line should be removed. Or fix the script so it doesn’t add these.

Fixed.

> > Source/WebCore/dom/NodeRareData.h:74
> > +        if (!removeThisIfOnlyOneListIsLeft(list->ownerNode()))
> > +            m_childNodeList = 0;
> 
> I think I’d like these better if they used an early return statement instead of nesting the remaining code. The logic is “if I can do this just by self destructing, then do that and return, otherwise do it the normal way”.
> 
> > Source/WebCore/dom/NodeRareData.h:225
> > +    bool removeThisIfOnlyOneListIsLeft(Node*);
> 
> This function is confusing for a few reasons. The idea of this function is that we call it just before removing a list, which is why it checks for one remaining list rather than zero. That was completely non-obvious to me and I had to read the patch over and over again to figure that out. So I think that either the function name or a comment needs to explain that.

The original design was to let each function delete the last time, and then removeThisIfPossible to check whether all hash maps & m_childNodes has been cleared or not That's definitely a much simpler design but incurs a runtime cost when we're deleting the last time.

> The trickiness and danger of using this function without looking at the boolean result would be clearer if it used the word “delete” instead of “remove”. On the other hand, “delete” does make it sound like this just deletes the object rather than cleanly removing it from the node.
> 
> Maybe a better name is deleteThisIfAboutToRemoveLastList? Or maybe selfDestructIfAboutToRemoveLastList? Maybe you can come up with an even better name.

I'm gonna go extra verbose and call it deleteThisAndUpdateNodeRareDataIfAboutToRemoveLastList.

> I’m also not sure the factoring is quite right. It seems awkward to call this class to remove a list, then have this class recover the node pointer so it can do work on the node’s rare data. It’s kind of a messy relationship between the node, rare data, and lists classes. I understand that this is an easy place to correctly make the change, so I see the attraction of doing it this way, but the need to recover the owner node pointer is a factoring danger sign.

Yeah, it's quite messy. On the other hand, making each node list calling clearNodeLists looked fragile. It's also important that we assert the list we're removing to be in the node lists because we've had quite few bugs where we had a mismatch in the past.

> 
> > Tools/Scripts/webkitpy/performance_tests/perftest.py:209
> >          # Following is for Parser/html-parser.html
> > -        re.compile(re.escape("""CONSOLE MESSAGE: Blocked script execution in 'html-parser.html' because the document's frame is sandboxed and the 'allow-scripts' permission is not set.""")),
> > +        re.compile(r"CONSOLE MESSAGE: Blocked script execution in '[A-Za-z0-9\-\.]+' because the document's frame is sandboxed and the 'allow-scripts' permission is not set."),
> 
> Comment now seems wrong.

Oops, you're right. Fixed.
Comment 6 Ryosuke Niwa 2013-01-17 16:37:13 PST
Committed r140070: <http://trac.webkit.org/changeset/140070>
Comment 7 Ryosuke Niwa 2013-01-18 01:53:31 PST
The bug 107239 tracks an unrelated crash.