Bug 53659

Summary: Web Inspector: Better support for finding "leaked" DOM
Product: WebKit Reporter: Tony Gentilcore <tonyg>
Component: Web Inspector (Deprecated)Assignee: Mikhail Naganov <mnaganov>
Status: RESOLVED FIXED    
Severity: Enhancement CC: abarth, antonm, apavlov, ap, bweinstein, dglazkov, eae, eric, gregsimon, jamesr, joepeck, keishi, loislo, mihai, mnaganov, ngm, peter, pfeldman, pmuellr, rik, timothy, vitalyr, webkit.review.bot, yurys
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
Example showing how JS can cause native growth that can not currently be measured from web inspector or heap profiler
none
Enable reporting of object groups and DOM-related entities to the new heap profiler
mnaganov: commit-queue-
Comments addressed, merged with antonm's changes, added new files that I forgot to add
mnaganov: commit-queue-
Again merged with new Anton's change, addressed Adam's comments pfeldman: review+, mnaganov: commit-queue-

Description Tony Gentilcore 2011-02-02 21:01:14 PST
James and I have been working with some folks on the Gmail team to track down a memory growth bug. They have a caching system which caches large numbers of DOM nodes that aren't currently attached to the document. However, there appear to be leaks -- perhaps caused by forgetting to detach event handlers? But we have no good way to track down what is holding the references alive.

Our goal is to find DOM objects which are unattached from the document but being held alive by JS alone. We don't have an idea for a silver bullet solution, but had a few ideas for additional information that the heap profile could expose to help us troubleshoot.

Is it possible to add any of the following information for each JS object which references a node in a DOM tree which is held alive by JS references alone?
1. A count of the number of DOM nodes in DOM trees which are held alive by JS references alone (with the understanding that this JS object may not be the only JS object referencing it). 
2. The id of the element referenced or at least the id of the element at the root of the referenced tree (if it has an id).
3. Allow sorting by number of DOM nodes referenced. Bonus points if we can somehow surface the largest trees with fewest number of references.
Comment 1 Mikhail Naganov 2011-02-03 04:28:51 PST
Can you please provide more info about the caching system (or provide me with source code pointers privately): what the caching system does? If a node is either in cache or in page's DOM tree, it should be never collected, right? So apparently there is some scenario for nodes disposal.

For the new heap profiler (see https://bugs.webkit.org/show_bug.cgi?id=50510) I maintain a map of unique object IDs for heap contents that help to track objects lifetime. Will this help you?

Some questions:
 - by 'id of the element' do you mean DOM nodes implementation (C++) object addresses, or what?
 - how can I know that a DOM node is held only from VM -- by verifying that refcount == 1?
Comment 2 Tony Gentilcore 2011-02-03 15:31:09 PST
(In reply to comment #1)
> Can you please provide more info about the caching system (or provide me with source code pointers privately): what the caching system does? If a node is either in cache or in page's DOM tree, it should be never collected, right? So apparently there is some scenario for nodes disposal.

I've added eae who might be able to explain a little better. But as a simplistic example, imagine a fixed size array where each item points to a DOM node that is no longer attached to the document. As new nodes are added to the front of the array, old nodes are pushed out the end. The app expects that the ones pushed out the end are collected. However, there is some other piece of code (other than the caching array) which has a reference to the node and is preventing its collection. But given the size of the codebase it is difficult to pinpoint what other piece of code is holding the reference alive.

> 
> For the new heap profiler (see https://bugs.webkit.org/show_bug.cgi?id=50510) I maintain a map of unique object IDs for heap contents that help to track objects lifetime. Will this help you?

It sounds like the same problem, but I from the description in the bug, I don't fully understand the new feature.

> 
> Some questions:
>  - by 'id of the element' do you mean DOM nodes implementation (C++) object addresses, or what?

The DOM node's id.
<div id=foo></div>
I know this may seem awkward because it isn't always available, but when it is available it is the key piece of information that could help a web app developer understand which node this is.

>  - how can I know that a DOM node is held only from VM -- by verifying that refcount == 1?

I'm not very familiar with how GC works, but Jamesr give me a quick overview. Perhaps we can consider it help only from the VM if the object groups's refcount is equal to the number of JS references.
Comment 3 anton muhin 2011-02-04 03:24:40 PST
(In reply to comment #2)
> (In reply to comment #1)
> > Can you please provide more info about the caching system (or provide me with source code pointers privately): what the caching system does? If a node is either in cache or in page's DOM tree, it should be never collected, right? So apparently there is some scenario for nodes disposal.
> 
> I've added eae who might be able to explain a little better. But as a simplistic example, imagine a fixed size array where each item points to a DOM node that is no longer attached to the document. As new nodes are added to the front of the array, old nodes are pushed out the end. The app expects that the ones pushed out the end are collected. However, there is some other piece of code (other than the caching array) which has a reference to the node and is preventing its collection. But given the size of the codebase it is difficult to pinpoint what other piece of code is holding the reference alive.

Just in case, if detached nodes form tree of its own (traversable via parent, children, sibling), they are treated as an object group and therefore should be eligible to GC in the same time, so retainers could be objects in the cache itself.  If it's not impossible, I'd suggest you to trace object grouping as well.

BTW, Mikhail should be able to explain it in more details, but I think he has this feature---list all the retainers of the object---in the heap profiler, at least in experimental variant.  BTW, Mikhail, do your retainer thing supports tracing object groups?

> 
> > 
> > For the new heap profiler (see https://bugs.webkit.org/show_bug.cgi?id=50510) I maintain a map of unique object IDs for heap contents that help to track objects lifetime. Will this help you?
> 
> It sounds like the same problem, but I from the description in the bug, I don't fully understand the new feature.

There was a confusion over term 'id'.  You meant HTML element's id while we rather thought of JS object identity.  Mikhail's ids are unique ids for the whole life of JS objects (that's trickier than plain pointer as GC can move objects).  So if you'd like to trace the life of the object, your best bet is Mikhail's ids.

> 
> > 
> > Some questions:
> >  - by 'id of the element' do you mean DOM nodes implementation (C++) object addresses, or what?
> 
> The DOM node's id.
> <div id=foo></div>
> I know this may seem awkward because it isn't always available, but when it is available it is the key piece of information that could help a web app developer understand which node this is.
> 
> >  - how can I know that a DOM node is held only from VM -- by verifying that refcount == 1?
> 
> I'm not very familiar with how GC works, but Jamesr give me a quick overview. Perhaps we can consider it help only from the VM if the object groups's refcount is equal to the number of JS references.

Object groups have no refcounts at all (and nothing in V8 is ref count).  There are several complications.  For example, JS objects can be retained not only by pure JS references, but by various types of C++ handles (these are counterparts of JS references in C++ land).

It might be a good experiment to check how many JS wrapped objects have refcount 1 (when you wrap an object, you ref it), it might indeed reveal some leaks.
Comment 4 Mikhail Naganov 2011-02-04 10:03:52 PST
(In reply to comment #3)
> (In reply to comment #2)
> > (In reply to comment #1)
> > > Can you please provide more info about the caching system (or provide me with source code pointers privately): what the caching system does? If a node is either in cache or in page's DOM tree, it should be never collected, right? So apparently there is some scenario for nodes disposal.
> > 
> > I've added eae who might be able to explain a little better. But as a simplistic example, imagine a fixed size array where each item points to a DOM node that is no longer attached to the document. As new nodes are added to the front of the array, old nodes are pushed out the end. The app expects that the ones pushed out the end are collected. However, there is some other piece of code (other than the caching array) which has a reference to the node and is preventing its collection. But given the size of the codebase it is difficult to pinpoint what other piece of code is holding the reference alive.
> 
> Just in case, if detached nodes form tree of its own (traversable via parent, children, sibling), they are treated as an object group and therefore should be eligible to GC in the same time, so retainers could be objects in the cache itself.  If it's not impossible, I'd suggest you to trace object grouping as well.
> 
> BTW, Mikhail should be able to explain it in more details, but I think he has this feature---list all the retainers of the object---in the heap profiler, at least in experimental variant.  BTW, Mikhail, do your retainer thing supports tracing object groups?
> 

I see that I need to include object groups on par with VM GC roots, as they can also prevent objects from deletion.

> > 
> > > 
> > > For the new heap profiler (see https://bugs.webkit.org/show_bug.cgi?id=50510) I maintain a map of unique object IDs for heap contents that help to track objects lifetime. Will this help you?
> > 
> > It sounds like the same problem, but I from the description in the bug, I don't fully understand the new feature.
> 
> There was a confusion over term 'id'.  You meant HTML element's id while we rather thought of JS object identity.  Mikhail's ids are unique ids for the whole life of JS objects (that's trickier than plain pointer as GC can move objects).  So if you'd like to trace the life of the object, your best bet is Mikhail's ids.
>
> > 
> > > 
> > > Some questions:
> > >  - by 'id of the element' do you mean DOM nodes implementation (C++) object addresses, or what?
> > 
> > The DOM node's id.
> > <div id=foo></div>
> > I know this may seem awkward because it isn't always available, but when it is available it is the key piece of information that could help a web app developer understand which node this is.
> >

OK, got it. Although, no data from DOM nodes is stored in VM's heap, I think it is possible to retrieve it during heap sampling.
 
> > >  - how can I know that a DOM node is held only from VM -- by verifying that refcount == 1?
> > 
> > I'm not very familiar with how GC works, but Jamesr give me a quick overview. Perhaps we can consider it help only from the VM if the object groups's refcount is equal to the number of JS references.
> 
> Object groups have no refcounts at all (and nothing in V8 is ref count).  There are several complications.  For example, JS objects can be retained not only by pure JS references, but by various types of C++ handles (these are counterparts of JS references in C++ land).
> 
> It might be a good experiment to check how many JS wrapped objects have refcount 1 (when you wrap an object, you ref it), it might indeed reveal some leaks.
Comment 5 James Robinson 2011-02-04 13:49:10 PST
(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #2)
> > > (In reply to comment #1)
> > > > Can you please provide more info about the caching system (or provide me with source code pointers privately): what the caching system does? If a node is either in cache or in page's DOM tree, it should be never collected, right? So apparently there is some scenario for nodes disposal.
> > > 
> > > I've added eae who might be able to explain a little better. But as a simplistic example, imagine a fixed size array where each item points to a DOM node that is no longer attached to the document. As new nodes are added to the front of the array, old nodes are pushed out the end. The app expects that the ones pushed out the end are collected. However, there is some other piece of code (other than the caching array) which has a reference to the node and is preventing its collection. But given the size of the codebase it is difficult to pinpoint what other piece of code is holding the reference alive.
> > 
> > Just in case, if detached nodes form tree of its own (traversable via parent, children, sibling), they are treated as an object group and therefore should be eligible to GC in the same time, so retainers could be objects in the cache itself.  If it's not impossible, I'd suggest you to trace object grouping as well.
> > 
> > BTW, Mikhail should be able to explain it in more details, but I think he has this feature---list all the retainers of the object---in the heap profiler, at least in experimental variant.  BTW, Mikhail, do your retainer thing supports tracing object groups?
> > 
> 
> I see that I need to include object groups on par with VM GC roots, as they can also prevent objects from deletion.
> 
> > > 
> > > > 
> > > > For the new heap profiler (see https://bugs.webkit.org/show_bug.cgi?id=50510) I maintain a map of unique object IDs for heap contents that help to track objects lifetime. Will this help you?
> > > 
> > > It sounds like the same problem, but I from the description in the bug, I don't fully understand the new feature.
> > 
> > There was a confusion over term 'id'.  You meant HTML element's id while we rather thought of JS object identity.  Mikhail's ids are unique ids for the whole life of JS objects (that's trickier than plain pointer as GC can move objects).  So if you'd like to trace the life of the object, your best bet is Mikhail's ids.
> >
> > > 
> > > > 
> > > > Some questions:
> > > >  - by 'id of the element' do you mean DOM nodes implementation (C++) object addresses, or what?
> > > 
> > > The DOM node's id.
> > > <div id=foo></div>
> > > I know this may seem awkward because it isn't always available, but when it is available it is the key piece of information that could help a web app developer understand which node this is.
> > >
> 
> OK, got it. Although, no data from DOM nodes is stored in VM's heap, I think it is possible to retrieve it during heap sampling.

My thought was that during the object grouping phase in V8GCController the NodeGrouperVisitor could annotate the object group with additional information like "this object group represents a DOM subtree with 25 nodes and the root node has the DOM id attribute 'foo'".  Then references to this object group could also be blamed for that 25 node chunk, even though the actual memory for those DOM nodes is external to the V8 heap.  Does that make sense?
> 
> > > >  - how can I know that a DOM node is held only from VM -- by verifying that refcount == 1?
> > > 
> > > I'm not very familiar with how GC works, but Jamesr give me a quick overview. Perhaps we can consider it help only from the VM if the object groups's refcount is equal to the number of JS references.
> > 
> > Object groups have no refcounts at all (and nothing in V8 is ref count).  There are several complications.  For example, JS objects can be retained not only by pure JS references, but by various types of C++ handles (these are counterparts of JS references in C++ land).
> > 
> > It might be a good experiment to check how many JS wrapped objects have refcount 1 (when you wrap an object, you ref it), it might indeed reveal some leaks.
Comment 6 Patrick Mueller 2011-02-04 17:26:44 PST
I'm not completely following the discussion here, but have a meta-comment.

I suspect the scenario you're wanting to track may be something that not everyone needs.  And on the flip-side, there are other similarly-scoped scenarios that other people have, that they'd like to have specialized debugging support for.

What I'm wondering is: is there some way that we can give you some low-level tools (JS APIs, at this point, I guess) so you could build the analyzer you want, without having to wait for it to show up in Web Inspector?

Turn a default/command-line option on, to enable the additional JS APIs available.
Comment 7 anton muhin 2011-02-07 03:51:49 PST
(In reply to comment #5)
> (In reply to comment #4)
> > (In reply to comment #3)
> > > (In reply to comment #2)
> > > > (In reply to comment #1)
> > > > > Can you please provide more info about the caching system (or provide me with source code pointers privately): what the caching system does? If a node is either in cache or in page's DOM tree, it should be never collected, right? So apparently there is some scenario for nodes disposal.
> > > > 
> > > > I've added eae who might be able to explain a little better. But as a simplistic example, imagine a fixed size array where each item points to a DOM node that is no longer attached to the document. As new nodes are added to the front of the array, old nodes are pushed out the end. The app expects that the ones pushed out the end are collected. However, there is some other piece of code (other than the caching array) which has a reference to the node and is preventing its collection. But given the size of the codebase it is difficult to pinpoint what other piece of code is holding the reference alive.
> > > 
> > > Just in case, if detached nodes form tree of its own (traversable via parent, children, sibling), they are treated as an object group and therefore should be eligible to GC in the same time, so retainers could be objects in the cache itself.  If it's not impossible, I'd suggest you to trace object grouping as well.
> > > 
> > > BTW, Mikhail should be able to explain it in more details, but I think he has this feature---list all the retainers of the object---in the heap profiler, at least in experimental variant.  BTW, Mikhail, do your retainer thing supports tracing object groups?
> > > 
> > 
> > I see that I need to include object groups on par with VM GC roots, as they can also prevent objects from deletion.
> > 
> > > > 
> > > > > 
> > > > > For the new heap profiler (see https://bugs.webkit.org/show_bug.cgi?id=50510) I maintain a map of unique object IDs for heap contents that help to track objects lifetime. Will this help you?
> > > > 
> > > > It sounds like the same problem, but I from the description in the bug, I don't fully understand the new feature.
> > > 
> > > There was a confusion over term 'id'.  You meant HTML element's id while we rather thought of JS object identity.  Mikhail's ids are unique ids for the whole life of JS objects (that's trickier than plain pointer as GC can move objects).  So if you'd like to trace the life of the object, your best bet is Mikhail's ids.
> > >
> > > > 
> > > > > 
> > > > > Some questions:
> > > > >  - by 'id of the element' do you mean DOM nodes implementation (C++) object addresses, or what?
> > > > 
> > > > The DOM node's id.
> > > > <div id=foo></div>
> > > > I know this may seem awkward because it isn't always available, but when it is available it is the key piece of information that could help a web app developer understand which node this is.
> > > >
> > 
> > OK, got it. Although, no data from DOM nodes is stored in VM's heap, I think it is possible to retrieve it during heap sampling.
> 
> My thought was that during the object grouping phase in V8GCController the NodeGrouperVisitor could annotate the object group with additional information like "this object group represents a DOM subtree with 25 nodes and the root node has the DOM id attribute 'foo'".  Then references to this object group could also be blamed for that 25 node chunk, even though the actual memory for those DOM nodes is external to the V8 heap.  Does that make sense?

James, I think it makes a lot of sense.  And if I understand Mikhail right, he was going to implement something like that.  I am not sure he was going to collect additional DOM info, but at least make object groups understood by his retainer stuff.
> > 
> > > > >  - how can I know that a DOM node is held only from VM -- by verifying that refcount == 1?
> > > > 
> > > > I'm not very familiar with how GC works, but Jamesr give me a quick overview. Perhaps we can consider it help only from the VM if the object groups's refcount is equal to the number of JS references.
> > > 
> > > Object groups have no refcounts at all (and nothing in V8 is ref count).  There are several complications.  For example, JS objects can be retained not only by pure JS references, but by various types of C++ handles (these are counterparts of JS references in C++ land).
> > > 
> > > It might be a good experiment to check how many JS wrapped objects have refcount 1 (when you wrap an object, you ref it), it might indeed reveal some leaks.
Comment 8 anton muhin 2011-02-07 03:54:14 PST
(In reply to comment #6)
> I'm not completely following the discussion here, but have a meta-comment.
> 
> I suspect the scenario you're wanting to track may be something that not everyone needs.  And on the flip-side, there are other similarly-scoped scenarios that other people have, that they'd like to have specialized debugging support for.
> 
> What I'm wondering is: is there some way that we can give you some low-level tools (JS APIs, at this point, I guess) so you could build the analyzer you want, without having to wait for it to show up in Web Inspector?
> 
> Turn a default/command-line option on, to enable the additional JS APIs available.

Patrik, the problem is what exact kind of API you want?  Plus those low-level APIs sometimes can expose security stuff.  But we can always run a private version of Chromium with some additional stuff exposed and somehow (e.g. C++ defines, or something else) disable it for prod builds.

But main problem is what kind of API people want and who have enough time and interest to implement them :)
Comment 9 Patrick Mueller 2011-02-07 05:26:14 PST
(In reply to comment #8)
> Patrik, the problem is what exact kind of API you want?  Plus those low-level APIs sometimes can expose security stuff.  But we can always run a private version of Chromium with some additional stuff exposed and somehow (e.g. C++ defines, or something else) disable it for prod builds.
> 
> But main problem is what kind of API people want and who have enough time and interest to implement them :)

Again, keep in mind this is a meta-comment; it seems like the original problem description is pretty narrow - it's easy to imagine all sorts of "queries" people might want to have like this - to chase down various types of problems.

In terms of an API, the thing I'm most familiar with, publicly, is the Java JVMTI interface:

   http://download.oracle.com/javase/6/docs/platform/jvmti/jvmti.html

here's the API for walking the object space:

   http://download.oracle.com/javase/6/docs/platform/jvmti/jvmti.html#Heap

Yes, YOU can always run a private version of Chrome.  Not many people have that luxury :-)
Comment 10 Mikhail Naganov 2011-02-07 12:19:13 PST
 
> > My thought was that during the object grouping phase in V8GCController the NodeGrouperVisitor could annotate the object group with additional information like "this object group represents a DOM subtree with 25 nodes and the root node has the DOM id attribute 'foo'".  Then references to this object group could also be blamed for that 25 node chunk, even though the actual memory for those DOM nodes is external to the V8 heap.  Does that make sense?
> 
> James, I think it makes a lot of sense.  And if I understand Mikhail right, he was going to implement something like that.  I am not sure he was going to collect additional DOM info, but at least make object groups understood by his retainer stuff.
> > >

Yes, the new heap profiler is coming, and I'll enhance it with support for object groups.
 
> 
> Again, keep in mind this is a meta-comment; it seems like the original problem description is pretty narrow - it's easy to imagine all sorts of "queries" people might want to have like this - to chase down various types of problems.
> 
> In terms of an API, the thing I'm most familiar with, publicly, is the Java JVMTI interface:
> 
>    http://download.oracle.com/javase/6/docs/platform/jvmti/jvmti.html
> 
> here's the API for walking the object space:
> 
>    http://download.oracle.com/javase/6/docs/platform/jvmti/jvmti.html#Heap
> 
> Yes, YOU can always run a private version of Chrome.  Not many people have that luxury :-)

Having such an API is nice. It's indispensable when a VM runs on the server side. For client side it may be easier to allow dumping performance data, so people can write analyzers. There is already an ability to send out CPU profiles using 'console.profiles' (although it looks weird that application performance data is exposed to the application itself). And it's also possible to dump heap contents as JSON in V8. Hopefully, we will establish some "standard" on how perf data should be exported for JavaScript, so people will start writing 3rd party analyzers.
Comment 11 Greg Simon 2011-02-22 14:24:38 PST
Created attachment 83390 [details]
Example showing how JS can cause native growth that can not currently be measured from web inspector or heap profiler

Here is a simple example that demonstrates how to "leak" DOM -- here the allocation shows up in the native heap, not the JS heap (meaning you can't use any js heap profilers to figure it out). Also, because the nodes are out of the document, web inspector does not report anything amiss either.

What is keeping them alive is a JS binding on the parent of this DOM subtree.
Comment 12 Mikhail Naganov 2011-03-10 07:17:51 PST
Created attachment 85320 [details]
Enable reporting of object groups and DOM-related entities to the new heap profiler

I will need to wait until V8 support will be rolled into Chromium. Until that, the patch will fail to compile.
Comment 13 anton muhin 2011-03-10 07:27:25 PST
Comment on attachment 85320 [details]
Enable reporting of object groups and DOM-related entities to the new heap profiler

View in context: https://bugs.webkit.org/attachment.cgi?id=85320&action=review

> Source/WebCore/bindings/v8/ScriptProfiler.cpp:101
> +    Node* node = reinterpret_cast<Node*>(object->GetPointerFromInternalField(v8DOMWrapperObjectIndex));

I think it should be V8Node::toNative
Comment 14 WebKit Review Bot 2011-03-10 07:35:12 PST
Attachment 85320 [details] did not build on chromium:
Build output: http://queues.webkit.org/results/8117860
Comment 15 Vitaly Repeshko 2011-03-10 08:37:10 PST
(In reply to comment #12)
> Created an attachment (id=85320) [details]
> Enable reporting of object groups and DOM-related entities to the new heap profiler
> 
> I will need to wait until V8 support will be rolled into Chromium. Until that, the patch will fail to compile.

Looks good. Given that we can extract the root from any GrouperItem I don't think we need stable sort or "rootIndex".
Comment 16 Mikhail Naganov 2011-03-11 05:17:54 PST
(In reply to comment #13)
> (From update of attachment 85320 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=85320&action=review
> 
> > Source/WebCore/bindings/v8/ScriptProfiler.cpp:101
> > +    Node* node = reinterpret_cast<Node*>(object->GetPointerFromInternalField(v8DOMWrapperObjectIndex));
> 
> I think it should be V8Node::toNative

Thanks! You are right.

(In reply to comment #15)
> (In reply to comment #12)
> > Created an attachment (id=85320) [details] [details]
> > Enable reporting of object groups and DOM-related entities to the new heap profiler
> > 
> > I will need to wait until V8 support will be rolled into Chromium. Until that, the patch will fail to compile.
> 
> Looks good. Given that we can extract the root from any GrouperItem I don't think we need stable sort or "rootIndex".

Sure. That was a rudimentary change, really unneeded.
Comment 17 Mikhail Naganov 2011-03-11 05:40:12 PST
Created attachment 85463 [details]
Comments addressed, merged with antonm's changes, added new files that I forgot to add

Will still not compile on Chromium, because the new V8 hasn't been rolled yet.
Comment 18 WebKit Review Bot 2011-03-11 05:55:13 PST
Attachment 85463 [details] did not build on chromium:
Build output: http://queues.webkit.org/results/8132249
Comment 19 Eric Seidel (no email) 2011-03-11 19:33:52 PST
Attachment 85463 [details] did not build on chromium:
Build output: http://queues.webkit.org/results/8133564
Comment 20 Mikhail Naganov 2011-03-15 05:24:05 PDT
V8 version has been updated in Chrome -- it's now possible to land this patch. I'll update Chromium DEPS upstream for WebKit Chromium bots.

Anybody to review?
Comment 21 Adam Barth 2011-03-15 10:54:32 PDT
I'd be happy to look at it.  My only hesitation is that this patch is related to the inspector and I've been criticized in the past for reviewing inspector-related changed.
Comment 22 Mikhail Naganov 2011-03-15 11:00:46 PDT
(In reply to comment #21)
> I'd be happy to look at it.  My only hesitation is that this patch is related to the inspector and I've been criticized in the past for reviewing inspector-related changed.

I think, Pavel (pfeldman@) can look at Inspector's part. There are also bindings change you can look at.
Comment 23 Adam Barth 2011-03-15 12:27:25 PDT
Comment on attachment 85463 [details]
Comments addressed, merged with antonm's changes, added new files that I forgot to add

View in context: https://bugs.webkit.org/attachment.cgi?id=85463&action=review

I didn't look at the inspector parts of the patch.

> Source/WebCore/bindings/v8/RetainedDOMInfo.cpp:39
> +        : m_root(root)

Too much indent.  Should be four spaces.

> Source/WebCore/bindings/v8/RetainedDOMInfo.cpp:57
> +    return other == this
> +            || reinterpret_cast<WebCore::RetainedObjectInfo*>(other)->GetEquivalenceClass() == this->GetEquivalenceClass();

Should be one line.  Also, shouldn't reinterpret_cast be static_cast here?

> Source/WebCore/bindings/v8/RetainedDOMInfo.cpp:60
> +intptr_t RetainedDOMInfo::GetHash()

All these functions should start with a lower-case letter.  Also, WebKit frowns upon names that begin with "get", so this should probably just be "hashValue()" or just "hash()" if it's clear that it's a noun and not a verb.

> Source/WebCore/bindings/v8/RetainedDOMInfo.cpp:65
> +const char* RetainedDOMInfo::GetLabel()

This should definitely just be "label()"

> Source/WebCore/bindings/v8/RetainedDOMInfo.h:53
> +    // V8 guarantees to keep RetainedObjectInfos alive only during a GC or heap snapshotting round, when renderer
> +    // doesn't get control. This allows us to use raw pointers.

Can we assert that's the case?

> Source/WebCore/bindings/v8/ScriptProfiler.cpp:103
> +static v8::RetainedObjectInfo* getRetainedDOMInfo(uint16_t classId, v8::Handle<v8::Value> wrapper)

Another "get" function.

> Source/WebCore/bindings/v8/V8GCController.cpp:187
> +        {
> +        }

These should not be indented.

> Source/WebCore/bindings/v8/V8GCController.cpp:201
> -    uintptr_t m_groupId;
> +    void* m_root;

This looks like a hack.  Why not use separate types and avoid the reinterpret_casts above?

> Source/WebCore/bindings/v8/V8GCController.cpp:203
> +    bool m_isDom;

Seems like this be removed in favor of null checking a Node* pointer.

> Source/WebCore/bindings/v8/V8GCController.cpp:218
> +            : m_object(object)

too much indent.

> Source/WebCore/bindings/v8/V8GCController.cpp:232
> +    virtual intptr_t GetHash()

Ah, these are overriding v8 virtual methods?  That's why they have V8-style names and not WebKit-style names...

> Source/WebCore/bindings/v8/V8GCController.cpp:291
> +                v8::V8::AddObjectGroup(&group[0], group.size(), new RetainedDOMInfo(rootNode));

The memory management for these object is scary.  WebKit very rarely uses manual new/delete, but I understand that V8 is forcing our hands here.
Comment 24 Timothy Hatcher 2011-03-15 12:58:03 PDT
Where is the bug requesting parity with JSC bindings?
Comment 25 Mikhail Naganov 2011-03-15 14:56:05 PDT
(In reply to comment #24)
> Where is the bug requesting parity with JSC bindings?

My understanding is that JSC is more tightly integrated with WebCore -- as far as I can see in JSDOMBinding code, GC can access DOM objects directly. Thus, it doesn't need the kind of interaction we have implemented for WebKit+V8.
Comment 26 Pavel Feldman 2011-03-15 14:59:57 PDT
(In reply to comment #25)
> (In reply to comment #24)
> > Where is the bug requesting parity with JSC bindings?
> 
> My understanding is that JSC is more tightly integrated with WebCore -- as far as I can see in JSDOMBinding code, GC can access DOM objects directly. Thus, it doesn't need the kind of interaction we have implemented for WebKit+V8.

Mikhail, there is an agreement, that all the features of the inspector that are not yet implemented in JSC and are already implemented in V8 are tracked under bugs with xenon in the CC. There should be a clear way / interfaces for JSC bindings to support every feature front-end exposes. Please make sure corresponding bugs are filed and APIs are established.
Comment 27 Mikhail Naganov 2011-03-16 08:03:26 PDT
Comment on attachment 85463 [details]
Comments addressed, merged with antonm's changes, added new files that I forgot to add

View in context: https://bugs.webkit.org/attachment.cgi?id=85463&action=review

>> Source/WebCore/bindings/v8/RetainedDOMInfo.cpp:39
>> +        : m_root(root)
> 
> Too much indent.  Should be four spaces.

Fixed.

>> Source/WebCore/bindings/v8/RetainedDOMInfo.cpp:57
>> +            || reinterpret_cast<WebCore::RetainedObjectInfo*>(other)->GetEquivalenceClass() == this->GetEquivalenceClass();
> 
> Should be one line.  Also, shouldn't reinterpret_cast be static_cast here?

Fixed.

>> Source/WebCore/bindings/v8/RetainedDOMInfo.cpp:60
>> +intptr_t RetainedDOMInfo::GetHash()
> 
> All these functions should start with a lower-case letter.  Also, WebKit frowns upon names that begin with "get", so this should probably just be "hashValue()" or just "hash()" if it's clear that it's a noun and not a verb.

This class implements an interface from V8 API. That's why V8-style names are used. I've added a comment into RetainedDOMInfo header about this.

>> Source/WebCore/bindings/v8/RetainedDOMInfo.cpp:65
>> +const char* RetainedDOMInfo::GetLabel()
> 
> This should definitely just be "label()"

ditto

>> Source/WebCore/bindings/v8/RetainedDOMInfo.h:53
>> +    // doesn't get control. This allows us to use raw pointers.
> 
> Can we assert that's the case?

No. We can't ask V8 about it's current state.

>> Source/WebCore/bindings/v8/ScriptProfiler.cpp:103
>> +static v8::RetainedObjectInfo* getRetainedDOMInfo(uint16_t classId, v8::Handle<v8::Value> wrapper)
> 
> Another "get" function.

Fixed.

>> Source/WebCore/bindings/v8/V8GCController.cpp:187
>> +        }
> 
> These should not be indented.

Fixed.

>> Source/WebCore/bindings/v8/V8GCController.cpp:201
>> +    void* m_root;
> 
> This looks like a hack.  Why not use separate types and avoid the reinterpret_casts above?

To save memory, as these cases are mutually exclusive. Changed to union to avoid casts.

>> Source/WebCore/bindings/v8/V8GCController.cpp:203
>> +    bool m_isDom;
> 
> Seems like this be removed in favor of null checking a Node* pointer.

Using enum type for clarity.

>> Source/WebCore/bindings/v8/V8GCController.cpp:218
>> +            : m_object(object)
> 
> too much indent.

Fixed.

>> Source/WebCore/bindings/v8/V8GCController.cpp:232
>> +    virtual intptr_t GetHash()
> 
> Ah, these are overriding v8 virtual methods?  That's why they have V8-style names and not WebKit-style names...

Sure.

>> Source/WebCore/bindings/v8/V8GCController.cpp:291
>> +                v8::V8::AddObjectGroup(&group[0], group.size(), new RetainedDOMInfo(rootNode));
> 
> The memory management for these object is scary.  WebKit very rarely uses manual new/delete, but I understand that V8 is forcing our hands here.

We probably can use reference counting and do a deref in 'Dispose' method. But I don't think this will look less scary.
Comment 28 Mikhail Naganov 2011-03-16 08:24:15 PDT
Created attachment 85931 [details]
Again merged with new Anton's change, addressed Adam's comments
Comment 29 Mikhail Naganov 2011-03-16 09:00:36 PDT
(In reply to comment #26)
> (In reply to comment #25)
> > (In reply to comment #24)
> > > Where is the bug requesting parity with JSC bindings?
> > 
> > My understanding is that JSC is more tightly integrated with WebCore -- as far as I can see in JSDOMBinding code, GC can access DOM objects directly. Thus, it doesn't need the kind of interaction we have implemented for WebKit+V8.
> 
> Mikhail, there is an agreement, that all the features of the inspector that are not yet implemented in JSC and are already implemented in V8 are tracked under bugs with xenon in the CC. There should be a clear way / interfaces for JSC bindings to support every feature front-end exposes. Please make sure corresponding bugs are filed and APIs are established.

Filed https://bugs.webkit.org/show_bug.cgi?id=56460
Comment 30 Pavel Feldman 2011-03-16 11:06:43 PDT
Comment on attachment 85931 [details]
Again merged with new Anton's change, addressed Adam's comments

View in context: https://bugs.webkit.org/attachment.cgi?id=85931&action=review

> Source/WebCore/inspector/front-end/DetailedHeapshotView.js:296
> +                            this.appendChild(new WebInspector.DataGridNode({path:WebInspector.UIString("Can't find any paths."), len:""}, false));

Make sure localized string is added.

> Source/WebCore/inspector/front-end/DetailedHeapshotView.js:401
> +    retainingPathsTitle.textContent = WebInspector.UIString("Paths from the selected object");

ditto
Comment 31 anton muhin 2011-03-16 11:08:28 PDT
LGTM
Comment 32 Mikhail Naganov 2011-03-16 11:38:37 PDT
(In reply to comment #30)
> (From update of attachment 85931 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=85931&action=review
> 
> > Source/WebCore/inspector/front-end/DetailedHeapshotView.js:296
> > +                            this.appendChild(new WebInspector.DataGridNode({path:WebInspector.UIString("Can't find any paths."), len:""}, false));
> 
> Make sure localized string is added.
> 
> > Source/WebCore/inspector/front-end/DetailedHeapshotView.js:401
> > +    retainingPathsTitle.textContent = WebInspector.UIString("Paths from the selected object");
> 
> ditto

Good catch! I will add them prior to committing.
Comment 33 Mikhail Naganov 2011-03-17 03:16:10 PDT
Manually committed: http://trac.webkit.org/changeset/81265

2011-03-16 Mikhail Naganov < mnaganov@chromium.org>

        Reviewed by Pavel Feldman.

        [Chromium] Report object groups and single DOM-related objects
        to the new heap profiler.
         https://bugs.webkit.org/show_bug.cgi?id=53659

        * Android.v8bindings.mk:
        * WebCore.gypi:
        * WebCore.pro:
        * bindings/scripts/CodeGeneratorV8.pm:
        * bindings/v8/RetainedDOMInfo.cpp: Added.
          (WebCore::RetainedDOMInfo::RetainedDOMInfo):
        * bindings/v8/RetainedDOMInfo.h: Added.
        * bindings/v8/RetainedObjectInfo.h: Added.
        * bindings/v8/ScriptProfiler.cpp:
          (WebCore::retainedDOMInfo):
          (WebCore::ScriptProfiler::initialize):
        * bindings/v8/ScriptProfiler.h:
        * bindings/v8/V8DOMWindowShell.cpp:
          (WebCore::V8DOMWindowShell::initContextIfNeeded):
        * bindings/v8/V8GCController.cpp:
          (WebCore::GroupId::GrouperItem::GrouperItem):
          (WebCore::GroupId::GrouperItem::groupId):
          (WebCore::GroupId::GrouperItem::createRetainedObjectInfo):
          (WebCore::calculateGroupId):
          (WebCore::GrouperVisitor::visitDOMWrapper):
          (WebCore::GrouperVisitor::applyGrouping):
        * bindings/v8/WrapperTypeInfo.h:
        * inspector/front-end/DetailedHeapshotGridNodes.js:
          (WebInspector.HeapSnapshotConstructorNode):
          (WebInspector.HeapSnapshotConstructorNode.prototype._createNodesProvider):
          (WebInspector.HeapSnapshotDiffNode):
          (WebInspector.HeapSnapshotDiffNode.prototype._createNodesProvider.createProvider):
          (WebInspector.HeapSnapshotDiffNode.prototype._createNodesProvider):
        * inspector/front-end/DetailedHeapshotView.js:
          (WebInspector.HeapSnapshotRetainingPathsList.prototype.setDataSource):
          (WebInspector.HeapSnapshotRetainingPathsList.prototype.refresh):
          (WebInspector.HeapSnapshotRetainingPathsList.prototype.showNext.startSearching):
          (WebInspector.HeapSnapshotRetainingPathsList.prototype.showNext):
          (WebInspector.HeapSnapshotRetainingPathsList.prototype._setRootChildrenForFinder):
          (WebInspector.DetailedHeapshotView.prototype._changeRetainingPathsRoot):
          (WebInspector.DetailedHeapshotView.prototype.get isTracingToWindowObjects):
        * inspector/front-end/HeapSnapshot.js:
          (WebInspector.HeapSnapshotNode.prototype.get className):
          (WebInspector.HeapSnapshot.prototype._buildAggregates):
          (WebInspector.HeapSnapshotPathFinder.prototype.updateRoots):
          (WebInspector.HeapSnapshotPathFinder.prototype._fillRootChildren):
        * inspector/front-end/heapProfiler.css:
          (.detailed-heapshot-view .retaining-paths-view .title > span):
          (.detailed-heapshot-view .retaining-paths-to-windows):