Bug 154847

Summary: Add ability to generate a Heap Snapshot
Product: WebKit Reporter: Joseph Pecoraro <joepeck>
Component: JavaScriptCoreAssignee: Joseph Pecoraro <joepeck>
Status: RESOLVED FIXED    
Severity: Normal CC: achristensen, bburg, commit-queue, fpizlo, ggaren, joepeck, kling, mark.lam, ossy, saam, timothy
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 10930    
Attachments:
Description Flags
[PATCH] Proposed Fix
none
[PATCH] Proposed Fix
mark.lam: review+, joepeck: commit-queue-
[PATCH] Add Edge Data - Snapshot Property Names / Indexes / Variable Names
none
[PATCH] For Landing none

Description Joseph Pecoraro 2016-03-01 00:00:32 PST
* SUMMARY
Add ability to generate a Heap Snapshot.

Requirements / desired characteristics of the snapshot:
  - serializable as JSON (useful for the Inspector protocol)
  - unique identification of JSCell instances across multiple snapshots
  - nodes and edges to be able to construct a graph of which nodes reference/retain which other nodes
  - cell information - identifier, type (className), size
  - edge information - from cell, to cell, type / data (to come in a future patch)

* NOTES
This patch adds the functionality to `jsc` for testing. However the same snapshot generation and json serialization will be used by Web Inspector.
Comment 1 Joseph Pecoraro 2016-03-01 00:49:37 PST
Created attachment 272553 [details]
[PATCH] Proposed Fix

The costs to normal GC are all in SlotVisitor:
  - two assignments per visit to a cell (setting/clearing m_currentCell)
  - a branch when marking a node (if builder -> addNode)
  - a branch when appending a new address to a cell (if builder -> addEdge)

Since this happens in parallel I would expect the performance cost to be negligible, but I want to measure performance anyways. A later patch will add a few more branches in various visitChildren methods to record extra data about edges (such as property names or property index).
Comment 2 WebKit Commit Bot 2016-03-01 00:50:26 PST
Attachment 272553 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.h:51:  One space before end of line comments  [whitespace/comments] [5]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.h:52:  One space before end of line comments  [whitespace/comments] [5]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.h:53:  One space before end of line comments  [whitespace/comments] [5]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.h:54:  One space before end of line comments  [whitespace/comments] [5]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.h:73:  The parameter name "profiler" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.h:83:  The parameter name "cell" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.h:89:  Extra space before ( in function call  [whitespace/parens] [4]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.cpp:148:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.cpp:151:  Extra space before ( in function call  [whitespace/parens] [4]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.cpp:165:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.cpp:197:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
Total errors found: 11 in 21 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 3 Joseph Pecoraro 2016-03-01 00:54:21 PST
Comment on attachment 272553 [details]
[PATCH] Proposed Fix

View in context: https://bugs.webkit.org/attachment.cgi?id=272553&action=review

> Source/JavaScriptCore/tests/heapProfiler/driver/driver.js:52
> +// NOTE: This creates a lot of objects that make it easy to walk a node graph.
> +// However when a test creates multiple snapshots with snapshots in scope
> +// this can quickly explode into a snapshot with a massive number of nodes/edges.

This is rather unfortunate and isn't an issue with the snapshot itself but just how I chose to interact with it. This doesn't happen when inspecting a page, because all of the heap snapshot nodes/edges aren't created in the JS Context of the page being inspected, however it hints that snapshotting a page with a large complex / pointer entangled structure can create a large snapshot (# of nodes and edge). I will look into moving away from this approach of creating an object per-Node/Edge and instead used a compact data structure with indexes into it or lazily created objects.
Comment 4 Joseph Pecoraro 2016-03-01 17:13:52 PST
Created attachment 272615 [details]
[PATCH] Proposed Fix

Version 2. Adds a cheaper snapshot representation for JavaScriptCore/tests/heapProfiler used by the existing basic tests. The next patch, which names edges will make use of the other type.
Comment 5 WebKit Commit Bot 2016-03-01 17:15:55 PST
Attachment 272615 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.h:51:  One space before end of line comments  [whitespace/comments] [5]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.h:52:  One space before end of line comments  [whitespace/comments] [5]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.h:53:  One space before end of line comments  [whitespace/comments] [5]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.h:54:  One space before end of line comments  [whitespace/comments] [5]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.h:89:  Extra space before ( in function call  [whitespace/parens] [4]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.cpp:147:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.cpp:150:  Extra space before ( in function call  [whitespace/parens] [4]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.cpp:164:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.cpp:196:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
Total errors found: 9 in 20 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 6 Joseph Pecoraro 2016-03-01 17:26:16 PST
Created attachment 272616 [details]
[PATCH] Add Edge Data - Snapshot Property Names / Indexes / Variable Names

I plan on putting this on a different bug, but because it is highly related to the existing patch they may be useful to view together.

I need to performance test this. It adds another assignment and a few branches to JSObject/JSFinalObject::visitChildren and Scope types, but otherwise normal GC is unaffected.
Comment 7 Mark Lam 2016-03-02 10:32:00 PST
Comment on attachment 272615 [details]
[PATCH] Proposed Fix

View in context: https://bugs.webkit.org/attachment.cgi?id=272615&action=review

I'm still the looking over the code, but thought I'd post these comments first for your consideration.

> Source/JavaScriptCore/ChangeLog:30
> +        time a we give the cell a unique identifier and add it to the the

typo: "time a we" ==> "time, we"

> Source/JavaScriptCore/ChangeLog:46
> +        * CMakeLists.txt:
> +        * JavaScriptCore.xcodeproj/project.pbxproj:
> +        Add new files to the build.

You also need to update the VC++ project files.

> Source/JavaScriptCore/ChangeLog:84
> +        (JSC::HeapSnapshot::sweepCell):
> +        (JSC::HeapSnapshot::shrinkToFit):
> +        Collect a list of cells for sweeping and then remove them all at once
> +        in shrinkToFit. This is done to avoid thrashing of individual removes
> +        that could cause many overlapping moves within the Vector.

You don't need to do this.  Here's a way that you can keep the "swept" status without allocating extra memory (and is faster too):

1. When sweeping cells in the snapshot, set its low bit to 1.  Since cell pointers always have 0 as low bits, this will distinguish the swept ones from the unswept ones while preserving sort order for binary searches.
2. In shrinkToFit, compact the snapshot by purging the cells with the low bit set.

> Source/JavaScriptCore/ChangeLog:89
> +        Sort the list, and also cache the bounding start/stop identifiers.
> +        No other snapshot can contain an identifier in this range, so it will
> +        improve lookup of a node from an identifier.

Why do you have to cache the start and stop identifiers?  If the list is sorted, can't you just peek at the first and last nodes every time for their IDs?
Comment 8 Joseph Pecoraro 2016-03-02 11:14:41 PST
> You don't need to do this.  Here's a way that you can keep the "swept"
> status without allocating extra memory (and is faster too):
> 
> 1. When sweeping cells in the snapshot, set its low bit to 1.  Since cell
> pointers always have 0 as low bits, this will distinguish the swept ones
> from the unswept ones while preserving sort order for binary searches.
> 2. In shrinkToFit, compact the snapshot by purging the cells with the low
> bit set.

I'll give it a shot!


> > Source/JavaScriptCore/ChangeLog:89
> > +        Sort the list, and also cache the bounding start/stop identifiers.
> > +        No other snapshot can contain an identifier in this range, so it will
> > +        improve lookup of a node from an identifier.
> 
> Why do you have to cache the start and stop identifiers?  If the list is
> sorted, can't you just peek at the first and last nodes every time for their
> IDs?

The list is sorted by cell address, which do not align with identifiers. That means the first and last identifiers may be attached to a node somewhere in the middle of the list.
Comment 9 Mark Lam 2016-03-02 11:16:37 PST
Comment on attachment 272615 [details]
[PATCH] Proposed Fix

View in context: https://bugs.webkit.org/attachment.cgi?id=272615&action=review

More comments for consideration.  And reminder: you might want to update the copyright year in the files you're modifying.

> Source/JavaScriptCore/heap/Heap.cpp:766
> +    if (HeapProfiler* heapProfiler = m_vm->heapProfiler())
> +        return !!heapProfiler->activeSnapshotBuilder();

nit: use UNLIKELY() around the heapProfiler condition.  And the "!!" is not necessary.

> Source/JavaScriptCore/heap/Heap.cpp:788
> +    if (HeapProfiler* heapProfiler = m_vm->heapProfiler()) {

nit: maybe remove this condition in favor of checking at the call site (see below)?

> Source/JavaScriptCore/heap/Heap.cpp:1164
> +    removeDeadHeapSnapshotNodes();

nit: Since the normal configuration is not to use the HeapProfiler, why not do the check for the profiler here?

    if (UNLIKELY(HeapProfiler* heapProfiler = m_vm->heapProfiler()))
        removeDeadHeapSnapshotNodes(heapProfiler);

> Source/JavaScriptCore/heap/HeapSnapshot.h:55
> +    Vector<HeapSnapshotNode> m_nodes;

I see that HeapSnapshotNode is a struct with a cell pointer and an unsigned id.  The current approach consumes 16 bytes per node, and hence, you'll end up consuming 16 bytes per live object in the heap.  Have you considered using 2 parallel vectors instead: 1 for the cell pointer and 1 for the ids.  Or do you have a use for the extra 4 bytes later?
Comment 10 Mark Lam 2016-03-02 13:15:18 PST
Comment on attachment 272615 [details]
[PATCH] Proposed Fix

View in context: https://bugs.webkit.org/attachment.cgi?id=272615&action=review

LGTM though I think the snapshot node sweeping implementation can be more optimized.

> Source/JavaScriptCore/heap/HeapSnapshot.cpp:99
> +    if (!isEmpty()) {
> +        m_firstObjectIdentifier = m_nodes.first().identifier;
> +        m_lastObjectIdentifier = m_nodes.last().identifier;
> +    }

Maybe add a comment here that nodes are always appended during snapshot capture in identifier order, and that finalize() is only called while the entries are still sorted in identifier order.

> Source/JavaScriptCore/heap/HeapSnapshotBuilder.cpp:71
> +    bool hasExistingNode = hasExistingNodeForCell(cell);
> +    if (hasExistingNode)

No need to cache the hasExistingNode bool here since it is used only once.

> Source/JavaScriptCore/heap/HeapSnapshotBuilder.cpp:88
> +    std::lock_guard<Lock> lock(m_appendingMutex);

You can use a different mutex for the edges since there is never any conflict between the edges and the snapshot.  That will allow for less potential contention, and WTF locks are cheap.

> Source/JavaScriptCore/heap/HeapSnapshotBuilder.cpp:122
> +//   - eliminate duplicate edge extra data strings, have an index into a de-duplicated like edgeTypes / nodeClassNames.

Looks like you're already applying this above (with the use of nodeClassNameIndex and edgeTypeIndex).  Hence, this comment is not needed.

> Source/JavaScriptCore/heap/HeapSnapshotBuilder.h:55
> +    // FIXME: Add "Weak" edges.

Do we have weak edges that survives GCs?  If not, do we need this?  Else, include a bug url here?

> Source/JavaScriptCore/jsc.cpp:491
> +    void finishCreation(VM& vm)
> +    {
> +        Base::finishCreation(vm);
> +    }

You don't need this.  Not having implies that we use the base's finishCreation().
Comment 11 Joseph Pecoraro 2016-03-02 15:42:15 PST
> > Source/JavaScriptCore/heap/HeapSnapshotBuilder.cpp:122
> > +//   - eliminate duplicate edge extra data strings, have an index into a de-duplicated like edgeTypes / nodeClassNames.
> 
> Looks like you're already applying this above (with the use of
> nodeClassNameIndex and edgeTypeIndex).  Hence, this comment is not needed.

This is for edge data, which isn't yet implemented.


> > Source/JavaScriptCore/heap/HeapSnapshotBuilder.h:55
> > +    // FIXME: Add "Weak" edges.
> 
> Do we have weak edges that survives GCs?  If not, do we need this?  Else,
> include a bug url here?

I'll include a bug here. A Weak edge would be useful to show that a WeakSet/WeakMap references another cell, but not strongly. This could be useful in a visualization.
Comment 12 Joseph Pecoraro 2016-03-02 16:13:12 PST
Created attachment 272701 [details]
[PATCH] For Landing
Comment 13 WebKit Commit Bot 2016-03-02 20:28:53 PST
Attachment 272701 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.cpp:146:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.cpp:149:  Extra space before ( in function call  [whitespace/parens] [4]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.cpp:163:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.cpp:195:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.h:51:  One space before end of line comments  [whitespace/comments] [5]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.h:52:  One space before end of line comments  [whitespace/comments] [5]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.h:53:  One space before end of line comments  [whitespace/comments] [5]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.h:54:  One space before end of line comments  [whitespace/comments] [5]
ERROR: Source/JavaScriptCore/heap/HeapSnapshotBuilder.h:89:  Extra space before ( in function call  [whitespace/parens] [4]
Total errors found: 9 in 22 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 14 WebKit Commit Bot 2016-03-02 21:15:50 PST
Comment on attachment 272701 [details]
[PATCH] For Landing

Clearing flags on attachment: 272701

Committed r197489: <http://trac.webkit.org/changeset/197489>
Comment 15 WebKit Commit Bot 2016-03-02 21:15:54 PST
All reviewed patches have been landed.  Closing bug.
Comment 16 Csaba Osztrogonác 2016-03-03 03:43:12 PST
(In reply to comment #14)
> Comment on attachment 272701 [details]
> [PATCH] For Landing
> 
> Clearing flags on attachment: 272701
> 
> Committed r197489: <http://trac.webkit.org/changeset/197489>

Just to document, Windows buildfix landed in
http://trac.webkit.org/changeset/197500
Comment 17 Joseph Pecoraro 2016-03-03 15:01:59 PST
> Just to document, Windows buildfix landed in
> http://trac.webkit.org/changeset/197500

Doh, thanks!