Bug 71956

Summary: Cache and reuse HTMLCollections exposed by Document.
Product: WebKit Reporter: Andreas Kling <kling>
Component: DOMAssignee: Andreas Kling <kling>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, darin, fishd, ggaren, koivisto, ojan, sam
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 72157    
Bug Blocks:    
Attachments:
Description Flags
Proposed patch
none
Proposed patch
koivisto: review+
run before regression stdio
none
run after regression stdio
none
Proposed patch v2
none
Proposed patch v3 none

Description Andreas Kling 2011-11-09 14:25:21 PST
The following HTMLCollection objects are exposed to the web by Document (and HTMLDocument):
* all
* images
* embeds
* applets
* links
* forms
* anchors
* scripts

We currently create new HTMLCollection objects on demand when these getters are called, meaning you can end up with quite a few of these objects.
As an example, the full HTML5 spec page creates 34001 HTMLCollection objects by calling document.links repeatedly.

Both Firefox and Opera cache and reuse the same HTMLCollection objects (as evidenced by the returned collections containing properties that were set on collections previously returned by the same getters.)

We could match this behavior and make very good memory savings in some cases (~800 kB on full HTML5 spec.)

I believe this change is safe since the HTMLCollection API we expose to the web is stateless.
Comment 1 Andreas Kling 2011-11-10 08:11:41 PST
Created attachment 114503 [details]
Proposed patch

First stab at this. I didn't touch the document.all collection here, since it comes with some emotional baggage in the JS bindings.
Comment 2 Andreas Kling 2011-11-10 08:13:03 PST
Created attachment 114504 [details]
Proposed patch

Oops, forgot to include fast/dom/gc-9.html rebaseline.
Comment 3 Darin Adler 2011-11-10 08:18:56 PST
Comment on attachment 114504 [details]
Proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=114504&action=review

> Source/WebCore/dom/Document.h:1429
> +    Vector<RefPtr<HTMLCollection>, NumCollectionTypes> m_collections;

Since this is a fixed size array, then you should use an array, not a Vector.

    RefPtr<HTMLCollection> m_collections[NumCollectionTypes];
Comment 4 Andreas Kling 2011-11-10 08:36:34 PST
(In reply to comment #3)
> (From update of attachment 114504 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=114504&action=review
> 
> > Source/WebCore/dom/Document.h:1429
> > +    Vector<RefPtr<HTMLCollection>, NumCollectionTypes> m_collections;
> 
> Since this is a fixed size array, then you should use an array, not a Vector.
> 
>     RefPtr<HTMLCollection> m_collections[NumCollectionTypes];

Sure, I can do that instead. I just liked that Vector would handle initialization and deletion automatically. :)
Comment 5 Andreas Kling 2011-11-10 08:39:37 PST
(In reply to comment #4)
> Sure, I can do that instead. I just liked that Vector would handle initialization and deletion automatically. :)

Um, disregard that comment, I'm being silly.
Comment 6 Antti Koivisto 2011-11-10 08:40:14 PST
Comment on attachment 114504 [details]
Proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=114504&action=review

r=me

It would be good if we had more tests in this area. The fact that this affects one test only is slightly discorncerning (though it does document the new behavior).

> Source/WebCore/dom/Document.h:1429
> +    Vector<RefPtr<HTMLCollection>, NumCollectionTypes> m_collections;

I would just use plain array of RefPtrs here as the length is fixed.
Comment 7 Andreas Kling 2011-11-10 09:09:50 PST
Committed r99869: <http://trac.webkit.org/changeset/99869>
Comment 8 Alexey Proskuryakov 2011-11-10 11:05:01 PST
> RefPtr<HTMLCollection> m_collections[NumCollectionTypes]; 

While it's reassuring that gc-9 test passes, I don't understand what protects wrappers of these objects. Is there some magic?
Comment 9 Andreas Kling 2011-11-10 17:11:52 PST
(In reply to comment #8)
> > RefPtr<HTMLCollection> m_collections[NumCollectionTypes]; 
> 
> While it's reassuring that gc-9 test passes, I don't understand what protects wrappers of these objects. Is there some magic?

Good question. I don't know how these are protected from GC, I assumed the "default" DOM wrapper protection would be enough here. CC'ing some people that might be able to shed some light on how this mechanism works.
Comment 10 Ryosuke Niwa 2011-11-11 11:43:00 PST
Reopen the bug since it was rolled out in http://trac.webkit.org/changeset/99995.
Comment 11 Ojan Vafai 2011-11-11 12:09:37 PST
This was the cause of the perf regression: http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/intl1/report.html?history=150&rev=109653.

kling, ping me offline and I can try to get you a way to run this test locally.
Comment 12 Ojan Vafai 2011-11-11 20:35:02 PST
FYI, the tracking chromium bug for the perf regression crbug.com/103868.
Comment 13 Andreas Kling 2011-11-12 03:46:54 PST
(In reply to comment #12)
> FYI, the tracking chromium bug for the perf regression crbug.com/103868.

Just to be clear, is this a speed or VM size regression? Or both? I'm a bit unfamiliar with the Chromium cycler lingo. :/
Comment 14 Ojan Vafai 2011-11-12 10:10:13 PST
(In reply to comment #13)
> (In reply to comment #12)
> > FYI, the tracking chromium bug for the perf regression crbug.com/103868.
> 
> Just to be clear, is this a speed or VM size regression? Or both? I'm a bit unfamiliar with the Chromium cycler lingo. :/

Speed. Those graphs default to time. This does seem to have had an effect on memory as well.

~5% improvement: http://build.chromium.org/f/chromium/perf/vista-release-webkit-latest/intl1/report.html?history=150&rev=-1&graph=total_byte_b

~80% regression: http://build.chromium.org/f/chromium/perf/vista-release-webkit-latest/intl1/report.html?history=150&rev=-1&graph=ws_peak_r

I'm not too familiar with the memory numbers myself. I'll try and find someone more familiar with this stuff to explain what total_byte_b adn ws_peak_r refer to.
Comment 15 Ojan Vafai 2011-11-12 10:21:14 PST
TL;DR version: On the chromium Vista bot
~365% memory regression: http://build.chromium.org/f/chromium/perf/vista-release-webkit-latest/intl1/report.html?history=150&rev=-1&graph=ws_peak_r

~7% runtime regresson: http://build.chromium.org/f/chromium/perf/vista-release-webkit-latest/intl1/report.html?history=150&rev=-1&graph=times

The regression on the mac bot has different percentages but shows roughly the same result.

DETAILS:
http://www.chromium.org/developers/testing/chromium-build-infrastructure/performance-test-plots

_b refers to browser process memory use. _r refers to renderer (i.e. webkit) memory use. So, ws_peak_r refers to the peak working set size of the renderer while running the test. 

Looks like the memory improvement in the browser process comes from read_byte_b, which is "I/O ops or bytes read": http://build.chromium.org/f/chromium/perf/vista-release-webkit-latest/intl1/report.html?history=150&rev=-1&graph=read_byte_b

But there's also a regression in writing I/O ops in the browser process: http://build.chromium.org/f/chromium/perf/vista-release-webkit-latest/intl1/report.html?history=150&rev=-1&graph=write_byte_b

I'm not really sure how this change affects chromium's browser process at all, but in the renderer process it's more simple (i.e. it's mainly just WebKit).
Comment 17 Ojan Vafai 2011-11-12 10:28:43 PST
What these tests do is they load static copies of websites of local disk. Once they hit the window's onload event, they'll load the next page. And it cycles 10 times through all the pages. So, the time is essentially the sum of the load of all the pages.
Comment 18 Andreas Kling 2011-11-14 02:05:44 PST
Thanks a lot for the informative write-up, Ojan! I'll return to this patch when I'm back from vacation.
Comment 19 Ojan Vafai 2011-11-14 14:03:45 PST
Specific pages from the intl1 suite where the mean runtime is >7 ms slower after the patch than before. When you get around to trying to fix this, I can get your appropriate reductions of our copies of these sites.

allegro.pl
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.00   48.75   67.50   94.55  126.20  265.00 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  11.00   53.75   75.50  102.60  123.00  355.00 
blog.skyrock.com
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.00   47.00   63.50   92.29  116.00  503.00 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   10.0    51.0    70.0    99.5   119.5   557.0 
elmundo.es
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.00   48.50   67.00   94.24  126.50  265.00 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   11.0    53.5    75.0   103.0   127.0   355.0 
fc2.com
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.00   47.50   64.00   93.18  119.00  503.00 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   10.0    51.0    70.0   100.5   122.0   557.0 
free.fr
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   9.00   48.00   67.00   91.41  111.80  313.00 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.00   51.25   71.00   98.59  113.00  331.00 
hatena.ne.jp
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.00   48.25   67.00   94.69  126.80  265.00 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  11.00   53.25   75.50  103.70  131.00  355.00 
kakaku.com
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.00   47.25   63.50   91.59  113.00  503.00 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.00   51.00   70.00   98.93  116.50  557.00 
naftemporiki.gr
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   9.00   48.00   67.00   91.77  112.00  313.00 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.00   51.00   71.00   99.11  115.00  331.00 
pchome.com.tw
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.00   48.00   67.00   93.87  126.00  265.00 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   11.0    53.0    75.0   103.4   135.0   355.0 
ricardo.ch
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.00   47.00   63.00   91.45  113.00  503.00 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.00   51.00   70.00   99.02  117.00  557.00 
ruten.com.tw
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   9.00   47.75   67.00   89.88  111.20  313.00 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.00   51.00   70.50   97.52  109.00  331.00 
uwants.com
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.00   47.75   67.00   91.79  123.80  265.00 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  11.00   52.75   73.00  101.40  121.50  355.00 
www.auction.co.kr
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.00   47.75   63.50   92.48  116.00  503.00 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   10.0    51.0    70.0   100.3   119.5   557.0 
www.chosun.com
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   9.00   47.50   67.00   89.75  111.50  313.00 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.00   51.00   70.00   97.86  111.00  331.00 
www.eastmoney.com
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.00   47.50   67.00   89.04  117.50  265.00 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  11.00   52.50   71.00   98.86  116.00  355.00
Comment 20 Ojan Vafai 2011-11-14 15:10:37 PST
Ack. I totally screwed up that analysis. I'll upload the raw stdio data so that the analysis can be done properly as per https://sites.google.com/a/chromium.org/dev/developers/how-tos/using-r-to-reduce-page-cycler-regressions should I screw it up again.
Comment 21 Ojan Vafai 2011-11-14 15:11:25 PST
Created attachment 115039 [details]
run before regression stdio
Comment 22 Ojan Vafai 2011-11-14 15:12:06 PST
Created attachment 115040 [details]
run after regression stdio

These are the stdio results from the Chromium Vista Perf webkit canary bot.
Comment 23 Ojan Vafai 2011-11-14 15:13:48 PST
Now with the proper analysis, these are the sites that regressed.

[1] "haberturk.com"
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  130.0   130.2   131.0   137.2   132.8   189.0 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  145.0   147.0   148.5   151.1   149.8   176.0 
[1] "pchome.com.tw"
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   47.0    48.0    48.0    63.8    49.0   202.0 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   69.0    70.0    71.0    85.4    73.0   206.0 
[1] "sport24.gr"
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   74.0    74.0    74.0    76.5    75.5    95.0 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  81.00   81.25   83.50   92.40   98.00  125.00 
[1] "udn.com"
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  107.0   108.2   110.0   130.1   111.5   309.0 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  134.0   136.5   138.5   156.4   149.0   291.0 
[1] "www.alice.it"
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  55.00   56.00   57.00   64.60   70.75  100.00 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   59.0    67.0    86.0    82.0    95.5   100.0 
[1] "www.auction.co.kr"
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  261.0   265.0   265.5   290.0   267.5   503.0 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  317.0   327.2   329.0   353.0   331.8   557.0 
[1] "www.danawa.com"
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  344.0   347.0   348.0   358.2   351.2   442.0 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  364.0   369.0   371.5   380.7   386.8   430.0 
[1] "www.dcinside.com"
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  204.0   205.0   206.0   220.6   208.8   318.0 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  221.0   230.2   232.0   249.3   232.8   396.0 
[1] "www.eastmoney.com"
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  112.0   112.2   114.0   119.4   115.0   172.0 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  115.0   115.2   116.5   133.2   119.2   261.0
Comment 24 Andreas Kling 2011-11-14 15:56:36 PST
(In reply to comment #8)
> > RefPtr<HTMLCollection> m_collections[NumCollectionTypes]; 
> 
> While it's reassuring that gc-9 test passes, I don't understand what protects wrappers of these objects. Is there some magic?

Looking at CodeGeneratorJS.pm, around line 2179, it reads to me like we tie the HTMLCollection to the base node (the Document in this case) via root(Node*).
Comment 25 Andreas Kling 2011-12-16 13:04:57 PST
Created attachment 119657 [details]
Proposed patch v2

Okay, finally came back to this. The problem was that this patch introduced a reference cycle between Document and its cached HTMLCollections.
This updated version avoids that by having the cached collections not retain their base node pointers (the Document.)
Comment 26 Andreas Kling 2011-12-16 13:12:40 PST
Created attachment 119658 [details]
Proposed patch v3

Just recalled Alexey requesting that I use === in the test rather than ==. Same patch otherwise.
Comment 27 Antti Koivisto 2011-12-16 13:51:36 PST
Comment on attachment 119658 [details]
Proposed patch v3

r=me
Comment 28 Andreas Kling 2011-12-16 15:11:19 PST
Comment on attachment 119658 [details]
Proposed patch v3

Clearing flags on attachment: 119658

Committed r103115: <http://trac.webkit.org/changeset/103115>
Comment 29 Andreas Kling 2011-12-16 15:11:33 PST
All reviewed patches have been landed.  Closing bug.
Comment 30 Darin Adler 2011-12-16 15:28:00 PST
Comment on attachment 119658 [details]
Proposed patch v3

View in context: https://bugs.webkit.org/attachment.cgi?id=119658&action=review

> Source/WebCore/ChangeLog:25
> +        collections that are cached on Document do not retained their base

“do not retain”.

> Source/WebCore/dom/Document.h:1188
> +    const RefPtr<HTMLCollection>& cachedCollection(CollectionType);

Why not just return HTMLCollection*?

> Source/WebCore/html/HTMLCollection.cpp:73
> +    if (m_baseIsRetained)
> +        m_base->deref();

A little confusing to use retain to mean ref/deref but I can’t think of obviously better wording.
Comment 31 Andreas Kling 2011-12-16 15:33:28 PST
(In reply to comment #30)
> (From update of attachment 119658 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=119658&action=review
> 
> > Source/WebCore/dom/Document.h:1188
> > +    const RefPtr<HTMLCollection>& cachedCollection(CollectionType);
> 
> Why not just return HTMLCollection*?

Good point.
 
> > Source/WebCore/html/HTMLCollection.cpp:73
> > +    if (m_baseIsRetained)
> > +        m_base->deref();
> 
> A little confusing to use retain to mean ref/deref but I can’t think of obviously better wording.

Yeah, I agree. This bool will go away in a subsequent patch that makes document.all cached. With that, we can use m_base->isDocumentNode() to know if the node is retained/reffed or not.