Bug 59449 - Investigate template bloat in HashTable
Summary: Investigate template bloat in HashTable
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: Web Template Framework (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-04-26 05:30 PDT by Tony Gentilcore
Modified: 2012-11-02 12:47 PDT (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tony Gentilcore 2011-04-26 05:30:28 PDT
A treemap of the chromium binary size by compilation unit is available here: http://neugierig.org/software/chromium/bloat/

Zoom in on WebKit, then JavaScriptCore, then wtf. Notice that all of wtf is 520k and of that ~70% (370k) is HashTable+HashMap+related classes. It looks to me like the template structure is forcing the entire class to be recompiled for every combination of key/value type used (and it is used quite a bit in WebCore). Obviously this is a very performance sensitive class, but maybe there are some ways to preserve the perf and still cut down on some space.

Some ideas:
- Hoist out parts of the implementation that aren't type dependent into a base class (although they look scarce). This was very successful for RefPtr in http://trac.webkit.org/changeset/36663.
- Specialize a version for pointers so that they can all share one, instead of having a different one for each pointer.
- There are 4 nearly identical lookup* implementations. They were expanded in http://trac.webkit.org/changeset/27176 for a perf improvement, but perhaps there is an alternative to keep the perf without so much space.
Comment 1 Torne (Richard Coles) 2011-04-28 09:02:51 PDT
For reference, a build I have contains around 300 instantiations of the HashTable template. Most have a value type which is a std::pair<int, some_type*> or similar.

There's about 575 instantiations of the various lookup* methods, the linker is throwing the extra ones away because they are unused, so the multiple lookup methods is only part of the problem.
Comment 2 Tony Gentilcore 2011-04-28 10:17:24 PDT
Sam/AP, I think Richard and I are both stumped here. There's almost no code that can easily be hoisted to a base class, and specialization doesn't really work because of the pair. Nevertheless, I'll leave this bug open because it is a pretty significant source of binary bloat. Perhaps you or someone else might have a better idea of how to approach this.
Comment 3 Julien Chaffraix 2011-05-03 07:47:06 PDT
Just FYI, this article http://www.lubomir.org/academic/MinimizingCodeBloat.pdf was what prompted me to do some template bloat removal. I could not apply the full technique on HashMap / HashTable as I did not find a smart way of reducing the "templated space". From what Richard mentioned, having a way of to map most of our std::pair<int, some_pointer*> to a common class would likely help.

I also wonder if the hoisting optimization is still relevant as the compilers have gotten smarter and may do it automatically now (I have no proof to back any of these claims though).
Comment 4 Satish Sampath 2011-05-10 07:53:10 PDT
I tried some basic techniques including template hoisting, template specialization for pointer types (similar to STLPort's _STLP_USE_PTR_SPECIALIZATIONS), moving common code around etc. but I could only save about a dozen kb at most with these. Given the heavy use of these templates in WebCore, it would be a highly useful project if more people could take a deeper look at reducing the code bloat.