Bug 31217 - Optimize code size of V8 DOM bindings
Summary: Optimize code size of V8 DOM bindings
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P3 Enhancement
Assignee: Jens Alfke
Depends on: 31362 31368 31383 31420 31443
  Show dependency treegraph
Reported: 2009-11-06 16:53 PST by Jens Alfke
Modified: 2009-11-12 16:28 PST (History)
2 users (show)

See Also:

Description of changes to CodeGeneratorV8 (6.92 KB, text/plain)
2009-11-06 16:53 PST, Jens Alfke
no flags Details
patch (57.50 KB, patch)
2009-11-09 11:03 PST, Jens Alfke
abarth: review-
Details | Formatted Diff | Diff
patch 2 (61.42 KB, patch)
2009-11-09 14:04 PST, Jens Alfke
dglazkov: review-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jens Alfke 2009-11-06 16:53:08 PST
Created attachment 42675 [details]
Description of changes to CodeGeneratorV8

The V8 DOM bindings (generated by CodeGeneratorV8.pm) are a surprisingly large amount of code. I haven't measured their total size (that would require some tricky parsing of the link map) but as a data point, compiling just those bindings with -Os instead of the usual -O3 (i.e. optimize for size, not speed) reduces the overall size of the Chrome binary for Mac by 1MB. That implies there's a very large amount of stuff being inlined.

1MB isn't huge in the grand scheme of things, but larger bindings reduce locality of reference and get in the way of the CPU caches. The code size is also significant on Android, where the ROM is of limited capacity.

Just changing the compiler flag reduces Dromaeo benchmark performance, though. So instead I looked at what constructs are generating the most machine code and tried to fix those, while also adding a few speed optimizations I found along the way. I made non-inlined equivalents of many of the utility methods the bindings call, and created some new functions for them to call instead of inlining common sequences of calls. I also made more of the class initialization table-driven.

The overall result is that I've shrunk the code size by about 600k, with no significant effect on the Dromaeo benchmarks. (3 of the 4 core DOM scores went up by 1-4%, and one went down 2%.)

I've attached a text document that describes the types of changes I made to the code generator.
Comment 1 Jens Alfke 2009-11-09 11:03:08 PST
Created attachment 42770 [details]

Here's the patch. Please read the text document (attached) first, as it describes at a high level what I changed.

I think it might be best to do an in-person walkthrough review of at least CodeGeneratorV8.pm, since the changes are quite numerous and sometimes complex.
Comment 2 Jens Alfke 2009-11-09 11:26:05 PST
I've put copies of the generated bindings, both with and without my
code-generator change, in my home dir on the Google corp network, at
/home/snej/V8Bindings/. Looking at the differences should help clear up what
the code-generator changes are doing.
Comment 3 Adam Barth 2009-11-09 12:24:52 PST
Comment on attachment 42770 [details]

I haven't looked at the code yet, but there's no changelog.
Comment 4 Jens Alfke 2009-11-09 13:48:14 PST
Dimitri is already reviewing this.
Comment 5 Adam Barth 2009-11-09 13:51:59 PST
Still needs a ChangeLog ;)
Comment 6 Jens Alfke 2009-11-09 14:04:03 PST
Created attachment 42800 [details]
patch 2

A few helper functions renamed to remove "v8" prefix at Dimitri's suggestion.
Added Changelog.

Note: dglazkov is reviewing this.
Comment 7 Dimitri Glazkov (Google) 2009-11-11 09:33:10 PST
Comment on attachment 42800 [details]
patch 2

Ok, this is epic-sized. I see a bunch of smaller patches in it:

1) Converting v8::Undefied to empty -- awesome!
    I would be very careful about this. Returning empty vs. Undefined actually does matter in cases where the callback an interceptor.
    We even have a helper called notHandledByInterceptor to distinguish that case.

2) Introducing transferHiddenDependency -- great!

3) Optimizing exception case -- super-duper!

4) Optimizing "Reflect" case -- even better!

5) Deinlining

6) ...

Let's try to break them up into incremental chunks. This way, if something goes awry, we won't have to revert the whole thing, and we can observe perf/size changes more granularly.
Comment 8 Jens Alfke 2009-11-12 16:24:44 PST
Comment 9 Jens Alfke 2009-11-12 16:28:09 PST
Oops, sorry that got cut off.
This is now all checked in -- it was broken up into five patches/bugs at Dimitri's request. See the "Depends on" list above.