WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
NEW
59348
Find large header files
https://bugs.webkit.org/show_bug.cgi?id=59348
Summary
Find large header files
Nico Weber
Reported
2011-04-25 14:48:20 PDT
Approach: 0.) Get a list of all interesting header files in webkit 1.) Port
http://codesearch.google.com/codesearch/p?hl=en#OAMlx_jo-ck/src/tools/include_tracer.py&q=include_tracer.py&exact_package=chromium&sa=N&cd=1&ct=rc
to webkit (this gives the "effective" file size of a header file) 2.) Have some simple grep command that counts how often a given header file is included 3.) run 1 & 2 for every file in 0, sort by file_size * num_includes 4.) Look at the files at the top of this list, make them smaller / cut dependencies
Attachments
Script that estimates effective header size
(6.45 KB, text/x-python-script)
2011-04-25 15:39 PDT
,
Nico Weber
no flags
Details
slightly better script
(6.51 KB, text/x-python-script)
2011-04-25 15:41 PDT
,
Nico Weber
no flags
Details
h file sizes
(276.04 KB, text/plain)
2011-04-25 15:57 PDT
,
Nico Weber
no flags
Details
Patch
(8.81 KB, patch)
2011-04-25 16:17 PDT
,
Eric Seidel (no email)
no flags
Details
Formatted Diff
Diff
find-includes.sh
(141 bytes, application/octet-stream)
2011-04-25 16:18 PDT
,
Nico Weber
no flags
Details
Patch
(9.02 KB, patch)
2011-04-25 16:19 PDT
,
Eric Seidel (no email)
ojan
: review-
ojan
: commit-queue-
Details
Formatted Diff
Diff
join script
(281 bytes, text/x-python-script)
2011-04-25 16:19 PDT
,
Nico Weber
no flags
Details
include-tracer with -c flag
(6.71 KB, text/x-python-script)
2011-04-25 20:10 PDT
,
Nico Weber
no flags
Details
h include counts
(360.55 KB, text/plain)
2011-04-25 20:10 PDT
,
Nico Weber
no flags
Details
new join script
(382 bytes, text/x-python-script)
2011-04-25 20:11 PDT
,
Nico Weber
no flags
Details
header files with file size and number of translation units
(311.62 KB, text/plain)
2011-04-25 20:12 PDT
,
Nico Weber
no flags
Details
sizes of translation units
(243.07 KB, text/plain)
2011-04-25 20:13 PDT
,
Nico Weber
no flags
Details
Show Obsolete
(2)
View All
Add attachment
proposed patch, testcase, etc.
Nico Weber
Comment 1
2011-04-25 14:55:30 PDT
Step 0: git ls-files --full-name *.h
Nico Weber
Comment 2
2011-04-25 15:29:16 PDT
Step 2: #!/bin/bash f=$(basename $1 | sed -e 's:\.:\\.:') git grep -n -e "^#include \"${f}\"$" -- "*.cpp" "*.h" | wc -l Takes ~6 seconds per .h file though :-/ We have 4832 .h files, so it'd take 8h to get the includes. There's probably some better way.
Nico Weber
Comment 3
2011-04-25 15:39:10 PDT
Created
attachment 90962
[details]
Script that estimates effective header size step 1
Nico Weber
Comment 4
2011-04-25 15:41:43 PDT
Created
attachment 90965
[details]
slightly better script First half of step 3: for f in $(git ls-files --full-name *.h); do Tools/Scripts/include-tracer.py -q $f; done
Nico Weber
Comment 5
2011-04-25 15:57:02 PDT
Created
attachment 90974
[details]
h file sizes Output of (for f in $(git ls-files --full-name *.h); do Tools/Scripts/include-tracer.py -q $f; done) | tee h-sizes.txt Top 10: thakis$ sort -r -k 2 -n h-sizes.txt | head Source/WebCore/rendering/RenderView.h: 1.538 MiB Source/WebKit/qt/WebCoreSupport/PageClientQt.h: 1.447 MiB Source/WebCore/page/FrameView.h: 1.390 MiB Source/WebKit/gtk/WebCoreSupport/EditorClientGtk.h: 1.300 MiB Source/WebCore/loader/EmptyClients.h: 1.291 MiB Source/WebCore/accessibility/AccessibilityMediaControls.h: 1.192 MiB Source/WebCore/rendering/RenderLayerCompositor.h: 1.171 MiB Source/WebCore/platform/efl/RenderThemeEfl.h: 1.153 MiB Source/WebCore/rendering/RenderMediaControlsChromium.h: 1.122 MiB Source/WebCore/rendering/RenderMediaControls.h: 1.122 MiB Turns out there are 89 .h files that evaluate to more than 1MB!
Eric Seidel (no email)
Comment 6
2011-04-25 16:17:22 PDT
Created
attachment 90990
[details]
Patch
Nico Weber
Comment 7
2011-04-25 16:18:30 PDT
Created
attachment 90991
[details]
find-includes.sh I'm currently running (for f in $(git ls-files --full-name *.h); do ./find-includes.sh $f; done) | tee h-counts.txt which will take a few hours to complete. find-includes.sh is the 2-line script that counts how often a .h is included, I mentioned it somewhere above; also attached.
Eric Seidel (no email)
Comment 8
2011-04-25 16:19:00 PDT
Created
attachment 90992
[details]
Patch
Nico Weber
Comment 9
2011-04-25 16:19:30 PDT
Created
attachment 90994
[details]
join script Once that other command is completed, this script can be used to combine the two outputs.
Mihai Parparita
Comment 10
2011-04-25 16:35:31 PDT
See also
bug 52451
where Tony looked at header files included everywhere and the benefits of forward declaration.
Nico Weber
Comment 11
2011-04-25 20:09:43 PDT
The command completed, but I realized it's not very helpful as it double-counts header files. We really only want to know how many different translation units include a header, since including a .h twice in a translation unit costs the same as including it once. To do this, I added a -c option to the include-tracer script that just prints all header files that the script finds and then used this to find how many translation unit include each .h file like this: time (rm cpp-include-counts.txt time (for f in $(git ls-files --full-name *.cpp); do Tools/Scripts/include-tracer.py -c $f; done) | tee -a cpp-include-counts.txt) sort cpp-include-counts.txt | uniq -c > cpp-include-counts-processed.txt I then used a modified join script to combine this with the header file sizes file. The top files are: thakis$ python join-cpp.py | sort -k 4 -r -n | head -20 Source/WebCore/rendering/RenderObject.h: 0.883 559 493.597000 Source/WebCore/page/Frame.h: 0.746 653 487.138000 Source/WebCore/bindings/js/ScriptValue.h: 0.578 829 479.162000 Source/WebCore/rendering/style/RenderStyle.h: 0.646 597 385.662000 Source/WebCore/page/FrameView.h: 1.390 237 329.430000 Source/WebCore/bindings/js/ScriptSourceCode.h: 0.378 842 318.276000 Source/WebCore/dom/Element.h: 0.292 1084 316.528000 Source/WebCore/bindings/v8/V8Proxy.h: 0.362 841 304.442000 Source/WebCore/bindings/v8/ScriptValue.h: 0.366 829 303.414000 Source/WebCore/rendering/RenderBoxModelObject.h: 0.894 331 295.914000 Source/WebCore/loader/FrameLoader.h: 0.443 659 291.937000 Source/WebCore/dom/Document.h: 0.220 1307 287.540000 Source/WebCore/rendering/RenderBox.h: 0.924 301 278.124000 Source/WebCore/bindings/v8/ScriptController.h: 0.377 663 249.951000 Source/WebCore/rendering/RenderText.h: 0.891 260 231.660000 Source/WebCore/dom/StyledElement.h: 0.296 772 228.512000 Source/WebCore/rendering/InlineBox.h: 0.917 247 226.499000 Source/WebCore/rendering/RenderBR.h: 0.894 249 222.606000 Source/WebCore/bindings/js/ScriptController.h: 0.327 663 216.801000 Source/WebCore/rendering/RenderBlock.h: 1.016 206 209.296000 (.h filename, size of the .h with all its includes resolved in MiB, number of translation units including that .h, product of the previous 2 numbers.) I also ran the include tracer on all cpp files, to find the biggest cpp files: thakis$ sort cpp-sizes.txt -k 2 -r -n | head -20 Source/WebCore/rendering/RenderingAllInOne.cpp: 4.851 MiB Source/WebCore/dom/DOMAllInOne.cpp: 3.898 MiB Source/WebCore/bindings/js/JSBindingsAllInOne.cpp: 3.624 MiB Source/WebCore/html/HTMLElementsAllInOne.cpp: 3.396 MiB Source/WebCore/svg/SVGAllInOne.cpp: 3.287 MiB Source/WebKit/chromium/src/WebViewImpl.cpp: 2.868 MiB Source/WebKit/chromium/src/WebFrameImpl.cpp: 2.687 MiB Source/WebCore/rendering/svg/RenderSVGAllInOne.cpp: 2.616 MiB Source/WebCore/dom/Document.cpp: 2.549 MiB Source/WebCore/accessibility/AccessibilityAllInOne.cpp: 2.357 MiB Source/WebKit/qt/Api/qwebpage.cpp: 2.331 MiB Source/WebKit/chromium/src/FrameLoaderClientImpl.cpp: 2.299 MiB Source/WebKit/chromium/src/ChromeClientImpl.cpp: 2.264 MiB Source/WebKit/chromium/src/WebMediaPlayerClientImpl.cpp: 2.260 MiB Source/WebKit/qt/WebCoreSupport/DumpRenderTreeSupportQt.cpp: 2.235 MiB Source/WebCore/page/EventHandler.cpp: 2.227 MiB Source/WebKit/chromium/src/PlatformBridge.cpp: 2.191 MiB Source/WebKit/chromium/src/WebPluginContainerImpl.cpp: 2.186 MiB Source/WebKit/chromium/src/ContextMenuClientImpl.cpp: 2.179 MiB Source/WebCore/page/Frame.cpp: 2.167 MiB I will attach all the scripts I used and all data files I produced.
Nico Weber
Comment 12
2011-04-25 20:10:17 PDT
Created
attachment 91040
[details]
include-tracer with -c flag
Nico Weber
Comment 13
2011-04-25 20:10:51 PDT
Created
attachment 91042
[details]
h include counts
Nico Weber
Comment 14
2011-04-25 20:11:14 PDT
Created
attachment 91043
[details]
new join script
Nico Weber
Comment 15
2011-04-25 20:12:38 PDT
Created
attachment 91044
[details]
header files with file size and number of translation units
Nico Weber
Comment 16
2011-04-25 20:13:08 PDT
Created
attachment 91045
[details]
sizes of translation units
Adam Barth
Comment 17
2011-04-26 13:51:53 PDT
Comment on
attachment 90992
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=90992&action=review
> Tools/Scripts/include-tracer:30 > +# Based on an almost identical script by:
jyrki@google.com
(Jyrki Alakuijala)
Which is available under BSD? It would be nice to have a link to the license here.
> Tools/Scripts/include-tracer:42 > +# FIXME: This should be a per-port list. Right now this is Apple-Mac only. > +INCLUDE_PATHS = [
This is pretty lame.
> Tools/Scripts/include-tracer:150 > + return total_bytes
Otherwise known as 0?
> Tools/Scripts/include-tracer:159 > + # Skip system includes. > + if filename[0] == '<': > + return total_bytes
This will also skip wtf, right?
> Tools/Scripts/include-tracer:170 > + lines = open(resolved_filename).readlines()
We should use "with" with open to make sure we don't leak.
> Tools/Scripts/include-tracer:183 > + if line.startswith('#include "'): > + total_bytes += self._walk(seen, line.split('"')[1], resolved_filename, indent + 2) > + elif line.startswith('#include '): > + include = '<' + line.split('<')[1].split('>')[0] + '>' > + total_bytes += self._walk(seen, include, resolved_filename, indent + 2)
Would this be clearer with a regular expression?
David Levin
Comment 18
2011-04-26 13:56:34 PDT
Comment on
attachment 90992
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=90992&action=review
Looks like Adam got it but here's comment anyway :).
> Tools/Scripts/include-tracer:192 > + self._be_quiet = True
Seems like quiet would be the default (and verbose would be an option).
Ojan Vafai
Comment 19
2011-04-26 17:05:22 PDT
Comment on
attachment 90992
[details]
Patch R- per abarth's comments
Eric Seidel (no email)
Comment 20
2011-04-26 18:08:17 PDT
(In reply to
comment #17
)
> (From update of
attachment 90992
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=90992&action=review
> > > Tools/Scripts/include-tracer:30 > > +# Based on an almost identical script by:
jyrki@google.com
(Jyrki Alakuijala) > > Which is available under BSD? It would be nice to have a link to the license here.
Which was previously google internal and covered under google copyright, thus as a google submitter licensed however we'd like here. :)
Nico Weber
Comment 21
2011-04-26 18:34:10 PDT
(In reply to
comment #20
)
> (In reply to
comment #17
) > > (From update of
attachment 90992
[details]
[details]) > > View in context:
https://bugs.webkit.org/attachment.cgi?id=90992&action=review
> > > > > Tools/Scripts/include-tracer:30 > > > +# Based on an almost identical script by:
jyrki@google.com
(Jyrki Alakuijala) > > > > Which is available under BSD? It would be nice to have a link to the license here. > > Which was previously google internal and covered under google copyright, thus as a google submitter licensed however we'd like here. :)
This script is based on a similar script from the chromium tree that has chromium's license (e.g. bsd style):
http://codesearch.google.com/codesearch/p?hl=en#OAMlx_jo-ck/src/tools/include_tracer.py&q=include_tracer.py&exact_package=chromium&sa=N&cd=1&ct=rc
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug