Bug 86279 - 99% of HashTables have fewer than 15 keys; reduce default table size?
Summary: 99% of HashTables have fewer than 15 keys; reduce default table size?
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: Web Template Framework (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-05-11 19:06 PDT by Simon Fraser (smfr)
Modified: 2012-12-30 17:07 PST (History)
14 users (show)

See Also:


Attachments
Patch to add logging (6.34 KB, patch)
2012-05-11 19:08 PDT, Simon Fraser (smfr)
buildbot: commit-queue-
Details | Formatted Diff | Diff
Archive of layout-test-results from ec2-cr-linux-02 (539.54 KB, application/zip)
2012-05-13 18:48 PDT, WebKit Review Bot
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Simon Fraser (smfr) 2012-05-11 19:06:08 PDT
I added some instrumentation to see how full our HashTables were (I'll attach a patch later). I tracked the high water mark key count, and table size for each HashTable object.

I browsed a few sites (cnn.com, news.bbc.co.uk, nytimes.com, reddit.com), and printed out the data. These numbers show that 99% of HashTables we create have fewer than 15 keys at high water mark. 78% have zero keys (which we optimize for).85% have 0 or 1 keys.

Data:

Size    Freq    %       Cumul.  % of total
0       716577  78.25%  716577  78.25%
1       60120   6.57%   776697  84.82%
2       43757   4.78%   820454  89.60%
3       25432   2.78%   845886  92.38%
4       16472   1.80%   862358  94.17%
5       13641   1.49%   875999  95.66%
6       7534    0.82%   883533  96.49%
7       5570    0.61%   889103  97.10%
8       4002    0.44%   893105  97.53%
9       4262    0.47%   897367  98.00%
10      2585    0.28%   899952  98.28%
11      2309    0.25%   902261  98.53%
12      1941    0.21%   904202  98.74%
13      1450    0.16%   905652  98.90%
14      832     0.09%   906484  98.99%
15      849     0.09%   907333  99.09%
16      704     0.08%   908037  99.16%
17      752     0.08%   908789  99.25%
18      1120    0.12%   909909  99.37%
19      555     0.06%   910464  99.43%
20      358     0.04%   910822  99.47%
21      414     0.05%   911236  99.51%
22      378     0.04%   911614  99.55%
23      317     0.03%   911931  99.59%
24      244     0.03%   912175  99.62%
25      208     0.02%   912383  99.64%
26      145     0.02%   912528  99.65%
27      118     0.01%   912646  99.67%
28      174     0.02%   912820  99.69%
29      193     0.02%   913013  99.71%
30      196     0.02%   913209  99.73%
31      132     0.01%   913341  99.74%
32      138     0.02%   913479  99.76%
33      74      0.01%   913553  99.77%
34      151     0.02%   913704  99.78%
35      71      0.01%   913775  99.79%
36      62      0.01%   913837  99.80%
37      35      0.00%   913872  99.80%
38      37      0.00%   913909  99.80%
39      44      0.00%   913953  99.81%
40      41      0.00%   913994  99.81%
41      17      0.00%   914011  99.82%
42      53      0.01%   914064  99.82%
43      36      0.00%   914100  99.83%
44      40      0.00%   914140  99.83%
45      80      0.01%   914220  99.84%
46      35      0.00%   914255  99.84%
47      42      0.00%   914297  99.85%
48      27      0.00%   914324  99.85%
49      1376    0.15%   915700  100.00%
Comment 1 Simon Fraser (smfr) 2012-05-11 19:08:53 PDT
Created attachment 141542 [details]
Patch to add logging
Comment 2 Geoffrey Garen 2012-05-12 14:39:26 PDT
Currently, default max load is 1/2 of table size, and default minimum table size is 64, so default growth proceeds as follows:
0 keys
32 keys
64 keys
etc.

Based on this data, it seems reasonable to change the default minimum table size to 2 (!). That would give:
0 keys
2 keys
4 keys
etc.

To accommodate any potential regression in the few clients that create mega tables, perhaps we should also add HashTable::reserveCapacity. I believe reserveCapacity(capacity) could literally just forward to rehash(capacity), and everything would "just work".
Comment 3 Geoffrey Garen 2012-05-12 14:41:47 PDT
> change the default minimum table size to 2

Correction: 4.
Comment 4 WebKit Review Bot 2012-05-13 15:54:12 PDT
Attachment 141542 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WTF/wtf/HashTable.cpp', u'Source/WT..." exit_code: 1
Source/WTF/wtf/HashTable.h:1078:  tmp_maxTableSize is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
Source/WTF/wtf/HashTable.h:1082:  tmp_maxKeyCount is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
Source/WTF/wtf/HashTable.cpp:38:  Should have a space between // and comment  [whitespace/comments] [4]
Total errors found: 3 in 2 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 5 Antti Koivisto 2012-05-13 16:29:06 PDT
The counts seem very high for four site visit. Wonder where all these tables are created. Are many of these temporaries?
Comment 6 Build Bot 2012-05-13 17:14:30 PDT
Comment on attachment 141542 [details]
Patch to add logging

Attachment 141542 [details] did not pass win-ews (win):
Output: http://queues.webkit.org/results/12681290
Comment 7 WebKit Review Bot 2012-05-13 18:48:54 PDT
Comment on attachment 141542 [details]
Patch to add logging

Attachment 141542 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/12681302

New failing tests:
http/tests/appcache/different-origin-manifest.html
accessibility/aria-checkbox-checked.html
accessibility/aria-labelledby-on-input.html
animations/3d/matrix-transform-type-animation.html
http/tests/appcache/cyrillic-uri.html
accessibility/aria-presentational-role.html
animations/animation-add-events-in-handler.html
http/tests/appcache/deferred-events-delete-while-raising.html
animations/additive-transform-animations.html
animations/3d/replace-filling-transform.html
accessibility/anchor-linked-anonymous-block-crash.html
http/tests/appcache/deferred-events-delete-while-raising-timer.html
accessibility/aria-disabled.html
http/tests/appcache/crash-when-navigating-away-then-back.html
animations/animation-border-overflow.html
http/tests/appcache/credential-url.html
http/tests/appcache/access-via-redirect.php
animations/3d/change-transform-in-end-event.html
accessibility/aria-hidden.html
http/tests/appcache/destroyed-frame.html
animations/animation-controller-drt-api.html
accessibility/aria-label.html
accessibility/aria-hidden-updates-alldescendants.html
accessibility/aria-labelledby-overrides-aria-label.html
http/tests/appcache/deferred-events.html
accessibility/aria-labelledby-stay-within.html
animations/animation-css-rule-types.html
accessibility/adjacent-continuations-cause-assertion-failure.html
http/tests/appcache/detached-iframe.html
accessibility/aria-help.html
Comment 8 WebKit Review Bot 2012-05-13 18:48:59 PDT
Created attachment 141625 [details]
Archive of layout-test-results from ec2-cr-linux-02

The attached test failures were seen while running run-webkit-tests on the chromium-ews.
Bot: ec2-cr-linux-02  Port: <class 'webkitpy.common.config.ports.ChromiumXVFBPort'>  Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
Comment 9 Simon Fraser (smfr) 2012-05-13 21:41:23 PDT
(In reply to comment #5)
> The counts seem very high for four site visit. Wonder where all these tables are created. Are many of these temporaries?

This includes all HashTables, so, yes, it will contain those on the stack.