WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
NEW
186747
Reduce HashTable capacity wastage
https://bugs.webkit.org/show_bug.cgi?id=186747
Summary
Reduce HashTable capacity wastage
Simon Fraser (smfr)
Reported
2018-06-16 17:33:22 PDT
Using the patch in
bug 186698
we know that there is significant wastage in HashTable capacity. Wastage on some pages: nytimes.com: 7MB cnn.com: 10MB theverge.com: 4MB HashMap<> doesn't have shrink() like Vector does. It looks like we could expose a rehash() which will halve the table size, but only if less than 50% of table capacity is used (?). Maybe the policy should be that we don't store large objects in HashMaps, and store pointers instead?
Attachments
Add attachment
proposed patch, testcase, etc.
Darin Adler
Comment 1
2018-06-16 23:30:36 PDT
I’m not 100% sure exactly what part of the memory we consider the "wasted" part, but our policy for hash tables is based on these constants: static const unsigned m_maxLoad = 2; static const unsigned m_minLoad = 6; The units here are fraction of table full. So we expand a hash table as soon as 50% (1/2) of the buckets are full and we shrink a hash table if only 17% (1/6) of the buckets are full. When we shrink, we always shrink to half the current size and when we expand, we typically expand to double the current size. I agree that we can make improvements by changing these policies.
Darin Adler
Comment 2
2018-06-17 00:04:30 PDT
This policy means that growing hash tables vary between 25% and 50% full, getting expanded each time they get up to 50%. I can imagine an explicit shrink-to-fit operation that would take us to 50% full once we knew we were doing growing and I don’t see much downside in implementing it.
Darin Adler
Comment 3
2018-06-17 10:57:03 PDT
(In reply to Simon Fraser (smfr) from
comment #0
)
> Maybe the policy should be that we don't store large objects in HashMaps, > and store pointers instead?
The tradeoff is likely the extra overhead for the separate memory blocks for each object vs. the extra overhead for unused hash table buckets. It would be good to have a principled way to do the math on that to figure out which costs more.
Radar WebKit Bug Importer
Comment 4
2018-06-18 11:31:18 PDT
<
rdar://problem/41217978
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug