Summary: | [WTF] Try using 75% load factor for HashTable | ||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Yusuke Suzuki <ysuzuki> | ||||||||||||||||||||||||||
Component: | Web Template Framework | Assignee: | Yusuke Suzuki <ysuzuki> | ||||||||||||||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||||||||||||||
Severity: | Normal | CC: | achristensen, benjamin, cdumez, cmarcelo, darin, dbates, ews-watchlist, fpizlo, mark.lam, tsavell, webkit-bug-importer | ||||||||||||||||||||||||||
Priority: | P2 | Keywords: | InRadar | ||||||||||||||||||||||||||
Version: | WebKit Nightly Build | ||||||||||||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||||||||||
See Also: | https://bugs.webkit.org/show_bug.cgi?id=207382 | ||||||||||||||||||||||||||||
Attachments: |
|
Description
Yusuke Suzuki
2020-02-03 22:22:30 PST
Created attachment 389630 [details]
Patch
Created attachment 389632 [details]
Patch
We need to keep power-of-two size since this is important for double-hashing. Double-hashing requires that h2 hash value is relatively prime to table size, and we are achieving this by, (1) choosing odd number for h2 and (2) using power-of-two table-size. Created attachment 389775 [details]
Patch
Created attachment 389777 [details]
Patch
Created attachment 389779 [details]
Patch
Created attachment 389781 [details]
Patch
Created attachment 389782 [details]
Patch
Created attachment 389839 [details]
Patch
(In reply to Yusuke Suzuki from comment #0) > We are currently using 50% load-factor, which super aggressively allocates > memory, and keeps them alive. > I wonder if we can use 75% load-factor instead. > > 1. Abseil SwissTable is using 75%[1]. Since rust's HashMap is derived from > this, it is also using 75%[2]. > 2. Oracle JDK's HashMap's load-factor is 75%[3]. > 3. LLVM ADT DenseMap's load-factor is 75%[4]. > 4. Mozilla's HashTable's load-factor is 75%[5] > > [1]: https://github.com/abseil/abseil-cpp > [2]: > https://github.com/rust-lang/rust/blob/master/src/libstd/collections/hash/ > map.rs > [3]: https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html > [4]: > https://github.com/llvm/llvm-project/blob/master/llvm/include/llvm/ADT/ > DenseMap.h#L539-L550 > [5]: > https://github.com/mozilla/gecko-dev/blob/master/mfbt/HashTable.h#L1568-L1572 Note that v8's hashmap load-factor is 80%. Created attachment 389898 [details]
Patch
Comment on attachment 389898 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=389898&action=review r=me > Source/WTF/wtf/HashTable.h:492 > + return keyAndDeleteCount * maxLoadDenominator >= tableSize * maxLoadNumerator; As we discussed offline, let's use 64-bit math here to ensure that there's no overflow issue. > Source/WTF/wtf/HashTable.h:568 > + constexpr unsigned maxLoadNumerator = 3; > + constexpr unsigned maxLoadDenominator = 4; Can you put the HashTable's constexprs in a HashTableBase class, make them public, and just use those here? Alternatively, instead of making those fields public, make this capacityForSize() function a private utility method in HashTableCapacityForSize, and make template<unsigned> struct HashTableCapacityForSize, a friend of HashTableBase (since capacityForSize() is only used by HashTableCapacityForSize). > Source/WTF/wtf/HashTable.h:569 > + if (size == 0) WebKit style: use !size instead of size == 0 (though I personally prefer size == 0). > Source/WTF/wtf/HashTable.h:571 > + unsigned candidate = roundUpToPowerOfTwo(size); Let's call this capacity instead of candidate. > Source/WTF/wtf/HashTable.h:572 > + if (size * maxLoadDenominator >= candidate * maxLoadNumerator) Alternatively, can you put the constexpr shouldExpand in HashTableBase and just use HashTableBase::shouldExpand(size, capacity) here. > Source/WTF/wtf/HashTable.h:1231 > + // If we are getting halfway between 11/24 and 3/4 (we pick 15/24 == 5/8), we double the size to avoid being too close to I suggest rephrasing "(we pick 15/24 == 5/8)" as "(i.e. past 14.5/24, which we'll round up to 15/24 i.e. 5/8)" > Source/WTF/wtf/HashTable.h:1232 > + // loadMax and bring the ratio close to 11/24. This give us a load in the bounds [9/24, 15/24). I haven't figured out how you arrived at the 9/24 number, but the reasoning for choosing 5/8 is sound to me. > Source/WTF/wtf/HashTable.h:1233 > + bool aboveThreeQuarterLoad = keyCount * 8 >= bestTableSize * 5; The name aboveThreeQuarterLoad sounds wrong given the math. Can you use a better name? > Tools/TestWebKitAPI/Tests/WTF/HashSet.cpp:58 > // Adding items up to less than half the capacity should not change the capacity. This comment is stale. Please update. Comment on attachment 389898 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=389898&action=review >> Source/WTF/wtf/HashTable.h:492 >> + return keyAndDeleteCount * maxLoadDenominator >= tableSize * maxLoadNumerator; > > As we discussed offline, let's use 64-bit math here to ensure that there's no overflow issue. Fixed. >> Source/WTF/wtf/HashTable.h:568 >> + constexpr unsigned maxLoadDenominator = 4; > > Can you put the HashTable's constexprs in a HashTableBase class, make them public, and just use those here? > > Alternatively, instead of making those fields public, make this capacityForSize() function a private utility method in HashTableCapacityForSize, and make template<unsigned> struct HashTableCapacityForSize, a friend of HashTableBase (since capacityForSize() is only used by HashTableCapacityForSize). I don't think adding HashTableBase and moving them are clean just to share these two variables: this makes HashTable complicated while the benefit is small. But we cannot use them from HashTableCapacityForSize since we do not have a way to instantiate HashTable's template. For now, I would like to put `capacityForSize` simply in HashTableCapacityForSize and duplicate maxLoadNumerator and maxLoadDenominator here. We should refresh HashTableCapacityForSize's implementation (like, merging it into HashTable?), or we should remove HashTableCapacityForSize simply in the future. >> Source/WTF/wtf/HashTable.h:569 >> + if (size == 0) > > WebKit style: use !size instead of size == 0 (though I personally prefer size == 0). Fixed. >> Source/WTF/wtf/HashTable.h:571 >> + unsigned candidate = roundUpToPowerOfTwo(size); > > Let's call this capacity instead of candidate. Fixed. >> Source/WTF/wtf/HashTable.h:572 >> + if (size * maxLoadDenominator >= candidate * maxLoadNumerator) > > Alternatively, can you put the constexpr shouldExpand in HashTableBase and just use HashTableBase::shouldExpand(size, capacity) here. For now, I just put capacityForSize into HashTableCapacityForSize, and having duplicate implementation of shouldExpand here instead of introducing HashTableBase since it is too complicated while the benefit is small. We should remove HashTableCapacityForSize in the future. >> Source/WTF/wtf/HashTable.h:1231 >> + // If we are getting halfway between 11/24 and 3/4 (we pick 15/24 == 5/8), we double the size to avoid being too close to > > I suggest rephrasing "(we pick 15/24 == 5/8)" as "(i.e. past 14.5/24, which we'll round up to 15/24 i.e. 5/8)" Changed. >> Source/WTF/wtf/HashTable.h:1232 >> + // loadMax and bring the ratio close to 11/24. This give us a load in the bounds [9/24, 15/24). > > I haven't figured out how you arrived at the 9/24 number, but the reasoning for choosing 5/8 is sound to me. 9/24 is the ratio just after expansion. We expand when we reach to 3/4. And after expansion, we get 3/8 == 9/24. This is the same to the original 3/12 calculation (previously we expand when we reach to 1/2. And we expand it to 1/4 == 3/12). >> Source/WTF/wtf/HashTable.h:1233 >> + bool aboveThreeQuarterLoad = keyCount * 8 >= bestTableSize * 5; > > The name aboveThreeQuarterLoad sounds wrong given the math. Can you use a better name? Fixed. >> Tools/TestWebKitAPI/Tests/WTF/HashSet.cpp:58 >> // Adding items up to less than half the capacity should not change the capacity. > > This comment is stale. Please update. Fixed. Committed r255889: <https://trac.webkit.org/changeset/255889> We should fix that aggregate-sorted-data-no-storage-access test to not depend on hash table ordering. The simplest way to do that would be to change ResourceLoadStatisticsDatabaseStore::dumpResourceLoadStatistics to make a vector of the domains and sort the vector before dumping them. Or we could do more research to find out exactly how hash table order ends up influencing what's inserted into the database, which might be a more complete fix because it isn't great that hash table order influences what gets written into the database either. (In reply to Darin Adler from comment #16) > We should fix that aggregate-sorted-data-no-storage-access test to not > depend on hash table ordering. The simplest way to do that would be to > change ResourceLoadStatisticsDatabaseStore::dumpResourceLoadStatistics to > make a vector of the domains and sort the vector before dumping them. Or we > could do more research to find out exactly how hash table order ends up > influencing what's inserted into the database, which might be a more > complete fix because it isn't great that hash table order influences what > gets written into the database either. I've just filed it in https://bugs.webkit.org/show_bug.cgi?id=207348 and attached the patch :) Committed r256011: <https://trac.webkit.org/changeset/256011> Rolled out. It offered large improvement in Membuster. But it also brings JetStream2/string-unpack-code-SP regression (6-10%). I'm now looking into this. Several notes. 1. Accelerating String comparison w/ hash code comparison does not help 2. Regression does not happen when only running this test My (super) rough guess is that, probe length increases slightly. Comparison itself is cheap, it does not matter. But multiple probes mean that we need to access several slots of hash table. And if table gets large, cache miss rate increase. It’s possible that probe miss cost increases w/ table size. If this is true, we should pick different load factor for larger table size. AtomStringTable size in JetStream2/string-unpack-code-SP is 30109 - 31703 when running this as a part of whole JetStream2 suite. On the other hand, it becomes 1731 - 3696 when we just run this test. THe revert in https://trac.webkit.org/changeset/256011/webkit broke http/tests/resourceLoadStatistics/aggregate-sorted-data-no-storage-access.html tracking in https://bugs.webkit.org/show_bug.cgi?id=207382 Yusuke, try this in combination with https://bugs.webkit.org/show_bug.cgi?id=207251 Created attachment 390161 [details]
Data
Attached the microbenchmark data. I think we should use 75% load-factor.
I think using this load-factor <= 1024 tables is a good tradeoff.
Created attachment 390164 [details]
Patch
And for larger tables, I would like to make load-factor 75% too. But to make it possible, we should decrease the collision rate. I'm currently trying implementing SwissTable / F14 table for larger sizes, and then, we get lower collision rate and we could safely increase load-factor for large tables too. For small tables, I’m now sticking to the current design since SwissTable / F14 table require additional control data. (In reply to Alex Christensen from comment #21) > Yusuke, try this in combination with > https://bugs.webkit.org/show_bug.cgi?id=207251 That looks nice! Comment on attachment 390164 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=390164&action=review > Source/WTF/wtf/HashTable.h:316 > + static constexpr unsigned maxSmallTableSize = 1024; Let's call this maxSmallTableCapacity. > Source/WTF/wtf/HashTable.h:319 > + static constexpr bool shouldExpand(uint64_t keyAndDeleteCount, uint64_t tableSize) Let's call rename these: keyAndDeleteCount => size, tableSize => capacity. > Source/WTF/wtf/HashTable.h:338 > + COMPILE_ASSERT(size > 0, HashTableNonZeroMinimumCapacity); > + COMPILE_ASSERT(!static_cast<unsigned>(value >> 31), HashTableNoCapacityOverflow); We can switch to using static_assert now. COMPILE_ASSERT is just from the old days before we had static_assert. > Source/WTF/wtf/HashTable.h:527 > + bool shouldExpand() const { return HashTableSizePolicy::shouldExpand(keyCount() + deletedCount(), tableSize()); } It's a pity that tableSize() actually means tableCapacity(). Maybe we should rename these in a refactoring patch later. > Source/WTF/wtf/HashTable.h:1256 > + double average = loadFactor + (1.0 / minLoad); Also, let's call this averageLoad instead of just average. Let's also introduce another value: maxLoad = loadFactor. This makes it clear that we're using it as the upper bound, and not as some factor. This equation does not match the comment above. average here should be (maxLoad + (1.0 / minLoad)) / 2. Otherwise, you won't get 11/24 for the 3/4 and 1/6. > Source/WTF/wtf/HashTable.h:1257 > + double halfway = (average + loadFactor) / 2; Based on the comment, the "halfway" here should be halfway between the averageLoad and the maxLoad i.e. it's really the 3/4 load. So ... Let's rename "halfway" as threeQuarterLoad, and the equation becomes: threeQuarterLoad = (averageLoad + maxLoad) / 2; > Source/WTF/wtf/HashTable.h:1258 > + return keyCount >= tableSize * halfway; And this becomes: keyCount >= tableSize * threeQuarterLoad; Note: this means that the original algorithm (based on the comment) calls for eager expansion only when the size reaches 3/4 of the way to the loadFactor. Your algorithm above changed it to eagerly expand if size reaches half of loadFactor. This might not be what you intended. Sounds like you might want to retest with the benchmarks ... unless you want to go with the way you want to change to eager expansion at half of loadFactor. If so, you should fix the comment. > Source/WTF/wtf/HashTraits.h:56 > + static constexpr unsigned maxLoadNumerator = 1; > + static constexpr unsigned maxLoadDenominator = 2; This appears to be unused. Why were they added? Are they stale now that you have to handle small and large loads? > Tools/TestWebKitAPI/Tests/WTF/HashSet.cpp:63 > + // Adding items up to less than 3/4 of the capacity should not change the capacity. > + unsigned capacityLimit = initialCapacity * 3 / 4 - 1; Is this still correct given that we have account for maxSmallTableSize now? > Tools/TestWebKitAPI/Tests/WTF/HashSet.cpp:531 > +struct HighLoadStringHash : public WTF::HashTraits<String> { > + static constexpr unsigned maxLoadNumerator = 3; > + static constexpr unsigned maxLoadDenominator = 4; > +}; > + > +struct HighLoadStringHash2 : public WTF::HashTraits<String> { > + static constexpr unsigned maxLoadNumerator = 3; > + static constexpr unsigned maxLoadDenominator = 5; > +}; > + > +struct HighLoadUnsignedHash : public WTF::HashTraits<unsigned> { > + static constexpr unsigned maxLoadNumerator = 3; > + static constexpr unsigned maxLoadDenominator = 4; > +}; > + > +struct HighLoadUnsignedHash2 : public WTF::HashTraits<unsigned> { > + static constexpr unsigned maxLoadNumerator = 3; > + static constexpr unsigned maxLoadDenominator = 5; > +}; How are these used? Who uses them or what are they testing? Also, are these stale now that you have handling of small and large loads? Created attachment 390168 [details]
Patch
Comment on attachment 390168 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=390168&action=review r=me with suggestions. > Source/WTF/ChangeLog:13 > + small tables which capacity is <= 1024 based on collected data by micro-benchmarking. Please add a comment about why you chose to do this. > Source/WTF/ChangeLog:22 > + * wtf/HashTraits.h: Please remove. > Source/WTF/wtf/HashTable.h:1260 > + double maxLoadRate = loadFactor; > + double minLoadRate = 1.0 / minLoad; > + double averageLoadRate = (maxLoadRate + minLoadRate) / 2; > + double halfWayBetweenAverageAndMaxLoadRate = (averageLoadRate + maxLoadRate) / 2; > + return keyCount >= tableSize * halfWayBetweenAverageAndMaxLoadRate; nit: I suggest using "Ratio" instead of "Rate". "Rate" makes me think of frequency, which doesn't apply here. Instead, what you're computing here is the ratio of a load threshold relative to the tableSize. So, I think "Ratio" accurately describes the quantities you're computing here. > Source/WTF/wtf/HashTraits.h:56 > + > + static constexpr unsigned maxLoadNumerator = 1; > + static constexpr unsigned maxLoadDenominator = 2; As you mention offline, this will be removed. Committed r256093: <https://trac.webkit.org/changeset/256093> Comment on attachment 390168 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=390168&action=review > Source/WTF/wtf/HashTable.h:1271 > + if (bestTableSize <= maxSmallTableCapacity) { > + constexpr double smallLoadFactor = static_cast<double>(smallMaxLoadNumerator) / smallMaxLoadDenominator; > + if (aboveThresholdForEagerExpansion(smallLoadFactor, keyCount, bestTableSize)) > + bestTableSize *= 2; > + } else { > + constexpr double largeLoadFactor = static_cast<double>(largeMaxLoadNumerator) / largeMaxLoadDenominator; > + if (aboveThresholdForEagerExpansion(largeLoadFactor, keyCount, bestTableSize)) > + bestTableSize *= 2; > + } I’d find it more elegant to have less code inside the if statements here; the two sides differ only in the "load factor". Why not just compute the load factor in an if statement, and have the rest of the code be outside the if? > Tools/TestWebKitAPI/Tests/WTF/HashSet.cpp:38 > +#include <wtf/DataLog.h> > #include <wtf/HashSet.h> > +#include <wtf/MonotonicTime.h> > #include <wtf/RefPtr.h> > +#include <wtf/text/StringHash.h> > +#include <wtf/text/WTFString.h> Why did all these new includes need to be added? I don’t see any changes that would seem to motivate that. Comment on attachment 390168 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=390168&action=review >> Source/WTF/wtf/HashTable.h:1271 >> + } > > I’d find it more elegant to have less code inside the if statements here; the two sides differ only in the "load factor". Why not just compute the load factor in an if statement, and have the rest of the code be outside the if? Currently, I'm planning to remove this two different load-factors by introducing swiss-table for larger size. https://bugs.webkit.org/show_bug.cgi?id=207426 So, after that, we can just remove `bestTableSize <= maxSmallTableCapacity` check here. >> Tools/TestWebKitAPI/Tests/WTF/HashSet.cpp:38 >> +#include <wtf/text/WTFString.h> > > Why did all these new includes need to be added? I don’t see any changes that would seem to motivate that. This was removed in the landed patch :) Committed r256194: <https://trac.webkit.org/changeset/256194> Committed r256203: <https://trac.webkit.org/changeset/256203> |