Bug 207426

Summary: [WTF] Use SwissTable algorithm for larger HashTable
Product: WebKit Reporter: Yusuke Suzuki <ysuzuki>
Component: Web Template FrameworkAssignee: Yusuke Suzuki <ysuzuki>
Status: NEW ---    
Severity: Normal CC: benjamin, cdumez, cmarcelo, darin, dbates, ews-watchlist, ggaren, keith_miller, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch keith_miller: review+

Description Yusuke Suzuki 2020-02-07 22:37:14 PST
And let's use 75% load-factor even in larger table.
Comment 1 Yusuke Suzuki 2020-02-08 22:26:24 PST
Created attachment 390197 [details]
Patch
Comment 2 Yusuke Suzuki 2020-02-08 22:40:25 PST
Created attachment 390198 [details]
Patch
Comment 3 Yusuke Suzuki 2020-02-08 23:04:31 PST
Created attachment 390199 [details]
Patch
Comment 4 Yusuke Suzuki 2020-02-09 01:26:45 PST
Created attachment 390200 [details]
Patch
Comment 5 Yusuke Suzuki 2020-02-09 15:04:33 PST
Created attachment 390214 [details]
Patch
Comment 6 Yusuke Suzuki 2020-02-09 17:47:28 PST
Created attachment 390220 [details]
Patch
Comment 7 Yusuke Suzuki 2020-02-09 19:19:31 PST
Created attachment 390222 [details]
Patch
Comment 8 Yusuke Suzuki 2020-02-09 19:23:25 PST
Created attachment 390223 [details]
Patch
Comment 9 Yusuke Suzuki 2020-02-09 22:11:27 PST
Created attachment 390233 [details]
Patch
Comment 10 Yusuke Suzuki 2020-02-10 19:54:39 PST
Created attachment 390339 [details]
Patch
Comment 11 Yusuke Suzuki 2020-02-10 21:57:16 PST
Some of membuster process's HashTable dump by using MALLOC_BREAKDOWN.
This is interesting.

Zone HashTable_0x110574000: 3867 nodes (2306608 bytes) 

    COUNT     BYTES       AVG   CLASS_NAME                                       TYPE    BINARY
    =====     =====       ===   ==========                                       ====    ======
        1        16      16.0   non-object                                                                 
      720     57600      80.0   non-object                                                                 
     1739    250416     144.0   non-object                                                                 
      714    148512     208.0   non-object                                                                 
      276     75072     272.0   non-object                                                                 
        4      1344     336.0   non-object                                                                 
       61     24400     400.0   non-object                                                                 
      133     70224     528.0   non-object                                                                 
        1       592     592.0   non-object                                                                 
       11      7216     656.0   non-object                                                                 
        1       720     720.0   non-object                                                                 
       22     17248     784.0   non-object                                                                 
       66    101376    1536.0   non-object                                                                 
        8     16384    2048.0   non-object                                                                 
       28     71680    2560.0   non-object                                                                 
        3      9216    3072.0   non-object                                                                 
       31    142848    4608.0   non-object                                                                 
        1      5120    5120.0   non-object                                                                 
        3     16896    5632.0   non-object                                                                 
        6     39936    6656.0   non-object                                                                 
        1      7680    7680.0   non-object                                                                 
        1      8192    8192.0   non-object                                                                 
        8     69632    8704.0   non-object                                                                 
        5     53760   10752.0   non-object                                                                 
        4     51200   12800.0   non-object                                                                 
        1     14848   14848.0   non-object                                                                 
        4     67584   16896.0   non-object                                                                 
        2     41984   20992.0   non-object                                                                 
        2     50176   25088.0   non-object                                                                 
        7    458752   65536.0   non-object                                                                 
        2    196608   98304.0   non-object                                                                 
        1    229376  229376.0   non-object

While # of HashTable has skew (144.0 is the most), accumulated memory of long-tail has fair-amount of memory.
For example, total memory of 144.0 is smaller than 65536.0 sized table.
We should optimize large-sized table too.
Comment 12 Yusuke Suzuki 2020-02-11 14:44:07 PST
Created attachment 390429 [details]
Patch
Comment 13 Yusuke Suzuki 2020-02-11 18:56:54 PST
Created attachment 390478 [details]
Patch
Comment 14 Yusuke Suzuki 2020-02-13 15:00:37 PST
Test failure comes from the fact that this test is relying on hashtable.
In some place, we put properties of JSGlobalObject into hashtable, iterating, and picking the first one to dump. Previously, it was,

"undefined": undefined

But now, it is,

"NaN": NaN

that's why we are seeing the failure.
Comment 15 Yusuke Suzuki 2020-02-14 00:13:11 PST
Created attachment 390734 [details]
Patch
Comment 16 Radar WebKit Bug Importer 2020-02-14 00:55:54 PST
<rdar://problem/59453007>
Comment 17 Keith Miller 2020-02-18 11:39:33 PST
Comment on attachment 390734 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=390734&action=review

r=me if you fix the style errors and rename h1/h2. Can you make sure we have tests for SwissHashing in testWTF?

> Source/WTF/ChangeLog:13
> +        to select Group we will search. The key is that we maintains Control bytes as a side data,

Nit: maintains => maintain and "a side data" => "side data"

> Source/WTF/ChangeLog:16
> +        For each bucket, we store either of (1) 7 bit hash code, (2) empty marker, or (3) deleted marker.

Nit: either => one

> Source/WTF/ChangeLog:18
> +        operation to match 16 bucket hash codes in parallel, to filter out search candidate buckets in

Nit: operation => operations. no comma after parallel.

> Source/WTF/ChangeLog:19
> +        one Group. And we can see last empty bucket if any bucket of Group is empty. So, almost all cases,

Nit: "So, almost" => "So, in almost"

> Source/WTF/wtf/HashTable.h:622
> +        ALWAYS_INLINE unsigned h1(unsigned hash)

h1 doesn't really convey the meaning of what we are computing here. Can we call this groupHash?

> Source/WTF/wtf/HashTable.h:628
> +        inline Control h2(unsigned hash)

h2 doesn't really convey the meaning of what we are computing here. Can we call this controlHash?

> Source/WTF/wtf/HashTable.h:1033
> +    inline auto HashTable<Key, Value, Extractor, HashFunctions, Traits, KeyTraits>::lookupForWriting(const T& key, unsigned h) -> LookupType

Can we change h to hash? It's poor style in the old code and now that it's a parameter is even worse...
Comment 18 Geoffrey Garen 2020-02-18 20:58:29 PST
Comment on attachment 390734 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=390734&action=review

> Source/WTF/ChangeLog:29
> +        While JetStream2 is neutral, we get sub-1% improvement in Membuster.

How big was the improvement in Membuster?
Comment 19 Yusuke Suzuki 2020-02-19 19:22:14 PST
Comment on attachment 390734 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=390734&action=review

>> Source/WTF/ChangeLog:29
>> +        While JetStream2 is neutral, we get sub-1% improvement in Membuster.
> 
> How big was the improvement in Membuster?

In my local testing, it is ~1%. We already got large improvement when increasing the load-factor from 50% to 75% for small HashMap. This patch allows this load-factor for larger tables.
But maybe, I first need to check the result of https://bugs.webkit.org/show_bug.cgi?id=207827 and https://trac.webkit.org/changeset/257025/webkit
Comment 20 Darin Adler 2020-04-13 20:20:16 PDT
Looks OK to me. The key is really the performance testing and making sure we also do enough correctness testing. Like maybe testing with CHECK_HASHTABLE_CONSISTENCY set to 1?