WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
173552
bmalloc: Add a per-thread line cache
https://bugs.webkit.org/show_bug.cgi?id=173552
Summary
bmalloc: Add a per-thread line cache
Geoffrey Garen
Reported
2017-06-19 10:34:32 PDT
bmalloc: Add a per-thread line cache
Attachments
Patch
(14.19 KB, patch)
2017-06-19 20:13 PDT
,
Geoffrey Garen
darin
: review+
Details
Formatted Diff
Diff
View All
Add attachment
proposed patch, testcase, etc.
Geoffrey Garen
Comment 1
2017-06-19 20:13:14 PDT
Created
attachment 313353
[details]
Patch
Geoffrey Garen
Comment 2
2017-06-19 20:14:11 PDT
MacBook Air MallocBench results: ~/OpenSource/Source/bmalloc> ~/OpenSource/PerformanceTests/MallocBench/run-malloc-benchmarks Baseline:~/OpenSource/WebKitBuildBaseline/Release/ Patch:~/OpenSource/WebKitBuild/Release/ Baseline Patch Δ Execution Time: churn 80ms 78ms ^ 1.03x faster list_allocate 70ms 73ms ! 1.04x slower tree_allocate 74ms 76ms ! 1.03x slower tree_churn 82ms 81ms ^ 1.01x faster fragment 70ms 70ms fragment_iterate 78ms 77ms ^ 1.01x faster medium 157ms 160ms ! 1.02x slower big 139ms 138ms ^ 1.01x faster facebook 221ms 219ms ^ 1.01x faster reddit 112ms 113ms ! 1.01x slower flickr 115ms 117ms ! 1.02x slower theverge 144ms 150ms ! 1.04x slower nimlang 119ms 117ms ^ 1.02x faster message_one 190ms 188ms ^ 1.01x faster message_many 122ms 119ms ^ 1.03x faster churn --parallel 37ms 37ms list_allocate --parallel 68ms 69ms ! 1.01x slower tree_allocate --parallel 84ms 80ms ^ 1.05x faster tree_churn --parallel 83ms 73ms ^ 1.14x faster fragment --parallel 53ms 50ms ^ 1.06x faster fragment_iterate --parallel 33ms 33ms medium --parallel 155ms 153ms ^ 1.01x faster big --parallel 148ms 140ms ^ 1.06x faster facebook --parallel 635ms 628ms ^ 1.01x faster reddit --parallel 316ms 277ms ^ 1.14x faster flickr --parallel 312ms 282ms ^ 1.11x faster theverge --parallel 412ms 368ms ^ 1.12x faster <geometric mean> 118ms 115ms ^ 1.02x faster <arithmetic mean> 152ms 147ms ^ 1.04x faster <harmonic mean> 96ms 94ms ^ 1.02x faster Peak Memory: churn 2,296kB 2,288kB ^ 1.0x smaller list_allocate 3,584kB 3,588kB ! 1.0x bigger tree_allocate 7,404kB 7,408kB ! 1.0x bigger tree_churn 6,224kB 6,224kB fragment 9,452kB 9,460kB ! 1.0x bigger fragment_iterate 27,124kB 27,120kB ^ 1.0x smaller medium 1,190,816kB 1,190,812kB ^ 1.0x smaller big 1,090,788kB 1,090,588kB ^ 1.0x smaller facebook 81,136kB 80,624kB ^ 1.01x smaller reddit 15,412kB 15,412kB flickr 29,324kB 29,320kB ^ 1.0x smaller theverge 28,976kB 28,940kB ^ 1.0x smaller nimlang 166,900kB 166,344kB ^ 1.0x smaller message_one 6,612kB 6,632kB ! 1.0x bigger message_many 4,272kB 4,444kB ! 1.04x bigger churn --parallel 2,420kB 2,428kB ! 1.0x bigger list_allocate --parallel 3,684kB 3,684kB tree_allocate --parallel 4,764kB 4,752kB ^ 1.0x smaller tree_churn --parallel 4,412kB 4,416kB ! 1.0x bigger fragment --parallel 9,560kB 9,576kB ! 1.0x bigger fragment_iterate --parallel 27,984kB 28,004kB ! 1.0x bigger medium --parallel 1,191,240kB 1,193,328kB ! 1.0x bigger big --parallel 1,087,676kB 1,089,688kB ! 1.0x bigger facebook --parallel 286,476kB 284,320kB ^ 1.01x smaller reddit --parallel 56,480kB 56,548kB ! 1.0x bigger flickr --parallel 101,908kB 101,936kB ! 1.0x bigger theverge --parallel 110,156kB 109,852kB ^ 1.0x smaller <geometric mean> 29,976kB 30,008kB ! 1.0x bigger <arithmetic mean> 205,818kB 205,842kB ! 1.0x bigger <harmonic mean> 9,014kB 9,043kB ! 1.0x bigger Memory at End: churn 464kB 456kB ^ 1.02x smaller list_allocate 464kB 468kB ! 1.01x bigger tree_allocate 464kB 468kB ! 1.01x bigger tree_churn 468kB 468kB fragment 468kB 476kB ! 1.02x bigger fragment_iterate 480kB 476kB ^ 1.01x smaller medium 544kB 540kB ^ 1.01x smaller big 536kB 536kB facebook 2,444kB 2,444kB reddit 1,684kB 1,684kB flickr 2,600kB 2,596kB ^ 1.0x smaller theverge 2,644kB 2,608kB ^ 1.01x smaller nimlang 58,460kB 58,544kB ! 1.0x bigger message_one 740kB 748kB ! 1.01x bigger message_many 1,324kB 1,148kB ^ 1.15x smaller churn --parallel 580kB 588kB ! 1.01x bigger list_allocate --parallel 620kB 608kB ^ 1.02x smaller tree_allocate --parallel 820kB 828kB ! 1.01x bigger tree_churn --parallel 868kB 780kB ^ 1.11x smaller fragment --parallel 716kB 1,008kB ! 1.41x bigger fragment_iterate --parallel 652kB 644kB ^ 1.01x smaller medium --parallel 5,752kB 6,744kB ! 1.17x bigger big --parallel 38,308kB 29,200kB ^ 1.31x smaller facebook --parallel 12,392kB 11,956kB ^ 1.04x smaller reddit --parallel 6,972kB 6,808kB ^ 1.02x smaller flickr --parallel 11,432kB 11,524kB ! 1.01x bigger theverge --parallel 10,848kB 10,996kB ! 1.01x bigger <geometric mean> 1,689kB 1,685kB ^ 1.0x smaller <arithmetic mean> 6,065kB 5,753kB ^ 1.05x smaller <harmonic mean> 910kB 916kB ! 1.01x bigger
Geoffrey Garen
Comment 3
2017-06-19 20:15:24 PDT
Mac Pro MallocBench results: ~/OpenSource/Source/bmalloc> ~/OpenSource/PerformanceTests/MallocBench/run-malloc-benchmarks Baseline:~/OpenSource/WebKitBuildBaseline/Release/ Patch:~/OpenSource/WebKitBuild/Release/ Baseline Patch Δ Execution Time: churn 71ms 71ms list_allocate 63ms 65ms ! 1.03x slower tree_allocate 63ms 64ms ! 1.02x slower tree_churn 76ms 75ms ^ 1.01x faster fragment 61ms 61ms fragment_iterate 66ms 66ms medium 138ms 138ms big 119ms 121ms ! 1.02x slower facebook 184ms 184ms reddit 100ms 100ms flickr 104ms 105ms ! 1.01x slower theverge 132ms 133ms ! 1.01x slower nimlang 117ms 114ms ^ 1.03x faster message_one 176ms 174ms ^ 1.01x faster message_many 953ms 911ms ^ 1.05x faster churn --parallel 33ms 32ms ^ 1.03x faster list_allocate --parallel 146ms 116ms ^ 1.26x faster tree_allocate --parallel 805ms 613ms ^ 1.31x faster tree_churn --parallel 1,009ms 354ms ^ 2.85x faster fragment --parallel 82ms 62ms ^ 1.32x faster fragment_iterate --parallel 12ms 12ms medium --parallel 119ms 117ms ^ 1.02x faster big --parallel 116ms 115ms ^ 1.01x faster facebook --parallel 4,719ms 4,104ms ^ 1.15x faster reddit --parallel 3,852ms 2,753ms ^ 1.4x faster flickr --parallel 4,126ms 2,532ms ^ 1.63x faster theverge --parallel 4,456ms 3,289ms ^ 1.35x faster <geometric mean> 199ms 177ms ^ 1.12x faster <arithmetic mean> 811ms 610ms ^ 1.33x faster <harmonic mean> 88ms 85ms ^ 1.03x faster Peak Memory: churn 1,024kB 1,036kB ! 1.01x bigger list_allocate 2,324kB 2,324kB tree_allocate 6,132kB 6,144kB ! 1.0x bigger tree_churn 4,960kB 4,948kB ^ 1.0x smaller fragment 8,176kB 8,176kB fragment_iterate 25,840kB 25,852kB ! 1.0x bigger medium 1,189,528kB 1,189,528kB big 1,089,316kB 1,089,548kB ! 1.0x bigger facebook 79,400kB 79,816kB ! 1.01x bigger reddit 14,092kB 14,080kB ^ 1.0x smaller flickr 28,044kB 27,976kB ^ 1.0x smaller theverge 27,608kB 27,608kB nimlang 165,688kB 166,184kB ! 1.0x bigger message_one 5,484kB 5,460kB ^ 1.0x smaller message_many 2,916kB 2,924kB ! 1.0x bigger churn --parallel 1,696kB 1,760kB ! 1.04x bigger list_allocate --parallel 3,136kB 3,296kB ! 1.05x bigger tree_allocate --parallel 12,972kB 12,800kB ^ 1.01x smaller tree_churn --parallel 13,024kB 13,964kB ! 1.07x bigger fragment --parallel 7,216kB 7,528kB ! 1.04x bigger fragment_iterate --parallel 28,176kB 28,348kB ! 1.01x bigger medium --parallel 1,134,232kB 1,159,656kB ! 1.02x bigger big --parallel 1,038,960kB 1,024,420kB ^ 1.01x smaller facebook --parallel 1,582,944kB 1,595,348kB ! 1.01x bigger reddit --parallel 291,172kB 300,924kB ! 1.03x bigger flickr --parallel 550,712kB 555,568kB ! 1.01x bigger theverge --parallel 602,324kB 614,916kB ! 1.02x bigger <geometric mean> 36,479kB 36,866kB ! 1.01x bigger <arithmetic mean> 293,226kB 295,190kB ! 1.01x bigger <harmonic mean> 6,982kB 7,089kB ! 1.02x bigger Memory at End: churn 572kB 584kB ! 1.02x bigger list_allocate 584kB 584kB tree_allocate 572kB 584kB ! 1.02x bigger tree_churn 584kB 572kB ^ 1.02x smaller fragment 572kB 572kB fragment_iterate 572kB 584kB ! 1.02x bigger medium 632kB 632kB big 632kB 636kB ! 1.01x bigger facebook 2,560kB 2,508kB ^ 1.02x smaller reddit 1,748kB 1,736kB ^ 1.01x smaller flickr 2,704kB 2,636kB ^ 1.03x smaller theverge 2,660kB 2,660kB nimlang 58,548kB 58,440kB ^ 1.0x smaller message_one 1,012kB 984kB ^ 1.03x smaller message_many 1,488kB 1,556kB ! 1.05x bigger churn --parallel 1,260kB 1,324kB ! 1.05x bigger list_allocate --parallel 1,596kB 1,652kB ! 1.04x bigger tree_allocate --parallel 2,560kB 2,764kB ! 1.08x bigger tree_churn --parallel 2,836kB 2,520kB ^ 1.13x smaller fragment --parallel 2,608kB 2,828kB ! 1.08x bigger fragment_iterate --parallel 1,716kB 2,128kB ! 1.24x bigger medium --parallel 49,384kB 31,760kB ^ 1.55x smaller big --parallel 79,960kB 77,192kB ^ 1.04x smaller facebook --parallel 40,992kB 39,532kB ^ 1.04x smaller reddit --parallel 30,316kB 29,752kB ^ 1.02x smaller flickr --parallel 37,020kB 34,972kB ^ 1.06x smaller theverge --parallel 32,012kB 32,288kB ! 1.01x bigger <geometric mean> 3,081kB 3,055kB ^ 1.01x smaller <arithmetic mean> 13,248kB 12,370kB ^ 1.07x smaller <harmonic mean> 1,334kB 1,349kB ! 1.01x bigger
Darin Adler
Comment 4
2017-06-19 22:57:54 PDT
Comment on
attachment 313353
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=313353&action=review
> Source/bmalloc/bmalloc/List.h:118 > + static void remove(ListNode<T>* node)
The insertAfter function could also be marked static; why not?
> Source/bmalloc/bmalloc/SmallPage.h:71 > +typedef std::array<List<SmallPage>, sizeClassCount> LineCache;
In new code, we’ve been preferring using to typedef.
Geoffrey Garen
Comment 5
2017-06-24 13:14:35 PDT
Committed
r218788
: <
http://trac.webkit.org/changeset/218788
>
Saam Barati
Comment 6
2017-06-26 12:04:44 PDT
This is a 4% progression on wasm benchmarks too.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug