Bug 207882 - NetworkCache should use 4KB threshold for mmap-ed files instead of 16KB
Summary: NetworkCache should use 4KB threshold for mmap-ed files instead of 16KB
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Page Loading (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Yusuke Suzuki
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2020-02-18 00:57 PST by Yusuke Suzuki
Modified: 2020-02-20 14:23 PST (History)
9 users (show)

See Also:


Attachments
Patch (4.25 KB, patch)
2020-02-19 15:23 PST, Yusuke Suzuki
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Yusuke Suzuki 2020-02-18 00:57:28 PST
If the file size is smaller than the threshold (16KB), we put metadata, header, and body into one file, and disabling mmap feature for this.
Previously, we are using mmap path for 4KB~ data in macOS. This sounds like a memory regression.

Disabling mmap means that we have copy of content in memory in WebProcess, and this content cannot be discarded easily because it is not file-backed.
File-backed mmap-ed content can be potentially discarded efficiently by OS (just purging the content and wiring it to disk).

We should first try 4KB threshold in A/B test to see the effect.
If this is effective,

1. We should consider decreasing this threshold in macOS since page size of macOS is 4KB.
2. We should try mmap-ping inlined body too. mmapping a part of metadata/header too, and using this file handle w/ offset.
Comment 1 Antti Koivisto 2020-02-18 02:13:59 PST
Mmaps are not free. We have had problems with running out of file descriptors in the past and kernel people have complained about excessive use. It is anyway questionable if using them for small data chuncks is anything other than a memory benchmark hack.

Only blobs are designed to be mappable, with mmap you need to be very careful about file mutations. If we wanted smaller chunks to be mappable we'd just change the inline size threshold.

I don't think this is something we should ever do.
Comment 2 Yusuke Suzuki 2020-02-18 03:11:23 PST
(In reply to Antti Koivisto from comment #1)
> Mmaps are not free. We have had problems with running out of file
> descriptors in the past and kernel people have complained about excessive
> use. It is anyway questionable if using them for small data chuncks is
> anything other than a memory benchmark hack.
> 
> Only blobs are designed to be mappable, with mmap you need to be very
> careful about file mutations. If we wanted smaller chunks to be mappable
> we'd just change the inline size threshold.
> 
> I don't think this is something we should ever do.

I think we should try 4KB threshold in this case, because it was the previous configuration and it matches against the page size in macOS.
I found a lot of images are using Vector for backing store in Membuster. My guess is that this is because our 16KB threshold is too large, and ends up having many of memory backed resources.

Iā€™m currently running A/B test which just configures this threshold to 4KB. If this shows improvement, I think we should do. If we think this is only profitable in Membyster, we should ignore Membuster, because it no longer reflects the workloads we would like to optimize.
Comment 3 Antti Koivisto 2020-02-18 03:22:17 PST
I'm not a fan of creating a difference between iOS and Mac here either.
Comment 4 Yusuke Suzuki 2020-02-18 11:21:13 PST
(In reply to Antti Koivisto from comment #3)
> I'm not a fan of creating a difference between iOS and Mac here either.

I'm not a fan of this too. But if this offers large benefit, I think we should not miss a chance of optimization because of clean / consistent code / policy.

The A/B test is saying 4% improvement in Membuster, this is super large.
I think, this is because so many files in Membuster is <= 16KB.
At this point, I think there are two choices,

1. We should do that. 4% is super large.
2. We should not do that. But in that case, I think we should ignore Membuster from now. Ignoring 4% improvement means we are almost saying "this benchmark is useless".
Comment 5 Antti Koivisto 2020-02-18 12:00:47 PST
Considering there are known costs in terms of system resources which Membuster doesn't capture, I think some real world benefits would need to be proven.
Comment 6 Yusuke Suzuki 2020-02-18 13:33:48 PST
(In reply to Antti Koivisto from comment #5)
> Considering there are known costs in terms of system resources which
> Membuster doesn't capture, I think some real world benefits would need to be
> proven.

One of the benefit of using mmap-ed memory backed by file is that, applications can tell OS about the hint that this memory can be purgeable if OS wants since the content exists on the disk. This offers two benefits,

1. OS can purge it if it wants.
2. OS can avoid swapping / compressing this memory to get saved memory unnecessarily. Instead, it can purge it and recover it if it is accessed.

Currently, non-dirty mmap-ed memory region is not accounted as memory, I think this makes sense since OS can purge it if it want.

I have a question. If this does not look a real world benefit, why do we have mmap-ed file path?
Using mmap-ed file if larger than 16KB sounds strange if it does not offer any benefit. If it is true, then, we should just remove this blob mechanism entirely and always use inline body.
If it offers benefit, the same benefit can be achieved for 4KB too, and it is measured in Membuster.
Comment 7 Antti Koivisto 2020-02-19 07:26:48 PST
Hmm, I suppose there is no likely harm in dropping this back to the page size on Mac. It is sort of logical. The problems with running out of file descriptors were always on iOS.
Comment 8 Yusuke Suzuki 2020-02-19 13:29:46 PST
(In reply to Antti Koivisto from comment #7)
> Hmm, I suppose there is no likely harm in dropping this back to the page
> size on Mac. It is sort of logical. The problems with running out of file
> descriptors were always on iOS.

That sounds really nice!
The final result of A/B test (with 10 runs for each) shows 2.5% progression w/ 98% significance.
This means that I think this is really solid improvement in macOS.

I've discussed with Phil, and we agree that this reduces memory pressure because of named-pages (instead of anonymous pages).
Comment 9 Yusuke Suzuki 2020-02-19 15:19:35 PST
I discussed with Geoff about this.
We agree that using named-pages is improvement in terms of memory because OS can purge it and recover it cheaply.
We agree that just decreasing the threshold for macOS is the safest option and improvement.
For now, we should start using page-size, this is the safest.
Geoff also suggests an interesting idea: we should explore good threshold for iOS too. Currently, we are always using mmap-file-path if the size gets larger than 16KB. This means we could waste 16KB - 1 byte in the worst case (50% wasting). This also suggests that we should try 8KB (50% of 16KB) threshold for iOS. I think it is worth looking into it after fixing this issue.
Comment 10 Yusuke Suzuki 2020-02-19 15:23:06 PST
Created attachment 391210 [details]
Patch
Comment 11 Radar WebKit Bug Importer 2020-02-19 15:34:50 PST
<rdar://problem/59608926>
Comment 12 Yusuke Suzuki 2020-02-19 17:26:45 PST
Comment on attachment 391210 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=391210&action=review

> Source/WebKit/ChangeLog:21
> +        This patch reduces the threshold from 16KB to page size (4KB in macOS, 16KB in iOS). This offers 2.56% progression with 98% probability in Membuster.

I note that this 4KB threshold is the default one before 2015.
Comment 13 Alex Christensen 2020-02-19 17:29:35 PST
Comment on attachment 391210 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=391210&action=review

> Source/WebKit/NetworkProcess/cache/NetworkCacheStorage.cpp:297
>      auto headerSizes = recordCount * 4096;

Let's consider changing this to pageSize, too.
Comment 14 Alex Christensen 2020-02-19 17:32:34 PST
Actually, maybe not.
Comment 15 Yusuke Suzuki 2020-02-19 18:13:00 PST
Committed r257025: <https://trac.webkit.org/changeset/257025>
Comment 16 Jason Lawrence 2020-02-20 08:14:05 PST
Reverted r257025 for reason:

This commit broke a test on Mac wk2.

Committed r257061: <https://trac.webkit.org/changeset/257061>
Comment 17 Jason Lawrence 2020-02-20 08:16:37 PST
This commit caused a regression with the http/tests/inspector/network/resource-sizes-disk-cache.html test.
https://bugs.webkit.org/show_bug.cgi?id=208004
Comment 18 Yusuke Suzuki 2020-02-20 12:46:28 PST
(In reply to Jason Lawrence from comment #17)
> This commit caused a regression with the
> http/tests/inspector/network/resource-sizes-disk-cache.html test.
> https://bugs.webkit.org/show_bug.cgi?id=208004

The test looks broken, it is assuming that space100.png is always in non-mmap-file path, and this is wrong. I'll reland this with test update.
Comment 19 Yusuke Suzuki 2020-02-20 14:23:41 PST
Committed r257089: <https://trac.webkit.org/changeset/257089>