WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
223808
Service Worker scripts use too much memory in the network process
https://bugs.webkit.org/show_bug.cgi?id=223808
Summary
Service Worker scripts use too much memory in the network process
Chris Dumez
Reported
2021-03-26 11:32:14 PDT
Service Worker scripts use to much memory in the network process.
Attachments
WIP Patch
(48.99 KB, patch)
2021-03-26 11:33 PDT
,
Chris Dumez
no flags
Details
Formatted Diff
Diff
WIP Patch
(50.57 KB, patch)
2021-03-26 11:52 PDT
,
Chris Dumez
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
WIP Patch
(59.60 KB, patch)
2021-03-26 13:09 PDT
,
Chris Dumez
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
WIP Patch
(59.55 KB, patch)
2021-03-26 13:20 PDT
,
Chris Dumez
no flags
Details
Formatted Diff
Diff
WIP Patch
(61.04 KB, patch)
2021-03-26 14:03 PDT
,
Chris Dumez
no flags
Details
Formatted Diff
Diff
Patch
(72.72 KB, patch)
2021-03-26 15:51 PDT
,
Chris Dumez
no flags
Details
Formatted Diff
Diff
Patch
(75.18 KB, patch)
2021-03-26 17:09 PDT
,
Chris Dumez
no flags
Details
Formatted Diff
Diff
Patch
(91.31 KB, patch)
2021-03-26 19:14 PDT
,
Chris Dumez
no flags
Details
Formatted Diff
Diff
Patch
(98.69 KB, patch)
2021-03-26 21:13 PDT
,
Chris Dumez
no flags
Details
Formatted Diff
Diff
Patch
(103.43 KB, patch)
2021-03-29 14:18 PDT
,
Chris Dumez
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(104.47 KB, patch)
2021-03-29 15:07 PDT
,
Chris Dumez
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(104.54 KB, patch)
2021-03-29 15:51 PDT
,
Chris Dumez
no flags
Details
Formatted Diff
Diff
Patch
(104.80 KB, patch)
2021-03-29 17:05 PDT
,
Chris Dumez
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(104.84 KB, patch)
2021-03-29 17:27 PDT
,
Chris Dumez
no flags
Details
Formatted Diff
Diff
Patch
(106.86 KB, patch)
2021-03-30 11:19 PDT
,
Chris Dumez
no flags
Details
Formatted Diff
Diff
Patch
(114.22 KB, patch)
2021-03-30 14:09 PDT
,
Chris Dumez
no flags
Details
Formatted Diff
Diff
Patch
(116.60 KB, patch)
2021-03-30 15:50 PDT
,
Chris Dumez
no flags
Details
Formatted Diff
Diff
Patch
(116.60 KB, patch)
2021-03-30 15:53 PDT
,
Chris Dumez
no flags
Details
Formatted Diff
Diff
Patch
(115.27 KB, patch)
2021-03-30 19:53 PDT
,
Chris Dumez
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(115.15 KB, patch)
2021-03-30 19:58 PDT
,
Chris Dumez
no flags
Details
Formatted Diff
Diff
Show Obsolete
(19)
View All
Add attachment
proposed patch, testcase, etc.
Chris Dumez
Comment 1
2021-03-26 11:32:29 PDT
<
rdar://75637093
>
Chris Dumez
Comment 2
2021-03-26 11:33:24 PDT
Created
attachment 424382
[details]
WIP Patch
Chris Dumez
Comment 3
2021-03-26 11:52:48 PDT
Created
attachment 424387
[details]
WIP Patch
Chris Dumez
Comment 4
2021-03-26 13:09:36 PDT
Created
attachment 424395
[details]
WIP Patch
Chris Dumez
Comment 5
2021-03-26 13:20:38 PDT
Created
attachment 424396
[details]
WIP Patch
Chris Dumez
Comment 6
2021-03-26 14:03:03 PDT
Created
attachment 424402
[details]
WIP Patch
Chris Dumez
Comment 7
2021-03-26 15:51:22 PDT
Created
attachment 424411
[details]
Patch
Chris Dumez
Comment 8
2021-03-26 17:09:48 PDT
Created
attachment 424421
[details]
Patch
Chris Dumez
Comment 9
2021-03-26 19:14:00 PDT
Created
attachment 424431
[details]
Patch
Chris Dumez
Comment 10
2021-03-26 21:13:47 PDT
Created
attachment 424440
[details]
Patch
Chris Dumez
Comment 11
2021-03-29 14:18:29 PDT
Created
attachment 424577
[details]
Patch
Chris Dumez
Comment 12
2021-03-29 15:07:59 PDT
Created
attachment 424589
[details]
Patch
Chris Dumez
Comment 13
2021-03-29 15:51:26 PDT
Created
attachment 424597
[details]
Patch
Chris Dumez
Comment 14
2021-03-29 17:05:52 PDT
Created
attachment 424604
[details]
Patch
Chris Dumez
Comment 15
2021-03-29 17:27:22 PDT
Created
attachment 424605
[details]
Patch
Chris Dumez
Comment 16
2021-03-30 11:19:04 PDT
Created
attachment 424663
[details]
Patch
Chris Dumez
Comment 17
2021-03-30 11:20:15 PDT
I did manual memory testing with a very large service worker that had one ~100MB main script, importing another ~100MB script. Here are the memory results without my change: - Networking (Dirty): 238MB (cold) / 331MB (warm) - WebContent (Dirty): 441MB (cold) / 857MB (warm) Here are the memory results with my change: - Networking (Dirty): 22MB (cold) / 14MB (warm) - WebContent (Dirty): 440MB (cold) / 438MB (warm)
Geoffrey Garen
Comment 18
2021-03-30 12:29:12 PDT
Comment on
attachment 424663
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=424663&action=review
Looks good. A few comments. I'm not going to say r+ yet because I think the salt issue might change the on-disk representation a smidge, and I'd hate to have to bump the version again just for that.
> Source/WebCore/ChangeLog:15 > + and greatly reduce memory usage.
reduce [ dirty ] memory usage
> Source/WebCore/ChangeLog:36 > + - WebContent (Dirty): 440MB (cold) / 438MB (warm)
I wonder why the warm case used to use 1.5X - 2X more memory than the cold case. Maybe this won't matter once we're done optimizing; but it almost suggests that we have two copies of the script somewhere in both networking and WebContent.
> Source/WebCore/workers/service/server/RegistrationDatabase.cpp:53 > -static const uint64_t schemaVersion = 5; > +static const uint64_t schemaVersion = 6;
Is this schema bump sufficient to delete the old database, or do we need to do something explicit? (We want to avoid leaving old data behind, including in the case of a restore from a backup from a previous version of WebKit.)
> Source/WebCore/workers/service/server/SWScriptStorage.cpp:45 > + auto crypto = PAL::CryptoDigest::create(PAL::CryptoDigest::Algorithm::SHA_256); > + auto inputUtf8 = input.utf8(); > + crypto->addBytes(inputUtf8.data(), inputUtf8.length()); > + auto hash = crypto->computeHash();
Is there anything in this code that will add a salt to the hash? If not, I think we need to add a salt -- otherwise the hash does not confer much secrecy. (Specifically, anyone searching for a specific URL can just compute the hash for themselves and then compare. And perhaps there are other attacks; I'm not an expert, but I know that salt is best practice and also what we use in the HTTP cache.)
> Source/WebCore/workers/service/server/SWScriptStorage.cpp:94 > + // If mmap() fails, fall back to less efficient way. > + auto fileHandle = FileSystem::openFile(scriptPath, FileSystem::FileOpenMode::Write); > + if (!FileSystem::isHandleValid(fileHandle)) { > + RELEASE_LOG_ERROR(ServiceWorker, "SWScriptStorage::store: Failure to store %s, FileSystem::openFile() failed", scriptPath.utf8().data()); > + return nullptr; > + } > + for (auto it = script.begin(); it != script.end(); ++it) > + FileSystem::writeToFile(fileHandle, it->segment->data(), it->segment->size()); > + > + FileSystem::closeFile(fileHandle); > + return script.copy();
Do we expect mmap to fail in any reasonable case? If not, I think we should just fail the operation instead of falling back. I don't love having fallback behavior that we can't test.
Chris Dumez
Comment 19
2021-03-30 12:38:54 PDT
Comment on
attachment 424663
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=424663&action=review
>> Source/WebCore/ChangeLog:15 >> + and greatly reduce memory usage. > > reduce [ dirty ] memory usage
OK
>> Source/WebCore/ChangeLog:36 >> + - WebContent (Dirty): 440MB (cold) / 438MB (warm) > > I wonder why the warm case used to use 1.5X - 2X more memory than the cold case. Maybe this won't matter once we're done optimizing; but it almost suggests that we have two copies of the script somewhere in both networking and WebContent.
Yes, I think this deserves a separate investigation. I could not believe the 857MB at first so I took the measurement twice...
>> Source/WebCore/workers/service/server/RegistrationDatabase.cpp:53 >> +static const uint64_t schemaVersion = 6; > > Is this schema bump sufficient to delete the old database, or do we need to do something explicit? (We want to avoid leaving old data behind, including in the case of a restore from a backup from a previous version of WebKit.)
This is sufficient because the following will get called and remove old databases: ``` static inline void cleanOldDatabases(const String& databaseDirectory) { for (uint64_t version = 1; version < schemaVersion; ++version) SQLiteFileSystem::deleteDatabaseFile(FileSystem::pathByAppendingComponent(databaseDirectory, databaseFilenameFromVersion(version))); } ```
>> Source/WebCore/workers/service/server/SWScriptStorage.cpp:45 >> + auto hash = crypto->computeHash(); > > Is there anything in this code that will add a salt to the hash? If not, I think we need to add a salt -- otherwise the hash does not confer much secrecy. (Specifically, anyone searching for a specific URL can just compute the hash for themselves and then compare. And perhaps there are other attacks; I'm not an expert, but I know that salt is best practice and also what we use in the HTTP cache.)
Ok, I will look into adding a Salt. I did not realized we were concerned about this sort of attack.
>> Source/WebCore/workers/service/server/SWScriptStorage.cpp:94 >> + return script.copy(); > > Do we expect mmap to fail in any reasonable case? If not, I think we should just fail the operation instead of falling back. I don't love having fallback behavior that we can't test.
Ok, so there is definitely one case: 1. size is 0 (I found out because some of our API test use empty files). FileSystem::mapToFile() definitely fails with an empty file. This code path is definitely tested by API tests. Then there is a possible second case I thought of: 2. I am assuming the system has some sort of limits on how many files you can have mmap'd() at the same time. If so, once we reach that limit, it is nicer to fall back to basic writing.
Chris Dumez
Comment 20
2021-03-30 12:42:50 PDT
Comment on
attachment 424663
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=424663&action=review
>>> Source/WebCore/workers/service/server/SWScriptStorage.cpp:94 >>> + return script.copy(); >> >> Do we expect mmap to fail in any reasonable case? If not, I think we should just fail the operation instead of falling back. I don't love having fallback behavior that we can't test. > > Ok, so there is definitely one case: > 1. size is 0 (I found out because some of our API test use empty files). FileSystem::mapToFile() definitely fails with an empty file. > > This code path is definitely tested by API tests. > > Then there is a possible second case I thought of: > 2. I am assuming the system has some sort of limits on how many files you can have mmap'd() at the same time. If so, once we reach that limit, it is nicer to fall back to basic writing.
Also note that the existing SharedBuffer::createWithContentsOfFile() does: 1. Try to mmap() file and return that if successful 2. Read file into memory So my storing behavior is consistent with our retrieval behavior.
Geoffrey Garen
Comment 21
2021-03-30 12:53:29 PDT
> > Do we expect mmap to fail in any reasonable case? If not, I think we should just fail the operation instead of falling back. I don't love having fallback behavior that we can't test. > > Ok, so there is definitely one case: > 1. size is 0 (I found out because some of our API test use empty files). > FileSystem::mapToFile() definitely fails with an empty file.
This behavior in mapToFile() seems like a bug? Zero-byte files are definitely a valid thing: ~> touch test ~> ls -al test -rw-r--r-- 1 ggaren staff 0 Mar 30 12:49 test
> Then there is a possible second case I thought of: > 2. I am assuming the system has some sort of limits on how many files you > can have mmap'd() at the same time. If so, once we reach that limit, it is > nicer to fall back to basic writing.
There *is* a limit (per process, per user, and per device) on number of open files, but it applies equally to mmap and other file descriptors. So, if you hit that limit, mapToFile() will fail, but so will openFile()! So, like, if we had a test for this condition, it would be failing.
Chris Dumez
Comment 22
2021-03-30 14:09:08 PDT
Created
attachment 424693
[details]
Patch
Chris Dumez
Comment 23
2021-03-30 14:10:31 PDT
(In reply to Chris Dumez from
comment #22
)
> Created
attachment 424693
[details]
> Patch
Made the following suggested changes: 1. Added a Salt when computing the hashes. I moved the salt logic from NetworkCacheData to FileSystem so that I could reuse it. 2. Made mapToFile() return a valid mapped file, even if the file size is 0. 3. Drop logic to try writing manually to file when mapToFile() fails.
Darin Adler
Comment 24
2021-03-30 14:16:31 PDT
What do we do to prevent ourselves from mapping an arbitrarily large number of files? I assume we can’t do that safely without having a bad effect on the rest of the system.
Chris Dumez
Comment 25
2021-03-30 14:19:30 PDT
Comment on
attachment 424693
[details]
Patch Just noticed something that probably needs fixing.
Chris Dumez
Comment 26
2021-03-30 14:48:42 PDT
(In reply to Darin Adler from
comment #24
)
> What do we do to prevent ourselves from mapping an arbitrarily large number > of files? I assume we can’t do that safely without having a bad effect on > the rest of the system.
That is a good question. I am not sure how we deal with this on the network cache side (where we do the same mmap'ing). I will try and find out. I currently do not have a limit in my patch.
Chris Dumez
Comment 27
2021-03-30 15:07:19 PDT
(In reply to Chris Dumez from
comment #26
)
> (In reply to Darin Adler from
comment #24
) > > What do we do to prevent ourselves from mapping an arbitrarily large number > > of files? I assume we can’t do that safely without having a bad effect on > > the rest of the system. > > That is a good question. I am not sure how we deal with this on the network > cache side (where we do the same mmap'ing). I will try and find out. > > I currently do not have a limit in my patch.
I am having trouble finding out how we're dealing with this in the network cache code. I am open to suggestions here. It seems that we could avoid using mmap() if the size is smallish to avoid mmap() overuse. We may also want to have a limit on the total number of mmap'd files the SWScriptStorage can have (and fallback to reading file from disk)?
Chris Dumez
Comment 28
2021-03-30 15:50:26 PDT
Created
attachment 424708
[details]
Patch
Chris Dumez
Comment 29
2021-03-30 15:51:34 PDT
(In reply to Chris Dumez from
comment #28
)
> Created
attachment 424708
[details]
> Patch
1. Fixed regression in my patch where NetworkCacheData::mapToFile() was no longer calling NetworkCacheData::apply() to write the data. 2. Do not use mmap() for files < PAGE_SIZE, similarly to what the network cache is doing.
Chris Dumez
Comment 30
2021-03-30 15:53:02 PDT
Created
attachment 424709
[details]
Patch
Geoffrey Garen
Comment 31
2021-03-30 15:55:47 PDT
Comment on
attachment 424709
[details]
Patch r=me
Chris Dumez
Comment 32
2021-03-30 19:53:45 PDT
Created
attachment 424729
[details]
Patch
Chris Dumez
Comment 33
2021-03-30 19:58:07 PDT
Created
attachment 424730
[details]
Patch
Chris Dumez
Comment 34
2021-03-30 20:56:30 PDT
I will follow-up to restrict the number of open file descriptors in the network process since the solution likely needs to cover both the network cache and this service worker storage (and seems more likely to be an issue in the network cache case).
EWS
Comment 35
2021-03-30 21:25:05 PDT
Committed
r275267
: <
https://commits.webkit.org/r275267
> All reviewed patches have been landed. Closing bug and clearing flags on
attachment 424730
[details]
.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug