Bug 207972 - Make support for bytecode caching more robust against file corruption.
Summary: Make support for bytecode caching more robust against file corruption.
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Mark Lam
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2020-02-19 15:40 PST by Mark Lam
Modified: 2020-02-21 08:21 PST (History)
9 users (show)

See Also:


Attachments
proposed patch. (22.88 KB, patch)
2020-02-19 16:16 PST, Mark Lam
mark.lam: review-
Details | Formatted Diff | Diff
proposed patch. (23.06 KB, patch)
2020-02-19 17:09 PST, Mark Lam
mark.lam: review-
Details | Formatted Diff | Diff
proposed patch. (15.24 KB, patch)
2020-02-20 15:46 PST, Mark Lam
mark.lam: review-
Details | Formatted Diff | Diff
proposed patch. (8.42 KB, patch)
2020-02-20 20:58 PST, Mark Lam
ysuzuki: review+
Details | Formatted Diff | Diff
patch for landing. (8.51 KB, patch)
2020-02-21 00:12 PST, Mark Lam
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Lam 2020-02-19 15:40:49 PST
Details to follow in the ChangeLog.

<rdar://problem/59260595>
Comment 1 Mark Lam 2020-02-19 16:16:54 PST
Created attachment 391216 [details]
proposed patch.
Comment 2 Mark Lam 2020-02-19 16:19:30 PST
Comment on attachment 391216 [details]
proposed patch.

Need build fixes after merge.
Comment 3 Mark Lam 2020-02-19 17:09:53 PST
Created attachment 391224 [details]
proposed patch.
Comment 4 Robin Morisset 2020-02-19 18:39:26 PST
Comment on attachment 391224 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=391224&action=review

Overall I am very happy with changes 1 and 2, but am a lot less convinced by change 3: either the cache can be freely corrupted by attackers and we need some more fancy checksum, or it cannot and then it is a bit of complexity for little benefit. I'd love to see this part of the patch separated from the parts 1 and 2, to at least make it easier to roll-out if we later find bugs in it.

> Source/JavaScriptCore/ChangeLog:33
> +           header current only contains a fileSize value.

nit: current -> currently

> Source/JavaScriptCore/API/JSScript.mm:153
> +    uint32_t unusedReservedForFutureUse[3];

Do we actually need this? And if we need this to be able to add things at the beginning of the cache file, why can we add things in this patch?
If it is purely for alignment purposes, I would maybe change the name/type to something like uint8_t paddingForAlignment[12];

> Source/JavaScriptCore/API/JSScript.mm:282
> +    int fd = open(tokenFileName, O_CREAT | O_NONBLOCK, 0666);

What is the minimum permission required for checking if a file exist and deleting it?
Allowing anyone to write to this file is probably not too bad (since no-one looks at it), but it feels unnecessary, and might maybe be useful for an attacker trying to persist information across reboots.

> Source/JavaScriptCore/API/JSScript.mm:337
> +    const char* tempFileName = [cachePathString stringByAppendingString:@".tmp"].UTF8String;

Nit: discrepancy between tempFileName and cacheFilename.
Suggested name change: cacheFilename => cacheFileName

> Source/JavaScriptCore/API/JSScript.mm:338
> +    int fd = open(cacheFilename, O_CREAT | O_RDWR | O_EXLOCK | O_NONBLOCK, 0666);

666 seems excessive, see above
Also, we seem not to ever use fd? Since we just rename tempFileName to cacheFilename

> Source/JavaScriptCore/API/JSScript.mm:344
> +    int tempFD = open(tempFileName, O_CREAT | O_RDWR | O_EXLOCK | O_NONBLOCK, 0666);

666 seems excessive, see above
Also, do we need O_RDWR, or just O_WRONLY ?

> Source/JavaScriptCore/runtime/CodeCache.h:158
> +        sourceProvider.notifyBeforeReadingCachedBytecode();

Should this not be before "sourceProvider.cachedBytecode()", to also catch crashes in that code?
Also I am a bit surprised that this pattern of notifyBeforeReadingCachedBytecode() -> do stuff -> notifyAfterReadingCachedBytecode() happens twice in this patch, I would have expected reading the cache to be done in a single place.
Comment 5 Mark Lam 2020-02-20 15:30:42 PST
Comment on attachment 391224 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=391224&action=review

Thanks for the feedback.  After thinking about this more, I decided to drop (3) since its benefit to risk ratio is low.

>> Source/JavaScriptCore/ChangeLog:33
>> +           header current only contains a fileSize value.
> 
> nit: current -> currently

Removed with (3).

>> Source/JavaScriptCore/API/JSScript.mm:153
>> +    uint32_t unusedReservedForFutureUse[3];
> 
> Do we actually need this? And if we need this to be able to add things at the beginning of the cache file, why can we add things in this patch?
> If it is purely for alignment purposes, I would maybe change the name/type to something like uint8_t paddingForAlignment[12];

Removed with (3).

>> Source/JavaScriptCore/API/JSScript.mm:282
>> +    int fd = open(tokenFileName, O_CREAT | O_NONBLOCK, 0666);
> 
> What is the minimum permission required for checking if a file exist and deleting it?
> Allowing anyone to write to this file is probably not too bad (since no-one looks at it), but it feels unnecessary, and might maybe be useful for an attacker trying to persist information across reboots.

I've change the 0666 to 0.

>> Source/JavaScriptCore/API/JSScript.mm:337
>> +    const char* tempFileName = [cachePathString stringByAppendingString:@".tmp"].UTF8String;
> 
> Nit: discrepancy between tempFileName and cacheFilename.
> Suggested name change: cacheFilename => cacheFileName

Fixed.

>> Source/JavaScriptCore/API/JSScript.mm:338
>> +    int fd = open(cacheFilename, O_CREAT | O_RDWR | O_EXLOCK | O_NONBLOCK, 0666);
> 
> 666 seems excessive, see above
> Also, we seem not to ever use fd? Since we just rename tempFileName to cacheFilename

Changed to O_WRONLY with 0600.

>> Source/JavaScriptCore/API/JSScript.mm:344
>> +    int tempFD = open(tempFileName, O_CREAT | O_RDWR | O_EXLOCK | O_NONBLOCK, 0666);
> 
> 666 seems excessive, see above
> Also, do we need O_RDWR, or just O_WRONLY ?

Changed to O_RDWR and 0600.  O_RDWR is actually needed.  The renaming of the temp file to cache file fails without it.

>> Source/JavaScriptCore/runtime/CodeCache.h:158
>> +        sourceProvider.notifyBeforeReadingCachedBytecode();
> 
> Should this not be before "sourceProvider.cachedBytecode()", to also catch crashes in that code?
> Also I am a bit surprised that this pattern of notifyBeforeReadingCachedBytecode() -> do stuff -> notifyAfterReadingCachedBytecode() happens twice in this patch, I would have expected reading the cache to be done in a single place.

Not really.  The cachedBytecode() method only returns a RefPtr<CachedBytecode> which is a container object that will point to the mapped file buffer.  It does not read anything from the file itself.
Comment 6 Mark Lam 2020-02-20 15:46:04 PST
Created attachment 391347 [details]
proposed patch.
Comment 7 Mark Lam 2020-02-20 17:33:12 PST
Comment on attachment 391347 [details]
proposed patch.

We're going to drop the "reading" token solution because it can be racy, and just validate the cache file with a hash at initialization instead.
Comment 8 Mark Lam 2020-02-20 20:58:49 PST
Created attachment 391377 [details]
proposed patch.
Comment 9 Yusuke Suzuki 2020-02-20 22:20:32 PST
Comment on attachment 391377 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=391377&action=review

r=me with sync fix.

> Source/JavaScriptCore/ChangeLog:28
> +        Manually tested with the following scenarios and ensuring that the client recovers
> +        with no crashes:

Can you also ensure that we are successfully using bytecode cache if it is correct?

> Source/JavaScriptCore/API/JSScript.mm:184
> +    SHA1::Digest fileHash;
> +    memcpy(&fileHash, fileData + fileDataSize, sizeof(SHA1::Digest));

Maybe, doing this after calculating the hash is better (at that time, page should be populated), but it is a small nit.

> Source/JavaScriptCore/API/JSScript.mm:312
> +        close(fd);

I think writing makeScopeExit twice is cleaner and non-error-prone.

auto closeFD = makeScopeExit([&] {
    close(fd);
});

int tempFD = ...
if (tempFD == -1) {
    ...
}

auto closeTempFD = makeScopeExit([&] {
    close(tempFD);
});

> Source/JavaScriptCore/API/JSScript.mm:346
> +    rename(tempFileName, cacheFileName);

Before renaming, let's do fsync to ensure the content is in disk.

> Source/JavaScriptCore/API/JSScript.mm:347
>      return YES;

To make this actually sync, we need to emit fsync onto the directory of this to make "renaming" in directory-entry sync in the disk. But since this is cache, we don't need to ensure it, I think.
If directory misses the rename operation, it is still fine.
Comment 10 Yusuke Suzuki 2020-02-20 22:46:23 PST
Comment on attachment 391377 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=391377&action=review

>> Source/JavaScriptCore/API/JSScript.mm:346
>> +    rename(tempFileName, cacheFileName);
> 
> Before renaming, let's do fsync to ensure the content is in disk.

Maybe, we should F_FULLFSYNC + fcntl on macOS. Let's add something like, FileSystem::sync and implement it for macOS by using fcntl + F_FULLFSYNC. (for other platforms using UNIX, fsync is OK I think.).
http://shaver.off.net/diary/2008/05/25/fsyncers-and-curveballs/
Comment 11 Yusuke Suzuki 2020-02-20 22:50:08 PST
Comment on attachment 391377 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=391377&action=review

>>> Source/JavaScriptCore/API/JSScript.mm:346
>>> +    rename(tempFileName, cacheFileName);
>> 
>> Before renaming, let's do fsync to ensure the content is in disk.
> 
> Maybe, we should F_FULLFSYNC + fcntl on macOS. Let's add something like, FileSystem::sync and implement it for macOS by using fcntl + F_FULLFSYNC. (for other platforms using UNIX, fsync is OK I think.).
> http://shaver.off.net/diary/2008/05/25/fsyncers-and-curveballs/

But because of SHA1 checksum, maybe, fsync is enough, because we do not require consistency of files strongly. If it is corrupted, then we just detect it via SHA1 and discard it.
Comment 12 Mark Lam 2020-02-20 22:55:55 PST
Comment on attachment 391377 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=391377&action=review

Thanks for the review.

>> Source/JavaScriptCore/ChangeLog:28
>> +        with no crashes:
> 
> Can you also ensure that we are successfully using bytecode cache if it is correct?

Yes, I did already.  That is the 5 case below: an uncorrupted cache file on disk.

Hmmm, the sentence below should appear below the 5 cases.  I'll fix the ChangeLog before landing.

>> Source/JavaScriptCore/API/JSScript.mm:184
>> +    memcpy(&fileHash, fileData + fileDataSize, sizeof(SHA1::Digest));
> 
> Maybe, doing this after calculating the hash is better (at that time, page should be populated), but it is a small nit.

Good point.

>> Source/JavaScriptCore/API/JSScript.mm:312
>> +        close(fd);
> 
> I think writing makeScopeExit twice is cleaner and non-error-prone.
> 
> auto closeFD = makeScopeExit([&] {
>     close(fd);
> });
> 
> int tempFD = ...
> if (tempFD == -1) {
>     ...
> }
> 
> auto closeTempFD = makeScopeExit([&] {
>     close(tempFD);
> });

Will apply.

>> Source/JavaScriptCore/API/JSScript.mm:347
>>      return YES;
> 
> To make this actually sync, we need to emit fsync onto the directory of this to make "renaming" in directory-entry sync in the disk. But since this is cache, we don't need to ensure it, I think.
> If directory misses the rename operation, it is still fine.

I'll add the fsync.
Comment 13 Mark Lam 2020-02-21 00:12:18 PST
Created attachment 391384 [details]
patch for landing.
Comment 14 Mark Lam 2020-02-21 08:21:57 PST
Landed in r257134: <http://trac.webkit.org/r257134>.