Bug 193420 - JSScript API should only take ascii files.
Summary: JSScript API should only take ascii files.
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Keith Miller
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2019-01-14 17:13 PST by Keith Miller
Modified: 2019-01-22 10:42 PST (History)
7 users (show)

See Also:


Attachments
Patch (4.28 KB, patch)
2019-01-14 17:14 PST, Keith Miller
no flags Details | Formatted Diff | Diff
Patch for landing (5.52 KB, patch)
2019-01-18 18:57 PST, Keith Miller
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Keith Miller 2019-01-14 17:13:37 PST
JSScript API should only take ascii files.
Comment 1 Keith Miller 2019-01-14 17:14:03 PST
Created attachment 359101 [details]
Patch
Comment 2 Alexey Proskuryakov 2019-01-14 18:22:52 PST
Comment on attachment 359101 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=359101&action=review

> Source/JavaScriptCore/ChangeLog:3
> +        JSScript API should only take ascii files.

Why? ASCII is not cool.
Comment 3 Keith Miller 2019-01-14 18:42:36 PST
Comment on attachment 359101 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=359101&action=review

>> Source/JavaScriptCore/ChangeLog:3
>> +        JSScript API should only take ascii files.
> 
> Why? ASCII is not cool.

It's that or UTF16... WTF doesn't support UTF-8 strings. 

I guess it could be Latin1?
Comment 4 Alexey Proskuryakov 2019-01-14 19:28:53 PST
> WTF doesn't support UTF-8 strings.

Which part of WTF doesn't? The code was already using String::fromUTF8WithLatin1Fallback... I would say that we don't want latin-1 fallback in an API, but other than that, not sure what the issue is.
Comment 5 Keith Miller 2019-01-15 10:11:58 PST
(In reply to Alexey Proskuryakov from comment #4)
> > WTF doesn't support UTF-8 strings.
> 
> Which part of WTF doesn't? The code was already using
> String::fromUTF8WithLatin1Fallback... I would say that we don't want latin-1
> fallback in an API, but other than that, not sure what the issue is.

Sorry, I mean without modifying the source data. This API is intended to mmap a file so the source doesn't count against dirty memory. (which it doesn't do yet). StringImpls are either latin1 or UTF-16, AFAIK.
Comment 6 Alexey Proskuryakov 2019-01-15 11:01:54 PST
I see. It seems not great to shape an API based on current low level implementation details.

I would suggest making the API universal, and if it has an optimized path that certain clients can opt into, good for those clients. As you already iterate over all the characters in the source, you could keep it as scriptFromUTF8File, and use mmapped buffer if the check passes.

Alternatively, why not make mmapping client's responsibility? There is enough to worry about that JavaScriptCore is not prepared for in general, such as concurrent writing to the file, or unmounting the file system. JSC can't make a file based API work predictably in those cases, so it seems better leaving to the client to decide what to do there, or to guarantee that they don't care.
Comment 7 Saam Barati 2019-01-15 11:51:08 PST
Why can't we just have a constructor called JSASCIIScript or something similar so the name dictates this property?
Comment 8 Saam Barati 2019-01-15 11:52:24 PST
Comment on attachment 359101 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=359101&action=review

> Source/JavaScriptCore/API/JSScript.mm:98
> +    for (char c : buffer) {
> +        if (!isASCII(c))
> +            return nil;
> +    }

Do we really want to pull the entire file into memory? Why can't we just debug assert this? If we move to properly naming this something like JSASCIIScript...
Comment 9 Keith Miller 2019-01-15 11:56:16 PST
Comment on attachment 359101 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=359101&action=review

>> Source/JavaScriptCore/API/JSScript.mm:98
>> +    }
> 
> Do we really want to pull the entire file into memory? Why can't we just debug assert this? If we move to properly naming this something like JSASCIIScript...

Are you saying that we should rename the class or this constructor? I'm not sure that we only ever want to support ASCII scripts. We could also have a scriptFromUTF16 SPI constructor in the future.

As far as pulling the whole file into memory. There is a FIXME to mmap the contents of the file so it shouldn't count against us. Also, we are going to call a code signing function that is going to hash the entire file for signing anyway.
Comment 10 Saam Barati 2019-01-15 12:26:10 PST
Comment on attachment 359101 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=359101&action=review

>>> Source/JavaScriptCore/API/JSScript.mm:98
>>> +    }
>> 
>> Do we really want to pull the entire file into memory? Why can't we just debug assert this? If we move to properly naming this something like JSASCIIScript...
> 
> Are you saying that we should rename the class or this constructor? I'm not sure that we only ever want to support ASCII scripts. We could also have a scriptFromUTF16 SPI constructor in the future.
> 
> As far as pulling the whole file into memory. There is a FIXME to mmap the contents of the file so it shouldn't count against us. Also, we are going to call a code signing function that is going to hash the entire file for signing anyway.

Good point regarding pulling in all memory.

I think having the name just in this constructor function is fine.
Comment 11 Alexey Proskuryakov 2019-01-15 15:16:52 PST
Comment on attachment 359101 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=359101&action=review

> Source/JavaScriptCore/API/JSScript.h:58
> ++ (nullable instancetype)scriptFromASCIIFile:(NSURL *)filePath inVirtualMachine:(JSVirtualMachine *)vm withCodeSigning:(nullable NSURL *)codeSigningPath andBytecodeCache:(nullable NSURL *)cachePath;

Still looks really bad to me as an API, Saam's r+ notwithstanding.
Comment 12 Keith Miller 2019-01-18 18:57:56 PST
Created attachment 359574 [details]
Patch for landing
Comment 13 WebKit Commit Bot 2019-01-18 19:35:50 PST
Comment on attachment 359574 [details]
Patch for landing

Clearing flags on attachment: 359574

Committed r240194: <https://trac.webkit.org/changeset/240194>
Comment 14 WebKit Commit Bot 2019-01-18 19:35:51 PST
All reviewed patches have been landed.  Closing bug.
Comment 15 Radar WebKit Bug Importer 2019-01-18 19:36:29 PST
<rdar://problem/47404335>
Comment 16 Saam Barati 2019-01-20 12:24:24 PST
(In reply to Alexey Proskuryakov from comment #11)
> Comment on attachment 359101 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=359101&action=review
> 
> > Source/JavaScriptCore/API/JSScript.h:58
> > ++ (nullable instancetype)scriptFromASCIIFile:(NSURL *)filePath inVirtualMachine:(JSVirtualMachine *)vm withCodeSigning:(nullable NSURL *)codeSigningPath andBytecodeCache:(nullable NSURL *)cachePath;
> 
> Still looks really bad to me as an API, Saam's r+ notwithstanding.

Can you propose an alternative SPI?
Comment 17 Keith Miller 2019-01-20 14:17:25 PST
(In reply to Saam Barati from comment #16)
> (In reply to Alexey Proskuryakov from comment #11)
> > Comment on attachment 359101 [details]
> > Patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=359101&action=review
> > 
> > > Source/JavaScriptCore/API/JSScript.h:58
> > > ++ (nullable instancetype)scriptFromASCIIFile:(NSURL *)filePath inVirtualMachine:(JSVirtualMachine *)vm withCodeSigning:(nullable NSURL *)codeSigningPath andBytecodeCache:(nullable NSURL *)cachePath;
> > 
> > Still looks really bad to me as an API, Saam's r+ notwithstanding.
> 
> Can you propose an alternative SPI?

This SPI was decided upon because we were asked to manage the bytecode cache. Since, we  need to read and write files for that part of the SPI it seems reasonable that the whole SPI would handle files, IMO.
Comment 18 Alexey Proskuryakov 2019-01-20 18:04:19 PST
> Can you propose an alternative SPI?

I did already.

First of all, scriptFromUTF8File was perfectly fine as API, as you could choose to use a memory mapped buffer after checking the content.

Secondly, I recommend against dealing with files, especially for something long lived like mmap. The client that you have in mind today may not care, but future clients may have expectations with regards to wasting open file handles, or to how the process responds to file system unmounting or flakiness. The documentation doesn't even call it out that there will be side effects beyond creating the script.

> This SPI was decided upon because we were asked to manage the bytecode cache. Since, we  need to read and write files for that part of the SPI it seems reasonable that the whole SPI would handle files, IMO.

For the cache, you can have reasonable expectations for file system staying in place, as well as whether files will be deleted underneath you.
Comment 19 Saam Barati 2019-01-20 18:49:49 PST
(In reply to Alexey Proskuryakov from comment #18)
> > Can you propose an alternative SPI?
> 
> I did already.
> 
> First of all, scriptFromUTF8File was perfectly fine as API, as you could
> choose to use a memory mapped buffer after checking the content.

Since this is SPI, I think it makes sense for the SPI to fail instead of silently opt you into using more memory than you expected. When using a lower level SPI like this, we want clients to know when they mess up and opt themselves back into using more memory than they wanted to. The whole point of using this SPI is to use less memory. It feels weird doing something that might cost the client memory when they were not expecting it.

> 
> Secondly, I recommend against dealing with files, especially for something
> long lived like mmap. The client that you have in mind today may not care,
> but future clients may have expectations with regards to wasting open file
> handles, or to how the process responds to file system unmounting or
> flakiness. The documentation doesn't even call it out that there will be
> side effects beyond creating the script.
> 
> > This SPI was decided upon because we were asked to manage the bytecode cache. Since, we  need to read and write files for that part of the SPI it seems reasonable that the whole SPI would handle files, IMO.
> 
> For the cache, you can have reasonable expectations for file system staying
> in place, as well as whether files will be deleted underneath you.
Comment 20 Saam Barati 2019-01-20 18:53:47 PST
(In reply to Alexey Proskuryakov from comment #18)
> > Can you propose an alternative SPI?
> 
> I did already.
I see. Sorry I missed that comment...

> 
> First of all, scriptFromUTF8File was perfectly fine as API, as you could
> choose to use a memory mapped buffer after checking the content.
> 
> Secondly, I recommend against dealing with files, especially for something
> long lived like mmap. The client that you have in mind today may not care,
> but future clients may have expectations with regards to wasting open file
> handles, or to how the process responds to file system unmounting or
> flakiness. The documentation doesn't even call it out that there will be
> side effects beyond creating the script.
> 
> > This SPI was decided upon because we were asked to manage the bytecode cache. Since, we  need to read and write files for that part of the SPI it seems reasonable that the whole SPI would handle files, IMO.
> 
> For the cache, you can have reasonable expectations for file system staying
> in place, as well as whether files will be deleted underneath you.
Comment 21 Saam Barati 2019-01-20 18:55:28 PST
(In reply to Alexey Proskuryakov from comment #6)

> Alternatively, why not make mmapping client's responsibility? There is
> enough to worry about that JavaScriptCore is not prepared for in general,
> such as concurrent writing to the file, or unmounting the file system. JSC
> can't make a file based API work predictably in those cases, so it seems
> better leaving to the client to decide what to do there, or to guarantee
> that they don't care.

I mostly agree with this. I think it'd be nicer if it were the client's responsibility.

The only thing that gives me some pause is if we recommend this to N different clients, and they implement it N different ways, perhaps we should implement it ourselves. But maybe that's a bridge we should cross when we get there.
Comment 22 Alexey Proskuryakov 2019-01-22 10:42:04 PST
> The whole point of using this SPI is to use less memory.

I see. Not sure how huge the files in question are. In any case, it seems like setting up a memory use test would be a better long term way to make sure that memory use doesn't grow. There are certainly many things we can do that increase memory use, and having an SPI guarantee that there is no copy at the very edge doesn't do very much to defend the memory use goal.