Bug 200102 - Move FormData zip file generation to NetworkProcess and enable it for all WebKit clients for uploading directories
Summary: Move FormData zip file generation to NetworkProcess and enable it for all Web...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Alex Christensen
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2019-07-24 17:16 PDT by Alex Christensen
Modified: 2019-08-04 11:20 PDT (History)
11 users (show)

See Also:


Attachments
Patch (52.55 KB, patch)
2019-07-24 17:32 PDT, Alex Christensen
no flags Details | Formatted Diff | Diff
Patch (64.23 KB, patch)
2019-07-31 14:49 PDT, Alex Christensen
no flags Details | Formatted Diff | Diff
Patch (61.79 KB, patch)
2019-07-31 14:53 PDT, Alex Christensen
no flags Details | Formatted Diff | Diff
Patch (69.49 KB, patch)
2019-07-31 22:43 PDT, Alex Christensen
commit-queue: commit-queue-
Details | Formatted Diff | Diff
patch without oops (69.49 KB, patch)
2019-08-01 07:15 PDT, Alex Christensen
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alex Christensen 2019-07-24 17:16:24 PDT
Move FormData zip file generation to NetworkProcess and enable it for all WebKit clients for uploading directories
Comment 1 Alex Christensen 2019-07-24 17:32:42 PDT
Created attachment 374838 [details]
Patch
Comment 2 Alex Christensen 2019-07-31 14:49:59 PDT
Created attachment 375251 [details]
Patch
Comment 3 Alex Christensen 2019-07-31 14:53:53 PDT
Created attachment 375253 [details]
Patch
Comment 4 Brent Fulgham 2019-07-31 15:22:39 PDT
Comment on attachment 375253 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=375253&action=review

> Source/WebCore/ChangeLog:17
> +        Since we no longer need to do these file operations in the WebProcess, I am also reverting r245322 and r246077 which tightens the sandbox.

So much wonderful code deletion!

> Source/WebCore/platform/network/mac/BlobDataFileReferenceMac.mm:53
>  namespace WebCore {

This file should probably be renamed BlobDataFileReferenceCocoa.mm (it's used by iOS, too, right?)

> Source/WebKit/WebProcess/com.apple.WebProcess.sb.in:-591
> -    (global-name "com.apple.FileCoordination"))

Good!

> Source/WebKit/WebProcess/com.apple.WebProcess.sb.in:-983
> -        (syscall-number SYS_getgroups)

Better!

> Source/WebKit/WebProcess/com.apple.WebProcess.sb.in:-992
> -        (syscall-number SYS_fstat64_extended)

Great!
Comment 5 Alex Christensen 2019-07-31 17:23:14 PDT
(In reply to Brent Fulgham from comment #4)
> Comment on attachment 375253 [details]
> Patch
> This file should probably be renamed BlobDataFileReferenceCocoa.mm (it's
> used by iOS, too, right?)
I propose doing this in a followup patch. There may come a day when this is needed to merge to a branch, and I'd like to minimize file renaming here.

r=you, Brent?
Comment 6 Darin Adler 2019-07-31 18:17:15 PDT
Comment on attachment 375253 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=375253&action=review

Looks great to me. Not sure how well we cover test cases; I have a few ideas about refinements.

> Source/WebCore/platform/network/FormData.cpp:368
> +            if (fileModificationTime->secondsSinceEpoch().secondsAs<time_t>() != fileData->expectedFileModificationTime->secondsSinceEpoch().secondsAs<time_t>())

I see that this code is similar (copied/pasted?) to code we already have. But I do have a question about it. Why is it correct to convert/round both of these values to time_t before comparing them? Is it important to ignore insignificant differences? I assume that on most Unix-like systems what’s actually stored in the file system is a time_t, so if we don’t convert to a time_t there is some risk that two matching times might compare as not equal, perhaps. But if that’s right, then is this platform-specific in some way? Seems like we might need to write a helper function so we can have a place to put a comment explaining the technique.

> Source/WebCore/platform/network/FormData.cpp:379
> +    return { *this, WTFMove(generatedFiles) };

Since the filenames are all stored in the EncodedFileData, kind of seems like we could just have a flag that says "this one was generated" and avoid having to have a separate vector of strings.

> Source/WebCore/platform/network/FormData.h:205
> +    WEBCORE_EXPORT std::pair<Ref<FormData>, Vector<String>> generateFilesForUpload();

We have three choices here: std::pair, std::tuple, custom struct with named elements. Is std::pair the best of the three?

Also, I like the idea of a GeneratedFile object which holds a path name and which deletes the file as a side effect when it’s deleted. We could keep moving it around from place to place, but then when it’s deleted, so is the file. Maybe too subtle? I think it makes it less likely we might forget to delete a file in some code path, but more likely that the delete could happen on some strange thread? If the only reason for the vector is to remember to delete the files, then we could even make the string private.

> Source/WebCore/platform/network/FormData.h:295
> +String generateFileForUpload(const String&);

This function is a bit mysterious. I think this takes a directory and turns it into a file. But that’s not clear from the name, nor is it clear what the argument and return value represent.

> Source/WebCore/platform/network/mac/BlobDataFileReferenceMac.mm:31
> +#include "FormData.h"

I don’t see any use of FormData in this file. What did I miss?

> Source/WebCore/platform/network/mac/BlobDataFileReferenceMac.mm:61
>          // The archive is put into a subdirectory of temporary directory for historic reasons. Changing this will require WebCore to change at the same time.

This comment is peculiar. The comment is in WebCore (probably was elsewhere at some point) but says "will require WebCore to change".

I wonder what these historic reasons are.

> Source/WebCore/platform/network/mac/BlobDataFileReferenceMac.mm:91
> +        m_replacementPath = WTFMove(generatedFile);

Not obvious to me where this file is deleted, but I suppose it’s nothing new, just not something could quickly see from looking over the code.
Comment 7 Alex Christensen 2019-07-31 21:31:36 PDT
Comment on attachment 375253 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=375253&action=review

>> Source/WebCore/platform/network/FormData.cpp:368
>> +            if (fileModificationTime->secondsSinceEpoch().secondsAs<time_t>() != fileData->expectedFileModificationTime->secondsSinceEpoch().secondsAs<time_t>())
> 
> I see that this code is similar (copied/pasted?) to code we already have. But I do have a question about it. Why is it correct to convert/round both of these values to time_t before comparing them? Is it important to ignore insignificant differences? I assume that on most Unix-like systems what’s actually stored in the file system is a time_t, so if we don’t convert to a time_t there is some risk that two matching times might compare as not equal, perhaps. But if that’s right, then is this platform-specific in some way? Seems like we might need to write a helper function so we can have a place to put a comment explaining the technique.

I'll move them to a shared helper function, but I'd rather match existing behavior in this patch. I think the intent is that if the file has been written to since we said to upload it, then don't upload it. I don't think accuracy of time systems will change that significantly, but I see no reason to change it.

>> Source/WebCore/platform/network/FormData.cpp:379
>> +    return { *this, WTFMove(generatedFiles) };
> 
> Since the filenames are all stored in the EncodedFileData, kind of seems like we could just have a flag that says "this one was generated" and avoid having to have a separate vector of strings.

That led to confusion about what to do when encoding and decoding the flag (which is now only created and used in the NetworkProcess, but the rest of the members are serialized) and when creating a FormDataElement (should it be generated or not? what does generated mean?). I intentionally removed the flag because it's something that's only used when uploading.

>> Source/WebCore/platform/network/FormData.h:205
>> +    WEBCORE_EXPORT std::pair<Ref<FormData>, Vector<String>> generateFilesForUpload();
> 
> We have three choices here: std::pair, std::tuple, custom struct with named elements. Is std::pair the best of the three?
> 
> Also, I like the idea of a GeneratedFile object which holds a path name and which deletes the file as a side effect when it’s deleted. We could keep moving it around from place to place, but then when it’s deleted, so is the file. Maybe too subtle? I think it makes it less likely we might forget to delete a file in some code path, but more likely that the delete could happen on some strange thread? If the only reason for the vector is to remember to delete the files, then we could even make the string private.

I'll make a struct. I like named things.

>> Source/WebCore/platform/network/FormData.h:295
>> +String generateFileForUpload(const String&);
> 
> This function is a bit mysterious. I think this takes a directory and turns it into a file. But that’s not clear from the name, nor is it clear what the argument and return value represent.

I'm planning to rename it to createZipArchive and put it in FileSystem.h.

>> Source/WebCore/platform/network/mac/BlobDataFileReferenceMac.mm:31
>> +#include "FormData.h"
> 
> I don’t see any use of FormData in this file. What did I miss?

The declaration for generateFileForUpload is currently in FormData.h. This is a bad place for that code, but I was trying to minimize changes. I'll move it to FileSystem.h instead.

>> Source/WebCore/platform/network/mac/BlobDataFileReferenceMac.mm:61
>>          // The archive is put into a subdirectory of temporary directory for historic reasons. Changing this will require WebCore to change at the same time.
> 
> This comment is peculiar. The comment is in WebCore (probably was elsewhere at some point) but says "will require WebCore to change".
> 
> I wonder what these historic reasons are.

This was copied from a function named createZipArchive in Safari.

>> Source/WebCore/platform/network/mac/BlobDataFileReferenceMac.mm:91
>> +        m_replacementPath = WTFMove(generatedFile);
> 
> Not obvious to me where this file is deleted, but I suppose it’s nothing new, just not something could quickly see from looking over the code.

This is existing behavior. It's deleted in ~BlobDataFileReference
Comment 8 Alex Christensen 2019-07-31 21:43:02 PDT
Comment on attachment 375253 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=375253&action=review

>>> Source/WebCore/platform/network/FormData.h:295
>>> +String generateFileForUpload(const String&);
>> 
>> This function is a bit mysterious. I think this takes a directory and turns it into a file. But that’s not clear from the name, nor is it clear what the argument and return value represent.
> 
> I'm planning to rename it to createZipArchive and put it in FileSystem.h.

createTemporaryZipArchive is more accurate.
Comment 9 Alex Christensen 2019-07-31 22:43:05 PDT
Created attachment 375293 [details]
Patch
Comment 10 WebKit Commit Bot 2019-08-01 06:37:31 PDT
Comment on attachment 375293 [details]
Patch

Rejecting attachment 375293 [details] from commit-queue.

Failed to run "['/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch', '--status-host=webkit-queues.webkit.org', '--bot-id=webkit-cq-03', 'validate-changelog', '--check-oops', '--non-interactive', 375293, '--port=mac']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit

ChangeLog entry in Source/WTF/ChangeLog contains OOPS!.

Full output: https://webkit-queues.webkit.org/results/12846584
Comment 11 Alex Christensen 2019-08-01 07:15:30 PDT
Created attachment 375308 [details]
patch without oops
Comment 12 WebKit Commit Bot 2019-08-01 15:29:59 PDT
Comment on attachment 375308 [details]
patch without oops

Clearing flags on attachment: 375308

Committed r248139: <https://trac.webkit.org/changeset/248139>
Comment 13 WebKit Commit Bot 2019-08-01 15:30:01 PDT
All reviewed patches have been landed.  Closing bug.
Comment 14 Radar WebKit Bug Importer 2019-08-01 15:33:09 PDT
<rdar://problem/53835700>
Comment 16 Alex Christensen 2019-08-02 11:24:41 PDT
That was supposed to be removed as part of reverting r245322 with this.  I'll fix it after verifying the fix.
Comment 17 Darin Adler 2019-08-04 11:20:52 PDT
Comment on attachment 375308 [details]
patch without oops

View in context: https://bugs.webkit.org/attachment.cgi?id=375308&action=review

> Source/WTF/wtf/cocoa/FileSystemCocoa.mm:88
> +            temporaryFile = String::fromUTF8(archivePath);

Just realized this should use stringFromFileSystemRepresentation instead of String::fromUTF8. This is probably a frequent error in our existing file system code. Or we could get rid of stringFromFileSystemRepresentation if it’s always just the same thing and isn’t needed?