Bug 176659 - Finish off the FormData implementation
Summary: Finish off the FormData implementation
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Sam Weinig
URL: https://xhr.spec.whatwg.org/#interfac...
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2017-09-09 11:19 PDT by Sam Weinig
Modified: 2017-09-27 12:48 PDT (History)
5 users (show)

See Also:


Attachments
Patch (54.21 KB, patch)
2017-09-09 14:53 PDT, Sam Weinig
no flags Details | Formatted Diff | Diff
Patch (54.71 KB, patch)
2017-09-09 17:11 PDT, Sam Weinig
no flags Details | Formatted Diff | Diff
Patch (55.45 KB, patch)
2017-09-09 18:19 PDT, Sam Weinig
no flags Details | Formatted Diff | Diff
Patch (55.35 KB, patch)
2017-09-10 14:43 PDT, Sam Weinig
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Sam Weinig 2017-09-09 11:19:12 PDT
Looks like we have a partial implementation of FormData (https://xhr.spec.whatwg.org/#interface-formdata). Pretty easy spec, so let's finish it off.
Comment 1 Sam Weinig 2017-09-09 14:53:51 PDT Comment hidden (obsolete)
Comment 2 Sam Weinig 2017-09-09 17:11:11 PDT Comment hidden (obsolete)
Comment 3 Sam Weinig 2017-09-09 18:19:53 PDT
Created attachment 320370 [details]
Patch
Comment 4 Darin Adler 2017-09-10 12:34:29 PDT
Comment on attachment 320370 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=320370&action=review

Is it OK that all these functions are inefficient if there are a huge numbers of elements, because this is a vector and not an associative data structure?

> Source/WebCore/fileapi/Blob.cpp:82
> +Blob::Blob(const Blob& blob)

Where is this new function used? Given that it seems that a Blob is both reference counted and immutable, I don’t understand when we need to copy one. Nor can I spot it reading the code.

> Source/WebCore/html/DOMFormData.cpp:72
> +}
> +auto DOMFormData::get(const String& name) -> std::optional<FormDataEntryValue>

Missing blank line.

> Source/WebCore/html/DOMFormData.cpp:126
> +    return WTF::KeyValuePair<String, FormDataEntryValue> { item.key(), item.data() };

Should we add a makeKeyValuePair function so we don’t have to write out types in a case like this?

> Source/WebCore/html/DOMFormData.idl:33
> +// FIXME: Should be Exposed=(Window,Worker)

Would be nice to have the comment explain what this is waiting on.

> Source/WebCore/html/FormDataList.cpp:44
> +        auto file = File::create(blob.get(), filename.isNull() ? ASCIILiteral("blob") : filename);
> +        return { key, WTFMove(file) };

Do this as a one-liner? It would only be 4 characters longer than the File::create line.

> Source/WebCore/html/FormDataList.cpp:49
> +        auto file = File::create(downcast<File>(blob.get()), filename);
> +        return { key, WTFMove(file) };

Ditto.

> Source/WebCore/html/FormDataList.cpp:71
> +template<typename EntryMaker>
> +void FormDataList::set(const String& key, const EntryMaker& entryMaker)

Looking at the logic below, it seems that EntryMaker is always called, and there is no ordering issue due to side effects that I can see. Thus, I think this function could take an Item&& instead and would not need to be a template.

> Source/WebCore/html/FormDataList.cpp:73
> +    size_t initialMatchLocation = WTF::notFound;

I suggest we use std::optional for clarity instead of WTF::notFound.

> Source/WebCore/html/FormDataList.cpp:79
> +            m_items[i] = entryMaker();

This would be easier to read if the loop was kept as a pure "findKey" operation.

Could move this one line of code into the if statement below instead of putting it inside the loop. I can’t think of a good reason to have some of work inside the loop and some outside.

> Source/WebCore/html/FormDataList.h:32
>      class Item {

Seems like this should be a struct instead of a class now. Unless we think Variant is so inconvenient to work with that we want to hide it behind "private".

> Source/WebCore/html/FormDataList.h:66
> +    CString normalizeString(const String&) const;

Seems a little strange now that we hold a TextEncoding just so that clients can explicitly call normalizeString. Is m_encoding used for anything else? Is there a more elegant way to arrange things?

> Source/WebCore/html/HTMLKeygenElement.cpp:126
> +    encoded_values.appendData(name(), value);

Could save one little reference count churn if a version that took String&& existed. Probably not worth it.

Does this fix a bug because we use the proper TextEncoding now rather than always UTF-8? Or is this always going to be an ASCII string and so it doesn’t matter what encoding is used?

> Source/WebCore/platform/network/FormData.cpp:209
> +                        String generatedFileName;
> +                        shouldGenerateFile = page->chrome().client().shouldReplaceWithGeneratedFileForUpload(path, generatedFileName);

Should come back and change the ChromeClient function to use a return value instead of an out argument. A boolean plus a String out argument seems out of style in our code these days. Not sure what is ideal, though. Maybe std::optional<String>? Or even just String using null string or empty string to mean false, but we always could have done that.

> Source/WebCore/platform/network/FormData.cpp:231
> +                auto file = WTF::get<RefPtr<File>>(item.data());

I would have written "auto& file = *WTF::get ..." here and had the local variable be a File& instead of a guaranteed non-null RefPtr<File>.

> Source/WebCore/platform/network/FormData.cpp:233
> +                if (!file->path().isEmpty())
> +                    appendFile(file->path(), shouldGenerateFile);

It is a bit awkward that we have to check holds_alternative<RefPtr<File>> and then file->path().isEmpty() again here, and then rely on the fact that the side effect of calling shouldReplaceWithGeneratedFileForUpload is in sync with this code. I wonder if there’s any way to tie the code above to the code down here in a tighter way. Annoying to re-do all the same conditionals after just two lines of unconditional code.

It’s mainly because the append functions have to be in a certain order because they have side effects, I think.

Similar problem with the various isMultipartForm. I think there is some improvement possible here at some point with some additional refactoring.

> Source/WebCore/platform/network/FormData.cpp:237
> +                auto normalizedStingData = list.normalizeString(WTF::get<String>(item.data()));

Typo here: "Sting".

> Source/WebCore/platform/network/FormData.cpp:244
> +            auto normalizedStingData = list.normalizeString(WTF::get<String>(item.data()));

Typo here: "Sting".

> Source/WebCore/platform/network/FormData.cpp:260
> +    auto& e = m_elements.last();

Amazed that you resisted the impulse to rename this lastElement.

> Source/WebCore/platform/network/FormData.cpp:262
>      e.m_data.grow(oldSize + size);

This math seems like it needs an overflow check or an explanation of why it doesn’t need one.

> Source/WebCore/platform/network/FormData.cpp:275
>      }
> +    
> +    return data;

Seems a bit strange to have this one blank line in this function.

> Source/WebCore/platform/network/FormData.cpp:312
> +    for (const auto& element : m_elements) {

I would have just done "auto&"; I don’t think it adds much clarity to state const.

> Source/WebCore/platform/network/FormData.cpp:327
> +    for (const auto& element : m_elements) {

Ditto.

> Source/WebCore/platform/network/FormData.cpp:361
> +    for (const auto& element : m_elements) {

I would have just done auto& here; it will be const either way since this is a const member function, no need to state it explicitly.

> Source/WebCore/platform/network/FormData.cpp:370
> +    for (const auto& element : m_elements) {

Ditto.
Comment 5 Sam Weinig 2017-09-10 13:57:47 PDT
(In reply to Darin Adler from comment #4)
> Comment on attachment 320370 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=320370&action=review
> 
> Is it OK that all these functions are inefficient if there are a huge
> numbers of elements, because this is a vector and not an associative data
> structure?

That's an interesting question, and one I don't have a great answer for. I would guess the use case for this object would remain mostly append, and relatively small number of elements, but that often ends up being a bad assumption.

I'll think a bit more about it, and look at what converting this (and maybe also URLSearchParams which is very similar) to use a ListHashSet. 

> 
> > Source/WebCore/fileapi/Blob.cpp:82
> > +Blob::Blob(const Blob& blob)
> 
> Where is this new function used? Given that it seems that a Blob is both
> reference counted and immutable, I don’t understand when we need to copy
> one. Nor can I spot it reading the code.

This is used from the File constructor that takes a const Blob&. Blob and File are refcounted, these aren't really acting as copy constructors, but rather, this constructor creates a Blob that references the passed in Blob (or at least references it's URL, in the way Blobs do). 

> 
> > Source/WebCore/html/DOMFormData.cpp:72
> > +}
> > +auto DOMFormData::get(const String& name) -> std::optional<FormDataEntryValue>
> 
> Missing blank line.

Will fix.

> 
> > Source/WebCore/html/DOMFormData.cpp:126
> > +    return WTF::KeyValuePair<String, FormDataEntryValue> { item.key(), item.data() };
> 
> Should we add a makeKeyValuePair function so we don’t have to write out
> types in a case like this?

Yes. Will do in a follow up.

> 
> > Source/WebCore/html/DOMFormData.idl:33
> > +// FIXME: Should be Exposed=(Window,Worker)
> 
> Would be nice to have the comment explain what this is waiting on.

Indeed. I'll also file a bug and reference it.

> 
> > Source/WebCore/html/FormDataList.cpp:44
> > +        auto file = File::create(blob.get(), filename.isNull() ? ASCIILiteral("blob") : filename);
> > +        return { key, WTFMove(file) };
> 
> Do this as a one-liner? It would only be 4 characters longer than the
> File::create line.

Will do.

> 
> > Source/WebCore/html/FormDataList.cpp:49
> > +        auto file = File::create(downcast<File>(blob.get()), filename);
> > +        return { key, WTFMove(file) };
> 
> Ditto.


Will do.

> 
> > Source/WebCore/html/FormDataList.cpp:71
> > +template<typename EntryMaker>
> > +void FormDataList::set(const String& key, const EntryMaker& entryMaker)
> 
> Looking at the logic below, it seems that EntryMaker is always called, and
> there is no ordering issue due to side effects that I can see. Thus, I think
> this function could take an Item&& instead and would not need to be a
> template.

Nice. I wonder what I was thinking. 

> 
> > Source/WebCore/html/FormDataList.cpp:73
> > +    size_t initialMatchLocation = WTF::notFound;
> 
> I suggest we use std::optional for clarity instead of WTF::notFound.

Will do.

> 
> > Source/WebCore/html/FormDataList.cpp:79
> > +            m_items[i] = entryMaker();
> 
> This would be easier to read if the loop was kept as a pure "findKey"
> operation.
> 
> Could move this one line of code into the if statement below instead of
> putting it inside the loop. I can’t think of a good reason to have some of
> work inside the loop and some outside.
> 
> > Source/WebCore/html/FormDataList.h:32
> >      class Item {
> 
> Seems like this should be a struct instead of a class now. Unless we think
> Variant is so inconvenient to work with that we want to hide it behind
> "private".
> 
> > Source/WebCore/html/FormDataList.h:66
> > +    CString normalizeString(const String&) const;
> 
> Seems a little strange now that we hold a TextEncoding just so that clients
> can explicitly call normalizeString. Is m_encoding used for anything else?
> Is there a more elegant way to arrange things?

Not sure. I was hoping when I merged DOMFormData and FormDataList a solution would emerge. My guess is we can just pass the encoding to FormData with the DOMFormData, but I'll have to work that out.

> 
> > Source/WebCore/html/HTMLKeygenElement.cpp:126
> > +    encoded_values.appendData(name(), value);
> 
> Could save one little reference count churn if a version that took String&&
> existed. Probably not worth it.
> 
> Does this fix a bug because we use the proper TextEncoding now rather than
> always UTF-8? Or is this always going to be an ASCII string and so it
> doesn’t matter what encoding is used?

It was always ASCII (a base-64 encoded string), so it didn't matter.

> 
> > Source/WebCore/platform/network/FormData.cpp:209
> > +                        String generatedFileName;
> > +                        shouldGenerateFile = page->chrome().client().shouldReplaceWithGeneratedFileForUpload(path, generatedFileName);
> 
> Should come back and change the ChromeClient function to use a return value
> instead of an out argument. A boolean plus a String out argument seems out
> of style in our code these days. Not sure what is ideal, though. Maybe
> std::optional<String>? Or even just String using null string or empty string
> to mean false, but we always could have done that.

Totally. I need to change all this so it can work with Workers, so I tried not to touch it in this round.

> 
> > Source/WebCore/platform/network/FormData.cpp:231
> > +                auto file = WTF::get<RefPtr<File>>(item.data());
> 
> I would have written "auto& file = *WTF::get ..." here and had the local
> variable be a File& instead of a guaranteed non-null RefPtr<File>.

Good call.

> 
> > Source/WebCore/platform/network/FormData.cpp:233
> > +                if (!file->path().isEmpty())
> > +                    appendFile(file->path(), shouldGenerateFile);
> 
> It is a bit awkward that we have to check holds_alternative<RefPtr<File>>
> and then file->path().isEmpty() again here, and then rely on the fact that
> the side effect of calling shouldReplaceWithGeneratedFileForUpload is in
> sync with this code. I wonder if there’s any way to tie the code above to
> the code down here in a tighter way. Annoying to re-do all the same
> conditionals after just two lines of unconditional code.
> 
> It’s mainly because the append functions have to be in a certain order
> because they have side effects, I think.
> 
> Similar problem with the various isMultipartForm. I think there is some
> improvement possible here at some point with some additional refactoring.

Yeah, this code is asking for a bath :).

> 
> > Source/WebCore/platform/network/FormData.cpp:237
> > +                auto normalizedStingData = list.normalizeString(WTF::get<String>(item.data()));
> 
> Typo here: "Sting".

Will fix.

> 
> > Source/WebCore/platform/network/FormData.cpp:244
> > +            auto normalizedStingData = list.normalizeString(WTF::get<String>(item.data()));
> 
> Typo here: "Sting".
> 
> > Source/WebCore/platform/network/FormData.cpp:260
> > +    auto& e = m_elements.last();
> 
> Amazed that you resisted the impulse to rename this lastElement.

Me too. I renamed so many other single letter variables.

> 
> > Source/WebCore/platform/network/FormData.cpp:262
> >      e.m_data.grow(oldSize + size);
> 
> This math seems like it needs an overflow check or an explanation of why it
> doesn’t need one.

Hm. I will investigate it.

> 
> > Source/WebCore/platform/network/FormData.cpp:275
> >      }
> > +    
> > +    return data;
> 
> Seems a bit strange to have this one blank line in this function.

Will fix.

> 
> > Source/WebCore/platform/network/FormData.cpp:312
> > +    for (const auto& element : m_elements) {
> 
> I would have just done "auto&"; I don’t think it adds much clarity to state
> const.
> 
> > Source/WebCore/platform/network/FormData.cpp:327
> > +    for (const auto& element : m_elements) {
> 
> Ditto.
> 
> > Source/WebCore/platform/network/FormData.cpp:361
> > +    for (const auto& element : m_elements) {
> 
> I would have just done auto& here; it will be const either way since this is
> a const member function, no need to state it explicitly.
> 
> > Source/WebCore/platform/network/FormData.cpp:370
> > +    for (const auto& element : m_elements) {
> 
> Ditto.

Will fix.


Thanks so much for the review.
Comment 6 Darin Adler 2017-09-10 14:15:47 PDT
Comment on attachment 320370 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=320370&action=review

>>> Source/WebCore/fileapi/Blob.cpp:82
>>> +Blob::Blob(const Blob& blob)
>> 
>> Where is this new function used? Given that it seems that a Blob is both reference counted and immutable, I don’t understand when we need to copy one. Nor can I spot it reading the code.
> 
> This is used from the File constructor that takes a const Blob&. Blob and File are refcounted, these aren't really acting as copy constructors, but rather, this constructor creates a Blob that references the passed in Blob (or at least references it's URL, in the way Blobs do).

I still don’t understand. Why can’t we just use the original Blob in that case instead of making a new one? Do we need separate sets of custom properties when it’s web-exposed? Or some other reason?

>> Source/WebCore/html/FormDataList.h:32
>>      class Item {
> 
> Seems like this should be a struct instead of a class now. Unless we think Variant is so inconvenient to work with that we want to hide it behind "private".

I noticed you did not respond to this one in either the negative or the affirmative. I hope that’s because you decided to do it and thought it was so obvious it did not even need to be mentioned.
Comment 7 Darin Adler 2017-09-10 14:17:21 PDT
(In reply to Sam Weinig from comment #5)
> I'll think a bit more about it, and look at what converting this (and maybe
> also URLSearchParams which is very similar) to use a ListHashSet. 

Sadly it seems this requires support for duplicate keys, and so I suspect ListHashSet won’t be usable as-is.
Comment 8 Sam Weinig 2017-09-10 14:43:12 PDT
Created attachment 320402 [details]
Patch
Comment 9 Sam Weinig 2017-09-10 14:50:53 PDT
(In reply to Darin Adler from comment #7)
> (In reply to Sam Weinig from comment #5)
> > I'll think a bit more about it, and look at what converting this (and maybe
> > also URLSearchParams which is very similar) to use a ListHashSet. 
> 
> Sadly it seems this requires support for duplicate keys, and so I suspect
> ListHashSet won’t be usable as-is.

Oh darn, good point.
Comment 10 Sam Weinig 2017-09-10 15:07:59 PDT
(In reply to Darin Adler from comment #6)
> Comment on attachment 320370 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=320370&action=review
> 
> >>> Source/WebCore/fileapi/Blob.cpp:82
> >>> +Blob::Blob(const Blob& blob)
> >> 
> >> Where is this new function used? Given that it seems that a Blob is both reference counted and immutable, I don’t understand when we need to copy one. Nor can I spot it reading the code.
> > 
> > This is used from the File constructor that takes a const Blob&. Blob and File are refcounted, these aren't really acting as copy constructors, but rather, this constructor creates a Blob that references the passed in Blob (or at least references it's URL, in the way Blobs do).
> 
> I still don’t understand. Why can’t we just use the original Blob in that
> case instead of making a new one? Do we need separate sets of custom
> properties when it’s web-exposed? Or some other reason?

Oh. Yeah, we need to convert the Blob into a File per https://xhr.spec.whatwg.org/#create-an-entry (specifically step 3 "If value is a Blob object and not a File object, then set value to a new File object, representing the same bytes, whose name attribute value is "blob"."). This is web visible, since you can append a Blob, but then when you go an inspect what you just appended, it needs to be a file.  

> 
> >> Source/WebCore/html/FormDataList.h:32
> >>      class Item {
> > 
> > Seems like this should be a struct instead of a class now. Unless we think Variant is so inconvenient to work with that we want to hide it behind "private".
> 
> I noticed you did not respond to this one in either the negative or the
> affirmative. I hope that’s because you decided to do it and thought it was
> so obvious it did not even need to be mentioned.

Oops, sorry. Yeah, I made it a struct.
Comment 11 Sam Weinig 2017-09-10 15:57:01 PDT
Committed r221839: <http://trac.webkit.org/changeset/221839>
Comment 12 Michael Catanzaro 2017-09-11 14:47:11 PDT
Hey Sam:

[2201/4301] Building CXX object Source...keFiles/WebCore.dir/fileapi/Blob.cpp.o
../../Source/WebCore/fileapi/Blob.cpp: In copy constructor ‘WebCore::Blob::Blob(const WebCore::Blob&)’:
../../Source/WebCore/fileapi/Blob.cpp:82:1: warning: base class ‘class WTF::RefCounted<WebCore::Blob>’ should be explicitly initialized in the copy constructor [-Wextra]
 Blob::Blob(const Blob& blob)
 ^~~~
Comment 13 Sam Weinig 2017-09-11 15:07:25 PDT
(In reply to Michael Catanzaro from comment #12)
> Hey Sam:
> 
> [2201/4301] Building CXX object
> Source...keFiles/WebCore.dir/fileapi/Blob.cpp.o
> ../../Source/WebCore/fileapi/Blob.cpp: In copy constructor
> ‘WebCore::Blob::Blob(const WebCore::Blob&)’:
> ../../Source/WebCore/fileapi/Blob.cpp:82:1: warning: base class ‘class
> WTF::RefCounted<WebCore::Blob>’ should be explicitly initialized in the copy
> constructor [-Wextra]
>  Blob::Blob(const Blob& blob)
>  ^~~~

What compiler / port is that?
Comment 14 Michael Catanzaro 2017-09-11 16:02:16 PDT
GCC 7.1.1

It's WebKitGTK+, but that shouldn't matter....
Comment 15 Michael Catanzaro 2017-09-11 16:13:08 PDT
So I think the problem is that Blob and File are supposed to be noncopyable, because they are RefCounted, but you have defined copy constructors for them. I won't roll this out, but reopening for follow-up.
Comment 16 Sam Weinig 2017-09-12 09:48:37 PDT
(In reply to Michael Catanzaro from comment #15)
> So I think the problem is that Blob and File are supposed to be noncopyable,
> because they are RefCounted, but you have defined copy constructors for
> them. I won't roll this out, but reopening for follow-up.

Thanks. I'll post a fix shortly.

Separately, it would be good to get this warning enabled on one (or as many as support it) of the EWS bots.
Comment 17 Sam Weinig 2017-09-12 13:03:59 PDT
Fix landed in r221936.
Comment 18 Sam Weinig 2017-09-12 16:47:05 PDT
Let me know if its still an issue.
Comment 19 Radar WebKit Bug Importer 2017-09-27 12:48:24 PDT
<rdar://problem/34694059>