Bug 225299

Summary: Constructing a FormData from a form can lead to entries with lone surrogates
Product: WebKit Reporter: Andreu Botella <andreu>
Component: FormsAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, cdumez, changseok, darin, esprehn+autocc, ews-watchlist, gyuyoung.kim, mehmetgelisin, mifenton, webkit-bug-importer, wenson_hsieh
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
ews-feeder: commit-queue-
Patch none

Description Andreu Botella 2021-05-03 05:20:26 PDT
WPT test: https://wpt.fyi/results/html/semantics/forms/form-submission-0/form-data-set-usv.html?label=master&label=experimental&aligned

According to the WebIDL definition for FormData, entry names should be scalar value strings, and so should entry values when they aren't files. However, when a FormData object is constructed from a form, lone surrogates in its controls' names and values will end up in the FormData object's entry list as is. While the IDL bindings restrict incoming values to be USVStrings, meaning that surrogate-containing entry names can't be observed from the API, it is possible to observe entry values with surrogates.

In the HTML spec, the conversion into scalar value strings of names and values coming from forms happens during the entry list construction, in the "append an entry" algorithm, at the same time as newlines are normalized to CRLF. Gecko defers those conversions and normalizations until the form payload is encoded, and so does WebKit, except that the USV conversion never seems to happen. The spec and Gecko's behaviors used to be indistinguishable, until FormData was changed to allow inspection of its entry list from JS, whose consequences apparently weren't realized at the time. (See also bug 219086.)

Now in https://github.com/whatwg/html/pull/6624 (together with https://github.com/whatwg/html/pull/6287) we're standardizing on Gecko's and WebKit's behavior of deferring the newline normalization, but we're leaving the USV conversion because it wouldn't make much sense to change FormData to work with DOMStrings.
Comment 1 Radar WebKit Bug Importer 2021-05-10 05:21:12 PDT
<rdar://problem/77740634>
Comment 3 Andreu Botella 2021-11-25 11:27:27 PST
With https://github.com/whatwg/html/pull/7371, the fact that entry names and values which are not File objects are scalar value strings will now become an invariant of form entries.
Comment 4 Andreu Botella 2021-11-29 14:56:57 PST
Created attachment 445352 [details]
Patch
Comment 5 Chris Dumez 2021-11-30 08:14:41 PST
Comment on attachment 445352 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=445352&action=review

r=me with comments.

> Source/WebCore/html/DOMFormData.cpp:81
> +    return { usvName, usvValue };

We could write this in a more concise way:

return {
    replaceUnpairedSurrogatesWithReplacementCharacter(String { usvName }),
    replaceUnpairedSurrogatesWithReplacementCharacter(String { usvValue })
};

> Source/WebCore/html/DOMFormData.cpp:88
> +    usvName = replaceUnpairedSurrogatesWithReplacementCharacter(WTFMove(usvName));

I think this would look a bit nicer:
auto usvName = replaceUnpairedSurrogatesWithReplacementCharacter(String { usvName });

> Source/WebCore/html/DOMFormData.h:83
> +    Item createStringEntry(const String& name, const String& value);

It seems this can be static. Also, since it is private, no need to declare it in the header.
Comment 6 Andreu Botella 2021-11-30 08:58:05 PST
Comment on attachment 445352 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=445352&action=review

>> Source/WebCore/html/DOMFormData.cpp:81
>> +    return { usvName, usvValue };
> 
> We could write this in a more concise way:
> 
> return {
>     replaceUnpairedSurrogatesWithReplacementCharacter(String { usvName }),
>     replaceUnpairedSurrogatesWithReplacementCharacter(String { usvValue })
> };

I still don't fully understand how && and std::move work, and I didn't realize you could write it that way.

>> Source/WebCore/html/DOMFormData.h:83
>> +    Item createStringEntry(const String& name, const String& value);
> 
> It seems this can be static. Also, since it is private, no need to declare it in the header.

I'll also be removing createFileEntry from the header then, for consistency.
Comment 7 Chris Dumez 2021-11-30 08:58:59 PST
(In reply to Andreu Botella from comment #6)
> Comment on attachment 445352 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=445352&action=review
> 
> >> Source/WebCore/html/DOMFormData.cpp:81
> >> +    return { usvName, usvValue };
> > 
> > We could write this in a more concise way:
> > 
> > return {
> >     replaceUnpairedSurrogatesWithReplacementCharacter(String { usvName }),
> >     replaceUnpairedSurrogatesWithReplacementCharacter(String { usvValue })
> > };
> 
> I still don't fully understand how && and std::move work, and I didn't
> realize you could write it that way.
> 
> >> Source/WebCore/html/DOMFormData.h:83
> >> +    Item createStringEntry(const String& name, const String& value);
> > 
> > It seems this can be static. Also, since it is private, no need to declare it in the header.
> 
> I'll also be removing createFileEntry from the header then, for consistency.

Yes, please.
Comment 8 Andreu Botella 2021-11-30 09:03:14 PST
Created attachment 445423 [details]
Patch
Comment 9 Chris Dumez 2021-11-30 09:05:40 PST
Comment on attachment 445423 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=445423&action=review

> Source/WebCore/html/DOMFormData.cpp:73
> +auto createStringEntry(const String& name, const String& value) -> DOMFormData::Item

Please mark as static

> Source/WebCore/html/DOMFormData.cpp:82
> +auto createFileEntry(const String& name, Blob& blob, const String& filename) -> DOMFormData::Item

ditto.
Comment 10 Andreu Botella 2021-11-30 09:18:36 PST
Created attachment 445426 [details]
Patch
Comment 11 EWS 2021-11-30 11:34:48 PST
Committed r286310 (244669@main): <https://commits.webkit.org/244669@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 445426 [details].
Comment 12 Darin Adler 2021-11-30 11:59:22 PST
Comment on attachment 445426 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=445426&action=review

> Source/WebCore/html/DOMFormData.cpp:87
> +        return { usvName, File::create(blob.scriptExecutionContext(), blob, filename.isNull() ? "blob"_s : filename) };

I believe we can slightly reduce reference count churn by using WTFMove(usvName) here.

> Source/WebCore/html/DOMFormData.cpp:90
> +        return { usvName, File::create(blob.scriptExecutionContext(), downcast<File>(blob), filename) };

Ditto.

> Source/WebCore/html/DOMFormData.cpp:92
> +    return { usvName, RefPtr<File> { &downcast<File>(blob) } };

Ditto.

One other thought: I think we can just write RefPtr instead of RefPtr<File>.