Bug 225299 - Constructing a FormData from a form can lead to entries with lone surrogates
Summary: Constructing a FormData from a form can lead to entries with lone surrogates
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: Forms (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2021-05-03 05:20 PDT by Andreu Botella
Modified: 2021-05-10 05:21 PDT (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andreu Botella 2021-05-03 05:20:26 PDT
WPT test: https://wpt.fyi/results/html/semantics/forms/form-submission-0/form-data-set-usv.html?label=master&label=experimental&aligned

According to the WebIDL definition for FormData, entry names should be scalar value strings, and so should entry values when they aren't files. However, when a FormData object is constructed from a form, lone surrogates in its controls' names and values will end up in the FormData object's entry list as is. While the IDL bindings restrict incoming values to be USVStrings, meaning that surrogate-containing entry names can't be observed from the API, it is possible to observe entry values with surrogates.

In the HTML spec, the conversion into scalar value strings of names and values coming from forms happens during the entry list construction, in the "append an entry" algorithm, at the same time as newlines are normalized to CRLF. Gecko defers those conversions and normalizations until the form payload is encoded, and so does WebKit, except that the USV conversion never seems to happen. The spec and Gecko's behaviors used to be indistinguishable, until FormData was changed to allow inspection of its entry list from JS, whose consequences apparently weren't realized at the time. (See also bug 219086.)

Now in https://github.com/whatwg/html/pull/6624 (together with https://github.com/whatwg/html/pull/6287) we're standardizing on Gecko's and WebKit's behavior of deferring the newline normalization, but we're leaving the USV conversion because it wouldn't make much sense to change FormData to work with DOMStrings.
Comment 1 Radar WebKit Bug Importer 2021-05-10 05:21:12 PDT
<rdar://problem/77740634>