Bug 215671

Summary: Implement Request/Response consuming as FormData
Product: WebKit Reporter: Alex Christensen <achristensen>
Component: New BugsAssignee: Alex Christensen <achristensen>
Status: RESOLVED FIXED    
Severity: Normal CC: benjamin, cdumez, cedric, cmarcelo, darin, ews-watchlist, mike, philip, webkit-bug-importer, youennf
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch none

Description Alex Christensen 2020-08-19 17:06:37 PDT
Implement Request/Response consuming as FormData
Comment 1 Alex Christensen 2020-08-19 17:14:45 PDT
Created attachment 406890 [details]
Patch
Comment 2 Alex Christensen 2020-08-19 17:15:54 PDT
Comment on attachment 406890 [details]
Patch

Uploading for EWS and feedback, but now is a bad time to land this change.  I'll hold off for a few weeks.
Comment 3 Darin Adler 2020-08-19 17:26:40 PDT
Comment on attachment 406890 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=406890&action=review

> Source/WTF/wtf/text/WTFString.cpp:865
> +    RELEASE_ASSERT_WITH_SECURITY_IMPLICATION(utf16Length <= length);

Adding the "=" was neat. Checking this even in production builds is less to my taste, but maybe that’s just the way the wind is blowing.

> Source/WTF/wtf/unicode/UTF8Conversion.cpp:100
> +        U8_NEXT_OR_FFFD(reinterpret_cast<const uint8_t*>(source), sourceOffset, sourceEnd - source, character);

Is this correct for all callers? Let’s figure out if we have tests for all clients of this function that cover invalid characters. I bet there are some where failure would be better than converting to FFFD, but I am not sure without a test.

Clearly we don’t have test coverage for most uses of this, because the only test cases I see with new behavior are the form data ones.

A far less elegant, but more conservative, change would be to change only the code used for form data, since we don’t have test coverage for the other uses.

> Source/WebCore/Modules/fetch/FetchBody.cpp:242
> +        m_consumer.resolveWithData(WTFMove(promise), reinterpret_cast<const uint8_t*>(sharedBuffer->data()), sharedBuffer->size());

Avoid the reinterpret_cast here by using the dataAsUInt8Ptr function.

> Source/WebCore/Modules/fetch/FetchBodyConsumer.cpp:48
> +    String string = String::fromUTF8(data, length);

I wouldn’t bother with a local variable here. Just put this inside the call to the URLParser.

> Source/WebCore/Modules/fetch/FetchBodyConsumer.cpp:50
> +    for (auto& pair : WTF::URLParser::parseURLEncodedForm(string))

Surprised that we need the WTF prefix here.

> Source/WebCore/Modules/fetch/FetchBodyConsumer.cpp:145
> +        promise->resolve<IDLInterface<DOMFormData>>(formDataFromData(buffer ? reinterpret_cast<const uint8_t*>(buffer->data()) : nullptr, buffer ? buffer->size() : 0).get());

Avoid the reinterpret_cast here by using the dataAsUInt8Ptr function.
Comment 4 Alex Christensen 2020-08-20 17:58:48 PDT
Created attachment 406990 [details]
Patch
Comment 5 Alex Christensen 2020-08-20 18:03:13 PDT
Created attachment 406991 [details]
Patch
Comment 6 Alex Christensen 2020-08-20 21:11:39 PDT
Created attachment 406999 [details]
Patch
Comment 7 Alex Christensen 2020-08-20 23:41:54 PDT
Created attachment 407003 [details]
Patch
Comment 8 Darin Adler 2020-08-21 09:31:35 PDT
Comment on attachment 407003 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=407003&action=review

> Source/WTF/wtf/text/WTFString.cpp:862
> +    constexpr auto function = replaceInvalidSequences ? convertUTF8ToUTF16ReplacingInvalidSequences : convertUTF8ToUTF16;

Not sure the constexpr here has any effect.

> Source/WTF/wtf/text/WTFString.cpp:863
> +    if (!function(stringCurrent, reinterpret_cast<const char *>(stringStart + length), &bufferCurrent, bufferCurrent + buffer.size(), nullptr))

Non-WebKit-code-style space was here in "char *".

> Source/WTF/wtf/unicode/UTF8Conversion.h:48
> +WTF_EXPORT_PRIVATE bool convertUTF8ToUTF16ReplacingInvalidSequences(const char* sourceStart, const char* sourceEnd, UChar** targetStart, UChar* targetEnd, bool* isSourceAllASCII);

What was behind the choice to not include the nullptr default for the last argument?

> Source/WebCore/Modules/fetch/FetchBodyConsumer.cpp:48
> +static bool containsOnlyHTTPQuotedStringTokenCodePoints(const String& string)

Seems like we’d want the argument to be StringView, not String.

> Source/WebCore/Modules/fetch/FetchBodyConsumer.cpp:59
> +    for (size_t i = 0; i < string.length(); i++) {
> +        if (!isHTTPQuotedStringTokenCodePoint(string[i]))
> +            return false;
> +    }
> +    return true;

The isAllSpecialCharacters template (which could also be named containsOnly) was designed for use in cases like this so we would not write a loop.

> Source/WebCore/Modules/fetch/FetchBodyConsumer.cpp:68
> +static HashMap<String, String> parseParameters(const String& input, size_t position)

Seems like we’d want the argument to be StringView, not String.

> Source/WebCore/Modules/fetch/FetchBodyConsumer.cpp:109
> +static Optional<MimeType> parseMIMEType(const String& contentType)

Seems like we’d want the argument to be StringView, not String.

> Source/WebCore/Modules/fetch/FetchBodyConsumer.cpp:140
> +#if PLATFORM(WIN)
> +const void* memmem(const void* haystack, size_t haystackLength, const void* needle, size_t needleLength)
> +{
> +    const char* pointer = static_cast<const char*>(haystack);
> +    while (haystackLength >= needleLength) {
> +        if (!memcmp(pointer, needle, needleLength))
> +            return pointer;
> +        pointer++;
> +        haystackLength--;
> +    }
> +    return nullptr;
> +}
> +#endif

In the past we put functions from <string.h> that are missing on Windows in <wtf/StringExtras.h>, and I suggest moving it there. If here, it should be marked static. If there, inline.

> Source/WebCore/Modules/fetch/FetchBodyConsumer.cpp:162
> +        size_t contentDispsitionParametersBegin = header.find(';', contentDispositionBegin + strlen(contentDispositionCharacters));

"disposition" is misspelled here.

> Source/WebCore/Modules/fetch/FetchBodyConsumer.cpp:166
> +        auto parameters = parseParameters(header.substring(contentDispsitionParametersBegin, contentDispositionEnd - contentDispsitionParametersBegin), 0);

Seems unnecessary to construct a WTF::String and allocate another copy in memory just to parse a substring. This is what StringView is for.
Comment 9 Alex Christensen 2020-08-24 12:17:26 PDT
(In reply to Darin Adler from comment #8)
> > Source/WTF/wtf/unicode/UTF8Conversion.h:48
> > +WTF_EXPORT_PRIVATE bool convertUTF8ToUTF16ReplacingInvalidSequences(const char* sourceStart, const char* sourceEnd, UChar** targetStart, UChar* targetEnd, bool* isSourceAllASCII);
> 
> What was behind the choice to not include the nullptr default for the last
> argument?
It wasn't necessary.  It doesn't hurt, though.

> > Source/WebCore/Modules/fetch/FetchBodyConsumer.cpp:109
> > +static Optional<MimeType> parseMIMEType(const String& contentType)
> 
> Seems like we’d want the argument to be StringView, not String.
In this case it's more convenient to have a const String& for stripLeadingAndTrailingHTTPSpaces, and it doesn't make a difference here because we always have a const String&.  parseParameters, though, is significantly better with a StringView.
Comment 10 Alex Christensen 2020-08-24 12:22:20 PDT
Created attachment 407121 [details]
Patch
Comment 11 Alex Christensen 2020-08-24 13:47:06 PDT
String::operator[] has bounds checks that StringView::operator[] does not.  I need to be more careful with StringView.
Comment 12 Alex Christensen 2020-08-24 13:50:52 PDT
Created attachment 407132 [details]
Patch
Comment 13 Alex Christensen 2020-08-24 14:21:15 PDT
Comment on attachment 407132 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=407132&action=review

> Source/WebCore/Modules/fetch/FetchBodyConsumer.cpp:71
> +        if (position >= input.length())

This check was needed.

> Source/WebCore/Modules/fetch/FetchBodyConsumer.cpp:79
> +        if (position >= input.length())

This check is redundant.  Will commit without it.
Comment 14 Alex Christensen 2020-08-24 15:12:35 PDT
(In reply to Alex Christensen from comment #13)
> > Source/WebCore/Modules/fetch/FetchBodyConsumer.cpp:79
> > +        if (position >= input.length())
> 
> This check is redundant.  Will commit without it.

Actually, I think it is necessary. cq+.
Comment 15 EWS 2020-08-24 15:36:07 PDT
Committed r266087: <https://trac.webkit.org/changeset/266087>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 407132 [details].
Comment 16 Radar WebKit Bug Importer 2020-08-24 15:37:21 PDT
<rdar://problem/67703327>
Comment 17 Alex Christensen 2020-08-24 18:46:15 PDT
Rebasing iOS test in a minute...
Comment 18 Alex Christensen 2020-08-24 19:04:14 PDT
https://trac.webkit.org/r266098
Comment 19 Alex Christensen 2020-08-25 12:07:58 PDT
Comment on attachment 407132 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=407132&action=review

> Source/WebCore/Modules/fetch/FetchBodyConsumer.cpp:191
> +        const char* boundary = boundaryWithDashes.utf8().data();
> +        size_t boundaryLength = strlen(boundary);

This is not only using strlen to get the length of a string we just got from a CString and already know the length, but also a use-after-free because the CString returned by utf8() is a temporary that is deallocated after we get a pointer to its contents.  Fixing...
Comment 20 Alex Christensen 2020-08-25 12:09:46 PDT
http://trac.webkit.org/r266140 fixed rdar://problem/67738130
Comment 21 Darin Adler 2020-08-25 12:14:01 PDT
Comment on attachment 407132 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=407132&action=review

>> Source/WebCore/Modules/fetch/FetchBodyConsumer.cpp:191
>> +        size_t boundaryLength = strlen(boundary);
> 
> This is not only using strlen to get the length of a string we just got from a CString and already know the length, but also a use-after-free because the CString returned by utf8() is a temporary that is deallocated after we get a pointer to its contents.  Fixing...

Oops! Should have caught that in review.
Comment 22 Philip Jägenstedt 2020-08-31 02:28:37 PDT
Is this the same as bug 212858? I found my way here from https://github.com/mdn/browser-compat-data/pull/6616.
Comment 23 Alex Christensen 2021-02-02 07:25:14 PST
*** Bug 161190 has been marked as a duplicate of this bug. ***
Comment 24 nyro 2021-11-08 07:19:31 PST
Is it also related to bug 187461 in some way?