Bug 20795

Summary: text/plain form encoding ignored and incorrectly specified in request header
Product: WebKit Reporter: Carsten Orthbandt <carsten.orthbandt>
Component: FormsAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, code.vineet, darin, dglazkov, japhet, ossy, philip.tellis, rniwa, tkent, webkit.review.bot
Priority: P2    
Version: 525.x (Safari 3.1)   
Hardware: All   
OS: All   
URL: http://carsten.orthbandt.de/formtest.html
Attachments:
Description Flags
Patch that handles enctype="text/plain" correctly
none
Patch that handles enctype="text/plain" correctly
ap: review-
initial approach to fix this issue.
none
updated patch
darin: review-
updated_patch
darin: review+, webkit.review.bot: commit-queue-
updated_patch none

Description Carsten Orthbandt 2008-09-12 07:46:24 PDT
Set up a HTML form like this:

		<form action="" method="post" enctype="text/plain">
			test1:<input name="test1" size="10">
			test2:<input name="test2" size="10">
			<input type="submit">
		</form>

The request header will say:
Content-Type: text/plain; boundary=
but the actual content is encoded as "application/x-www-form-urlencoded".

It's bad enough that the form enctype parameter is more or less ignored, but specifying a different encoding in the header than is used in the body is a serious problem.

Yes, I can change my forms not to use "text/plain". But it's still a bug.
Comment 1 Alexey Proskuryakov 2008-09-13 08:04:57 PDT
Confirmed with r35843 on Mac OS X 10.4.11 as a difference with Firefox.

Note however that text/plain form serialization is not defined in HTML4 (http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4), and is not yet defined in HTML5 (http://www.w3.org/html/wg/html5/#textplain).
Comment 2 Carsten Orthbandt 2008-09-13 08:46:55 PDT
(In reply to comment #1)
> Confirmed with r35843 on Mac OS X 10.4.11 as a difference with Firefox.
> Note however that text/plain form serialization is not defined in HTML4
> (http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4), and is not yet
> defined in HTML5 (http://www.w3.org/html/wg/html5/#textplain).

Well, especially the HTML5 doc _does_ list text/plain as a valid form encoding (which is a change compared to HTML4). Calling it "not defined" looks like an excuse though, since none of the other encodings are defined there either (yet).
While surprisingly there seems to be no official standard on name/value encoding in text/plain, all browsers I've used back to stone-age Mosaic used the same format.
So I'd expect the upcoming definition to match exactly that.

Comment 3 Alexey Proskuryakov 2010-05-25 13:44:54 PDT
HTML5 now defines this, <http://dev.w3.org/html5/spec/Overview.html#text-plain-encoding-algorithm>.
Comment 4 Philip Tellis 2010-11-09 02:18:26 PST
I've written a patch to fix this bug, but I don't know how to write a regression test for it.  How do I go about testing what a form would POST to the server without actually having a server side script?
Comment 5 Philip Tellis 2010-11-09 02:45:47 PST
Created attachment 73350 [details]
Patch that handles enctype="text/plain" correctly

The attached patch changes the behaviour of form submissions when the enctype is set to text/plain and the action is a non mailto: URL
Comment 6 Alexey Proskuryakov 2010-11-09 08:30:58 PST
> How do I go about testing what a form would POST to the server without actually having a server side
> script?

Actually, you can have a server side script - tests in LayoutTests/http/tests are served by Apache.
Comment 7 Philip Tellis 2010-11-09 11:52:29 PST
Thanks, will work on the tests today then.
Comment 8 Alexey Proskuryakov 2010-11-09 17:14:43 PST
Thanks! 

Apache is launched automatically by run-webkit-tests, or you could use the run-webkit-httpd script for debugging in Safari.
Comment 9 Philip Tellis 2010-11-09 19:52:44 PST
Created attachment 73453 [details]
Patch that handles enctype="text/plain" correctly

Updated patch that also includes test cases, and handles GET requests correctly.
Comment 10 Alexey Proskuryakov 2010-11-10 11:37:43 PST
Comment on attachment 73453 [details]
Patch that handles enctype="text/plain" correctly

View in context: https://bugs.webkit.org/attachment.cgi?id=73453&action=review

Please clarify what exactly this patch implements, and add test coverage. Besides what is mentioned below, there should be a test with non-ASCII data.

> WebCore/ChangeLog:13
> +        Implementation algorithm defined here:
> +        http://dev.w3.org/html5/spec/Overview.html#text-plain-encoding-algorithm

You reference the HTML5 algorithm, but is it fully implemented? We don't implement accept-charset, and we don't implement _charset_ entries. There is no test for sending a file input as text/plain.

> WebCore/loader/FormSubmission.cpp:168
> +        formData = FormData::create(*(static_cast<FormDataList*>(domFormData.get())), domFormData->encoding(), isMailtoForm || attributes.method() == GetMethod ? "application/x-www-form-urlencoded" : encodingType);

This doesn't seem to match what HTML5 says. GET/mailto is always "application/x-www-form-urlencoded", but POST/mailto uses "appropriate form encoding algorithm".

> WebCore/platform/network/FormData.h:98
> +    static PassRefPtr<FormData> create(const FormDataList&, const TextEncoding&, const String encodingType = "application/x-www-form-urlencoded");

The encoding type definitely shouldn't be passed as string copy. Maybe String reference, maybe an AtomicString, or maybe an enum constant. The constant is probably most appropriate.

> WebCore/platform/network/FormDataBuilder.cpp:190
> +        // http://dev.w3.org/html5/spec/Overview.html#text-plain-encoding-algorithm

I don't think that a link to an interim version of HTML5 is useful. That's where one would look by default anyway.

> LayoutTests/http/tests/misc/form-post-textplain.html:13
> +    <input type="hidden" name="f1" value="This is field #1 &!@$%'<>">
> +    <input type="hidden" name="f2" value='This is field #2 ""'>

Interesting characters to test should include '=' and newlines.
Comment 11 Philip Tellis 2010-11-10 11:48:10 PST
Thanks for the comments, will fix what I can.  As far as I know, this has been a standard as far back as HTML 3.0, but I can't find a reference to the algorithm in the specifications.
Comment 12 Kent Tamura 2011-10-23 18:58:45 PDT
Comment on attachment 73453 [details]
Patch that handles enctype="text/plain" correctly

View in context: https://bugs.webkit.org/attachment.cgi?id=73453&action=review

>> WebCore/platform/network/FormData.h:98
>> +    static PassRefPtr<FormData> create(const FormDataList&, const TextEncoding&, const String encodingType = "application/x-www-form-urlencoded");
> 
> The encoding type definitely shouldn't be passed as string copy. Maybe String reference, maybe an AtomicString, or maybe an enum constant. The constant is probably most appropriate.

I agree with Alexey.  We had better introduce an enum set.
Comment 13 Vineet Chaudhary (vineetc) 2011-10-25 04:45:02 PDT
Created attachment 112317 [details]
initial approach to fix this issue.

This is just code patch incorporating above review comments.
Please let me know your review comments on this. 
I will update ChangeLog too once code review is done.
Comment 14 Kent Tamura 2011-10-25 07:19:24 PDT
Comment on attachment 112317 [details]
initial approach to fix this issue.

Please read the style guide: http://www.webkit.org/coding/coding-style.html
> Enum members should user InterCaps with an initial capital letter.

Also, I feel application_x_www_form_urlencoded is too long :-)  FormURLEncoded is enough.
Comment 15 Vineet Chaudhary (vineetc) 2011-10-26 10:24:28 PDT
Created attachment 112559 [details]
updated patch

Modified Enum members as per style guidelines.
Comment 16 Darin Adler 2011-10-26 10:28:15 PDT
Comment on attachment 112559 [details]
updated patch

View in context: https://bugs.webkit.org/attachment.cgi?id=112559&action=review

Is this correct behavior? Do other browsers do this? What websites use this feature?

> Source/WebCore/platform/network/FormData.cpp:102
> +PassRefPtr<FormData> FormData::create(const FormDataList& list, const TextEncoding& encoding, const EncodingType encodingType)

This should just be EncodingType, not const EncodingType.

> Source/WebCore/platform/network/FormData.cpp:183
> +void FormData::appendKeyValuePairItems(const FormDataList& list, const TextEncoding& encoding, bool isMultiPartForm, Document* document, const EncodingType encodingType)

No const here either.

> Source/WebCore/platform/network/FormData.h:104
> +    static PassRefPtr<FormData> create(const FormDataList&, const TextEncoding&, const EncodingType encodingType = FormURLEncoded);

This should be:

    EncodingType = FormURLEncoded

The const is not needed or helpful, and the argument name is not helpful.

> Source/WebCore/platform/network/FormData.h:138
> +    static EncodingType getEncodingType(const String& type)

Functions like this are not named with the word “get” in WebKit. This is covered in the coding style guide.

This should probably be named “parseEncodingType”.

> Source/WebCore/platform/network/FormData.h:143
> +             return MultipartFormData;

Extra space here.

> Source/WebCore/platform/network/FormData.h:151
> +    void appendKeyValuePairItems(const FormDataList&, const TextEncoding&, bool isMultiPartForm, Document*, const EncodingType encodingType = FormURLEncoded);

Same comment as above.

> Source/WebCore/platform/network/FormDataBuilder.cpp:184
> +void FormDataBuilder::addKeyValuePairAsFormData(Vector<char>& buffer, const CString& key, const CString& value, const FormData::EncodingType encodingType)

Same comment as above.

> Source/WebCore/platform/network/FormDataBuilder.h:47
> +    static void addKeyValuePairAsFormData(Vector<char>&, const CString& key, const CString& value, FormData::EncodingType encodingType = FormData::FormURLEncoded);

Should omit the argument name here.
Comment 17 Darin Adler 2011-10-26 10:28:59 PDT
(In reply to comment #4)
> I've written a patch to fix this bug, but I don't know how to write a regression test for it.  How do I go about testing what a form would POST to the server without actually having a server side script?

We do need a test for this patch. As Alexey pointed out, we have many tests that do involve server side scripts.
Comment 18 Vineet Chaudhary (vineetc) 2011-10-26 11:01:26 PDT
Thank you Darin for review comments,

Yes this is correct behavior because in webkit forms with post method and "text/plain" enctype are encoded as "application/x-www-form-urlencoded".

This can be verified with latest firefox.

(In reply to comment #17)
> We do need a test for this patch. As Alexey pointed out, we have many tests that do involve server side scripts.

I have added form-post-textplain.html and form-get-textplain.html to test this  behavior, do you want me to add more tests for this?

I will upload patch again incorporating all your comments, thanks!
Comment 19 WebKit Review Bot 2011-10-26 11:21:27 PDT
Attachment 112559 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/ChangeLog', u'LayoutTests/fast..." exit_code: 1

Source/WebCore/platform/network/FormDataBuilder.h:47:  The parameter name "encodingType" adds no information, so it should be removed.  [readability/parameter_name] [5]
Source/WebCore/platform/network/FormData.h:104:  The parameter name "encodingType" adds no information, so it should be removed.  [readability/parameter_name] [5]
Source/WebCore/platform/network/FormData.h:151:  The parameter name "encodingType" adds no information, so it should be removed.  [readability/parameter_name] [5]
Total errors found: 3 in 12 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 20 Kent Tamura 2011-10-27 01:31:47 PDT
(In reply to comment #16)
> (From update of attachment 112559 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=112559&action=review
> 
> Is this correct behavior? Do other browsers do this? What websites use this feature?

According to http://crbug.com/73979, IE, Firefox, and Opera support text/plain.
Comment 21 Vineet Chaudhary (vineetc) 2011-10-27 09:44:45 PDT
Created attachment 112695 [details]
updated_patch

updated patch contains form-post-textplain.html and form-get-textplain.html to test this  behavior, 
please let me know if I need add more tests for this?
Comment 22 WebKit Review Bot 2011-10-27 11:07:45 PDT
Comment on attachment 112695 [details]
updated_patch

Attachment 112695 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/10226774

New failing tests:
fast/forms/mailto/formenctype-attribute-input-html.html
fast/forms/mailto/post-text-plain-with-accept-charset.html
fast/forms/mailto/formenctype-attribute-button-html.html
fast/forms/mailto/post-text-plain.html
fast/forms/mailto/post-multiple-items-text-plain.html
Comment 23 Vineet Chaudhary (vineetc) 2011-10-27 13:43:35 PDT
Created attachment 112740 [details]
updated_patch

(In reply to comment #22)
> (From update of attachment 112695 [details])
> Attachment 112695 [details] did not pass chromium-ews (chromium-xvfb):
> Output: http://queues.webkit.org/results/10226774
> 
> New failing tests:
> fast/forms/mailto/formenctype-attribute-input-html.html
> fast/forms/mailto/post-text-plain-with-accept-charset.html
> fast/forms/mailto/formenctype-attribute-button-html.html
> fast/forms/mailto/post-text-plain.html
> fast/forms/mailto/post-multiple-items-text-plain.html

Hello Darin/Alexey,
could you please review this patch once more, sorry for failing test cases :(.
As Alexey mentioned earlier
> GET/mailto is always "application/x-www-form-urlencoded", but POST/mailto uses "appropriate form encoding algorithm".
Above test cases uses post method and test/plain enctype so I have to modify the expected results for above test cases.
Comment 24 Darin Adler 2011-10-31 15:10:00 PDT
Comment on attachment 112740 [details]
updated_patch

View in context: https://bugs.webkit.org/attachment.cgi?id=112740&action=review

> Source/WebCore/platform/network/FormData.h:145
> +    static EncodingType parseEncodingType(const String& type)
> +    {
> +        if (equalIgnoringCase(type, "text/plain"))
> +            return TextPlain;
> +        if (equalIgnoringCase(type, "multipart/form-data"))
> +            return MultipartFormData;
> +        return FormURLEncoded;
> +    }

Seems a little ugly to include this entire function body here in the class definition.
Comment 25 WebKit Review Bot 2011-10-31 16:03:03 PDT
Comment on attachment 112740 [details]
updated_patch

Clearing flags on attachment: 112740

Committed r98896: <http://trac.webkit.org/changeset/98896>
Comment 26 WebKit Review Bot 2011-10-31 16:03:11 PDT
All reviewed patches have been landed.  Closing bug.
Comment 28 Vineet Chaudhary (vineetc) 2011-10-31 23:47:44 PDT
(In reply to comment #27)
> Two forms/mailto tests are failing after this patch on SL, Qt, Cr Win, & Cr Linux:
> http://build.webkit.org/results/SnowLeopard%20Intel%20Release%20(Tests)/r98926%20(34333)/results.html
> 
> http://test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=fast%2Fforms%2Fmailto%2Fformenctype-attribute-button-html.html&group=%40ToT%20-%20webkit.org

Failing tests were platform specific so I have added those to all test_expectations.txt except gtk so those could be rebaselined. Can you please correct me to fix this?
Comment 29 Ryosuke Niwa 2011-11-01 10:35:47 PDT
(In reply to comment #28)
> (In reply to comment #27)
> > Two forms/mailto tests are failing after this patch on SL, Qt, Cr Win, & Cr Linux:
> > http://build.webkit.org/results/SnowLeopard%20Intel%20Release%20(Tests)/r98926%20(34333)/results.html
> > 
> > http://test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=fast%2Fforms%2Fmailto%2Fformenctype-attribute-button-html.html&group=%40ToT%20-%20webkit.org
> 
> Failing tests were platform specific so I have added those to all test_expectations.txt except gtk so those could be rebaselined. Can you please correct me to fix this?

When did this happen (which revision)? Also, no port except Chromium port uses test_expectations.txt for this purpose. You typically grab the results out of bots (build.webkit.org) and check that in instead of skiping/adding failing expectations.
Comment 30 Vineet Chaudhary (vineetc) 2011-11-01 11:05:48 PDT
(In reply to comment #29)
> (In reply to comment #28)
> > (In reply to comment #27)
> > > Two forms/mailto tests are failing after this patch on SL, Qt, Cr Win, & Cr Linux:
> > > http://build.webkit.org/results/SnowLeopard%20Intel%20Release%20(Tests)/r98926%20(34333)/results.html
> > > 
> > > http://test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=fast%2Fforms%2Fmailto%2Fformenctype-attribute-button-html.html&group=%40ToT%20-%20webkit.org
> > 
> > Failing tests were platform specific so I have added those to all test_expectations.txt except gtk so those could be rebaselined. Can you please correct me to fix this?
> 
> When did this happen (which revision)? Also, no port except Chromium port uses test_expectations.txt for this purpose. You typically grab the results out of bots (build.webkit.org) and check that in instead of skiping/adding failing expectations.

In the revision http://trac.webkit.org/changeset/98979 itself I have added failing tests to test_expectations.txt and even build bot didn't give error on attached patch :(. Please let me know if you want me to rebaseline the results for the ports.
Comment 31 Ryosuke Niwa 2011-11-01 11:41:26 PDT
Rebaseline done in http://trac.webkit.org/changeset/98984 (fixed change log in http://trac.webkit.org/changeset/98986).