Bug 62107 - Filename in file upload should not URL-encode " to %22
Summary: Filename in file upload should not URL-encode " to %22
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: Forms (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-06-05 00:30 PDT by Daniel Cheng
Modified: 2012-05-02 13:45 PDT (History)
5 users (show)

See Also:


Attachments
Patch (4.65 KB, patch)
2011-06-05 00:40 PDT, Daniel Cheng
no flags Details | Formatted Diff | Diff
Patch (4.84 KB, patch)
2011-06-05 19:43 PDT, Daniel Cheng
no flags Details | Formatted Diff | Diff
Patch (5.14 KB, patch)
2011-10-21 00:05 PDT, Daniel Cheng
ap: review-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Cheng 2011-06-05 00:30:57 PDT
Original report at http://crbug.com/81193

This is apparently permissible by the RFC, but since %22 is not encoded, there's no way to distinguish between %22 and a quote that was encoded to %22.
It's not clear to me if we need to escape backslashes as well, and I'm not familiar enough with the RFC. I'm uploading a preliminary patch anyway.
Comment 1 Daniel Cheng 2011-06-05 00:40:38 PDT
Created attachment 96049 [details]
Patch
Comment 2 Alexey Proskuryakov 2011-06-05 12:09:18 PDT
It's fine to change how we encode form data for better compatibility with other browsers, as long as it doesn't reintroduce the problem described at <http://kuza55.blogspot.com/2008/02/csrf-ing-file-upload-fields.html>.

For form data encoding, HTML5 spec currently references RFC 2388, which references a chain of older RFCs, some of which do talk about escaping content in quoted strings with a backslash in certain contexts, but it's not immediately obvious if that formally filters up to form-data in all code paths that are being changed here.

I think that to be really sure about this change, the following needs to ideally be done:
1. Can you find a Mozilla bug/revision for when they introduced this behavior? Ours is bug 30723.
2. Find out what IE9 does.
3. While at it, can you check how Mozilla encodes newlines now?
4. A regression test needs to be made (and verified with Firefox) for all code paths that are being changed now. The function is called from beginMultiPartHeader() and from addFilenameToMultiPartHeader(), so it affects more than file names.

My feeling is that the proposed change is probably right.
Comment 3 Daniel Cheng 2011-06-05 17:07:17 PDT
(In reply to comment #2)
> It's fine to change how we encode form data for better compatibility with other browsers, as long as it doesn't reintroduce the problem described at <http://kuza55.blogspot.com/2008/02/csrf-ing-file-upload-fields.html>.
> 
> For form data encoding, HTML5 spec currently references RFC 2388, which references a chain of older RFCs, some of which do talk about escaping content in quoted strings with a backslash in certain contexts, but it's not immediately obvious if that formally filters up to form-data in all code paths that are being changed here.
> 
> I think that to be really sure about this change, the following needs to ideally be done:
> 1. Can you find a Mozilla bug/revision for when they introduced this behavior? Ours is bug 30723.

I can't see that bug. Do you know who in Mozilla might be a good contact person to talk to about this issue?

> 2. Find out what IE9 does.

" is disallowed in Win32 filenames. It's also very hard to create a file with a newline in the name. I'll keep trying though.

> 3. While at it, can you check how Mozilla encodes newlines now?

They seem to just transform it to a space. Mozilla doesn't escape backslashes though. I'm guessing this is something we probably want to do as well, per http://blog.nullmethod.com/2011/05/csrf-file-uploads-in-firefox-4.html.

There are some other pages about potential CSRF opportunities in file uploads that seem to have to do with CORS: http://blog.kotowicz.net/2011/04/how-to-upload-arbitrary-file-contents.html and http://secureyes.net/nw/assets/Cross_Site_File_Upload_Forgery_in_Firefox4_A_Proof_of_Concept.pdf reference this flaw. I don't think it applies to this particular problem though.

> 4. A regression test needs to be made (and verified with Firefox) for all code paths that are being changed now. The function is called from beginMultiPartHeader() and from addFilenameToMultiPartHeader(), so it affects more than file names.

I'm not sure what this would involve. Is there someone I can talk to to get an idea of the scope of this work?

> 
> My feeling is that the proposed change is probably right.
Comment 4 Daniel Cheng 2011-06-05 19:43:43 PDT
Created attachment 96061 [details]
Patch
Comment 5 Alexey Proskuryakov 2011-06-05 20:43:54 PDT
> I can't see that bug.

The information is all in <http://kuza55.blogspot.com/2008/02/csrf-ing-file-upload-fields.html>.

> Do you know who in Mozilla might be a good contact person to talk to about this issue?

I do not. I'd start with source control blame.

> > 2. Find out what IE9 does.
> 
> " is disallowed in Win32 filenames. It's also very hard to create a file with a newline in the name. I'll keep trying though.

You do not need an actual file with a quote mark in the name, all you need is to have it in HTML.

> I'm not sure what this would involve. Is there someone I can talk to to get an idea of the scope of this work?

You need to have a look where the function you modify is called from. There are two callers - one is to encode a file name, and another is to encode a form input name.
Comment 6 Daniel Cheng 2011-10-21 00:03:57 PDT
Using a filename of abc\ndef".html using the .pl file in the patch:

===== WebKit (pre-patch);
WebKitFormBoundary

Content-Disposition: form-data; name="toUpload"; filename="abc%0Adef%22.html"

Content-Type: text/html

WebKitFormBoundary

Content-Disposition: form-data; name="abc\%22\%0D%0A %22"

def\"\ "

WebKitFormBoundary

===== WebKit (post-patch):
WebKitFormBoundary

Content-Disposition: form-data; name="toUpload"; filename="abc%0Adef\".html"

Content-Type: text/html

WebKitFormBoundary

Content-Disposition: form-data; name="abc\\\"\\%0D%0A \""

def\"\ "

WebKitFormBoundary

===== Firefox:
-----------------------------10102754414578508781458777923

Content-Disposition: form-data; name="toUpload"; filename="abc def\".html"

Content-Type: text/html

-----------------------------10102754414578508781458777923

Content-Disposition: form-data; name="abc\\"\ \""

def\"\ "

-----------------------------10102754414578508781458777923-- 

===== IE:
-----------------------------7db1d11140564 
Content-Disposition: form-data; name="toUpload"; filename="abcƒXdefƒp.html" 
Content-Type: text/html 


-----------------------------7db1d11140564 
Content-Disposition: form-data; name="" 

def\"\ " 
-----------------------------7db1d11140564--

Several things to note...
Filename:
I have no idea how IE chooses to handle it since both " and \n are illegal in Win32 filenames.
Firefox escapes quotes with a backslash and converts all whitespace to a regular space. Firefox does not appear to correctly escape backslashes (I tested this after writing up the initial comment).
We preserve whitespace and escape backslashes and double quotes with a single backslash.

Input names:
IE doesn't handle funky input names well. Firefox simply collapses all whitespace into a single regular space. Firefox does not appear to correctly escape quotes in input names. We preserve whitespace in input names.

Input values:
All browsers seem to handle input values the same (whitespace is collapsed into one space).
Comment 7 Daniel Cheng 2011-10-21 00:05:31 PDT
Created attachment 111915 [details]
Patch
Comment 8 Alexey Proskuryakov 2011-10-21 10:24:20 PDT
One note: we are not looking for compatibility with other browsers if those are affected by <http://kuza55.blogspot.com/2008/02/csrf-ing-file-upload-fields.html>. Could you please comment on that?

I don't have an opportunity to deeply understand the suggested change now, but looking at comment 6 and the ChangeLog, I'm worried that it seems to be about compatibility more than about safety. Given that this is not the first iteration, what you do is likely correct, but it's hard to see why from the explanation.
Comment 9 Daniel Cheng 2011-10-21 11:33:05 PDT
This patch has two aims:
- Change filename / input name encoding so it's reversible.
- Try to do the same thing other browsers when they handle things correctly (thus the reason I included the snippets from other browsers)

I believe this patch is correct and will not allow further CSRFing of file uploads (this is in contrast to Mozilla, where a single backslash can still allow CSRFing of file uploads)--we properly escape both a double quote and a backslash. There are further issues to consider, but probably out of scope of this patch (in particular, how input names are parsed).
Comment 10 Alexey Proskuryakov 2012-02-02 15:56:48 PST
Comment on attachment 111915 [details]
Patch

I don't think that this is safe. IE and Firefox don't escape backslashes because these are normal in Windows paths. We don't have much indication that servers actually treat backslashes as escape characters (see <http://java.net/jira/browse/JERSEY-759> about a server that does, and its users see it as broken as a result).

If we use backslash to escape, servers expecting IE behavior will treat those as path separators.

I agree that the current situation is not good. Unfortunately, there doesn't seem to be much that could be done in WebKit unilaterally. I suggest working with WHATWG or HTML WG to get something specified in HTML5, and getting browsers converge on that.
Comment 11 Daniel Cheng 2012-02-02 16:11:51 PST
I'm focused in some other areas at the moment, so I won't be able to drive spec changes. Unassigning myself... hopefully someone else will pick this up.
Comment 12 Evan Jones 2012-05-02 13:45:29 PDT
I've filed an HTML5 bug and started a mailing list thread to try to get the spec describe what browsers are actually supposed to do in this case. It would be great to get some webkit people to chime in about what is the right answer here.

Bug: https://www.w3.org/Bugs/Public/show_bug.cgi?id=16909
Mailing list: http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-May/035610.html

(Aside: I ran into this as a server implementer; the existing state of affairs requires browser-specific parsing).