NEW 62107
Filename in file upload should not URL-encode " to %22
https://bugs.webkit.org/show_bug.cgi?id=62107
Summary Filename in file upload should not URL-encode " to %22
Daniel Cheng
Reported 2011-06-05 00:30:57 PDT
Original report at http://crbug.com/81193 This is apparently permissible by the RFC, but since %22 is not encoded, there's no way to distinguish between %22 and a quote that was encoded to %22. It's not clear to me if we need to escape backslashes as well, and I'm not familiar enough with the RFC. I'm uploading a preliminary patch anyway.
Attachments
Patch (4.65 KB, patch)
2011-06-05 00:40 PDT, Daniel Cheng
no flags
Patch (4.84 KB, patch)
2011-06-05 19:43 PDT, Daniel Cheng
no flags
Patch (5.14 KB, patch)
2011-10-21 00:05 PDT, Daniel Cheng
ap: review-
Daniel Cheng
Comment 1 2011-06-05 00:40:38 PDT
Alexey Proskuryakov
Comment 2 2011-06-05 12:09:18 PDT
It's fine to change how we encode form data for better compatibility with other browsers, as long as it doesn't reintroduce the problem described at <http://kuza55.blogspot.com/2008/02/csrf-ing-file-upload-fields.html>. For form data encoding, HTML5 spec currently references RFC 2388, which references a chain of older RFCs, some of which do talk about escaping content in quoted strings with a backslash in certain contexts, but it's not immediately obvious if that formally filters up to form-data in all code paths that are being changed here. I think that to be really sure about this change, the following needs to ideally be done: 1. Can you find a Mozilla bug/revision for when they introduced this behavior? Ours is bug 30723. 2. Find out what IE9 does. 3. While at it, can you check how Mozilla encodes newlines now? 4. A regression test needs to be made (and verified with Firefox) for all code paths that are being changed now. The function is called from beginMultiPartHeader() and from addFilenameToMultiPartHeader(), so it affects more than file names. My feeling is that the proposed change is probably right.
Daniel Cheng
Comment 3 2011-06-05 17:07:17 PDT
(In reply to comment #2) > It's fine to change how we encode form data for better compatibility with other browsers, as long as it doesn't reintroduce the problem described at <http://kuza55.blogspot.com/2008/02/csrf-ing-file-upload-fields.html>. > > For form data encoding, HTML5 spec currently references RFC 2388, which references a chain of older RFCs, some of which do talk about escaping content in quoted strings with a backslash in certain contexts, but it's not immediately obvious if that formally filters up to form-data in all code paths that are being changed here. > > I think that to be really sure about this change, the following needs to ideally be done: > 1. Can you find a Mozilla bug/revision for when they introduced this behavior? Ours is bug 30723. I can't see that bug. Do you know who in Mozilla might be a good contact person to talk to about this issue? > 2. Find out what IE9 does. " is disallowed in Win32 filenames. It's also very hard to create a file with a newline in the name. I'll keep trying though. > 3. While at it, can you check how Mozilla encodes newlines now? They seem to just transform it to a space. Mozilla doesn't escape backslashes though. I'm guessing this is something we probably want to do as well, per http://blog.nullmethod.com/2011/05/csrf-file-uploads-in-firefox-4.html. There are some other pages about potential CSRF opportunities in file uploads that seem to have to do with CORS: http://blog.kotowicz.net/2011/04/how-to-upload-arbitrary-file-contents.html and http://secureyes.net/nw/assets/Cross_Site_File_Upload_Forgery_in_Firefox4_A_Proof_of_Concept.pdf reference this flaw. I don't think it applies to this particular problem though. > 4. A regression test needs to be made (and verified with Firefox) for all code paths that are being changed now. The function is called from beginMultiPartHeader() and from addFilenameToMultiPartHeader(), so it affects more than file names. I'm not sure what this would involve. Is there someone I can talk to to get an idea of the scope of this work? > > My feeling is that the proposed change is probably right.
Daniel Cheng
Comment 4 2011-06-05 19:43:43 PDT
Alexey Proskuryakov
Comment 5 2011-06-05 20:43:54 PDT
> I can't see that bug. The information is all in <http://kuza55.blogspot.com/2008/02/csrf-ing-file-upload-fields.html>. > Do you know who in Mozilla might be a good contact person to talk to about this issue? I do not. I'd start with source control blame. > > 2. Find out what IE9 does. > > " is disallowed in Win32 filenames. It's also very hard to create a file with a newline in the name. I'll keep trying though. You do not need an actual file with a quote mark in the name, all you need is to have it in HTML. > I'm not sure what this would involve. Is there someone I can talk to to get an idea of the scope of this work? You need to have a look where the function you modify is called from. There are two callers - one is to encode a file name, and another is to encode a form input name.
Daniel Cheng
Comment 6 2011-10-21 00:03:57 PDT
Using a filename of abc\ndef".html using the .pl file in the patch: ===== WebKit (pre-patch); WebKitFormBoundary Content-Disposition: form-data; name="toUpload"; filename="abc%0Adef%22.html" Content-Type: text/html WebKitFormBoundary Content-Disposition: form-data; name="abc\%22\%0D%0A %22" def\"\ " WebKitFormBoundary ===== WebKit (post-patch): WebKitFormBoundary Content-Disposition: form-data; name="toUpload"; filename="abc%0Adef\".html" Content-Type: text/html WebKitFormBoundary Content-Disposition: form-data; name="abc\\\"\\%0D%0A \"" def\"\ " WebKitFormBoundary ===== Firefox: -----------------------------10102754414578508781458777923 Content-Disposition: form-data; name="toUpload"; filename="abc def\".html" Content-Type: text/html -----------------------------10102754414578508781458777923 Content-Disposition: form-data; name="abc\\"\ \"" def\"\ " -----------------------------10102754414578508781458777923-- ===== IE: -----------------------------7db1d11140564 Content-Disposition: form-data; name="toUpload"; filename="abcƒXdefƒp.html" Content-Type: text/html -----------------------------7db1d11140564 Content-Disposition: form-data; name="" def\"\ " -----------------------------7db1d11140564-- Several things to note... Filename: I have no idea how IE chooses to handle it since both " and \n are illegal in Win32 filenames. Firefox escapes quotes with a backslash and converts all whitespace to a regular space. Firefox does not appear to correctly escape backslashes (I tested this after writing up the initial comment). We preserve whitespace and escape backslashes and double quotes with a single backslash. Input names: IE doesn't handle funky input names well. Firefox simply collapses all whitespace into a single regular space. Firefox does not appear to correctly escape quotes in input names. We preserve whitespace in input names. Input values: All browsers seem to handle input values the same (whitespace is collapsed into one space).
Daniel Cheng
Comment 7 2011-10-21 00:05:31 PDT
Alexey Proskuryakov
Comment 8 2011-10-21 10:24:20 PDT
One note: we are not looking for compatibility with other browsers if those are affected by <http://kuza55.blogspot.com/2008/02/csrf-ing-file-upload-fields.html>. Could you please comment on that? I don't have an opportunity to deeply understand the suggested change now, but looking at comment 6 and the ChangeLog, I'm worried that it seems to be about compatibility more than about safety. Given that this is not the first iteration, what you do is likely correct, but it's hard to see why from the explanation.
Daniel Cheng
Comment 9 2011-10-21 11:33:05 PDT
This patch has two aims: - Change filename / input name encoding so it's reversible. - Try to do the same thing other browsers when they handle things correctly (thus the reason I included the snippets from other browsers) I believe this patch is correct and will not allow further CSRFing of file uploads (this is in contrast to Mozilla, where a single backslash can still allow CSRFing of file uploads)--we properly escape both a double quote and a backslash. There are further issues to consider, but probably out of scope of this patch (in particular, how input names are parsed).
Alexey Proskuryakov
Comment 10 2012-02-02 15:56:48 PST
Comment on attachment 111915 [details] Patch I don't think that this is safe. IE and Firefox don't escape backslashes because these are normal in Windows paths. We don't have much indication that servers actually treat backslashes as escape characters (see <http://java.net/jira/browse/JERSEY-759> about a server that does, and its users see it as broken as a result). If we use backslash to escape, servers expecting IE behavior will treat those as path separators. I agree that the current situation is not good. Unfortunately, there doesn't seem to be much that could be done in WebKit unilaterally. I suggest working with WHATWG or HTML WG to get something specified in HTML5, and getting browsers converge on that.
Daniel Cheng
Comment 11 2012-02-02 16:11:51 PST
I'm focused in some other areas at the moment, so I won't be able to drive spec changes. Unassigning myself... hopefully someone else will pick this up.
Evan Jones
Comment 12 2012-05-02 13:45:29 PDT
I've filed an HTML5 bug and started a mailing list thread to try to get the spec describe what browsers are actually supposed to do in this case. It would be great to get some webkit people to chime in about what is the right answer here. Bug: https://www.w3.org/Bugs/Public/show_bug.cgi?id=16909 Mailing list: http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-May/035610.html (Aside: I ran into this as a server implementer; the existing state of affairs requires browser-specific parsing).
Note You need to log in before you can comment on or make changes to this bug.