With this site(webkit-buglizza), I cannot upload file from my Desktop. After moving file to somewhere like c:\__sendtemp\, uploading succeeded. 'filename' is full path, and contains UTF8(nonascii) chars. With some environment directory name "desktop" is localized into nonascii strings. I don't know who has responsiblility, or whether we should always send fullpath. ; =============== ; posts ; =============== POST http://bugs.webkit.org/attachment.cgi HTTP/1.1 User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ja) AppleWebKit/522.14.1+ (KHTML, like Gecko) Version/3.0.2 Safari/522.13.1 Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryAAsdAAJVAApEAA5j Transfer-Encoding: Chunked ..... ------WebKitFormBoundaryAAsdAAJVAApEAA5j Content-Disposition: form-data; name="data"; filename="C:\Documents and Settings\*****\ε㑦€»εβ―εγ°ε₯γεγΝ½test.html.png" Content-Type: image/png ; =============== ; response ; =============== HTTP/1.1 411 Length Required Server: Apache/1.3.33 (Darwin) PHP/4.4.1 mod_ssl/2.8.24 OpenSSL/0.9.7i DAV/1.0.3 Transfer-Encoding: chunked Content-Type: text/html; charset=iso-8859-1
Oh, HTMLFormElement::formData() converts path to current encoding. At the time above I certainly settled DefaultEncoding as UTF8. Now I retried with reloaded attachment.cgi while Encoding->Europian. Path is converted as below and still uploading failed. Content-Disposition: form-data; name="data"; filename="C:\Documents and Settings\********\デスクトップ\test.txt"
I might well be wrong, but I think that a full path is not really needed (and is sometimes harmful) in Content-Disposition, so just a file name can be sent. Would you be willing to submit such a patch for WebCore, as described in <http://webkit.org/coding/contributing.html>?
What does Firefox or MSIE (or Opera) when uploading this file? Do they encode the path differently?
(In reply to comment #2) > I might well be wrong, but I think that a full path is not really needed (and > is sometimes harmful) in Content-Disposition, so just a file name can be sent. But if the actual file contained UTF8 (non-ASCII) characters, this doesn't really fix the issue.
<rdar://problem/5340188>
(In reply to comment #4) > But if the actual file contained UTF8 (non-ASCII) characters, this doesn't > really fix the issue. That's true, but we don't have to fix everything at once - there isn't any 100% correct solution for non-ASCII characters in HTTP headers AFAIK, so it will take a lot of tweaking. Also, most experienced users know about the perils of non-ASCII file names, while it's really hard to guess that a problem with uploading is caused by a directory name somewhere in the path.
(In reply to comment #6) > (In reply to comment #4) > > But if the actual file contained UTF8 (non-ASCII) characters, this doesn't > > really fix the issue. > > That's true, but we don't have to fix everything at once - there isn't any 100% > correct solution for non-ASCII characters in HTTP headers AFAIK, so it will > take a lot of tweaking. > > Also, most experienced users know about the perils of non-ASCII file names, > while it's really hard to guess that a problem with uploading is caused by a > directory name somewhere in the path. My main concern is that this would make Safari on Windows behave differently from every other browser with respect to uploading files.
Why can't we use "encoded-word" encoding for the path as defined by RFC 2047? It's perfect for this application--encoding non-US-ASCII characters in a US-ASCII-only field! We could either encode the path on a per-directory basis (encode each set of characters between directory separators), or encode the whole string. (I would recommend encoding text between each directory separator since server-side software would have to be updated to understand how to decode such fields.) http://tools.ietf.org/html/rfc2047 Updates to RFC 2047: http://tools.ietf.org/html/rfc2184 http://tools.ietf.org/html/rfc2231 Confirming bug based on empirical evidence.
(In reply to comment #8) > Why can't we use "encoded-word" encoding for the path as defined by RFC 2047? Let's track this as a separate bug, please. The fact that a full path is sent from Windows is just a porting bug AFAICT - it's not sent from the Mac.
/* bug#14667 for the encoding of filename part discussion */
The problem is in HTMLFormElement::formData(): int start = path.reverseFind('/') + 1; We should be using PathFindFileName() API on Windows, and possibly basename(3) where available. I'm trying to think of a good place to put a helper function...
Created attachment 15933 [details] proposed fix Not tested, as I don't have a Windows build environment set up. AFAICT, path operations aren't needed anywhere else in WebCore, so a slightly buggy static helper looks like an fine solution :-)
Comment on attachment 15933 [details] proposed fix +static String pathGetFileName(const String& path) +{ +#if PLATFORM(WIN_OS) + return String(PathFindFileName(path.charactersWithNullTermination())); This won't compile, since charactersWithNullTermination is a non-const instance method.
Created attachment 15934 [details] updated patch Oops - I wonder why it is non-const if it doesn't really modify the string.
Comment on attachment 15934 [details] updated patch + return String(PathFindFileName(path.charactersWithNullTermination())); Do you need the explicit String() here? Doesn't the conversion happen automatically? Also, we often use "filename" as a single word; so I think the variables and functions could just be all lowercase. r=me
(In reply to comment #15) > Do you need the explicit String() here? Doesn't the conversion happen > automatically? I wondered the same when I saw such a construct in FileChooserWin.cpp, so I preserved it for safety (I don't have a Windows build environment set up at the moment).
Committed revision 25067.