Bug 12291 - A downloaded file via Safari,that has Korean Character in its name shows "??"
Summary: A downloaded file via Safari,that has Korean Character in its name shows "??"
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 420+
Hardware: Mac OS X 10.4
: P2 Major
Assignee: Nobody
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2007-01-16 00:34 PST by Kim Hyoung Jin
Modified: 2009-09-01 11:08 PDT (History)
8 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kim Hyoung Jin 2007-01-16 00:34:12 PST
Hi. 
I use Mac OS X in the Korean language environment. I have a problem with Safari file downloading.
I usually download files that have Korean Character in its file name. But Always!! in either downloading status window and on desktop, Safari doesn't show its filename correctly. Only It shows filename like this "??.pdf"(ÇѱÛ.pdf is original.)

"??" is the place that should be appeared in Korean character. 

I don't have this kind of problem in Mail. When I download the file that has Korean Character in its filename in Mail, It shows Korean correctly. (correctly on Desktop)

This annoying problem appeared early in Mac OS X Safari and didn't be fixed until Today(16/1/2007 nightly webkit). I hope that this problem to be fixed really soon. Many Koreans feel inconvenient because this isn't fixed already. We always must change the filename "???.pdf" into "ÇѱÛ.pdf". Please find the cause and fix the bug.
Comment 1 David Kilzer (:ddkilzer) 2007-01-16 03:34:29 PST
(In reply to comment #0)
> This annoying problem appeared early in Mac OS X Safari and didn't be fixed
> until Today(16/1/2007 nightly webkit).

I want to make sure I understand what you said.  This issue is fixed in the latest nightly WebKit?

> I hope that this problem to be fixed really soon.

If the issue is fixed in the nightly WebKit, it will eventually be fixed in a released version of Safari as well.  See Bug 12199 Comment #3 about when that might happen.

Also, it would help if you could provide an example URL so others may test this.

Thanks for taking the time to file this bug!

Comment 2 Alexey Proskuryakov 2007-01-16 03:51:38 PST
(In reply to comment #0)
> Please find the cause and fix the bug.

I'd be very interested in fixing this bug, but please tell us how to reproduce is. We need very detailed and precise steps to reproduce (what URL to open, what button to click, what result to expect, what is your system primary language and anything else you may find relevant).
Comment 3 Kim Hyoung Jin 2007-01-16 04:22:56 PST
This is NOT fixed yet in any nightly version of webkit/safari. Sorry for my poor English, but I'll do my best to inform developers of everything. (I live in Seoul, Korea)

The following URL is a direct link to a Korean-name-file.(txt extention) Its original name is ÇѱÛ.txt.
I use Korean as premary language for OS X. So ÇѱÛ.txt is to be seen correctly in Korean.


To reproduce the bug, 
1. You should click the link below.
http://www.albireo.net/powerbook/forum/showthread.php?t=4299
2. You'll see a post(which has an attachment) written by gazrang(i.e Me) and download the attached file.(ÇѱÛ.txt)
3. It'll be downloaded to your download folder or desktop
4. Check that filename is somewhat ugly composition of character(eg, C¢¬N¢¦¡¾U^.txt) not beautiful korean (ÇѱÛ.txt)

tell me if you want more information or reproduction steps.
Comment 4 Alexey Proskuryakov 2007-01-16 09:37:04 PST
Confirmed with TOT. Works as expected in Firefox. Going to investigate whether this is a WebKit bug, or something deeper in the OS.

Looks like a tricky issue - the response doesn't have any hint to guess the encoding (Korean DOS/Windows), so the encoding of the main page will need to be taken into account:

$ curl -I "http://www.albireo.net/powerbook/forum/attachment.php?s=c0f8a7bb63cfaaacb6dac91fe7ebaa80&attachmentid=2531"
HTTP/1.1 200 OK
Date: Tue, 16 Jan 2007 17:25:15 GMT
Server: Apache/1.3.33 (Unix) mod_throttle/3.1.2 mod_perl/1.27 PHP/4.4.1
X-Powered-By: PHP/4.4.1
Set-Cookie: bblastvisit=1168968315; expires=Wed, 16 Jan 2008 17:25:15 GMT; path=/
Set-Cookie: bblastactivity=1168968306; expires=Wed, 16 Jan 2008 17:25:15 GMT; path=/
Cache-control: max-age=31536000
Expires: Wed, 16 Jan 2008 17:25:15 GMT
Last-Modified: Tue, 16 Jan 2007 12:00:09 GMT
Content-disposition: attachment; filename="???.txt"
Content-Length: 10
Connection: close
Content-Type: plain/text
Comment 5 Kim Hyoung Jin 2007-01-16 13:35:15 PST
Thanks devolpers. 
I'd say that in Camino that bug is already fixed because I reported it Camino deveolper about a year ago.

p.s. : When I reproduce this bug, the Korean-File-Name is also corrupted in Download windows not only on desktop.
Comment 6 Alexey Proskuryakov 2007-01-19 11:59:04 PST
Unfortunately, I believe that this is an underlying OS framework bug, so it cannot be fixed in WebKit (unless we stop using this framework, of course). The file name is provided to us in [response suggestedFilename], and it's already broken at this time.

Could you please file this bug via <http://bugreport.apple.com> (free registration required) for Apple engineers to take a look? Please mention this bug in your report, and post the resulting Apple bug number here.
Comment 7 Kim Hyoung Jin 2007-01-19 13:40:15 PST
Although I didn't tell devolpers here, I already reported this bug at Apple Bug Report(with same subject).
But, nevertheless 10.4.8 came out, No change happened. This bug yet exists in 10.4.8. I doubt 10.4.9 would fix this bug. So I directly went here for Webkit Devolpers. (That's because camino browser has fixed this bug in previous os x)

Ok. the thing I have to do maybe is waiting. :)  
Comment 8 David Kilzer (:ddkilzer) 2007-01-19 13:49:22 PST
(In reply to comment #7)
> Although I didn't tell devolpers here, I already reported this bug at Apple Bug
> Report(with same subject).

Could you log into bugreport.apple.com quickly, find the Bug ID# and report it here?  I believe it's easier for Apple to find these bugs by ID than subject.  Thanks!
Comment 9 Kim Hyoung Jin 2007-01-19 13:54:59 PST
Problem ID: 4727992 at Apple Bug Reporting.
The engineers at Apple marked this bug as duplicate of other bug which may be reported by someone.
They must be know this bug. But this is very annoying bug, and hasn't fixed for Many Years.. :(
Comment 10 Per Lindberg 2007-09-07 07:05:28 PDT
The problem appears to be that WebKit/Safari does not recognize a correct Content-Disposition header, with a filename encoded as per RFC 2047.

The following link may also be enlightening:
http://java.sun.com/j2ee/1.4/docs/api/javax/servlet/http/HttpServletResponse.html#setHeader(java.lang.String,%20java.lang.String)

Comment 11 David Kilzer (:ddkilzer) 2007-09-07 07:09:27 PDT
(In reply to comment #10)
> The problem appears to be that WebKit/Safari does not recognize a correct
> Content-Disposition header, with a filename encoded as per RFC 2047.

http://www.ietf.org/rfc/rfc2047.txt
Comment 12 Federico Gandino 2007-11-21 09:31:51 PST
Hi could you finally solve it?

Many thanks!

Fede
Comment 13 David Kilzer (:ddkilzer) 2007-11-21 11:06:34 PST
Wasn't there a Radar filed about this as well?  Or a duplicate Bugzilla bug?

Comment 14 David Kilzer (:ddkilzer) 2007-11-21 11:09:38 PST
(In reply to comment #13)
> Wasn't there a Radar filed about this as well?  Or a duplicate Bugzilla bug?

Or am I thinking of a different Windows-only issue?
Comment 15 Alexey Proskuryakov 2007-11-21 12:45:19 PST
See comment 9, it's <rdar://problem/4727992>
Comment 16 David Kilzer (:ddkilzer) 2007-11-21 13:09:57 PST
(In reply to comment #15)
> See comment 9, it's <rdar://problem/4727992>

Thanks, ap!  That was duped to <rdar://problem/4605374> FWIW.
Comment 17 Federico Gandino 2008-01-21 04:21:15 PST
Hi! I've solved the problem setting the "filename" with double quote and setting the appropiate content-type. As an example:

Response.AddHeader("content-disposition", "attachment;filename=\"" + filename + "\"")

The content types would be:

xml-> "application/Text"
.doc -> "application/ms-word"
.xls -> "application/ms-excel"

             
Comment 18 Alexey Proskuryakov 2008-02-01 14:41:02 PST
(In reply to comment #10)
> The problem appears to be that WebKit/Safari does not recognize a correct
> Content-Disposition header, with a filename encoded as per RFC 2047.

Thank you for mentioning this!

I do not think that this bug as reported is actually covered by RFC 2047 (as you can see in comment 4, the header is not encoded as the RFC specifies, no "=?EUC-KR?" here). Also, the RFC itself seems to say that it doesn't apply to file names:

   + An 'encoded-word' MUST NOT be used in parameter of a MIME
     Content-Type or Content-Disposition field, or in any structured
     field body except within a 'comment' or 'phrase'.

If you are aware of any actual Web sites that do encode attachment names like this, I'd really appreciate if you could file a new bug for those. Thanks!
Comment 19 Chris Burmester 2008-03-26 11:09:21 PDT
I believe that the core issue here is that the specification for the Content-disposition HTTP header does not allow for the specification of charset for the string that specifies the "filename" in:

Content-disposition: attachment; filename=THEFILENAME

So, if THEFILENAME contains unicode characters, there's no way to specify the charset - utf-8, for example. The applicable standard, RFC 1521, explicitly punts on this issue. See, for example:

http://www.imc.org/ietf-822/old-archive2/msg02165.html

Mozilla-based browsers deal with this issue by assuming that characters in the filename string are UTF-8 encoded. Thus, if you pass a UTF-8 encoded string in the filename parameter, then it works.

IE browsers handle this by accepting URL encoded strings in the filename parameter.

See, for example: http://www.motobit.com/help/scptutl/pa97.htm

As far as I can tell, Safari is the only browser that doesn't provide a workaround.

Until a standard is provided, I would recommend that Safari do what the Mozilla based browsers have done and support UTF-8 encoding by default in the filename. Thus, any developer that wants to support the download of an attachment file whose filename has unicode characters in in need only use UTF-8 charset encoding.
Comment 20 nickmay 2008-07-22 22:57:21 PDT
Is there likely to be any movement on this issue? (I have just downloaded the latest webkit nightly and it still has problems with Japanese filenames.).

It is a serious issue for many users - it prevents use of Safari with web-applications that upload and download files to a server, unless those files have very restricted filenames. 

It is always a shame when one has to tell users explicitly NOT to use Safari with a web-app, and to use I.E. or Firefox instead.

 I appreciate the point about the standard being in-explicit, but if I.E. and Firefox can both code around it to provide this much needed functionality, surely Safari should be able to as well.... 

Standards or no standards, to the user this is "badly broken". 

Comment 21 Alexey Proskuryakov 2008-07-23 02:35:38 PDT
This is being investigated by Apple. As soon as underlying support for this in closed source frameworks is provided, the necessary changes to WebKit will be made.
Comment 22 Stephen Booher 2008-07-29 15:31:31 PDT
(In reply to comment #18)
> (In reply to comment #10)
> > The problem appears to be that WebKit/Safari does not recognize a correct
> > Content-Disposition header, with a filename encoded as per RFC 2047.
> 
> Thank you for mentioning this!
> 
> I do not think that this bug as reported is actually covered by RFC 2047 (as
> you can see in comment 4, the header is not encoded as the RFC specifies, no
> "=?EUC-KR?" here).

I believe the applicable standard for encoding filenames in the Content-Disposition header is in RFC 2231 (http://tools.ietf.org/html/rfc2231). Firefox supports this standard as well.
Comment 23 Alexey Proskuryakov 2009-09-01 11:08:30 PDT
This works fine for me in Safari 4.0.3 on Mac OS X 10.5.8 or later (notably, this is NOT fixed on Tiger).