WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED INVALID
14475
REGRESSION: Korean (DOS) encoding doesn't work
https://bugs.webkit.org/show_bug.cgi?id=14475
Summary
REGRESSION: Korean (DOS) encoding doesn't work
Kyungdahm Yun
Reported
2007-06-30 07:10:33 PDT
Characters in different encodings are detected and rendered correctly when they are in a frame which properly specify text encoding. But when the frame is poorly structured, encoding is not detected. The worse is that one can't even change text encoding with an explicit menu command. I've done small test with different cases. They are contained in a main page which specifies a default encoding in a META tag. Frame 1: When a frame has encoded characters in a raw form, without any HTML markups. Frame 2: When a frame has a HTML structure, but an encoding is not specified. Frame 3: When a frame has a HTML structure with an encoding specified properly. In Safari 3.0.2 (522.12) and nightly build, Frame 1 and 2 shows the problem. An attempt to change 'Text Encoding' in View menu failed. When I chose an encoding except UTF-8, nothing happened. Choosing UTF-8 made a change in rendered text with miserably broken characters. Frame 3 renders correctly. Firefox 2.0.0.3 and Camino 1.5 has no problem at all. They even automatically detected proper encoding for Frame 1 and 2. Internet Explorer 7 on Windows does a good job as well. It detected a proper encoding for all frames.
Attachments
3 frames with different encoding setup
(1.67 KB, application/zip)
2007-06-30 07:13 PDT
,
Kyungdahm Yun
no flags
Details
test case
(347 bytes, text/html)
2007-06-30 09:12 PDT
,
Alexey Proskuryakov
no flags
Details
View All
Add attachment
proposed patch, testcase, etc.
Kyungdahm Yun
Comment 1
2007-06-30 07:13:06 PDT
Created
attachment 15325
[details]
3 frames with different encoding setup
Alexey Proskuryakov
Comment 2
2007-06-30 07:24:35 PDT
(In reply to
comment #0
)
> The worse is that one can't even > change text encoding with an explicit menu command.
I have tried, and choosing Korean (Mac OS) from the menu works for me in
r23841
nightly (running with Safari 3.0.2 beta). I'm wondering what is different in your case. Do you have any Safari enhancers installed?
Kyungdahm Yun
Comment 3
2007-06-30 08:15:44 PDT
(In reply to
comment #2
)
> I have tried, and choosing Korean (Mac OS) from the menu works for me in
r23841
> nightly (running with Safari 3.0.2 beta). I'm wondering what is different in > your case. Do you have any Safari enhancers installed? >
I missed that one. Actually, I (and maybe many Korean users) usually play with 'Korean (Windows, DOS)', not 'Korean (Mac OS)'. They are slightly different variants of EUC-KR encoding, though I'm not sure which parts are exactly different. Since Windows platforms are prevalent in Korea, the former would be more commonly found on the web. Anyway, choosing 'Korean (Windows, DOS)' should show the same result as 'Korean (Mac OS)' in most cases. Web pages rendered correctly in Safari 2 starts broken in Safari 3. Also, automatic encoding detection feature in Safari 3 seems to be somewhat broken when the page does not specify one. PS: I don't have any enhancer installed. Once I had SafariStand, but uninstalled it right after Safari 3 beta came out.
Alexey Proskuryakov
Comment 4
2007-06-30 09:02:25 PDT
> Anyway, choosing 'Korean (Windows, DOS)' should show the same result as 'Korean > (Mac OS)' in most cases
Yes, I also see this now. Confirming that 'Korean (Windows, DOS)' no longer works, renaming the bug to make clear that it tracks this specific problem. As for automatic detection, there are two issues in fact: 1) Firefox has true encoding auto-detection (using the actual page text to guess what the correct encoding is). WebKit only has it for Japanese at the moment, although other languages could also benefit from it. I suggest adding examples of sites that need auto-detection to
bug 4120
. 2) In your test case, the index document explicitly specifies an encoding, while its subframes do not. WebKit used to propagate the encoding from main frame to subframes in such case, but we stopped doing so because of many sites that were broken by this approach. If you have examples of real-life sites that are broken because of this change, please file a new bug; maybe we could find a safer solution.
Alexey Proskuryakov
Comment 5
2007-06-30 09:12:42 PDT
Created
attachment 15327
[details]
test case This is a test of cp949 encoding, which Safari tries to use when Korean (DOS, Windows) encoding is manually selected.
Alexey Proskuryakov
Comment 6
2007-06-30 09:58:06 PDT
The ICU version shipped with Tiger doesn't support "cp949" encoding, and newer ICU versions do - but it's a different encoding! See <
http://www.icu-project.org/icu-bin/convexp?conv=ibm-949_P110-1999&s=ALL
>. Firefox and MSIE do not support cp949 either, and I think it's a Safari bug that it uses this name for what is actually "windows-949". Since Safari is not open source, this needs to be fixed by Apple engineers. I have filed this as <
rdar://5304984
>. As this is not a WebKit bug, closing as INVALID (please open new bugs for related issues, as discussed above).
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug