RESOLVED FIXED 5932
Wrong encoding used for http://www.miel.ru
https://bugs.webkit.org/show_bug.cgi?id=5932
Summary Wrong encoding used for http://www.miel.ru
Alexey Proskuryakov
Reported 2005-12-04 06:25:16 PST
This site specifies the encoding "almost" correctly: $ curl -I http://www.miel.ru/ <...> Content-Type: text/html; charset=cp1251 (unknown alias for windows-1251) <html> <span /> <!-- 0 --> <head> <base href="http://www.miel.ru/" /> <title>Недвижимость Москвы и Подмосковья. Агентство недвижимости МИЭЛЬ</title> <meta http-equiv="Content-Type" content="text/html; charset=windows-1251"> (has a meta, but khtml::Decoder doesn't see it because of a <span> in the beginning). Need to figure out which workaround would be more compatible...
Attachments
proposed fix (2.16 KB, patch)
2005-12-11 05:55 PST, Alexey Proskuryakov
no flags
proposed fix (2.40 KB, patch)
2005-12-11 06:40 PST, Alexey Proskuryakov
no flags
proposed fix (2.44 KB, patch)
2005-12-11 06:55 PST, Alexey Proskuryakov
darin: review+
Alexey Proskuryakov
Comment 1 2005-12-10 01:34:57 PST
Actually, ICU supports the "cp1251" alias, and it's WebCore that blocks its usage in KWQCFStringEncodingFromIANACharsetName().
Alexey Proskuryakov
Comment 2 2005-12-11 05:55:36 PST
Created attachment 5028 [details] proposed fix If a charset name is not known, try to normalize it using ICU. Admittedly, this is a band-aid fix, and the way to go is probably to get rid of CFStringEncoding-related functions throughout WebKit, so that KWQCFStringEncodingFromIANACharsetName() wouldn't be needed at all.
Alexey Proskuryakov
Comment 3 2005-12-11 06:40:19 PST
Created attachment 5029 [details] proposed fix Oops, no need to do the lookup again if the first attempt was successful.
Alexey Proskuryakov
Comment 4 2005-12-11 06:55:04 PST
Created attachment 5030 [details] proposed fix Fixed paths for non-existing files (see bug 5846).
Darin Adler
Comment 5 2005-12-11 16:48:52 PST
Comment on attachment 5030 [details] proposed fix If we're going to use the ICU aliases, then I would like to see all the redundant entries in our encoding table removed.
Darin Adler
Comment 6 2005-12-11 17:06:54 PST
Comment on attachment 5030 [details] proposed fix Seems fine to make this change. Would have liked to have a comment explaining why the code is doing what it's doing.
Alexey Proskuryakov
Comment 7 2005-12-11 22:16:13 PST
Filed bug 6046 about getting rid of CFStringEncoding and tables cleanup.
Note You need to log in before you can comment on or make changes to this bug.