WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED CONFIGURATION CHANGED
Bug 4120
Servers that need encoding sniffing to be rendered properly
https://bugs.webkit.org/show_bug.cgi?id=4120
Summary
Servers that need encoding sniffing to be rendered properly
Alexey Proskuryakov
Reported
2005-07-24 02:00:32 PDT
This server (using Microsoft-IIS/5.0) auto-guesses the encoding, and sends Mac Cyrillic to Safari. For whatever reason, the charset sent is quite broken - "mac" is ambiguous and thus unsupported by WebKit. Still, it should be possible to disambiguate "mac" by using the system primary language's Mac encoding. % curl -I --header "User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; ru-ru) Apple WebKit/312.1 (KHTML, like Gecko) Safari/312"
http://www.museum.ru/museum/Ostankino/5.htm
HTTP/1.1 200 OK Server: Microsoft-IIS/5.0 Date: Sun, 24 Jul 2005 08:58:33 GMT Accept-Ranges: bytes Last-Modified: Wed, 15 Jun 2005 12:58:15 GMT ETag: "ed49fce4a971c51:804" Content-Length: 7318 Set-Cookie: charset=mac; path=/; expires=Mon, 10 May 2032 23:12:40 GMT Content-Type: text/html; charset=mac
Attachments
Add attachment
proposed patch, testcase, etc.
Alexey Proskuryakov
Comment 1
2005-07-24 13:36:42 PDT
Oops, in fact "mac" and "macintosh" charsets are defined in RFC 1345 (as MacRoman), and WebKit explicitly supports them. So, the implementation is correct, and probably shouldn't be changed. However, this example may need to be considered in a future encoding sniffer - museum.ru is a rather important server.
Alexey Proskuryakov
Comment 2
2005-10-04 09:58:50 PDT
I propose to use this bug to track servers whose encoding cannot be determined via HTTP or HTML headers, so content sniffing is required. Two more:
http://stats.distributed.net/team/tmsummary.php?project_id=8&team=11269
http://www.mdf.ru
Alexey Proskuryakov
Comment 3
2006-01-07 01:08:12 PST
http://www.zoo.ru
(also sends charset=mac instead of x-mac-cyrillic).
Alexey Proskuryakov
Comment 4
2008-02-18 02:44:11 PST
Bug 17405
:
http://tianya.cn
- no charset information; encoded as Simplified Chinese.
Karl Dubost
Comment 5
2024-01-22 20:56:33 PST
From the sites in this bug PASS
http://www.museum.ru/museum/Ostankino/5.htm
PASS
https://stats.distributed.net/team/tmsummary.php?project_id=8&team=11269
PASS
http://www.mdf.ru
after redirect to
https://www.mamm-mdf.ru
ERR
http://www.zoo.ru
Domain is for sale. ERR
http://tianya.cn
Domain not available anymore. Let's close this bug as
Bug 245305
is about addressing the requirements of Content Sniffing.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug