|Summary:||Incorrect charset at http://star.vnet.cn|
|Product:||WebKit||Reporter:||Jungshik Shin <jshin>|
|Severity:||Normal||CC:||ap, darin, oliver, webkit|
|Version:||523.x (Safari 3)|
Description Jungshik Shin 2007-07-19 22:46:53 PDT
The lefthand side of the above page is decoded as the default charset instead of gb2312. The righthand side is correctly decoded as gb2312. The lefthand side is loaded into an iframe in the page. The document loaded into the iframe at http://star.vnet.cn/Comments/three.asp?ChannelID=116&ClassID=170 does not specify charset anywhere (html meta or http). On the other hand, its parent document has meta charset declaration. In this scenario, usually inheriting charset from its parent helps.
Comment 1 Alexey Proskuryakov 2007-07-20 00:03:05 PDT
We do not inherit the charset from parent frame to make Google Images (and Google Cache) work better, see bug 6118. This also fixes sites that use User-Agent sniffing to choose an encoding (e.g. the main frame's server may send x-mac-cyrillic to Safari, while a subframe may rely on the browser default being windows-1251, as that's the Windows Cyrillic encoding). We also match Firefox here. So, although it's a regression from shipping Safari/WebKit for this site, I think it's an evangelism issue.
Comment 2 Alexey Proskuryakov 2007-07-20 03:17:08 PDT
One way to fix this site without breaking the cases I mentioned above would be to inherit charsets specified in META, but not in an actual HTTP header. But it's not clear whether such a change would fix more sites than it would break.
Comment 3 Darin Adler 2007-07-20 10:17:15 PDT
Matching other browsers is more important than matching older versions of Safari, so I think Alexey's right and we want the behavior we have currently.
Comment 4 Alexey Proskuryakov 2007-07-21 13:11:59 PDT
Moving to evangelism component.
Comment 5 Robert Blaut 2008-10-12 23:21:57 PDT
The URL is dead.
Comment 6 Alexey Proskuryakov 2008-10-14 03:55:50 PDT
Closing now. Note that we have actually changed the behavior to inherit the charset as long as security considerations allow (other browsers did change in the meanwhile, too).