Bug 14500 - need to be more generous about charset declaration with meta tag
Summary: need to be more generous about charset declaration with meta tag
Alias: None
Product: WebKit
Classification: Unclassified
Component: Page Loading (show other bugs)
Version: 523.x (Safari 3)
Hardware: All All
: P2 Normal
Assignee: Nobody
Depends on:
Reported: 2007-07-02 15:52 PDT by Jungshik Shin
Modified: 2007-12-27 00:38 PST (History)
2 users (show)

See Also:

Yahoo! Mail example (4.91 KB, text/html)
2007-11-07 05:29 PST, David Kilzer (:ddkilzer)
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jungshik Shin 2007-07-02 15:52:22 PDT

has a strange structure. Note that html tag appears twice and so does head. charset definition in meta tag appears in the 2nd head tag.  WebKit does not honor it while FF and IE do.

<script type="text/javascript"><!--
var ID="100099131";
var AD=0;
var FRAME=0;
// --></script>
<script src="http://j1.ax.xrea.com/l.j?id=100099131" type="text/javascript"></script>
<a href="http://w1.ax.xrea.com/c.f?id=100099131" target="_blank"><img src="http://w1.ax.xrea.com/l.f?id=100099131&url=X" alt="AX" border="0"></a>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="ja" xmlns="http://www.w3.org/1999/xhtml" xml:lang="ja">
<meta http-equiv="Content-Type" content="text/html; charset=EUC-JP" />
Comment 1 Alexey Proskuryakov 2007-07-02 22:27:21 PDT
WebKit currently stops to look for charset as soon as it reaches document body (for performance reasons).

See also: bug 12526.
Comment 2 Jungshik Shin 2007-07-16 14:38:44 PDT
is a variation on this.  Its strucutre

<script> very long .... </script><form> ...</form> <script ..></script>
<meta .... charset ...>
Comment 3 Jungshik Shin 2007-11-06 13:35:32 PST
Another variation:


It starts with "<input>" tag.  Later, it has the correct <meta> tag to indicate charset. 

Comment 4 David Kilzer (:ddkilzer) 2007-11-07 05:29:19 PST
Created attachment 17107 [details]
Yahoo! Mail example

This (partial) reduction is an example of a HTML-based mail message (about Sandvox) rendering with the wrong charset due to a "late" <meta> tag.  It was originally displayed within Yahoo! Mail, although I stripped out almost all of the Y! Mail bits for the reduction.

Note the rendering of the apostrophes in the body of the message, and compare to Opera 9.22 and Firefox
Comment 5 Alexey Proskuryakov 2007-12-27 00:38:05 PST
Fixed in <http://trac.webkit.org/projects/webkit/changeset/28998>.