Bug 19216

Summary: Character set detection seems to happen too late
Product: WebKit Reporter: David Carson <dacarson>
Component: TextAssignee: Nobody <webkit-unassigned>
Status: NEW ---    
Severity: Normal CC: ap
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Mac   
OS: OS X 10.5   
URL: http://www.zoo.gov.tw/table.shtml

Description David Carson 2008-05-23 08:47:06 PDT
When I load this site, the text seems garbled. When I load it in FF, it starts garbled, but then switches over the page to the correct character set.
Comment 1 Alexey Proskuryakov 2009-01-11 14:54:22 PST
This document has a broken structure (part of body content comes before head). In such cases, WebKit only checks the first kilobyte for meta charset - while Firefox will restart decoding if it sees the meta at any place in the document.

HTML5 specifies Firefox behavior, but despite this, I have very strong doubts about it, because parsing can have side effects (like script execution), which re-parsing would repeat.