Bug 19216

Summary:	Character set detection seems to happen too late
Product:	WebKit	Reporter:	David Carson <dacarson>
Component:	Text	Assignee:	Nobody <webkit-unassigned>
Status:	NEW ---
Severity:	Normal	CC:	ap
Priority:	P2
Version:	528+ (Nightly build)
Hardware:	Mac
OS:	OS X 10.5
URL:	http://www.zoo.gov.tw/table.shtml

Description David Carson 2008-05-23 08:47:06 PDT

When I load this site, the text seems garbled. When I load it in FF, it starts garbled, but then switches over the page to the correct character set.

Comment 1 Alexey Proskuryakov 2009-01-11 14:54:22 PST

This document has a broken structure (part of body content comes before head). In such cases, WebKit only checks the first kilobyte for meta charset - while Firefox will restart decoding if it sees the meta at any place in the document.

HTML5 specifies Firefox behavior, but despite this, I have very strong doubts about it, because parsing can have side effects (like script execution), which re-parsing would repeat.