Bug 19216
| Summary: | Character set detection seems to happen too late | ||
|---|---|---|---|
| Product: | WebKit | Reporter: | David Carson <dacarson> |
| Component: | Text | Assignee: | Nobody <webkit-unassigned> |
| Status: | NEW | ||
| Severity: | Normal | CC: | ap |
| Priority: | P2 | ||
| Version: | 528+ (Nightly build) | ||
| Hardware: | Mac | ||
| OS: | OS X 10.5 | ||
| URL: | http://www.zoo.gov.tw/table.shtml | ||
David Carson
When I load this site, the text seems garbled. When I load it in FF, it starts garbled, but then switches over the page to the correct character set.
| Attachments | ||
|---|---|---|
| Add attachment proposed patch, testcase, etc. |
Alexey Proskuryakov
This document has a broken structure (part of body content comes before head). In such cases, WebKit only checks the first kilobyte for meta charset - while Firefox will restart decoding if it sees the meta at any place in the document.
HTML5 specifies Firefox behavior, but despite this, I have very strong doubts about it, because parsing can have side effects (like script execution), which re-parsing would repeat.