Bug 16621
Summary: | WebKit ignores encoding description in invalid HTML if it's too far from the start | ||
---|---|---|---|
Product: | WebKit | Reporter: | Alexey Proskuryakov <ap> |
Component: | Page Loading | Assignee: | Nobody <webkit-unassigned> |
Status: | NEW | ||
Severity: | Normal | CC: | ahmad.saleem792, darin, ddkilzer, ian, jshin, mrowe |
Priority: | P2 | ||
Version: | 528+ (Nightly build) | ||
Hardware: | Mac | ||
OS: | OS X 10.4 |
Alexey Proskuryakov
From bug 12526 comment 3.
Our heuristic for <meta> charset declarations differs from what Firefox does, and what is documented in HTML5. Namely, we do not check for <meta> during normal parsing and re-start parsing if the charset changes late in the game. We only pre-parse the first 512 bytes of the document, or the whole <head>, whichever is larger. This is usually enough, but we know of pages that aren't decoded correctly because of this difference.
The following two pages have a very long script (~ 10kB) at the beginning, and
charset declaration in <meta> is not honored.
http://db66.vnet.cn/
http://www.ddm.com/event/event84.asp?code=-548
Restarting parsing at any point is a big can of worms though - e.g., some scripts with side effects may run twice because of that.
Attachments | ||
---|---|---|
Add attachment proposed patch, testcase, etc. |
Mark Rowe (bdash)
Is the handling of scripts when reparsing discussed in the HTML5 specification? Is that something which should be documented in the spec?
Alexey Proskuryakov
See <http://www.whatwg.org/specs/web-apps/current-work/#change>.
Ian 'Hixie' Hickson
(basically, HTML5 requires that the scripts run twice.)
Alexey Proskuryakov
*** Bug 275017 has been marked as a duplicate of this bug. ***