NEW 16621
WebKit ignores encoding description in invalid HTML if it's too far from the start
https://bugs.webkit.org/show_bug.cgi?id=16621
Summary WebKit ignores encoding description in invalid HTML if it's too far from the ...
Alexey Proskuryakov
Reported 2007-12-27 00:36:02 PST
From bug 12526 comment 3. Our heuristic for <meta> charset declarations differs from what Firefox does, and what is documented in HTML5. Namely, we do not check for <meta> during normal parsing and re-start parsing if the charset changes late in the game. We only pre-parse the first 512 bytes of the document, or the whole <head>, whichever is larger. This is usually enough, but we know of pages that aren't decoded correctly because of this difference. The following two pages have a very long script (~ 10kB) at the beginning, and charset declaration in <meta> is not honored. http://db66.vnet.cn/ http://www.ddm.com/event/event84.asp?code=-548 Restarting parsing at any point is a big can of worms though - e.g., some scripts with side effects may run twice because of that.
Attachments
Mark Rowe (bdash)
Comment 1 2007-12-27 01:58:44 PST
Is the handling of scripts when reparsing discussed in the HTML5 specification? Is that something which should be documented in the spec?
Alexey Proskuryakov
Comment 2 2007-12-27 02:28:56 PST
Ian 'Hixie' Hickson
Comment 3 2008-01-08 18:37:00 PST
(basically, HTML5 requires that the scripts run twice.)
Alexey Proskuryakov
Comment 4 2024-06-01 17:10:52 PDT
*** Bug 275017 has been marked as a duplicate of this bug. ***
Note You need to log in before you can comment on or make changes to this bug.