Bug 16621
| Summary: | WebKit ignores encoding description in invalid HTML if it's too far from the start | ||
|---|---|---|---|
| Product: | WebKit | Reporter: | Alexey Proskuryakov <ap> | 
| Component: | Page Loading | Assignee: | Nobody <webkit-unassigned> | 
| Status: | NEW | ||
| Severity: | Normal | CC: | ahmad.saleem792, darin, ddkilzer, ian, jshin, mrowe | 
| Priority: | P2 | ||
| Version: | 528+ (Nightly build) | ||
| Hardware: | Mac | ||
| OS: | OS X 10.4 | ||
          Alexey Proskuryakov
          
          
          
          
        
        
      From bug 12526 comment 3.
Our heuristic for <meta> charset declarations differs from what Firefox does, and what is documented in HTML5. Namely, we do not check for <meta> during normal parsing and re-start parsing if the charset changes late in the game. We only pre-parse the first 512 bytes of the document, or the whole <head>, whichever is larger. This is usually enough, but we know of pages that aren't decoded correctly because of this difference.
The following two pages have a very long script (~ 10kB) at the beginning, and
charset declaration in <meta> is not honored. 
http://db66.vnet.cn/
http://www.ddm.com/event/event84.asp?code=-548
Restarting parsing at any point is a big can of worms though - e.g., some scripts with side effects may run twice because of that.
    | Attachments | ||
|---|---|---|
| Add attachment proposed patch, testcase, etc. | 
          Mark Rowe (bdash)
          
          
          
          
        
        
      Is the handling of scripts when reparsing discussed in the HTML5 specification?  Is that something which should be documented in the spec?
    
          Alexey Proskuryakov
          
          
          
          
        
        
      See <http://www.whatwg.org/specs/web-apps/current-work/#change>.
    
          Ian 'Hixie' Hickson
          
          
          
          
        
        
      (basically, HTML5 requires that the scripts run twice.)
    
          Alexey Proskuryakov
          
          
          
          
        
        
      *** Bug 275017 has been marked as a duplicate of this bug. ***