// derived from bugs#14601 With some broken meta tags like: > <meta http-equiv="Content-Type" content="text/html; charset="utf-8"> detectJapaneseEncoding() seems to not to be called. For not-collectly-paired \x22, checkForHeadCharset() lost sync for quote and runs out whole the content absorbed with returns-false (at 'if(ptr == pEnd) return false;' line 588). Tag/content may not contain linefeeds with almost websites. I think successfully aborting at scanning quote pair when linefeed occuered is with reality. My experimental code. ----- while (ptr != pEnd && *ptr != quoteMark) { if(*ptr=='\r' || *ptr=='\n'){ // too long tag content : may lost sync // successfully bail out m_checkedForHeadCharset = true; return true; } ++ptr; } -----
This is a regression from shipping WebKit, upgrading to P1. See <http://www.whatwg.org/specs/web-apps/current-work/#get-an> - if I'm reading it correctly, we are not supposed to honor such a META. Which might mean that we need to suggest a correction to the HTML5 algorithm. Also, I'm not sure why Firefox works - it's possible that it ignores the META, and auto-detects the encoding based on page text analysis.
*** Bug 14643 has been marked as a duplicate of this bug. ***
<rdar://problem/5340161>