WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
NEW
14636
REGRESSION: broken tags with unpaired quote prevents encode autodetection
https://bugs.webkit.org/show_bug.cgi?id=14636
Summary
REGRESSION: broken tags with unpaired quote prevents encode autodetection
808caaa4.8ce9.9cd6c799e9f6
Reported
2007-07-17 03:00:44 PDT
// derived from bugs#14601 With some broken meta tags like:
> <meta http-equiv="Content-Type" content="text/html; charset="utf-8">
detectJapaneseEncoding() seems to not to be called. For not-collectly-paired \x22, checkForHeadCharset() lost sync for quote and runs out whole the content absorbed with returns-false (at 'if(ptr == pEnd) return false;' line 588). Tag/content may not contain linefeeds with almost websites. I think successfully aborting at scanning quote pair when linefeed occuered is with reality. My experimental code. ----- while (ptr != pEnd && *ptr != quoteMark) { if(*ptr=='\r' || *ptr=='\n'){ // too long tag content : may lost sync // successfully bail out m_checkedForHeadCharset = true; return true; } ++ptr; } -----
Attachments
Add attachment
proposed patch, testcase, etc.
Alexey Proskuryakov
Comment 1
2007-07-17 04:00:18 PDT
This is a regression from shipping WebKit, upgrading to P1. See <
http://www.whatwg.org/specs/web-apps/current-work/#get-an
> - if I'm reading it correctly, we are not supposed to honor such a META. Which might mean that we need to suggest a correction to the HTML5 algorithm. Also, I'm not sure why Firefox works - it's possible that it ignores the META, and auto-detects the encoding based on page text analysis.
David Kilzer (:ddkilzer)
Comment 2
2007-07-17 08:27:30 PDT
***
Bug 14643
has been marked as a duplicate of this bug. ***
David Kilzer (:ddkilzer)
Comment 3
2007-07-17 08:29:01 PDT
<
rdar://problem/5340161
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug