Bug 5140 - CachedObject loading ignores charset from HTTP headers
Summary: CachedObject loading ignores charset from HTTP headers
Alias: None
Product: WebKit
Classification: Unclassified
Component: Layout and Rendering (show other bugs)
Version: 420+
Hardware: Mac OS X 10.4
: P2 Normal
Assignee: Dave Hyatt
URL: http://www.rambler.ru/db/news/msg.htm...
: 4879 (view as bug list)
Depends on:
Reported: 2005-09-26 11:44 PDT by Alexey Proskuryakov
Modified: 2005-11-21 10:49 PST (History)
2 users (show)

See Also:

test document (363 bytes, text/html)
2005-09-26 11:45 PDT, Alexey Proskuryakov
no flags Details
proposed patch (5.03 KB, patch)
2005-09-26 11:46 PDT, Alexey Proskuryakov
mjs: review-
Details | Formatted Diff | Diff
proposed patch (5.16 KB, patch)
2005-09-28 10:09 PDT, Alexey Proskuryakov
mjs: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alexey Proskuryakov 2005-09-26 11:44:29 PDT
khtml::Loader doesn't use a charset specified in Content-Type HTTP header.

See the attached test case (accesses an external server)
Comment 1 Alexey Proskuryakov 2005-09-26 11:45:02 PDT
Created attachment 4047 [details]
test document
Comment 2 Alexey Proskuryakov 2005-09-26 11:46:28 PDT
Created attachment 4048 [details]
proposed patch
Comment 3 Alexey Proskuryakov 2005-09-26 11:48:42 PDT
Comment on attachment 4048 [details]
proposed patch

These files are formatted somewhat inconsistently; I tried to match what I saw
in the nearest parts. Also, there were tabs in some places; I used spaces, as
Comment 4 Maciej Stachowiak 2005-09-27 21:28:27 PDT
A setter method should be called "setCharset()", not just "charset()". Also, what will this code do if the 
HTTP headers include  an invalid or unknown charset? Will the code that deletes the codec and fetches a 
new one leave you with no coded in this case?
Comment 5 Maciej Stachowiak 2005-09-27 21:29:05 PDT
Comment on attachment 4048 [details]
proposed patch

r- for setCharset() rename, and please make clear why blindly deleting the
codec is OK, or fix it if it isn't.
Comment 6 Alexey Proskuryakov 2005-09-28 10:09:28 PDT
Created attachment 4079 [details]
proposed patch

Blindly deleting the codec was safe, because we were getting iso8859-1 for
invalid charsets. But on a second thought, it was probably not such a good
idea, because it didn't match what khtml::Decoder does for invalid charsets.
Comment 7 Maciej Stachowiak 2005-10-02 21:29:47 PDT
Comment on attachment 4079 [details]
proposed patch

Comment 8 Vicki Murley 2005-10-24 10:35:45 PDT
I'll commit this.
Comment 9 Alexey Proskuryakov 2005-10-24 21:25:22 PDT
Bug 5484 contains a fix for this patch.
Comment 10 Dan Wood 2005-11-21 10:49:14 PST
*** Bug 4879 has been marked as a duplicate of this bug. ***