RESOLVED INVALID 32416
[Qt] QtWebKit fails to detect the character encoding through BOM detection for UTF-8 for text/html docs
https://bugs.webkit.org/show_bug.cgi?id=32416
Summary [Qt] QtWebKit fails to detect the character encoding through BOM detection fo...
Petri Ojala
Reported 2009-12-11 04:07:30 PST
When the content type is text/html and the encoding not specified through any of the http header or xml encoding or meta tag methods, then UA must look at the BOM and find out if it is UTF-8 and display the contents accordingly. In this test case, the test file is saved in utf-8 encoding. Steps to reproduce: 1. Load: http://waplabdc.nokia-boston.com/browser/users/charset/Charset_detection/index.asp 2. Select Hindi as phone language and either utf-8 or ISCII as encoding from select lists and then click the "Test Link". 3. An index page with a list of tests is loaded. 4. Click the link: nocharset_xhtml_text_html Expected Result: A page will be loaded which shows text "nokia" in hindi language font. Actual result: Unidentified characters are displayed instead of hindi font. Note that if the content type is application/xhtml+xml or application/xml or text/xml then CWRT correctly identifies charset encoding in such cases if the document served is originally saved as a utf-8 document.
Attachments
Layout test (968 bytes, text/html)
2010-04-21 04:25 PDT, Petri Ojala
no flags
Layout test (950 bytes, application/octet-stream)
2010-04-28 00:18 PDT, Petri Ojala
no flags
Benjamin Poulain
Comment 1 2010-03-16 03:32:19 PDT
This is not platform specific. Petri, could you create a new layout test for this issue?
Petri Ojala
Comment 2 2010-04-20 03:29:09 PDT
I tried to find a way to create a test, without success. Some problems: 1. I can't copy the page source to local because of the type (asp). So I need to execute the index.asp in the server. Is it acceptable to access internet in this case? 2. To load the actual result page (nocharset_xhtml_text_html.asp), I need to set a cookie (language + enc) like in index.asp. Is there any way to set the cookie to remote host from a local test page?
Benjamin Poulain
Comment 3 2010-04-20 10:03:31 PDT
(In reply to comment #2) > I tried to find a way to create a test, without success. > > Some problems: > 1. I can't copy the page source to local because of the type (asp). So I need > to execute the index.asp in the server. Is it acceptable to access internet in > this case? > 2. To load the actual result page (nocharset_xhtml_text_html.asp), I need to > set a cookie (language + enc) like in index.asp. Is there any way to set the > cookie to remote host from a local test page? I don't get it. Why can't you just save the generated page and encode it with a BOM?
Petri Ojala
Comment 4 2010-04-21 04:25:53 PDT
Created attachment 53941 [details] Layout test Ok. I did add a test page (attachment) to layout tests. Benjamin, could you please check if this test page is ok?
Benjamin Poulain
Comment 5 2010-04-26 05:47:45 PDT
(In reply to comment #4) > Ok. I did add a test page (attachment) to layout tests. > > Benjamin, could you please check if this test page is ok? I meant a real layout test, one that can be added directly in WebKit. If you have a colleague working on WebKit, ask him some help to get started.
Petri Ojala
Comment 6 2010-04-28 00:18:02 PDT
Created attachment 54530 [details] Layout test Expected result added. This should now be a valid layout test. At least, I was able run it with run-webkit-tests. Unzip files to Layouttests/fast/encoding (I think it could be the correct folder) Please, add these test files to the trunk together with the fix.
Benjamin Poulain
Comment 7 2010-04-28 05:33:33 PDT
(In reply to comment #6) > Expected result added. This should now be a valid layout test. > At least, I was able run it with run-webkit-tests. > > Unzip files to Layouttests/fast/encoding > (I think it could be the correct folder) > > Please, add these test files to the trunk together with the fix. Yep, that looks a lot better for a layout test. To be able to integrate it, the file would need the BOM and do not have the <meta> tag. If you want to work on the bug we can review your patch and the test so you get everything right. That can be interesting to you to learn how to contribute to WebKit.
Benjamin Poulain
Comment 8 2010-04-30 04:09:55 PDT
Closing as invalid. There is no BOM at the beginning of the document at http://waplabdc.nokia-boston.com/browser/users/charset/Charset_detection/index.asp . It makes sense to open it with the default encoding. Is it a misconfiguration of the server?
Note You need to log in before you can comment on or make changes to this bug.