When serving a page with these headers: -- CUT -- Cache-Control:no-store, no-cache, must-revalidate, post-check=0, pre-check=0 Connection:Keep-Alive Content-Type:application/xhtml+xml; charset=UTF-8 Date:Sun, 26 Oct 2008 09:25:01 GMT Expires:Thu, 19 Nov 1981 08:52:00 GMT Keep-Alive:timeout=15, max=100 Pragma:no-cache Server:Apache Transfer-Encoding:Identity -- CUT -- the encoding of the page is not detected to be in UTF-8. I've tried inserting <?xml version="1.0" encoding="UTF-8"?> at page's 1st line, without success. The page in question is an XHTML 1.1 page with heavy use of JS and dynamically loaded inline SVG graphs (so the usage of application/xhtml+xml is required). Version: WebKit nightly r37894 OS: Mac OS X 10.4.11 Doesn't happen in: Opera 9.6, FF 3.0.1
Is this the same as bug 18308? Hard to tell from the description, but we certainly do honor the aforementioned ways to specify charset in usual cases, not to mention that UTF-8 is the default for application/xhtml+xml.
Ok. This has nothing to do with static content (it renders correctly). I think a have a simplified test case. File: test.html. Served with "Content-Type: application/xhtml+xml; charset=UTF-8" -- CUT -- <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="ru" dir="ltr"> <head> <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8" /> </head> <body> <div id="a"></div> <script type="text/javascript" src="script.js"></script> </body> </html> -- CUT -- File: script.js Served with "Content-Type: application/x-javascript" (note no charset!) -- CUT -- var d1 = document.getElementById('a'); var d2 = document.createElement('div'); var x = 'ÀÁÂÃÄ'; d2.appendChild(d2.ownerDocument.createTextNode(x)); d1.appendChild(d2); -- CUT -- Script should insert 5 uppercase cyrillic letters into div#a. But it does not.
Several additional facts I missed. 1. Script runs ok when embedded into the page (like inside <head>). 2. Script runs ok when test.html is served with text/html.
Ugh! Confirmed, and this isn't even a ToT regression. (In reply to comment #2) > <meta http-equiv="Content-Type" content="application/xhtml+xml; > charset=UTF-8" /> FWIW, meta declarations have no effect for XHTML documents (not that this affects the validity of this bug in any way).
Created attachment 24687 [details] proposed fix Talk about coincidences - turns out that I found a site affected by this very bug a few days ago, but didn't have the time to investigate it yet.
Comment on attachment 24687 [details] proposed fix r=me
Fixed in <http://trac.webkit.org/changeset/37924>.