Steps to reproduce: 1) Make Safari load (either in content area or through XMLHttpRequest) an XML document that does not have an XML declaration that declares the character encoding AND does not have a BOM AND is encoded in UTF-8 AND contains characters from outside the ASCII range AND is served as either application/xml or application/xhtml+xml AND has no charset parameter on the HTTP layer. (Although the above looks very specific, the conditions commonly hold true.) 2) Observe. Actual results: The bytes are decoded as characters according to the Default Encoding in Appearance preferences. Expected results: Expected the bytes to be decoded as characters according to UTF-8 as per section 3.2 of RFC 3023, which defers to XML 1.0 section 4.3.3. Additional information: Besides the obvious implications of this bug, there are two less obvious implications: 1) Safari cannot properly consume Canonical XML. 2) Safari cannot properly consume XML documents it has produced itself via XMLHttpRequest POST!
Would you be able to attach a test document, cheers, Oliver
What reduction is needed beyond the case that has been in the URL field all along?
Behaviour is wrong (confirmed against ffx)
Created attachment 3827 [details] proposed patch Well, the XML spec is pretty explicit about files that do not have an encoding declaration in the text declaration - they should be UTF-8 or UTF-16, unless a higher-level protocol defines a charset (4.3.3).
The file from bug URL can serve as a test case (without a link to the next test, of course).
Comment on attachment 3827 [details] proposed patch Is there any other browser that has this behavior? The comments above lead me to believe this is not working this way in Firefox.
Gecko used to have this same bug (at least in content area--not sure about XMLHttpRequest), but it has been fixed.
Henri, which Gecko bugfix are you referring to? I see that Firefox 1.0.5 renders the test as expected, but I couldn't find anything in Bugzilla. I found <https://bugzilla.mozilla.org/show_bug.cgi?id=247024>, but it talks about a different issue: documents transferred with MIME type text/xml should default to us-ascii, not utf-8. I'm not sure if WebKit has the same problem, but if it has, that should be in a separate report IMO.
Comment on attachment 3827 [details] proposed patch I thought about it a lot, and I think it's fine to land the fix just like this.
Mass moving XML DOM bugs to the "DOM" Component.