At www.becu.org there's a missing character glyph, the <?> symbol, at what should be some whitespace. The reason for this is that we're interpreting the page as UTF-8, despite the fact that it's not labeled as such. That's because in this file there is the following stray content (not at the beginning, after the entire <head> section). <?xml version="1.0" encoding="UTF-16"?> This causes the TextResourceDecoder to decide the file is UTF-8! Doesn't happen in other browsers.
<rdar://problem/5400664> covers this bug and some other problems.
This is a regression from shipping Safari/WebKit. Firefox 2 also honors an XML encoding declaration in HTML documents (which is the historical reason why we do so, too, ignoring HTML5), but only if it's at the very beginning of the file. Of course, both browsers detect an error if an XML declaration is not at the very beginning of an XML file. Also, Firefox doesn't have out quirk of treating UTF-16 declaration in 8-bit files as UTF-8.
Created attachment 15951 [details] test case (non-regression) Here, the XML declaration is not at the beginning of the file, but before BODY - both shipping Safari/WebKit and TOT fail in this case (although the test doesn't work in the former due to missing support for document.characterSet).
Alexey, is your plan to ignore <?xml> that are not at the start of the file? I think that would probably be sufficient.
(In reply to comment #4) > Alexey, is your plan to ignore <?xml> that are not at the start of the file? I > think that would probably be sufficient. Yes, that's precisely what I'm going to do here.
Created attachment 15954 [details] proposed fix /me wonders how many off-by-one errors he managed to introduce this time.
Comment on attachment 15954 [details] proposed fix + // Is there enough data available to check for XML declaration? + if (m_buffer.size() < 8) + return false; Why not put that check before setting up ptr and pEnd? r=me
Oh, and don't forget to land a test case for what this fixed!
(In reply to comment #7) > Why not put that check before setting up ptr and pEnd? OK > Oh, and don't forget to land a test case for what this fixed! Do you mean that there should be a test case specifically for XML declaration after HEAD? Is the included test not sufficient?
(In reply to comment #9) > Do you mean that there should be a test case specifically for XML declaration > after HEAD? Is the included test not sufficient? I think the included test is sufficient. Sorry, I completely overlooked it. I would have expected a test that was more like the original site too, but I don't think that's really all that helpful or important.
Committed revision 25066.
Oops, forgot to move the check for m_buffer.size() as promised.