UNCONFIRMED47967
Plain Text Representations of XML Source Code without .txt Extension Result in Parsing
https://bugs.webkit.org/show_bug.cgi?id=47967
Summary Plain Text Representations of XML Source Code without .txt Extension Result i...
Hugh Guiney
Reported 2010-10-19 23:11:33 PDT
I saved an XHTML document as a text file so that I could display the source code for users to download. I have Apache HTTPD set up to leave off extensions with MultiViews. So while there is only one resource, xhtml.txt, but it is accessible via /xhtml as well as /xhtml.txt. In either case, the response headers report "Content-Type:text/plain; charset=utf-8". And yet, if I request /xhtml, WebKit renders the document incorrectly, parsing it as if it were actual XHTML. If I request it with .txt on the end of it, it renders the document correctly as plain text. However, if I remove the XML prolog, it is rendered as plain text in both instances. Gecko and Presto both display plain text regardless of whether an XML prolog is present.
Attachments
XHTML as Plain Text (66 bytes, text/plain)
2010-10-20 12:55 PDT, Hugh Guiney
no flags
Alexey Proskuryakov
Comment 1 2010-10-20 12:14:08 PDT
The draft specification governing content type sniffing is at <http://tools.ietf.org/html/draft-abarth-mime-sniff>. I don't know whether Safari works in accordance with it here.
Adam Barth
Comment 2 2010-10-20 12:15:20 PDT
The draft prevent sniffing of XML from text/plain. Safari does sniff from text/plain in some cases.
Alexey Proskuryakov
Comment 3 2010-10-20 12:33:44 PDT
The same document may still get sniffed as HTML. I guess we need to have the original document to find out what exactly happened. Reporter, would it be possible for you to upload it?
Hugh Guiney
Comment 4 2010-10-20 12:55:20 PDT
Created attachment 71322 [details] XHTML as Plain Text
Alexey Proskuryakov
Comment 5 2010-10-20 13:02:58 PDT
Thanks! Yes, that's sniffed as <application/xhtml+xml>. Adam, what does the spec say? Should it be text/plain, or text/html?
Adam Barth
Comment 6 2010-10-20 13:07:21 PDT
> Adam, what does the spec say? Should it be text/plain, or text/html? text/plain. We should never sniff HTML or XML from text/plain. The spec says text/plain can only become types that are not scriptable.
Note You need to log in before you can comment on or make changes to this bug.