47967 – Plain Text Representations of XML Source Code without .txt Extension Result in Parsing

Bug 47967 - Plain Text Representations of XML Source Code without .txt Extension Result in Parsing

Summary: Plain Text Representations of XML Source Code without .txt Extension Result i...

Status:	UNCONFIRMED

Alias:	None

Product:	WebKit
Classification:	Unclassified
Component:	Page Loading (show other bugs)
Version:	528+ (Nightly build)
Hardware:	Mac OS X 10.5

Importance:	P2 Normal
Assignee:	Nobody

URL:
Keywords:

Depends on:
Blocks:

Reported:	2010-10-19 23:11 PDT by Hugh Guiney
Modified:	2023-09-26 19:05 PDT (History)
CC List:	4 users (show)

See Also:

Attachments
XHTML as Plain Text (66 bytes, text/plain) 2010-10-20 12:55 PDT, Hugh Guiney	no flags	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Hugh Guiney 2010-10-19 23:11:33 PDT

I saved an XHTML document as a text file so that I could display the source code for users to download. I have Apache HTTPD set up to leave off extensions with MultiViews. So while there is only one resource, xhtml.txt, but it is accessible via /xhtml as well as /xhtml.txt. In either case, the response headers report "Content-Type:text/plain; charset=utf-8". And yet, if I request /xhtml, WebKit renders the document incorrectly, parsing it as if it were actual XHTML. If I request it with .txt on the end of it, it renders the document correctly as plain text. However, if I remove the XML prolog, it is rendered as plain text in both instances. Gecko and Presto both display plain text regardless of whether an XML prolog is present.

Comment 1 Alexey Proskuryakov 2010-10-20 12:14:08 PDT

The draft specification governing content type sniffing is at <http://tools.ietf.org/html/draft-abarth-mime-sniff>. I don't know whether Safari works in accordance with it here.

Comment 2 Adam Barth 2010-10-20 12:15:20 PDT

The draft prevent sniffing of XML from text/plain.  Safari does sniff from text/plain in some cases.

Comment 3 Alexey Proskuryakov 2010-10-20 12:33:44 PDT

The same document may still get sniffed as HTML.

I guess we need to have the original document to find out what exactly happened. Reporter, would it be possible for you to upload it?

Comment 4 Hugh Guiney 2010-10-20 12:55:20 PDT

Created attachment 71322 [details]
XHTML as Plain Text

Comment 5 Alexey Proskuryakov 2010-10-20 13:02:58 PDT

Thanks! Yes, that's sniffed as <application/xhtml+xml>.

Adam, what does the spec say? Should it be text/plain, or text/html?

Comment 6 Adam Barth 2010-10-20 13:07:21 PDT

> Adam, what does the spec say? Should it be text/plain, or text/html?

text/plain.  We should never sniff HTML or XML from text/plain.  The spec says text/plain can only become types that are not scriptable.