Bug 5998 - WebKit should recognize anything with MIME type *+xml as xml.
Summary: WebKit should recognize anything with MIME type *+xml as xml.
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: 420+
Hardware: Macintosh OS X 10.4
: P4 Normal
Assignee: Eric Seidel (no email)
URL:
Keywords: InRadar
Depends on:
Blocks: 8645
  Show dependency treegraph
 
Reported: 2005-12-07 23:35 PST by Eric Seidel (no email)
Modified: 2019-02-06 09:02 PST (History)
1 user (show)

See Also:


Attachments
Fix, including comprehensive test case. (11.05 KB, patch)
2006-04-28 02:44 PDT, Eric Seidel (no email)
andersca: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Seidel (no email) 2005-12-07 23:35:39 PST
WebKit should recognize anything with MIME type *+xml as xml.

From email:

	From: 	  chris
	Subject: 	CDF and the +xml convention
	Date: 	December 7, 2005 12:35:54 PM PST
	To: 	  eseidel
	Cc: 	  mjs
	Reply-To: 	  chris

On Wednesday, December 7, 2005, 7:39:05 PM, Eric wrote:

ES> Chris,

ES> Could you explain to me how, according to RFC 3023 browsers are  
ES> supposed to detect that that is XML?

From the introduction:

   To enable the exchange of XML network entities, this document
   standardizes five new media types -- text/xml, application/xml,
   text/xml-external-parsed-entity,
   application/xml-external-parsed-entity, and application/xml-dtd -- as
   well as a naming convention for identifying XML-based MIME media
   types.

also

7. A Naming Convention for XML-Based Media Types

   This document recommends the use of a naming convention (a suffix of
   '+xml') for identifying XML-based MIME media types, whatever their
   particular content may represent.  This allows the use of generic XML
   processors and technologies on a wide variety of different XML
   document types at a minimum cost, using existing frameworks for media
   type registration.
   [...]
   Some areas where 'generic' processing is useful include:

   o  Browsing - An XML browser can display any XML document with a
      provided [CSS] or [XSLT] style sheet, whatever the vocabulary of
      that document.   

+xml as the final token on the subtype means that the media type has
used this naming convention and is identifying itself as an XML-based
media type.

This test is a test of that; its not necessarily good practice, just a
test.

ES> Safari (in my mind) correctly notices the bogus MIME type and  
ES> downloads the file to your desktop (not even trying to render it).

That is conformant behaviour for unknown types; but its also conformant
to parse the xml (since its identified as XML) and use namespace
dispatching.

ES> When you drop the file on Safari from your desktop, the Mac OS  
ES> "Launch Services" attempts to determine what sort of file it is, and  
ES> seems to decide text/html (given the initially sniffed <html> content  
ES> and the fact that I can tell our code is taking the HTML parser path  
ES> instead of the XML parser path.

That is unfortunate.

ES>   This leads to a correct rendering of
ES> the text content of the file, but ignores the SVG tags entirely,  
ES> instead the HTML parser treats them as bogus HTML tags (creating  
ES> plain old Elements for them).

That is conformant (unfortunately) for text/html.

ES> One might argue with this second "drop on Safari" behavior, given  
ES> that a namespace is specified...

Yes, sniffing is complicated, and there is not much compound document
stuff around currently; also, is unsafe to assume that any random
html-looking document is wellformed(in fact,it statistically most
improbable that it is well formed).

Note though that RFC specifically forbids using +xml for anything that is
not XML, so  in the case where there is an authoritative media type (eg
HTTP, MIME email, RTP)it can be safely assumed that the content is XML.

ES> But what would in your mind be "correct" behavior for this file (w/o  
ES> breaking the rest of the web),

When received from the server, parse it and look for known
namespaces, known stylesheet languages etc.

On the local filesystem, in the absence of any filesystem metadata, I
can't see an alternative to sniffing but a check for likely xml and a
well formedness check to verify would seemlike a better approach than
assuming anything with html-looking element names is text/html.

ES>  and down what path of logical  
ES> decisions must we tread (as a browser dealing w/ a bogus file) in  
ES> order to achieve this correct behavior?

Does the above help?

ES> Thanks for your time.

No problem. Happy to discuss it further, here or on public-cdf@w3.org.
Apple, as members of W3C and implementors of an XML-based browser, would
of course be most welcome to join the CDF WG as well.

ES> -eric

ES> On Dec 7, 2005, at 10:17 AM, Chris Lilley wrote:

On Wednesday, December 7, 2005, 3:21:02 PM, jeff_schiller wrote:

j> Here's my take :
j> http://blog.codedread.com/archives/2005/12/06/the-svg-roller- 
coaster/

You note there that Opera "plans to support CDF in a big way, which is
fantastic news.". It is, I agree; part of the point of SVG being in
XMLis that it can be used as a graphical namespace - something that  
knows
how to draw itself - in compound documents.

I put together a quick test, used in a recent CDF WG meeting; it  
serves
a compound document as a bogus unregistered media type,
application/foobar+xml with an unknown/unsniffable filename  
extension of
.foobar

Thus the only thing that is known about it, per RFC 3023, is that  
it is
in XML.

Opera 9 happily renders it, finding both the XHTML and SVG  
namespaces in
there.

http://www.w3.org/2005/10/SVGinXHTML.foobar

-- 
 Chris Lilley 
 Chair, W3C SVG Working Group
 W3C Graphics Activity Lead
 Co-Chair, W3C Hypertext CG



-- 
 Chris Lilley 
 Chair, W3C SVG Working Group
 W3C Graphics Activity Lead
 Co-Chair, W3C Hypertext CG
Comment 1 Eric Seidel (no email) 2006-04-28 02:44:46 PDT
Created attachment 8018 [details]
Fix, including comprehensive test case.
Comment 2 Eric Seidel (no email) 2006-04-28 02:45:30 PDT
This is in radar as:
<rdar://problem/4031511> XmlHttpRequest doesn't allow responses with Content-Type: application/soap+xml
Comment 3 Anders Carlsson 2006-04-28 02:53:00 PDT
Comment on attachment 8018 [details]
Fix, including comprehensive test case.

Looks great, r=me!

I'd like it if you could make that //FIXME into a bug report and refer to it in the comments.
Comment 4 Eric Seidel (no email) 2006-04-28 03:54:22 PDT
Actually, the patch I posted only fixes xmlhttprequest.  I'll open a separate bug to track fixes needed for WebKit.
Comment 5 Lucas Forschler 2019-02-06 09:02:59 PST
Mass moving XML DOM bugs to the "DOM" Component.