RESOLVED FIXED 6290
XML documents with "html" in dtd name use XHTML entities when they shouldn't
https://bugs.webkit.org/show_bug.cgi?id=6290
Summary XML documents with "html" in dtd name use XHTML entities when they shouldn't
Ian 'Hixie' Hickson
Reported 2005-12-29 16:56:35 PST
Currently any XML file with a DOCTYPE whose "name" part is "html" triggers "HTML" parsing mode. This is wrong. It should be based on the PUBLIC identifier. TESTCASE: http://www.hixie.ch/tests/adhoc/xml/parsing/010.xml This is a regression caused by bug 4301.
Attachments
fix & test (4.20 KB, patch)
2005-12-29 18:36 PST, Eric Seidel (no email)
mjs: review+
Eric Seidel (no email)
Comment 1 2005-12-29 17:02:01 PST
This is such a trivial edgecase.
Ian 'Hixie' Hickson
Comment 2 2005-12-29 17:11:02 PST
Not that much of an edgecase, XHTML2 will likely trip this, and even if XHTML2 isn't much in use today, it eventually may be. It would be sad for XHTML2 authors if they were confused by old builds of Safari. The bug is that externalSubsetHandler() is checking |name|, but should be checking |externalId|. Here are the IDs to compare against (taken from Mozilla's list, with the MathML DTDs removed): "-//W3C//DTD XHTML 1.0 Transitional//EN" "-//W3C//DTD XHTML 1.1//EN" "-//W3C//DTD XHTML 1.0 Strict//EN" "-//W3C//DTD XHTML 1.0 Frameset//EN" "-//W3C//DTD XHTML Basic 1.0//EN" "-//WAPFORUM//DTD XHTML Mobile 1.0//EN"
Eric Seidel (no email)
Comment 3 2005-12-29 17:36:23 PST
Calling this a regression is pretty strong language. As far as I can tell this test case only "passed" before, because html entities were not supported in xhtml at all in Safari. Perhaps in the expat days this test case passed for valid reasons. Now html entities are supported in anything that looks like an xhtml document. This one happens to look enough like xhtml that we treat it as one. Eventually, we'll need a HashSet, and a list of allowed doc types.
Ian 'Hixie' Hickson
Comment 4 2005-12-29 17:38:49 PST
Why not just a set of six string comparisons in the place of the one comparison you have now? You'll almost always either match the first one or not enter the function, so it shouldn't be a perf hit. It's a regression because before you used to not have this bug, and now you do. That's what a regression is: the introduction of a new bug.
Eric Seidel (no email)
Comment 5 2005-12-29 18:36:07 PST
Created attachment 5364 [details] fix & test
Ian 'Hixie' Hickson
Comment 6 2005-12-29 19:01:28 PST
You have this line: + || (extId == "-//W3C//DTD XHTML Basic 1.0//EN") ...in there twice. (In the Mozilla codebase they have the MathML 2.0 DOCTYPE where you have the second occurance, but you don't want to put that in because that isn't XHTML and has different entities. It's probably sort-of ok to have the XHTML+MathML DOCTYPEs in there, I guess, though. It'll at least work for some pages, anyway.) Other than that, looks great. Thanks. :-)
Maciej Stachowiak
Comment 7 2005-12-29 20:00:49 PST
Comment on attachment 5364 [details] fix & test r=me
Joost de Valk (AlthA)
Comment 8 2006-01-22 04:56:38 PST
Removing keyword(s) since bug is fixed.
Joost de Valk (AlthA)
Comment 9 2006-01-22 05:01:00 PST
Removing keyword(s) since bug is fixed.
Eric Seidel (no email)
Comment 10 2006-01-31 21:20:40 PST
Removing Regression keyword from bugs already fixed.
Note You need to log in before you can comment on or make changes to this bug.