Summary: | Processing instructions inside DOCTYPE internal subset are parsed incorrectly (by libxml2?) | ||
---|---|---|---|
Product: | WebKit | Reporter: | Leif Halvard Silli <xn--mlform-iua> |
Component: | DOM | Assignee: | Nobody <webkit-unassigned> |
Status: | NEW --- | ||
Severity: | Major | CC: | cdumez |
Priority: | P2 | ||
Version: | 528+ (Nightly build) | ||
Hardware: | All | ||
OS: | All | ||
Attachments: |
Description
Leif Halvard Silli
2010-03-09 20:17:42 PST
I will once again stress that this bug is about application/xhtml+xml parsing. Created attachment 74246 [details]
test
Same test as an attachment
Created attachment 74247 [details]
test
Modified to pass in Firefox.
This is weird - the only callbacks we get from libxml2 are startDocumentHandler, internalSubsetHandler and then normalErrorHandler, so this looks almost like a libxml2 bug. Note that internalSubsetHandler only carries name, externalID, systemID - we certainly aren't handling DTD itself in WebKit. But command line xmllint doesn't seem to have a problem with this file. Created attachment 109901 [details]
Shows that Webkit *does* follow XML PI-syntax
My diagnosis was wrong: The attached XHTML file includes a HTML PI inside the DTD, and Webkit then correctly reports that the PI never ends (because there is no "?>" to end it.
Created attachment 109902 [details]
Shows that Webkit accepts a closed comment inside the PI
A XML comment inside an XML processing instruction, is not a XML comment. But Webkit apparently sees it as one. And as long as it perceives it as a well formed comment, it accepts its - as the demo shows.
Created attachment 109903 [details]
Reduction of the problem: Webkit doesn't accept a "unclosed" comment inside the PI
Add minimal demo to show what Webkit doesn't accept.
Created attachment 109906 [details]
Workaround: Shows how to circument the problem - perhaps point at a solution?
This test file shows how to workaround the problem. Please read the comments in the test file.
Created attachment 109928 [details]
Workaround 2: Here the ]> appears right after the processing instruction has started
In this new attachment, the ]> comes right aft the processing instruction has begun:
<!DOCTYPE html SYSTEM "about:legacy">
<?pi ]>
<whatever><!--goes here
?>
]>
So, seemingly, as long as Webkit is able to
a) find 2 occurences of the string ']>', and
b) the string occurs immediately after the PI has begun or
inside (!) a comment right after the then DTD has ended
then webkit allows any content inside the processing instruction.
(For more comments and speculation, see the attachment.)
Created attachment 109929 [details]
Workaround 3: Add a comment inside the DTD, after the PI
Created attachment 109930 [details]
Workaround: Shows that a "HTML5 comment" - a "short comment" (<!-->) can be used as workaround
Mass moving XML DOM bugs to the "DOM" Component. |