Firefox treats the document type node as the first child of the document. We don't add document type nodes as children at all.
*** Bug 16636 has been marked as a duplicate of this bug. ***
Committed revision 29067.
Marked resolved by accident.
Oh, Eric just pointed out that it's not just XML!
I guess it probably won't hurt compatibility too much if it makes us match Gecko. I wonder what Trident/IE does with a doctype node.
See also: bug 15290, bug 15411.
Created attachment 18797 [details] test case
Sadly this causes 2 Acid3 failures.
Created attachment 19146 [details] Work in progress Just saving my work. Not ready for review yet.
Created attachment 19147 [details] Patch that gets Acid3 up to 84/100 I still have to make view source mode work with doctypes so that I don't break the Web Inspector.
Comment on attachment 19147 [details] Patch that gets Acid3 up to 84/100 I like it! 405 virtual void determineParseMode() {}; You made this protected, but it should be private. You only need it protected if derived classes need to *call* it. You can always override a virtual regardless of the visibility. And the derived class only needs to override, not call. It should be private in HTMLDocument too. Also, no semicolon after the braces. Do we need to do something to prohibit a second DOCTYPE node in XML, or does the parser already take care of that? Maybe this doesn't even handle he XML case yet? 858 bool isWhitespace = c == '\r' || c == '\n' || c == '\t' || c == ' '; Don't we already have a function for exactly that set of whitespace? All our different whitespace checks drive me crazy. 868 // Malformed. Just exit. Extra space here in this comment after the period. 902 if (Unicode::toLower(c) == publicStart[m_doctypeSearchCount]) { 910 } else if (Unicode::toLower(c) == systemStart[m_doctypeSearchCount]) { 1072 if (Unicode::toLower(*src) == doctypeStart[m_doctypeSearchCount]) { Unicode::toLower is overkill here. You want the much faster toASCIILower that works only for ASCII characters, because you know the that characters you are looking for are ASCII. I think we need test cases for every single branch in the parsing logic. I honestly can't read the parsing code well enough to be sure it's right because of so many if statements and branches. Only tests will convince me and I assume that' true for you too. 102 DoctypeToken() 103 {} I'd prefer this either all on one line, or with the braces on separate lines. 105 void reset() { Please put the opening brace here on a separate line since this is a function. 116 String name() const { return String(m_name.data(), m_name.size()); } 117 String publicID() const { return String(m_publicID.data(), m_publicID.size()); } 118 String systemID() const { return String(m_systemID.data(), m_systemID.size()); } Since we only use these when we're *done* with the doctype, we should use String::adopt, which is more efficient, but leaves the vector empty. I think that we could just leave these out entirely and write this: document->setDocType(new DocumentType(document, String::adopt(t->m_name), String::adopt(t->m_publicID), String::adopt(t->m_systemID))); That's all my comments.
Created attachment 19186 [details] More cleanup.
Comment on attachment 19186 [details] More cleanup. I know you didn't post this for review, yet, but here are my thoughts anyway. I read through the DocType node manipulation bits, not through the doctype parsing bits yet. 1. Obviously this needs test cases (I'm sure you have some). 2. I'm not sure what other engines do when you add a second doctype node to a document (via js), then remove the first. Does the second then become the doctype? In our code, when you add a 3rd after removing the first it would suddenly become the doctype. What if you have 3 in the doc when you remove the first? which becomes the doctype (if other parsers do that).
"if the node to append is one of this node's ancestors or this node itself, or if this node is of type Document and the DOM application attempts to append a second DocumentType or Element node", then you should raise a "HIERARCHY_REQUEST_ERR" exception. See http://www.w3.org/TR/DOM-Level-3-Core/core.html
(In reply to comment #13) > 1. Obviously this needs test cases (I'm sure you have some). We can probably borrow some testcases from html5lib. Anne pointed me at the following files, all of which contain some doctype tokenizer tests: http://html5lib.googlecode.com/svn/trunk/testdata/tokenizer/test1.test http://html5lib.googlecode.com/svn/trunk/testdata/tokenizer/test2.test http://html5lib.googlecode.com/svn/trunk/testdata/tokenizer/test3.test http://html5lib.googlecode.com/svn/trunk/testdata/tokenizer/test4.test
Created attachment 19216 [details] Finish view source mode. Fix bugs in the inspector code.
Comment on attachment 19216 [details] Finish view source mode. Fix bugs in the inspector code. Sam Weinig has written a test harness that tests a bunch of doctypes and makes sure the mode that is picked is correct. I will land his harness in the tree once it is ready (along with this patch). Ready for comments again.
Created attachment 19218 [details] Patch that passes all of Sam's tests This patch passes every single doctype rule contained in the HTML5 specification (Sam's tests cover all of them).
Created attachment 19219 [details] Work in progress There are still many layout tests to fix. I am having to change layout tests that have bugs in them (e.g., svg/dynamic-updates has a broken test harness that does document.firstChild instead of document.documentElement).
Created attachment 19221 [details] Ready for review. This patch has been through the layout tests. Many tests are changing results: (1) DOM tests that failed are now passing and/or changing results slightly. (2) about:blank is now properly in quirks mode instead of strict mode (affects every single test that uses about:blank) (3) Positional dumping tests now change when a doctype is placed before the root element (4) XSL transform results now have the correct mode One major change I had to make is that the construction of the style selector is now delayed until a Document is attached. Document construction time was too early, since the compat mode had not yet been initialized (the HTMLDocument is not even constructed yet in the base class constructor).
Comment on attachment 19221 [details] Ready for review. + String text = "<"; + text += String::adopt(doctypeToken->m_source); + text += ">"; + addText(text, "webkit-html-doctype"); This could be quite a bit more efficient like so: Vector<UChar> buffer(doctypeToken->m_source.size() + 2); buffer[0] = "<"; buffer.insert(1, doctypeToken->m_source); buffer[buffer.size() - 1] = ">"; addText(String::adopt(buffer), "webkit-html-doctype");
Created attachment 19235 [details] doctype tests
Comment on attachment 19235 [details] doctype tests +FAIL: the Doctype was Standards. It was expected to be Quirks. Doctype: <!-- comment --><!DOCTYPE html> Is this expected?
Comment on attachment 19235 [details] doctype tests Yeah that is a bug. I will fix the test.
Created attachment 19241 [details] Final patch
Comment on attachment 19241 [details] Final patch Yeah!
Fixed in r30431.
Mass moving XML DOM bugs to the "DOM" Component.