12751 – DocumentType nodes aren't part of the Document (affects Acid3)

RESOLVED FIXED12751

DocumentType nodes aren't part of the Document (affects Acid3)

https://bugs.webkit.org/show_bug.cgi?id=12751

Summary DocumentType nodes aren't part of the Document (affects Acid3)

Anders Carlsson

Reported 2007-02-12 14:51:14 PST

Firefox treats the document type node as the first child of the document. We don't add document type nodes as children at all.

Attachments
test case (456 bytes, text/html) 2008-01-30 12:55 PST, Robert Blaut	no flags	Details
Work in progress (34.33 KB, patch) 2008-02-15 15:15 PST, Dave Hyatt	no flags	Details Formatted Diff Diff
Patch that gets Acid3 up to 84/100 (36.19 KB, patch) 2008-02-15 15:52 PST, Dave Hyatt	no flags	Details Formatted Diff Diff
More cleanup. (37.52 KB, patch) 2008-02-18 06:37 PST, Dave Hyatt	no flags	Details Formatted Diff Diff
Finish view source mode. Fix bugs in the inspector code. (45.43 KB, patch) 2008-02-19 15:44 PST, Dave Hyatt	no flags	Details Formatted Diff Diff
Patch that passes all of Sam's tests (51.11 KB, patch) 2008-02-19 20:37 PST, Dave Hyatt	no flags	Details Formatted Diff Diff
Work in progress (51.31 KB, patch) 2008-02-19 22:04 PST, Dave Hyatt	no flags	Details Formatted Diff Diff
Ready for review. (53.95 KB, patch) 2008-02-20 01:21 PST, Dave Hyatt	no flags	Details Formatted Diff Diff
doctype tests (22.77 KB, patch) 2008-02-20 13:16 PST, Sam Weinig	no flags	Details Formatted Diff Diff
Final patch (260.54 KB, patch) 2008-02-20 14:25 PST, Dave Hyatt	sam: review+	Details Formatted Diff Diff
Show Obsolete (8) View All Add attachment proposed patch, testcase, etc.

Alexey Proskuryakov

Comment 1 2007-12-28 07:00:40 PST

*** Bug 16636 has been marked as a duplicate of this bug. ***

Darin Adler

Comment 2 2008-01-01 11:35:54 PST

Committed revision 29067.

Darin Adler

Comment 3 2008-01-01 12:01:20 PST

Marked resolved by accident.

Darin Adler

Comment 4 2008-01-01 12:04:36 PST

Oh, Eric just pointed out that it's not just XML!

Darin Adler

Comment 5 2008-01-01 12:05:22 PST

I guess it probably won't hurt compatibility too much if it makes us match Gecko. I wonder what Trident/IE does with a doctype node.

Alexey Proskuryakov

Comment 6 2008-01-06 02:05:17 PST

See also: bug 15290, bug 15411.

Robert Blaut

Comment 7 2008-01-30 12:55:27 PST

Created attachment 18797 [details] test case

Eric Seidel (no email)

Comment 8 2008-02-08 15:07:57 PST

Sadly this causes 2 Acid3 failures.

Dave Hyatt

Comment 9 2008-02-15 15:15:35 PST

Created attachment 19146 [details] Work in progress Just saving my work. Not ready for review yet.

Dave Hyatt

Comment 10 2008-02-15 15:52:31 PST

Created attachment 19147 [details] Patch that gets Acid3 up to 84/100 I still have to make view source mode work with doctypes so that I don't break the Web Inspector.

Darin Adler

Comment 11 2008-02-16 07:59:19 PST

Comment on attachment 19147 [details] Patch that gets Acid3 up to 84/100 I like it! 405 virtual void determineParseMode() {}; You made this protected, but it should be private. You only need it protected if derived classes need to *call* it. You can always override a virtual regardless of the visibility. And the derived class only needs to override, not call. It should be private in HTMLDocument too. Also, no semicolon after the braces. Do we need to do something to prohibit a second DOCTYPE node in XML, or does the parser already take care of that? Maybe this doesn't even handle he XML case yet? 858 bool isWhitespace = c == '\r' || c == '\n' || c == '\t' || c == ' '; Don't we already have a function for exactly that set of whitespace? All our different whitespace checks drive me crazy. 868 // Malformed. Just exit. Extra space here in this comment after the period. 902 if (Unicode::toLower(c) == publicStart[m_doctypeSearchCount]) { 910 } else if (Unicode::toLower(c) == systemStart[m_doctypeSearchCount]) { 1072 if (Unicode::toLower(*src) == doctypeStart[m_doctypeSearchCount]) { Unicode::toLower is overkill here. You want the much faster toASCIILower that works only for ASCII characters, because you know the that characters you are looking for are ASCII. I think we need test cases for every single branch in the parsing logic. I honestly can't read the parsing code well enough to be sure it's right because of so many if statements and branches. Only tests will convince me and I assume that' true for you too. 102 DoctypeToken() 103 {} I'd prefer this either all on one line, or with the braces on separate lines. 105 void reset() { Please put the opening brace here on a separate line since this is a function. 116 String name() const { return String(m_name.data(), m_name.size()); } 117 String publicID() const { return String(m_publicID.data(), m_publicID.size()); } 118 String systemID() const { return String(m_systemID.data(), m_systemID.size()); } Since we only use these when we're *done* with the doctype, we should use String::adopt, which is more efficient, but leaves the vector empty. I think that we could just leave these out entirely and write this: document->setDocType(new DocumentType(document, String::adopt(t->m_name), String::adopt(t->m_publicID), String::adopt(t->m_systemID))); That's all my comments.

Dave Hyatt

Comment 12 2008-02-18 06:37:49 PST

Created attachment 19186 [details] More cleanup.

Eric Seidel (no email)

Comment 13 2008-02-18 11:13:52 PST

Comment on attachment 19186 [details] More cleanup. I know you didn't post this for review, yet, but here are my thoughts anyway. I read through the DocType node manipulation bits, not through the doctype parsing bits yet. 1. Obviously this needs test cases (I'm sure you have some). 2. I'm not sure what other engines do when you add a second doctype node to a document (via js), then remove the first. Does the second then become the doctype? In our code, when you add a 3rd after removing the first it would suddenly become the doctype. What if you have 3 in the doc when you remove the first? which becomes the doctype (if other parsers do that).

Ian 'Hixie' Hickson

Comment 14 2008-02-18 12:28:12 PST

"if the node to append is one of this node's ancestors or this node itself, or if this node is of type Document and the DOM application attempts to append a second DocumentType or Element node", then you should raise a "HIERARCHY_REQUEST_ERR" exception. See http://www.w3.org/TR/DOM-Level-3-Core/core.html

Adam Roben (:aroben)

Comment 15 2008-02-19 00:14:13 PST

(In reply to comment #13) > 1. Obviously this needs test cases (I'm sure you have some). We can probably borrow some testcases from html5lib. Anne pointed me at the following files, all of which contain some doctype tokenizer tests: http://html5lib.googlecode.com/svn/trunk/testdata/tokenizer/test1.test http://html5lib.googlecode.com/svn/trunk/testdata/tokenizer/test2.test http://html5lib.googlecode.com/svn/trunk/testdata/tokenizer/test3.test http://html5lib.googlecode.com/svn/trunk/testdata/tokenizer/test4.test

Dave Hyatt

Comment 16 2008-02-19 15:44:52 PST

Created attachment 19216 [details] Finish view source mode. Fix bugs in the inspector code.

Dave Hyatt

Comment 17 2008-02-19 18:24:05 PST

Comment on attachment 19216 [details] Finish view source mode. Fix bugs in the inspector code. Sam Weinig has written a test harness that tests a bunch of doctypes and makes sure the mode that is picked is correct. I will land his harness in the tree once it is ready (along with this patch). Ready for comments again.

Dave Hyatt

Comment 18 2008-02-19 20:37:32 PST

Created attachment 19218 [details] Patch that passes all of Sam's tests This patch passes every single doctype rule contained in the HTML5 specification (Sam's tests cover all of them).

Dave Hyatt

Comment 19 2008-02-19 22:04:40 PST

Created attachment 19219 [details] Work in progress There are still many layout tests to fix. I am having to change layout tests that have bugs in them (e.g., svg/dynamic-updates has a broken test harness that does document.firstChild instead of document.documentElement).

Dave Hyatt

Comment 20 2008-02-20 01:21:06 PST

Created attachment 19221 [details] Ready for review. This patch has been through the layout tests. Many tests are changing results: (1) DOM tests that failed are now passing and/or changing results slightly. (2) about:blank is now properly in quirks mode instead of strict mode (affects every single test that uses about:blank) (3) Positional dumping tests now change when a doctype is placed before the root element (4) XSL transform results now have the correct mode One major change I had to make is that the construction of the style selector is now delayed until a Document is attached. Document construction time was too early, since the compat mode had not yet been initialized (the HTMLDocument is not even constructed yet in the base class constructor).

Adam Roben (:aroben)

Comment 21 2008-02-20 09:54:55 PST

Comment on attachment 19221 [details] Ready for review. + String text = "<"; + text += String::adopt(doctypeToken->m_source); + text += ">"; + addText(text, "webkit-html-doctype"); This could be quite a bit more efficient like so: Vector<UChar> buffer(doctypeToken->m_source.size() + 2); buffer[0] = "<"; buffer.insert(1, doctypeToken->m_source); buffer[buffer.size() - 1] = ">"; addText(String::adopt(buffer), "webkit-html-doctype");

Sam Weinig

Comment 22 2008-02-20 13:16:40 PST

Created attachment 19235 [details] doctype tests

mitz

Comment 23 2008-02-20 13:29:05 PST

Comment on attachment 19235 [details] doctype tests +FAIL: the Doctype was Standards. It was expected to be Quirks. Doctype: <!DOCTYPE html> Is this expected?

Dave Hyatt

Comment 24 2008-02-20 13:37:35 PST

Comment on attachment 19235 [details] doctype tests Yeah that is a bug. I will fix the test.

Dave Hyatt

Comment 25 2008-02-20 14:25:34 PST

Created attachment 19241 [details] Final patch

Sam Weinig

Comment 26 2008-02-20 14:38:49 PST

Comment on attachment 19241 [details] Final patch Yeah!

Dave Hyatt

Comment 27 2008-02-20 14:49:35 PST

Fixed in r30431.

Lucas Forschler

Comment 28 2019-02-06 09:03:22 PST

Mass moving XML DOM bugs to the "DOM" Component.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 420+

Hardware Mac

OS OS X 10.4

Product WebKit

Component DOM

Assignee

Dave Hyatt

Reported

2007-02-12 14:51 PST

Modified

2019-02-06 09:03 PST History

CC List

6 users Show

URL

http://acid3.acidtests.org/

Keywords HasReduction

Duplicates (1)

16636 View as bug list

Depends on

Blocks

Acid3 17080

Dependencies

tree graph