Bug 35043

Summary: Strict DTD should always trigger strict mode, also when there is an internal subset
Product: WebKit Reporter: Leif Halvard Silli <xn--mlform-iua>
Component: DOMAssignee: Nobody <webkit-unassigned>
Status: RESOLVED CONFIGURATION CHANGED    
Severity: Normal CC: abarth, ahmad.saleem792, ap, bfulgham, cdumez, rniwa
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
URL: http://www.xn--mlform-iua.no.no/html4-or-html5/

Description Leif Halvard Silli 2010-02-17 10:25:10 PST
This bug also relates to bug 9280. 

The test page (http://www.xn--mlform-iua.no.no/html4-or-html5/) triggers QuirksMode in Safari 4 and onwards. This is caused by the internal subset of the DOCTYPE:


       --- Example 1: ------ (Line 2 is the internal subset) ----

            <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 
               [<!ATTLIST P myattr   CDATA #implied >] 
              >

       ---End of example ------------------------------


The internal subset causes "]>" to be displayed in the page. However that is a separate issue - in principle (bug 9280).  (And it can be solved by a workaround: http://www.målform.no/html4-or-html5/workaround )

The issue here is that Webkit is the only browser which sees the internal subset as a QuirksMode trigger. (EXCEPTION:Opera 10.5 beta. Nightly Minefield is not an exception.)

Information from Philip Thaylor of the HTMLwg  (http://www.w3.org/mid/4B7C03BA.4050903@cam.ac.uk) says that the above DOCTYPE (with internal subset) should trigger Standards mode WHEN THERE IS AN SYSTEM IDENTIFIER, like this:


       --- Example 2: ------ (Line 2 is the internal subset) ---

      <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd" 
      [<!ATTLIST P myattr   CDATA #implied >]
       >

       ---End of example ------------------------------


Test page: http://www.målform.no/html4-or-html5/take2

And, true enough, in Opera 10.5beta (the only other UA to react with QuirksMode to Example 1 , this triggers Standards Mode. But in Safari it doesn't matter.

The whole issue about whether the lack of system identifier should trigger Quirks mode if the Doctype currently triggers Standards mode, is however questioned - Boris Zbarsky of the HTMLwg:
http://www.w3.org/mid/4B7C0A50.305@mit.edu
Comment 1 Alexey Proskuryakov 2010-02-17 13:55:47 PST
If we diverge from other browsers, then this is something to fix in WebKit. There is no indication that there are any actual Web pages affected, so downgrading priority.

It's also not clear if there is any use for internal DTD subsets in HTML, other than fooling validity checkers into accepting invalid content.
Comment 2 Leif Halvard Silli 2010-02-17 14:53:05 PST
I don't understand why you refer to "validation" as "fooling".
Comment 3 Leif Halvard Silli 2010-02-17 17:29:51 PST
DIVERGING: 

  I'll just make it clear that it is not only about "other browsers", but also about HTML5. HTML5 requires Standards Mode when there is a system identifier in the DOCTYPE. While Safari doesn't  make any difference - regardless.

PAGES AFFECTED versus the issue of Quirks/Standards mode: 

  That QuirksMode is triggerd when it shouldn't, should have high priority: Suggest priority 2.

  There are indeed not that many affected pages, compared to the whole Web. And the effect of QuirksMode isn't easy to spot always, either, I think. HOWEVER, the very issue of Quirks/Standards triggering via the DOCTYPE is important. It is the only function that doctyps have in the HTML5 world.

  I assume that the current behaviour in Safari/Webkit is a result of adapting to HTML5 and thus that internal subsets did not trigger QuirksMode in Safari 3.  Adapting to HTML5 should at least not lead to more doctypes than those HTML5 specify becoming Quirks triggers.


USE OF INTERNAL DTD SUBSETS:

  According to Jukka Korpela internal DTD subests is a very useful approach - see after the last quotation in this message: http://www.w3.org/mid/AF25D023B0B34E1C97D8F44ABDE365F3@JukanPC

  It is not very fruitful to assume that private DTDs and internal DTD subsets can only be used to add useless, invalid stuff. 

  An internal DTD subset can e.g. also be used to add a HTML5 feature - such as the time element - to a HTML4 document. I don't see how it would be meaningful to label such an effort as  "fooling validity checkers into accepting invalid content". The HTML5 spec says that user agents should behave the same, regardless of possible doctypes/versioning. The exception being QuirksMode, for which HTML5 keeps a list of QuirksMode triggers.

  Nevertheless, whether internal DTD subsets are useful is not within the scope of HTML5, as HTML5 only defines what triggers QuirksMode and what triggers StandardsMode.  But in the HTML4 world and in the XHTML worlds, internal subsets are by definition useful if one wants to use DTD based validation.
Comment 4 Adam Barth 2010-09-24 16:29:05 PDT
Our behavior here now matches the HTML5 spec.  We can add tests and close this bug.
Comment 5 Leif Halvard Silli 2011-10-05 19:38:03 PDT
(In reply to comment #4)
> Our behavior here now matches the HTML5 spec.  We can add tests and close this bug.

I agree that this bug is fixed. Should I make it "resolved"?
Comment 6 Adam Barth 2011-10-05 19:42:47 PDT
Ideally we'd add some tests first.
Comment 7 Leif Halvard Silli 2011-10-05 19:52:55 PDT
(In reply to comment #6)
> Ideally we'd add some tests first.

As in 'attach some files to this bug"?
Comment 8 Adam Barth 2011-10-05 21:42:38 PDT
As in writing a LayoutTest:
http://www.webkit.org/quality/testwriting.html
Comment 9 Ahmad Saleem 2022-08-05 16:25:47 PDT
WPT do have coverage of DOCTYPE:

https://wpt.fyi/results/?label=master&label=experimental&aligned&view=subtest&q=doctype

Do we need any more test? Or we can mark this as "RESOLVED LATER" or "RESOLVED WONTFIX" since WPT ensure compliance to web-standards and for DOCTYPE, all browsers pass all tests. Thanks!
Comment 10 Brent Fulgham 2022-08-05 18:40:39 PDT
Yes, I think we can just rely on WPT to confirm proper behavior here.