Bug 162166 - Update XHTMLParser to recognize "-//W3C//DTD MathML 2.0//EN" public identifier
Summary: Update XHTMLParser to recognize "-//W3C//DTD MathML 2.0//EN" public identifier
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Chris Dumez
URL: https://html.spec.whatwg.org/#parsing...
Keywords: WebExposed
: 162215 (view as bug list)
Depends on:
Blocks:
 
Reported: 2016-09-19 10:37 PDT by Chris Dumez
Modified: 2016-09-19 14:13 PDT (History)
8 users (show)

See Also:


Attachments
Patch (809.50 KB, patch)
2016-09-19 10:43 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews117 for mac-yosemite (1.46 MB, application/zip)
2016-09-19 11:46 PDT, Build Bot
no flags Details
Patch (810.43 KB, patch)
2016-09-19 12:10 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (810.23 KB, patch)
2016-09-19 13:35 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Dumez 2016-09-19 10:37:41 PDT
XHTMLParser does not recognize some of the public identifiers in the specification:
- https://html.spec.whatwg.org/#parsing-xhtml-documents

In particular, we were missing the following ones: "-//W3C//DTD MathML 2.0//EN" and "-//WAPFORUM//DTD XHTML Mobile 1.0//EN".

Firefox recognizes those properly.
Comment 1 Chris Dumez 2016-09-19 10:43:35 PDT
Created attachment 289237 [details]
Patch
Comment 2 Alex Christensen 2016-09-19 11:06:02 PDT
Comment on attachment 289237 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=289237&action=review

> Source/WebCore/xml/parser/XMLDocumentParserLibxml2.cpp:1315
>          || (extId == "-//W3C//DTD XHTML 1.1 plus MathML 2.0//EN")
>          || (extId == "-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN")
> +        || (extId == "-//W3C//DTD MathML 2.0//EN")
>          || (extId == "-//WAPFORUM//DTD XHTML Mobile 1.0//EN")

This code could be much more efficient.  Right now we unnecessarily allocate and copy a String, then do many unnecessary comparisons.  We should do something like isSpecialScheme, except we shouldn't even need to calculate the length first because it seems to be null-terminated.
Comment 3 Build Bot 2016-09-19 11:46:22 PDT
Comment on attachment 289237 [details]
Patch

Attachment 289237 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/2106869

New failing tests:
imported/w3c/web-platform-tests/html/the-xhtml-syntax/parsing-xhtml-documents/xhtml-mathml-dtd-entity-support.htm
Comment 4 Build Bot 2016-09-19 11:46:27 PDT
Created attachment 289244 [details]
Archive of layout-test-results from ews117 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews117  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 5 Chris Dumez 2016-09-19 12:05:12 PDT
Looks like the new test hits an assertion on debug builds:
xception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       KERN_INVALID_ADDRESS at 0x00000000bbadbeef

VM Regions Near 0xbbadbeef:
--> 
    __TEXT                 0000000102d4a000-0000000102e0c000 [  776K] r-x/rwx SM=COW  /Volumes/VOLUME/*

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   com.apple.JavaScriptCore      	0x00000001043a05d7 WTFCrash + 39
1   com.apple.WebCore             	0x000000010bc7bd2c WebCore::convertUTF16EntityToUTF8(unsigned short const*, unsigned long, char*, unsigned long) + 188 (XMLDocumentParserLibxml2.cpp:1216)
2   com.apple.WebCore             	0x000000010bc7bbed WebCore::getXHTMLEntity(unsigned char const*) + 157 (XMLDocumentParserLibxml2.cpp:1229)
3   com.apple.WebCore             	0x000000010bc7a577 WebCore::getEntityHandler(void*, unsigned char const*) + 295 (XMLDocumentParserLibxml2.cpp:1276)
4   libxml2.2.dylib               	0x00007fff891300a8 xmlParseEntityRef + 225
5   libxml2.2.dylib               	0x00007fff8912f79b xmlParseReference + 194
6   libxml2.2.dylib               	0x00007fff891368ef xmlParseTryOrFinish + 5405
7   libxml2.2.dylib               	0x00007fff891351dd xmlParseChunk + 901
8   com.apple.WebCore             	0x000000010bc77ca8 WebCore::XMLDocumentParser::doWrite(WTF::String const&) + 408 (XMLDocumentParserLibxml2.cpp:690)
9   com.apple.WebCore             	0x000000010bc74988 WebCore::XMLDocumentParser::append(WTF::RefPtr<WTF::StringImpl>&&) + 312 (XMLDocumentParser.cpp:122)
10  com.apple.WebCore             	0x0000000109ac7a4c WebCore::Document::setContent(WTF::String const&) + 108 (Document.cpp:1402)
11  com.apple.WebCore             	0x0000000109b9c4bd WebCore::DOMParser::parseFromString(WTF::String const&, WTF::String const&, int&) + 285 (DOMParser.cpp:40)
12  com.apple.WebCore             	0x000000010a5628ae WebCore::jsDOMParserPrototypeFunctionParseFromString(JSC::ExecState*) + 1038 (JSDOMParser.cpp:187)
Comment 6 Chris Dumez 2016-09-19 12:06:56 PDT
(In reply to comment #2)
> Comment on attachment 289237 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=289237&action=review
> 
> > Source/WebCore/xml/parser/XMLDocumentParserLibxml2.cpp:1315
> >          || (extId == "-//W3C//DTD XHTML 1.1 plus MathML 2.0//EN")
> >          || (extId == "-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN")
> > +        || (extId == "-//W3C//DTD MathML 2.0//EN")
> >          || (extId == "-//WAPFORUM//DTD XHTML Mobile 1.0//EN")
> 
> This code could be much more efficient.  Right now we unnecessarily allocate
> and copy a String, then do many unnecessary comparisons.  We should do
> something like isSpecialScheme, except we shouldn't even need to calculate
> the length first because it seems to be null-terminated.

It can probably be more efficient yes, but this is the common pattern in this file to convert xmlChar* to String first using toString, which does a reinterpret_cast and a String::fromUtf8(). I'd rather not do the refactoring in the same patch.
Comment 7 Chris Dumez 2016-09-19 12:10:36 PDT
Created attachment 289246 [details]
Patch
Comment 8 Chris Dumez 2016-09-19 12:11:45 PDT
Comment on attachment 289244 [details]
Archive of layout-test-results from ews117 for mac-yosemite

Dealing with the debug crash via Bug 162215 since it is not related to my code change (the new test hits an assertion in debug builds without my change).
Comment 9 Chris Dumez 2016-09-19 13:35:12 PDT
Created attachment 289251 [details]
Patch
Comment 10 WebKit Commit Bot 2016-09-19 14:12:48 PDT
Comment on attachment 289251 [details]
Patch

Clearing flags on attachment: 289251

Committed r206118: <http://trac.webkit.org/changeset/206118>
Comment 11 WebKit Commit Bot 2016-09-19 14:12:54 PDT
All reviewed patches have been landed.  Closing bug.
Comment 12 Chris Dumez 2016-09-19 14:13:27 PDT
*** Bug 162215 has been marked as a duplicate of this bug. ***