RESOLVED FIXED 15290
REGRESSION (r14184-r14216): Duplicate DOCTYPE definitions when saving files as webarchives
https://bugs.webkit.org/show_bug.cgi?id=15290
Summary REGRESSION (r14184-r14216): Duplicate DOCTYPE definitions when saving files a...
Eric Seidel (no email)
Reported 2007-09-26 14:27:17 PDT
Duplicate DOCTYPE definitions when saving SVGs as WebArchives Save an SVG as a WebArchive and you end up with a duplicate DOCTYPE definition. This then breaks parsing when trying to open the archive.
Attachments
an example busted archive (30.79 KB, application/x-webarchive)
2007-09-26 14:27 PDT, Eric Seidel (no email)
no flags
minimal SVG file to reproduce the problem (156 bytes, image/svg+xml)
2007-09-26 14:34 PDT, Eric Seidel (no email)
no flags
minimal example busted SVG archive (497 bytes, application/x-webarchive)
2007-09-26 14:34 PDT, Eric Seidel (no email)
no flags
minimal example html file (105 bytes, text/html)
2007-09-26 14:41 PDT, Eric Seidel (no email)
no flags
minimal example html archive (still loads, just is clearly serialized incorrectly) (457 bytes, application/x-webarchive)
2007-09-26 14:42 PDT, Eric Seidel (no email)
no flags
Patch v1 (5.43 KB, patch)
2008-07-03 09:36 PDT, David Kilzer (:ddkilzer)
no flags
Patch v2 (6.41 KB, patch)
2008-07-04 07:33 PDT, David Kilzer (:ddkilzer)
mitz: review+
Eric Seidel (no email)
Comment 1 2007-09-26 14:27:45 PDT
Created attachment 16405 [details] an example busted archive
Eric Seidel (no email)
Comment 2 2007-09-26 14:34:12 PDT
Created attachment 16406 [details] minimal SVG file to reproduce the problem
Eric Seidel (no email)
Comment 3 2007-09-26 14:34:45 PDT
Created attachment 16407 [details] minimal example busted SVG archive
Eric Seidel (no email)
Comment 4 2007-09-26 14:36:56 PDT
I believe this is caused by this line in markup.cpp: case Node::DOCUMENT_NODE: { // Documents do not normally contain a docType as a child node, force it to print here instead. const DocumentType* docType = static_cast<const Document*>(node)->doctype(); if (docType) return docType->toString().deprecatedString(); return ""; }
Eric Seidel (no email)
Comment 5 2007-09-26 14:39:50 PDT
This actually applies to HTML as well, it just doesn't break the document, since the HTML parser is more forgiving. This may be a regression from 2.0 (likely, since we have made major improvements to XML serialization since then).
Eric Seidel (no email)
Comment 6 2007-09-26 14:41:14 PDT
Created attachment 16408 [details] minimal example html file
Eric Seidel (no email)
Comment 7 2007-09-26 14:42:09 PDT
Created attachment 16409 [details] minimal example html archive (still loads, just is clearly serialized incorrectly)
Eric Seidel (no email)
Comment 8 2007-09-26 14:49:15 PDT
So I tested with Safari 2.0 and this is actually a regression. However it only affects XHTML (since SVG didn't work in Safari 2) so I'm not sure anyone cares. I've added the keyword (since it is technically a regression), but someone else can certainly remove it given the only-affecting-xhtml circumstances.
Eric Seidel (no email)
Comment 9 2007-09-26 16:00:15 PDT
Actually, the bug seems to be from the use of: _stringWithDocumentTypeStringAndMarkupString which also pre-pends the DOCTYPE to the beginning of the file as well as createMarkup (when printing the DocumentNode).
David Kilzer (:ddkilzer)
Comment 10 2007-09-27 09:57:45 PDT
Unfortunately, it appears that the method that DumpRenderTree uses to create webarchives is not the same method that Safari uses. (In reply to comment #8) > So I tested with Safari 2.0 and this is actually a regression. However it only > affects XHTML (since SVG didn't work in Safari 2) so I'm not sure anyone cares. Actually, it affects any HTML or XML document with a pre-existing DOCTYPE.
David Kilzer (:ddkilzer)
Comment 11 2007-09-27 10:11:11 PDT
An autospade run says: Works: r14184 Fails: r14216
David Kilzer (:ddkilzer)
Comment 12 2007-09-27 10:15:43 PDT
* STEPS TO REPRODUCE 1. Launch Safari/WebKit. 2. Open Attachment 16408 [details]. 3. Save as webarchive. 4. Open saved webarchive. 5. View source of page. * EXPECTED RESULTS There should be only one "<!DOCTYPE>" tag. * ACTUAL RESULTS There are two "<!DOCTYPE>" tags at the top of the file.
David Kilzer (:ddkilzer)
Comment 13 2007-09-27 10:16:13 PDT
David Kilzer (:ddkilzer)
Comment 14 2007-09-27 10:23:30 PDT
(In reply to comment #11) > An autospade run says: > Works: r14184 Fails: r14216 Revision r14204 looks most suspicious. http://trac.webkit.org/projects/webkit/changeset/14204
Eric Seidel (no email)
Comment 15 2007-09-27 11:02:50 PDT
I don't think it was 14204. I had a patch locally on my system to remove "realDocType()" as it's the same as the default implementation of docType() (which is no longer overridden by anyone and thus realDocType() doesn't need to exist)
David Kilzer (:ddkilzer)
Comment 16 2008-07-03 09:36:31 PDT
Created attachment 22066 [details] Patch v1 Proposed fix. Tested both the SVG and HTML examples. Layout tests pass as well.
mitz
Comment 17 2008-07-03 09:38:28 PDT
Comment on attachment 22066 [details] Patch v1 r=me
David Kilzer (:ddkilzer)
Comment 18 2008-07-03 10:35:45 PDT
Comment on attachment 22066 [details] Patch v1 Clearing review flag for now. I think I may need to check to see that the "first" node is a DOCTYPE node since this method may be called for any arbitrary node in the DOM.
David Kilzer (:ddkilzer)
Comment 19 2008-07-04 07:33:28 PDT
Created attachment 22087 [details] Patch v2 Added checks for the node being a DOCUMENT_NODE or DOCUMENT_TYPE_NODE and only prepending the document type if the node type is NOT one of those.
mitz
Comment 20 2008-07-04 16:02:33 PDT
Comment on attachment 22087 [details] Patch v2 r=me
David Kilzer (:ddkilzer)
Comment 21 2008-07-04 21:52:00 PDT
Committed r35006
David Kilzer (:ddkilzer)
Comment 22 2008-07-05 10:58:00 PDT
Fixed ChangeLog entry in r35014.
Note You need to log in before you can comment on or make changes to this bug.