Bug 7480 - non-HTML elems w/o children in HTML docs get serialized self-closing
Summary: non-HTML elems w/o children in HTML docs get serialized self-closing
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: 420+
Hardware: Macintosh OS X 10.4
: P3 Trivial
Assignee: Darin Adler
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-02-26 09:12 PST by Darin Adler
Modified: 2019-02-06 09:02 PST (History)
1 user (show)

See Also:


Attachments
patch (5.98 KB, patch)
2006-02-26 15:59 PST, Darin Adler
mjs: review-
Details | Formatted Diff | Diff
patch, presumably will get review- because of lack of a layout test (1.58 KB, patch)
2006-02-27 00:46 PST, Darin Adler
no flags Details | Formatted Diff | Diff
patch (9.30 KB, patch)
2006-03-01 00:36 PST, Darin Adler
eric: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Darin Adler 2006-02-26 09:12:30 PST
I looked at the code in markup.cpp and by code inspection noticed that the logic was wrong. My fixed version makes XML serialization work a little better.
Comment 1 Darin Adler 2006-02-26 15:59:35 PST
Created attachment 6749 [details]
patch
Comment 2 Darin Adler 2006-02-26 16:00:27 PST
By the way, when I tried the tests, Gecko used "<tag/>" rather than "<tag />".
Comment 3 Maciej Stachowiak 2006-02-26 22:51:03 PST
Proposed rules for serialization:

1) For HTML documents (HTMLDocument, parsed as HTML), always use HTML serialization rules. Elements that must be empty get no close tags, other elements get close tag even if they happen to have no children.
2) For XML documents, serialize as follows: HTML elements that must be empty in HTML are serialized with " />" syntax; other HTML elements are always serialized with a close tag; non-HTML elements follow normal XML rules.

The benefit of this is, if you serialize an XHTML document and try to treat the results as HTML, it will work, since people may want to edit as XHTML but then will almost invariably serve the result with a text/html mime type on the web.

This is what the old code was trying to do. I am not sure if it had a bug in implementing these rules. But the patch clearly makes it not follow these rules.


Comment 4 Maciej Stachowiak 2006-02-26 22:52:05 PST
(By "normal XML rules" I mean don't bother with the extra space before "/>" when using minimized syntax.)
Comment 5 Darin Adler 2006-02-27 00:39:27 PST
Now I understand what the actual bug is. I'll do a new patch.
Comment 6 Darin Adler 2006-02-27 00:46:05 PST
Created attachment 6756 [details]
patch, presumably will get review- because of lack of a layout test
Comment 7 Darin Adler 2006-02-27 09:24:04 PST
Comment on attachment 6756 [details]
patch, presumably will get review- because of lack of a layout test

Strangely, I discovered that custom elements in HTML documents are HTML elements. This doesn't make sense to me. I'll investigate further.
Comment 8 Darin Adler 2006-03-01 00:36:20 PST
Created attachment 6788 [details]
patch
Comment 9 Eric Seidel (no email) 2006-03-01 09:58:27 PST
Comment on attachment 6788 [details]
patch

Looks good.  As you mentioned before, it does seem a bit odd that unrecognized elements end up as HTMLElements in an HTML document (and thus are serliaized in all caps)

r=me
Comment 10 Maciej Stachowiak 2006-03-01 23:37:56 PST
I believe all browsers will make unrecognized elements into HTMLElements in an html document, and the specs may even require this. HTML4 mandates supporting unknown elements and rendering their contents.
Comment 11 Lucas Forschler 2019-02-06 09:02:30 PST
Mass moving XML DOM bugs to the "DOM" Component.