Bug 33191 - CDATA sections are merged into Text nodes when normalize() is used
Summary: CDATA sections are merged into Text nodes when normalize() is used
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC OS X 10.5
: P2 Normal
Assignee: Darin Adler
URL:
Keywords: EasyFix
Depends on:
Blocks:
 
Reported: 2010-01-04 18:13 PST by William J. Edney
Modified: 2019-02-06 09:04 PST (History)
3 users (show)

See Also:


Attachments
Testcase showing invalid merging behavior (1022 bytes, text/html)
2010-01-04 18:13 PST, William J. Edney
no flags Details
patch (5.19 KB, patch)
2010-01-05 17:30 PST, Darin Adler
mitz: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description William J. Edney 2010-01-04 18:13:20 PST
Created attachment 45855 [details]
Testcase showing invalid merging behavior

As the attached testcase demonstrates, Webkit violates the DOM Specification by merging adjacent CDATA sections into Text nodes when normalize() is used.

The 'normalize()' method in the DOM Level 2 specification says:

"Puts all Text nodes in the full depth of the sub-tree underneath this Node, including attribute nodes, into a "normal" form where only structure (e.g., elements, comments, processing instructions, CDATA sections, and entity references) separates Text nodes".

Further, section 1.3 describing CDATA sections state that adjacent CDATA sections are not to be merged with each other when normalize() is used:

"Adjacent CDATASection nodes are not merged by use of the normalize method of the Node interface"

The attached testcase shows that the XML chunk starts out with 3 nodes under the document element in this order:

- Text node
- CDATA Section node
- Text node

and confirms that the document element has 3 child nodes before normalization and 1 child node after normalization (when it should still have 3, having preserved the above structure).

This test passes on Mozilla and IE.

This is failing on the latest Webkit build I have: build 52571.

Cheers,

- Bill
Comment 1 Darin Adler 2010-01-05 11:03:37 PST
Looking at Node::normalize it seems the problem is using nodeType which returns TEXT_NODE for both text nodes and character data nodes. Instead we should call isTextNode and isElementNode in that function.
Comment 2 Darin Adler 2010-01-05 14:51:55 PST
Turns out I had it backwards. CDATASection nodes are text nodes. So we need to call nodeType more, not less.
Comment 3 Darin Adler 2010-01-05 14:56:00 PST
I’ll fix this.
Comment 4 Darin Adler 2010-01-05 17:30:16 PST
Created attachment 45947 [details]
patch
Comment 5 Darin Adler 2010-01-05 17:36:12 PST
http://trac.webkit.org/changeset/52840
Comment 6 William J. Edney 2010-01-05 21:53:36 PST
That was fast!

Thanks Darin!

Cheers,

- Bill
Comment 7 Lucas Forschler 2019-02-06 09:04:13 PST
Mass moving XML DOM bugs to the "DOM" Component.