Bug 33191

Summary: CDATA sections are merged into Text nodes when normalize() is used
Product: WebKit Reporter: William J. Edney <bedney>
Component: DOMAssignee: Darin Adler <darin>
Severity: Normal CC: ap, cdumez, darin
Priority: P2 Keywords: EasyFix
Version: 528+ (Nightly build)   
Hardware: PC   
OS: OS X 10.5   
Description Flags
Testcase showing invalid merging behavior
patch mitz: review+

Description William J. Edney 2010-01-04 18:13:20 PST
Created attachment 45855 [details]
Testcase showing invalid merging behavior

As the attached testcase demonstrates, Webkit violates the DOM Specification by merging adjacent CDATA sections into Text nodes when normalize() is used.

The 'normalize()' method in the DOM Level 2 specification says:

"Puts all Text nodes in the full depth of the sub-tree underneath this Node, including attribute nodes, into a "normal" form where only structure (e.g., elements, comments, processing instructions, CDATA sections, and entity references) separates Text nodes".

Further, section 1.3 describing CDATA sections state that adjacent CDATA sections are not to be merged with each other when normalize() is used:

"Adjacent CDATASection nodes are not merged by use of the normalize method of the Node interface"

The attached testcase shows that the XML chunk starts out with 3 nodes under the document element in this order:

- Text node
- CDATA Section node
- Text node

and confirms that the document element has 3 child nodes before normalization and 1 child node after normalization (when it should still have 3, having preserved the above structure).

This test passes on Mozilla and IE.

This is failing on the latest Webkit build I have: build 52571.


- Bill
Comment 1 Darin Adler 2010-01-05 11:03:37 PST
Looking at Node::normalize it seems the problem is using nodeType which returns TEXT_NODE for both text nodes and character data nodes. Instead we should call isTextNode and isElementNode in that function.
Comment 2 Darin Adler 2010-01-05 14:51:55 PST
Turns out I had it backwards. CDATASection nodes are text nodes. So we need to call nodeType more, not less.
Comment 3 Darin Adler 2010-01-05 14:56:00 PST
I’ll fix this.
Comment 4 Darin Adler 2010-01-05 17:30:16 PST
Created attachment 45947 [details]
Comment 5 Darin Adler 2010-01-05 17:36:12 PST
Comment 6 William J. Edney 2010-01-05 21:53:36 PST
That was fast!

Thanks Darin!


- Bill
Comment 7 Lucas Forschler 2019-02-06 09:04:13 PST
Mass moving XML DOM bugs to the "DOM" Component.