25822 – DOM normalize does not remove empty text node between element nodes

RESOLVED FIXED25822

DOM normalize does not remove empty text node between element nodes

https://bugs.webkit.org/show_bug.cgi?id=25822

Summary DOM normalize does not remove empty text node between element nodes

Kai Brüning

Reported 2009-05-15 07:52:25 PDT

The DOM node normalize() function does not remove empty text nodes which are not adjacent to other text nodes nor the only child node. That is, empty text nodes between element nodes or at the start or end of the child list are not removed. This is clearly in contradiction to the DOM level 2 core specification. Note also that it works in FireFox (tested with 3.0.10). A test case is attached, a patch forthcoming.

Attachments
Test case for this bug (884 bytes, text/html) 2009-05-15 07:53 PDT, Kai Brüning	no flags	Details
Patch to fix this bug (5.59 KB, patch) 2009-05-15 09:23 PDT, Kai Brüning	darin: review+	Details Formatted Diff Diff
Path with suggested code changes (6.07 KB, patch) 2009-05-15 13:51 PDT, Kai Brüning	darin: review+	Details Formatted Diff Diff
Patch with more thorough set of test cases (11.17 KB, patch) 2009-05-17 06:54 PDT, Kai Brüning	darin: review+	Details Formatted Diff Diff
Show Obsolete (2) View All Add attachment proposed patch, testcase, etc.

Kai Brüning

Comment 1 2009-05-15 07:53:19 PDT

Created attachment 30384 [details] Test case for this bug

Kai Brüning

Comment 2 2009-05-15 09:23:19 PDT

Created attachment 30390 [details] Patch to fix this bug I do not have a Tiger machine to run layout tests, but I made sure that the set of failed layout tests under Leopard did not change by the patch.

Darin Adler

Comment 3 2009-05-15 10:26:11 PDT

Comment on attachment 30390 [details] Patch to fix this bug > - for (; node; node = node->traverseNextNodePostOrder()) { > + for (; node; ) { Should be changed a while, then. > + } else { > + node = node->traverseNextNodePostOrder(); WebKit style doesn't use braces around single lines. Typically we'd use "early continue" in a case like this: if (type != TEXT_NODE) { node = node->traverseNextNodePostOrder(); continue; } I'd really like to see a more thorough set of test cases. Adding a test for only the single case that shows the current bug seems too minimal. r=me as is -- none of my requested changes are mandatory

Kai Brüning

Comment 4 2009-05-15 13:51:35 PDT

Created attachment 30400 [details] Path with suggested code changes I integrated the code-level suggestions of Darin. Adding more test will take a little longer; guess I expected tests for the other cases to already exist. Should this be one file per tested case or is it ok to combine several cases into one test file?

Kai Brüning

Comment 5 2009-05-15 13:56:36 PDT

One additional remark: I am a little puzzled by the fact that the original code carefully and specifically handled the "single empty text node child" case, but ignored the mixed empty text node/elements case. I just would feel better knowing the rationale behind that code.

Darin Adler

Comment 6 2009-05-15 14:15:23 PDT

(In reply to comment #4) > Should this be one file per tested case or is it ok to combine several cases > into one test file? A single test file is better. The key to making a good test is making sure you can understand what its testing and see what's failing and what's working.

Darin Adler

Comment 7 2009-05-15 14:16:11 PDT

(In reply to comment #5) > One additional remark: I am a little puzzled by the fact that the original code > carefully and specifically handled the "single empty text node child" case, but > ignored the mixed empty text node/elements case. I just would feel better > knowing the rationale behind that code. You can look at the ChangeLog to see if there are any comments that make it clear. I doubt there was any rationale -- it was probably a simple mistake. I trust test cases more than I trust code.

Kai Brüning

Comment 8 2009-05-17 06:54:19 PDT

Created attachment 30421 [details] Patch with more thorough set of test cases I extended the test case to (hopefully) test the full range of the normalize() functionality. To express this, I renamed it to "normalize.html". The test lives inside "dom", therefore I consider the term "normalize" well defined. Please advice if a longer name is necessary. Having these tests and after learning that document->textNodesMerged() does adjust DOMRanges to the changed text nodes and nothing more, I changed the code of Node::normalize() a little more to make it more concise and less historic. Concerning rationale of the old code, I checked the ChangeLogs and found the entry about the addition of the "single empty text node child" case. It is indeed rather inexpressively, just stating the addition of this case.

Darin Adler

Comment 9 2009-05-17 08:51:13 PDT

Comment on attachment 30421 [details] Patch with more thorough set of test cases > + // Merge non-empty text nodes. > + while (1) { > + Node* nextSibling = node->nextSibling(); > + if (!nextSibling || !nextSibling->isTextNode()) { > + node = node->traverseNextNodePostOrder(); > + break; > + } I think it would be clearer to have the node = node->traverseNextNodePostOrder() line outside the if statement. I also would prefer to have a while condition instead of just while (1). Here's how I'd write it: while (Node* nextSibling = node->nextSibling()) { if (!nextSiblng->isTextNode()) break; ... } node = node->traverseNextNodePostOrder(); Patch looks good, r=me even without that change.

Darin Adler

Comment 10 2009-05-17 09:02:47 PDT

I'll take care of landing this. I think I'll tweak the test a bit.

Kai Brüning

Comment 11 2009-05-17 11:11:22 PDT

> I'll take care of landing this. I think I'll tweak the test a bit. Great. Please tweak the code, too, as you suggested. I agree that both changes make it clearer.

Darin Adler

Comment 12 2009-05-18 14:16:27 PDT

http://trac.webkit.org/changeset/43809

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Mac

OS OS X 10.5

Product WebKit

Component DOM

Assignee

Darin Adler

Reported

2009-05-15 07:52 PDT

Modified

2009-05-18 14:16 PDT History

CC List

0 users

URL

Keywords

Depends on

Blocks