Bug 3492 - TreeWalker implementation needs to be fixed (affects Acid3)
Summary: TreeWalker implementation needs to be fixed (affects Acid3)
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: 412
Hardware: Mac OS X 10.4
: P3 Normal
Assignee: Darin Adler
: 16759 (view as bug list)
Depends on:
Blocks: Acid3
  Show dependency treegraph
Reported: 2005-06-12 20:55 PDT by Darin Adler
Modified: 2019-02-06 09:03 PST (History)
7 users (show)

See Also:

Here's a partial patch with my work in progress to fix this. (48.19 KB, patch)
2005-06-12 21:04 PDT, Darin Adler
no flags Details | Formatted Diff | Diff
another patch -- not sure if it's better than the first try (70.84 KB, patch)
2005-08-27 09:23 PDT, Darin Adler
no flags Details | Formatted Diff | Diff
patch -- seems to be great, but needs some more tests (73.64 KB, patch)
2008-01-08 22:51 PST, Darin Adler
no flags Details | Formatted Diff | Diff
much better patch -- not perfect on Acid3, but makes us pass 4 more tests (from 69 to 73 when I tried it) (112.48 KB, patch)
2008-02-03 18:10 PST, Darin Adler
eric: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Darin Adler 2005-06-12 20:55:06 PDT
I started investigating TreeWalker the other day, and learned:

    1) All the layout tests in layout-tests/traversal have incorrect results checked in.
    2) The implementation doesn't really come close on handling cases where many nodes are filtered 
out. It has a bunch of incorrect assumptions in the various operation implementations.
    3) The code does not do enough referencing of nodes it's holding on to -- must remember that the 
filter function can be arbitrary JavaScript, so a node could be removed from the document and all other 
references released at any time.
    4) There's an "ancestorRejected" function in there that can't possibly be used in a correct 
implementation; a careful reading of the DOM specification makes it clear that you never have to look at 
your ancestors -- you simply have to look at parents as you encounter them during the tree walk.
Comment 1 Darin Adler 2005-06-12 21:04:03 PDT
Created attachment 2285 [details]
Here's a partial patch with my work in progress to fix this.

The implementation is about 80% there. Need to add a lot of new test cases,
though. This simply includes the existing test cases, but fixed to quickly
Comment 2 Darin Adler 2005-06-12 21:10:07 PDT
I meant to say "fixed to write out results in text form, and with results that I believe may be correct".
Comment 3 Maciej Stachowiak 2005-06-12 23:44:26 PDT
Assigning to Darin since he is working on it. Also, have you checked out the KDOM implementation of 
TreeWalker? I think they have a complete and well-tested implementation which may be easy to port to 
our DOM.
Comment 4 Ian 'Hixie' Hickson 2005-06-13 03:43:46 PDT
In theory if I write any tests they'll end up here:
I don't have any at the moment, though.
Comment 5 Darin Adler 2005-06-14 08:53:30 PDT
Maybe KDOM is the right place to get it from.
Comment 6 Maciej Stachowiak 2005-06-14 11:43:17 PDT
Actually, the KDO guys tell me their implementation is mostly untested, so continuing with this fix might 
be better after all.
Comment 7 Darin Adler 2005-08-27 09:23:14 PDT
Created attachment 3609 [details]
another patch -- not sure if it's better than the first try
Comment 8 Eric Seidel (no email) 2007-09-30 14:43:30 PDT
Hum.  Perhaps a better way to go about fixing this would be to break out the tests (land them even) and then go about fixing them one at a time.  Doesn't require as long an attention span. ;)
Comment 9 Sam Weinig 2007-09-30 21:17:17 PDT
I don't believe any new tests were added, just if (window.layoutTestController) layoutTestController.dumpAsText(); was added to each test and they have new/correct output.

Comment 10 Darin Adler 2008-01-08 13:05:23 PST
I have a new version that I think is correct.
Comment 11 Darin Adler 2008-01-08 22:51:04 PST
Created attachment 18345 [details]
patch -- seems to be great, but needs some more tests
Comment 12 Darin Adler 2008-01-09 14:31:21 PST
Things I need to do to get this patch in better shape:

    1) catch the ObjC exceptions in the ObjC filter class (look at acceptNode implementations)
    2) add tests for the various things I fixed besides the exception propagation

To make a thorough enough set of tests, I'll need to think about a separate test for each thing I fixed and also look at the tests already attached to bug 16743, bug 16744, and bug 16759.
Comment 13 Darin Adler 2008-01-09 14:31:53 PST
*** Bug 16759 has been marked as a duplicate of this bug. ***
Comment 14 Eric Seidel (no email) 2008-01-17 17:42:36 PST
FYI, this patch changes behavior on the Acid3 test:

Test 6: expected [object HTMLTitleElement], got: null - failed to handle regrafting correctly
is now fixed.

Test 5: expected ,, got: . - expectation 22 failed
has changed to:
Test 5: expected [object HTMLBodyElement], got: null - expectation 20 failed

Tests 2 and 4 remain the same (failures)

Test 1: method [object NodeIterator].nextNode() didn't forward exception
has changed to:
Test 1: expected null, got: [object HTMLBodyElement] - w.nextSibling() didn't return the right node
Comment 15 Eric Seidel (no email) 2008-01-17 17:59:36 PST
It looks like Hixie already has a small suite of DOM Traversal tests:


for both NodeIterator and TreeWalker.
Comment 17 Darin Adler 2008-01-18 16:00:22 PST
Maks Orlovich just checked in a new implementation for KHTML that might be usable.
Comment 18 Eric Seidel (no email) 2008-01-24 10:46:55 PST
The KDE folks made similar fixes:

We should at least check out their test cases.
Comment 19 Darin Adler 2008-02-03 14:29:43 PST
http://www.hixie.ch/tests/adhoc/dom/traversal/node-iterator/001.xml fails because we don't parse an empty CDATA. I'm going to change a copy of the tests so I can use the rest of it; we could use a separate bug about that problem.
Comment 20 Darin Adler 2008-02-03 18:10:03 PST
Created attachment 18894 [details]
much better patch -- not perfect on Acid3, but makes us pass 4 more tests (from 69 to 73 when I tried it)

This patch fixes a *lot* of little issues. I grabbed the test case that Maks made for KHTML and we now pass that.

But there's still some work needed to pass the Acid3 tests, including some possible tweaks needed to the tests. I think this is far enough along that it probably would be good to review and land it even though it's not enough to pass all the relevant Acid3 tests.
Comment 21 Darin Adler 2008-02-03 18:15:57 PST
(In reply to comment #20)
> This patch fixes a *lot* of little issues. I grabbed the test case that Maks
> made for KHTML and we now pass that.

Oh, and also all of Hixie's NodeIterator tests, which are also in this patch.
Comment 22 Eric Seidel (no email) 2008-02-03 23:10:20 PST
You must mean from 71 to 75. :)  or 76 once I post (and land) an improved copy of bug 17125.
Comment 23 Eric Seidel (no email) 2008-02-07 09:36:23 PST
Comment on attachment 18894 [details]
much better patch -- not perfect on Acid3, but makes us pass 4 more tests (from 69 to 73 when I tried it)

// FIXME: Should we add takeException as a member of ExecState?
IMO: yes.

if (exec->hadException()) {
    exception = takeException(exec);
    return NodeFilter::FILTER_REJECT;
These could be even shorter using a helper:
return rejectNodeWithException(exec);

We should file a bug about extending the js bindings mechanism to handle the exception passing implemented by JSTreeWalkerCustom.cpp
[PassesException] or [RethrowsException]

const Node *n = this;
Node* traversePreviousSiblingPostOrder(const Node *stayWithin = 0) const;

Seems slightly silly to clear "exception" every time through the loop in Node* NodeIterator::nextNode(ExceptionCode& ec, JSValue*& exception)

 153 void NodeIterator::notifyBeforeNodeRemoval(Node* removedNode)
//updateForNodeRemoval(removedNode, m_candidateNode);

I think this check in TreeWalker::parentNode() to avoid the ref churn is probably unnecessary:
    if (m_current == root())
 56         return 0;

It might be worth adding a private helper function:
bool shouldAcceptNode(Node*, JSValue*& exception); due to how many times the rather-long: acceptNode(m_candidateNode.node.get(), exception) == NodeFilter::FILTER_ACCEPT string used.

It would also be nice if this common hunk of code could be pushed into a common function:
if (acceptNodeResult == NodeFilter::FILTER_ACCEPT) {
 280             m_current = node.release();
 281             return m_current.get();
 282         }

Overall this looks great!  Certainly much much better than what's currently in TOT.
Comment 24 Darin Adler 2008-02-08 02:09:35 PST
I made some of your suggested changes, and left others for the future.
Comment 25 Darin Adler 2008-02-08 02:36:01 PST
Committed revision 30089.

There may be Acid3 issues remaining, but we should track them with new/separate bugs.
Comment 26 Lucas Forschler 2019-02-06 09:03:57 PST
Mass moving XML DOM bugs to the "DOM" Component.