I started investigating TreeWalker the other day, and learned: 1) All the layout tests in layout-tests/traversal have incorrect results checked in. 2) The implementation doesn't really come close on handling cases where many nodes are filtered out. It has a bunch of incorrect assumptions in the various operation implementations. 3) The code does not do enough referencing of nodes it's holding on to -- must remember that the filter function can be arbitrary JavaScript, so a node could be removed from the document and all other references released at any time. 4) There's an "ancestorRejected" function in there that can't possibly be used in a correct implementation; a careful reading of the DOM specification makes it clear that you never have to look at your ancestors -- you simply have to look at parents as you encounter them during the tree walk.
Created attachment 2285 [details] Here's a partial patch with my work in progress to fix this. The implementation is about 80% there. Need to add a lot of new test cases, though. This simply includes the existing test cases, but fixed to quickly
I meant to say "fixed to write out results in text form, and with results that I believe may be correct".
Assigning to Darin since he is working on it. Also, have you checked out the KDOM implementation of TreeWalker? I think they have a complete and well-tested implementation which may be easy to port to our DOM.
In theory if I write any tests they'll end up here: http://hixie.ch/tests/adhoc/dom/traversal/ I don't have any at the moment, though.
Maybe KDOM is the right place to get it from.
Actually, the KDO guys tell me their implementation is mostly untested, so continuing with this fix might be better after all.
Created attachment 3609 [details] another patch -- not sure if it's better than the first try
Hum. Perhaps a better way to go about fixing this would be to break out the tests (land them even) and then go about fixing them one at a time. Doesn't require as long an attention span. ;)
I don't believe any new tests were added, just if (window.layoutTestController) layoutTestController.dumpAsText(); was added to each test and they have new/correct output.
I have a new version that I think is correct.
Created attachment 18345 [details] patch -- seems to be great, but needs some more tests
Things I need to do to get this patch in better shape: 1) catch the ObjC exceptions in the ObjC filter class (look at acceptNode implementations) 2) add tests for the various things I fixed besides the exception propagation To make a thorough enough set of tests, I'll need to think about a separate test for each thing I fixed and also look at the tests already attached to bug 16743, bug 16744, and bug 16759.
*** Bug 16759 has been marked as a duplicate of this bug. ***
FYI, this patch changes behavior on the Acid3 test: Test 6: expected [object HTMLTitleElement], got: null - failed to handle regrafting correctly is now fixed. Test 5: expected ,, got: . - expectation 22 failed has changed to: Test 5: expected [object HTMLBodyElement], got: null - expectation 20 failed Tests 2 and 4 remain the same (failures) Test 1: method [object NodeIterator].nextNode() didn't forward exception has changed to: Test 1: expected null, got: [object HTMLBodyElement] - w.nextSibling() didn't return the right node
It looks like Hixie already has a small suite of DOM Traversal tests: http://www.hixie.ch/tests/adhoc/dom/traversal/ for both NodeIterator and TreeWalker.
With your patch applied we fail: http://www.hixie.ch/tests/adhoc/dom/traversal/node-iterator/001.xml http://www.hixie.ch/tests/adhoc/dom/traversal/node-iterator/002.xml http://www.hixie.ch/tests/adhoc/dom/traversal/node-iterator/010.xml It turns out there are no TreeWalker tests. :)
Maks Orlovich just checked in a new implementation for KHTML that might be usable.
The KDE folks made similar fixes: http://websvn.kde.org/?view=rev&revision=763246 http://websvn.kde.org/?view=rev&revision=763253 We should at least check out their test cases.
http://www.hixie.ch/tests/adhoc/dom/traversal/node-iterator/001.xml fails because we don't parse an empty CDATA. I'm going to change a copy of the tests so I can use the rest of it; we could use a separate bug about that problem.
Created attachment 18894 [details] much better patch -- not perfect on Acid3, but makes us pass 4 more tests (from 69 to 73 when I tried it) This patch fixes a *lot* of little issues. I grabbed the test case that Maks made for KHTML and we now pass that. But there's still some work needed to pass the Acid3 tests, including some possible tweaks needed to the tests. I think this is far enough along that it probably would be good to review and land it even though it's not enough to pass all the relevant Acid3 tests.
(In reply to comment #20) > This patch fixes a *lot* of little issues. I grabbed the test case that Maks > made for KHTML and we now pass that. Oh, and also all of Hixie's NodeIterator tests, which are also in this patch.
You must mean from 71 to 75. :) or 76 once I post (and land) an improved copy of bug 17125.
Comment on attachment 18894 [details] much better patch -- not perfect on Acid3, but makes us pass 4 more tests (from 69 to 73 when I tried it) // FIXME: Should we add takeException as a member of ExecState? IMO: yes. if (exec->hadException()) { exception = takeException(exec); return NodeFilter::FILTER_REJECT; } These could be even shorter using a helper: return rejectNodeWithException(exec); :sigh: We should file a bug about extending the js bindings mechanism to handle the exception passing implemented by JSTreeWalkerCustom.cpp [PassesException] or [RethrowsException] const Node *n = this; Node* traversePreviousSiblingPostOrder(const Node *stayWithin = 0) const; Seems slightly silly to clear "exception" every time through the loop in Node* NodeIterator::nextNode(ExceptionCode& ec, JSValue*& exception) 153 void NodeIterator::notifyBeforeNodeRemoval(Node* removedNode) has: //updateForNodeRemoval(removedNode, m_candidateNode); I think this check in TreeWalker::parentNode() to avoid the ref churn is probably unnecessary: if (m_current == root()) 56 return 0; It might be worth adding a private helper function: bool shouldAcceptNode(Node*, JSValue*& exception); due to how many times the rather-long: acceptNode(m_candidateNode.node.get(), exception) == NodeFilter::FILTER_ACCEPT string used. It would also be nice if this common hunk of code could be pushed into a common function: if (acceptNodeResult == NodeFilter::FILTER_ACCEPT) { 280 m_current = node.release(); 281 return m_current.get(); 282 } Overall this looks great! Certainly much much better than what's currently in TOT.
I made some of your suggested changes, and left others for the future.
Committed revision 30089. There may be Acid3 issues remaining, but we should track them with new/separate bugs.
Mass moving XML DOM bugs to the "DOM" Component.