Bug 26005

Summary: Optimization for XPath //* does not preserve context size
Product: WebKit Reporter: Alexey Proskuryakov <ap>
Component: XMLAssignee: Alexey Proskuryakov <ap>
Status: RESOLVED FIXED    
Severity: Normal CC: fearphage+bugmail, kangax
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
test case
none
test case showing before and after dom append
none
proposed fix darin: review+

Alexey Proskuryakov
Reported 2009-05-25 00:33:46 PDT
//* is an abbreviated form for /descendant-or-self::node()/child::*. When applying predicates to the last step, the context should contain all children elements of a given node, so e.g. //*[2] locates all second children elements in document. We wrongfully optimize this to /descendant:node()/self::*, so the context for predicates always contains a single element.
Attachments
test case (4.14 KB, text/html)
2009-05-25 00:37 PDT, Alexey Proskuryakov
no flags
test case showing before and after dom append (590 bytes, text/html)
2009-05-27 06:01 PDT, Phred
no flags
proposed fix (29.73 KB, patch)
2009-05-28 03:05 PDT, Alexey Proskuryakov
darin: review+
Alexey Proskuryakov
Comment 1 2009-05-25 00:37:14 PDT
Created attachment 30645 [details] test case
Phred
Comment 2 2009-05-27 06:01:25 PDT
Created attachment 30705 [details] test case showing before and after dom append As best I can discern the spec http://www.w3.org/TR/xpath#path-abbrev : "//para selects all the para descendants of the document root and thus selects all para elements in the same document as the context node" There exist no parent document for the context node in the original URL, thus there should be no answer (null). Firefox 2 returned null and Opera currently returns null in this case. However once it is appended to the document, the answer refers to the root element as it should. The bug here is that you are executing from the current document root regardless of whether the context node exist in the document or not.
Phred
Comment 3 2009-05-27 06:08:55 PDT
The previous comment was actually meant for Bug 25931.
Alexey Proskuryakov
Comment 4 2009-05-28 03:05:11 PDT
Created attachment 30733 [details] proposed fix
Darin Adler
Comment 5 2009-05-28 09:04:54 PDT
Comment on attachment 30733 [details] proposed fix > + * Copyright (C) 2005 Frerich Raabe <raabe@kde.org> > + * Copyright (C) 2006, 2009 Apple, Inc. No comma in Apple Inc. > + switch (axis) { > + case Step::AttributeAxis: > + return Node::ATTRIBUTE_NODE; > + case Step::NamespaceAxis: > + return Node::XPATH_NAMESPACE_NODE; > + default: > + return Node::ELEMENT_NODE; > + } I like to leave out the "default" case so that gcc can warn me about unhandled enum values. This is practical when the cases end with return statements, since the default return statement can go outside the switch, perhaps with an ASSERT_NOT_REACHED(). > + // In XPath land, namespace nodes are not accessible on the attribute axis. > + if (node->namespaceURI() == "http://www.w3.org/2000/xmlns/") > + return false; Since namespaces are atomic strings, comparing to an AtomicString constant may be slightly more efficient, and it would also be nice to have that string in some common place. r=me
Alexey Proskuryakov
Comment 6 2009-05-28 09:26:07 PDT
Committed <http://trac.webkit.org/changeset/44232>. (In reply to comment #5) > No comma in Apple Inc. Fixed. > I like to leave out the "default" case so that gcc can warn me about unhandled > enum values. This is practical when the cases end with return statements, since > the default return statement can go outside the switch, perhaps with an > ASSERT_NOT_REACHED(). I think that in this case, leaving element axis as default is more natural. > Since namespaces are atomic strings, comparing to an AtomicString constant may > be slightly more efficient, and it would also be nice to have that string in > some common place. Agreed, will keep this in mind for future patches.
Note You need to log in before you can comment on or make changes to this bug.