WebKit Bugzilla
New
Browse
Search+
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
74665
Poor XPath performance when evaluating an expression that returns a lot of nodes
https://bugs.webkit.org/show_bug.cgi?id=74665
Summary
Poor XPath performance when evaluating an expression that returns a lot of nodes
Alexey Proskuryakov
Reported
2011-12-15 16:42:47 PST
Steps to reproduce: open <
http://opensource.apple.com/source/SQLite/SQLite-74/derived_source/sqlite3.c
>. After loading, Safari will freeze for a long time. The reason is that Safari Reader evaluates an XPath expression, which returns about 180K nodes. Sorting such a node set takes a long time. The sorting function is optimized for small node sets in large documents, and this is the opposite of it. <
rdar://problem/10517146
>
Attachments
proposed fix
(4.41 KB, patch)
2011-12-15 16:49 PST
,
Alexey Proskuryakov
darin
: review+
Details
Formatted Diff
Diff
View All
Add attachment
proposed patch, testcase, etc.
Alexey Proskuryakov
Comment 1
2011-12-15 16:49:14 PST
Created
attachment 119519
[details]
proposed fix
Darin Adler
Comment 2
2011-12-15 17:02:18 PST
Comment on
attachment 119519
[details]
proposed fix View in context:
https://bugs.webkit.org/attachment.cgi?id=119519&action=review
> Source/WebCore/xml/XPathNodeSet.cpp:38 > +// When a node set is large, sorting it by traversing the whole document is better (we can > +// assume that we aren't dealing with documents that we cannot even traverse in reasonable time). > +const unsigned traversalSortCutoff = 10000;
Maybe comparing the node set to the total number of nodes in the document would be better. I guess we don’t really know what that is, so a hardcoded numbers seems fine.
> Source/WebCore/xml/XPathNodeSet.cpp:203 > + if (node->isAttributeNode()) > + containsAttributeNodes = true;
Since isAttributeNode is a virtual function we might want to do the much faster checks that have flags right in the Node first: if (!node->isElementNode() && !node->isTextNode() && node->isAttributeNode()) We might even consider putting that optimization into the isAttributeNode function itself.
> Source/WebCore/xml/XPathNodeSet.cpp:211 > + if (nodes.contains(n)) > + sortedNodes.append(n);
Would it be more or less efficient to remove each node from the set as it’s handled?
> Source/WebCore/xml/XPathNodeSet.cpp:216 > + if (!containsAttributeNodes || !n->hasAttributes()) > + continue; > + > + NamedNodeMap* attributes = n->attributes();
It would be more efficient to write it like this: if (!containsAttributeNodes || !n->isElementNode()) continue; NamedNodeMap* attributes = toElement(n)->attributes(true /* read-only */); if (!attributes) continue; Avoids doing some work twice in the hasAttributes and attributes functions.
> Source/WebCore/xml/XPathNodeSet.cpp:226 > + const_cast<Vector<RefPtr<Node> >& >(m_nodes).swap(sortedNodes);
No need for the space after the "&" character. Is const_cast preferred over mutable? I think a typedef for Vector<RefPtr<Node> > would help make this more readable.
Alexey Proskuryakov
Comment 3
2011-12-16 11:01:03 PST
Committed <
http://trac.webkit.org/changeset/103082
>.
> We might even consider putting that optimization into the isAttributeNode function itself.
Seems more appropriate to optimize isAttributeNode when needed. In this particular instance, this micro optimization feels unwarranted, as we do many more operations per node than this virtual call.
> Would it be more or less efficient to remove each node from the set as it’s handled?
My gut feeling is that it would be less efficient. But I didn't measure.
> It would be more efficient to write it like this:
Done.
> Is const_cast preferred over mutable?
I think so. The vector is not mutable, it's only sortable. We don't want to change it in other const methods.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug