<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>74665</bug_id>
          
          <creation_ts>2011-12-15 16:42:47 -0800</creation_ts>
          <short_desc>Poor XPath performance when evaluating an expression that returns a lot of nodes</short_desc>
          <delta_ts>2011-12-16 11:01:03 -0800</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>XML</component>
          <version>528+ (Nightly build)</version>
          <rep_platform>Unspecified</rep_platform>
          <op_sys>Unspecified</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords>InRadar</keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Alexey Proskuryakov">ap</reporter>
          <assigned_to name="Alexey Proskuryakov">ap</assigned_to>
          
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>522328</commentid>
    <comment_count>0</comment_count>
    <who name="Alexey Proskuryakov">ap</who>
    <bug_when>2011-12-15 16:42:47 -0800</bug_when>
    <thetext>Steps to reproduce: open &lt;http://opensource.apple.com/source/SQLite/SQLite-74/derived_source/sqlite3.c&gt;. After loading, Safari will freeze for a long time.

The reason is that Safari Reader evaluates an XPath expression, which returns about 180K nodes. Sorting such a node set takes a long time.

The sorting function is optimized for small node sets in large documents, and this is the opposite of it.

&lt;rdar://problem/10517146&gt;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>522336</commentid>
    <comment_count>1</comment_count>
      <attachid>119519</attachid>
    <who name="Alexey Proskuryakov">ap</who>
    <bug_when>2011-12-15 16:49:14 -0800</bug_when>
    <thetext>Created attachment 119519
proposed fix</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>522354</commentid>
    <comment_count>2</comment_count>
      <attachid>119519</attachid>
    <who name="Darin Adler">darin</who>
    <bug_when>2011-12-15 17:02:18 -0800</bug_when>
    <thetext>Comment on attachment 119519
proposed fix

View in context: https://bugs.webkit.org/attachment.cgi?id=119519&amp;action=review

&gt; Source/WebCore/xml/XPathNodeSet.cpp:38
&gt; +// When a node set is large, sorting it by traversing the whole document is better (we can
&gt; +// assume that we aren&apos;t dealing with documents that we cannot even traverse in reasonable time).
&gt; +const unsigned traversalSortCutoff = 10000;

Maybe comparing the node set to the total number of nodes in the document would be better. I guess we don’t really know what that is, so a hardcoded numbers seems fine.

&gt; Source/WebCore/xml/XPathNodeSet.cpp:203
&gt; +        if (node-&gt;isAttributeNode())
&gt; +            containsAttributeNodes = true;

Since isAttributeNode is a virtual function we might want to do the much faster checks that have flags right in the Node first:

    if (!node-&gt;isElementNode() &amp;&amp; !node-&gt;isTextNode() &amp;&amp; node-&gt;isAttributeNode())

We might even consider putting that optimization into the isAttributeNode function itself.

&gt; Source/WebCore/xml/XPathNodeSet.cpp:211
&gt; +        if (nodes.contains(n))
&gt; +            sortedNodes.append(n);

Would it be more or less efficient to remove each node from the set as it’s handled?

&gt; Source/WebCore/xml/XPathNodeSet.cpp:216
&gt; +        if (!containsAttributeNodes || !n-&gt;hasAttributes())
&gt; +            continue;
&gt; +
&gt; +        NamedNodeMap* attributes = n-&gt;attributes();

It would be more efficient to write it like this:

    if (!containsAttributeNodes || !n-&gt;isElementNode())
        continue;

    NamedNodeMap* attributes = toElement(n)-&gt;attributes(true /* read-only */);
    if (!attributes)
        continue;

Avoids doing some work twice in the hasAttributes and attributes functions.

&gt; Source/WebCore/xml/XPathNodeSet.cpp:226
&gt; +    const_cast&lt;Vector&lt;RefPtr&lt;Node&gt; &gt;&amp; &gt;(m_nodes).swap(sortedNodes);

No need for the space after the &quot;&amp;&quot; character.

Is const_cast preferred over mutable?

I think a typedef for Vector&lt;RefPtr&lt;Node&gt; &gt; would help make this more readable.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>522846</commentid>
    <comment_count>3</comment_count>
    <who name="Alexey Proskuryakov">ap</who>
    <bug_when>2011-12-16 11:01:03 -0800</bug_when>
    <thetext>Committed &lt;http://trac.webkit.org/changeset/103082&gt;.

&gt; We might even consider putting that optimization into the isAttributeNode function itself.

Seems more appropriate to optimize isAttributeNode when needed. In this particular instance, this micro optimization feels unwarranted, as we do many more operations per node than this virtual call.

&gt; Would it be more or less efficient to remove each node from the set as it’s handled?

My gut feeling is that it would be less efficient. But I didn&apos;t measure.

&gt; It would be more efficient to write it like this:

Done.

&gt; Is const_cast preferred over mutable?

I think so. The vector is not mutable, it&apos;s only sortable. We don&apos;t want to change it in other const methods.</thetext>
  </long_desc>
      
          <attachment
              isobsolete="0"
              ispatch="1"
              isprivate="0"
          >
            <attachid>119519</attachid>
            <date>2011-12-15 16:49:14 -0800</date>
            <delta_ts>2011-12-15 17:02:18 -0800</delta_ts>
            <desc>proposed fix</desc>
            <filename>XPathForReader.txt</filename>
            <type>text/plain</type>
            <size>4520</size>
            <attacher name="Alexey Proskuryakov">ap</attacher>
            
              <data encoding="base64">SW5kZXg6IFNvdXJjZS9XZWJDb3JlL0NoYW5nZUxvZwo9PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09Ci0tLSBTb3VyY2UvV2Vi
Q29yZS9DaGFuZ2VMb2cJKHJldmlzaW9uIDEwMjk5NykKKysrIFNvdXJjZS9XZWJDb3JlL0NoYW5n
ZUxvZwkod29ya2luZyBjb3B5KQpAQCAtMSwzICsxLDI3IEBACisyMDExLTEyLTE1ICBBbGV4ZXkg
UHJvc2t1cnlha292ICA8YXBAYXBwbGUuY29tPgorCisgICAgICAgIFBvb3IgWFBhdGggcGVyZm9y
bWFuY2Ugd2hlbiBldmFsdWF0aW5nIGFuIGV4cHJlc3Npb24gdGhhdCByZXR1cm5zIGEgbG90IG9m
IG5vZGVzCisgICAgICAgIGh0dHBzOi8vYnVncy53ZWJraXQub3JnL3Nob3dfYnVnLmNnaT9pZD03
NDY2NQorICAgICAgICA8cmRhcjovL3Byb2JsZW0vMTA1MTcxNDY+CisKKyAgICAgICAgUmV2aWV3
ZWQgYnkgTk9CT0RZIChPT1BTISkuCisKKyAgICAgICAgTm8gY2hhbmdlIGluIGZ1bmNpdG9uYWxp
dHkuIFdlbGwgY292ZXJlZCBieSBleGlzdGluZyB0ZXN0cyAocmFuIHRoZW0gd2l0aCB6ZXJvIGN1
dG9mZiB0bworICAgICAgICBleGVjdXRlIHRoZSBuZXcgY29kZSBwYXRoKS4KKworICAgICAgICBP
dXIgc29ydGluZyBmdW5jdGlvbiBpcyBvcHRpbWl6ZWQgZm9yIHNtYWxsIG5vZGUgc2V0cyBpbiBs
YXJnZSBkb2N1bWVudHMsIGFuZCB0aGlzIGlzIHRoZQorICAgICAgICBvcHBvc2l0ZSBvZiBpdC4g
QWRkZWQgYW5vdGhlciBvbmUgdGhhdCB0cmF2ZXJzZXMgdGhlIHdob2xlIGRvY3VtZW50LCBhZGRp
bmcgbm9kZXMgZnJvbSB0aGUKKyAgICAgICAgbm9kZSBzZXQgdG8gc29ydGVkIGxpc3QuIFRoYXQg
ZG9lc24ndCBncm93IHdpdGggdGhlIG51bWJlciBvZiBub2RlcyBuZWFybHkgYXMgZmFzdC4KKwor
ICAgICAgICBDdXRvZmYgYW1vdW50IGNob3NlbiBmb3IgdGhlIGRvY3VtZW50IHJlZmVyZW5jZWQg
aW4gYnVnIC0gdGhpcyBpcyByb3VnaGx5IHdoZXJlIHRoZSBhbGdvcml0aG1zCisgICAgICAgIGhh
dmUgdGhlIHNhbWUgcGVyZm9ybWFuY2Ugb24gaXQuCisKKyAgICAgICAgKiB4bWwvWFBhdGhOb2Rl
U2V0LmNwcDoKKyAgICAgICAgKFdlYkNvcmU6OlhQYXRoOjpOb2RlU2V0Ojpzb3J0KToKKyAgICAg
ICAgKFdlYkNvcmU6OlhQYXRoOjpmaW5kUm9vdE5vZGUpOgorICAgICAgICAoV2ViQ29yZTo6WFBh
dGg6Ok5vZGVTZXQ6OnRyYXZlcnNhbFNvcnQpOgorICAgICAgICAqIHhtbC9YUGF0aE5vZGVTZXQu
aDoKKwogMjAxMS0xMi0xNSAgUnlvc3VrZSBOaXdhICA8cm5pd2FAd2Via2l0Lm9yZz4KIAogICAg
ICAgICBUb3VjaCBtYWtlX25hbWUucGwgaW4gYW4gYXR0ZW1wdCB0byBtYWtlIFF0IGJvdHMgaGFw
cHkuCkluZGV4OiBTb3VyY2UvV2ViQ29yZS94bWwvWFBhdGhOb2RlU2V0LmNwcAo9PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
Ci0tLSBTb3VyY2UvV2ViQ29yZS94bWwvWFBhdGhOb2RlU2V0LmNwcAkocmV2aXNpb24gMTAyNDU0
KQorKysgU291cmNlL1dlYkNvcmUveG1sL1hQYXRoTm9kZVNldC5jcHAJKHdvcmtpbmcgY29weSkK
QEAgLTMzLDYgKzMzLDEwIEBACiBuYW1lc3BhY2UgV2ViQ29yZSB7CiBuYW1lc3BhY2UgWFBhdGgg
ewogCisvLyBXaGVuIGEgbm9kZSBzZXQgaXMgbGFyZ2UsIHNvcnRpbmcgaXQgYnkgdHJhdmVyc2lu
ZyB0aGUgd2hvbGUgZG9jdW1lbnQgaXMgYmV0dGVyICh3ZSBjYW4KKy8vIGFzc3VtZSB0aGF0IHdl
IGFyZW4ndCBkZWFsaW5nIHdpdGggZG9jdW1lbnRzIHRoYXQgd2UgY2Fubm90IGV2ZW4gdHJhdmVy
c2UgaW4gcmVhc29uYWJsZSB0aW1lKS4KK2NvbnN0IHVuc2lnbmVkIHRyYXZlcnNhbFNvcnRDdXRv
ZmYgPSAxMDAwMDsKKwogc3RhdGljIGlubGluZSBOb2RlKiBwYXJlbnRXaXRoRGVwdGgodW5zaWdu
ZWQgZGVwdGgsIGNvbnN0IFZlY3RvcjxOb2RlKj4mIHBhcmVudHMpCiB7CiAgICAgQVNTRVJUKHBh
cmVudHMuc2l6ZSgpID49IGRlcHRoICsgMSk7CkBAIC0xNDAsNyArMTQ0LDEyIEBAIHZvaWQgTm9k
ZVNldDo6c29ydCgpIGNvbnN0CiAgICAgICAgIGNvbnN0X2Nhc3Q8Ym9vbCY+KG1faXNTb3J0ZWQp
ID0gdHJ1ZTsKICAgICAgICAgcmV0dXJuOwogICAgIH0KLSAgICAKKworICAgIGlmIChub2RlQ291
bnQgPiB0cmF2ZXJzYWxTb3J0Q3V0b2ZmKSB7CisgICAgICAgIHRyYXZlcnNhbFNvcnQoKTsKKyAg
ICAgICAgcmV0dXJuOworICAgIH0KKwogICAgIGJvb2wgY29udGFpbnNBdHRyaWJ1dGVOb2RlcyA9
IGZhbHNlOwogICAgIAogICAgIFZlY3RvcjxWZWN0b3I8Tm9kZSo+ID4gcGFyZW50TWF0cml4KG5v
ZGVDb3VudCk7CkBAIC0xNjcsNiArMTc2LDU2IEBAIHZvaWQgTm9kZVNldDo6c29ydCgpIGNvbnN0
CiAgICAgY29uc3RfY2FzdDxWZWN0b3I8UmVmUHRyPE5vZGU+ID4mID4obV9ub2Rlcykuc3dhcChz
b3J0ZWROb2Rlcyk7CiB9CiAKK3N0YXRpYyBOb2RlKiBmaW5kUm9vdE5vZGUoTm9kZSogbm9kZSkK
K3sKKyAgICBpZiAobm9kZS0+aXNBdHRyaWJ1dGVOb2RlKCkpCisgICAgICAgIG5vZGUgPSBzdGF0
aWNfY2FzdDxBdHRyKj4obm9kZSktPm93bmVyRWxlbWVudCgpOworICAgIGlmIChub2RlLT5pbkRv
Y3VtZW50KCkpCisgICAgICAgIG5vZGUgPSBub2RlLT5kb2N1bWVudCgpOworICAgIGVsc2Ugewor
ICAgICAgICB3aGlsZSAoTm9kZSogcGFyZW50ID0gbm9kZS0+cGFyZW50Tm9kZSgpKQorICAgICAg
ICAgICAgbm9kZSA9IHBhcmVudDsKKyAgICB9CisgICAgcmV0dXJuIG5vZGU7Cit9CisKK3ZvaWQg
Tm9kZVNldDo6dHJhdmVyc2FsU29ydCgpIGNvbnN0Cit7CisgICAgSGFzaFNldDxOb2RlKj4gbm9k
ZXM7CisgICAgYm9vbCBjb250YWluc0F0dHJpYnV0ZU5vZGVzID0gZmFsc2U7CisKKyAgICB1bnNp
Z25lZCBub2RlQ291bnQgPSBtX25vZGVzLnNpemUoKTsKKyAgICBBU1NFUlQobm9kZUNvdW50ID4g
MSk7CisgICAgZm9yICh1bnNpZ25lZCBpID0gMDsgaSA8IG5vZGVDb3VudDsgKytpKSB7CisgICAg
ICAgIE5vZGUqIG5vZGUgPSBtX25vZGVzW2ldLmdldCgpOworICAgICAgICBub2Rlcy5hZGQobm9k
ZSk7CisgICAgICAgIGlmIChub2RlLT5pc0F0dHJpYnV0ZU5vZGUoKSkKKyAgICAgICAgICAgIGNv
bnRhaW5zQXR0cmlidXRlTm9kZXMgPSB0cnVlOworICAgIH0KKworICAgIFZlY3RvcjxSZWZQdHI8
Tm9kZT4gPiBzb3J0ZWROb2RlczsKKyAgICBzb3J0ZWROb2Rlcy5yZXNlcnZlSW5pdGlhbENhcGFj
aXR5KG5vZGVDb3VudCk7CisKKyAgICBmb3IgKE5vZGUqIG4gPSBmaW5kUm9vdE5vZGUobV9ub2Rl
cy5maXJzdCgpLmdldCgpKTsgbjsgbiA9IG4tPnRyYXZlcnNlTmV4dE5vZGUoKSkgeworICAgICAg
ICBpZiAobm9kZXMuY29udGFpbnMobikpCisgICAgICAgICAgICBzb3J0ZWROb2Rlcy5hcHBlbmQo
bik7CisKKyAgICAgICAgaWYgKCFjb250YWluc0F0dHJpYnV0ZU5vZGVzIHx8ICFuLT5oYXNBdHRy
aWJ1dGVzKCkpCisgICAgICAgICAgICBjb250aW51ZTsKKworICAgICAgICBOYW1lZE5vZGVNYXAq
IGF0dHJpYnV0ZXMgPSBuLT5hdHRyaWJ1dGVzKCk7CisgICAgICAgIHVuc2lnbmVkIGF0dHJpYnV0
ZUNvdW50ID0gYXR0cmlidXRlcy0+bGVuZ3RoKCk7CisgICAgICAgIGZvciAodW5zaWduZWQgaSA9
IDA7IGkgPCBhdHRyaWJ1dGVDb3VudDsgKytpKSB7CisgICAgICAgICAgICBBdHRyKiBhdHRyaWJ1
dGUgPSBhdHRyaWJ1dGVzLT5hdHRyaWJ1dGVJdGVtKGkpLT5hdHRyKCk7CisgICAgICAgICAgICBp
ZiAoYXR0cmlidXRlICYmIG5vZGVzLmNvbnRhaW5zKGF0dHJpYnV0ZSkpCisgICAgICAgICAgICAg
ICAgc29ydGVkTm9kZXMuYXBwZW5kKGF0dHJpYnV0ZSk7CisgICAgICAgIH0KKyAgICB9CisKKyAg
ICBBU1NFUlQoc29ydGVkTm9kZXMuc2l6ZSgpID09IG5vZGVDb3VudCk7CisgICAgY29uc3RfY2Fz
dDxWZWN0b3I8UmVmUHRyPE5vZGU+ID4mID4obV9ub2Rlcykuc3dhcChzb3J0ZWROb2Rlcyk7Cit9
CisKIHZvaWQgTm9kZVNldDo6cmV2ZXJzZSgpCiB7CiAgICAgaWYgKG1fbm9kZXMuaXNFbXB0eSgp
KQpJbmRleDogU291cmNlL1dlYkNvcmUveG1sL1hQYXRoTm9kZVNldC5oCj09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KLS0t
IFNvdXJjZS9XZWJDb3JlL3htbC9YUGF0aE5vZGVTZXQuaAkocmV2aXNpb24gMTAyNDU0KQorKysg
U291cmNlL1dlYkNvcmUveG1sL1hQYXRoTm9kZVNldC5oCSh3b3JraW5nIGNvcHkpCkBAIC03MSw2
ICs3MSw4IEBAIG5hbWVzcGFjZSBXZWJDb3JlIHsKICAgICAgICAgICAgIHZvaWQgcmV2ZXJzZSgp
OwogICAgICAgICAKICAgICAgICAgcHJpdmF0ZToKKyAgICAgICAgICAgIHZvaWQgdHJhdmVyc2Fs
U29ydCgpIGNvbnN0OworCiAgICAgICAgICAgICBib29sIG1faXNTb3J0ZWQ7CiAgICAgICAgICAg
ICBib29sIG1fc3VidHJlZXNBcmVEaXNqb2ludDsKICAgICAgICAgICAgIFZlY3RvcjxSZWZQdHI8
Tm9kZT4gPiBtX25vZGVzOwo=
</data>
<flag name="review"
          id="119569"
          type_id="1"
          status="+"
          setter="darin"
    />
          </attachment>
      

    </bug>

</bugzilla>