<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>143393</bug_id>
          
          <creation_ts>2015-04-03 16:46:28 -0700</creation_ts>
          <short_desc>[Content Extensions] Sort DFANodes by importance</short_desc>
          <delta_ts>2015-04-06 17:11:50 -0700</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>WebCore Misc.</component>
          <version>528+ (Nightly build)</version>
          <rep_platform>Unspecified</rep_platform>
          <op_sys>Unspecified</op_sys>
          <bug_status>NEW</bug_status>
          <resolution></resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Alex Christensen">achristensen</reporter>
          <assigned_to name="Alex Christensen">achristensen</assigned_to>
          <cc>barraclough</cc>
    
    <cc>benjamin</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>1082719</commentid>
    <comment_count>0</comment_count>
    <who name="Alex Christensen">achristensen</who>
    <bug_when>2015-04-03 16:46:28 -0700</bug_when>
    <thetext>The DFA nodes are currently compiled in the order they are added, which provides pretty-good memory locality compared to random ordering, but we could do better.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1082722</commentid>
    <comment_count>1</comment_count>
      <attachid>250107</attachid>
    <who name="Alex Christensen">achristensen</who>
    <bug_when>2015-04-03 16:52:42 -0700</bug_when>
    <thetext>Created attachment 250107
Patch</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1083112</commentid>
    <comment_count>2</comment_count>
      <attachid>250107</attachid>
    <who name="Geoffrey Garen">ggaren</who>
    <bug_when>2015-04-06 11:13:50 -0700</bug_when>
    <thetext>Comment on attachment 250107
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=250107&amp;action=review

&gt; Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:202
&gt; +static Vector&lt;unsigned&gt; zeroes(size_t count)
&gt; +{
&gt; +    Vector&lt;unsigned&gt; vector;
&gt; +    vector.reserveInitialCapacity(count);
&gt; +    for (unsigned i = 0; i &lt; count; ++i)
&gt; +        vector.append(0);
&gt; +    return vector;
&gt; +}

Wat. Plz use Vector::fill, which uses memset, which uses SIMD, which makes me happy.

&gt; Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:240
&gt; +static Vector&lt;unsigned&gt; incomingTransitions(const DFA&amp; dfa)
&gt; +{
&gt; +    Vector&lt;unsigned&gt; transitionsIn = zeroes(dfa.size());
&gt; +    for (unsigned i = 0; i &lt; dfa.size(); ++i) {
&gt; +        for (unsigned transition : dfa.nodeAt(i).transitions.values())
&gt; +            transitionsIn[transition]++;
&gt; +    }
&gt; +    return transitionsIn;
&gt; +}
&gt; +
&gt; +static Vector&lt;unsigned&gt; outgoingTransitions(const DFA&amp; dfa)
&gt; +{
&gt; +    Vector&lt;unsigned&gt; transitionsOut;
&gt; +    transitionsOut.reserveInitialCapacity(dfa.size());
&gt; +    for (unsigned i = 0; i &lt; dfa.size(); ++i)
&gt; +        transitionsOut.append(dfa.nodeAt(i).transitions.size());
&gt; +    return transitionsOut;
&gt; +}

Can we compute these in the same loop, to avoid looping twice? Or is this loop really inexpensive?

&gt; Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:279
&gt; +        // We should tune this objective function before landing, but this gives a slight improvement.

How much of an improvement is this?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1083220</commentid>
    <comment_count>3</comment_count>
      <attachid>250107</attachid>
    <who name="Alex Christensen">achristensen</who>
    <bug_when>2015-04-06 17:11:50 -0700</bug_when>
    <thetext>Comment on attachment 250107
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=250107&amp;action=review

We should look into this and measure different objective functions&apos; effects on memory usage after minimizing the size of the DFA bytecode.

&gt;&gt; Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:202
&gt;&gt; +}
&gt; 
&gt; Wat. Plz use Vector::fill, which uses memset, which uses SIMD, which makes me happy.

I knew there should be something in Vector to do this.

&gt;&gt; Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:240
&gt;&gt; +}
&gt; 
&gt; Can we compute these in the same loop, to avoid looping twice? Or is this loop really inexpensive?

Unless we do things terribly wrong, most of the time compiling is spent in NFAToDFA::convert.  This is really inexpensive, but we could do both in the same loop and organize the functions differently.

&gt;&gt; Source/WebCore/contentextensions/DFABytecodeCompiler.cpp:279
&gt;&gt; +        // We should tune this objective function before landing, but this gives a slight improvement.
&gt; 
&gt; How much of an improvement is this?

Not as much as I want.</thetext>
  </long_desc>
      
          <attachment
              isobsolete="0"
              ispatch="1"
              isprivate="0"
          >
            <attachid>250107</attachid>
            <date>2015-04-03 16:52:42 -0700</date>
            <delta_ts>2015-04-06 17:11:50 -0700</delta_ts>
            <desc>Patch</desc>
            <filename>bug-143393-20150403165156.patch</filename>
            <type>text/plain</type>
            <size>5305</size>
            <attacher name="Alex Christensen">achristensen</attacher>
            
              <data encoding="base64">SW5kZXg6IFNvdXJjZS9XZWJDb3JlL0NoYW5nZUxvZwo9PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09Ci0tLSBTb3VyY2UvV2Vi
Q29yZS9DaGFuZ2VMb2cJKHJldmlzaW9uIDE4MjMzNykKKysrIFNvdXJjZS9XZWJDb3JlL0NoYW5n
ZUxvZwkod29ya2luZyBjb3B5KQpAQCAtMSwzICsxLDE5IEBACisyMDE1LTA0LTAzICBBbGV4IENo
cmlzdGVuc2VuICA8YWNocmlzdGVuc2VuQHdlYmtpdC5vcmc+CisKKyAgICAgICAgW0NvbnRlbnQg
RXh0ZW5zaW9uc10gU29ydCBERkFOb2RlcyBieSBpbXBvcnRhbmNlCisgICAgICAgIGh0dHBzOi8v
YnVncy53ZWJraXQub3JnL3Nob3dfYnVnLmNnaT9pZD0xNDMzOTMKKworICAgICAgICBSZXZpZXdl
ZCBieSBOT0JPRFkgKE9PUFMhKS4KKworICAgICAgICAqIGNvbnRlbnRleHRlbnNpb25zL0RGQUJ5
dGVjb2RlQ29tcGlsZXIuY3BwOgorICAgICAgICAoV2ViQ29yZTo6Q29udGVudEV4dGVuc2lvbnM6
Onplcm9lcyk6CisgICAgICAgIChXZWJDb3JlOjpDb250ZW50RXh0ZW5zaW9uczo6ZGlzdGFuY2Vz
RnJvbVJvb3QpOgorICAgICAgICAoV2ViQ29yZTo6Q29udGVudEV4dGVuc2lvbnM6OmluY29taW5n
VHJhbnNpdGlvbnMpOgorICAgICAgICAoV2ViQ29yZTo6Q29udGVudEV4dGVuc2lvbnM6Om91dGdv
aW5nVHJhbnNpdGlvbnMpOgorICAgICAgICAoV2ViQ29yZTo6Q29udGVudEV4dGVuc2lvbnM6Omhp
dENvdW50KToKKyAgICAgICAgKFdlYkNvcmU6OkNvbnRlbnRFeHRlbnNpb25zOjpERkFCeXRlY29k
ZUNvbXBpbGVyOjpjb21waWxlKToKKyAgICAgICAgU29ydCBERkEgbm9kZXMgYnkgaW1wb3J0YW5j
ZSBiZWZvcmUgY29tcGlsaW5nIHRoZW0gdG8gaW1wcm92ZSBtZW1vcnkgbG9jYWxpdHkgd2hlbiBp
bnRlcnByZXRpbmcuCisKIDIwMTUtMDQtMDMgIEFsZXggQ2hyaXN0ZW5zZW4gIDxhY2hyaXN0ZW5z
ZW5Ad2Via2l0Lm9yZz4KIAogICAgICAgICBbQ29udGVudCBFeHRlbnNpb25zXSBBZGQgbWVtb3J5
IHJlcG9ydGluZy4KSW5kZXg6IFNvdXJjZS9XZWJDb3JlL2NvbnRlbnRleHRlbnNpb25zL0RGQUJ5
dGVjb2RlQ29tcGlsZXIuY3BwCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KLS0tIFNvdXJjZS9XZWJDb3JlL2NvbnRlbnRl
eHRlbnNpb25zL0RGQUJ5dGVjb2RlQ29tcGlsZXIuY3BwCShyZXZpc2lvbiAxODIzMzUpCisrKyBT
b3VyY2UvV2ViQ29yZS9jb250ZW50ZXh0ZW5zaW9ucy9ERkFCeXRlY29kZUNvbXBpbGVyLmNwcAko
d29ya2luZyBjb3B5KQpAQCAtMTkyLDYgKzE5Miw3NCBAQCB2b2lkIERGQUJ5dGVjb2RlQ29tcGls
ZXI6OmNvbXBpbGVDaGVja0ZvCiAgICAgICAgIGVtaXRDaGVja1ZhbHVlUmFuZ2UocmFuZ2UubWlu
LCByYW5nZS5tYXgsIHJhbmdlLmRlc3RpbmF0aW9uLCByYW5nZS5jYXNlU2Vuc2l0aXZlKTsKIH0K
IAorc3RhdGljIFZlY3Rvcjx1bnNpZ25lZD4gemVyb2VzKHNpemVfdCBjb3VudCkKK3sKKyAgICBW
ZWN0b3I8dW5zaWduZWQ+IHZlY3RvcjsKKyAgICB2ZWN0b3IucmVzZXJ2ZUluaXRpYWxDYXBhY2l0
eShjb3VudCk7CisgICAgZm9yICh1bnNpZ25lZCBpID0gMDsgaSA8IGNvdW50OyArK2kpCisgICAg
ICAgIHZlY3Rvci5hcHBlbmQoMCk7CisgICAgcmV0dXJuIHZlY3RvcjsKK30KKworc3RhdGljIFZl
Y3Rvcjx1bnNpZ25lZD4gZGlzdGFuY2VzRnJvbVJvb3QoY29uc3QgREZBJiBkZmEpCit7CisgICAg
VmVjdG9yPHVuc2lnbmVkPiBkaXN0YW5jZXMgPSB6ZXJvZXMoZGZhLnNpemUoKSk7CisgICAgVmVj
dG9yPHVuc2lnbmVkPiBsZWF2ZXM7CisgICAgbGVhdmVzLmFwcGVuZChkZmEucm9vdCgpKTsKKyAg
ICB3aGlsZSAobGVhdmVzLnNpemUoKSkgeworICAgICAgICB1bnNpZ25lZCBsZWFmSW5kZXggPSBs
ZWF2ZXMudGFrZUxhc3QoKTsKKyAgICAgICAgdW5zaWduZWQgbGVhZkRpc3RhbmNlID0gZGlzdGFu
Y2VzW2xlYWZJbmRleF07CisgICAgICAgIGNvbnN0IERGQU5vZGUmIGxlYWYgPSBkZmEubm9kZUF0
KGxlYWZJbmRleCk7CisgICAgICAgIGZvciAodW5zaWduZWQgdHJhbnNpdGlvbiA6IGxlYWYudHJh
bnNpdGlvbnMudmFsdWVzKCkpIHsKKyAgICAgICAgICAgIGlmICghZGlzdGFuY2VzW3RyYW5zaXRp
b25dKSB7CisgICAgICAgICAgICAgICAgZGlzdGFuY2VzW3RyYW5zaXRpb25dID0gbGVhZkRpc3Rh
bmNlICsgMTsKKyAgICAgICAgICAgICAgICBsZWF2ZXMuYXBwZW5kKHRyYW5zaXRpb24pOworICAg
ICAgICAgICAgfQorICAgICAgICB9CisgICAgfQorICAgIHJldHVybiBkaXN0YW5jZXM7Cit9CisK
K3N0YXRpYyBWZWN0b3I8dW5zaWduZWQ+IGluY29taW5nVHJhbnNpdGlvbnMoY29uc3QgREZBJiBk
ZmEpCit7CisgICAgVmVjdG9yPHVuc2lnbmVkPiB0cmFuc2l0aW9uc0luID0gemVyb2VzKGRmYS5z
aXplKCkpOworICAgIGZvciAodW5zaWduZWQgaSA9IDA7IGkgPCBkZmEuc2l6ZSgpOyArK2kpIHsK
KyAgICAgICAgZm9yICh1bnNpZ25lZCB0cmFuc2l0aW9uIDogZGZhLm5vZGVBdChpKS50cmFuc2l0
aW9ucy52YWx1ZXMoKSkKKyAgICAgICAgICAgIHRyYW5zaXRpb25zSW5bdHJhbnNpdGlvbl0rKzsK
KyAgICB9CisgICAgcmV0dXJuIHRyYW5zaXRpb25zSW47Cit9CisKK3N0YXRpYyBWZWN0b3I8dW5z
aWduZWQ+IG91dGdvaW5nVHJhbnNpdGlvbnMoY29uc3QgREZBJiBkZmEpCit7CisgICAgVmVjdG9y
PHVuc2lnbmVkPiB0cmFuc2l0aW9uc091dDsKKyAgICB0cmFuc2l0aW9uc091dC5yZXNlcnZlSW5p
dGlhbENhcGFjaXR5KGRmYS5zaXplKCkpOworICAgIGZvciAodW5zaWduZWQgaSA9IDA7IGkgPCBk
ZmEuc2l6ZSgpOyArK2kpCisgICAgICAgIHRyYW5zaXRpb25zT3V0LmFwcGVuZChkZmEubm9kZUF0
KGkpLnRyYW5zaXRpb25zLnNpemUoKSk7CisgICAgcmV0dXJuIHRyYW5zaXRpb25zT3V0OworfQor
CitzdGF0aWMgVmVjdG9yPHVuc2lnbmVkPiBoaXRDb3VudChjb25zdCBERkEmIGRmYSwgY29uc3Qg
Y2hhciogdXJsKQoreworICAgIHVuc2lnbmVkIGxlbmd0aCA9IHN0cmxlbih1cmwpOworICAgIFZl
Y3Rvcjx1bnNpZ25lZD4gaGl0cyA9IHplcm9lcyhkZmEuc2l6ZSgpKTsKKyAgICB1bnNpZ25lZCBj
dXJyZW50Tm9kZSA9IGRmYS5yb290KCk7CisgICAgZm9yICh1bnNpZ25lZCBpID0gMDsgaSA8IGxl
bmd0aDsgKytpKSB7CisgICAgICAgIGhpdHNbY3VycmVudE5vZGVdKys7CisKKyAgICAgICAgY2hh
ciBjdXJyZW50Q2hhcmFjdGVyID0gdXJsW2ldOworICAgICAgICBjb25zdCBhdXRvJiBub2RlID0g
ZGZhLm5vZGVBdChpKTsKKyAgICAgICAgY29uc3QgYXV0byBpdCA9IG5vZGUudHJhbnNpdGlvbnMu
ZmluZChzdGF0aWNfY2FzdDx1aW50MTZfdD4oY3VycmVudENoYXJhY3RlcikpOworICAgICAgICBp
ZiAoaXQgIT0gbm9kZS50cmFuc2l0aW9ucy5lbmQoKSkKKyAgICAgICAgICAgIGN1cnJlbnROb2Rl
ID0gaXQtPnZhbHVlOworICAgICAgICBlbHNlIGlmIChub2RlLmhhc0ZhbGxiYWNrVHJhbnNpdGlv
bikKKyAgICAgICAgICAgIGN1cnJlbnROb2RlID0gbm9kZS5mYWxsYmFja1RyYW5zaXRpb247Cisg
ICAgICAgIGVsc2UKKyAgICAgICAgICAgIGJyZWFrOworICAgIH0KKyAgICByZXR1cm4gaGl0czsK
K30KKwogdm9pZCBERkFCeXRlY29kZUNvbXBpbGVyOjpjb21waWxlKCkKIHsKICAgICAvLyBERkEg
aGVhZGVyLgpAQCAtMTk5LDExICsyNjcsMzMgQEAgdm9pZCBERkFCeXRlY29kZUNvbXBpbGVyOjpj
b21waWxlKCkKICAgICBhcHBlbmQ8dW5zaWduZWQ+KG1fYnl0ZWNvZGUsIDApOwogICAgIG1fbm9k
ZVN0YXJ0T2Zmc2V0cy5yZXNpemUobV9kZmEuc2l6ZSgpKTsKICAgICAKKyAgICBWZWN0b3I8dW5z
aWduZWQ+IGRmciA9IGRpc3RhbmNlc0Zyb21Sb290KG1fZGZhKTsKKyAgICBWZWN0b3I8dW5zaWdu
ZWQ+IGl0ID0gaW5jb21pbmdUcmFuc2l0aW9ucyhtX2RmYSk7CisgICAgVmVjdG9yPHVuc2lnbmVk
PiBvdCA9IG91dGdvaW5nVHJhbnNpdGlvbnMobV9kZmEpOworICAgIFZlY3Rvcjx1bnNpZ25lZD4g
aGMxID0gaGl0Q291bnQobV9kZmEsICJodHRwczovL3d3dy53ZWJraXQub3JnLyIpOworICAgIFZl
Y3Rvcjx1bnNpZ25lZD4gaGMyID0gaGl0Q291bnQobV9kZmEsICJodHRwOi8vZW4ud2lraXBlZGlh
Lm9yZy93aWtpL0hUTUw1Iik7CisKKyAgICBWZWN0b3I8c3RkOjpwYWlyPHVuc2lnbmVkLCBkb3Vi
bGU+PiBub2RlU2NvcmVzOworICAgIG5vZGVTY29yZXMucmVzZXJ2ZUluaXRpYWxDYXBhY2l0eSht
X2RmYS5zaXplKCkpOworICAgIGZvciAodW5zaWduZWQgaSA9IDA7IGkgPCBtX2RmYS5zaXplKCk7
ICsraSkgeworICAgICAgICAvLyBXZSBzaG91bGQgdHVuZSB0aGlzIG9iamVjdGl2ZSBmdW5jdGlv
biBiZWZvcmUgbGFuZGluZywgYnV0IHRoaXMgZ2l2ZXMgYSBzbGlnaHQgaW1wcm92ZW1lbnQuCisg
ICAgICAgIGRvdWJsZSBzY29yZSA9IChkZnJbaV0gPCA0ID8gNTAuMCA6IDAuMCkgKyBsb2coaXRb
aV0gKyAxKSArIGxvZyhvdFtpXSArIDEpICsgaGMxW2ldICogNTAuMCArIGhjMltpXSAqIDUwLjA7
CisgICAgICAgIG5vZGVTY29yZXMuYXBwZW5kKHN0ZDo6bWFrZV9wYWlyKGksIHNjb3JlKSk7Cisg
ICAgfQorICAgIAorICAgIC8vIFB1dCAiaW1wb3J0YW50IiBub2RlcyBhdCB0aGUgYmVnaW5uaW5n
IHRvIGltcHJvdmUgbWVtb3J5IGFjY2VzcyBsb2NhbGl0eS4KKyAgICBhdXRvIGNvbXBhcmVOb2Rl
U2NvcmVzID0gW10oY29uc3Qgc3RkOjpwYWlyPHVuc2lnbmVkLCBkb3VibGU+JiBhLCBjb25zdCBz
dGQ6OnBhaXI8dW5zaWduZWQsIGRvdWJsZT4mIGIpCisgICAgeyAKKyAgICAgICAgcmV0dXJuIGEu
c2Vjb25kID4gYi5zZWNvbmQ7CisgICAgfTsKKyAgICBzdGQ6OnNvcnQobm9kZVNjb3Jlcy5iZWdp
bigpLCBub2RlU2NvcmVzLmVuZCgpLCBjb21wYXJlTm9kZVNjb3Jlcyk7CisgICAgCiAgICAgLy8g
TWFrZSBzdXJlIHRoZSByb290IGlzIGFsd2F5cyBhdCB0aGUgYmVnaW5uaW5nIG9mIHRoZSBieXRl
Y29kZS4KICAgICBjb21waWxlTm9kZShtX2RmYS5yb290KCkpOwogICAgIGZvciAodW5zaWduZWQg
aSA9IDA7IGkgPCBtX2RmYS5zaXplKCk7IGkrKykgewotICAgICAgICBpZiAoaSAhPSBtX2RmYS5y
b290KCkpCi0gICAgICAgICAgICBjb21waWxlTm9kZShpKTsKKyAgICAgICAgdW5zaWduZWQgaW5k
ZXggPSBub2RlU2NvcmVzW2ldLmZpcnN0OworICAgICAgICBpZiAoaW5kZXggIT0gbV9kZmEucm9v
dCgpKQorICAgICAgICAgICAgY29tcGlsZU5vZGUoaW5kZXgpOwogICAgIH0KIAogICAgIC8vIExp
bmsuCg==
</data>
<flag name="review"
          id="274935"
          type_id="1"
          status="-"
          setter="achristensen"
    />
          </attachment>
      

    </bug>

</bugzilla>