<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>236532</bug_id>
          
          <creation_ts>2022-02-11 15:47:40 -0800</creation_ts>
          <short_desc>look up InputTypeFactoryMap with an ASCII lowercase string instead of using a ASCIICaseInsensitiveHash</short_desc>
          <delta_ts>2022-02-13 10:29:10 -0800</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>DOM</component>
          <version>WebKit Nightly Build</version>
          <rep_platform>Unspecified</rep_platform>
          <op_sys>Unspecified</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords>InRadar</keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Cameron McCormack (:heycam)">heycam</reporter>
          <assigned_to name="Cameron McCormack (:heycam)">heycam</assigned_to>
          <cc>cdumez</cc>
    
    <cc>changseok</cc>
    
    <cc>darin</cc>
    
    <cc>esprehn+autocc</cc>
    
    <cc>ews-watchlist</cc>
    
    <cc>gyuyoung.kim</cc>
    
    <cc>mifenton</cc>
    
    <cc>mmaxfield</cc>
    
    <cc>webkit-bug-importer</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>1840602</commentid>
    <comment_count>0</comment_count>
    <who name="Cameron McCormack (:heycam)">heycam</who>
    <bug_when>2022-02-11 15:47:40 -0800</bug_when>
    <thetext>InputType::create looks up the InputTypeFactoryMap based on the AtomString value of the &lt;input type&gt; attribute.  The HashMap uses an ASCIICaseInsensitiveHash, but the AtomStrings stored in the map are all ASCII lowercase to begin with.  This means that we spend time doing an ASCII case insensitive hash computation on the query string.  Most content already supplies an ASCII lowercase type value, so it&apos;s less work to ASCII lowercase the type value and then look up the HashMap using the regular hash for AtomStrings (i.e., pulling the hash out of AtomString).

Doing this is a 0.5% improvement on a couple of Speedometer 2 subtests, and a 0.1% improvement to the overall score.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1840604</commentid>
    <comment_count>1</comment_count>
      <attachid>451754</attachid>
    <who name="Cameron McCormack (:heycam)">heycam</who>
    <bug_when>2022-02-11 15:50:51 -0800</bug_when>
    <thetext>Created attachment 451754
Patch</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1840688</commentid>
    <comment_count>2</comment_count>
    <who name="EWS">ews-feeder</who>
    <bug_when>2022-02-12 07:09:29 -0800</bug_when>
    <thetext>Committed r289691 (247176@main): &lt;https://commits.webkit.org/247176@main&gt;

All reviewed patches have been landed. Closing bug and clearing flags on attachment 451754.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1840689</commentid>
    <comment_count>3</comment_count>
    <who name="Radar WebKit Bug Importer">webkit-bug-importer</who>
    <bug_when>2022-02-12 07:10:18 -0800</bug_when>
    <thetext>&lt;rdar://problem/88855498&gt;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1840728</commentid>
    <comment_count>4</comment_count>
      <attachid>451754</attachid>
    <who name="Darin Adler">darin</who>
    <bug_when>2022-02-12 14:10:16 -0800</bug_when>
    <thetext>Comment on attachment 451754
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=451754&amp;action=review

&gt; Source/WebCore/ChangeLog:3
&gt; +        Look up InputTypeFactoryMap with an ASCII lowercase string instead of using a ASCIICaseInsensitiveHash

This seems like a great speedup, but if the cost is the hash computation there’s another approach we could have taken: We can change ASCIICaseInsensitiveHash::hash(StringImpl&amp;) to check if there are any ASCII uppercase characters. If there are none, then it can use string.hash(). If we do this, it would optimize *all* uses of ASCIICaseInsensitiveHash that are typically used on already-lowercase strings. However, if the optimization here benefits from other savings as well, then perhaps my suggestion would not work as well as this did.

What do you think, Cameron?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1840751</commentid>
    <comment_count>5</comment_count>
    <who name="Cameron McCormack (:heycam)">heycam</who>
    <bug_when>2022-02-12 17:45:42 -0800</bug_when>
    <thetext>(In reply to Darin Adler from comment #4)
&gt; This seems like a great speedup, but if the cost is the hash computation
&gt; there’s another approach we could have taken: We can change
&gt; ASCIICaseInsensitiveHash::hash(StringImpl&amp;) to check if there are any ASCII
&gt; uppercase characters. If there are none, then it can use string.hash(). If
&gt; we do this, it would optimize *all* uses of ASCIICaseInsensitiveHash that
&gt; are typically used on already-lowercase strings. However, if the
&gt; optimization here benefits from other savings as well, then perhaps my
&gt; suggestion would not work as well as this did.
&gt; 
&gt; What do you think, Cameron?

Nice idea, and I think it is worth trying that out.  In this particular case, I think it&apos;s still advantageous to move away from ASCIICaseInsensitiveHash, as it allows us to skip hashing the key upon insertion too (in createInputTypeFactoryMap).

Is it possible to configure HashMap in a way that hashing of keys on insertion is done differently from hashing of keys passed when looking up the map?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1840752</commentid>
    <comment_count>6</comment_count>
    <who name="Cameron McCormack (:heycam)">heycam</who>
    <bug_when>2022-02-12 17:48:08 -0800</bug_when>
    <thetext>Maybe we would only want to make your suggested ASCIICaseInsensitiveHash change if we were sure that all performance critical uses of ASCIICaseInsensitiveHash and AtomStrings tend to get called with already ASCII lowercase strings.

Or we could have a different hash trait type which means &quot;compute an ASCII case insensitive hash but the string probably is already ASCII case insensitive&quot;.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1841020</commentid>
    <comment_count>7</comment_count>
    <who name="Darin Adler">darin</who>
    <bug_when>2022-02-13 10:21:41 -0800</bug_when>
    <thetext>On reflection, I think I have a better idea. We should use SortedArrayMap instead of HashMap for InputType::create. Then there will be no hashing at all. I suspect it can match or exceed the speed of the HashMap; I believe the list is short enough that it will be a linear search, but maybe big enough for binary. Because this is not a dynamic data structure, but rather something that can be initialized at compile time, SortedArrayMap should be good.

I&apos;ve learned to never doubt the speed one can get from hashing, so there is a slim chance that won&apos;t work, though!

(In reply to Cameron McCormack (:heycam) from comment #5)
&gt; Nice idea, and I think it is worth trying that out.  In this particular
&gt; case, I think it&apos;s still advantageous to move away from
&gt; ASCIICaseInsensitiveHash, as it allows us to skip hashing the key upon
&gt; insertion too (in createInputTypeFactoryMap).

Sorry, I don’t understand this.

As far as I can tell, neither the solution you implemented nor the ASCIICaseInsensitiveHash::hash solution I suggested will involve any extra hashing on insertion. In both cases, we detect that the String is already all lowercase, and use the hash stored in the String.

In the solution you implemented, we are relying on knowing that there are no ASCII uppercase characters in the string and so it uses String::hash.

In the solution I suggested, the hash table add function will call the ASCIICaseInsensitiveHash::hash function. That function will detect that the String lacks uppercase ASCII characters and then it will call String::hash.

It’s true that my suggestion incurs an &quot;are there any uppercase ASCII letters&quot; check on each string during insertion that yours did not, but no additional hashing.

&gt; Is it possible to configure HashMap in a way that hashing of keys on
&gt; insertion is done differently from hashing of keys passed when looking up
&gt; the map?

I don’t think that’s needed here, but if we wanted to use it, the answer is yes.

HashMap provides HashTranslator versions of the functions find, contains, get, inlineGet, and add, and in all of these we can use a different type that hashes differently by providing HashTranslator traits.

This could be used to make a version of add that already knows there are no uppercase ASCII letters, and thus optimize the hash table building process. But that’s not an important optimization for a function that runs once, I don’t think.

If the speed of building the table matters at all, then my SortedArrayMap idea becomes even better, because the map is built at compile/load time, not at runtime.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1841021</commentid>
    <comment_count>8</comment_count>
    <who name="Darin Adler">darin</who>
    <bug_when>2022-02-13 10:24:18 -0800</bug_when>
    <thetext>(In reply to Cameron McCormack (:heycam) from comment #6)
&gt; Maybe we would only want to make your suggested ASCIICaseInsensitiveHash
&gt; change if we were sure that all performance critical uses of
&gt; ASCIICaseInsensitiveHash and AtomStrings tend to get called with already
&gt; ASCII lowercase strings.
&gt; 
&gt; Or we could have a different hash trait type which means &quot;compute an ASCII
&gt; case insensitive hash but the string probably is already ASCII case
&gt; insensitive&quot;.

I see your point.

If there are places where we use ASCIICaseInsensitiveHash, but the strings often have some uppercase ASCII letters in them, then the extra checking would be wasteful.

But since this check is quite inexpensive compared to the hash computation, I am guessing we wouldn’t need two differently tuned versions.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1841023</commentid>
    <comment_count>9</comment_count>
    <who name="Darin Adler">darin</who>
    <bug_when>2022-02-13 10:27:56 -0800</bug_when>
    <thetext>One other thought: If this is so hot that all that hash computation is significant, maybe we also want to make a histogram and find out what values are actually used in practice and add special cases for the hottest most common values:

    if (string == &quot;text&quot;)
        &lt;do the thing and don&apos;t even bother with any fancy data structure&gt;

Adding a case like this could speed things up further! All depends on this really being super-hot.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1841024</commentid>
    <comment_count>10</comment_count>
    <who name="Darin Adler">darin</who>
    <bug_when>2022-02-13 10:29:10 -0800</bug_when>
    <thetext>I think I like SortedArrayMap best of the above ideas.

But also, we should probably think about the ASCIICaseInsensitiveHash optimization for the benefit of the other uses of it. I bet it will be a net win, maybe there are even other places where we can see a measurable speedup on some benchmark.</thetext>
  </long_desc>
      
          <attachment
              isobsolete="0"
              ispatch="1"
              isprivate="0"
          >
            <attachid>451754</attachid>
            <date>2022-02-11 15:50:51 -0800</date>
            <delta_ts>2022-02-12 07:09:31 -0800</delta_ts>
            <desc>Patch</desc>
            <filename>bug-236532-20220212105050.patch</filename>
            <type>text/plain</type>
            <size>2889</size>
            <attacher name="Cameron McCormack (:heycam)">heycam</attacher>
            
              <data encoding="base64">U3VidmVyc2lvbiBSZXZpc2lvbjogMjg5NDExCmRpZmYgLS1naXQgYS9Tb3VyY2UvV2ViQ29yZS9D
aGFuZ2VMb2cgYi9Tb3VyY2UvV2ViQ29yZS9DaGFuZ2VMb2cKaW5kZXggMjc0MGVhMjAwNTg2NDIz
YzkzMWY1NmQ4MDBjYmM2ODMxNWM5MzYxNC4uYjEzMGUwNDBmOTU5NTVmZTc0ZGMzYWFlMzVjZjUz
YmRjZTU0MDgzNSAxMDA2NDQKLS0tIGEvU291cmNlL1dlYkNvcmUvQ2hhbmdlTG9nCisrKyBiL1Nv
dXJjZS9XZWJDb3JlL0NoYW5nZUxvZwpAQCAtMSwzICsxLDI2IEBACisyMDIyLTAyLTExICBDYW1l
cm9uIE1jQ29ybWFjayAgPGhleWNhbUBhcHBsZS5jb20+CisKKyAgICAgICAgTG9vayB1cCBJbnB1
dFR5cGVGYWN0b3J5TWFwIHdpdGggYW4gQVNDSUkgbG93ZXJjYXNlIHN0cmluZyBpbnN0ZWFkIG9m
IHVzaW5nIGEgQVNDSUlDYXNlSW5zZW5zaXRpdmVIYXNoCisgICAgICAgIGh0dHBzOi8vYnVncy53
ZWJraXQub3JnL3Nob3dfYnVnLmNnaT9pZD0yMzY1MzIKKworICAgICAgICBSZXZpZXdlZCBieSBO
T0JPRFkgKE9PUFMhKS4KKworICAgICAgICBJbnB1dFR5cGU6OmNyZWF0ZSBsb29rcyB1cCB0aGUg
SW5wdXRUeXBlRmFjdG9yeU1hcCBiYXNlZCBvbiB0aGUKKyAgICAgICAgQXRvbVN0cmluZyB2YWx1
ZSBvZiB0aGUgPGlucHV0IHR5cGU+IGF0dHJpYnV0ZS4gVGhlIEhhc2hNYXAgdXNlcyBhbgorICAg
ICAgICBBU0NJSUNhc2VJbnNlbnNpdGl2ZUhhc2gsIGJ1dCB0aGUgQXRvbVN0cmluZ3Mgc3RvcmVk
IGluIHRoZSBtYXAgYXJlIGFsbAorICAgICAgICBBU0NJSSBsb3dlcmNhc2UgdG8gYmVnaW4gd2l0
aC4gVGhpcyBtZWFucyB0aGF0IHdlIHNwZW5kIHRpbWUgZG9pbmcgYW4KKyAgICAgICAgQVNDSUkg
Y2FzZSBpbnNlbnNpdGl2ZSBoYXNoIGNvbXB1dGF0aW9uIG9uIHRoZSBxdWVyeSBzdHJpbmcuIE1v
c3QKKyAgICAgICAgY29udGVudCBhbHJlYWR5IHN1cHBsaWVzIGFuIEFTQ0lJIGxvd2VyY2FzZSB0
eXBlIHZhbHVlLCBzbyBpdCdzIGxlc3MKKyAgICAgICAgd29yayB0byBBU0NJSSBsb3dlcmNhc2Ug
dGhlIHR5cGUgdmFsdWUgYW5kIHRoZW4gbG9vayB1cCB0aGUgSGFzaE1hcAorICAgICAgICB1c2lu
ZyB0aGUgcmVndWxhciBoYXNoIGZvciBBdG9tU3RyaW5ncyAoaS5lLiwgcHVsbGluZyB0aGUgaGFz
aCBvdXQgb2YKKyAgICAgICAgQXRvbVN0cmluZykuCisKKyAgICAgICAgRG9pbmcgdGhpcyBpcyBh
IDAuNSUgaW1wcm92ZW1lbnQgb24gYSBjb3VwbGUgb2YgU3BlZWRvbWV0ZXIgMiBzdWJ0ZXN0cywK
KyAgICAgICAgYW5kIGEgMC4xJSBpbXByb3ZlbWVudCB0byB0aGUgb3ZlcmFsbCBzY29yZS4KKwor
ICAgICAgICAqIGh0bWwvSW5wdXRUeXBlLmNwcDoKKyAgICAgICAgKFdlYkNvcmU6OklucHV0VHlw
ZTo6Y3JlYXRlKToKKwogMjAyMi0wMi0xMCAgQ2FtZXJvbiBNY0Nvcm1hY2sgIDxoZXljYW1AYXBw
bGUuY29tPgogCiAgICAgICAgIE1ha2UgV2lkZ2V0SGllcmFyY2h5VXBkYXRlc1N1c3BlbnNpb25T
Y29wZSBjaGVhcGVyIGlmIGl0IGhhcyBub3RoaW5nIHRvIGRvCmRpZmYgLS1naXQgYS9Tb3VyY2Uv
V2ViQ29yZS9odG1sL0lucHV0VHlwZS5jcHAgYi9Tb3VyY2UvV2ViQ29yZS9odG1sL0lucHV0VHlw
ZS5jcHAKaW5kZXggMWM3Nzc3NjUxZmMzZjkzYmY2OGIzMTJhNDkxYzcwN2Y1MTkxOGM4OC4uMTJh
OWRiMjliZjEzZmEyMDMxNmY5OTAxY2YwMTgzODMwMWIxNGEwMCAxMDA2NDQKLS0tIGEvU291cmNl
L1dlYkNvcmUvaHRtbC9JbnB1dFR5cGUuY3BwCisrKyBiL1NvdXJjZS9XZWJDb3JlL2h0bWwvSW5w
dXRUeXBlLmNwcApAQCAtOTAsNyArOTAsNyBAQCB1c2luZyBuYW1lc3BhY2UgSFRNTE5hbWVzOwog
dHlwZWRlZiBib29sIChTZXR0aW5nczo6KklucHV0VHlwZUNvbmRpdGlvbmFsRnVuY3Rpb24pKCkg
Y29uc3Q7CiB0eXBlZGVmIGNvbnN0IEF0b21TdHJpbmcmICgqSW5wdXRUeXBlTmFtZUZ1bmN0aW9u
KSgpOwogdHlwZWRlZiBSZWY8SW5wdXRUeXBlPiAoKklucHV0VHlwZUZhY3RvcnlGdW5jdGlvbiko
SFRNTElucHV0RWxlbWVudCYpOwotdHlwZWRlZiBIYXNoTWFwPEF0b21TdHJpbmcsIHN0ZDo6cGFp
cjxJbnB1dFR5cGVDb25kaXRpb25hbEZ1bmN0aW9uLCBJbnB1dFR5cGVGYWN0b3J5RnVuY3Rpb24+
LCBBU0NJSUNhc2VJbnNlbnNpdGl2ZUhhc2g+IElucHV0VHlwZUZhY3RvcnlNYXA7Cit0eXBlZGVm
IEhhc2hNYXA8QXRvbVN0cmluZywgc3RkOjpwYWlyPElucHV0VHlwZUNvbmRpdGlvbmFsRnVuY3Rp
b24sIElucHV0VHlwZUZhY3RvcnlGdW5jdGlvbj4+IElucHV0VHlwZUZhY3RvcnlNYXA7CiAKIHRl
bXBsYXRlPGNsYXNzIFQ+IHN0YXRpYyBSZWY8SW5wdXRUeXBlPiBjcmVhdGVJbnB1dFR5cGUoSFRN
TElucHV0RWxlbWVudCYgZWxlbWVudCkKIHsKQEAgLTE1MCw3ICsxNTAsNyBAQCBSZWY8SW5wdXRU
eXBlPiBJbnB1dFR5cGU6OmNyZWF0ZShIVE1MSW5wdXRFbGVtZW50JiBlbGVtZW50LCBjb25zdCBB
dG9tU3RyaW5nJiB0eQogewogICAgIGlmICghdHlwZU5hbWUuaXNFbXB0eSgpKSB7CiAgICAgICAg
IHN0YXRpYyBOZXZlckRlc3Ryb3llZCBmYWN0b3J5TWFwID0gY3JlYXRlSW5wdXRUeXBlRmFjdG9y
eU1hcCgpOwotICAgICAgICBhdXRvJiYgW2NvbmRpdGlvbmFsLCBmYWN0b3J5XSA9IGZhY3RvcnlN
YXAuZ2V0KCkuZ2V0KHR5cGVOYW1lKTsKKyAgICAgICAgYXV0byYmIFtjb25kaXRpb25hbCwgZmFj
dG9yeV0gPSBmYWN0b3J5TWFwLmdldCgpLmdldCh0eXBlTmFtZS5jb252ZXJ0VG9BU0NJSUxvd2Vy
Y2FzZSgpKTsKICAgICAgICAgaWYgKGZhY3RvcnkgJiYgKCFjb25kaXRpb25hbCB8fCBzdGQ6Omlu
dm9rZShjb25kaXRpb25hbCwgZWxlbWVudC5kb2N1bWVudCgpLnNldHRpbmdzKCkpKSkKICAgICAg
ICAgICAgIHJldHVybiBmYWN0b3J5KGVsZW1lbnQpOwogICAgIH0K
</data>

          </attachment>
      

    </bug>

</bugzilla>