<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>100575</bug_id>
          
          <creation_ts>2012-10-26 16:46:57 -0700</creation_ts>
          <short_desc>Try to create AtomicString as 8 bit where possible</short_desc>
          <delta_ts>2012-10-28 14:33:16 -0700</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>JavaScriptCore</component>
          <version>528+ (Nightly build)</version>
          <rep_platform>All</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Michael Saboff">msaboff</reporter>
          <assigned_to name="Michael Saboff">msaboff</assigned_to>
          <cc>benjamin</cc>
    
    <cc>webkit.review.bot</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>752577</commentid>
    <comment_count>0</comment_count>
    <who name="Michael Saboff">msaboff</who>
    <bug_when>2012-10-26 16:46:57 -0700</bug_when>
    <thetext>Most AtomicString&apos;s contain strings composed of 8 bit data even when created using a UChar* source.  When creating an AtomicString with UChar data, the data should be checked for 8 bit characters and an 8 bit string should be created if possible.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>752611</commentid>
    <comment_count>1</comment_count>
      <attachid>171058</attachid>
    <who name="Michael Saboff">msaboff</who>
    <bug_when>2012-10-26 17:23:37 -0700</bug_when>
    <thetext>Created attachment 171058
Patch</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>752829</commentid>
    <comment_count>2</comment_count>
      <attachid>171058</attachid>
    <who name="WebKit Review Bot">webkit.review.bot</who>
    <bug_when>2012-10-27 13:33:32 -0700</bug_when>
    <thetext>Comment on attachment 171058
Patch

Clearing flags on attachment: 171058

Committed r132739: &lt;http://trac.webkit.org/changeset/132739&gt;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>752830</commentid>
    <comment_count>3</comment_count>
    <who name="WebKit Review Bot">webkit.review.bot</who>
    <bug_when>2012-10-27 13:33:36 -0700</bug_when>
    <thetext>All reviewed patches have been landed.  Closing bug.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>752837</commentid>
    <comment_count>4</comment_count>
      <attachid>171058</attachid>
    <who name="Darin Adler">darin</who>
    <bug_when>2012-10-27 14:04:58 -0700</bug_when>
    <thetext>Comment on attachment 171058
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=171058&amp;action=review

&gt; Source/WTF/wtf/text/StringImpl.cpp:276
&gt; +    LChar* data;
&gt; +    RefPtr&lt;StringImpl&gt; string = createUninitialized(length, data);
&gt; +
&gt; +    for (size_t i = 0; i &lt; length; ++i) {
&gt; +        if (characters[i] &amp; 0xff00)
&gt; +            return create(characters, length);
&gt; +        data[i] = static_cast&lt;LChar&gt;(characters[i]);
&gt; +    }

What’s the performance hit like here? Did you do any measurement.

We’re doing a second memory allocation every time we end up creating a 16-bit string, so that case is a bit slow. Also, I could imagine we might get better speed if we did the Latin-1 preflight using the techniques from ASCIIFastPath that process many characters at a time quickly, even though that would be a separate loop.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>752888</commentid>
    <comment_count>5</comment_count>
    <who name="Michael Saboff">msaboff</who>
    <bug_when>2012-10-27 21:50:31 -0700</bug_when>
    <thetext>(In reply to comment #4)
&gt; (From update of attachment 171058 [details])
&gt; View in context: https://bugs.webkit.org/attachment.cgi?id=171058&amp;action=review
&gt; 
&gt; &gt; Source/WTF/wtf/text/StringImpl.cpp:276
&gt; &gt; +    LChar* data;
&gt; &gt; +    RefPtr&lt;StringImpl&gt; string = createUninitialized(length, data);
&gt; &gt; +
&gt; &gt; +    for (size_t i = 0; i &lt; length; ++i) {
&gt; &gt; +        if (characters[i] &amp; 0xff00)
&gt; &gt; +            return create(characters, length);
&gt; &gt; +        data[i] = static_cast&lt;LChar&gt;(characters[i]);
&gt; &gt; +    }
&gt; 
&gt; What’s the performance hit like here? Did you do any measurement.

I ran benchmarks on a set of changes including this one.  I ran sunspider, V8v7 and kraken.  No change in their performance.  I then ran the Parser performance tests and overall there wasn&apos;t a change.
 
&gt; We’re doing a second memory allocation every time we end up creating a 16-bit string, so that case is a bit slow. Also, I could imagine we might get better speed if we did the Latin-1 preflight using the techniques from ASCIIFastPath that process many characters at a time quickly, even though that would be a separate loop.

This function is intended for mostly 8 bit strings.  For AtomicStrings, the 16 bit path appears to be rarely used (a little over 1% of the time in my tests).  I instrumented the 8 and 16 bit paths.  Here are the number of times we took the 16 bit path for a given website:
    apple.com 0
    cnn.com 3
    yahoo.com 1
    facebook.com 1
    daringfireball.net 42
    ebay.com 1
    google.com 1
    reddit.com 0
    wikipedia.com 55
    amazon.com 0
    lemonde.fr 5
    japantimes.co.jp 7
    weather.com 1
    espn.com 0
    webkit.org 0
Across these sites, we took the 8 bit path 10960 times and the 16 bit path 117 times.

I thought about employing the techniques in ASCIIFastPath, but the performance tests didn&apos;t show that there was an issue.  If you like, I can file a defect to investigate performance improvements.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>752993</commentid>
    <comment_count>6</comment_count>
    <who name="Benjamin Poulain">benjamin</who>
    <bug_when>2012-10-28 14:33:16 -0700</bug_when>
    <thetext>Please measure on ARM, both branches and memory accesses are relatively more expensive.

Since the 16bits case is only used in 1% of the case, consider branching in 0xFF00 only once at the end instead of for every iteration.</thetext>
  </long_desc>
      
          <attachment
              isobsolete="0"
              ispatch="1"
              isprivate="0"
          >
            <attachid>171058</attachid>
            <date>2012-10-26 17:23:37 -0700</date>
            <delta_ts>2012-10-27 14:04:58 -0700</delta_ts>
            <desc>Patch</desc>
            <filename>100575.patch</filename>
            <type>text/plain</type>
            <size>3568</size>
            <attacher name="Michael Saboff">msaboff</attacher>
            
              <data encoding="base64">SW5kZXg6IFNvdXJjZS9XVEYvQ2hhbmdlTG9nCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KLS0tIFNvdXJjZS9XVEYvQ2hh
bmdlTG9nCShyZXZpc2lvbiAxMzI3MTIpCisrKyBTb3VyY2UvV1RGL0NoYW5nZUxvZwkod29ya2lu
ZyBjb3B5KQpAQCAtMSwzICsxLDIyIEBACisyMDEyLTEwLTI2ICBNaWNoYWVsIFNhYm9mZiAgPG1z
YWJvZmZAYXBwbGUuY29tPgorCisgICAgICAgIFRyeSB0byBjcmVhdGUgQXRvbWljU3RyaW5nIGFz
IDggYml0IHdoZXJlIHBvc3NpYmxlCisgICAgICAgIGh0dHBzOi8vYnVncy53ZWJraXQub3JnL3No
b3dfYnVnLmNnaT9pZD0xMDA1NzUKKworICAgICAgICBSZXZpZXdlZCBieSBOT0JPRFkgKE9PUFMh
KS4KKworICAgICAgICBBZGRlZCBTdHJpbmdJbXBsOjpjcmVhdGU4Qml0SWZQb3NzaWJsZSgpIHRo
YXQgZmlyc3QgdHJpZXMgdG8gY3JlYXRlIGFuIDggYml0IHN0cmluZy4gIElmIGl0IGZpbmRzIGEg
MTYgYml0IGNoYXJhY3RlcgorICAgICAgICBkdXJpbmcgcHJvY2Vzc2luZywgaXQgY2FsbHMgdGhl
IHN0YW5kYXJkIGNyZWF0ZSgpIG1ldGhvZC4gIFRoZSBhc3N1bXB0aW9uIGlzIHRoYXQgdGhpcyB3
aWxsIGJlIHVzZWQgb24gbW9zdGx5IDggYml0CisgICAgICAgIHN0cmluZ3MgYW5kIG9uZXMgdGhh
dCBhcmUgc2hvcnRlciAoaW4gdGhlIHRlbnMgb2YgY2hhcmFjdGVycykuICBDaGFuZ2VkIEF0b21p
Y1N0cmluZyB0byB1c2UgdGhlIG5ldyBjcmVhdGlvbiBtZXRob2QKKyAgICAgICAgZm9yIFVDaGFy
IGJhc2VkIGNvbnN0cnVjdGlvbi4KKworICAgICAgICAqIHd0Zi90ZXh0L0F0b21pY1N0cmluZy5j
cHA6CisgICAgICAgIChXVEY6OlVDaGFyQnVmZmVyVHJhbnNsYXRvcjo6dHJhbnNsYXRlKToKKyAg
ICAgICAgKiB3dGYvdGV4dC9TdHJpbmdJbXBsLmNwcDoKKyAgICAgICAgKFdURjo6U3RyaW5nSW1w
bDo6Y3JlYXRlOEJpdElmUG9zc2libGUpOgorICAgICAgICAqIHd0Zi90ZXh0L1N0cmluZ0ltcGwu
aDoKKyAgICAgICAgKFdURjo6U3RyaW5nSW1wbDo6Y3JlYXRlOEJpdElmUG9zc2libGUpOgorCiAy
MDEyLTEwLTI2ICBTaGVyaWZmIEJvdCAgPHdlYmtpdC5yZXZpZXcuYm90QGdtYWlsLmNvbT4KIAog
ICAgICAgICBVbnJldmlld2VkLCByb2xsaW5nIG91dCByMTMyNjg5LgpJbmRleDogU291cmNlL1dU
Ri93dGYvdGV4dC9BdG9taWNTdHJpbmcuY3BwCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KLS0tIFNvdXJjZS9XVEYvd3Rm
L3RleHQvQXRvbWljU3RyaW5nLmNwcAkocmV2aXNpb24gMTMyNTEwKQorKysgU291cmNlL1dURi93
dGYvdGV4dC9BdG9taWNTdHJpbmcuY3BwCSh3b3JraW5nIGNvcHkpCkBAIC0xMzUsNyArMTM1LDcg
QEAgc3RydWN0IFVDaGFyQnVmZmVyVHJhbnNsYXRvciB7CiAKICAgICBzdGF0aWMgdm9pZCB0cmFu
c2xhdGUoU3RyaW5nSW1wbComIGxvY2F0aW9uLCBjb25zdCBVQ2hhckJ1ZmZlciYgYnVmLCB1bnNp
Z25lZCBoYXNoKQogICAgIHsKLSAgICAgICAgbG9jYXRpb24gPSBTdHJpbmdJbXBsOjpjcmVhdGUo
YnVmLnMsIGJ1Zi5sZW5ndGgpLmxlYWtSZWYoKTsKKyAgICAgICAgbG9jYXRpb24gPSBTdHJpbmdJ
bXBsOjpjcmVhdGU4Qml0SWZQb3NzaWJsZShidWYucywgYnVmLmxlbmd0aCkubGVha1JlZigpOwog
ICAgICAgICBsb2NhdGlvbi0+c2V0SGFzaChoYXNoKTsKICAgICAgICAgbG9jYXRpb24tPnNldElz
QXRvbWljKHRydWUpOwogICAgIH0KSW5kZXg6IFNvdXJjZS9XVEYvd3RmL3RleHQvU3RyaW5nSW1w
bC5jcHAKPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PQotLS0gU291cmNlL1dURi93dGYvdGV4dC9TdHJpbmdJbXBsLmNwcAko
cmV2aXNpb24gMTMyNTEwKQorKysgU291cmNlL1dURi93dGYvdGV4dC9TdHJpbmdJbXBsLmNwcAko
d29ya2luZyBjb3B5KQpAQCAtMjYxLDYgKzI2MSwyMyBAQCBQYXNzUmVmUHRyPFN0cmluZ0ltcGw+
IFN0cmluZ0ltcGw6OmNyZWF0CiAgICAgcmV0dXJuIHN0cmluZy5yZWxlYXNlKCk7CiB9CiAKK1Bh
c3NSZWZQdHI8U3RyaW5nSW1wbD4gU3RyaW5nSW1wbDo6Y3JlYXRlOEJpdElmUG9zc2libGUoY29u
c3QgVUNoYXIqIGNoYXJhY3RlcnMsIHVuc2lnbmVkIGxlbmd0aCkKK3sKKyAgICBpZiAoIWNoYXJh
Y3RlcnMgfHwgIWxlbmd0aCkKKyAgICAgICAgcmV0dXJuIGVtcHR5KCk7CisKKyAgICBMQ2hhciog
ZGF0YTsKKyAgICBSZWZQdHI8U3RyaW5nSW1wbD4gc3RyaW5nID0gY3JlYXRlVW5pbml0aWFsaXpl
ZChsZW5ndGgsIGRhdGEpOworCisgICAgZm9yIChzaXplX3QgaSA9IDA7IGkgPCBsZW5ndGg7ICsr
aSkgeworICAgICAgICBpZiAoY2hhcmFjdGVyc1tpXSAmIDB4ZmYwMCkKKyAgICAgICAgICAgIHJl
dHVybiBjcmVhdGUoY2hhcmFjdGVycywgbGVuZ3RoKTsKKyAgICAgICAgZGF0YVtpXSA9IHN0YXRp
Y19jYXN0PExDaGFyPihjaGFyYWN0ZXJzW2ldKTsKKyAgICB9CisKKyAgICByZXR1cm4gc3RyaW5n
LnJlbGVhc2UoKTsKK30KKwogUGFzc1JlZlB0cjxTdHJpbmdJbXBsPiBTdHJpbmdJbXBsOjpjcmVh
dGUoY29uc3QgTENoYXIqIHN0cmluZykKIHsKICAgICBpZiAoIXN0cmluZykKSW5kZXg6IFNvdXJj
ZS9XVEYvd3RmL3RleHQvU3RyaW5nSW1wbC5oCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KLS0tIFNvdXJjZS9XVEYvd3Rm
L3RleHQvU3RyaW5nSW1wbC5oCShyZXZpc2lvbiAxMzI1MTApCisrKyBTb3VyY2UvV1RGL3d0Zi90
ZXh0L1N0cmluZ0ltcGwuaAkod29ya2luZyBjb3B5KQpAQCAtMzUyLDYgKzM1Miw3IEBAIHB1Ymxp
YzoKIAogICAgIFdURl9FWFBPUlRfU1RSSU5HX0FQSSBzdGF0aWMgUGFzc1JlZlB0cjxTdHJpbmdJ
bXBsPiBjcmVhdGUoY29uc3QgVUNoYXIqLCB1bnNpZ25lZCBsZW5ndGgpOwogICAgIFdURl9FWFBP
UlRfU1RSSU5HX0FQSSBzdGF0aWMgUGFzc1JlZlB0cjxTdHJpbmdJbXBsPiBjcmVhdGUoY29uc3Qg
TENoYXIqLCB1bnNpZ25lZCBsZW5ndGgpOworICAgIFdURl9FWFBPUlRfU1RSSU5HX0FQSSBzdGF0
aWMgUGFzc1JlZlB0cjxTdHJpbmdJbXBsPiBjcmVhdGU4Qml0SWZQb3NzaWJsZShjb25zdCBVQ2hh
ciosIHVuc2lnbmVkIGxlbmd0aCk7CiAgICAgQUxXQVlTX0lOTElORSBzdGF0aWMgUGFzc1JlZlB0
cjxTdHJpbmdJbXBsPiBjcmVhdGUoY29uc3QgY2hhciogcywgdW5zaWduZWQgbGVuZ3RoKSB7IHJl
dHVybiBjcmVhdGUocmVpbnRlcnByZXRfY2FzdDxjb25zdCBMQ2hhcio+KHMpLCBsZW5ndGgpOyB9
CiAgICAgV1RGX0VYUE9SVF9TVFJJTkdfQVBJIHN0YXRpYyBQYXNzUmVmUHRyPFN0cmluZ0ltcGw+
IGNyZWF0ZShjb25zdCBMQ2hhciopOwogICAgIEFMV0FZU19JTkxJTkUgc3RhdGljIFBhc3NSZWZQ
dHI8U3RyaW5nSW1wbD4gY3JlYXRlKGNvbnN0IGNoYXIqIHMpIHsgcmV0dXJuIGNyZWF0ZShyZWlu
dGVycHJldF9jYXN0PGNvbnN0IExDaGFyKj4ocykpOyB9Cg==
</data>

          </attachment>
      

    </bug>

</bugzilla>