Bug 100575 - Try to create AtomicString as 8 bit where possible
Summary: Try to create AtomicString as 8 bit where possible
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Michael Saboff
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-10-26 16:46 PDT by Michael Saboff
Modified: 2012-10-28 14:33 PDT (History)
2 users (show)

See Also:


Attachments
Patch (3.48 KB, patch)
2012-10-26 17:23 PDT, Michael Saboff
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Saboff 2012-10-26 16:46:57 PDT
Most AtomicString's contain strings composed of 8 bit data even when created using a UChar* source.  When creating an AtomicString with UChar data, the data should be checked for 8 bit characters and an 8 bit string should be created if possible.
Comment 1 Michael Saboff 2012-10-26 17:23:37 PDT
Created attachment 171058 [details]
Patch
Comment 2 WebKit Review Bot 2012-10-27 13:33:32 PDT
Comment on attachment 171058 [details]
Patch

Clearing flags on attachment: 171058

Committed r132739: <http://trac.webkit.org/changeset/132739>
Comment 3 WebKit Review Bot 2012-10-27 13:33:36 PDT
All reviewed patches have been landed.  Closing bug.
Comment 4 Darin Adler 2012-10-27 14:04:58 PDT
Comment on attachment 171058 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=171058&action=review

> Source/WTF/wtf/text/StringImpl.cpp:276
> +    LChar* data;
> +    RefPtr<StringImpl> string = createUninitialized(length, data);
> +
> +    for (size_t i = 0; i < length; ++i) {
> +        if (characters[i] & 0xff00)
> +            return create(characters, length);
> +        data[i] = static_cast<LChar>(characters[i]);
> +    }

What’s the performance hit like here? Did you do any measurement.

We’re doing a second memory allocation every time we end up creating a 16-bit string, so that case is a bit slow. Also, I could imagine we might get better speed if we did the Latin-1 preflight using the techniques from ASCIIFastPath that process many characters at a time quickly, even though that would be a separate loop.
Comment 5 Michael Saboff 2012-10-27 21:50:31 PDT
(In reply to comment #4)
> (From update of attachment 171058 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=171058&action=review
> 
> > Source/WTF/wtf/text/StringImpl.cpp:276
> > +    LChar* data;
> > +    RefPtr<StringImpl> string = createUninitialized(length, data);
> > +
> > +    for (size_t i = 0; i < length; ++i) {
> > +        if (characters[i] & 0xff00)
> > +            return create(characters, length);
> > +        data[i] = static_cast<LChar>(characters[i]);
> > +    }
> 
> What’s the performance hit like here? Did you do any measurement.

I ran benchmarks on a set of changes including this one.  I ran sunspider, V8v7 and kraken.  No change in their performance.  I then ran the Parser performance tests and overall there wasn't a change.
 
> We’re doing a second memory allocation every time we end up creating a 16-bit string, so that case is a bit slow. Also, I could imagine we might get better speed if we did the Latin-1 preflight using the techniques from ASCIIFastPath that process many characters at a time quickly, even though that would be a separate loop.

This function is intended for mostly 8 bit strings.  For AtomicStrings, the 16 bit path appears to be rarely used (a little over 1% of the time in my tests).  I instrumented the 8 and 16 bit paths.  Here are the number of times we took the 16 bit path for a given website:
    apple.com 0
    cnn.com 3
    yahoo.com 1
    facebook.com 1
    daringfireball.net 42
    ebay.com 1
    google.com 1
    reddit.com 0
    wikipedia.com 55
    amazon.com 0
    lemonde.fr 5
    japantimes.co.jp 7
    weather.com 1
    espn.com 0
    webkit.org 0
Across these sites, we took the 8 bit path 10960 times and the 16 bit path 117 times.

I thought about employing the techniques in ASCIIFastPath, but the performance tests didn't show that there was an issue.  If you like, I can file a defect to investigate performance improvements.
Comment 6 Benjamin Poulain 2012-10-28 14:33:16 PDT
Please measure on ARM, both branches and memory accesses are relatively more expensive.

Since the 16bits case is only used in 1% of the case, consider branching in 0xFF00 only once at the end instead of for every iteration.