100575 – Try to create AtomicString as 8 bit where possible

RESOLVED FIXED100575

Try to create AtomicString as 8 bit where possible

https://bugs.webkit.org/show_bug.cgi?id=100575

Summary Try to create AtomicString as 8 bit where possible

Michael Saboff

Reported 2012-10-26 16:46:57 PDT

Most AtomicString's contain strings composed of 8 bit data even when created using a UChar* source. When creating an AtomicString with UChar data, the data should be checked for 8 bit characters and an 8 bit string should be created if possible.

Attachments
Patch (3.48 KB, patch) 2012-10-26 17:23 PDT, Michael Saboff	no flags	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Michael Saboff

Comment 1 2012-10-26 17:23:37 PDT

Created attachment 171058 [details] Patch

WebKit Review Bot

Comment 2 2012-10-27 13:33:32 PDT

Comment on attachment 171058 [details] Patch Clearing flags on attachment: 171058 Committed r132739: <http://trac.webkit.org/changeset/132739>

WebKit Review Bot

Comment 3 2012-10-27 13:33:36 PDT

All reviewed patches have been landed. Closing bug.

Darin Adler

Comment 4 2012-10-27 14:04:58 PDT

Comment on attachment 171058 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=171058&action=review > Source/WTF/wtf/text/StringImpl.cpp:276 > + LChar* data; > + RefPtr<StringImpl> string = createUninitialized(length, data); > + > + for (size_t i = 0; i < length; ++i) { > + if (characters[i] & 0xff00) > + return create(characters, length); > + data[i] = static_cast<LChar>(characters[i]); > + } What’s the performance hit like here? Did you do any measurement. We’re doing a second memory allocation every time we end up creating a 16-bit string, so that case is a bit slow. Also, I could imagine we might get better speed if we did the Latin-1 preflight using the techniques from ASCIIFastPath that process many characters at a time quickly, even though that would be a separate loop.

Michael Saboff

Comment 5 2012-10-27 21:50:31 PDT

(In reply to comment #4) > (From update of attachment 171058 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=171058&action=review > > > Source/WTF/wtf/text/StringImpl.cpp:276 > > + LChar* data; > > + RefPtr<StringImpl> string = createUninitialized(length, data); > > + > > + for (size_t i = 0; i < length; ++i) { > > + if (characters[i] & 0xff00) > > + return create(characters, length); > > + data[i] = static_cast<LChar>(characters[i]); > > + } > > What’s the performance hit like here? Did you do any measurement. I ran benchmarks on a set of changes including this one. I ran sunspider, V8v7 and kraken. No change in their performance. I then ran the Parser performance tests and overall there wasn't a change. > We’re doing a second memory allocation every time we end up creating a 16-bit string, so that case is a bit slow. Also, I could imagine we might get better speed if we did the Latin-1 preflight using the techniques from ASCIIFastPath that process many characters at a time quickly, even though that would be a separate loop. This function is intended for mostly 8 bit strings. For AtomicStrings, the 16 bit path appears to be rarely used (a little over 1% of the time in my tests). I instrumented the 8 and 16 bit paths. Here are the number of times we took the 16 bit path for a given website: apple.com 0 cnn.com 3 yahoo.com 1 facebook.com 1 daringfireball.net 42 ebay.com 1 google.com 1 reddit.com 0 wikipedia.com 55 amazon.com 0 lemonde.fr 5 japantimes.co.jp 7 weather.com 1 espn.com 0 webkit.org 0 Across these sites, we took the 8 bit path 10960 times and the 16 bit path 117 times. I thought about employing the techniques in ASCIIFastPath, but the performance tests didn't show that there was an issue. If you like, I can file a defect to investigate performance improvements.

Benjamin Poulain

Comment 6 2012-10-28 14:33:16 PDT

Please measure on ARM, both branches and memory accesses are relatively more expensive. Since the 16bits case is only used in 1% of the case, consider branching in 0xFF00 only once at the end instead of for every iteration.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware All

OS All

Product WebKit

Component JavaScriptCore

Assignee

Michael Saboff

Reported

2012-10-26 16:46 PDT

Modified

2012-10-28 14:33 PDT History

CC List

2 users Show

URL

Keywords

Depends on

Blocks