WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
30186
Optimization: Many StringImpl transformations are no-ops and should just return 'this'
https://bugs.webkit.org/show_bug.cgi?id=30186
Summary
Optimization: Many StringImpl transformations are no-ops and should just retu...
Jens Alfke
Reported
2009-10-07 15:05:38 PDT
I've instrumented the WebCore::StringImpl methods lower(), stripWhiteSpace() and simplifyWhiteSpace(), and found that, when loading common websites like yahoo.com or GMail, the great majority of calls are no-ops, i.e. return a string identical to the receiver. These could be optimized to check for no-op before allocating memory, and simply return 'this'. (Interestingly, upper() does not follow this pattern -- most of the calls do something. I suspect this is because WebKit uses this method to canonicalize HTML tag names, but almost all sites use lowercase tags.) The fix to lower() will slow the nontrivial case down slightly, as it scans the string twice instead of once. Given caching, this is probably not a big deal, especially considering that most calls are no-ops and will save a large number of CPU cycles in malloc and copying. The fixes to stripWhiteSpace() and simplifyWhiteSpace() are just simple checks at the end that only take a few instructions. I'll add a patch and flag for review once my previous StringImpl patch(es) are submitted.
Attachments
patch
(2.46 KB, patch)
2009-10-08 16:11 PDT
,
Jens Alfke
no flags
Details
Formatted Diff
Diff
patch 2
(3.52 KB, patch)
2009-10-08 16:30 PDT
,
Jens Alfke
levin
: review-
Details
Formatted Diff
Diff
patch 3
(4.29 KB, patch)
2009-10-09 10:46 PDT
,
Jens Alfke
no flags
Details
Formatted Diff
Diff
Show Obsolete
(2)
View All
Add attachment
proposed patch, testcase, etc.
Jens Alfke
Comment 1
2009-10-08 16:11:50 PDT
Created
attachment 40917
[details]
patch Here's the patch. Note that I added a comment to upper() explaining why it isn't optimized the way lower() is.
Jens Alfke
Comment 2
2009-10-08 16:30:54 PDT
Created
attachment 40918
[details]
patch 2 Fixed two stylistic nits dave_levin found. And added changelog entry.
Jens Alfke
Comment 3
2009-10-08 16:50:45 PDT
Here are some statistics I collected of the number of StringImpls created through various means. In each case I launched the browser, opened the page, and immediately quit. Each line shows the number of calls followed by my shorthand for the specific call, e.g. "adopt" is StringImpl::adopt. "createC" is StringImpl::create taking a C string. "~dt~" is the destructor. The "-noop" suffix denotes a case where the method was a no-op and returned the original string. The only one of these optimizations that might be controversial is the lower() one since it takes extra cycles to pre-scan the string. But the numbers below show that in each test case at least 80% of the calls were no-ops, so the optimization's definitely a time-saver as well as a memory saver. Yahoo! home page: 905 [[adopt]] 5706 [[createC]] 3689 [[create]] 64 [[lower-noop]] 9 [[lower]] 2 [[simplify-noop]] 40 [[strip-noop]] 268 [[upper]] 8098 [[~dt~]] GMail (home page of my account): 2 [[strip]] 9 [[lower]] 9 [[strip-noop]] 16 [[upper-noop]] 121 [[secure]] 131 [[lower-noop]] 630 [[upper]] 2128 [[adopt]] 7394 [[createC]] 14391 [[~dt~]] 17240 [[create]] Pitchfork.com (a music-review site, fairly complex page): 1 [[upper-noop]] 8 [[strip]] 39 [[lower]] 65 [[strip-noop]] 164 [[lower-noop]] 1717 [[upper]] 3191 [[adopt]] 5196 [[createC]] 7755 [[create]] 13169 [[~dt~]]
David Levin
Comment 4
2009-10-08 17:00:22 PDT
Comment on
attachment 40918
[details]
patch 2 Thanks. This looks good. Just a little clean and a potential bug to address.
> Index: WebCore/ChangeLog > +2009-10-08 Jens Alfke <
snej@chromium.org
> > + > + Reviewed by NOBODY (OOPS!). > +
Please add bug title bug link blank line here.
> + Optimized StringImpl methods lower(), stripWhiteSpace() and simplifyWhiteSpace() to > + detect no-ops and return this instead of creating a new instance. > + Empirical testing shows that the majority of calls to these methods are no-ops, making > + this worthwhile even if (in the case of lower()) the non-no-op case is slightly slowed. > + Upper() is very rarely a no-op, so it wasn't worthwhile to optimize it.
> Index: WebCore/platform/text/StringImpl.cpp
> PassRefPtr<StringImpl> StringImpl::lower()
> + UChar* data; > + RefPtr<StringImpl> newImpl = createUninitialized(m_length, data); > + > + if (!(ored & ~0x7F)) { > + // Do a faster loop for the case where all the characters are ASCII. > + for (int i = 0; i < length; i++) { > + UChar c = m_data[i]; > + data[i] = toASCIILower(c); > + } > + return newImpl;
return newImpl.release(); Since you changed to a RefPtr (at my request).
> @@ -346,6 +363,9 @@ PassRefPtr<StringImpl> StringImpl::simpl > if (outc > 0 && to[outc - 1] == ' ') > outc--; > > + if (static_cast<unsigned>(outc) == m_length) > + return this; > +
It looks like outc may equal m_length, but a newline may have been converted to a space. In this case, "this" is not equivalent to "data".
David Levin
Comment 5
2009-10-08 17:07:36 PDT
Note this change "return newImpl.release();" in StringImpl::lower will need to happen at the other return sides in the function as well.
Darin Adler
Comment 6
2009-10-08 17:08:41 PDT
Comment on
attachment 40918
[details]
patch 2
> + // Nothing to do if the string is all ascii with no uppercase.
We call it ASCII, not ascii.
> + // (This method could be optimized for no-op cases the way lower() is, > + // but in empirical testing, few actual calls to upper() are no-ops, so > + // it wouldn't be worth the extra time for pre-scanning.)
I suggest replacing the word "method" with the word "function" and taking out the parentheses. You should check call sites for isLower -- I suspect if you did a function in AtomicString that was similarly optimized we could eliminate the isLower function entirely. Also, those call sites check isLower when what they are actually trying to check is "would not change if lower was called", which is not the same thing!
Jens Alfke
Comment 7
2009-10-09 10:46:19 PDT
Created
attachment 40953
[details]
patch 3 Fixed the issues that were pointed out. Fixing the bug in simplifyWhiteSpace required adding a test inside the space-skipping loop. I don't think it'll slow it down much, and in my testing I found that this method isn't called much, and all of the calls I did see were no-ops; so I think it's still worthwhile.
Darin Adler
Comment 8
2009-10-09 10:57:21 PDT
Comment on
attachment 40953
[details]
patch 3 r=me
WebKit Commit Bot
Comment 9
2009-10-09 11:11:23 PDT
Comment on
attachment 40953
[details]
patch 3 Rejecting patch 40953 from commit-queue.
snej@chromium.org
does not have committer permissions according to
http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/modules/committers.py
.
Jens Alfke
Comment 10
2009-10-09 13:56:33 PDT
Oops, sorry, I must have misunderstood what setting the commit-queue flag meant. Dave, would you mind checking this in? Thanks!
WebKit Commit Bot
Comment 11
2009-10-09 14:24:35 PDT
Comment on
attachment 40953
[details]
patch 3 Clearing flags on attachment: 40953 Committed
r49403
: <
http://trac.webkit.org/changeset/49403
>
WebKit Commit Bot
Comment 12
2009-10-09 14:24:39 PDT
All reviewed patches have been landed. Closing bug.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug