Bug 31415

Summary:

[v8] do not copy data twice when converting short v8 string into WebCore::String

Product:

WebKit

Reporter:

anton muhin <antonm>

Component:

WebCore Misc.

Assignee:

Nobody <webkit-unassigned>

Status:

RESOLVED FIXED

Severity:

Normal

CC:

commit-queue, dimich, eric, fishd, jens, vitalyr

Priority:

Version:

528+ (Nightly build)

Hardware:

All

OS:

All

Attachments:

Description	Flags
First take	none
Addressing Vitaly's comments	fishd: review-
Addressing Darin's concerns	none

anton muhin

Reported 2009-11-12 09:47:29 PST

Using WebCore::String::String(const UChar*, int length) leads to another copy of data, so we'd better use String::createUninitialized directly

Attachments
First take (3.95 KB, patch) 2009-11-12 09:50 PST, anton muhin	no flags	Details Formatted Diff Diff
Addressing Vitaly's comments (4.21 KB, patch) 2009-11-12 11:09 PST, anton muhin	fishd: review-	Details Formatted Diff Diff
Addressing Darin's concerns (4.16 KB, patch) 2009-11-25 15:19 PST, anton muhin	no flags	Details Formatted Diff Diff
Show Obsolete (2) View All Add attachment proposed patch, testcase, etc.

anton muhin

Comment 1 2009-11-12 09:50:50 PST

Created attachment 43070 [details] First take

Vitaly Repeshko

Comment 2 2009-11-12 10:05:53 PST

(In reply to comment #1) > Created an attachment (id=43070) [details] > First take LGTM. > + static String convertFromV8(v8::Handle<v8::String> v8String, int length) { ASSERT v8String->Length() == length. > + UChar* buffer; > + String result = String::createUninitialized(length, buffer); > + v8String->Write(reinterpret_cast<uint16_t*>(buffer), 0, length); > + return result; > + } Perhaps add a comment that using an inline buffer doesn't make sense here to prevent future attempts to "optimize" this. > + static AtomicString convertFromV8(v8::Handle<v8::String> v8String, int length) { ASSERT v8String->Length() == length. > + static const int inlineBufferSize = 16; > + if (length <= inlineBufferSize) { > + UChar inlineBuffer[inlineBufferSize]; > + v8String->Write(reinterpret_cast<uint16_t*>(inlineBuffer), 0, length); > + return AtomicString(inlineBuffer, length); > + } else { > + UChar* buffer; > + String tmp = String::createUninitialized(length, buffer); > + v8String->Write(reinterpret_cast<uint16_t*>(buffer), 0, length); > + return AtomicString(tmp); > + } > + }

anton muhin

Comment 3 2009-11-12 11:09:22 PST

Created attachment 43080 [details] Addressing Vitaly's comments

anton muhin

Comment 4 2009-11-12 11:11:16 PST

(In reply to comment #2) > (In reply to comment #1) > > Created an attachment (id=43070) [details] [details] > > First take > > LGTM. > > > > + static String convertFromV8(v8::Handle<v8::String> v8String, int length) { > > ASSERT v8String->Length() == length. > > > + UChar* buffer; > > + String result = String::createUninitialized(length, buffer); > > + v8String->Write(reinterpret_cast<uint16_t*>(buffer), 0, length); > > + return result; > > + } > > Perhaps add a comment that using an inline buffer doesn't make sense here to > prevent future attempts to "optimize" this. Done. > > + static AtomicString convertFromV8(v8::Handle<v8::String> v8String, int length) { > > ASSERT v8String->Length() == length. Done. Thanks a lot for comments, Vitaly.

Dmitry Titov

Comment 5 2009-11-12 14:48:40 PST

Comment on attachment 43080 [details] Addressing Vitaly's comments > Index: WebCore/ChangeLog > + Do not use WebCore::String::String(const UChar*, int length) to convert > + short v8 strings. > + > + Plus added string traits. It is good to add the reason for the change to the ChangeLog as well as what was done. Ideally, perf optimizations would come with some measurement to show the actual win. > + static String convertFromV8(v8::Handle<v8::String> v8String, int length) { > + ASSERT(v8String->Length() == length); > + // NOTE: as of now, String(const UChar*, int) performs String::createUninitialized > + // anyway, so no need to optimize like we do for AtomicString below. > + UChar* buffer; > + String result = String::createUninitialized(length, buffer); > + v8String->Write(reinterpret_cast<uint16_t*>(buffer), 0, length); > + return result; It is unfortunate that this introduces almost a duplicate of the code... Could you consider having the StringTraits to have some simple bool flag, like "useInlineBuffer" and have a single copy of the code, something like this: if (StringTraits<StringType>::useInlineBuffer() && length <= inlineBufferSize) ... use inline buffer ... } else { ... use createUninitialized ... } or some other way to keep the code in one place?

anton muhin

Comment 6 2009-11-13 05:08:50 PST

(In reply to comment #5) > (From update of attachment 43080 [details]) > > Index: WebCore/ChangeLog > > > + Do not use WebCore::String::String(const UChar*, int length) to convert > > + short v8 strings. > > + > > + Plus added string traits. > > It is good to add the reason for the change to the ChangeLog as well as what > was done. > Ideally, perf optimizations would come with some measurement to show the actual > win. It has no performance impact or completely minimal. The idea was not to speed up some things, but to make it clear that this optimization is actually wrong for this particular case. > > > + static String convertFromV8(v8::Handle<v8::String> v8String, int length) { > > + ASSERT(v8String->Length() == length); > > + // NOTE: as of now, String(const UChar*, int) performs String::createUninitialized > > + // anyway, so no need to optimize like we do for AtomicString below. > > + UChar* buffer; > > + String result = String::createUninitialized(length, buffer); > > + v8String->Write(reinterpret_cast<uint16_t*>(buffer), 0, length); > > + return result; > > It is unfortunate that this introduces almost a duplicate of the code... Could > you consider having the StringTraits to have some simple bool flag, like > "useInlineBuffer" and have a single copy of the code, something like this: > if (StringTraits<StringType>::useInlineBuffer() && length <= inlineBufferSize) > ... use inline buffer ... > } else { > ... use createUninitialized ... > } > or some other way to keep the code in one place? Sure we could. The problem: there are only three lines of the code (including UChar* buffer declaration), so I am not sure that in this particular case it's worth the effort. But if you still think it's a good idea, I'd definitely do that. And thanks a lot for comments.

anton muhin

Comment 7 2009-11-17 06:00:12 PST

Comment on attachment 43080 [details] Addressing Vitaly's comments ping

Dmitry Titov

Comment 8 2009-11-17 13:52:53 PST

> It has no performance impact or completely minimal. The idea was not to speed > up some things, but to make it clear that this optimization is actually wrong > for this particular case. I'm not sure I understand the idea of the patch then. It replaces 15 lines of code with 43, to document that certain optimization does not always produce a perf win... Could a comment suffice? It seems the purpose of v8StringToWebCoreString templated function is to fold implementation of 2 other functions, v8StringToWebCoreString and v8StringToAtomicWebCoreString, into one, to avoid code duplication. It is strange to branch into 2 separate implementations again from v8StringToWebCoreString template. Perhaps 2 separate implementations from the start would be more readable?

anton muhin

Comment 9 2009-11-18 05:32:37 PST

(In reply to comment #8) > > It has no performance impact or completely minimal. The idea was not to speed > > up some things, but to make it clear that this optimization is actually wrong > > for this particular case. > > I'm not sure I understand the idea of the patch then. It replaces 15 lines of > code with 43, to document that certain optimization does not always produce a > perf win... Could a comment suffice? It is an optimization (for small strings we do not copy them twice). But perf effect as measured by current benchmarks is minimal if any. However, I think it should be easy to cook up a benchmark that shows the benefit---many conversions of short strings. I am not sure number of lines removed/added is a good estimator. E.g. I removed two lines and added 10 to use a common idiom of traits when fetching webcore string from the resource (instead of overloading which is less intuitive imho). > It seems the purpose of v8StringToWebCoreString templated function is to fold > implementation of 2 other functions, v8StringToWebCoreString and > v8StringToAtomicWebCoreString, into one, to avoid code duplication. It is > strange to branch into 2 separate implementations again from > v8StringToWebCoreString template. Perhaps 2 separate implementations from the > start would be more readable? The problem is there is no trivial prologue (checking if the string is already externalized) and epilogue (which externalizes the string if necessary) which I'd like not to duplicate.

Darin Fisher (:fishd, Google)

Comment 10 2009-11-24 14:55:20 PST

Comment on attachment 43080 [details] Addressing Vitaly's comments > Index: WebCore/bindings/v8/V8Binding.cpp > +template <class S> struct StringTraits > +{ > + static S getFromResource(WebCoreStringResource* resource); > + > + static S convertFromV8(v8::Handle<v8::String> v8String, int length); how about naming these fromStringResource and fromV8String? > +template<> > +struct StringTraits<String> > +{ > + static String getFromResource(WebCoreStringResource* resource) { > + return resource->webcoreString(); > + } please place the opening bracket on a new line per webkit style. same goes for the other functions in this patch. > + static AtomicString convertFromV8(v8::Handle<v8::String> v8String, int length) { ... > + return AtomicString(inlineBuffer, length); > + } else { nit: no need for else after return. R- because of style issues. otherwise, the substance looks good.

anton muhin

Comment 11 2009-11-25 15:19:08 PST

Created attachment 43872 [details] Addressing Darin's concerns

anton muhin

Comment 12 2009-11-25 15:21:00 PST

(In reply to comment #10) > (From update of attachment 43080 [details]) > > Index: WebCore/bindings/v8/V8Binding.cpp > > > +template <class S> struct StringTraits > > +{ > > + static S getFromResource(WebCoreStringResource* resource); > > + > > + static S convertFromV8(v8::Handle<v8::String> v8String, int length); > > how about naming these fromStringResource and fromV8String? Done. > > +template<> > > +struct StringTraits<String> > > +{ > > + static String getFromResource(WebCoreStringResource* resource) { > > + return resource->webcoreString(); > > + } > > please place the opening bracket on a new line per webkit style. same goes > for the other functions in this patch. Hopefully done---I hope we are allowed to keep { on the line of if, at least that was the case before. > > > > + static AtomicString convertFromV8(v8::Handle<v8::String> v8String, int length) { > ... > > + return AtomicString(inlineBuffer, length); > > + } else { > > nit: no need for else after return. > > > R- because of style issues. otherwise, the substance looks good. Many thanks for review.

Dmitry Titov

Comment 13 2009-11-25 18:05:12 PST

Comment on attachment 43872 [details] Addressing Darin's concerns Darin Fisher and I discussed it a bit. Although he's noted the style issues, he didn't mean to say the discussion we had before is concluded. Usually, perf improvement should measurably improve perf, so if there is no test but you say it can easily be created then lets add a test, measure and know what we are gaining. Having a test also prevents possible regressions. We have a v8-specific set of tests in src/data/dom_perf - could you add a specific test here? Again, if there is no measurable difference, I don't see why make the change. Agree there could be an extra copy of up to 16 bytes to a stack array. But if it's impossible to measure, why split the code at all? r- because it needs a test.

Darin Fisher (:fishd, Google)

Comment 14 2009-11-26 00:06:57 PST

I agree with Dmitry. My comment #10 was made without reviewing the commentary in this bug. I had just jumped to the patch from the request queue :-(

anton muhin

Comment 15 2009-12-01 09:52:12 PST

(In reply to comment #13) > (From update of attachment 43872 [details]) > Darin Fisher and I discussed it a bit. Although he's noted the style issues, he > didn't mean to say the discussion we had before is concluded. > > Usually, perf improvement should measurably improve perf, so if there is no > test but you say it can easily be created then lets add a test, measure and > know what we are gaining. Having a test also prevents possible regressions. We > have a v8-specific set of tests in src/data/dom_perf - could you add a specific > test here? > > Again, if there is no measurable difference, I don't see why make the change. > Agree there could be an extra copy of up to 16 bytes to a stack array. But if > it's impossible to measure, why split the code at all? > > r- because it needs a test. For Dromaeo DOM core (which is one of our primary benchmarks now and the only benchmark we're lagging Safari for now), that gives ~2.5% speedup for overall score (4 runs, dropping the slowest) and, more importantly, ~10% for DOM query (which is, together with DOM Modification) is there we're behind Safari.

anton muhin

Comment 16 2009-12-03 11:12:10 PST

ping

Dmitry Titov

Comment 17 2009-12-04 10:51:10 PST

Comment on attachment 43872 [details] Addressing Darin's concerns Good improvement on Dromaeo! r=me

anton muhin

Comment 18 2009-12-04 10:55:52 PST

Thanks a lot for review, Dmitry and Darin, Attempting to commit-queue+ as is, but if it got rotten would upload fresh patch (just rebaselined to the current WebKit).

WebKit Commit Bot

Comment 19 2009-12-04 12:16:13 PST

Comment on attachment 43872 [details] Addressing Darin's concerns Clearing flags on attachment: 43872 Committed r51707: <http://trac.webkit.org/changeset/51707>

WebKit Commit Bot

Comment 20 2009-12-04 12:16:21 PST

All reviewed patches have been landed. Closing bug.

Note You need to log in before you can comment on or make changes to this bug.