Summary: | [v8] Cache atomic strings in externalized v8 strings | ||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Christian Plesner Hansen <christian.plesner.hansen> | ||||||||||||||||||||
Component: | WebCore Misc. | Assignee: | David Levin <levin> | ||||||||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||||||||
Severity: | Normal | CC: | antonm, eric, levin, sam | ||||||||||||||||||||
Priority: | P2 | ||||||||||||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||||||||||||
Hardware: | All | ||||||||||||||||||||||
OS: | All | ||||||||||||||||||||||
Attachments: |
|
Description
Christian Plesner Hansen
2009-07-28 05:50:43 PDT
Created attachment 33618 [details]
Initial version
Created attachment 33620 [details]
Updated ChangeLog
Just a minor comments/questions. 1) Looks like you don't use HintAtomic and there is no support for it in codegen. Maybe this change should be just postponed? 2) do we need to store String now? String(StringImpl*) ctor seems very cheap. Maybe only store AtomicString even for String (it might though perform a find on creation, but it might benefit us later as most of DOM api uses AtomicStrings anyway)? 3) Safari has a nice optimization that if we don't find AtmoicString, we're guaranteed not to get node by the given idea for example. Is it possible to incorporate it into Chrome as well? Of course, not for this CL. > 1) Looks like you don't use HintAtomic and there is no support for it in > codegen. Maybe this change should be just postponed? I don't use HintAtomic value but I do use the presence/absence of a V8Custom annotation in line 1574. > 2) do we need to store String now? String(StringImpl*) ctor seems very cheap. > Maybe only store AtomicString even for String (it might though perform a find > on creation, but it might benefit us later as most of DOM api uses > AtomicStrings anyway)? The reason for the m_impl field is to make sure the String value is kept alive even after it has been converted to an AtomicString. V8 may use the buffer underlying m_impl even after m_atomic_impl has been set so both have to be kept alive. > 3) Safari has a nice optimization that if we don't find AtmoicString, we're > guaranteed not to get node by the given idea for example. Is it possible to > incorporate it into Chrome as well? Of course, not for this CL. Good point, we should definitely do that. (In reply to comment #4) > > 1) Looks like you don't use HintAtomic and there is no support for it in > > codegen. Maybe this change should be just postponed? > > I don't use HintAtomic value but I do use the presence/absence of a V8Custom > annotation in line 1574. > > > 2) do we need to store String now? String(StringImpl*) ctor seems very cheap. > > Maybe only store AtomicString even for String (it might though perform a find > > on creation, but it might benefit us later as most of DOM api uses > > AtomicStrings anyway)? > > The reason for the m_impl field is to make sure the String value is kept alive > even after it has been converted to an AtomicString. V8 may use the buffer > underlying m_impl even after m_atomic_impl has been set so both have to be kept > alive. AtomicString has String m_string field, cannot we reuse that instead? > > > 3) Safari has a nice optimization that if we don't find AtmoicString, we're > > guaranteed not to get node by the given idea for example. Is it possible to > > incorporate it into Chrome as well? Of course, not for this CL. > > Good point, we should definitely do that. Created attachment 33628 [details]
Check that cached atomic string stay on the main thread
The issue was raised in another review that atomic string are not safe to use across threads. I've added an assertion that only the main thread accesses the cached atomic string which, as far as I can tell, is a safe assumption to make. It is still possible to use the atomic string on other threads as if it were a plain string; I assume that is safe. > AtomicString has String m_string field, cannot we reuse that instead?
That field is already being used to hold the string value shared between all AtomicString values with the same contents.
Comment on attachment 33628 [details]
Check that cached atomic string stay on the main thread
There are tabs in your ChangeLog.
m_atomic_impl is not WebKit style. m_atomicString would be, or m_atomicImpl (although we've historically tried to kill Impl, excepting a few places)
+ if (m_atomic_impl.isNull()) {
+ m_atomic_impl = AtomicString(m_impl);
+ }
is not WK style.
WK uses 4 space indent:
+ if (!v8String->MakeExternal(resource)) {
+ // In case of a failure delete the external resource as it was not used.
+ delete resource;
+ }
I think you should consider running WebKit's copy of cpplint, it's found in WebKitTools/Scripts/
JSC does this by passing UStrings directly to an AtomicString constructor (they're implicitly constructed when getElementById is called).
PassRefPtr<StringImpl> AtomicString::add(const JSC::UString& ustring)
is where the real magic is. (Just hashes the string and makes a StringImpl wrapper if necessary.)
Does V8 not want to just do the same thing? Add a new constructor for AtomicString for a Handle<String> (or whatever the appropriate v8 string base type is).
The r- was for the style stuff mostly. But I'm wondering if you really need to/want to add a custom attribute for this. > There are tabs in your ChangeLog. > > m_atomic_impl is not WebKit style. m_atomicString would be, or m_atomicImpl > (although we've historically tried to kill Impl, excepting a few places) > > + if (m_atomic_impl.isNull()) { > + m_atomic_impl = AtomicString(m_impl); > + } > > is not WK style. > > WK uses 4 space indent: > + if (!v8String->MakeExternal(resource)) { > + // In case of a failure delete the external resource as it was not used. > + delete resource; > + } > > I think you should consider running WebKit's copy of cpplint, it's found in > WebKitTools/Scripts/ Sorry, it's been like two years since I last worked on WK code. I'll lint and clean up the code. > JSC does this by passing UStrings directly to an AtomicString constructor > (they're implicitly constructed when getElementById is called). > > PassRefPtr<StringImpl> AtomicString::add(const JSC::UString& ustring) > > is where the real magic is. (Just hashes the string and makes a StringImpl > wrapper if necessary.) It's the 'add' call I'm trying to get rid of; currently around 10% of getElementById is spent just looking up atomic strings and that seems to hold for most operations that use them. > The r- was for the style stuff mostly. But I'm wondering if you really need
> to/want to add a custom attribute for this.
Just so I understand what you're asking: are you suggesting getting rid of the annotation altogether or replacing [V8Custom=...] with a special-purpose [Atomic] annotation?
So do you believe JSC to be slow for the same reasons? Or is add() somehow faster for JSC? > So do you believe JSC to be slow for the same reasons? Or is add() somehow
> faster for JSC?
I haven't looked at JSC's code but unless they're doing something clever to avoid it I would expect they're doing at least one call to add() that could be saved if there was somewhere to cache the atomic string.
I believe both UString and StringImpl know how to cache their hash values. So add() is just a single lookup in the AtomicString hash table, you don't have to re-hash the string. (In reply to comment #15) > I believe both UString and StringImpl know how to cache their hash values. So > add() is just a single lookup in the AtomicString hash table, you don't have to > re-hash the string. We currently cache a StringImpl so if they do hash caching we already get the benefits of that. The lookup itself still shows up high on our profiles. Also, to answer your question from before: I think add() is as fast for us as for JSC. Comment on attachment 33628 [details]
Check that cached atomic string stay on the main thread
If what you say is true, and that add() would show up just as much for JSC, it seems we should fix this in a different way which fixes it for all JS engines. :) Hard to know without seeing the profile in question.
Created attachment 33695 [details]
Fixed style issues; added main thread assert
(In reply to comment #17) > (From update of attachment 33628 [details]) > If what you say is true, and that add() would show up just as much for JSC, it > seems we should fix this in a different way which fixes it for all JS engines. > :) Hard to know without seeing the profile in question. I agree, a solution that covers both JSC and v8 would be better. Just as an experiment I tried replacing my changes with a naive atomic string cache in StringImpls and it makes my getElementById microbenchmark ~8% faster. So that might be worth doing if the cost of an extra word per StringImpl is acceptable. I've only tried it in webkit/v8 though because webkit/jsc doesn't run on linux (as far as I know). Comment on attachment 33695 [details]
Fixed style issues; added main thread assert
Of course JSC runs on linux. :) I still think v8-specific hacks are a bad thing.
Spacing:
+ if (!v8String->MakeExternal(resource)) {
+ // In case of a failure delete the external resource as it was not used.
+ delete resource;
+ }
What test are you profiling against? > Of course JSC runs on linux. :) I still think v8-specific hacks are a bad
> thing.
I know jsc runs on linux but I assumed that chrome-jsc doesn't. I may be wrong
about that and it would be very useful if it does run.
I agree that a solution that improves both vms would be better but I don't know
enough about webkit to decide how best to implement it or decide if it should
be implemented at all. V8's string representation happens to be particularly
well suited for this kind of caching because we allow transparent
representation changes; I don't consider this a hack but just taking advantage
of the hooks that are available in v8 and not, as far as I know, in jsc.
> What test are you profiling against?
Right now I'm working specifically on document.getElementById because it performed abysmally in chromium so the benchmark I'm running is a tight loop that just calls document.getElementById over and over again. But I expect the atom conversion part of getElementById is representative of all operations that use atomic strings.
Created attachment 33721 [details]
Fixed whitespace issue
Comment on attachment 33721 [details] Fixed whitespace issue Just a few things to address and this can be done. Please get the patch to be based from WebKit or else it will be harder to commit this for you (i.e. in this case, all path for files below should be prefixed with WebCore/). > Index: ChangeLog > =================================================================== > --- ChangeLog (revision 46477) > +++ ChangeLog (working copy) > @@ -1,3 +1,21 @@ > +2009-07-28 Christian Plesner Hansen <christian.plesner.hansen@gmail.com> > + > + Reviewed by NOBODY (OOPS!). > + Add title and bug link like this: [v8] Cache atomic strings in externalized v8 strings https://bugs.webkit.org/show_bug.cgi?id=27762 > + Cache atomic strings in externalized strings the first time > + the string is converted. Added V8Custom IDL annotation for > + hinting that converting a string directly to an atomic string is a > + good idea. Get rid of TABs in the changelog. > Index: bindings/v8/V8Binding.cpp > + explicit WebCoreStringResource(const AtomicString& string) > + : m_impl(string.impl()), m_atomicImpl(string.impl()) Please format like this: explicit WebCoreStringResource(const AtomicString& string) : m_impl(string.impl()) , m_atomicImpl(string.impl()) > + AtomicString atomicString() > + { > + ASSERT(WTF::isMainThread()); > + if (m_atomicImpl.isNull()) > + m_atomicImpl = AtomicString(m_impl); It seems that you should adjust the external memory here: if (m_atomicImpl.isNull()) { m_atomicImpl = AtomicString(m_impl); if (!m_impl.inTable()) v8::V8::AdjustAmountOfExternalAllocatedMemory(2 * length()); } And change the destructor like this: virtual ~WebCoreStringResource() { ASSERT(WTF::isMainThread()); int reducedExternalMemory = -2 * length(); if (!m_impl.inTable()) reducedExternalMemory *= 2; v8::V8::AdjustAmountOfExternalAllocatedMemory(reducedExternalMemory); } > private: > // A shallow copy of the string. Keeps the string buffer alive until the V8 engine garbage collects it. > String m_impl; > + // If this string is atomic or has been made atomic earlier the > + // atomic string is held here. In the case where the string starts Please only put one space after periods. > + // off non-atomic and becomes atomic later it is necessary to keep > + // the original string alive because v8 may keep derived pointers > + // into that string. > + AtomicString m_atomicImpl; Why is it m_atomicImpl instead of just m_atomicString? > @@ -130,13 +151,34 @@ String v8ValueToWebCoreString(v8::Handle > > AtomicString v8StringToAtomicWebCoreString(v8::Handle<v8::String> v8String) > { > + WebCoreStringResource* stringResource = static_cast<WebCoreStringResource*>(v8String->GetExternalStringResource()); > + if (stringResource) > + return stringResource->atomicString(); > + > + int length = v8String->Length(); > + if (!length) { > + // Avoid trying to morph empty strings, as they do not have enough room to contain the external reference. > + return StringImpl::empty(); > + } > + > + UChar* buffer; > + String plainResult = String::createUninitialized(length, buffer); > + v8String->Write(reinterpret_cast<uint16_t*>(buffer), 0, length); > + AtomicString result(plainResult); > + > + WebCoreStringResource* resource = new WebCoreStringResource(result); > + if (!v8String->MakeExternal(resource)) { > + // In case of a failure delete the external resource as it was not used. > + delete resource; > + } > + return result; This duplicates a fair amount of code from v8StringToWebCoreString. Can you factor the code to avoid this? btw, if you create a new function (or add to an existing function) and want a "bool" parameter, please make it an enum instead. Created attachment 33865 [details]
Next iteration
Fixed all issues except one. I don't see any tabs in the ChangeLog but I had to create a new one anyway so if there were any they should be gone now. There is one potential issue left though. Is is correct that an atomic thread created on one thread cannot safely be used on another? Or is it just that the operation of making a string atomic is not thread safe? If it is the first case then as far as I can see plain String objects can't be shared either, which my changes assume they can. > Please get the patch to be based from WebKit or else it will be harder to > commit this for you (i.e. in this case, all path for files below should be > prefixed with WebCore/). On linux, using svn-create-patch from third_party/WebKit gave me an empty patch so I had to be a little creative. Let me know if the patch gives you any problems. > Why is it m_atomicImpl instead of just m_atomicString? Renamed one to m_plainString and the other to m_atomicString to make it clear what the difference is. Comment on attachment 33865 [details] Next iteration Just a few more thigns. > Index: WebCore/ChangeLog > +2009-07-31 Christian Plesner Hansen <christian.plesner.hansen@gmail.com> > + > + Reviewed by NOBODY (OOPS!). > + > + [V8] Cache atomic strings in externalized v8 strings > + https://bugs.webkit.org/show_bug.cgi?id=27762 > + > + * bindings/scripts/CodeGeneratorV8.pm: > + * bindings/v8/V8Binding.cpp: > + (WebCore::WebCoreStringResource::WebCoreStringResource): > + (WebCore::WebCoreStringResource::~WebCoreStringResource): > + (WebCore::WebCoreStringResource::data): > + (WebCore::WebCoreStringResource::length): > + (WebCore::WebCoreStringResource::webcoreString): > + (WebCore::WebCoreStringResource::atomicString): > + (WebCore::WebCoreStringResource::forString): > + (WebCore::v8StringToWebCoreString): > + (WebCore::v8StringToAtomicWebCoreString): > + (WebCore::v8ValueToWebCoreString): > + (WebCore::v8ValueToAtomicWebCoreString): I think this is ok for this change, but in general (brief) per function comments are encouraged to explain what happened in the function. See other entries in the change log. > Index: WebCore/bindings/v8/V8Binding.cpp > @@ -49,33 +49,66 @@ namespace WebCore { > class WebCoreStringResource : public v8::String::ExternalStringResource { > public: > explicit WebCoreStringResource(const String& string) > - : m_impl(string.impl()) > + : m_plainString(string) > { Nice to add assert here: ASSERT(WTF::isMainThread()); > v8::V8::AdjustAmountOfExternalAllocatedMemory(2 * length()); > } > > + explicit WebCoreStringResource(const AtomicString& string) > + : m_plainString(string) > + , m_atomicString(string) > + { > + ASSERT(WTF::isMainThread()); > + v8::V8::AdjustAmountOfExternalAllocatedMemory(2 * length()); > + } > + > virtual ~WebCoreStringResource() > { > - v8::V8::AdjustAmountOfExternalAllocatedMemory(-2 * length()); Nice to add assert here: ASSERT(WTF::isMainThread()); (Really all we care about is that everything is on the same thread and for now this will do.) > + int reducedExternalMemory = -2 * length(); > + if (!m_plainString.impl()->inTable()) > + reducedExternalMemory *= 2; > + v8::V8::AdjustAmountOfExternalAllocatedMemory(reducedExternalMemory); > } > + AtomicString atomicString() > + { > + ASSERT(WTF::isMainThread()); > + if (m_atomicString.isNull()) { > + m_atomicString = AtomicString(m_plainString); > + if (m_plainString.impl()->inTable()) This should be if (!m_plainString.impl()->inTable()) > + v8::V8::AdjustAmountOfExternalAllocatedMemory(2 * length()); > + } > + return m_atomicString; > + } > + > + static WebCoreStringResource *forString(v8::Handle<v8::String> v8String) Move the "*" static WebCoreStringResource* Also the name "forString" doesn't read that well to me. How about toStringResource? (toString may have worked but there are so many *String* in here, that I thought it would be nice to distinguish.) > +AtomicString v8StringToAtomicWebCoreString(v8::Handle<v8::String> v8String) > +{ > + WebCoreStringResource* stringResource = WebCoreStringResource::forString(v8String); > + if (!stringResource) { > + // If this string hasn't been externalized we force it now. Nice to add a comma after the if clause. If this string hasn't been externalized, we force it now. And as discussed, this solution removed the duplicate code but increased memory usage unnecessarily. Created attachment 33879 [details]
Fixed memory waste etc.
> This should be > if (!m_plainString.impl()->inTable()) Eek, how did I get this wrong?!? > Also the name "forString" doesn't read that well to me. How about > toStringResource? (toString may have worked but there are so many *String* in > here, that I thought it would be nice to distinguish.) That's a v8-ism inherited, I believe, from SmallTalk. I've changed it. > And as discussed, this solution removed the duplicate code but increased memory > usage unnecessarily. I've changed it to avoid that. Comment on attachment 33879 [details]
Fixed memory waste etc.
If you want a review, it is good to set the r? (or no one will see this). I'm adding it for you.
Thanks! I thought that was just the first time. Comment on attachment 33879 [details] Fixed memory waste etc. Just a few things that can be fixed on landing.. > Index: WebCore/ChangeLog > =================================================================== > +2009-07-31 Christian Plesner Hansen <christian.plesner.hansen@gmail.com> > + > + Reviewed by NOBODY (OOPS!). > + > + [V8] Cache atomic strings in externalized v8 strings > + https://bugs.webkit.org/show_bug.cgi?id=27762 > + I should have said this before, but typically there is a comment here about why no layout tests were added if none were. In this case, there is no change in functionality, so you could say: No change in behavior, so no tests. > Index: WebCore/bindings/v8/V8Binding.cpp > =================================================================== > +String v8StringToWebCoreString(v8::Handle<v8::String> v8String, ExternalMode external, > + StringType type) This is typically indented to align with the (. > Index: WebCore/bindings/v8/V8Binding.h Enum members should user InterCaps with an initial capital letter. > + enum ExternalMode { EXTERNALIZE, DONT_EXTERNALIZE }; Externalize, DoNotExternalize > + enum StringType { PLAIN, ATOMIC }; Since these enum are in the WebCore namespace (and cased like classes), it feels like the names for StringType should be a little more verbose. Here's some simple alternatives: PlainStringType AtomicStringType Comment on attachment 33879 [details]
Fixed memory waste etc.
I still disagree with this approach. 1. it seems silly to hold two StringImpl pointers, one in a String and one in an AtomicString. 2. I think this is wrong to "fix" this issue just for V8. If AtomicString::add is really such an issue, it should be fixed using StringImpl changes instead of in the individual JS engines.
Comment on attachment 33879 [details]
Fixed memory waste etc.
You still haven't convinced me that this need to be fixed in V8 and not in WebCore.
Do you have a copy of your test so that one of us can run it in a regular WebKit (non-chomium) build and see if this is really just a V8 problem? Created attachment 33890 [details]
GetElementById benchmark
Responding to the several comments asking for the benchmark:
Christian told me where to get the benchmark. I did a few minor changes (added the explanation text that he told me and made it one file instead of having a separate one for the js so that it was more self contained), so I take credit for any ugliness, mistakes, etc.
I tried to be brief as possible in this, but it still was long. Sorry. My overall conclusion: If devs want to increase the perf of code using v8 bindings by taking advantage of a v8 feature and there doesn't seem a clear path to do something more generic, that seems like a really reasonable thing to do and not require an equivalent change in JSC. (The reverse goes without saying.) Details: 1. The big issue seems to be "Why is this needed in V8 and not in something done for both JSC and V8?" For V8, this patch solution uses v8::string::MakeExternal (which stores the WebCore String and uses that buffer pointer inside of V8). JSC uses features of UString to speed up this code (-- for example, keeping the precomputed hash). In addition, I believe that JSC could do even by taking advantage of the fact that it can share the UChar* buffer from the UString. After debugging both JSC and V8 on the benchmar, I saw no significant difference in way that AtomicStrings are formed from UStrings vs String, but the code path for UString was obviously invested in to make sure it preformed well. I'm not sure why this seemed to be a good optimization for Chromium (and has not been for Safari), but if there isn't something apparent in AtomicString that can be fixed, then it seems reasonable to use a feature of v8::String to handle it. 2. The second issue was "It seems silly to hold two StringImpl pointers, one in a String and one in an AtomicString." A. The WebCoreStringResource may be created with a WebCore::String to begin with. (It doesn't make sense to incur the cost of an AtomicString if it isn't need.) B. When WebCoreStringResource is created, it calls MakeExternal with that WebCore::String. At this point the buffer in that WebCore::String must stay around for as long as the v8::String is alive. C. Later, that same string value may come from v8 and want to be AtomicString. In that case, the AtomicString is cached as well due to performance reasons. Ok. Talking with Dave Levin over IM: 1. JSC has this same hot-spot. 2. JSC based WebKit is 4x as fast as V8 WebKit on this same benchmark. 3. It's unclear what other things v8 is doing to make it so slow on this benchmark. 4. We could add this optimization to JSC in a similar way by converting UString's hash into a hash/stringimpl* union and keeping around a bit to know which it was. Or we could add a StringImpl* to UString. Neither seems necessary at this point. I no-longer object to this change. I'm still confused as to if this is what accounts for the 4x slowdown in v8 vs. JSC on this benchmark. I'll do a profile of chromium to compare to the one I did for webkit using nightlies and try to see why there is a huge difference here. Your free to move forward with this change, btw. I no longer mean to stand in your way. DEBUG r46722 WebKit (JSC) with Safari Version 4.0 (5530.17, 532+) on my machine: Runs: 10 Average: 0.5603 Deviation: 0.1520% Values: 0.5602 0.5605 0.5605 0.5587 0.5596 0.5599 0.5596 0.5612 0.5618 0.5609 RELEASE r46722 WebKit (JSC) with Safari Version 4.0 (5530.17, 532+) on my machine: Runs: 10 Average: 6.580 Deviation: 0.2311% Values: 6.591 6.585 6.585 6.579 6.542 6.591 6.585 6.585 6.560 6.591 RELEASE Chrome 3.0.196.0 for Mac: Runs: 10 Average: 1.610 Deviation: 0.5771% Values: 1.599 1.594 1.610 1.608 1.609 1.618 1.623 1.623 1.610 1.604 Larger numbers are better. Created attachment 33998 [details] Shark sample of benchmark in WebKit r46722 built RELEASE for Mac with JSC AtomicString::add is 4th on the list for JSC. Most of the time seems to be spent ref-counting and looking up nodes via toJS(Node*) I don't have a version of Chrome with symbols, so although I made a Shark sample for that too, it's useless. Comment on attachment 33879 [details]
Fixed memory waste etc.
Per conversation with Eric and the comments in this bug, I'm adding my r+ back to this patch.
Assigning to levin to fix style issue on landing. |