Bug 94574

Summary: [V8] Remove String::New() from V8 binding (Part 1)
Product: WebKit Reporter: Kentaro Hara <haraken>
Component: WebCore JavaScriptAssignee: Kentaro Hara <haraken>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, japhet, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 93095    
Attachments:
Description Flags
Patch none

Description Kentaro Hara 2012-08-20 23:04:54 PDT
Currently, V8 binding mixes String::New(), String::NewSymbol() and v8String().

String::New() should be replaced with String::NewSymbol() or v8String(), depending on use cases:

- If it is a symbol (e.g. attribute name, constant string, etc), String::NewSymbol() should be used. Cache of the created symbols is managed by V8.

- If it is not a symbol, v8String() should be used. Cache of the created strings is managed by V8 binding (i.e. StringCache class).
Comment 1 Kentaro Hara 2012-08-20 23:07:06 PDT
Created attachment 159624 [details]
Patch
Comment 2 WebKit Review Bot 2012-08-21 00:33:25 PDT
Comment on attachment 159624 [details]
Patch

Clearing flags on attachment: 159624

Committed r126141: <http://trac.webkit.org/changeset/126141>
Comment 3 WebKit Review Bot 2012-08-21 00:33:28 PDT
All reviewed patches have been landed.  Closing bug.
Comment 4 Kentaro Hara 2012-08-21 04:39:22 PDT
(In reply to comment #0)
> - If it is a symbol (e.g. attribute name, constant string, etc), String::NewSymbol() should be used. Cache of the created symbols is managed by V8.
> 
> - If it is not a symbol, v8String() should be used. Cache of the created strings is managed by V8 binding (i.e. StringCache class).

This approach looks bad... I found that v8::String::NewSymbol() is much slower than v8String(). We should _always_ use v8String().

I measured the performance:

// 83 nano sec
static v8::Handle<v8::Value> attr1AttrGetter(v8::Local<v8::String> name, const v8::AccessorInfo& info)
{
    static String str1 = String("foo");
    static String str2 = String("foo");
    v8::Handle<v8::Value> v1 = v8String(str1);
    v8::Handle<v8::Value> v2 = v8String(str2);
    if (!v1.IsEmpty() && !v2.IsEmpty())
        return v8Undefined();
    return v8::Null(); // Never reach.
}

// 86 nano sec
static v8::Handle<v8::Value> attr2AttrGetter(v8::Local<v8::String> name, const v8::AccessorInfo& info)
{
    static String str1 = String("foo");
    static String str2 = String("bar");
    v8::Handle<v8::Value> v1 = v8String(str1);
    v8::Handle<v8::Value> v2 = v8String(str2);
    if (!v1.IsEmpty() && !v2.IsEmpty())
        return v8Undefined();
    return v8::Null(); // Never reach.
}

// 376 nano sec
static v8::Handle<v8::Value> attr3AttrGetter(v8::Local<v8::String> name, const v8::AccessorInfo& info)
{
    v8::Handle<v8::Value> v1 = v8::String::NewSymbol("foo", 3);
    v8::Handle<v8::Value> v2 = v8::String::NewSymbol("foo", 3);
    if (!v1.IsEmpty() && !v2.IsEmpty())
        return v8Undefined();
    return v8::Null(); // Never reach.
}

// 377 nano sec
static v8::Handle<v8::Value> attr4AttrGetter(v8::Local<v8::String> name, const v8::AccessorInfo& info)
{
    v8::Handle<v8::Value> v1 = v8::String::NewSymbol("foo", 3);
    v8::Handle<v8::Value> v2 = v8::String::NewSymbol("bar", 3);
    if (!v1.IsEmpty() && !v2.IsEmpty())
        return v8Undefined();
    return v8::Null(); // Never reach.
}
Comment 5 Adam Barth 2012-08-21 08:14:08 PDT
Presumably there's a perf win when comparing symbols later or in memory because symbols can share the underlying storage of the characters.

Perhaps we should use symbols in places like template construction that we execute once and then use over and over again?
Comment 6 Kentaro Hara 2012-08-21 08:38:41 PDT
(In reply to comment #5)
> Presumably there's a perf win when comparing symbols later or in memory because symbols can share the underlying storage of the characters.

Maybe. But v8String() also shares the underlying storage (i.e. WebCoreStringResource). v8String() implements a couple of fast paths to access cached WebCoreStringResource, which I think would be the reason why v8String() shows a better performance than v8::String::NewSymbol().

> Perhaps we should use symbols in places like template construction that we execute once and then use over and over again?

Specifically, what kind of code are you assuming?
Comment 7 Adam Barth 2012-08-21 08:44:11 PDT
> Specifically, what kind of code are you assuming?

The code in the V8 bindings that uses the term "template".  Specifically, it's code that we run at startup to teach V8 how to make various DOM objects.  We build "template" objects in V8 that we then clone when we want to instantiate a DOM object.

I'm just guessing at what NewSymbol does based on how symbolized strings are used in similar VMs.  Typically, they're optimized for comparison operations, like AtomicString in WebCore.  I don't actually know if they're useful for us.  I'm just speculating.
Comment 8 Kentaro Hara 2012-08-21 09:01:55 PDT
(In reply to comment #7)
> > Specifically, what kind of code are you assuming?
> 
> The code in the V8 bindings that uses the term "template".  Specifically, it's code that we run at startup to teach V8 how to make various DOM objects.  We build "template" objects in V8 that we then clone when we want to instantiate a DOM object.

Got it... But the result in https://bugs.webkit.org/show_bug.cgi?id=94588#c1 implies that v8String() would be faster than v8::String::NewSymbol() even for the "template" code?

In the results I confirmed that:

  for (i = 0; i < 1000000; i++)
    v8String(String("foo"));

is faster than

  for (i = 0; i < 1000000; i++)
    v8::String::NewSymbol("foo");

and that

  for (i = 0; i < 1000000; i++)
    v8String(String("foo")); v8String(String("bar"));

is faster than

  for (i = 0; i < 1000000; i++)
    v8::String::NewSymbol("foo"); v8::String::NewSymbol("bar");

Or should I measure the performance with a lot of more strings?

> I'm just guessing at what NewSymbol does based on how symbolized strings are used in similar VMs.  Typically, they're optimized for comparison operations, like AtomicString in WebCore.  I don't actually know if they're useful for us.  I'm just speculating.

FYI, in case of v8String(), StringCache manages:

- a persistent handle of the most recently accessed StringImpl()
- a hashmap from StringImpl()s to persistent handles.
Comment 9 Adam Barth 2012-08-21 09:29:27 PDT
You're just measuring creation.  What about comparison of the strings?
Comment 10 Kentaro Hara 2012-08-22 23:50:52 PDT
(In reply to comment #9)
> You're just measuring creation.  What about comparison of the strings?

abarth: You're quite right. I had a discussion with the V8 team.

[1] String comparison of String::NewSymbol() is faster than string comparison of String::New(). Thus we should use String::NewSymbol() for symbols, and String::New() for other strings.

Note: The string comparison is done when properties are looked up in V8, but it just affects the performance of property accesses through V8 APIs (e.g. Get(), Set(), Has(), etc). It does not affect the performance of property accesses in JIT code (i.e. it does not affect the overhead of calling back xxxAttrGetter()/xxxAttrSetter()). I confirmed it.

[2] We can improve the performance by using a persistent handle for String::NewSymbol().

  object->Set(String::NewSymbol("foo")); // Local handle A.
  object->Get(String::NewSymbol("foo")); // Local handle B. Given that local handle B is different from local handle A, V8 needs to compare two symbols.

  Persistent<String> handle = Persistent<String>::New(String::NewSymbol("foo"));
  object->Set(handle); // Persistent handle A.
  object->Get(handle); // Persistent handle A. Given that two persistent handles are identical, V8 just needs to compare two handles. This is faster than comparing two symbols.

This persistent handle won't cause memory bloat. Given that the string contents (e.g. "foo") are already stored in object, the additional memory required for the persistent handle is just two pointers per persistent handle (according to the V8 team).


[1] is easy to implement. So I'll do [1] from now.

[2] will require a substantial amount of work. Furthermore, as mentioned above, it just affects the performance of property accesses through V8 APIs. We are not sure how critical the performance is in the real world. So I'll do [2] when I find that it is critical.