Bug 20209 - Atomize constant strings
Summary: Atomize constant strings
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Mac OS X 10.5
: P2 Normal
Assignee: Gavin Barraclough
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-07-29 06:52 PDT by Gavin Barraclough
Modified: 2008-07-30 08:35 PDT (History)
3 users (show)

See Also:


Attachments
fix (1.45 KB, patch)
2008-07-29 06:53 PDT, Gavin Barraclough
ggaren: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Gavin Barraclough 2008-07-29 06:52:25 PDT
Example of code that currently performs poorly:

function get(self, keyword) {
    // do stuff
    return self[keyword];
}

function f1() {
    var localGet = get;
    var object = {};
    var start = new Date;
    for (var i =0; i < 10000000; i++)
        get(object, 'foo');
    print("f1 time taken: " + (new Date - start)/1000 + "s");
}

function f2() {
    var object = {};
    var startTime = new Date;
    for (var i =0; i < 10000000; i++)
        object['foo'];
    print("f2 time taken: " + (new Date - startTime)/1000 + "s");
}

f1();
f2();

The problem in the above script is that the string 'foo' is used twice as an identifier.  Only 'foo' from f1() will be added to the identifier map, degrading the performance of f2().  Fix this by turning constant strings into identifiers during code generation.

(patch following).
Comment 1 Gavin Barraclough 2008-07-29 06:53:21 PDT
Created attachment 22537 [details]
fix
Comment 2 Alexey Proskuryakov 2008-07-29 08:24:11 PDT
I'm getting some weird results on the above test with a r35249 nightly. f1 takes ~2.3s, while f2 alternates between 1.6s and 0.9s (odd runs are slow, even runs are fast, no values in between).

On a Shark profile of a slow run, 36% of time is spent in Identifier::addSlowCase, and for fast runs, this function is not on the profile.
Comment 3 Geoffrey Garen 2008-07-29 15:44:42 PDT
> I'm getting some weird results on the above test with a r35249 nightly. f1
> takes ~2.3s, while f2 alternates between 1.6s and 0.9s (odd runs are slow, even
> runs are fast, no values in between).

Perhaps the garbage collector's effect on the lifetime of an identifier is in play here.
Comment 4 Geoffrey Garen 2008-07-29 15:52:22 PDT
Comment on attachment 22537 [details]
fix

Note that the patch I suggested to convert a['s'] to a.s during codegen would also fix this particular test case.

+    // String literals are atomized (internalized) in the identifier map.

Probably better to state the why here, rather than the what. For example, "We atomize constant strings, in case they're later used in property lookup."

+        a script contains multiple identical strings that are used as keys keys to identify

Typo: "keys keys" should be "keys."

Please add a bugzilla link to the ChangeLog.

r=me
Comment 5 Gavin Barraclough 2008-07-30 07:56:55 PDT
(In reply to comment #4)
> (From update of attachment 22537 [details] [edit])
> Note that the patch I suggested to convert a['s'] to a.s during codegen would
> also fix this particular test case.

Ah yes, that's quite true, thanks – I had writing a test case for this on my todo list, since we don't hit it in sunspider – that's one problem solved! – will raise a bug & submit a patch.
G.
Comment 6 Gavin Barraclough 2008-07-30 07:59:24 PDT
Sending        JavaScriptCore/ChangeLog
Sending        JavaScriptCore/kjs/nodes.cpp
Transmitting file data ..
Committed revision 35446.

Comment 7 Gavin Barraclough 2008-07-30 08:35:33 PDT
Optimize [ "constant string" ] into an identifier access.

https://bugs.webkit.org/show_bug.cgi?id=20227