Bug 99118 - [JSC] >10% of Dromaeo dom-query.html is atomizing the same StringImpl's passed from JSC to getElementByTagName
Summary: [JSC] >10% of Dromaeo dom-query.html is atomizing the same StringImpl's passe...
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on:
Blocks: 75571 92486 99191
  Show dependency treegraph
 
Reported: 2012-10-11 16:49 PDT by Eric Seidel (no email)
Modified: 2012-10-15 16:19 PDT (History)
6 users (show)

See Also:


Attachments
Patch I used to generate addSlowCase stats (2.57 KB, patch)
2012-10-12 16:07 PDT, Eric Seidel (no email)
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Seidel (no email) 2012-10-11 16:49:39 PDT
JSC Should atomize constant strings before handing them to WebCore

The Dromaeo dom-query benchmark has the following subtest:

for ( var i = 0; i < num; i++ ) {
	var elems = document.getElementsByTagName("*");
	ret = elems[elems.length-1].nodeType;
}

That spends nearly 10% of total time in AtomicString hash lookups for "*".  "*" is of course already in the AtomicString hash (also a global called WTF::starAtom).

I suspect that JSC may already have code to identify "*" as a constant.  If it were to notice that it is a short string, and to do an AtomicString lookup for it, then it could pass WebCore an already atomized StringImpl, which would cause the AtomicString(const String&) constructor to go down the fast-path, and presumably drop 10% off this test!

I suspect variants of this may affect real world usage, but I don't know how much of a cost Atomizing constants in JSC would be on other benchmarks (or how complicated it would be).  I leave those up to you JSC experts.

I have not yet tested V8 to see if they already do this.
Comment 1 Geoffrey Garen 2012-10-11 16:59:42 PDT
<rdar://problem/12484322>
Comment 2 Geoffrey Garen 2012-10-11 17:00:57 PDT
For this optimization, JSC could put Identifiers in the AtomicString table instead of or in addition to the Identifier table.
Comment 3 Eric Seidel (no email) 2012-10-11 19:19:54 PDT
Looking at the sample more, it looks like this is more like avoiding AtomicString lookup due to Identifier/AtomicString StringImpl* mismatches has the potential to be more like at 15+% win on dom-query.html:

Running Time	Self		Symbol Name
3472.0ms    7.5%	3472.0	 	WTF::HashTableAddResult<WTF::HashTableIterator<WTF::StringImpl*, WTF::StringImpl*, WTF::IdentityExtractor, WTF::StringHash, WTF::HashTraits<WTF::StringImpl*>, WTF::HashTraits<WTF::StringImpl*> > > WTF::HashTable<WTF::StringImpl*, WTF::StringImpl*, WTF::IdentityExtractor, WTF::StringHash, WTF::HashTraits<WTF::StringImpl*>, WTF::HashTraits<WTF::StringImpl*> >::add<WTF::IdentityHashTranslator<WTF::StringHash>, WTF::StringImpl*, WTF::StringImpl*>(WTF::StringImpl* const&, WTF::StringImpl* const&)
1615.0ms    3.4%	1615.0	 	WTF::StringHash::equal(WTF::StringImpl const*, WTF::StringImpl const*)
1524.0ms    3.2%	1524.0	 	WTF::AtomicString::addSlowCase(WTF::StringImpl*)
561.0ms    1.2%	561.0	 	WTF::StringImpl::hashSlowCase() const

Are all samples which stem from this in dom-query.html. :)
Comment 4 Eric Seidel (no email) 2012-10-11 19:34:16 PDT
This is also another 8-10% of dom-attr.html.
Comment 5 Oliver Hunt 2012-10-12 12:04:48 PDT
JSC does intern all string literals in source code.  I'm unsure how feasible sharing jsc's interned strings with AtomicString is (multiple JSC contexts have separate interned strings for the same literal for example).  Also it would require some communication between JSC and WebCore of the sort that would be easier if we weren't having to deal with two JS engines.

That said, what strings are we seeing that aren't being atomised?

There are some ways to optimize these kinds of lookup to avoid hashing in common cases.
Comment 6 Eric Seidel (no email) 2012-10-12 12:58:50 PDT
(In reply to comment #5)
> JSC does intern all string literals in source code.  I'm unsure how feasible sharing jsc's interned strings with AtomicString is (multiple JSC contexts have separate interned strings for the same literal for example).  Also it would require some communication between JSC and WebCore of the sort that would be easier if we weren't having to deal with two JS engines.

For what it's worth, I believe v8 has the same bug.

> That said, what strings are we seeing that aren't being atomised?

The case is that "*" or "div" or whatever the author has specified is correctly being Interned (as you noted above), but then that same Interned StringImpl is being handed to getElementsByTagName, getElementById, etc. every time.  Since that StringImpl is not marked isAtomic(), we do a hash-lookup to find the corresponding Atomized StringImpl (which is slow):
http://trac.webkit.org/browser/trunk/Source/WTF/wtf/text/AtomicString.h#L176

A win here would either be to have JSC use the same underlying StringImpl as WebCore (or vice-versa), or to have some fancy way to map between them.

One possible fix might be to pre-warm JSC's unique StringImpl table with the ones from WebCore?
Comment 7 Eric Seidel (no email) 2012-10-12 13:01:06 PDT
I'm not sure I was clear above.  JSC hands WebCore the same StringImpl for constants, every time.  But unfortunately that StringImpl is not marked isAtomic(), so when WebCore needs an AtomicString, it does a hash-lookup from the StringImpl provided by JSC, to find the corresponding "Atomic" StringImpl in WebCore.  That hash lookup is slow, and ideally would be avoided by having a faster way to map from JSC constants to AtomicStrings. :)
Comment 8 Oliver Hunt 2012-10-12 13:15:14 PDT
(In reply to comment #6)
> (In reply to comment #5)
> > JSC does intern all string literals in source code.  I'm unsure how feasible sharing jsc's interned strings with AtomicString is (multiple JSC contexts have separate interned strings for the same literal for example).  Also it would require some communication between JSC and WebCore of the sort that would be easier if we weren't having to deal with two JS engines.
> 
> For what it's worth, I believe v8 has the same bug.
> 
> > That said, what strings are we seeing that aren't being atomised?
> 
> The case is that "*" or "div" or whatever the author has specified is correctly being Interned (as you noted above), but then that same Interned StringImpl is being handed to getElementsByTagName, getElementById, etc. every time.  Since that StringImpl is not marked isAtomic(), we do a hash-lookup to find the corresponding Atomized StringImpl (which is slow):
> http://trac.webkit.org/browser/trunk/Source/WTF/wtf/text/AtomicString.h#L176
> 
> A win here would either be to have JSC use the same underlying StringImpl as WebCore (or vice-versa), or to have some fancy way to map between them.
> 
> One possible fix might be to pre-warm JSC's unique StringImpl table with the ones from WebCore?

StringImpl's are unique per JSC-heap, which isn't a guarantee that they're unique in the context of the DOM.  We could maybe fix that for commonJSGlobalData, but that feels fragile.


What I was wanting to know is how many unique strings end up missing the cache hit (i mean non-Atomic strings coming from JS)?

Some stats on what those strings are just accumulated via printf logging into a histogram.

Depending on the spread I have an idea for a way to greatly improve the lookup speed even on a miss, when the string is already present in the Atomics table.
Comment 9 Eric Seidel (no email) 2012-10-12 13:33:19 PDT
(In reply to comment #8)
> (In reply to comment #6)
> > (In reply to comment #5)
> What I was wanting to know is how many unique strings end up missing the cache hit (i mean non-Atomic strings coming from JS)?

Do you mean specific to this benchmark?  Or some larger set of pages?

My understanding is that any string created in JS, by JSC, will be Interned, but never be an AtomicString, and thus need to go through a hash lookup before it can be used in WebCore as an AtomicString?

Every DOM API which takes an AtomicString as a parameter, will require a Hash lookup when taking a string from JSC, if I'm understanding correctly.  If JSC got the string from the DOM/WebCore originally and handed us back the same StringImpl, then we'd avoid this hit of course.

> Some stats on what those strings are just accumulated via printf logging into a histogram.
> 
> Depending on the spread I have an idea for a way to greatly improve the lookup speed even on a miss, when the string is already present in the Atomics table.

Happy to oblidge, but could you restate what data you're looking for, I'm not sure I understand yet?

It sounds like you'd be interested in having some information on how often a StringImpl which came from JSC is used for a hash lookup in AtomicString?  Or maybe how often an Interned string from JSC is?  I assume over something like the PLT or some representative set of pages?
Comment 10 Eric Seidel (no email) 2012-10-12 13:33:56 PDT
(I hope I'm using "Interned" correctly, I've interpreted that to mean the JSC equivilent of unique/atomic.)
Comment 11 Oliver Hunt 2012-10-12 13:44:12 PDT
> Happy to oblidge, but could you restate what data you're looking for, I'm not sure I understand yet?

I want to know what strings we end up doing a hash lookup for - i don't care if they're interned in the JS engine, as that isn't something we can take advantage of (multiple GC heaps == multiple "unique" versions of the same string).

What I want is a histogram of strings that require a hash lookup to get to the atomicstring.
Comment 12 Eric Seidel (no email) 2012-10-12 15:16:47 PDT
By hist(In reply to comment #11)
> > Happy to oblidge, but could you restate what data you're looking for, I'm not sure I understand yet?
> 
> I want to know what strings we end up doing a hash lookup for - i don't care if they're interned in the JS engine, as that isn't something we can take advantage of (multiple GC heaps == multiple "unique" versions of the same string).
> 
> What I want is a histogram of strings that require a hash lookup to get to the atomicstring.

Which buckets did you want for such a histogram?  Were you looking for a histogram over strings bucketed by length?  Are you looking for a frequency count based on string content?  (Like "foo" : 1M times, "bar": 9 times, "baz" 75 times)?

I added the following to:

PassRefPtr<StringImpl> AtomicString::add(StringImpl* r)
{
    if (!r)
        return r;
    printf("add\n");
    if (r->isAtomic())
        return r;
    printf("slow\n");
    return addSlowCase(r);
}

And then did run-safari Dromeao/dom-query.html > strings.txt

% grep add strings.txt | wc -l
 151839657
% grep slow strings.txt | wc -l
 115803580

So 76% of all AtomicString(StringImpl*) calls, ended up doing a hash lookup during Dromeao/dom-query.html.

I don't think that number is particularly interesting, other than that it's lower than I expect it to be (I expected something like 99% on this benchmark.)

NOTE: That is counting strings which came into AtomicString() already as a StringImpl.  Meaning they were turned into a String, w/o being atomized (this excludes constants from C++, buffers from the parser, etc.)
Comment 13 Eric Seidel (no email) 2012-10-12 15:23:38 PDT
To be clear, the add(StringImpl*) case covers AtomicStrings which had been converted to String/StringImple somewhere else in the engine.

I chose this entry point to avoid counting strings which came from some other source (like constants in c++, or the HTML parser), or for strings which are created from a substring of other strings (add(StringImpl*, int start, int end) is a wholly separate beast).

I'm generating frequency count data for add(StringImpl*) from the web now.  (I don't think frequency count data from dom-query.html would be very interesting... and it's huge anyway.)
Comment 14 Oliver Hunt 2012-10-12 15:24:58 PDT
(In reply to comment #13)
> To be clear, the add(StringImpl*) case covers AtomicStrings which had been converted to String/StringImple somewhere else in the engine.
> 
> I chose this entry point to avoid counting strings which came from some other source (like constants in c++, or the HTML parser), or for strings which are created from a substring of other strings (add(StringImpl*, int start, int end) is a wholly separate beast).
> 
> I'm generating frequency count data for add(StringImpl*) from the web now.  (I don't think frequency count data from dom-query.html would be very interesting... and it's huge anyway.)

I don't care about the frequency of hit vs. non-hit, i want a histogram of what strings take the slow case
Comment 15 Eric Seidel (no email) 2012-10-12 15:55:26 PDT
1469 addSlowCase(StringImpl*) calls loading apple.com
top 10 strings by content:
 430 data-hires
 126 Referer
 115 User-Agent
  57 http
  40 Accept
  33 \u000A\u0009
  30 onclick
  21 mousedown
  19 div
  15 \u000A\u0009\u0009\u0009\u0009\u0009

8835 addSlowCase(StringImpl*) calls loading my gmail.com
top 10 strings by content:
 274 div
 250 User-Agent
 243 Referer
 219 mouseover
 200 email
 195 keydown
 183 mouseout
 174 click
 146 \u000A  
 145 .ajl

26518 addSlowCase(StringImpl*) calls loading a docs.google.com spreadsheet
2536 div
1286 role
 747 -webkit-user-select: none;
 601 goog-menuitem-content
 542 goog-menuitem
 537 *
 488 mousedown
 470 span
 346 mouseup
 342 id

Remember, these stats only track strings which are being added to the AtomicString hash, which already existed as StringImpls in our engine before that point.
Comment 16 Geoffrey Garen 2012-10-12 16:01:54 PDT
FWIW, I believe this problem case only arises if WebCore already has an entry in the AtomicString table for the same string. (If the JS string is the first instance of the string, it will "win" in the table.)
Comment 17 Eric Seidel (no email) 2012-10-12 16:07:34 PDT
Created attachment 168504 [details]
Patch I used to generate addSlowCase stats
Comment 18 Eric Seidel (no email) 2012-10-12 16:08:14 PDT
Sorry, webkit-patch upload doesn't have a --no-assign option yet. :)
Comment 19 Geoffrey Garen 2012-10-13 09:50:35 PDT
Since JSC has looser threading rules than WebCore, I wonder if WebCore could drive this optimization. For example, WebCore could put AtomicStrings in the Identifier table, or JSC could have an optional interface for supplying your own identifier table, and WebCore could supply the AtomicString table.
Comment 20 Eric Seidel (no email) 2012-10-15 16:13:32 PDT
I've now confirmed v8 has the same trouble. Filing a separate bug for that.
Comment 21 Eric Seidel (no email) 2012-10-15 16:19:29 PDT
bug 99383 is the v8 equivalent. Feel free to re-title, close, however you like to this bug. I think I've added all the necessary information at this point.