[JSC] >10% of Dromaeo dom-query.html is atomizing the same StringImpl's passed from JSC to getElementByTagName
https://bugs.webkit.org/show_bug.cgi?id=99118
Summary [JSC] >10% of Dromaeo dom-query.html is atomizing the same StringImpl's passe...
Eric Seidel (no email)
Reported 2012-10-11 16:49:39 PDT
JSC Should atomize constant strings before handing them to WebCore The Dromaeo dom-query benchmark has the following subtest: for ( var i = 0; i < num; i++ ) { var elems = document.getElementsByTagName("*"); ret = elems[elems.length-1].nodeType; } That spends nearly 10% of total time in AtomicString hash lookups for "*". "*" is of course already in the AtomicString hash (also a global called WTF::starAtom). I suspect that JSC may already have code to identify "*" as a constant. If it were to notice that it is a short string, and to do an AtomicString lookup for it, then it could pass WebCore an already atomized StringImpl, which would cause the AtomicString(const String&) constructor to go down the fast-path, and presumably drop 10% off this test! I suspect variants of this may affect real world usage, but I don't know how much of a cost Atomizing constants in JSC would be on other benchmarks (or how complicated it would be). I leave those up to you JSC experts. I have not yet tested V8 to see if they already do this.
Attachments
Patch I used to generate addSlowCase stats (2.57 KB, patch)
2012-10-12 16:07 PDT, Eric Seidel (no email)
no flags
Geoffrey Garen
Comment 1 2012-10-11 16:59:42 PDT
Geoffrey Garen
Comment 2 2012-10-11 17:00:57 PDT
For this optimization, JSC could put Identifiers in the AtomicString table instead of or in addition to the Identifier table.
Eric Seidel (no email)
Comment 3 2012-10-11 19:19:54 PDT
Looking at the sample more, it looks like this is more like avoiding AtomicString lookup due to Identifier/AtomicString StringImpl* mismatches has the potential to be more like at 15+% win on dom-query.html: Running Time Self Symbol Name 3472.0ms 7.5% 3472.0 WTF::HashTableAddResult<WTF::HashTableIterator<WTF::StringImpl*, WTF::StringImpl*, WTF::IdentityExtractor, WTF::StringHash, WTF::HashTraits<WTF::StringImpl*>, WTF::HashTraits<WTF::StringImpl*> > > WTF::HashTable<WTF::StringImpl*, WTF::StringImpl*, WTF::IdentityExtractor, WTF::StringHash, WTF::HashTraits<WTF::StringImpl*>, WTF::HashTraits<WTF::StringImpl*> >::add<WTF::IdentityHashTranslator<WTF::StringHash>, WTF::StringImpl*, WTF::StringImpl*>(WTF::StringImpl* const&, WTF::StringImpl* const&) 1615.0ms 3.4% 1615.0 WTF::StringHash::equal(WTF::StringImpl const*, WTF::StringImpl const*) 1524.0ms 3.2% 1524.0 WTF::AtomicString::addSlowCase(WTF::StringImpl*) 561.0ms 1.2% 561.0 WTF::StringImpl::hashSlowCase() const Are all samples which stem from this in dom-query.html. :)
Eric Seidel (no email)
Comment 4 2012-10-11 19:34:16 PDT
This is also another 8-10% of dom-attr.html.
Oliver Hunt
Comment 5 2012-10-12 12:04:48 PDT
JSC does intern all string literals in source code. I'm unsure how feasible sharing jsc's interned strings with AtomicString is (multiple JSC contexts have separate interned strings for the same literal for example). Also it would require some communication between JSC and WebCore of the sort that would be easier if we weren't having to deal with two JS engines. That said, what strings are we seeing that aren't being atomised? There are some ways to optimize these kinds of lookup to avoid hashing in common cases.
Eric Seidel (no email)
Comment 6 2012-10-12 12:58:50 PDT
(In reply to comment #5) > JSC does intern all string literals in source code. I'm unsure how feasible sharing jsc's interned strings with AtomicString is (multiple JSC contexts have separate interned strings for the same literal for example). Also it would require some communication between JSC and WebCore of the sort that would be easier if we weren't having to deal with two JS engines. For what it's worth, I believe v8 has the same bug. > That said, what strings are we seeing that aren't being atomised? The case is that "*" or "div" or whatever the author has specified is correctly being Interned (as you noted above), but then that same Interned StringImpl is being handed to getElementsByTagName, getElementById, etc. every time. Since that StringImpl is not marked isAtomic(), we do a hash-lookup to find the corresponding Atomized StringImpl (which is slow): http://trac.webkit.org/browser/trunk/Source/WTF/wtf/text/AtomicString.h#L176 A win here would either be to have JSC use the same underlying StringImpl as WebCore (or vice-versa), or to have some fancy way to map between them. One possible fix might be to pre-warm JSC's unique StringImpl table with the ones from WebCore?
Eric Seidel (no email)
Comment 7 2012-10-12 13:01:06 PDT
I'm not sure I was clear above. JSC hands WebCore the same StringImpl for constants, every time. But unfortunately that StringImpl is not marked isAtomic(), so when WebCore needs an AtomicString, it does a hash-lookup from the StringImpl provided by JSC, to find the corresponding "Atomic" StringImpl in WebCore. That hash lookup is slow, and ideally would be avoided by having a faster way to map from JSC constants to AtomicStrings. :)
Oliver Hunt
Comment 8 2012-10-12 13:15:14 PDT
(In reply to comment #6) > (In reply to comment #5) > > JSC does intern all string literals in source code. I'm unsure how feasible sharing jsc's interned strings with AtomicString is (multiple JSC contexts have separate interned strings for the same literal for example). Also it would require some communication between JSC and WebCore of the sort that would be easier if we weren't having to deal with two JS engines. > > For what it's worth, I believe v8 has the same bug. > > > That said, what strings are we seeing that aren't being atomised? > > The case is that "*" or "div" or whatever the author has specified is correctly being Interned (as you noted above), but then that same Interned StringImpl is being handed to getElementsByTagName, getElementById, etc. every time. Since that StringImpl is not marked isAtomic(), we do a hash-lookup to find the corresponding Atomized StringImpl (which is slow): > http://trac.webkit.org/browser/trunk/Source/WTF/wtf/text/AtomicString.h#L176 > > A win here would either be to have JSC use the same underlying StringImpl as WebCore (or vice-versa), or to have some fancy way to map between them. > > One possible fix might be to pre-warm JSC's unique StringImpl table with the ones from WebCore? StringImpl's are unique per JSC-heap, which isn't a guarantee that they're unique in the context of the DOM. We could maybe fix that for commonJSGlobalData, but that feels fragile. What I was wanting to know is how many unique strings end up missing the cache hit (i mean non-Atomic strings coming from JS)? Some stats on what those strings are just accumulated via printf logging into a histogram. Depending on the spread I have an idea for a way to greatly improve the lookup speed even on a miss, when the string is already present in the Atomics table.
Eric Seidel (no email)
Comment 9 2012-10-12 13:33:19 PDT
(In reply to comment #8) > (In reply to comment #6) > > (In reply to comment #5) > What I was wanting to know is how many unique strings end up missing the cache hit (i mean non-Atomic strings coming from JS)? Do you mean specific to this benchmark? Or some larger set of pages? My understanding is that any string created in JS, by JSC, will be Interned, but never be an AtomicString, and thus need to go through a hash lookup before it can be used in WebCore as an AtomicString? Every DOM API which takes an AtomicString as a parameter, will require a Hash lookup when taking a string from JSC, if I'm understanding correctly. If JSC got the string from the DOM/WebCore originally and handed us back the same StringImpl, then we'd avoid this hit of course. > Some stats on what those strings are just accumulated via printf logging into a histogram. > > Depending on the spread I have an idea for a way to greatly improve the lookup speed even on a miss, when the string is already present in the Atomics table. Happy to oblidge, but could you restate what data you're looking for, I'm not sure I understand yet? It sounds like you'd be interested in having some information on how often a StringImpl which came from JSC is used for a hash lookup in AtomicString? Or maybe how often an Interned string from JSC is? I assume over something like the PLT or some representative set of pages?
Eric Seidel (no email)
Comment 10 2012-10-12 13:33:56 PDT
(I hope I'm using "Interned" correctly, I've interpreted that to mean the JSC equivilent of unique/atomic.)
Oliver Hunt
Comment 11 2012-10-12 13:44:12 PDT
> Happy to oblidge, but could you restate what data you're looking for, I'm not sure I understand yet? I want to know what strings we end up doing a hash lookup for - i don't care if they're interned in the JS engine, as that isn't something we can take advantage of (multiple GC heaps == multiple "unique" versions of the same string). What I want is a histogram of strings that require a hash lookup to get to the atomicstring.
Eric Seidel (no email)
Comment 12 2012-10-12 15:16:47 PDT
By hist(In reply to comment #11) > > Happy to oblidge, but could you restate what data you're looking for, I'm not sure I understand yet? > > I want to know what strings we end up doing a hash lookup for - i don't care if they're interned in the JS engine, as that isn't something we can take advantage of (multiple GC heaps == multiple "unique" versions of the same string). > > What I want is a histogram of strings that require a hash lookup to get to the atomicstring. Which buckets did you want for such a histogram? Were you looking for a histogram over strings bucketed by length? Are you looking for a frequency count based on string content? (Like "foo" : 1M times, "bar": 9 times, "baz" 75 times)? I added the following to: PassRefPtr<StringImpl> AtomicString::add(StringImpl* r) { if (!r) return r; printf("add\n"); if (r->isAtomic()) return r; printf("slow\n"); return addSlowCase(r); } And then did run-safari Dromeao/dom-query.html > strings.txt % grep add strings.txt | wc -l 151839657 % grep slow strings.txt | wc -l 115803580 So 76% of all AtomicString(StringImpl*) calls, ended up doing a hash lookup during Dromeao/dom-query.html. I don't think that number is particularly interesting, other than that it's lower than I expect it to be (I expected something like 99% on this benchmark.) NOTE: That is counting strings which came into AtomicString() already as a StringImpl. Meaning they were turned into a String, w/o being atomized (this excludes constants from C++, buffers from the parser, etc.)
Eric Seidel (no email)
Comment 13 2012-10-12 15:23:38 PDT
To be clear, the add(StringImpl*) case covers AtomicStrings which had been converted to String/StringImple somewhere else in the engine. I chose this entry point to avoid counting strings which came from some other source (like constants in c++, or the HTML parser), or for strings which are created from a substring of other strings (add(StringImpl*, int start, int end) is a wholly separate beast). I'm generating frequency count data for add(StringImpl*) from the web now. (I don't think frequency count data from dom-query.html would be very interesting... and it's huge anyway.)
Oliver Hunt
Comment 14 2012-10-12 15:24:58 PDT
(In reply to comment #13) > To be clear, the add(StringImpl*) case covers AtomicStrings which had been converted to String/StringImple somewhere else in the engine. > > I chose this entry point to avoid counting strings which came from some other source (like constants in c++, or the HTML parser), or for strings which are created from a substring of other strings (add(StringImpl*, int start, int end) is a wholly separate beast). > > I'm generating frequency count data for add(StringImpl*) from the web now. (I don't think frequency count data from dom-query.html would be very interesting... and it's huge anyway.) I don't care about the frequency of hit vs. non-hit, i want a histogram of what strings take the slow case
Eric Seidel (no email)
Comment 15 2012-10-12 15:55:26 PDT
1469 addSlowCase(StringImpl*) calls loading apple.com top 10 strings by content: 430 data-hires 126 Referer 115 User-Agent 57 http 40 Accept 33 \u000A\u0009 30 onclick 21 mousedown 19 div 15 \u000A\u0009\u0009\u0009\u0009\u0009 8835 addSlowCase(StringImpl*) calls loading my gmail.com top 10 strings by content: 274 div 250 User-Agent 243 Referer 219 mouseover 200 email 195 keydown 183 mouseout 174 click 146 \u000A 145 .ajl 26518 addSlowCase(StringImpl*) calls loading a docs.google.com spreadsheet 2536 div 1286 role 747 -webkit-user-select: none; 601 goog-menuitem-content 542 goog-menuitem 537 * 488 mousedown 470 span 346 mouseup 342 id Remember, these stats only track strings which are being added to the AtomicString hash, which already existed as StringImpls in our engine before that point.
Geoffrey Garen
Comment 16 2012-10-12 16:01:54 PDT
FWIW, I believe this problem case only arises if WebCore already has an entry in the AtomicString table for the same string. (If the JS string is the first instance of the string, it will "win" in the table.)
Eric Seidel (no email)
Comment 17 2012-10-12 16:07:34 PDT
Created attachment 168504 [details] Patch I used to generate addSlowCase stats
Eric Seidel (no email)
Comment 18 2012-10-12 16:08:14 PDT
Sorry, webkit-patch upload doesn't have a --no-assign option yet. :)
Geoffrey Garen
Comment 19 2012-10-13 09:50:35 PDT
Since JSC has looser threading rules than WebCore, I wonder if WebCore could drive this optimization. For example, WebCore could put AtomicStrings in the Identifier table, or JSC could have an optional interface for supplying your own identifier table, and WebCore could supply the AtomicString table.
Eric Seidel (no email)
Comment 20 2012-10-15 16:13:32 PDT
I've now confirmed v8 has the same trouble. Filing a separate bug for that.
Eric Seidel (no email)
Comment 21 2012-10-15 16:19:29 PDT
bug 99383 is the v8 equivalent. Feel free to re-title, close, however you like to this bug. I think I've added all the necessary information at this point.
Note You need to log in before you can comment on or make changes to this bug.