Bug 20645

Summary: Duplicated constants
Product: WebKit Reporter: Judit Jász <jasy>
Component: JavaScriptCoreAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: ggaren
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: All   
Attachments:
Description Flags
proposed patch
ggaren: review-
Modification of the solution according to the reviewer's comments.
ggaren: review-
Improvements of the earlier patch according to the comments.
darin: review-
Updating the patch to the actual revision. darin: review+

Judit Jász
Reported 2008-09-04 04:32:17 PDT
The CodeGenerator differentiates the constants with their JSValue* pointer instead of their contents. In this way many cases there are different constants with the same values. The memory consumption of a particular CodeBlock can be better if it would not duplicate the constants with the same values.
Attachments
proposed patch (2.95 KB, patch)
2008-09-04 04:34 PDT, Judit Jász
ggaren: review-
Modification of the solution according to the reviewer's comments. (3.52 KB, patch)
2008-09-05 05:43 PDT, Judit Jász
ggaren: review-
Improvements of the earlier patch according to the comments. (6.33 KB, patch)
2008-09-19 07:40 PDT, Judit Jász
darin: review-
Updating the patch to the actual revision. (3.49 KB, patch)
2008-09-21 12:29 PDT, Judit Jász
darin: review+
Judit Jász
Comment 1 2008-09-04 04:34:19 PDT
Created attachment 23166 [details] proposed patch
Geoffrey Garen
Comment 2 2008-09-04 08:47:47 PDT
Comment on attachment 23166 [details] proposed patch Is the point of this patch just to unique numbers and strings within a codeblock? I think it is. For everything else, comparing JSValue* would be enough. Rather than making a JSValue*, then later extracting and comparing the underlying data, I think we should just to add a version of emitLoad() for Identifier, and one for double, which did the unique-ing based on hashes of Identifier and double. That would eliminate a lot of complexity, and substantially improve the double hash function, which in this patch just truncates the double to unsigned. It would also reduce memory and GC pressure, in the case where you actually found a duplicate constant (since you would avoid allocating the duplicate just to discover that it was a duplicate).
Judit Jász
Comment 3 2008-09-05 05:43:16 PDT
Created attachment 23190 [details] Modification of the solution according to the reviewer's comments.
Geoffrey Garen
Comment 4 2008-09-18 08:58:03 PDT
Thanks, Judit, this latest patch looks a lot better. Three comments: 1. I'd like to improve the string map a bit. Right now, the typedef is + typedef HashMap<RefPtr<UString::Rep>, JSValue*> StringMap; We can get rid of the RefPtr in that typedef. The string is owned by the syntax tree, so there's no need for the code generator to ref and deref it. Also, because we know that we're hashing Identifier strings, we can use IdentifierRepHash, which is optimized for the case where the string has already computed a hash code. 2. Though your patch didn't introduce this problem, it made me realize that emitLoad for UString and its use are a little weird: + RegisterID* emitLoad(RegisterID* dst, const UString&); - return generator.emitLoad(dst, jsOwnedString(generator.globalExec(), Identifier(generator.globalExec(), m_value).ustring())); + return generator.emitLoad(dst, m_value); emitLoad is declared to operate on UStrings, but in fact it only wants to operate on Identifiers! A more efficient solution would be to declare emitLoad to operate on an Identifier: "RegisterID* emitLoad(RegisterID* dst, const Identifier&);". And StringNode, instead of holding m_value as a UString, should hold m_value as an Identifier. That way, we won't have to make a temporary Identifier every time we encounter a string in the syntax tree. 3. A simpler way to write + NumberMap::iterator iterator = m_numberMap.find(d); + if (iterator != m_numberMap.end()) + return emitLoad(dst, iterator->second); is if (JSValue* v = m_numberMap.get(d)) return emitLoad(dst, v); The same goes for emitLoad of a string. I'm going to say r- because I think these would be good improvements to make, but overall this patch looks really good!
Judit Jász
Comment 5 2008-09-19 07:40:14 PDT
Created attachment 23565 [details] Improvements of the earlier patch according to the comments.
Darin Adler
Comment 6 2008-09-21 11:36:22 PDT
Comment on attachment 23565 [details] Improvements of the earlier patch according to the comments. Patch looks good, but it needs to be merged with a change I made to StringNode. Sorry!
Judit Jász
Comment 7 2008-09-21 12:29:14 PDT
Created attachment 23630 [details] Updating the patch to the actual revision.
Darin Adler
Comment 8 2008-09-21 13:26:55 PDT
Comment on attachment 23630 [details] Updating the patch to the actual revision. This looks good. Does it slow SunSpider down, or speed it up, or have no effect? + Reviewed by NOBODY (OOPS!). + Duplications of constants values of CodeBlocks are eliminated. ChangeLog should include the bug number. It's also better if the list of files and functions below says what was done at each call site. + JSValue* number = NULL; This variable doesn't need to be initialized, because it's always set on the next line. Also, our coding style says we use 0 instead of NULL. Further, this can be written in a way that doesn't hash twice, using more advanced features of the HashMap. pair<NumberMap::iterator, bool> addResult = m_numberMap.add(d, 0); JSValue* number; if (!addResult.second) number = addResult.first->second; else { number = jsNumber(globalExec(), d); addResult.first->second = number; } return emitLoad(dst, number); While the above code looks a bit more confusing, it does only one hash table lookup, while the code in your patch does two. You can do the same thing for the string case. + JSValue* str = NULL; I'd suggest using "string" instead of "str". + typedef HashMap<UString::Rep*, JSValue*, IdentifierRepHash > IdentifierStringMap; No need for a space before the ">" symbol. I'm going to say r=me because my suggestions are probably optional; I think I'll make the improvements and then test performance.
Darin Adler
Comment 9 2008-09-21 13:35:38 PDT
Note You need to log in before you can comment on or make changes to this bug.