Bug 23068

Summary: Merge m_transitionCount and m_offset in Structure
Product: WebKit Reporter: Alice Liu <alice.barraclough>
Component: JavaScriptCoreAssignee: Nobody <webkit-unassigned>
Severity: Normal CC: darin
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Description Flags
patch darin: review+

Description Alice Liu 2009-01-01 14:43:43 PST
The m_transitionCount and m_cachedTransistionOffset of StructureID can be merged into one field to save a word per StructureID.
Comment 1 Alice Liu 2009-01-01 14:52:11 PST
Created attachment 26352 [details]

memory usage stats for Structure during membuster first 30 windows
47001 blocks ( 3%), 5749 KB ( 4%) -- JavaScript Structure
WIth Patch:
47123 blocks ( 3%), 5480 KB ( 4%) -- JavaScript Structure

run 1: Total:                  811.1ms +/- 0.4%
run 2: Total:                  847.2ms +/- 2.0%

With Patch:
run 1:Total:                  826.5ms +/- 0.7%
run 2:Total:                  810.2ms +/- 0.3%
Comment 2 Darin Adler 2009-01-01 17:11:39 PST
Comment on attachment 26352 [details]

Looks great.

I don't entirely understand the change, because m_offset starts with the value -1 (WTF::notFound), and m_transitionCount used to start with the value 0. It seems that to get the transition length you need to add 1 to m_offset, and in fact you also would need to check for notFound.

> +        * runtime/Structure.cpp:
> +        (JSC::Structure::Structure):
> +        (JSC::Structure::addPropertyTransition):
> +        (JSC::Structure::changePrototypeTransition):
> +        (JSC::Structure::getterSetterTransition):
> +        * runtime/Structure.h:
> +        (JSC::Structure::isEmpty):

I prefer patches with comments for each function. And in header files I normally just delete the function names added by the prepare-ChangeLog script, unless the change is really worth commenting on.

> -    , m_offset(WTF::notFound)
> +    , m_offset((char)WTF::notFound)

We try to avoid C-style casts in our C++ code. In the case of this code, there's no need to use the WTF::notFound constant; it was handy when the variable was of size_t, but there's nothing special about that particular named constant and now that it's the wrong type it's actively harmful because it's causing us to put type casts in. We can use our own constant instead. In addition, in general it's best to avoid using the "char" type for anything other than string characters, because whether it's a signed or unsigned type can vary between platforms. So I suggest using "signed char" or "unsigned char" when you want a 1-byte integer and care about whether it's signed or not. For this use, I suggest unsigned char for the offset and then we can use this static data member in the Structure class instead:

    const unsigned char noOffset = 0xFF;

Then we won't need to use WTF::notFound at all. Also, the best style is to use the WTF:: prefix only in headers, and in cpp files we would do "using namespace WTF" and then omit the WTF:: prefix.

> -    if (structure->m_transitionCount > s_maxTransitionLength) {
> +    if (structure->m_offset > s_maxTransitionLength) { // transitions move in step with offsets, so we can use offset here to reduce footprint of the object

This comment is good, but a little confusing.

In the future when I read this, I won't really know what "move in step with" means and "reduce footprint" is a little confusing too, once the footprint is reduced. It's more a comment from our current perspective than a future one.

One idea is that the comment could say, "Since the number of transitions is always the same as the value of m_offset, we can check the transition length by looking at it." Another is that, "Since the number of transitions is always the same as m_offset, we keep the size of Structure down by not storing both."

Another idea is to have an inline function called transitionCount that just returns m_offset. It seems that would be a great place for the comment, and it's nice to keep it out of the middle of the function. And also, if the transition count really is the offset minus one, then that's the perfect place to do that math.

> -        bool isEmpty() const { return m_propertyTable ? !m_propertyTable->keyCount : m_offset == WTF::notFound; }
> +        bool isEmpty() const { return m_propertyTable ? !m_propertyTable->keyCount : (size_t)m_offset == WTF::notFound; }

Again, we'd want to avoid the C++ style cast, either using static_cast, or better, avoiding the cast entirely (see my comments above).

I think the patch as-is is OK. It gets us the memory saving, and it's pretty clear. However, there are two weaknesses:

    1) I think the logic has an off-by-one error now because we use m_offset as if it was the transition count, and it's really the transition count minus one. The reason this off-by-one error is not a big deal is that it just means we allow less transition before switching to a dictionary representation, which is mostly harmless.

    2) The second reason the off-by-one error is harmless is that the initial value of m_offset, notFound, ends up being less than s_maxTransitionLength because "char" is signed on the platforms we normally work on. But this is not good to rely on. Platforms are allowed to have "char" be an unsigned type.

I'm going to say r=me as-is, but I think it would be better to fix the things I mentioned above. Feel free to clear the review flag and post a new patch if you decide to.
Comment 3 Alice Liu 2009-01-03 00:25:58 PST
Created attachment 26390 [details]

This patch addresses everything Darin mentioned (at least i hope so).  I ended up using signed char instead of unsigned char only because i was having a lot of trouble debugging why my unsigned char version of this patch was crashing in garbage collection.
Comment 4 Darin Adler 2009-01-03 10:58:56 PST
Comment on attachment 26390 [details]

Comment 5 Alice Liu 2009-01-04 16:19:20 PST
Sending        JavaScriptCore/ChangeLog
Sending        JavaScriptCore/runtime/Structure.cpp
Sending        JavaScriptCore/runtime/Structure.h
Transmitting file data ...
Committed revision 39593.