Bug 30141 - Make AtomicString construct StringImpls in single malloc blocks
Summary: Make AtomicString construct StringImpls in single malloc blocks
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P3 Enhancement
Assignee: David Levin
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-10-06 13:38 PDT by Jens Alfke
Modified: 2009-10-08 14:43 PDT (History)
2 users (show)

See Also:


Attachments
patch (7.93 KB, patch)
2009-10-08 10:14 PDT, Jens Alfke
darin: review-
Details | Formatted Diff | Diff
patch 2 (7.75 KB, patch)
2009-10-08 12:29 PDT, Jens Alfke
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jens Alfke 2009-10-06 13:38:08 PDT
When AtomicString creates StringImpls, it calls 'new StringImpl(....)', which ends up allocating two heap blocks (one for the StringImpl object, the other for the character buffer.) It's very easy to change this to call 'StringImpl::create(...)' instead, which allocates the string and buffer in a single block.

This saves the time of an extra malloc call, and the overhead of the block header (~20 bytes on OS X, IIRC.) It also improves locality of reference.

I have a patch for this, but it depends on the patch for bug 29500 so I'll wait for that one to get submitted.
Comment 1 Jens Alfke 2009-10-08 10:14:01 PDT
Created attachment 40887 [details]
patch
Comment 2 Darin Adler 2009-10-08 10:31:22 PDT
Comment on attachment 40887 [details]
patch

> +        Make AtomicString create its StringImpl via create(), not the constructor,
> +        so it gets allocated in a single heap block, saving memory and CPU cycles.

Beautiful!

> +        This eliminates two StringImpl constructors, making the remaining ones
> +        unambiguous, so the "AdoptBuffer" parameter is no longer needed.

The behavior of this constructor is a bit unconventional because it takes ownership of a buffer, but I guess that's OK because it's private. We should have a comment in the header making it clear that it adopts the buffer, though; this would be completely unclear if reading the header.

> +        Added const attribute to UChar* in StringImpl constructor, eliminating the
> +        need for several const_casts in calls to it.

Great!

It is a bit strange to have a function take ownership of a buffer but be given a const UChar* to it. But I guess the comment could clarify that sufficiently.

> +        StringImpl also unfriends AtomicString (OMG drama!!!)

Superb!

But since nobody else needs access to the setHash or setInTable functions, is having those functions be public better than having AtomicString be a friend?

> -        location = new StringImpl(c, strlen(c), hash); 
> +        location = StringImpl::create(c).releaseRef(); 
> +        location->setHash(hash);
> +        location->setInTable();

Seems slightly unfortunate to set the hash twice and to set the in-table bit first to 0 and then to 1. Any clean way to avoid that?

On a related note, maybe the best thing here would be to have two create functions specifically intended for AtomicString's use instead of having the code here in AtomicString.cpp? Could we then eliminate both setHash and setInTable? I think that would be good.

> +    void setHash(unsigned hash) {ASSERT(!m_hash); m_hash = hash;}

I don't think it was worthwhile to add this public function that is an error for anyone except AtomicString to use just so AtomicString doesn't have to be a friend. Please consider my suggestion above about how to avoid it.

You need some spaces inside the braces here.

review- but this seems almost right!
Comment 3 Jens Alfke 2009-10-08 12:04:11 PDT
>On a related note, maybe the best thing here would be to have two create
>functions specifically intended for AtomicString's use instead of having the
>code here in AtomicString.cpp? Could we then eliminate both setHash and
>setInTable? I think that would be good.

I actually had it that way at first, but it took significantly more code (had to add a hash parameter to the constructor and createUninitialized and two of the create methods, then create public create methods that don't take the hash and call through with zero.)

I just changed it back after your comment, but looking at it, I think this ends up taking more cycles than setting m_hash twice, due to the extra hash parameter being passed around through several levels of call stack on every StringImpl creation.

The existing code was already setting the flags twice via a call to setInTable, because they're always initialized to zero (which is free as a side effect of copying the pointer.)

It's not possible to eliminate setInTable because AtomicString::add(StringImpl*) has to call it. Given that, I prefer to have the other calls to setInTable in AtomicString.cpp, because that's the code that manages the table.

I just noticed that setHash can be made private, since only the friended XXXTranslator classes ever call it.
Comment 4 Jens Alfke 2009-10-08 12:29:34 PDT
Created attachment 40900 [details]
patch 2

- Made setHash private, since it only needs to be called by the friend XXXTranslator classes.
- Added whitespace in setHash definition.
- Added explanatory comment about the StringImpl ctor.

After investigating I decided against Darin's suggestion of customized create methods, because it introduced more code, including passing a hash parameter through several layers of method calls, which I think cancels out any win from only initializing the member once. (See my previous comment.)
Comment 5 Darin Adler 2009-10-08 13:21:02 PDT
(In reply to comment #3)
> I just changed it back after your comment, but looking at it, I think this ends
> up taking more cycles than setting m_hash twice, due to the extra hash
> parameter being passed around through several levels of call stack on every
> StringImpl creation.

Typically the way to avoid this is inlining, but I accept your explanation.
Comment 6 David Levin 2009-10-08 14:19:12 PDT
Assigned to Levin to landing.
Comment 7 David Levin 2009-10-08 14:43:53 PDT
Committed as http://trac.webkit.org/changeset/49322