WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
23175
String and UString should be able to share a UChar* buffer.
https://bugs.webkit.org/show_bug.cgi?id=23175
Summary
String and UString should be able to share a UChar* buffer.
David Levin
Reported
2009-01-07 15:20:21 PST
Also, it would be nice to share a String across threads and share the buffer in that case as well.
Attachments
Patch for bug.
(80.07 KB, patch)
2009-01-07 17:29 PST
,
David Levin
no flags
Details
Formatted Diff
Diff
Attempted to make the SharedDataHolder usage more clear.
(82.39 KB, patch)
2009-01-08 17:06 PST
,
David Levin
no flags
Details
Formatted Diff
Diff
Part 1. Some refactoring work.
(32.42 KB, patch)
2009-01-09 04:53 PST
,
David Levin
no flags
Details
Formatted Diff
Diff
Minor change to previous patch.
(1.99 KB, patch)
2009-01-09 11:22 PST
,
David Levin
no flags
Details
Formatted Diff
Diff
Part 2: Refactor UString::Rep into two classes which reduces memory usage for the base Rep class.
(27.80 KB, patch)
2009-01-11 11:32 PST
,
David Levin
darin
: review-
Details
Formatted Diff
Diff
Addressed comments -- Part 2: Refactor UString::Rep into two classes which reduces memory usage for the base Rep class.
(28.36 KB, patch)
2009-01-11 16:51 PST
,
David Levin
no flags
Details
Formatted Diff
Diff
Addressed Darin's comments from the last review.
(28.30 KB, patch)
2009-01-11 18:06 PST
,
David Levin
no flags
Details
Formatted Diff
Diff
Part3: Allow UString to share a UChar* with another class.
(32.77 KB, patch)
2009-01-12 04:39 PST
,
David Levin
no flags
Details
Formatted Diff
Diff
Part 3: Add CrossThreadRefCounted.
(10.05 KB, patch)
2009-02-02 16:30 PST
,
David Levin
no flags
Details
Formatted Diff
Diff
Show Obsolete
(5)
View All
Add attachment
proposed patch, testcase, etc.
David Levin
Comment 1
2009-01-07 17:29:01 PST
Created
attachment 26515
[details]
Patch for bug.
David Levin
Comment 2
2009-01-07 17:34:25 PST
There is a simple follow on patch that could enable StringImpl to automatically share the buf from a UString, but I'm slightly more worried about potential memory regressions (due to UString's buffers being bigger than StringImpl), so I wanted to get in this part first.
David Levin
Comment 3
2009-01-07 22:51:00 PST
If anyone wants to start looking at the diff, it would be great. However, I'm looking at changing making some changes to mrowe's comments in irc. My summary: Basically "SharedDataHolder" is confusing and badly named (and I goofed in the changelog "visa versa" should be "vice versa"). So I've be changing SharedDataHolder some how to make it clearer in how to use it (and likely changing its name).
David Levin
Comment 4
2009-01-08 17:06:22 PST
Created
attachment 26550
[details]
Attempted to make the SharedDataHolder usage more clear. This has all known changes done to it. There is one known issue with the change: JavaScriptCore/wtf/SharedDataHolder.h no longer matches the contents of the file. This means that I should change the name of the file and the references to it in various places (build files, includes, and ChangeLog). However, in case there is more feedback about the naming of this class, I avoid those changes for the moment.
David Levin
Comment 5
2009-01-08 17:13:41 PST
Now it is ready for review.
David Levin
Comment 6
2009-01-09 04:53:19 PST
Created
attachment 26563
[details]
Part 1. Some refactoring work.
David Levin
Comment 7
2009-01-09 04:55:49 PST
Here's a much smaller patch that is ready for review. After this one gets in, I'll do another incremental one.
Oliver Hunt
Comment 8
2009-01-09 09:55:41 PST
Committing to
http://svn.webkit.org/repository/webkit/trunk
... M JavaScriptCore/ChangeLog M JavaScriptCore/GNUmakefile.am M JavaScriptCore/JavaScriptCore.exp M JavaScriptCore/JavaScriptCore.vcproj/WTF/WTF.vcproj M JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj M JavaScriptCore/runtime/Identifier.cpp M JavaScriptCore/runtime/InitializeThreading.cpp M JavaScriptCore/runtime/JSGlobalData.cpp M JavaScriptCore/runtime/PropertyNameArray.cpp M JavaScriptCore/runtime/UString.cpp M JavaScriptCore/runtime/UString.h M JavaScriptGlue/ChangeLog M WebCore/ChangeLog Committed
r39747
Committing to
http://svn.webkit.org/repository/webkit/trunk
... A JavaScriptGlue/ForwardingHeaders/wtf/PtrAndFlags.h A WebCore/ForwardingHeaders/wtf/PtrAndFlags.h Committed
r39748
David Levin
Comment 9
2009-01-09 11:22:41 PST
Created
attachment 26565
[details]
Minor change to previous patch.
Oliver Hunt
Comment 10
2009-01-09 12:17:30 PST
Comment on
attachment 26565
[details]
Minor change to previous patch. Committing to
http://svn.webkit.org/repository/webkit/trunk
... M JavaScriptCore/ChangeLog M JavaScriptCore/jsc.cpp M JavaScriptCore/runtime/JSGlobalData.cpp Committed
r39755
Alexey Proskuryakov
Comment 11
2009-01-11 01:45:10 PST
Comment on
attachment 26563
[details]
Part 1. Some refactoring work. Clearing review flags from committed patches to get the bug out of commit queue.
David Levin
Comment 12
2009-01-11 11:32:33 PST
Created
attachment 26609
[details]
Part 2: Refactor UString::Rep into two classes which reduces memory usage for the base Rep class.
Darin Adler
Comment 13
2009-01-11 14:14:15 PST
Comment on
attachment 26609
[details]
Part 2: Refactor UString::Rep into two classes which reduces memory usage for the base Rep class. This looks good.
> +static const unsigned numCharactersToStore = 0x100; // Note that SmallStrings::m_singleCharacterStrings should be the same size.
We have COMPILE_ASSERT so that such things can be checked rather than just mentioned in comments.
> +} // namespace JSC > + > +#endif // SmallStrings_h
If the patch is going to touch some files just to add comments, then I will be in full nitpick mode. I suggest only one space before "//" to match the style elsewhere, rather than two spaces.
> - rep->capacity = newCapacity; > + rep->baseString()->capacity = newCapacity;
I would have expected you to use "base" here rather than re-fetching rep->baseString(). Is this better in some way?
> - bool baseIsSelf() const { return baseString == this; } > + bool baseIsSelf() const { return m_identifierTableAndFlags.isFlagSet(BaseStringFlag); }
Why can't we keep the old definition, but move it down in the file where it can see the BaseString class definition? Or can't this check if baseString is zero instead? Do we really need the bit in the flags? Does the bit help performance or something?
> - UChar* data() const { return baseString->buf + baseString->preCapacity + offset; } > + UChar* data() const;
I'm surprised this can be non-inline without hurting performance slightly.
> + void setBaseString(PassRefPtr<BaseString> base) { ASSERT(base != this); m_baseString = base.releaseRef(); } > + BaseString* baseString() { return reinterpret_cast<BaseString*>(baseIsSelf() ? this : m_baseString); } > + const BaseString* baseString() const { return const_cast<const BaseString*>(const_cast<Rep*>(this)->baseString()); }
It's annoying to have these reinterpret_cast here. Instead you should just define these inline functions outside the class definition, in a place that can see the definition of BaseString.
> + void* m_baseString; // If "this" is a BaseString instance, it is 0. BaseString* otherwise.
Seems really ugly to have this be a void*. Can't it be a BaseString* instead? When is the fact that it's a void* helpful?
> + > + static BaseString& null() { return *nullBaseString; } > + static BaseString& empty() { return *emptyBaseString; } > + private: > + friend void initializeUString();
I'd prefer a blank line before the private line.
> + inline UChar* UString::Rep::data() const > + { > + const UString::BaseString* base = baseString(); > + return base->buf + base->preCapacity + offset; > + }
Do you need the UString:: qualifier on BaseString here?
> + UString::BaseString* base = m_rep->baseString();
And here? I'm going to say review- because of the reinterpret_cast question, and the question of whether we really do need a bit to tell whether something is a base string. But this does seem like a great change and about ready to land.
David Levin
Comment 14
2009-01-11 16:51:55 PST
Created
attachment 26618
[details]
Addressed comments -- Part 2: Refactor UString::Rep into two classes which reduces memory usage for the base Rep class.
> > +static const unsigned numCharactersToStore = 0x100; // Note that SmallStrings::m_singleCharacterStrings should be the same size. >We have COMPILE_ASSERT so that such things can be checked rather than just mentioned in comments.
Fixed.
> > +} // namespace JSC > > + > > +#endif // SmallStrings_h > I suggest only one space before "//" to match the style elsewhere, rather than two spaces.
Fixed.
> > - rep->capacity = newCapacity; > > + rep->baseString()->capacity = newCapacity;
> I would have expected you to use "base" here rather than re-fetching > rep->baseString(). Is this better in some way?
rep has just been recreated, so the local variable base is no longer valid at this point.
> > + UChar* data() const; > I'm surprised this can be non-inline without hurting performance slightly.
Still inlined just not defined in the class definition.
> > + void setBaseString(PassRefPtr<BaseString> base) { ASSERT(base != this); m_baseString = base.releaseRef(); } > > + BaseString* baseString() { return reinterpret_cast<BaseString*>(baseIsSelf() ? this : m_baseString); } > > + const BaseString* baseString() const { return const_cast<const BaseString*>(const_cast<Rep*>(this)->baseString()); }
> It's annoying to have these reinterpret_cast here. Instead you should just > define these inline functions outside the class definition, in a place that can > see the definition of BaseString.
Fixed.
> > + static BaseString& empty() { return *emptyBaseString; } > > + private: > > + friend void initializeUString(); > I'd prefer a blank line before the private line.
Fixed.
> + inline UChar* UString::Rep::data() const > + { > + const UString::BaseString* base = baseString(); > Do you need the UString:: qualifier on BaseString here?
Fixed.
> > + UString::BaseString* base = m_rep->baseString(); >And here?
Fixed.
> > - bool baseIsSelf() const { return baseString == this; } >> + bool baseIsSelf() const { return m_identifierTableAndFlags.isFlagSet(BaseStringFlag); }
> Why can't we keep the old definition, but move it down in the file where it can...
and
> > + void* m_baseString; // If "this" is a BaseString instance, it is 0. BaseString* otherwise. > Seems really ugly to have this be a void*. Can't it be a BaseString* instead? > When is the fact that it's a void* helpful?
and
> I'm going to say review- because of the reinterpret_cast question, and the > question of whether we really do need a bit to tell whether something is a base > string.
In the next change, I'll make this point to something else if it is a BaseString. This is the reason for the flag and putting 0 in there in the case of the BaseString and for making a it void* that no code can use directly. Why re-use this field? In order to not make BaseString any bigger. (I've been under the impression that this is desired.)
Darin Adler
Comment 15
2009-01-11 17:47:54 PST
(In reply to
comment #14
)
> In the next change, I'll make this point to something else if it is a > BaseString. This is the reason for the flag and putting 0 in there in the case > of the BaseString and for making a it void* that no code can use directly.
A union is typically better than a void* to use the same storage for pointers of two different types.
Darin Adler
Comment 16
2009-01-11 17:51:57 PST
Comment on
attachment 26618
[details]
Addressed comments -- Part 2: Refactor UString::Rep into two classes which reduces memory usage for the base Rep class.
> + COMPILE_ASSERT(numCharactersToStore == sizeof(SmallStrings::m_singleCharacterStrings) / sizeof(SmallStrings::m_singleCharacterStrings[0]), > + WTF_IsNumCharactersConstInSyncWithClassUsage);
No need to put that WTF_ prefix on the COMPILE_ASSERT argument. That name is only used for a local symbol, which can't conflict with anything outside this function. The only place you *might* want that prefix would be if you used COMPILE_ASSERT in a header, and even there I don't think it’s needed. Do you need those SmallStrings:: qualifiers? I'd think this would compile without them since we’re inside a SmallStrings constructor definition.
> + void setBaseString(PassRefPtr<BaseString> base);
This is a classic case where we would not name the argument in the function declaration, since the argument's type and the function name make it perfectly clear without a name. r=me
David Levin
Comment 17
2009-01-11 18:06:04 PST
Created
attachment 26619
[details]
Addressed Darin's comments from the last review. The only change is to fix the two comments on the R+ review from before.
Darin Adler
Comment 18
2009-01-11 19:19:59 PST
Comment on
attachment 26619
[details]
Addressed Darin's comments from the last review. Landed as
http://trac.webkit.org/changeset/39815
Darin Adler
Comment 19
2009-01-11 19:20:13 PST
Comment on
attachment 26619
[details]
Addressed Darin's comments from the last review. Clearing review flag because it was landed.
David Levin
Comment 20
2009-01-12 04:39:18 PST
Created
attachment 26629
[details]
Part3: Allow UString to share a UChar* with another class. The underlying class used for sharing also allows for sharing across threads which will be used in StringImpl::copy.
David Levin
Comment 21
2009-01-26 22:23:19 PST
I currently like the names FastRefCountForThreads or FastThreadableRefCount instead of RefCountedMalloced but I'd like a second opinion before I change it. (The fact that it hold something that was malloc'ed is easily changeable if it is ever desired to be used to for another allocation type.)
Darin Adler
Comment 22
2009-01-27 08:57:25 PST
(In reply to
comment #21
)
> I currently like the names > FastRefCountForThreads > or > FastThreadableRefCount > > instead of > RefCountedMalloced > > but I'd like a second opinion before I change it. (The fact that it hold > something that was malloc'ed is easily changeable if it is ever desired to be > used to for another allocation type.)
I like the general direction here; the new suggested names seem better. Despite our unfortunate use of the word "fast" in "fastMalloc", I generally would like to avoid the word "fast". It makes me want to ask the question, "When we would I ever *not* use this?" I don't want my other code to be slow! This is easier to do in person, but generally I like to describe the class and what it's for in prose, and then select words from that discussion to name the class. That exercise often works well.
David Levin
Comment 23
2009-01-27 12:56:12 PST
> describe the class and what it's for in prose
Here's my attempt: The class is to allow for sharing an item with multiple objects across threads that can each control its lifetime independently (using a ref counting scheme) which is similar to RefCounted. It particularly useful in performance critical scenarios where ref/deref are done often without significantly impacting performance and avoids using atomic* until absolutely necessary (unlike ThreadSafeShared which does it on every call). In return, the way to use it is slightly more complicated than ThreadSafeShared. An instance of the class is not threadsafe, but it is designed to be usable in a multi-thread scenario. The code has to call ::copy() to get another instance which can be used on another thread (while still holding on to the same object).
David Levin
Comment 24
2009-01-29 13:57:51 PST
Comment on
attachment 26618
[details]
Addressed comments -- Part 2: Refactor UString::Rep into two classes which reduces memory usage for the base Rep class. Removed r+ b/c it was landed.
David Levin
Comment 25
2009-02-02 16:30:10 PST
Created
attachment 27266
[details]
Part 3: Add CrossThreadRefCounted. Ready for review. Also I broke up the previous (part 3) patch into something smaller with the hope that it will be easier to review like this, and came up with a better name for my class. The class isn't in the build yet because it isn't used in this patch. (It will be used in the next patch.)
Darin Adler
Comment 26
2009-03-09 09:38:37 PDT
Comment on
attachment 27266
[details]
Part 3: Add CrossThreadRefCounted.
> + // ThreadSafeShared can have a siginificant perf impact when used in low level classes
Typo here in the spelling of "significant".
> + // data is freed using *fastFree*.
I believe this comment is wrong. I think the data is deallocated with delete, not fastFree.
> + // Used to make an instance that can be used on another thread. > + PassRefPtr<CrossThreadRefCounted<T> > copy();
I think this naming is getting increasingly unclear. The functions that copy objects to be used on other threads probably need a name that somehow talks about threads rather then just the word "copy".
> + bool isShared() const { return !m_refCounter.hasOneRef() || dataAccessMustBeThreadSafe(); }
I don't understand the semantics of this function. When would you call it and what would you do based on the result.
> + bool dataAccessMustBeThreadSafe() const { return m_threadSafeRefCounter && !m_threadSafeRefCounter->hasOneRef(); }
Same question.
> + ~CrossThreadRefCounted() > + { > + if (m_data && !m_threadSafeRefCounter) > + delete m_data; > + }
The null check of m_data is redundant with what the compiler will already do automatically, so I suggest omitting it.
> +#if USE(LOCKFREE_THREADSAFESHARED) > + if (atomicDecrement(&m_refCount) <= 0) > +#else > + { > + MutexLocker locker(m_mutex); > + --m_refCount; > + } > + if (m_refCount <= 0) > +#endif > + return true;
I think this would read more clearly if the return statement was repeated isnide the #if. r=me
David Levin
Comment 27
2009-03-09 14:06:24 PDT
Comment on
attachment 27266
[details]
Part 3: Add CrossThreadRefCounted. Committed in
r41536
.
David Levin
Comment 28
2009-06-29 16:00:15 PDT
This is all working now. Changing to FIXED.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug