Bug 69913

Summary: Use realloc() to expand/shrink StringBuilder buffer
Product: WebKit Reporter: Xianzhu Wang <wangxianzhu>
Component: Web Template FrameworkAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: darin, kpiascik, webkit.review.bot, yong.li.webkit
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
patch
darin: review-, webkit-ews: commit-queue-
patch v2 none

Description Xianzhu Wang 2011-10-12 01:26:39 PDT
For now StringBuilder uses malloc(StringImpl::createUninitialized()) and memcpy to expand and shrink buffers. I found that replace them with realloc() can reduce many unnecessary actual allocations and copies. Experiments showed that where realloc() is applicable, 45.3% of realloc()s return the same pointer as the input i.e. no actual copies.
Comment 1 Xianzhu Wang 2011-10-12 02:09:52 PDT
Created attachment 110654 [details]
patch
Comment 2 WebKit Review Bot 2011-10-12 02:11:58 PDT
Attachment 110654 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1

Source/JavaScriptCore/wtf/text/StringImpl.h:187:  The parameter name "impl" adds no information, so it should be removed.  [readability/parameter_name] [5]
Source/JavaScriptCore/wtf/text/StringImpl.cpp:111:  Local variables should never be PassRefPtr (see http://webkit.org/coding/RefPtr.html).  [readability/pass_ptr] [5]
Total errors found: 2 in 5 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 3 Early Warning System Bot 2011-10-12 03:35:39 PDT
Comment on attachment 110654 [details]
patch

Attachment 110654 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/10026841
Comment 4 Darin Adler 2011-10-12 09:50:23 PDT
Comment on attachment 110654 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=110654&action=review

review- because of the EWS build failures

> Source/JavaScriptCore/wtf/text/StringBuilder.cpp:110
> +    // otherwise fallback to "allocate and copy" method.

The verb is “fall back”. The word “fallback” is a noun. This should say “fall back”.

> Source/JavaScriptCore/wtf/text/StringImpl.cpp:93
> +PassRefPtr<StringImpl> StringImpl::reallocate(PassRefPtr<StringImpl> impl, unsigned length, UChar*& data)

I think “impl” is a strange name for this argument. You could just as easily call it “string”. I’d probably call it oldString or existingString or originalString.

> Source/JavaScriptCore/wtf/text/StringImpl.cpp:112
> +    PassRefPtr<StringImpl> result = adoptRef(new (string) StringImpl(length));
> +    return result;

The style bot is right to complain about the use of PassRefPtr. The best way to fix it here is to just call it right in the return statement.

>> Source/JavaScriptCore/wtf/text/StringImpl.h:187
>> +    static PassRefPtr<StringImpl> reallocate(PassRefPtr<StringImpl> impl, unsigned length, UChar*& data);
> 
> The parameter name "impl" adds no information, so it should be removed.  [readability/parameter_name] [5]

I agree with the style bot on this.
Comment 5 Xianzhu Wang 2011-10-12 19:06:00 PDT
Comment on attachment 110654 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=110654&action=review

Thanks Darin for review. I shouldn't have been that hurry going home without running check-webkit-style and checking carefully :)

>> Source/JavaScriptCore/wtf/text/StringBuilder.cpp:110
>> +    // otherwise fallback to "allocate and copy" method.
> 
> The verb is “fall back”. The word “fallback” is a noun. This should say “fall back”.

Done.

>> Source/JavaScriptCore/wtf/text/StringImpl.cpp:93
>> +PassRefPtr<StringImpl> StringImpl::reallocate(PassRefPtr<StringImpl> impl, unsigned length, UChar*& data)
> 
> I think “impl” is a strange name for this argument. You could just as easily call it “string”. I’d probably call it oldString or existingString or originalString.

Changed to originalString.

>> Source/JavaScriptCore/wtf/text/StringImpl.cpp:112
>> +    return result;
> 
> The style bot is right to complain about the use of PassRefPtr. The best way to fix it here is to just call it right in the return statement.

Done. This was caused by incomplete debug code cleanup.

>>> Source/JavaScriptCore/wtf/text/StringImpl.h:187
>>> +    static PassRefPtr<StringImpl> reallocate(PassRefPtr<StringImpl> impl, unsigned length, UChar*& data);
>> 
>> The parameter name "impl" adds no information, so it should be removed.  [readability/parameter_name] [5]
> 
> I agree with the style bot on this.

Changed to originalString and use that name in the comments.
Comment 6 Xianzhu Wang 2011-10-12 19:08:46 PDT
Created attachment 110792 [details]
patch v2
Comment 7 WebKit Review Bot 2011-10-13 10:00:05 PDT
Comment on attachment 110792 [details]
patch v2

Clearing flags on attachment: 110792

Committed r97371: <http://trac.webkit.org/changeset/97371>
Comment 8 WebKit Review Bot 2011-10-13 10:00:09 PDT
All reviewed patches have been landed.  Closing bug.
Comment 9 Yong Li 2012-01-30 10:54:31 PST
While it boosts performance, it also costs memory usage. Are we sure we really want it? Especially in shrinkToFit. In that function we check the length and see if we should shrink memory. When we decide to shrink memory, we are sure we want to create a new copy. But with your patch, it depends on how realloc is implemented. The more frequently you see realloc returns the same pointer, the more memory waste is happening. The buffer will not be destroyed when StringBuilder destructs in most cases, instead, it will be held by WTF::String object returned by toString, and can live with some JS object. So I suggest to revert this patch.
Comment 10 Xianzhu Wang 2012-01-30 12:14:28 PST
(In reply to comment #9)
> The more frequently you see realloc returns the same pointer, the more memory waste is happening.

Could you explain more about the above sentence?
Comment 11 Yong Li 2012-01-30 12:25:28 PST
(In reply to comment #10)
> (In reply to comment #9)
> > The more frequently you see realloc returns the same pointer, the more memory waste is happening.
> 
> Could you explain more about the above sentence?

When realloc returns same pointer, it means the new size is smaller than the old or original size, and the memory allocator can reuse the original memory block. Usually (there are specials), the memory allocator doesn't free the rest of the memory block that is not used any more. For eample, the original buffer in StringBuilder is 2K, but the length is 512 bytes. When the string build job finishes, StringBuilder::toString() calls shrinkToFit(), which is supposed to allocate a new 512-byte buffer and free the 2K buffer. But with the patch, realloc can still use the 2K buffer which leaves 1.5K waste.
Comment 12 Xianzhu Wang 2012-01-30 12:35:10 PST
(In reply to comment #11)
> (In reply to comment #10)
> > (In reply to comment #9)
> > > The more frequently you see realloc returns the same pointer, the more memory waste is happening.
> > 
> > Could you explain more about the above sentence?
> 
> When realloc returns same pointer, it means the new size is smaller than the old or original size, and the memory allocator can reuse the original memory block. Usually (there are specials), the memory allocator doesn't free the rest of the memory block that is not used any more. For eample, the original buffer in StringBuilder is 2K, but the length is 512 bytes. When the string build job finishes, StringBuilder::toString() calls shrinkToFit(), which is supposed to allocate a new 512-byte buffer and free the 2K buffer. But with the patch, realloc can still use the 2K buffer which leaves 1.5K waste.

Could you provide link of memory allocator which works like this?
Comment 13 Yong Li 2012-01-30 13:36:40 PST
(In reply to comment #12)
> (In reply to comment #11)
> > (In reply to comment #10)
> > > (In reply to comment #9)
> > > > The more frequently you see realloc returns the same pointer, the more memory waste is happening.
> > > 
> > > Could you explain more about the above sentence?
> > 
> > When realloc returns same pointer, it means the new size is smaller than the old or original size, and the memory allocator can reuse the original memory block. Usually (there are specials), the memory allocator doesn't free the rest of the memory block that is not used any more. For eample, the original buffer in StringBuilder is 2K, but the length is 512 bytes. When the string build job finishes, StringBuilder::toString() calls shrinkToFit(), which is supposed to allocate a new 512-byte buffer and free the 2K buffer. But with the patch, realloc can still use the 2K buffer which leaves 1.5K waste.
> 
> Could you provide link of memory allocator which works like this?

Never mind. It seems it is the memory allocator's responsibility to reuse the new free space if the new free space is significant. If it cannot, it should try to allocate a new block. Sorry for this