RESOLVED FIXED 221766
Reduce string copies when converting from NSString/CFStringRef to WTF::String
https://bugs.webkit.org/show_bug.cgi?id=221766
Summary Reduce string copies when converting from NSString/CFStringRef to WTF::String
Alex Christensen
Reported 2021-02-11 11:46:29 PST
Reduce string copies when converting from NSString/CFStringRef to WTF::String
Attachments
Patch (5.95 KB, patch)
2021-02-11 11:52 PST, Alex Christensen
no flags
Patch (5.09 KB, patch)
2021-02-11 12:17 PST, Alex Christensen
no flags
Alex Christensen
Comment 1 2021-02-11 11:52:58 PST
Geoffrey Garen
Comment 2 2021-02-11 12:12:40 PST
Comment on attachment 420020 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=420020&action=review r=me -- but please revert the buffer casting behavior before landing (or post a new patch) > Source/WTF/wtf/text/cf/StringCF.cpp:52 > + buffer.resize(size * sizeof(UChar) / sizeof(LChar)); > + > + StringBuffer<UChar>& ucharBuffer = *reinterpret_cast<StringBuffer<UChar>*>(&buffer); I don't think we should do this part. It's not type-safe or strict-aliasing-safe. Also, in order for the resize() to be an optimization, malloc would need to over-allocate the original buffer by 2X. In bmalloc, that will never happen. So, I recommend just using a new StringBuffer. Or, if I've missed something and there's a reason to believe that resize() will be faster, then I recommend adding a releaseRaw() that gives you the malloc buffer, along with an adopting constructor. That's type-safe and strict-aliasing-safe. (Side note: I think the argument to resize() should have been just "size".)
Alex Christensen
Comment 3 2021-02-11 12:17:58 PST
Geoffrey Garen
Comment 4 2021-02-11 12:19:41 PST
> Also, in order for the resize() to be an optimization, > malloc would need to over-allocate the original buffer by 2X. In bmalloc, > that will never happen. Well, I guess it could be an optimization if the allocation were large and the adjacent memory were free? Anyway, in that case, you'd only save the function call to malloc(); the data copy would still have to restart from scratch. So, I'm not sure it would be a win.
Alex Christensen
Comment 5 2021-02-11 12:21:26 PST
If resize doesn't find adjacent memory, wouldn't it need to copy the bytes that we are going to throw away? I think two malloc calls is actually most efficient. I added a scope to free the first before allocating the second
Geoffrey Garen
Comment 6 2021-02-11 12:23:03 PST
Comment on attachment 420025 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=420025&action=review > Source/WTF/wtf/text/cf/StringCF.cpp:47 > + StringBuffer<LChar> buffer(size); > CFIndex usedBufLen; > - CFIndex convertedsize = CFStringGetBytes(str, CFRangeMake(0, size), kCFStringEncodingISOLatin1, 0, false, lcharBuffer.data(), size, &usedBufLen); > - if ((convertedsize == size) && (usedBufLen == size)) { > - m_impl = StringImpl::create(lcharBuffer.data(), size); > + CFIndex convertedSize = CFStringGetBytes(str, CFRangeMake(0, size), kCFStringEncodingISOLatin1, 0, false, buffer.characters(), size, &usedBufLen); > + if (convertedSize == size && usedBufLen == size) { If we ever regret this behavior, I suppose we can preflight it with a CFStringGetFastestEncoding() check?
Geoffrey Garen
Comment 7 2021-02-11 12:23:47 PST
(In reply to Alex Christensen from comment #5) > If resize doesn't find adjacent memory, wouldn't it need to copy the bytes > that we are going to throw away? Yes, good point.
EWS
Comment 8 2021-02-11 13:30:31 PST
commit-queue failed to commit attachment 420025 [details] to WebKit repository. To retry, please set cq+ flag again.
EWS
Comment 9 2021-02-11 14:24:05 PST
Committed r272752: <https://commits.webkit.org/r272752> All reviewed patches have been landed. Closing bug and clearing flags on attachment 420025 [details].
Radar WebKit Bug Importer
Comment 10 2021-02-11 14:25:27 PST
Darin Adler
Comment 11 2021-02-12 13:30:24 PST
Comment on attachment 420025 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=420025&action=review > Source/WTF/wtf/text/cf/StringCF.cpp:48 > + m_impl = StringImpl::adopt(WTFMove(buffer)); This makes a 2-block StringImpl (BufferOwned), but the old code made a 1-block StringImpl (BufferInternal). Doesn’t that mean we are using more memory now? > Source/WTF/wtf/text/cf/StringCF.cpp:55 > + m_impl = StringImpl::adopt(WTFMove(ucharBuffer)); Ditto.
Geoffrey Garen
Comment 12 2021-02-12 14:09:23 PST
> This makes a 2-block StringImpl (BufferOwned), but the old code made a > 1-block StringImpl (BufferInternal). Doesn’t that mean we are using more > memory now? bmalloc small allocations are precise, so I don't think BufferOwned intrinsically uses more memory. But maybe having two objects instead of one increases risk of fragmentation.
Note You need to log in before you can comment on or make changes to this bug.