RESOLVED FIXED 239412
Drop String::truncate() and use String::left() instead
https://bugs.webkit.org/show_bug.cgi?id=239412
Summary Drop String::truncate() and use String::left() instead
Chris Dumez
Reported 2022-04-15 17:13:24 PDT
String::truncate() and String::left() have identical behavior. The only difference is that truncate() modifies the String in place (which is a bit confusing), while left() returns a new String, without modifying the original. To simplify our API, I am dropping String::truncate().
Attachments
Patch (19.44 KB, patch)
2022-04-15 17:16 PDT, Chris Dumez
no flags
Patch (21.23 KB, patch)
2022-04-16 15:43 PDT, Chris Dumez
no flags
Chris Dumez
Comment 1 2022-04-15 17:16:44 PDT
Darin Adler
Comment 2 2022-04-16 14:11:05 PDT
Comment on attachment 457737 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=457737&action=review > Source/WebCore/dom/FragmentDirectiveParser.cpp:105 > + firstToken = firstToken.left(tokens.first().length() - 2); > > if (auto prefix = WTF::URLParser::formURLDecode(tokens.takeFirst())) It’s unnecessary to truncate in place. Instead it can be more like this: auto takenFirstToken = tokens.takeFirst(); if (auto prefix = URLParser::formURLDecode(StringView(takenFirstToken).left(firstToken.length() - 2))) Notice that this means we don’t have to call String::left, which allocates memory! > Source/WebCore/html/HTMLImageElement.cpp:216 > String type = typeAttribute.string(); > - type.truncate(type.find(';')); > - type = stripLeadingAndTrailingHTMLSpaces(type); > + type = stripLeadingAndTrailingHTMLSpaces(type.left(type.find(';'))); > if (!type.isEmpty() && !MIMETypeRegistry::isSupportedImageVideoOrSVGMIMEType(type)) > continue; This allocates an intermediate string after finding the ";", but before stripping the spaces, so it can allocate two strings. We can do this without double allocation. StringView typeView = typeAttribute; typeView = typeView.left(typeView.find(';)).stripLeadingAndTrailingMatchedCharacters(isHTMLSpace<UChar>); if (!typeView.isEmpty()) { auto typeString = typeView == typeAttribute ? typeAttribute.string() : typeView.toString(); if (!MIMETypeRegistry::isSupportedImageVideoOrSVGMIMEType(typeString)) continue; } But unfortunately it’s longer code. Not sure how you feel about this. > Source/WebCore/html/TextFieldInputType.cpp:627 > + eventText = eventText.left(textLength).replace("\r\n", " ").replace('\r', ' ').replace('\n', ' '); New code is no worse, but so inefficient to make so many buffers! This operation could be written to use a maximum of one buffer by working on a copy of the string in place, but instead way this code works, we can have up to 3 intermediate String objects plus the final String object passed into limitText, which might create one more. A worst case of 5 malloc/free pairs instead of 1. Probably doesn’t matter, though. > Source/WebCore/loader/FrameLoader.cpp:760 > + headerContentLanguage = stripLeadingAndTrailingHTMLSpaces(headerContentLanguage.left(headerContentLanguage.find(','))); // notFound == -1 == don't truncate. This can allocate two strings instead of 1. This is the exact same operation as HTMLImageElement::bestFitSourceFromPictureElement, but with "," instead of ";", with the same opportunity for a better algorithm. > Source/WebCore/platform/graphics/avfoundation/objc/AVAssetTrackUtilities.mm:49 > + auto codecStringView = StringView(codecString).left(codecString.find('.')); > + auto codecIdentifier = FourCC::fromString(codecStringView.left(4)); How about a one liner: if (auto identifier = FourCC::fromString(StringView(codecString).left(codecString.find('.')).left(4))) > Source/WebKitLegacy/win/WebDownloadCFNet.cpp:195 > + m_destination = StringView(m_bundlePath).left(m_destination.length() - DownloadBundle::fileExtension().length()).toString(); I’m not sure I understand why isolatedCopy is needed. If it’s not, then we should use String::left instead of StringView::left. If it is needed, then not thrilled that it’s now hidden in the use of StringView::toString. It good for efficiency that we don’t make an extra copy, though, if we do need it! > Tools/WebKitTestRunner/InjectedBundle/InjectedBundlePage.cpp:1342 > + messageString = messageString.left(messageString.find(UChar(0))); Could use nullCharacter from CharacterNames.h instead of UChar(0).
Chris Dumez
Comment 3 2022-04-16 15:36:52 PDT
(In reply to Darin Adler from comment #2) > Comment on attachment 457737 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=457737&action=review > > > Source/WebCore/dom/FragmentDirectiveParser.cpp:105 > > + firstToken = firstToken.left(tokens.first().length() - 2); > > > > if (auto prefix = WTF::URLParser::formURLDecode(tokens.takeFirst())) > > It’s unnecessary to truncate in place. Instead it can be more like this: > > auto takenFirstToken = tokens.takeFirst(); > if (auto prefix = > URLParser::formURLDecode(StringView(takenFirstToken).left(firstToken. > length() - 2))) > > Notice that this means we don’t have to call String::left, which allocates > memory! > > > Source/WebCore/html/HTMLImageElement.cpp:216 > > String type = typeAttribute.string(); > > - type.truncate(type.find(';')); > > - type = stripLeadingAndTrailingHTMLSpaces(type); > > + type = stripLeadingAndTrailingHTMLSpaces(type.left(type.find(';'))); > > if (!type.isEmpty() && !MIMETypeRegistry::isSupportedImageVideoOrSVGMIMEType(type)) > > continue; > > This allocates an intermediate string after finding the ";", but before > stripping the spaces, so it can allocate two strings. We can do this without > double allocation. > > StringView typeView = typeAttribute; > typeView = > typeView.left(typeView.find(';)). > stripLeadingAndTrailingMatchedCharacters(isHTMLSpace<UChar>); > if (!typeView.isEmpty()) { > auto typeString = typeView == typeAttribute ? typeAttribute.string() > : typeView.toString(); > if > (!MIMETypeRegistry::isSupportedImageVideoOrSVGMIMEType(typeString)) > continue; > } > > But unfortunately it’s longer code. Not sure how you feel about this. > > > Source/WebCore/html/TextFieldInputType.cpp:627 > > + eventText = eventText.left(textLength).replace("\r\n", " ").replace('\r', ' ').replace('\n', ' '); > > New code is no worse, but so inefficient to make so many buffers! This > operation could be written to use a maximum of one buffer by working on a > copy of the string in place, but instead way this code works, we can have up > to 3 intermediate String objects plus the final String object passed into > limitText, which might create one more. A worst case of 5 malloc/free pairs > instead of 1. Probably doesn’t matter, though. > > > Source/WebCore/loader/FrameLoader.cpp:760 > > + headerContentLanguage = stripLeadingAndTrailingHTMLSpaces(headerContentLanguage.left(headerContentLanguage.find(','))); // notFound == -1 == don't truncate. > > This can allocate two strings instead of 1. This is the exact same operation > as HTMLImageElement::bestFitSourceFromPictureElement, but with "," instead > of ";", with the same opportunity for a better algorithm. > > > Source/WebCore/platform/graphics/avfoundation/objc/AVAssetTrackUtilities.mm:49 > > + auto codecStringView = StringView(codecString).left(codecString.find('.')); > > + auto codecIdentifier = FourCC::fromString(codecStringView.left(4)); > > How about a one liner: > > if (auto identifier = > FourCC::fromString(StringView(codecString).left(codecString.find('.')). > left(4))) > > > Source/WebKitLegacy/win/WebDownloadCFNet.cpp:195 > > + m_destination = StringView(m_bundlePath).left(m_destination.length() - DownloadBundle::fileExtension().length()).toString(); > > I’m not sure I understand why isolatedCopy is needed. If it’s not, then we > should use String::left instead of StringView::left. > > If it is needed, then not thrilled that it’s now hidden in the use of > StringView::toString. It good for efficiency that we don’t make an extra > copy, though, if we do need it! I only dropped it because StringView().toString() was going to create a fresh (thus isolated) String anyway. That said, you are right that it is a bit obscure. I am going to use StringView().toString().isolatedCopy() for clarity. There should be no cost in perf anyway because isolatedCopy() will get optimized out since this is an r-value reference with a refcount of 1. > > > Tools/WebKitTestRunner/InjectedBundle/InjectedBundlePage.cpp:1342 > > + messageString = messageString.left(messageString.find(UChar(0))); > > Could use nullCharacter from CharacterNames.h instead of UChar(0).
Chris Dumez
Comment 4 2022-04-16 15:43:11 PDT
EWS
Comment 5 2022-04-16 20:57:43 PDT
Committed r292945 (249710@main): <https://commits.webkit.org/249710@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 457757 [details].
Radar WebKit Bug Importer
Comment 6 2022-04-16 20:58:14 PDT
Note You need to log in before you can comment on or make changes to this bug.