RESOLVED FIXED 226803
Avoid some calls to StringView::toString() / StringView::toStringWithoutCopying()
https://bugs.webkit.org/show_bug.cgi?id=226803
Summary Avoid some calls to StringView::toString() / StringView::toStringWithoutCopyi...
Chris Dumez
Reported 2021-06-08 20:00:03 PDT
Avoid some calls to StringView::toString() / StringView::toStringWithoutCopying().
Attachments
Patch (36.84 KB, patch)
2021-06-08 20:02 PDT, Chris Dumez
no flags
Patch (40.66 KB, patch)
2021-06-09 09:37 PDT, Chris Dumez
no flags
Chris Dumez
Comment 1 2021-06-08 20:02:48 PDT
Darin Adler
Comment 2 2021-06-09 08:45:15 PDT
Comment on attachment 430936 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=430936&action=review > Source/WTF/wtf/URL.cpp:559 > + if constexpr (std::is_same_v<StringType, StringView>) > + return input.toString(); > + else > + return input; Another way to write this is: return makeString(input); We would have to make sure it’s optimized well enough for the single argument StringView and String cases, but I think it’s better than writing the if statement. > Source/WTF/wtf/text/TextStream.h:71 > WTF_EXPORT_PRIVATE TextStream& operator<<(const char*); If we remove the const char* and String overloads rather than adding an AtomString one, will the StringView overload just be used instead? If so, would be nice to remove them. Also would be good at some point to find a way to support everything that StringConcatenate does without having to write separate functions for each. > Source/WebCore/css/parser/CSSPropertyParser.cpp:539 > + auto stringView = range.consumeIncludingWhitespace().value(); Do we really need to change the variable name? > Source/WebCore/css/parser/CSSPropertyParser.cpp:3337 > + for (unsigned i = 0; i < gridRowNames.length(); ++i) { This should use a loop more like this: for (auto character : gridRowNames.codeUnits()) { > Source/WebCore/css/parser/CSSPropertyParserHelpers.cpp:3566 > + return isPredefinedCounterStyle(nameToken.id()) ? AtomString { name.convertToASCIILowercase() } : name.toAtomString(); If this is hot enough, should consider a StringView::convertToASCIILowercaseAtom that optimizes the case for short strings at least, and goes directly to an AtomString rather than allocating and destroying a StringImpl if it happens to match an existing AtomString.
Chris Dumez
Comment 3 2021-06-09 08:47:47 PDT
(In reply to Darin Adler from comment #2) > Comment on attachment 430936 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=430936&action=review > > > Source/WTF/wtf/URL.cpp:559 > > + if constexpr (std::is_same_v<StringType, StringView>) > > + return input.toString(); > > + else > > + return input; > > Another way to write this is: > > return makeString(input); > > We would have to make sure it’s optimized well enough for the single > argument StringView and String cases, but I think it’s better than writing > the if statement. Will check. > > > Source/WTF/wtf/text/TextStream.h:71 > > WTF_EXPORT_PRIVATE TextStream& operator<<(const char*); > > If we remove the const char* and String overloads rather than adding an > AtomString one, will the StringView overload just be used instead? If so, > would be nice to remove them. I think I tried that. If I remember correctly, the compiler complained that it was ambiguous for String because it could use the overload taking a StringView or the one taking a bool :/ > Also would be good at some point to find a way to support everything that > StringConcatenate does without having to write separate functions for each. > > > Source/WebCore/css/parser/CSSPropertyParser.cpp:539 > > + auto stringView = range.consumeIncludingWhitespace().value(); > > Do we really need to change the variable name? Probably not. > > > Source/WebCore/css/parser/CSSPropertyParser.cpp:3337 > > + for (unsigned i = 0; i < gridRowNames.length(); ++i) { > > This should use a loop more like this: > > for (auto character : gridRowNames.codeUnits()) { OK. > > > Source/WebCore/css/parser/CSSPropertyParserHelpers.cpp:3566 > > + return isPredefinedCounterStyle(nameToken.id()) ? AtomString { name.convertToASCIILowercase() } : name.toAtomString(); > > If this is hot enough, should consider a > StringView::convertToASCIILowercaseAtom that optimizes the case for short > strings at least, and goes directly to an AtomString rather than allocating > and destroying a StringImpl if it happens to match an existing AtomString. I'll check.
Chris Dumez
Comment 4 2021-06-09 08:57:27 PDT
(In reply to Darin Adler from comment #2) > Comment on attachment 430936 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=430936&action=review > > > Source/WTF/wtf/URL.cpp:559 > > + if constexpr (std::is_same_v<StringType, StringView>) > > + return input.toString(); > > + else > > + return input; > > Another way to write this is: > > return makeString(input); > > We would have to make sure it’s optimized well enough for the single > argument StringView and String cases, but I think it’s better than writing > the if statement. It doesn't seem like makeString() would currently work with a StringView parameter as I don't see a StringTypeAdapter for StringView. For String, the code is a little hard to follow but it really doesn't look like it is as efficient as ref'ing the StringImpl. It looks to me that the implementation creates a new StringImpl and then asks the StringTypeAdapter to copy the characters into it. > > > Source/WTF/wtf/text/TextStream.h:71 > > WTF_EXPORT_PRIVATE TextStream& operator<<(const char*); > > If we remove the const char* and String overloads rather than adding an > AtomString one, will the StringView overload just be used instead? If so, > would be nice to remove them. > > Also would be good at some point to find a way to support everything that > StringConcatenate does without having to write separate functions for each. > > > Source/WebCore/css/parser/CSSPropertyParser.cpp:539 > > + auto stringView = range.consumeIncludingWhitespace().value(); > > Do we really need to change the variable name? > > > Source/WebCore/css/parser/CSSPropertyParser.cpp:3337 > > + for (unsigned i = 0; i < gridRowNames.length(); ++i) { > > This should use a loop more like this: > > for (auto character : gridRowNames.codeUnits()) { > > > Source/WebCore/css/parser/CSSPropertyParserHelpers.cpp:3566 > > + return isPredefinedCounterStyle(nameToken.id()) ? AtomString { name.convertToASCIILowercase() } : name.toAtomString(); > > If this is hot enough, should consider a > StringView::convertToASCIILowercaseAtom that optimizes the case for short > strings at least, and goes directly to an AtomString rather than allocating > and destroying a StringImpl if it happens to match an existing AtomString. (In reply to Chris Dumez from comment #3) > (In reply to Darin Adler from comment #2) > > Comment on attachment 430936 [details] > > Patch > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=430936&action=review > > > > > Source/WTF/wtf/URL.cpp:559 > > > + if constexpr (std::is_same_v<StringType, StringView>) > > > + return input.toString(); > > > + else > > > + return input; > > > > Another way to write this is: > > > > return makeString(input); > > > > We would have to make sure it’s optimized well enough for the single > > argument StringView and String cases, but I think it’s better than writing > > the if statement. > > Will check. > > > > > > Source/WTF/wtf/text/TextStream.h:71 > > > WTF_EXPORT_PRIVATE TextStream& operator<<(const char*); > > > > If we remove the const char* and String overloads rather than adding an > > AtomString one, will the StringView overload just be used instead? If so, > > would be nice to remove them. > > I think I tried that. If I remember correctly, the compiler complained that > it was ambiguous for String because it could use the overload taking a > StringView or the one taking a bool :/ > > > Also would be good at some point to find a way to support everything that > > StringConcatenate does without having to write separate functions for each. > > > > > Source/WebCore/css/parser/CSSPropertyParser.cpp:539 > > > + auto stringView = range.consumeIncludingWhitespace().value(); > > > > Do we really need to change the variable name? > > Probably not. > > > > > > Source/WebCore/css/parser/CSSPropertyParser.cpp:3337 > > > + for (unsigned i = 0; i < gridRowNames.length(); ++i) { > > > > This should use a loop more like this: > > > > for (auto character : gridRowNames.codeUnits()) { > > OK. > > > > > Source/WebCore/css/parser/CSSPropertyParserHelpers.cpp:3566 > > > + return isPredefinedCounterStyle(nameToken.id()) ? AtomString { name.convertToASCIILowercase() } : name.toAtomString(); > > > > If this is hot enough, should consider a > > StringView::convertToASCIILowercaseAtom that optimizes the case for short > > strings at least, and goes directly to an AtomString rather than allocating > > and destroying a StringImpl if it happens to match an existing AtomString. > > I'll check.
Chris Dumez
Comment 5 2021-06-09 09:00:59 PDT
(In reply to Darin Adler from comment #2) > Comment on attachment 430936 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=430936&action=review > > > Source/WTF/wtf/URL.cpp:559 > > + if constexpr (std::is_same_v<StringType, StringView>) > > + return input.toString(); > > + else > > + return input; > > Another way to write this is: > > return makeString(input); > > We would have to make sure it’s optimized well enough for the single > argument StringView and String cases, but I think it’s better than writing > the if statement. > > > Source/WTF/wtf/text/TextStream.h:71 > > WTF_EXPORT_PRIVATE TextStream& operator<<(const char*); > > If we remove the const char* and String overloads rather than adding an > AtomString one, will the StringView overload just be used instead? If so, > would be nice to remove them. > > Also would be good at some point to find a way to support everything that > StringConcatenate does without having to write separate functions for each. > > > Source/WebCore/css/parser/CSSPropertyParser.cpp:539 > > + auto stringView = range.consumeIncludingWhitespace().value(); > > Do we really need to change the variable name? > > > Source/WebCore/css/parser/CSSPropertyParser.cpp:3337 > > + for (unsigned i = 0; i < gridRowNames.length(); ++i) { > > This should use a loop more like this: > > for (auto character : gridRowNames.codeUnits()) { > > > Source/WebCore/css/parser/CSSPropertyParserHelpers.cpp:3566 > > + return isPredefinedCounterStyle(nameToken.id()) ? AtomString { name.convertToASCIILowercase() } : name.toAtomString(); > > If this is hot enough, should consider a > StringView::convertToASCIILowercaseAtom that optimizes the case for short > strings at least, and goes directly to an AtomString rather than allocating > and destroying a StringImpl if it happens to match an existing AtomString. Do you mean that for strings that are short enough, we'd iterate through it and optimize the case where the input StringView is already all lowercase?
Chris Dumez
Comment 6 2021-06-09 09:04:17 PDT
(In reply to Chris Dumez from comment #5) > (In reply to Darin Adler from comment #2) > > Comment on attachment 430936 [details] > > Patch > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=430936&action=review > > > > > Source/WTF/wtf/URL.cpp:559 > > > + if constexpr (std::is_same_v<StringType, StringView>) > > > + return input.toString(); > > > + else > > > + return input; > > > > Another way to write this is: > > > > return makeString(input); > > > > We would have to make sure it’s optimized well enough for the single > > argument StringView and String cases, but I think it’s better than writing > > the if statement. > > > > > Source/WTF/wtf/text/TextStream.h:71 > > > WTF_EXPORT_PRIVATE TextStream& operator<<(const char*); > > > > If we remove the const char* and String overloads rather than adding an > > AtomString one, will the StringView overload just be used instead? If so, > > would be nice to remove them. > > > > Also would be good at some point to find a way to support everything that > > StringConcatenate does without having to write separate functions for each. > > > > > Source/WebCore/css/parser/CSSPropertyParser.cpp:539 > > > + auto stringView = range.consumeIncludingWhitespace().value(); > > > > Do we really need to change the variable name? > > > > > Source/WebCore/css/parser/CSSPropertyParser.cpp:3337 > > > + for (unsigned i = 0; i < gridRowNames.length(); ++i) { > > > > This should use a loop more like this: > > > > for (auto character : gridRowNames.codeUnits()) { > > > > > Source/WebCore/css/parser/CSSPropertyParserHelpers.cpp:3566 > > > + return isPredefinedCounterStyle(nameToken.id()) ? AtomString { name.convertToASCIILowercase() } : name.toAtomString(); > > > > If this is hot enough, should consider a > > StringView::convertToASCIILowercaseAtom that optimizes the case for short > > strings at least, and goes directly to an AtomString rather than allocating > > and destroying a StringImpl if it happens to match an existing AtomString. > > Do you mean that for strings that are short enough, we'd iterate through it > and optimize the case where the input StringView is already all lowercase? Looks like the version in StringImpl has a similar optimization (without length restriction): ``` template<StringImpl::CaseConvertType type, typename CharacterType> ALWAYS_INLINE Ref<StringImpl> StringImpl::convertASCIICase(StringImpl& impl, const CharacterType* data, unsigned length) { unsigned failingIndex; for (unsigned i = 0; i < length; ++i) { CharacterType character = data[i]; if (type == CaseConvertType::Lower ? UNLIKELY(isASCIIUpper(character)) : LIKELY(isASCIILower(character))) { failingIndex = i; goto SlowPath; } } return impl; SlowPath: CharacterType* newData; auto newImpl = createUninitializedInternalNonEmpty(length, newData); copyCharacters(newData, data, failingIndex); for (unsigned i = failingIndex; i < length; ++i) newData[i] = type == CaseConvertType::Lower ? toASCIILower(data[i]) : toASCIIUpper(data[i]); return newImpl; } ```
Chris Dumez
Comment 7 2021-06-09 09:37:08 PDT
Darin Adler
Comment 8 2021-06-09 11:12:10 PDT
(In reply to Chris Dumez from comment #4) > It doesn't seem like makeString() would currently work with a StringView > parameter as I don't see a StringTypeAdapter for StringView. It definitely works. StringTypeAdapter<StringView> is in StringView.h, not StringConcatenate.h. > For String, the > code is a little hard to follow but it really doesn't look like it is as > efficient as ref'ing the StringImpl. It looks to me that the implementation > creates a new StringImpl and then asks the StringTypeAdapter to copy the > characters into it. Yes, I was talking about optimizing that case, not expecting that it was already optimized. We could and should optimize makeString when called with a single String or single StringView, to make it more useful for generic programming.
Darin Adler
Comment 9 2021-06-09 11:15:14 PDT
(In reply to Chris Dumez from comment #5) > Do you mean that for strings that are short enough, we'd iterate through it > and optimize the case where the input StringView is already all lowercase? The point here is to avoid allocating a new StringImpl when there is already an AtomStringImpl with the correct characters. Presumably the reason we are using AtomString is that such duplication happens and we want to save memory when it does, but it’s also nice to save the extra work from allocating/deallocating in that case. Yes, when the characters already don’t contain any uppercase ASCII, we can just look up the AtomString without even copying characters. That’s what you are talking about above. But there is another case we could optimize: when the string is short enough, we could copy the lowercased characters into a fixed size buffer without allocating anything on the heap, and then look up the AtomString without allocating/destroying memory on the heap if the string happens to already exist.
EWS
Comment 10 2021-06-09 13:46:31 PDT
Committed r278669 (238651@main): <https://commits.webkit.org/238651@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 430971 [details].
Radar WebKit Bug Importer
Comment 11 2021-06-09 13:47:20 PDT
Note You need to log in before you can comment on or make changes to this bug.