WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
226803
Avoid some calls to StringView::toString() / StringView::toStringWithoutCopying()
https://bugs.webkit.org/show_bug.cgi?id=226803
Summary
Avoid some calls to StringView::toString() / StringView::toStringWithoutCopyi...
Chris Dumez
Reported
2021-06-08 20:00:03 PDT
Avoid some calls to StringView::toString() / StringView::toStringWithoutCopying().
Attachments
Patch
(36.84 KB, patch)
2021-06-08 20:02 PDT
,
Chris Dumez
no flags
Details
Formatted Diff
Diff
Patch
(40.66 KB, patch)
2021-06-09 09:37 PDT
,
Chris Dumez
no flags
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Chris Dumez
Comment 1
2021-06-08 20:02:48 PDT
Created
attachment 430936
[details]
Patch
Darin Adler
Comment 2
2021-06-09 08:45:15 PDT
Comment on
attachment 430936
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=430936&action=review
> Source/WTF/wtf/URL.cpp:559 > + if constexpr (std::is_same_v<StringType, StringView>) > + return input.toString(); > + else > + return input;
Another way to write this is: return makeString(input); We would have to make sure it’s optimized well enough for the single argument StringView and String cases, but I think it’s better than writing the if statement.
> Source/WTF/wtf/text/TextStream.h:71 > WTF_EXPORT_PRIVATE TextStream& operator<<(const char*);
If we remove the const char* and String overloads rather than adding an AtomString one, will the StringView overload just be used instead? If so, would be nice to remove them. Also would be good at some point to find a way to support everything that StringConcatenate does without having to write separate functions for each.
> Source/WebCore/css/parser/CSSPropertyParser.cpp:539 > + auto stringView = range.consumeIncludingWhitespace().value();
Do we really need to change the variable name?
> Source/WebCore/css/parser/CSSPropertyParser.cpp:3337 > + for (unsigned i = 0; i < gridRowNames.length(); ++i) {
This should use a loop more like this: for (auto character : gridRowNames.codeUnits()) {
> Source/WebCore/css/parser/CSSPropertyParserHelpers.cpp:3566 > + return isPredefinedCounterStyle(nameToken.id()) ? AtomString { name.convertToASCIILowercase() } : name.toAtomString();
If this is hot enough, should consider a StringView::convertToASCIILowercaseAtom that optimizes the case for short strings at least, and goes directly to an AtomString rather than allocating and destroying a StringImpl if it happens to match an existing AtomString.
Chris Dumez
Comment 3
2021-06-09 08:47:47 PDT
(In reply to Darin Adler from
comment #2
)
> Comment on
attachment 430936
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=430936&action=review
> > > Source/WTF/wtf/URL.cpp:559 > > + if constexpr (std::is_same_v<StringType, StringView>) > > + return input.toString(); > > + else > > + return input; > > Another way to write this is: > > return makeString(input); > > We would have to make sure it’s optimized well enough for the single > argument StringView and String cases, but I think it’s better than writing > the if statement.
Will check.
> > > Source/WTF/wtf/text/TextStream.h:71 > > WTF_EXPORT_PRIVATE TextStream& operator<<(const char*); > > If we remove the const char* and String overloads rather than adding an > AtomString one, will the StringView overload just be used instead? If so, > would be nice to remove them.
I think I tried that. If I remember correctly, the compiler complained that it was ambiguous for String because it could use the overload taking a StringView or the one taking a bool :/
> Also would be good at some point to find a way to support everything that > StringConcatenate does without having to write separate functions for each. > > > Source/WebCore/css/parser/CSSPropertyParser.cpp:539 > > + auto stringView = range.consumeIncludingWhitespace().value(); > > Do we really need to change the variable name?
Probably not.
> > > Source/WebCore/css/parser/CSSPropertyParser.cpp:3337 > > + for (unsigned i = 0; i < gridRowNames.length(); ++i) { > > This should use a loop more like this: > > for (auto character : gridRowNames.codeUnits()) {
OK.
> > > Source/WebCore/css/parser/CSSPropertyParserHelpers.cpp:3566 > > + return isPredefinedCounterStyle(nameToken.id()) ? AtomString { name.convertToASCIILowercase() } : name.toAtomString(); > > If this is hot enough, should consider a > StringView::convertToASCIILowercaseAtom that optimizes the case for short > strings at least, and goes directly to an AtomString rather than allocating > and destroying a StringImpl if it happens to match an existing AtomString.
I'll check.
Chris Dumez
Comment 4
2021-06-09 08:57:27 PDT
(In reply to Darin Adler from
comment #2
)
> Comment on
attachment 430936
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=430936&action=review
> > > Source/WTF/wtf/URL.cpp:559 > > + if constexpr (std::is_same_v<StringType, StringView>) > > + return input.toString(); > > + else > > + return input; > > Another way to write this is: > > return makeString(input); > > We would have to make sure it’s optimized well enough for the single > argument StringView and String cases, but I think it’s better than writing > the if statement.
It doesn't seem like makeString() would currently work with a StringView parameter as I don't see a StringTypeAdapter for StringView. For String, the code is a little hard to follow but it really doesn't look like it is as efficient as ref'ing the StringImpl. It looks to me that the implementation creates a new StringImpl and then asks the StringTypeAdapter to copy the characters into it.
> > > Source/WTF/wtf/text/TextStream.h:71 > > WTF_EXPORT_PRIVATE TextStream& operator<<(const char*); > > If we remove the const char* and String overloads rather than adding an > AtomString one, will the StringView overload just be used instead? If so, > would be nice to remove them. > > Also would be good at some point to find a way to support everything that > StringConcatenate does without having to write separate functions for each. > > > Source/WebCore/css/parser/CSSPropertyParser.cpp:539 > > + auto stringView = range.consumeIncludingWhitespace().value(); > > Do we really need to change the variable name? > > > Source/WebCore/css/parser/CSSPropertyParser.cpp:3337 > > + for (unsigned i = 0; i < gridRowNames.length(); ++i) { > > This should use a loop more like this: > > for (auto character : gridRowNames.codeUnits()) { > > > Source/WebCore/css/parser/CSSPropertyParserHelpers.cpp:3566 > > + return isPredefinedCounterStyle(nameToken.id()) ? AtomString { name.convertToASCIILowercase() } : name.toAtomString(); > > If this is hot enough, should consider a > StringView::convertToASCIILowercaseAtom that optimizes the case for short > strings at least, and goes directly to an AtomString rather than allocating > and destroying a StringImpl if it happens to match an existing AtomString.
(In reply to Chris Dumez from
comment #3
)
> (In reply to Darin Adler from
comment #2
) > > Comment on
attachment 430936
[details]
> > Patch > > > > View in context: > >
https://bugs.webkit.org/attachment.cgi?id=430936&action=review
> > > > > Source/WTF/wtf/URL.cpp:559 > > > + if constexpr (std::is_same_v<StringType, StringView>) > > > + return input.toString(); > > > + else > > > + return input; > > > > Another way to write this is: > > > > return makeString(input); > > > > We would have to make sure it’s optimized well enough for the single > > argument StringView and String cases, but I think it’s better than writing > > the if statement. > > Will check. > > > > > > Source/WTF/wtf/text/TextStream.h:71 > > > WTF_EXPORT_PRIVATE TextStream& operator<<(const char*); > > > > If we remove the const char* and String overloads rather than adding an > > AtomString one, will the StringView overload just be used instead? If so, > > would be nice to remove them. > > I think I tried that. If I remember correctly, the compiler complained that > it was ambiguous for String because it could use the overload taking a > StringView or the one taking a bool :/ > > > Also would be good at some point to find a way to support everything that > > StringConcatenate does without having to write separate functions for each. > > > > > Source/WebCore/css/parser/CSSPropertyParser.cpp:539 > > > + auto stringView = range.consumeIncludingWhitespace().value(); > > > > Do we really need to change the variable name? > > Probably not. > > > > > > Source/WebCore/css/parser/CSSPropertyParser.cpp:3337 > > > + for (unsigned i = 0; i < gridRowNames.length(); ++i) { > > > > This should use a loop more like this: > > > > for (auto character : gridRowNames.codeUnits()) { > > OK. > > > > > Source/WebCore/css/parser/CSSPropertyParserHelpers.cpp:3566 > > > + return isPredefinedCounterStyle(nameToken.id()) ? AtomString { name.convertToASCIILowercase() } : name.toAtomString(); > > > > If this is hot enough, should consider a > > StringView::convertToASCIILowercaseAtom that optimizes the case for short > > strings at least, and goes directly to an AtomString rather than allocating > > and destroying a StringImpl if it happens to match an existing AtomString. > > I'll check.
Chris Dumez
Comment 5
2021-06-09 09:00:59 PDT
(In reply to Darin Adler from
comment #2
)
> Comment on
attachment 430936
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=430936&action=review
> > > Source/WTF/wtf/URL.cpp:559 > > + if constexpr (std::is_same_v<StringType, StringView>) > > + return input.toString(); > > + else > > + return input; > > Another way to write this is: > > return makeString(input); > > We would have to make sure it’s optimized well enough for the single > argument StringView and String cases, but I think it’s better than writing > the if statement. > > > Source/WTF/wtf/text/TextStream.h:71 > > WTF_EXPORT_PRIVATE TextStream& operator<<(const char*); > > If we remove the const char* and String overloads rather than adding an > AtomString one, will the StringView overload just be used instead? If so, > would be nice to remove them. > > Also would be good at some point to find a way to support everything that > StringConcatenate does without having to write separate functions for each. > > > Source/WebCore/css/parser/CSSPropertyParser.cpp:539 > > + auto stringView = range.consumeIncludingWhitespace().value(); > > Do we really need to change the variable name? > > > Source/WebCore/css/parser/CSSPropertyParser.cpp:3337 > > + for (unsigned i = 0; i < gridRowNames.length(); ++i) { > > This should use a loop more like this: > > for (auto character : gridRowNames.codeUnits()) { > > > Source/WebCore/css/parser/CSSPropertyParserHelpers.cpp:3566 > > + return isPredefinedCounterStyle(nameToken.id()) ? AtomString { name.convertToASCIILowercase() } : name.toAtomString(); > > If this is hot enough, should consider a > StringView::convertToASCIILowercaseAtom that optimizes the case for short > strings at least, and goes directly to an AtomString rather than allocating > and destroying a StringImpl if it happens to match an existing AtomString.
Do you mean that for strings that are short enough, we'd iterate through it and optimize the case where the input StringView is already all lowercase?
Chris Dumez
Comment 6
2021-06-09 09:04:17 PDT
(In reply to Chris Dumez from
comment #5
)
> (In reply to Darin Adler from
comment #2
) > > Comment on
attachment 430936
[details]
> > Patch > > > > View in context: > >
https://bugs.webkit.org/attachment.cgi?id=430936&action=review
> > > > > Source/WTF/wtf/URL.cpp:559 > > > + if constexpr (std::is_same_v<StringType, StringView>) > > > + return input.toString(); > > > + else > > > + return input; > > > > Another way to write this is: > > > > return makeString(input); > > > > We would have to make sure it’s optimized well enough for the single > > argument StringView and String cases, but I think it’s better than writing > > the if statement. > > > > > Source/WTF/wtf/text/TextStream.h:71 > > > WTF_EXPORT_PRIVATE TextStream& operator<<(const char*); > > > > If we remove the const char* and String overloads rather than adding an > > AtomString one, will the StringView overload just be used instead? If so, > > would be nice to remove them. > > > > Also would be good at some point to find a way to support everything that > > StringConcatenate does without having to write separate functions for each. > > > > > Source/WebCore/css/parser/CSSPropertyParser.cpp:539 > > > + auto stringView = range.consumeIncludingWhitespace().value(); > > > > Do we really need to change the variable name? > > > > > Source/WebCore/css/parser/CSSPropertyParser.cpp:3337 > > > + for (unsigned i = 0; i < gridRowNames.length(); ++i) { > > > > This should use a loop more like this: > > > > for (auto character : gridRowNames.codeUnits()) { > > > > > Source/WebCore/css/parser/CSSPropertyParserHelpers.cpp:3566 > > > + return isPredefinedCounterStyle(nameToken.id()) ? AtomString { name.convertToASCIILowercase() } : name.toAtomString(); > > > > If this is hot enough, should consider a > > StringView::convertToASCIILowercaseAtom that optimizes the case for short > > strings at least, and goes directly to an AtomString rather than allocating > > and destroying a StringImpl if it happens to match an existing AtomString. > > Do you mean that for strings that are short enough, we'd iterate through it > and optimize the case where the input StringView is already all lowercase?
Looks like the version in StringImpl has a similar optimization (without length restriction): ``` template<StringImpl::CaseConvertType type, typename CharacterType> ALWAYS_INLINE Ref<StringImpl> StringImpl::convertASCIICase(StringImpl& impl, const CharacterType* data, unsigned length) { unsigned failingIndex; for (unsigned i = 0; i < length; ++i) { CharacterType character = data[i]; if (type == CaseConvertType::Lower ? UNLIKELY(isASCIIUpper(character)) : LIKELY(isASCIILower(character))) { failingIndex = i; goto SlowPath; } } return impl; SlowPath: CharacterType* newData; auto newImpl = createUninitializedInternalNonEmpty(length, newData); copyCharacters(newData, data, failingIndex); for (unsigned i = failingIndex; i < length; ++i) newData[i] = type == CaseConvertType::Lower ? toASCIILower(data[i]) : toASCIIUpper(data[i]); return newImpl; } ```
Chris Dumez
Comment 7
2021-06-09 09:37:08 PDT
Created
attachment 430971
[details]
Patch
Darin Adler
Comment 8
2021-06-09 11:12:10 PDT
(In reply to Chris Dumez from
comment #4
)
> It doesn't seem like makeString() would currently work with a StringView > parameter as I don't see a StringTypeAdapter for StringView.
It definitely works. StringTypeAdapter<StringView> is in StringView.h, not StringConcatenate.h.
> For String, the > code is a little hard to follow but it really doesn't look like it is as > efficient as ref'ing the StringImpl. It looks to me that the implementation > creates a new StringImpl and then asks the StringTypeAdapter to copy the > characters into it.
Yes, I was talking about optimizing that case, not expecting that it was already optimized. We could and should optimize makeString when called with a single String or single StringView, to make it more useful for generic programming.
Darin Adler
Comment 9
2021-06-09 11:15:14 PDT
(In reply to Chris Dumez from
comment #5
)
> Do you mean that for strings that are short enough, we'd iterate through it > and optimize the case where the input StringView is already all lowercase?
The point here is to avoid allocating a new StringImpl when there is already an AtomStringImpl with the correct characters. Presumably the reason we are using AtomString is that such duplication happens and we want to save memory when it does, but it’s also nice to save the extra work from allocating/deallocating in that case. Yes, when the characters already don’t contain any uppercase ASCII, we can just look up the AtomString without even copying characters. That’s what you are talking about above. But there is another case we could optimize: when the string is short enough, we could copy the lowercased characters into a fixed size buffer without allocating anything on the heap, and then look up the AtomString without allocating/destroying memory on the heap if the string happens to already exist.
EWS
Comment 10
2021-06-09 13:46:31 PDT
Committed
r278669
(
238651@main
): <
https://commits.webkit.org/238651@main
> All reviewed patches have been landed. Closing bug and clearing flags on
attachment 430971
[details]
.
Radar WebKit Bug Importer
Comment 11
2021-06-09 13:47:20 PDT
<
rdar://problem/79096267
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug