WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
170925
Using StringView.split() instead of String.split() in some places
https://bugs.webkit.org/show_bug.cgi?id=170925
Summary
Using StringView.split() instead of String.split() in some places
Daniel Bates
Reported
2017-04-17 17:08:49 PDT
Use StringView.split() instead of String.split() in some places to avoid intermediary allocation of substrings.
Attachments
Patch
(14.72 KB, patch)
2017-04-17 17:09 PDT
,
Daniel Bates
no flags
Details
Formatted Diff
Diff
Patch
(23.60 KB, patch)
2017-04-18 12:22 PDT
,
Daniel Bates
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews116 for mac-elcapitan
(1.81 MB, application/zip)
2017-04-18 14:35 PDT
,
Build Bot
no flags
Details
Patch
(23.78 KB, patch)
2017-04-18 16:38 PDT
,
Daniel Bates
no flags
Details
Formatted Diff
Diff
Patch
(26.58 KB, patch)
2017-04-28 17:22 PDT
,
Daniel Bates
darin
: review+
Details
Formatted Diff
Diff
Show Obsolete
(3)
View All
Add attachment
proposed patch, testcase, etc.
Daniel Bates
Comment 1
2017-04-17 17:09:52 PDT
Created
attachment 307323
[details]
Patch
Sam Weinig
Comment 2
2017-04-17 18:42:37 PDT
Comment on
attachment 307323
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=307323&action=review
> Source/WebCore/accessibility/AccessibilityObject.cpp:2241 > + for (StringView roleName : StringView(value).split(' ')) {
I would use auto instead of StringView here.
> Source/WebCore/accessibility/AccessibilityObject.cpp:2242 > + if (AccessibilityRole role = ariaRoleMap().get(roleName.toStringWithoutCopying()))
Seems unfortunate that you have to allocate a String here (though it's good you don't have to copy the characters. Seems like a good place to use a HashTranslator.
> Source/WebCore/accessibility/AccessibilityObject.cpp:3079 > + for (StringView idName : StringView(idList).split(' ')) {
Same, I would use auto.
> Source/WebCore/accessibility/AccessibilityObject.cpp:3080 > + if (Element* idElement = treeScope.getElementById(idName.toStringWithoutCopying()))
I think here you want toAtomicString(), as getElementById will probably go ahead and do that if you don't.
> Source/WebCore/html/parser/XSSAuditor.cpp:245 > + if (protocolIsJavaScript(value.toStringWithoutCopying()))
Seems like protocolIsJavaScript should have an overload that takes a StringView.
Daniel Bates
Comment 3
2017-04-18 12:22:52 PDT
Created
attachment 307402
[details]
Patch Updated patch based on feedback from Sam Weinig.
Build Bot
Comment 4
2017-04-18 14:35:07 PDT
Comment on
attachment 307402
[details]
Patch
Attachment 307402
[details]
did not pass mac-debug-ews (mac): Output:
http://webkit-queues.webkit.org/results/3558947
New failing tests: http/tests/inspector/network/resource-sizes-network.html
Build Bot
Comment 5
2017-04-18 14:35:09 PDT
Created
attachment 307418
[details]
Archive of layout-test-results from ews116 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews116 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Daniel Bates
Comment 6
2017-04-18 16:38:53 PDT
Created
attachment 307434
[details]
Patch Fix the GTK EWS; rename the parameter to WebCore::semicolonSeparatedValueContainsJavaScriptURL() so that its name does not collide with the name of the for-loop variable.
Darin Adler
Comment 7
2017-04-18 18:43:51 PDT
Comment on
attachment 307434
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=307434&action=review
> Source/WTF/wtf/HashMap.h:301 > + KeyValuePairType* entry = const_cast<HashTableType&>(m_impl).template lookup<HashMapTranslatorAdapter<KeyValuePairTraits, HashTranslator>>(value);
Seems like this should be auto or auto* instead of KeyValuePairType*.
> Source/WTF/wtf/HashMap.h:406 > + return get<typename HashTableType::IdentityTranslatorType>(key);
Why doesn’t this use IdentityTranslatorType instead of typename HashTableType::IdentityTranslatorType?
> Source/WebCore/accessibility/AccessibilityObject.cpp:2242 > + if (AccessibilityRole role = ariaRoleMap().get<typename ARIARoleMap::IdentityTranslatorType>(roleName)) // Identity translator supports StringView; get() does not.
That is so peculiar. Even with the comment it’s so non-obvious that an "identity" translator would allow things that the get function would not allow itself. The comment just makes me wonder why! I wonder if there is some more elegant way to handle this. Or some clearer way to explain why explicit use of the identity translator makes this work.
> Source/WebCore/accessibility/AccessibilityObject.cpp:3080 > + if (Element* idElement = treeScope.getElementById(idName.toAtomicString()))
Could consider optimizing this by using toExistingAtomicString; if that function exists and we would also have to make sure null string is handled properly. Could also use auto* here instead of Element*.
> Source/WebCore/platform/URL.cpp:1149 > + return protocolIs(url.toStringWithoutCopying(), "javascript");
This is an inefficient algorithm, allocating memory every time. We can do this much more efficiently by either copying and pasting the body of the protocolIs(const String&, const char*) function or using templates to make a StringView version. Also there is a very strange FIXME above protocolIs(const String&, const char*) that says: "Why is this different than protocolIs(StringView, const char*)?' That seems to imply that there was a protocolIs function that took a StringView at one time. Perhaps it was deleted? The FIXME should go.
> Source/WebCore/platform/graphics/avfoundation/CDMPrivateMediaSourceAVFObjC.mm:107 > + StringView keySystem = m_cdm->keySystem();
This looks unsafe. What type does m_cdm->keySystem() return? If it returns String then this is unsafe. If it returns const String& then this is safe. I worry that someone will change it from returning const String& to returning String and then make this code invalid. Hard to write this so it’s obviously safe. Think of StringView like a raw pointer. There’s always the question of why the StringView is still good after you initialize it from the result of an expression that might involve a temporary String. Generally we should avoid local variables of type StringView that are initialized from a temporary String or look like they might be. Arguments of type StringView don’t have this problem, many cases of local variable StringView that are not initialized from temporary Strings are fine, and various other idioms are OK.
> Source/WebCore/svg/SVGToOTFFontConversion.cpp:1476 > + auto segments = StringView(m_fontFaceElement->attributeWithoutSynchronization(SVGNames::font_weightAttr)).split(' '); > + for (auto segment : segments) {
I would have written this; auto& weight = m_fontFaceElement->attributeWithoutSynchronization(SVGNames::font_weightAttr); for (auto segment: StringView(weight).split(' ')) { Makes the object lifetimes more obviously correct. Otherwise, you have the puzzle of why it’s OK for the StringView to outlive the result of attributeWithoutSynchronization.
> Source/WebCore/svg/SVGToOTFFontConversion.cpp:1489 > + segments = StringView(m_fontFaceElement->attributeWithoutSynchronization(SVGNames::font_styleAttr)).split(' '); > + for (auto segment : segments) {
Same issue here.
> Source/WebCore/testing/Internals.cpp:3509 > +void Internals::setPageMuted(const String& statesString)
I think you can just change the argument type to StringView instead.
Sam Weinig
Comment 8
2017-04-18 19:08:04 PDT
(In reply to Darin Adler from
comment #7
)
> Comment on
attachment 307434
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=307434&action=review
> > > Source/WTF/wtf/HashMap.h:406 > > + return get<typename HashTableType::IdentityTranslatorType>(key); > > Why doesn’t this use IdentityTranslatorType instead of typename > HashTableType::IdentityTranslatorType? > > > Source/WebCore/accessibility/AccessibilityObject.cpp:2242 > > + if (AccessibilityRole role = ariaRoleMap().get<typename ARIARoleMap::IdentityTranslatorType>(roleName)) // Identity translator supports StringView; get() does not. > > That is so peculiar. Even with the comment it’s so non-obvious that an > "identity" translator would allow things that the get function would not > allow itself. The comment just makes me wonder why! I wonder if there is > some more elegant way to handle this. Or some clearer way to explain why > explicit use of the identity translator makes this work.
One option, though I am not 100% sure this would work, would be to change HashMap::get from: MappedPeekType get(const KeyType&) const; to template<typename K> MappedPeekType get(K&&) const; And make get call m_impl.lookup<IdentityTranslator>(std::forward<K>(k)) (or some such). I would probably make sense to also do something like static_assert that HashTraits::hash(k) is invokable, to make errors slightly more clear.
Sam Weinig
Comment 9
2017-04-18 19:13:32 PDT
(In reply to Sam Weinig from
comment #8
)
> (In reply to Darin Adler from
comment #7
) > > Comment on
attachment 307434
[details]
> > Patch > > > > View in context: > >
https://bugs.webkit.org/attachment.cgi?id=307434&action=review
> > > > > > Source/WTF/wtf/HashMap.h:406 > > > + return get<typename HashTableType::IdentityTranslatorType>(key); > > > > Why doesn’t this use IdentityTranslatorType instead of typename > > HashTableType::IdentityTranslatorType? > > > > > Source/WebCore/accessibility/AccessibilityObject.cpp:2242 > > > + if (AccessibilityRole role = ariaRoleMap().get<typename ARIARoleMap::IdentityTranslatorType>(roleName)) // Identity translator supports StringView; get() does not. > > > > That is so peculiar. Even with the comment it’s so non-obvious that an > > "identity" translator would allow things that the get function would not > > allow itself. The comment just makes me wonder why! I wonder if there is > > some more elegant way to handle this. Or some clearer way to explain why > > explicit use of the identity translator makes this work. > > One option, though I am not 100% sure this would work, would be to change > HashMap::get from: > > MappedPeekType get(const KeyType&) const; > > to > > template<typename K> > MappedPeekType get(K&&) const; > > And make get call m_impl.lookup<IdentityTranslator>(std::forward<K>(k)) (or > some such). I would probably make sense to also do something like > static_assert that HashTraits::hash(k) is invokable, to make errors slightly > more clear.
Alternatively, writing a specific Translator just for StringView might be the most straightforward and clear way to go.
Daniel Bates
Comment 10
2017-04-19 11:58:20 PDT
(In reply to Darin Adler from
comment #7
)
> Comment on
attachment 307434
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=307434&action=review
> > > Source/WTF/wtf/HashMap.h:301 > > + KeyValuePairType* entry = const_cast<HashTableType&>(m_impl).template lookup<HashMapTranslatorAdapter<KeyValuePairTraits, HashTranslator>>(value); > > Seems like this should be auto or auto* instead of KeyValuePairType*. >
I can make this change if strongly desired. I feel doing this makes the code less readable. I understand the advantages of using auto/auto* to prevent narrowing conversions and make the code future proof. Can you please enlighten me on how making your proposed change makes it easier for a person reading this code to discover that the right-hand side of the expression returns an object of type KeyValuePairType, ultimately derived from struct KeyValuePair, that exposes member fields key and value? Suppose in the future KeyValuePairType is made to be effectively an alias for type NameValuePair and a NameValuePair object has member fields name and value. For some reason a person needs to modify this function (HashMap::get<>()) to access the key in the looked up result. How does your proposed change make it easier to discover that entry->name is the way to access this information? Are you assuming that a person working on this code will be using an editor with autocompletion? Are you assuming that a person using knowledge of typical hash table/map operations and intuition could determine this? If the latter, how does a person avoid the need to guess-and-check (e.g. try entry->key then entry->name)?
> > Source/WTF/wtf/HashMap.h:406 > > + return get<typename HashTableType::IdentityTranslatorType>(key); > > Why doesn’t this use IdentityTranslatorType instead of typename > HashTableType::IdentityTranslatorType? >
Will change to use IdentityTranslatorType.
> > Source/WebCore/accessibility/AccessibilityObject.cpp:2242 > > + if (AccessibilityRole role = ariaRoleMap().get<typename ARIARoleMap::IdentityTranslatorType>(roleName)) // Identity translator supports StringView; get() does not. > > That is so peculiar. Even with the comment it’s so non-obvious that an > "identity" translator would allow things that the get function would not > allow itself. The comment just makes me wonder why! I wonder if there is > some more elegant way to handle this. Or some clearer way to explain why > explicit use of the identity translator makes this work. >
I am also not happy with this code and open to suggestions. I considered writing a HashTranslator-specific to the use case in this code, case-insensitive hashing a of a StringView. I chose not to take this approach because it would duplicate code in ASCIICaseInsensitiveHash. I also spoke to Anders Carlsson yesterday (04/19) about these approaches as well as the pecular code in question and asked for suggestions. In this discussion I brought up the idea Sam Weinig mentioned in
comment #8
about making HashMap::get() take a templated type as its parameter instead of const KeyType&. Anders felt this was error prone and could lead to subtle bugs where we may return the wrong result. Another idea is to pass ASCIICaseInsensitiveHash for "typename ARIARoleMap::IdentityTranslatorType" in the above code or to define an alias for it, maybe ASCIICaseInsensitiveStringViewHash. At the time I wrote the patch I chose against these approaches because I was worried that ASCIICaseInsensitiveStringViewHash could get out-of-sync with the hash function defined for the ARIA role map (ASCIICaseInsensitiveHash). Maybe there is nothing to worry?
> > Source/WebCore/accessibility/AccessibilityObject.cpp:3080 > > + if (Element* idElement = treeScope.getElementById(idName.toAtomicString())) > > Could consider optimizing this by using toExistingAtomicString; if that > function exists and we would also have to make sure null string is handled > properly. > > Could also use auto* here instead of Element*. >
Will use auto*.
> > Source/WebCore/platform/URL.cpp:1149 > > + return protocolIs(url.toStringWithoutCopying(), "javascript"); > > This is an inefficient algorithm, allocating memory every time. We can do > this much more efficiently by either copying and pasting the body of the > protocolIs(const String&, const char*) function or using templates to make a > StringView version. > > Also there is a very strange FIXME above protocolIs(const String&, const > char*) that says: "Why is this different than protocolIs(StringView, const > char*)?'
Oops! I meant to remove this comment as it is out-of-date.
> That seems to imply that there was a protocolIs function that took > a StringView at one time.
> Perhaps it was deleted? The FIXME should go.
Yes, the overload referenced in the comment was removed.
> > > Source/WebCore/platform/graphics/avfoundation/CDMPrivateMediaSourceAVFObjC.mm:107 > > + StringView keySystem = m_cdm->keySystem(); > > This looks unsafe. What type does m_cdm->keySystem() return? If it returns > String then this is unsafe. If it returns const String& then this is safe.
It returns a const String&.
> I worry that someone will change it from returning const String& to returning > String and then make this code invalid.
I agree assuming that CDM::keySystem() will always returns a const String& is error prone. I will change this code to read: String keySystem = m_cdm->keySystem(); // Local copy for StringView usage StringView keySystemStringView { keySystem }; ASSERT(validKeySystemRE().match(keySystem) >= 0); ... (I am open to name suggestions). And will update the remainder of the code in this function to use keySystemStringView, as applicable.
> Hard to write this so it’s obviously > safe. Think of StringView like a raw pointer. There’s always the question of > why the StringView is still good after you initialize it from the result of > an expression that might involve a temporary String. > > Generally we should avoid local variables of type StringView that are > initialized from a temporary String or look like they might be. Arguments of > type StringView don’t have this problem, many cases of local variable > StringView that are not initialized from temporary Strings are fine, and > various other idioms are OK. > > > Source/WebCore/svg/SVGToOTFFontConversion.cpp:1476 > > + auto segments = StringView(m_fontFaceElement->attributeWithoutSynchronization(SVGNames::font_weightAttr)).split(' '); > > + for (auto segment : segments) { > > I would have written this; > > auto& weight = > m_fontFaceElement->attributeWithoutSynchronization(SVGNames:: > font_weightAttr); > for (auto segment: StringView(weight).split(' ')) { > > Makes the object lifetimes more obviously correct. Otherwise, you have the > puzzle of why it’s OK for the StringView to outlive the result of > attributeWithoutSynchronization. >
Will change.
> > Source/WebCore/svg/SVGToOTFFontConversion.cpp:1489 > > + segments = StringView(m_fontFaceElement->attributeWithoutSynchronization(SVGNames::font_styleAttr)).split(' '); > > + for (auto segment : segments) { > > Same issue here. >
Will change.
> > Source/WebCore/testing/Internals.cpp:3509 > > +void Internals::setPageMuted(const String& statesString) > > I think you can just change the argument type to StringView instead.
Will change.
Daniel Bates
Comment 11
2017-04-19 12:11:09 PDT
(In reply to Darin Adler from
comment #7
)
> > Source/WebCore/accessibility/AccessibilityObject.cpp:3080 > > + if (Element* idElement = treeScope.getElementById(idName.toAtomicString())) > > Could consider optimizing this by using toExistingAtomicString; if that > function exists and we would also have to make sure null string is handled > properly. >
I did not mean to ignore this comment in my reply in
comment #10
. StringView::toExistingAtomicString() does not exist.
Darin Adler
Comment 12
2017-04-19 15:39:55 PDT
Comment on
attachment 307434
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=307434&action=review
>>> Source/WTF/wtf/HashMap.h:301 >>> + KeyValuePairType* entry = const_cast<HashTableType&>(m_impl).template lookup<HashMapTranslatorAdapter<KeyValuePairTraits, HashTranslator>>(value); >> >> Seems like this should be auto or auto* instead of KeyValuePairType*. > > I can make this change if strongly desired. I feel doing this makes the code less readable. I understand the advantages of using auto/auto* to prevent narrowing conversions and make the code future proof. Can you please enlighten me on how making your proposed change makes it easier for a person reading this code to discover that the right-hand side of the expression returns an object of type KeyValuePairType, ultimately derived from struct KeyValuePair, that exposes member fields key and value? Suppose in the future KeyValuePairType is made to be effectively an alias for type NameValuePair and a NameValuePair object has member fields name and value. For some reason a person needs to modify this function (HashMap::get<>()) to access the key in the looked up result. How does your proposed change make it easier to discover that entry->name is the way to access this information? Are you assuming that a person working on this code will be using an editor with autocompletion? Are you assuming that a person using knowledge of typical hash table/map operations and intuition could determine this? If the latter, how does a person avoid the need to guess-and-check (e.g. try entry->key then entry->name)?
If there was no null check we would likely not name the type because we wouldn’t need a local variable. The code is: look up the entry in the hash map return the value from inside that entry (unless there is none) I don’t think that the exact type used for entries in the hash map is relevant to understanding the code is correct. The code "entry->value" is self explanatory without knowing the specific type of entry.
>>> Source/WebCore/accessibility/AccessibilityObject.cpp:3080 >>> + if (Element* idElement = treeScope.getElementById(idName.toAtomicString())) >> >> Could consider optimizing this by using toExistingAtomicString; if that function exists and we would also have to make sure null string is handled properly. >> >> Could also use auto* here instead of Element*. > > Will use auto*.
At some point we should add toExistingAtomicString; it’s wasteful to create an atomic string and then destroy it in the case where the name is not already an atomic string.
>>> Source/WebCore/platform/URL.cpp:1149 >>> + return protocolIs(url.toStringWithoutCopying(), "javascript"); >> >> This is an inefficient algorithm, allocating memory every time. We can do this much more efficiently by either copying and pasting the body of the protocolIs(const String&, const char*) function or using templates to make a StringView version. >> >> Also there is a very strange FIXME above protocolIs(const String&, const char*) that says: "Why is this different than protocolIs(StringView, const char*)?' That seems to imply that there was a protocolIs function that took a StringView at one time. Perhaps it was deleted? The FIXME should go. > > Oops! I meant to remove this comment as it is out-of-date.
Removing the comment is good. Using an efficient algorithm is important in the long run; we don’t want to allocate memory just to compare a URL against a protocol.
Sam Weinig
Comment 13
2017-04-20 09:39:11 PDT
(In reply to Daniel Bates from
comment #10
)
> (In reply to Darin Adler from
comment #7
) > > Comment on
attachment 307434
[details]
> > Patch > > > > > Source/WebCore/accessibility/AccessibilityObject.cpp:2242 > > > + if (AccessibilityRole role = ariaRoleMap().get<typename ARIARoleMap::IdentityTranslatorType>(roleName)) // Identity translator supports StringView; get() does not. > > > > That is so peculiar. Even with the comment it’s so non-obvious that an > > "identity" translator would allow things that the get function would not > > allow itself. The comment just makes me wonder why! I wonder if there is > > some more elegant way to handle this. Or some clearer way to explain why > > explicit use of the identity translator makes this work. > > > > I am also not happy with this code and open to suggestions. I considered > writing a HashTranslator-specific to the use case in this code, > case-insensitive hashing a of a StringView. I chose not to take this > approach because it would duplicate code in ASCIICaseInsensitiveHash. I also > spoke to Anders Carlsson yesterday (04/19) about these approaches as well as > the pecular code in question and asked for suggestions. In this discussion I > brought up the idea Sam Weinig mentioned in
comment #8
about making > HashMap::get() take a templated type as its parameter instead of const > KeyType&. Anders felt this was error prone and could lead to subtle bugs > where we may return the wrong result.
> I am curious in what ways it would be error prone? Perhaps we could figure out a way to avoid those issues.
Sam Weinig
Comment 14
2017-04-20 09:43:40 PDT
(In reply to Darin Adler from
comment #12
)
> Comment on
attachment 307434
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=307434&action=review
> > >>> Source/WebCore/accessibility/AccessibilityObject.cpp:3080 > >>> + if (Element* idElement = treeScope.getElementById(idName.toAtomicString())) > >> > >> Could consider optimizing this by using toExistingAtomicString; if that function exists and we would also have to make sure null string is handled properly.
Alternatively, we could add an additional overload to getElementById() that takes a StringView, since getElementById() already has the optimization for String, where it does nothing, if an atomic string cannot be made. That way, we could avoid exposing the fact knowledge that if an atomic string is not present, there can be no associated element. But maybe we already expose that fact for the bindings. Either way, we should add toExistingAtomicString() to StringView since it will no doubt be useful elsewhere.
Daniel Bates
Comment 15
2017-04-28 14:30:43 PDT
(In reply to Sam Weinig from
comment #14
)
> (In reply to Darin Adler from
comment #12
) > > Comment on
attachment 307434
[details]
> > Patch > > > > View in context: > >
https://bugs.webkit.org/attachment.cgi?id=307434&action=review
> > > > >>> Source/WebCore/accessibility/AccessibilityObject.cpp:3080 > > >>> + if (Element* idElement = treeScope.getElementById(idName.toAtomicString())) > > >> > > >> Could consider optimizing this by using toExistingAtomicString; if that function exists and we would also have to make sure null string is handled properly. > > Alternatively, we could add an additional overload to getElementById() that > takes a StringView, since getElementById() already has the optimization for > String, where it does nothing, if an atomic string cannot be made. That way, > we could avoid exposing the fact knowledge that if an atomic string is not > present, there can be no associated element. But maybe we already expose > that fact for the bindings. Either way, we should add > toExistingAtomicString() to StringView since it will no doubt be useful > elsewhere.
Filed
bug #171405
to add StringView::toExistingAtomicString(). I will update this patch to add an overload of TreeScope::getElementById() that takes a StringView and is implemented in terms of StringView::toExistingAtomicString(). Notice that both the String and AtomicString constructors take a RefPtr<AtomicStringImpl> (returned by StringView::toExistingAtomicString()). I chose to also add the overload to avoid the need to disambiguate between TreeScope::getElementById(const String&) and TreeScope::getElementById(const AtomicString&) at the call site. Therefore the code at the call site will read: ... if (Element* idElement = treeScope.getElementById(idName)) ...
Daniel Bates
Comment 16
2017-04-28 15:25:55 PDT
(In reply to Darin Adler from
comment #12
)
> [...] > >>> Source/WebCore/accessibility/AccessibilityObject.cpp:3080 > >>> + if (Element* idElement = treeScope.getElementById(idName.toAtomicString())) > >> > >> Could consider optimizing this by using toExistingAtomicString; if that function exists and we would also have to make sure null string is handled properly. > >> > >> Could also use auto* here instead of Element*. > > > > Will use auto*. > > At some point we should add toExistingAtomicString; it’s wasteful to create > an atomic string and then destroy it in the case where the name is not > already an atomic string. >
Added StringView::toExistingAtomicString() in
bug #171405
. Will update patch to make use of it. See
comment #15
for more details.
> >>> Source/WebCore/platform/URL.cpp:1149 > >>> + return protocolIs(url.toStringWithoutCopying(), "javascript"); > >> > >> This is an inefficient algorithm, allocating memory every time. We can do this much more efficiently by either copying and pasting the body of the protocolIs(const String&, const char*) function or using templates to make a StringView version. > >> > >> Also there is a very strange FIXME above protocolIs(const String&, const char*) that says: "Why is this different than protocolIs(StringView, const char*)?' That seems to imply that there was a protocolIs function that took a StringView at one time. Perhaps it was deleted? The FIXME should go. > > > > Oops! I meant to remove this comment as it is out-of-date. > > Removing the comment is good. Using an efficient algorithm is important in > the long run; we don’t want to allocate memory just to compare a URL against > a protocol.
Removed out-of-date comment and added WebCore::protocolIs(StringView) with a template-based implementation in
bug #171396
. Will update patch to make use of it.
Daniel Bates
Comment 17
2017-04-28 17:22:29 PDT
Created
attachment 308626
[details]
Patch Updated patch based on feedback from Darin Adler and Sam Weinig.
Darin Adler
Comment 18
2017-05-01 09:30:22 PDT
Comment on
attachment 308626
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=308626&action=review
> Source/WTF/wtf/HashMap.h:72 > + using IdentityTranslatorType = typename HashTableType::IdentityTranslatorType;
Seems fine to have this. Not 100% sure it should be public; would probably be fine to have it private.
> Source/WebCore/accessibility/AccessibilityObject.cpp:2131 > +// FIXME: Find a way to incorporate this functionality into ASCIICaseInsensitiveHash and allow > +// a HashMap whose keys are type String to perform operations when given a key of type StringView. > +struct ASCIICaseInsensitiveStringViewHashTranslator { > + static unsigned hash(StringView key) > + { > + if (key.is8Bit()) > + return ASCIICaseInsensitiveHash::hash(key.characters8(), key.length()); > + return ASCIICaseInsensitiveHash::hash(key.characters16(), key.length()); > + } > + > + static bool equal(const String& a, StringView b) > + { > + return equalIgnoringASCIICaseCommon(a, b); > + } > +};
That comment seems fine, but it’s unclear why even in its current form this is here and not in the the header with ASCIICaseSensitiveHash.
> Source/WebCore/dom/TreeScope.cpp:122 > + if (RefPtr<AtomicStringImpl> atomicElementId = elementId.toExistingAtomicString())
I would use auto here instead of writing out RefPtr.
> Source/WebCore/html/parser/XSSAuditor.cpp:242 > +static bool semicolonSeparatedValueContainsJavaScriptURL(const String& semicolonSeparatedValue)
Should change argument type to StringView. The one call site for this function calls stripLeadingAndTrailingHTMLSpaces before calling this function. This is peculiar because it means that leading HTML spaces on the entire value are ignored, but not spaces after semicolons. I am not sure that is actually the desired behavior. Also, stripping trailing spaces is pointless; such spaces will be ignored anyway. We might want to do that stripping of leading spaces on each substring before calling protocolIsJavaScript. Note that stripLeadingAndTrailingHTMLSpaces or just stripping leading spaces are natural operations to support on a StringView rather than a String when the result is going to inspected rather than stored. There is no need to allocate a new String for the substring computed after trimming off possible spaces. Generally speaking StringView is excellent for substring operations like this one as long as the substring is not going to be turned back into a String for storage. Worth coming back to and tidying up later.
Daniel Bates
Comment 19
2017-05-02 15:48:28 PDT
(In reply to Darin Adler from
comment #18
)
> Comment on
attachment 308626
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=308626&action=review
> > > Source/WTF/wtf/HashMap.h:72 > > + using IdentityTranslatorType = typename HashTableType::IdentityTranslatorType; > > Seems fine to have this. Not 100% sure it should be public; would probably > be fine to have it private. >
Will make private.
> > Source/WebCore/accessibility/AccessibilityObject.cpp:2131 > > +// FIXME: Find a way to incorporate this functionality into ASCIICaseInsensitiveHash and allow > > +// a HashMap whose keys are type String to perform operations when given a key of type StringView. > > +struct ASCIICaseInsensitiveStringViewHashTranslator { > > + static unsigned hash(StringView key) > > + { > > + if (key.is8Bit()) > > + return ASCIICaseInsensitiveHash::hash(key.characters8(), key.length()); > > + return ASCIICaseInsensitiveHash::hash(key.characters16(), key.length()); > > + } > > + > > + static bool equal(const String& a, StringView b) > > + { > > + return equalIgnoringASCIICaseCommon(a, b); > > + } > > +}; > > That comment seems fine, but it’s unclear why even in its current form this > is here and not in the the header with ASCIICaseSensitiveHash. >
Moved to file Source/WTF/wtf/text/StringHash.h (where ASCIICaseSensitiveHash is defined).
> > Source/WebCore/dom/TreeScope.cpp:122 > > + if (RefPtr<AtomicStringImpl> atomicElementId = elementId.toExistingAtomicString()) > > I would use auto here instead of writing out RefPtr. >
Will use auto.
> > Source/WebCore/html/parser/XSSAuditor.cpp:242 > > +static bool semicolonSeparatedValueContainsJavaScriptURL(const String& semicolonSeparatedValue) > > Should change argument type to StringView. >
Will change.
> The one call site for this function calls stripLeadingAndTrailingHTMLSpaces > before calling this function. This is peculiar because it means that leading > HTML spaces on the entire value are ignored, but not spaces after > semicolons. I am not sure that is actually the desired behavior. Also, > stripping trailing spaces is pointless; such spaces will be ignored anyway. > We might want to do that stripping of leading spaces on each substring > before calling protocolIsJavaScript. Note that > stripLeadingAndTrailingHTMLSpaces or just stripping leading spaces are > natural operations to support on a StringView rather than a String when the > result is going to inspected rather than stored. There is no need to > allocate a new String for the substring computed after trimming off possible > spaces. Generally speaking StringView is excellent for substring operations > like this one as long as the substring is not going to be turned back into a > String for storage. Worth coming back to and tidying up later.
I hope you do not mind that I defer looking into this issue for now.
Daniel Bates
Comment 20
2017-05-02 15:51:48 PDT
Committed
r216102
: <
http://trac.webkit.org/changeset/216102
>
Daniel Bates
Comment 21
2017-05-02 16:08:01 PDT
Committed build fix in <
http://trac.webkit.org/changeset/216105
>.
Daniel Bates
Comment 22
2017-05-02 16:15:53 PDT
Fixed up ChangeLog entry in <
http://trac.webkit.org/changeset/216106
>.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug