RESOLVED FIXED 187864
WTF::StringView::split should have an allowEmptyEntries flag
https://bugs.webkit.org/show_bug.cgi?id=187864
Summary WTF::StringView::split should have an allowEmptyEntries flag
Ross Kirsling
Reported 2018-07-20 10:39:20 PDT
WTF::String::split takes an allowEmptyEntries flag -- StringView should follow suit.
Attachments
Patch (7.42 KB, patch)
2018-07-20 10:54 PDT, Ross Kirsling
no flags
Patch (7.50 KB, patch)
2018-07-20 13:36 PDT, Ross Kirsling
no flags
Patch (7.49 KB, patch)
2018-07-20 21:56 PDT, Ross Kirsling
no flags
Patch (7.69 KB, patch)
2018-07-23 10:07 PDT, Ross Kirsling
no flags
Ross Kirsling
Comment 1 2018-07-20 10:54:34 PDT
Konstantin Tokarev
Comment 2 2018-07-20 11:07:39 PDT
I think this is not the best API, because by looking at call site like a.split(' ', true) you won't have a slightest idea what does it mean until you look up implementation. In Qt we call this "boolean parameter trap" [1]. It would be better to use enum value, or other kind of named and typed flag [1] https://wiki.qt.io/API_Design_Principles#The_Boolean_Parameter_Trap
Ross Kirsling
Comment 3 2018-07-20 11:11:40 PDT
(In reply to Konstantin Tokarev from comment #2) > I think this is not the best API, because by looking at call site like > > a.split(' ', true) > > you won't have a slightest idea what does it mean until you look up > implementation. In Qt we call this "boolean parameter trap" [1]. It would be > better to use enum value, or other kind of named and typed flag > > [1] https://wiki.qt.io/API_Design_Principles#The_Boolean_Parameter_Trap That's a fair point, but I was assuming we would want the API to align with String::split: https://github.com/WebKit/webkit/blob/master/Source/WTF/wtf/text/WTFString.cpp#L705
Konstantin Tokarev
Comment 4 2018-07-20 11:14:39 PDT
I think it would be better to fix String::split as well, though the latter is a bit more readable: split(separator, result) vs split(separator, true, result)
Ross Kirsling
Comment 5 2018-07-20 13:36:53 PDT
Ross Kirsling
Comment 6 2018-07-20 21:56:19 PDT
Fujii Hironori
Comment 7 2018-07-22 23:04:23 PDT
Assertion fails in Debug build. > ASSERTION FAILED: m_position < m_result.m_string.length() > c:\webkit\ga\webkitbuild\debug\derivedsources\forwardingheaders\wtf\text\stringview.h(946) : WTF::StringView::SplitResult::Iterator::operator *
Ross Kirsling
Comment 8 2018-07-23 10:07:38 PDT
Ross Kirsling
Comment 9 2018-07-23 10:09:29 PDT
(In reply to Fujii Hironori from comment #7) > Assertion fails in Debug build. > > > ASSERTION FAILED: m_position < m_result.m_string.length() > > c:\webkit\ga\webkitbuild\debug\derivedsources\forwardingheaders\wtf\text\stringview.h(946) : WTF::StringView::SplitResult::Iterator::operator * Whoops, thanks! Assertion updated.
Konstantin Tokarev
Comment 10 2018-07-23 10:59:59 PDT
Comment on attachment 345579 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=345579&action=review > Source/WTF/wtf/text/StringView.h:953 > + return m_position == other.m_position && m_isDone == other.m_isDone; This looks suspicious to me. 2 iterators may have same position but differ in m_isDone value only if they have result.m_allowEmptyEntries Shouldn't we compare that value instead of m_isDone? May iterators with different allowEmptyEntries be ever considered equal? What is the usage context of this operator?
Konstantin Tokarev
Comment 11 2018-07-23 11:04:03 PDT
Comment on attachment 345579 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=345579&action=review >> Source/WTF/wtf/text/StringView.h:953 >> + return m_position == other.m_position && m_isDone == other.m_isDone; > > This looks suspicious to me. 2 iterators may have same position but differ in m_isDone value only if they have result.m_allowEmptyEntries > > Shouldn't we compare that value instead of m_isDone? May iterators with different allowEmptyEntries be ever considered equal? What is the usage context of this operator? Actually, it seems that m_isDone affects control flow only when assers are enabled, this means that it should be guarded with #ifndef NDEBUG, and it should be excluded from operator==
Ross Kirsling
Comment 12 2018-07-23 11:16:40 PDT
(In reply to Konstantin Tokarev from comment #11) > Comment on attachment 345579 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=345579&action=review > > >> Source/WTF/wtf/text/StringView.h:953 > >> + return m_position == other.m_position && m_isDone == other.m_isDone; > > > > This looks suspicious to me. 2 iterators may have same position but differ in m_isDone value only if they have result.m_allowEmptyEntries > > > > Shouldn't we compare that value instead of m_isDone? May iterators with different allowEmptyEntries be ever considered equal? What is the usage context of this operator? > > Actually, it seems that m_isDone affects control flow only when assers are > enabled, this means that it should be guarded with #ifndef NDEBUG, and it > should be excluded from operator== No, the asserts are just asserts; I simply forgot to update them, as I was in Release... m_isDone exists to prevent having begin() == end() one iteration too early when the input string is either empty or ends in a separator. That is, when we reach the point where m_position == length(), we still have one more empty entry that must be returned. Thus m_position alone cannot suffice as an end condition.
Konstantin Tokarev
Comment 13 2018-07-23 14:57:13 PDT
Comment on attachment 345579 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=345579&action=review >>>> Source/WTF/wtf/text/StringView.h:953 >>>> + return m_position == other.m_position && m_isDone == other.m_isDone; >>> >>> This looks suspicious to me. 2 iterators may have same position but differ in m_isDone value only if they have result.m_allowEmptyEntries >>> >>> Shouldn't we compare that value instead of m_isDone? May iterators with different allowEmptyEntries be ever considered equal? What is the usage context of this operator? >> >> Actually, it seems that m_isDone affects control flow only when assers are enabled, this means that it should be guarded with #ifndef NDEBUG, and it should be excluded from operator== > > No, the asserts are just asserts; I simply forgot to update them, as I was in Release... > > m_isDone exists to prevent having begin() == end() one iteration too early when the input string is either empty or ends in a separator. That is, when we reach the point where m_position == length(), we still have one more empty entry that must be returned. Thus m_position alone cannot suffice as an end condition. Thanks for explanation, now I have better idea of what's going on here. However, I don't get why empty string (corresponding to result.m_string.isEmpty()) should produce any entries at all - it doesn't have any separators.
Ross Kirsling
Comment 14 2018-07-23 15:31:02 PDT
(In reply to Konstantin Tokarev from comment #13) > Comment on attachment 345579 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=345579&action=review > > >>>> Source/WTF/wtf/text/StringView.h:953 > >>>> + return m_position == other.m_position && m_isDone == other.m_isDone; > >>> > >>> This looks suspicious to me. 2 iterators may have same position but differ in m_isDone value only if they have result.m_allowEmptyEntries > >>> > >>> Shouldn't we compare that value instead of m_isDone? May iterators with different allowEmptyEntries be ever considered equal? What is the usage context of this operator? > >> > >> Actually, it seems that m_isDone affects control flow only when assers are enabled, this means that it should be guarded with #ifndef NDEBUG, and it should be excluded from operator== > > > > No, the asserts are just asserts; I simply forgot to update them, as I was in Release... > > > > m_isDone exists to prevent having begin() == end() one iteration too early when the input string is either empty or ends in a separator. That is, when we reach the point where m_position == length(), we still have one more empty entry that must be returned. Thus m_position alone cannot suffice as an end condition. > > Thanks for explanation, now I have better idea of what's going on here. > However, I don't get why empty string (corresponding to > result.m_string.isEmpty()) should produce any entries at all - it doesn't > have any separators. I'm just following the standard behavior of string split in every language I am aware of (including WTF::String::split and boost::algorithm::split). The rationale is that we want "abc".split(',') to return { "abc" } (just the original string), so that sort of implies that "".split(',') should return { "" }.
WebKit Commit Bot
Comment 15 2018-07-23 17:01:55 PDT
Comment on attachment 345579 [details] Patch Clearing flags on attachment: 345579 Committed r234122: <https://trac.webkit.org/changeset/234122>
WebKit Commit Bot
Comment 16 2018-07-23 17:01:57 PDT
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 17 2018-07-23 17:02:46 PDT
Darin Adler
Comment 18 2018-07-23 17:38:02 PDT
I agree that using a boolean here is not good style and I am unhappy to see this pattern repeated from WTF::String rather than fixing it in both places.
Darin Adler
Comment 19 2018-07-23 17:39:28 PDT
(In reply to Darin Adler from comment #18) > I agree that using a boolean here is not good style and I am unhappy to see > this pattern repeated from WTF::String rather than fixing it in both places. Oh, I see, we used a named argument instead of a boolean -- better. However, I think we should simply have used a new function named splitAllowingEmptyEntries.
Ross Kirsling
Comment 20 2018-07-23 17:43:15 PDT
(In reply to Darin Adler from comment #19) > (In reply to Darin Adler from comment #18) > > I agree that using a boolean here is not good style and I am unhappy to see > > this pattern repeated from WTF::String rather than fixing it in both places. > > Oh, I see, we used a named argument instead of a boolean -- better. However, > I think we should simply have used a new function named > splitAllowingEmptyEntries. I'd be happy to address this for String and StringView alike in a new ticket -- would that be okay?
Ross Kirsling
Comment 21 2018-07-24 10:35:59 PDT
(In reply to Ross Kirsling from comment #20) > I'd be happy to address this for String and StringView alike in a new ticket > -- would that be okay? Created https://bugs.webkit.org/show_bug.cgi?id=187963 for this purpose.
Note You need to log in before you can comment on or make changes to this bug.