WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
187864
WTF::StringView::split should have an allowEmptyEntries flag
https://bugs.webkit.org/show_bug.cgi?id=187864
Summary
WTF::StringView::split should have an allowEmptyEntries flag
Ross Kirsling
Reported
2018-07-20 10:39:20 PDT
WTF::String::split takes an allowEmptyEntries flag -- StringView should follow suit.
Attachments
Patch
(7.42 KB, patch)
2018-07-20 10:54 PDT
,
Ross Kirsling
no flags
Details
Formatted Diff
Diff
Patch
(7.50 KB, patch)
2018-07-20 13:36 PDT
,
Ross Kirsling
no flags
Details
Formatted Diff
Diff
Patch
(7.49 KB, patch)
2018-07-20 21:56 PDT
,
Ross Kirsling
no flags
Details
Formatted Diff
Diff
Patch
(7.69 KB, patch)
2018-07-23 10:07 PDT
,
Ross Kirsling
no flags
Details
Formatted Diff
Diff
Show Obsolete
(3)
View All
Add attachment
proposed patch, testcase, etc.
Ross Kirsling
Comment 1
2018-07-20 10:54:34 PDT
Created
attachment 345461
[details]
Patch
Konstantin Tokarev
Comment 2
2018-07-20 11:07:39 PDT
I think this is not the best API, because by looking at call site like a.split(' ', true) you won't have a slightest idea what does it mean until you look up implementation. In Qt we call this "boolean parameter trap" [1]. It would be better to use enum value, or other kind of named and typed flag [1]
https://wiki.qt.io/API_Design_Principles#The_Boolean_Parameter_Trap
Ross Kirsling
Comment 3
2018-07-20 11:11:40 PDT
(In reply to Konstantin Tokarev from
comment #2
)
> I think this is not the best API, because by looking at call site like > > a.split(' ', true) > > you won't have a slightest idea what does it mean until you look up > implementation. In Qt we call this "boolean parameter trap" [1]. It would be > better to use enum value, or other kind of named and typed flag > > [1]
https://wiki.qt.io/API_Design_Principles#The_Boolean_Parameter_Trap
That's a fair point, but I was assuming we would want the API to align with String::split:
https://github.com/WebKit/webkit/blob/master/Source/WTF/wtf/text/WTFString.cpp#L705
Konstantin Tokarev
Comment 4
2018-07-20 11:14:39 PDT
I think it would be better to fix String::split as well, though the latter is a bit more readable: split(separator, result) vs split(separator, true, result)
Ross Kirsling
Comment 5
2018-07-20 13:36:53 PDT
Created
attachment 345478
[details]
Patch
Ross Kirsling
Comment 6
2018-07-20 21:56:19 PDT
Created
attachment 345504
[details]
Patch
Fujii Hironori
Comment 7
2018-07-22 23:04:23 PDT
Assertion fails in Debug build.
> ASSERTION FAILED: m_position < m_result.m_string.length() > c:\webkit\ga\webkitbuild\debug\derivedsources\forwardingheaders\wtf\text\stringview.h(946) : WTF::StringView::SplitResult::Iterator::operator *
Ross Kirsling
Comment 8
2018-07-23 10:07:38 PDT
Created
attachment 345579
[details]
Patch
Ross Kirsling
Comment 9
2018-07-23 10:09:29 PDT
(In reply to Fujii Hironori from
comment #7
)
> Assertion fails in Debug build. > > > ASSERTION FAILED: m_position < m_result.m_string.length() > > c:\webkit\ga\webkitbuild\debug\derivedsources\forwardingheaders\wtf\text\stringview.h(946) : WTF::StringView::SplitResult::Iterator::operator *
Whoops, thanks! Assertion updated.
Konstantin Tokarev
Comment 10
2018-07-23 10:59:59 PDT
Comment on
attachment 345579
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=345579&action=review
> Source/WTF/wtf/text/StringView.h:953 > + return m_position == other.m_position && m_isDone == other.m_isDone;
This looks suspicious to me. 2 iterators may have same position but differ in m_isDone value only if they have result.m_allowEmptyEntries Shouldn't we compare that value instead of m_isDone? May iterators with different allowEmptyEntries be ever considered equal? What is the usage context of this operator?
Konstantin Tokarev
Comment 11
2018-07-23 11:04:03 PDT
Comment on
attachment 345579
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=345579&action=review
>> Source/WTF/wtf/text/StringView.h:953 >> + return m_position == other.m_position && m_isDone == other.m_isDone; > > This looks suspicious to me. 2 iterators may have same position but differ in m_isDone value only if they have result.m_allowEmptyEntries > > Shouldn't we compare that value instead of m_isDone? May iterators with different allowEmptyEntries be ever considered equal? What is the usage context of this operator?
Actually, it seems that m_isDone affects control flow only when assers are enabled, this means that it should be guarded with #ifndef NDEBUG, and it should be excluded from operator==
Ross Kirsling
Comment 12
2018-07-23 11:16:40 PDT
(In reply to Konstantin Tokarev from
comment #11
)
> Comment on
attachment 345579
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=345579&action=review
> > >> Source/WTF/wtf/text/StringView.h:953 > >> + return m_position == other.m_position && m_isDone == other.m_isDone; > > > > This looks suspicious to me. 2 iterators may have same position but differ in m_isDone value only if they have result.m_allowEmptyEntries > > > > Shouldn't we compare that value instead of m_isDone? May iterators with different allowEmptyEntries be ever considered equal? What is the usage context of this operator? > > Actually, it seems that m_isDone affects control flow only when assers are > enabled, this means that it should be guarded with #ifndef NDEBUG, and it > should be excluded from operator==
No, the asserts are just asserts; I simply forgot to update them, as I was in Release... m_isDone exists to prevent having begin() == end() one iteration too early when the input string is either empty or ends in a separator. That is, when we reach the point where m_position == length(), we still have one more empty entry that must be returned. Thus m_position alone cannot suffice as an end condition.
Konstantin Tokarev
Comment 13
2018-07-23 14:57:13 PDT
Comment on
attachment 345579
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=345579&action=review
>>>> Source/WTF/wtf/text/StringView.h:953 >>>> + return m_position == other.m_position && m_isDone == other.m_isDone; >>> >>> This looks suspicious to me. 2 iterators may have same position but differ in m_isDone value only if they have result.m_allowEmptyEntries >>> >>> Shouldn't we compare that value instead of m_isDone? May iterators with different allowEmptyEntries be ever considered equal? What is the usage context of this operator? >> >> Actually, it seems that m_isDone affects control flow only when assers are enabled, this means that it should be guarded with #ifndef NDEBUG, and it should be excluded from operator== > > No, the asserts are just asserts; I simply forgot to update them, as I was in Release... > > m_isDone exists to prevent having begin() == end() one iteration too early when the input string is either empty or ends in a separator. That is, when we reach the point where m_position == length(), we still have one more empty entry that must be returned. Thus m_position alone cannot suffice as an end condition.
Thanks for explanation, now I have better idea of what's going on here. However, I don't get why empty string (corresponding to result.m_string.isEmpty()) should produce any entries at all - it doesn't have any separators.
Ross Kirsling
Comment 14
2018-07-23 15:31:02 PDT
(In reply to Konstantin Tokarev from
comment #13
)
> Comment on
attachment 345579
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=345579&action=review
> > >>>> Source/WTF/wtf/text/StringView.h:953 > >>>> + return m_position == other.m_position && m_isDone == other.m_isDone; > >>> > >>> This looks suspicious to me. 2 iterators may have same position but differ in m_isDone value only if they have result.m_allowEmptyEntries > >>> > >>> Shouldn't we compare that value instead of m_isDone? May iterators with different allowEmptyEntries be ever considered equal? What is the usage context of this operator? > >> > >> Actually, it seems that m_isDone affects control flow only when assers are enabled, this means that it should be guarded with #ifndef NDEBUG, and it should be excluded from operator== > > > > No, the asserts are just asserts; I simply forgot to update them, as I was in Release... > > > > m_isDone exists to prevent having begin() == end() one iteration too early when the input string is either empty or ends in a separator. That is, when we reach the point where m_position == length(), we still have one more empty entry that must be returned. Thus m_position alone cannot suffice as an end condition. > > Thanks for explanation, now I have better idea of what's going on here. > However, I don't get why empty string (corresponding to > result.m_string.isEmpty()) should produce any entries at all - it doesn't > have any separators.
I'm just following the standard behavior of string split in every language I am aware of (including WTF::String::split and boost::algorithm::split). The rationale is that we want "abc".split(',') to return { "abc" } (just the original string), so that sort of implies that "".split(',') should return { "" }.
WebKit Commit Bot
Comment 15
2018-07-23 17:01:55 PDT
Comment on
attachment 345579
[details]
Patch Clearing flags on attachment: 345579 Committed
r234122
: <
https://trac.webkit.org/changeset/234122
>
WebKit Commit Bot
Comment 16
2018-07-23 17:01:57 PDT
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 17
2018-07-23 17:02:46 PDT
<
rdar://problem/42521814
>
Darin Adler
Comment 18
2018-07-23 17:38:02 PDT
I agree that using a boolean here is not good style and I am unhappy to see this pattern repeated from WTF::String rather than fixing it in both places.
Darin Adler
Comment 19
2018-07-23 17:39:28 PDT
(In reply to Darin Adler from
comment #18
)
> I agree that using a boolean here is not good style and I am unhappy to see > this pattern repeated from WTF::String rather than fixing it in both places.
Oh, I see, we used a named argument instead of a boolean -- better. However, I think we should simply have used a new function named splitAllowingEmptyEntries.
Ross Kirsling
Comment 20
2018-07-23 17:43:15 PDT
(In reply to Darin Adler from
comment #19
)
> (In reply to Darin Adler from
comment #18
) > > I agree that using a boolean here is not good style and I am unhappy to see > > this pattern repeated from WTF::String rather than fixing it in both places. > > Oh, I see, we used a named argument instead of a boolean -- better. However, > I think we should simply have used a new function named > splitAllowingEmptyEntries.
I'd be happy to address this for String and StringView alike in a new ticket -- would that be okay?
Ross Kirsling
Comment 21
2018-07-24 10:35:59 PDT
(In reply to Ross Kirsling from
comment #20
)
> I'd be happy to address this for String and StringView alike in a new ticket > -- would that be okay?
Created
https://bugs.webkit.org/show_bug.cgi?id=187963
for this purpose.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug