Bug 159609 - Relax ordering requirements on StringView::CodePoints iterator
Summary: Relax ordering requirements on StringView::CodePoints iterator
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Myles C. Maxfield
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-07-10 13:01 PDT by Myles C. Maxfield
Modified: 2016-07-13 17:08 PDT (History)
9 users (show)

See Also:


Attachments
Patch (6.10 KB, patch)
2016-07-10 13:02 PDT, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
Patch (5.15 KB, patch)
2016-07-11 16:47 PDT, Myles C. Maxfield
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Myles C. Maxfield 2016-07-10 13:01:31 PDT
Relax ordering requirements on StringView::CodePoints iterator
Comment 1 Myles C. Maxfield 2016-07-10 13:02:38 PDT
Created attachment 283290 [details]
Patch
Comment 2 Myles C. Maxfield 2016-07-11 16:47:28 PDT
Created attachment 283365 [details]
Patch
Comment 3 Darin Adler 2016-07-11 19:27:09 PDT
Comment on attachment 283365 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=283365&action=review

Seems OK, but can be coded a bit tighter.

> Source/WTF/wtf/text/StringView.h:656
> +    void advanceCurrentCodePoint();

I’d just call this “advance” or (see below) not have this function at all.

> Source/WTF/wtf/text/StringView.h:660
> +    unsigned m_index;
> +    unsigned m_indexEnd;

We don’t need to add m_indexEnd. We can just change the meaning of m_index to have it point to the start of the next code point. Maybe rename it m_nextCodePointOffset.

> Source/WTF/wtf/text/StringView.h:661
> +    UChar32 m_currentCodePoint;

Not sure this needs the word “current” in its name. We didn’t call it m_currentIndex.

> Source/WTF/wtf/text/StringView.h:717
> +    , m_indexEnd(index)

As I said above, not needed.

> Source/WTF/wtf/text/StringView.h:719
> +    advanceCurrentCodePoint();

Could just write one of these lines here instead:

    ++*this;
    operator++();

Then we would not need a separate advanceCurrentCodePoint function.

> Source/WTF/wtf/text/StringView.h:728
> +    if (m_indexEnd == m_stringView.length()) {
> +        m_currentCodePoint = 0;
> +        return;
> +    }

I’m not sure I understand the importance of this. Do we need a feature where we let you safely advance past the last character and return 0 for the current code point once you do that? Maybe an assertion instead?

> Source/WTF/wtf/text/StringView.h:729
> +    UChar32 result = 0;

No need for the "= 0" here, both sides of the if below set the result.

> Source/WTF/wtf/text/StringView.h:734
> +    if (m_stringView.is8Bit())
> +        result = m_stringView.characters8()[m_indexEnd++];
> +    else
> +        U16_NEXT(m_stringView.characters16(), m_indexEnd, m_stringView.length(), result);
> +    m_currentCodePoint = result;

We do not need "result". Just use m_currentCodePoint directly.

> Source/WTF/wtf/text/StringView.h:739
> +    advanceCurrentCodePoint();

We should be asserting here that m_nextCodePointOffset < m_stringView.length().

> Source/WTF/wtf/text/StringView.h:745
> +    return m_currentCodePoint;

Seems like we should be asserting here that m_nextCodePointOffset <= m_stringView.length().

> Source/WTF/wtf/text/StringView.h:752
> +    ASSERT(!result || (m_indexEnd == other.m_indexEnd && m_currentCodePoint == other.m_currentCodePoint));

This assertion seems like overkill to me. Doesn’t really seem relevant to the job of comparing two iterators.
Comment 4 Darin Adler 2016-07-11 19:28:52 PDT
Not sure m_nextCodePointOffset is a good name. Maybe m_index is fine.
Comment 5 Myles C. Maxfield 2016-07-12 11:47:50 PDT
Comment on attachment 283365 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=283365&action=review

>> Source/WTF/wtf/text/StringView.h:660
>> +    unsigned m_indexEnd;
> 
> We don’t need to add m_indexEnd. We can just change the meaning of m_index to have it point to the start of the next code point. Maybe rename it m_nextCodePointOffset.

The index of the current code point is necessary to make sure that an iterator which points to the last code point is distinct from end().

>> Source/WTF/wtf/text/StringView.h:728
>> +    }
> 
> I’m not sure I understand the importance of this. Do we need a feature where we let you safely advance past the last character and return 0 for the current code point once you do that? Maybe an assertion instead?

This is how end() works. End represents just off the end of the StringView.
Comment 6 Myles C. Maxfield 2016-07-12 11:48:52 PDT
Committed r203119: <http://trac.webkit.org/changeset/203119>
Comment 7 Darin Adler 2016-07-12 14:38:52 PDT
Comment on attachment 283365 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=283365&action=review

>>> Source/WTF/wtf/text/StringView.h:660
>>> +    unsigned m_indexEnd;
>> 
>> We don’t need to add m_indexEnd. We can just change the meaning of m_index to have it point to the start of the next code point. Maybe rename it m_nextCodePointOffset.
> 
> The index of the current code point is necessary to make sure that an iterator which points to the last code point is distinct from end().

Or we could just use a boolean for that, which I think would be clearer than having two index values. Or we could use a special index value for end(), such as 0xFFFFFFFF, which can never be an index into a StringView, since it’s more than the maximum string length. Or we could use Optional<unsigned> for the index.

>>> Source/WTF/wtf/text/StringView.h:728
>>> +    }
>> 
>> I’m not sure I understand the importance of this. Do we need a feature where we let you safely advance past the last character and return 0 for the current code point once you do that? Maybe an assertion instead?
> 
> This is how end() works. End represents just off the end of the StringView.

Sure, but why set m_currentCodePoint since it’s illegal to look at it?
Comment 8 Myles C. Maxfield 2016-07-13 17:08:24 PDT
(In reply to comment #7)
> Comment on attachment 283365 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=283365&action=review
> 
> >>> Source/WTF/wtf/text/StringView.h:660
> >>> +    unsigned m_indexEnd;
> >> 
> >> We don’t need to add m_indexEnd. We can just change the meaning of m_index to have it point to the start of the next code point. Maybe rename it m_nextCodePointOffset.
> > 
> > The index of the current code point is necessary to make sure that an iterator which points to the last code point is distinct from end().
> 
> Or we could just use a boolean for that, which I think would be clearer than
> having two index values. Or we could use a special index value for end(),
> such as 0xFFFFFFFF, which can never be an index into a StringView, since
> it’s more than the maximum string length. Or we could use Optional<unsigned>
> for the index.
> 
> >>> Source/WTF/wtf/text/StringView.h:728
> >>> +    }
> >> 
> >> I’m not sure I understand the importance of this. Do we need a feature where we let you safely advance past the last character and return 0 for the current code point once you do that? Maybe an assertion instead?
> > 
> > This is how end() works. End represents just off the end of the StringView.
> 
> Sure, but why set m_currentCodePoint since it’s illegal to look at it?

Updated in https://bugs.webkit.org/show_bug.cgi?id=159749