Bug 213929

Summary: Convert DateComponents to use StringParsingBuffer
Product: WebKit Reporter: Sam Weinig <sam>
Component: New BugsAssignee: Sam Weinig <sam>
Status: RESOLVED FIXED    
Severity: Normal CC: andersca, darin, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch none

Description Sam Weinig 2020-07-03 10:37:59 PDT
Convert DateComponents to use StringParsingBuffer
Comment 1 Sam Weinig 2020-07-03 10:41:38 PDT Comment hidden (obsolete)
Comment 2 Sam Weinig 2020-07-30 09:58:24 PDT Comment hidden (obsolete)
Comment 3 Sam Weinig 2020-07-30 11:03:59 PDT
Created attachment 405591 [details]
Patch
Comment 4 Darin Adler 2020-07-30 12:37:49 PDT
Comment on attachment 405591 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=405591&action=review

> Source/WebCore/platform/DateComponents.cpp:44
> +// HTML5 uses ISO-8601 format with year >= 1. Gregorian calendar started in

Continuing to call the HTML specification "HTML5" is quaint.

> Source/WebCore/platform/DateComponents.cpp:112
> +    skipWhile<CharacterType, isASCIIDigit>(buffer);

Is there no way to make this idiom figure out the character type from the buffer argument?

> Source/WebCore/platform/DateComponents.cpp:120
> +    if (maximumNumberOfDigitsToParse > buffer.lengthRemaining() || !maximumNumberOfDigitsToParse)
> +        return WTF::nullopt;

Subtly awkward naming that caller must *not* pass in a large value for maximumNumberOfDigitsToParse. Doesn't really feel like it’s a maximum from the caller’s point of view. The function could clamp instead of just returning.

> Source/WebCore/platform/DateComponents.cpp:142
> +    if (value && (*value < minimumValue || *value > maximumValue))

I would have written:

    if (!(value && *value >= minimumValue && *value <= maximumValue))

Handles null in a different way, but I slightly prefer it. I think one reason is that this idiom does the right thing with non-finite floating point values. Also if refactoring to use a function, it’s more logical to have an "is good" function than an "is bad" function. In fact, I kind of wish the WTF::String::isNull function had the opposite sense and had some awesome name for the same reason.

> Source/WebCore/platform/DateComponents.cpp:465
> +    StringParsingBuffer<CharacterType> temporaryBuffer = buffer;

Maybe just use auto here?

> Source/WebCore/platform/DateComponents.cpp:485
> +                        // Regardless of the number of digits, we only ever parse at most 3. All other
> +                        // digits after that are ignored, but the buffer is incremented as if they were
> +                        // all parsed.

I don’t see the code that increments the buffer as if they were all parsed.

> Source/WebCore/platform/DateComponents.cpp:659
> +    if (doubleYear < minimumYear || maximumYear < doubleYear)

Same thought about writing this in the !"is good" way instead of the "is bad" way. And here it’s literally floating point, so reversing it would let us optimize by removing the std::isfinite check above.

> Source/WebCore/platform/DateComponents.cpp:697
> +        if (m_year <= minimumYear)

Fascinating that this is "<=". I’m sure we have test coverage.
Comment 5 Sam Weinig 2020-07-30 14:25:55 PDT
(In reply to Darin Adler from comment #4)
> Comment on attachment 405591 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=405591&action=review
> 
> > Source/WebCore/platform/DateComponents.cpp:44
> > +// HTML5 uses ISO-8601 format with year >= 1. Gregorian calendar started in
> 
> Continuing to call the HTML specification "HTML5" is quaint.

Yeah. Fixed.

> 
> > Source/WebCore/platform/DateComponents.cpp:112
> > +    skipWhile<CharacterType, isASCIIDigit>(buffer);
> 
> Is there no way to make this idiom figure out the character type from the
> buffer argument?

I'll see, but so far, unless we convert to using function objects rather than function pointers, we can't deduce CharacterType.

> 
> > Source/WebCore/platform/DateComponents.cpp:120
> > +    if (maximumNumberOfDigitsToParse > buffer.lengthRemaining() || !maximumNumberOfDigitsToParse)
> > +        return WTF::nullopt;
> 
> Subtly awkward naming that caller must *not* pass in a large value for
> maximumNumberOfDigitsToParse. Doesn't really feel like it’s a maximum from
> the caller’s point of view. The function could clamp instead of just
> returning.

I think this whole maximum number concept can be removed eventually.

> 
> > Source/WebCore/platform/DateComponents.cpp:142
> > +    if (value && (*value < minimumValue || *value > maximumValue))
> 
> I would have written:
> 
>     if (!(value && *value >= minimumValue && *value <= maximumValue))
> 
> Handles null in a different way, but I slightly prefer it. I think one
> reason is that this idiom does the right thing with non-finite floating
> point values. Also if refactoring to use a function, it’s more logical to
> have an "is good" function than an "is bad" function. In fact, I kind of
> wish the WTF::String::isNull function had the opposite sense and had some
> awesome name for the same reason.

Ok. Changed.

> 
> > Source/WebCore/platform/DateComponents.cpp:465
> > +    StringParsingBuffer<CharacterType> temporaryBuffer = buffer;
> 
> Maybe just use auto here?

Fixed.

> 
> > Source/WebCore/platform/DateComponents.cpp:485
> > +                        // Regardless of the number of digits, we only ever parse at most 3. All other
> > +                        // digits after that are ignored, but the buffer is incremented as if they were
> > +                        // all parsed.
> 
> I don’t see the code that increments the buffer as if they were all parsed.

It's right below. 

                    // Only after successfully parsing and validating 'millisecond', do we increment the real buffer.
                    buffer += 1 + digitsLength;


> 
> > Source/WebCore/platform/DateComponents.cpp:659
> > +    if (doubleYear < minimumYear || maximumYear < doubleYear)
> 
> Same thought about writing this in the !"is good" way instead of the "is
> bad" way. And here it’s literally floating point, so reversing it would let
> us optimize by removing the std::isfinite check above.

Nice.
Comment 6 Sam Weinig 2020-07-30 14:29:07 PDT
(In reply to Sam Weinig from comment #5)
> (In reply to Darin Adler from comment #4)
> > Comment on attachment 405591 [details]
> > Patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=405591&action=review
> 
> > 
> > > Source/WebCore/platform/DateComponents.cpp:112
> > > +    skipWhile<CharacterType, isASCIIDigit>(buffer);
> > 
> > Is there no way to make this idiom figure out the character type from the
> > buffer argument?
> 
> I'll see, but so far, unless we convert to using function objects rather
> than function pointers, we can't deduce CharacterType.
> 

I guess an alternative option would be to just stop trying to deduce CharacterType and have two overloads of skipWhile, one for LChar, one for UChar:

template<bool characterPredicate(LChar)> void skipWhile(const LChar*& position, const LChar* end)
{
    while (position < end && characterPredicate(*position))
        ++position;
}

template<bool characterPredicate(UChar)> void skipWhile(const UChar*& position, const UChar* end)
{
    while (position < end && characterPredicate(*position))
        ++position;
}
Comment 7 Sam Weinig 2020-07-30 14:29:41 PDT
(In reply to Sam Weinig from comment #6)
> (In reply to Sam Weinig from comment #5)
> > (In reply to Darin Adler from comment #4)
> > > Comment on attachment 405591 [details]
> > > Patch
> > > 
> > > View in context:
> > > https://bugs.webkit.org/attachment.cgi?id=405591&action=review
> > 
> > > 
> > > > Source/WebCore/platform/DateComponents.cpp:112
> > > > +    skipWhile<CharacterType, isASCIIDigit>(buffer);
> > > 
> > > Is there no way to make this idiom figure out the character type from the
> > > buffer argument?
> > 
> > I'll see, but so far, unless we convert to using function objects rather
> > than function pointers, we can't deduce CharacterType.
> > 
> 
> I guess an alternative option would be to just stop trying to deduce
> CharacterType and have two overloads of skipWhile, one for LChar, one for
> UChar:
> 
> template<bool characterPredicate(LChar)> void skipWhile(const LChar*&
> position, const LChar* end)
> {
>     while (position < end && characterPredicate(*position))
>         ++position;
> }
> 
> template<bool characterPredicate(UChar)> void skipWhile(const UChar*&
> position, const UChar* end)
> {
>     while (position < end && characterPredicate(*position))
>         ++position;
> }

This is untested, might not compile.
Comment 8 Sam Weinig 2020-07-30 14:31:00 PDT Comment hidden (obsolete)
Comment 9 Sam Weinig 2020-07-30 17:50:20 PDT Comment hidden (obsolete)
Comment 10 Sam Weinig 2020-07-30 18:05:14 PDT
Created attachment 405647 [details]
Patch
Comment 11 EWS 2020-07-30 19:05:41 PDT
Committed r265126: <https://trac.webkit.org/changeset/265126>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 405647 [details].
Comment 12 Radar WebKit Bug Importer 2020-07-30 19:06:19 PDT
<rdar://problem/66359344>