Bug 213929 - Convert DateComponents to use StringParsingBuffer
Summary: Convert DateComponents to use StringParsingBuffer
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Sam Weinig
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2020-07-03 10:37 PDT by Sam Weinig
Modified: 2020-07-30 19:06 PDT (History)
3 users (show)

See Also:


Attachments
Patch (25.92 KB, patch)
2020-07-03 10:41 PDT, Sam Weinig
no flags Details | Formatted Diff | Diff
Patch (24.41 KB, patch)
2020-07-30 09:58 PDT, Sam Weinig
no flags Details | Formatted Diff | Diff
Patch (24.59 KB, patch)
2020-07-30 11:03 PDT, Sam Weinig
no flags Details | Formatted Diff | Diff
Patch (24.77 KB, patch)
2020-07-30 14:31 PDT, Sam Weinig
no flags Details | Formatted Diff | Diff
Patch (24.21 KB, patch)
2020-07-30 17:50 PDT, Sam Weinig
no flags Details | Formatted Diff | Diff
Patch (24.65 KB, patch)
2020-07-30 18:05 PDT, Sam Weinig
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Sam Weinig 2020-07-03 10:37:59 PDT
Convert DateComponents to use StringParsingBuffer
Comment 1 Sam Weinig 2020-07-03 10:41:38 PDT Comment hidden (obsolete)
Comment 2 Sam Weinig 2020-07-30 09:58:24 PDT Comment hidden (obsolete)
Comment 3 Sam Weinig 2020-07-30 11:03:59 PDT
Created attachment 405591 [details]
Patch
Comment 4 Darin Adler 2020-07-30 12:37:49 PDT
Comment on attachment 405591 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=405591&action=review

> Source/WebCore/platform/DateComponents.cpp:44
> +// HTML5 uses ISO-8601 format with year >= 1. Gregorian calendar started in

Continuing to call the HTML specification "HTML5" is quaint.

> Source/WebCore/platform/DateComponents.cpp:112
> +    skipWhile<CharacterType, isASCIIDigit>(buffer);

Is there no way to make this idiom figure out the character type from the buffer argument?

> Source/WebCore/platform/DateComponents.cpp:120
> +    if (maximumNumberOfDigitsToParse > buffer.lengthRemaining() || !maximumNumberOfDigitsToParse)
> +        return WTF::nullopt;

Subtly awkward naming that caller must *not* pass in a large value for maximumNumberOfDigitsToParse. Doesn't really feel like it’s a maximum from the caller’s point of view. The function could clamp instead of just returning.

> Source/WebCore/platform/DateComponents.cpp:142
> +    if (value && (*value < minimumValue || *value > maximumValue))

I would have written:

    if (!(value && *value >= minimumValue && *value <= maximumValue))

Handles null in a different way, but I slightly prefer it. I think one reason is that this idiom does the right thing with non-finite floating point values. Also if refactoring to use a function, it’s more logical to have an "is good" function than an "is bad" function. In fact, I kind of wish the WTF::String::isNull function had the opposite sense and had some awesome name for the same reason.

> Source/WebCore/platform/DateComponents.cpp:465
> +    StringParsingBuffer<CharacterType> temporaryBuffer = buffer;

Maybe just use auto here?

> Source/WebCore/platform/DateComponents.cpp:485
> +                        // Regardless of the number of digits, we only ever parse at most 3. All other
> +                        // digits after that are ignored, but the buffer is incremented as if they were
> +                        // all parsed.

I don’t see the code that increments the buffer as if they were all parsed.

> Source/WebCore/platform/DateComponents.cpp:659
> +    if (doubleYear < minimumYear || maximumYear < doubleYear)

Same thought about writing this in the !"is good" way instead of the "is bad" way. And here it’s literally floating point, so reversing it would let us optimize by removing the std::isfinite check above.

> Source/WebCore/platform/DateComponents.cpp:697
> +        if (m_year <= minimumYear)

Fascinating that this is "<=". I’m sure we have test coverage.
Comment 5 Sam Weinig 2020-07-30 14:25:55 PDT
(In reply to Darin Adler from comment #4)
> Comment on attachment 405591 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=405591&action=review
> 
> > Source/WebCore/platform/DateComponents.cpp:44
> > +// HTML5 uses ISO-8601 format with year >= 1. Gregorian calendar started in
> 
> Continuing to call the HTML specification "HTML5" is quaint.

Yeah. Fixed.

> 
> > Source/WebCore/platform/DateComponents.cpp:112
> > +    skipWhile<CharacterType, isASCIIDigit>(buffer);
> 
> Is there no way to make this idiom figure out the character type from the
> buffer argument?

I'll see, but so far, unless we convert to using function objects rather than function pointers, we can't deduce CharacterType.

> 
> > Source/WebCore/platform/DateComponents.cpp:120
> > +    if (maximumNumberOfDigitsToParse > buffer.lengthRemaining() || !maximumNumberOfDigitsToParse)
> > +        return WTF::nullopt;
> 
> Subtly awkward naming that caller must *not* pass in a large value for
> maximumNumberOfDigitsToParse. Doesn't really feel like it’s a maximum from
> the caller’s point of view. The function could clamp instead of just
> returning.

I think this whole maximum number concept can be removed eventually.

> 
> > Source/WebCore/platform/DateComponents.cpp:142
> > +    if (value && (*value < minimumValue || *value > maximumValue))
> 
> I would have written:
> 
>     if (!(value && *value >= minimumValue && *value <= maximumValue))
> 
> Handles null in a different way, but I slightly prefer it. I think one
> reason is that this idiom does the right thing with non-finite floating
> point values. Also if refactoring to use a function, it’s more logical to
> have an "is good" function than an "is bad" function. In fact, I kind of
> wish the WTF::String::isNull function had the opposite sense and had some
> awesome name for the same reason.

Ok. Changed.

> 
> > Source/WebCore/platform/DateComponents.cpp:465
> > +    StringParsingBuffer<CharacterType> temporaryBuffer = buffer;
> 
> Maybe just use auto here?

Fixed.

> 
> > Source/WebCore/platform/DateComponents.cpp:485
> > +                        // Regardless of the number of digits, we only ever parse at most 3. All other
> > +                        // digits after that are ignored, but the buffer is incremented as if they were
> > +                        // all parsed.
> 
> I don’t see the code that increments the buffer as if they were all parsed.

It's right below. 

                    // Only after successfully parsing and validating 'millisecond', do we increment the real buffer.
                    buffer += 1 + digitsLength;


> 
> > Source/WebCore/platform/DateComponents.cpp:659
> > +    if (doubleYear < minimumYear || maximumYear < doubleYear)
> 
> Same thought about writing this in the !"is good" way instead of the "is
> bad" way. And here it’s literally floating point, so reversing it would let
> us optimize by removing the std::isfinite check above.

Nice.
Comment 6 Sam Weinig 2020-07-30 14:29:07 PDT
(In reply to Sam Weinig from comment #5)
> (In reply to Darin Adler from comment #4)
> > Comment on attachment 405591 [details]
> > Patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=405591&action=review
> 
> > 
> > > Source/WebCore/platform/DateComponents.cpp:112
> > > +    skipWhile<CharacterType, isASCIIDigit>(buffer);
> > 
> > Is there no way to make this idiom figure out the character type from the
> > buffer argument?
> 
> I'll see, but so far, unless we convert to using function objects rather
> than function pointers, we can't deduce CharacterType.
> 

I guess an alternative option would be to just stop trying to deduce CharacterType and have two overloads of skipWhile, one for LChar, one for UChar:

template<bool characterPredicate(LChar)> void skipWhile(const LChar*& position, const LChar* end)
{
    while (position < end && characterPredicate(*position))
        ++position;
}

template<bool characterPredicate(UChar)> void skipWhile(const UChar*& position, const UChar* end)
{
    while (position < end && characterPredicate(*position))
        ++position;
}
Comment 7 Sam Weinig 2020-07-30 14:29:41 PDT
(In reply to Sam Weinig from comment #6)
> (In reply to Sam Weinig from comment #5)
> > (In reply to Darin Adler from comment #4)
> > > Comment on attachment 405591 [details]
> > > Patch
> > > 
> > > View in context:
> > > https://bugs.webkit.org/attachment.cgi?id=405591&action=review
> > 
> > > 
> > > > Source/WebCore/platform/DateComponents.cpp:112
> > > > +    skipWhile<CharacterType, isASCIIDigit>(buffer);
> > > 
> > > Is there no way to make this idiom figure out the character type from the
> > > buffer argument?
> > 
> > I'll see, but so far, unless we convert to using function objects rather
> > than function pointers, we can't deduce CharacterType.
> > 
> 
> I guess an alternative option would be to just stop trying to deduce
> CharacterType and have two overloads of skipWhile, one for LChar, one for
> UChar:
> 
> template<bool characterPredicate(LChar)> void skipWhile(const LChar*&
> position, const LChar* end)
> {
>     while (position < end && characterPredicate(*position))
>         ++position;
> }
> 
> template<bool characterPredicate(UChar)> void skipWhile(const UChar*&
> position, const UChar* end)
> {
>     while (position < end && characterPredicate(*position))
>         ++position;
> }

This is untested, might not compile.
Comment 8 Sam Weinig 2020-07-30 14:31:00 PDT Comment hidden (obsolete)
Comment 9 Sam Weinig 2020-07-30 17:50:20 PDT Comment hidden (obsolete)
Comment 10 Sam Weinig 2020-07-30 18:05:14 PDT
Created attachment 405647 [details]
Patch
Comment 11 EWS 2020-07-30 19:05:41 PDT
Committed r265126: <https://trac.webkit.org/changeset/265126>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 405647 [details].
Comment 12 Radar WebKit Bug Importer 2020-07-30 19:06:19 PDT
<rdar://problem/66359344>