Bug 144314

Summary: String#startsWith/endsWith/includes don't handle Infinity position/endPosition args correctly
Product: WebKit Reporter: Jordan Harband <ljharb>
Component: JavaScriptCoreAssignee: Jordan Harband <ljharb>
Status: RESOLVED FIXED    
Severity: Normal CC: benjamin, commit-queue, darin, fpizlo, simon.fraser, ysuzuki
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
URL: https://people.mozilla.org/~jorendorff/es6-draft.html#sec-string.prototype.startswith
See Also: https://bugs.webkit.org/show_bug.cgi?id=244196
Attachments:
Description Flags
Patch
none
Patch
none
Patch none

Description Jordan Harband 2015-04-27 22:32:29 PDT
`'abc'.startsWith('a', Infinity)` should return `false`, but currently returns `true`.
Per https://people.mozilla.org/~jorendorff/es6-draft.html#sec-string.prototype.startswith, step 12, `Let start be min(max(pos, 0), len).` - `pos` is `Infinity`, `max(Infinity, 0)` returns `Infinity`, and `min(Infinity, len)` returns `len`. Thus, the search must start after the end of the string, meaning `true` is impossible, as if `position` was the length of the string.

`'abc'.endsWith('bc', Infinity))` should return `true`, but currently returns `false`.
Per https://people.mozilla.org/~jorendorff/es6-draft.html#sec-string.prototype.endswith, step 12, `Let end be min(max(pos, 0), len).` - `pos` is `Infinity`, `max(Infinity, 0)` returns `Infinity`, and `min(Infinity, len)` returns `len`. Thus, in step 14, `Let start be end - searchLength.`, `start` is 0, so the entire string should be searched, as if `endPosition` was `undefined`.
Comment 1 Jordan Harband 2015-04-29 14:54:15 PDT
Created attachment 251988 [details]
Patch
Comment 2 Darin Adler 2015-04-29 19:27:16 PDT
Comment on attachment 251988 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=251988&action=review

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1678
> +    return static_cast<int32_t>(value);

Why cast to int32_t instead of casting to unsigned?

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1700
> +    unsigned len = stringToSearchIn.length();

Could you please use the word “length” instead of the abbreviation “len” here?

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1703
> +    if (positionArg.isUInt32())
> +        start = positionArg.asUInt32();

I think this should say:

    if (positionArg.isInt32())
        start = std::max(0, positionArg.asInt32());

Covers more cases with the fast path code, and I see no downside to it.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1708
> +        double position = positionArg.toInteger(exec);
> +        if (exec->hadException())
> +            return JSValue::encode(jsUndefined());
> +        start = clampInt32(position, 0, len);

I would write this without the local variable:

    start = clampInt32(positionArg.toInteger(exec), 0, length);
    if (exec->hadException())
        return JSValue::encode(jsUndefined());

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1712
> +    if ((searchString.length() + start) > len)
> +        return JSValue::encode(jsBoolean(false));

What prevents searchString.length() + start from overflowing? I think this should say:

    if (searchString.length() > len - start)

But also, I don’t understand why this check is needed. Doesn’t stringToSearchIn.hasInfixStartingAt handle this properly? Does removing this check cause a test to fail?

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1735
> +    unsigned len = stringToSearchIn.length();

Could you please use the word “length” instead of the abbreviation “len” here?

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1740
> +    if (endPositionArg.isUInt32())
> +        end = endPositionArg.asUInt32();

I think this should say:

    if (endPositionArg.isInt32())
        end = std::max(0, endPositionArg.asInt32());

Covers more cases with the fast path code, and I see no downside to it.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1745
> +        double pos = endPositionArg.toInteger(exec);
> +        if (exec->hadException())
> +            return JSValue::encode(jsUndefined());
> +        end = clampInt32(pos, 0, len);

Could you please use the word “position” instead of the abbreviation “pos” here?

Or better, write this without the local variable:

    end = clampInt32(endPositionArg.toInteger(exec), 0, length);
    if (exec->hadException())
        return JSValue::encode(jsUndefined());

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1749
> +    if (end < searchString.length())
> +        return JSValue::encode(jsBoolean(false));

I don’t understand why this check is needed. Doesn’t stringToSearchIn.hasInfixStartingAt handle this properly? Does removing this check cause a test to fail?

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1773
> +    unsigned len = stringToSearchIn.length();

Could you please use the word “length” instead of the abbreviation “len” here?

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1776
> +    if (positionArg.isUInt32())
> +        start = positionArg.asUInt32();

I think this should say:

    if (positionArg.isInt32())
        start = std::max(0, positionArg.asInt32());

Covers more cases with the fast path code, and I see no downside to it.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1781
> +        double position = positionArg.toInteger(exec);
> +        if (exec->hadException())
> +            return JSValue::encode(jsUndefined());
> +        start = clampInt32(position, 0, len);

I would write this without the local variable:

    start = clampInt32(positionArg.toInteger(exec), 0, length);
    if (exec->hadException())
        return JSValue::encode(jsUndefined());
Comment 3 Jordan Harband 2015-04-29 21:06:40 PDT
> Why cast to int32_t instead of casting to unsigned?
No particular reason, I'll change it.

> > Source/JavaScriptCore/runtime/StringPrototype.cpp:1700
> > +    unsigned len = stringToSearchIn.length();
> 
> Could you please use the word “length” instead of the abbreviation “len”
> here?
Sure - I used "len" because that's what the spec uses. I feel there's a lot of value in using the same steps and variable names as the spec when possible (there's also numerous examples in this file of "len" usage). I'll change it to "length" throughout my patch.

> > Source/JavaScriptCore/runtime/StringPrototype.cpp:1703
> > +    if (positionArg.isUInt32())
> > +        start = positionArg.asUInt32();
> 
> I think this should say:
> 
>     if (positionArg.isInt32())
>         start = std::max(0, positionArg.asInt32());
> 
> Covers more cases with the fast path code, and I see no downside to it.
Sounds good! Only downside I can see is the extra "max" call in the UInt32 case, but I suppose that's negligible.

> > Source/JavaScriptCore/runtime/StringPrototype.cpp:1708
> > +        double position = positionArg.toInteger(exec);
> > +        if (exec->hadException())
> > +            return JSValue::encode(jsUndefined());
> > +        start = clampInt32(position, 0, len);
> 
> I would write this without the local variable:
> 
>     start = clampInt32(positionArg.toInteger(exec), 0, length);
>     if (exec->hadException())
>         return JSValue::encode(jsUndefined());
Done (I'd thought that avoiding the clampInt32 call when there was an exception would be more performant)

> > Source/JavaScriptCore/runtime/StringPrototype.cpp:1712
> > +    if ((searchString.length() + start) > len)
> > +        return JSValue::encode(jsBoolean(false));
> 
> What prevents searchString.length() + start from overflowing? I think this
> should say:
> 
>     if (searchString.length() > len - start)
Good call, I hadn't thought about overflow.

> But also, I don’t understand why this check is needed. Doesn’t
> stringToSearchIn.hasInfixStartingAt handle this properly? Does removing this
> check cause a test to fail?
Removing this check does not cause tests to fail, but since it's explicitly called out in the spec, and because it seems more performant to avoid the hasInfixStartingAt call entirely in that case, I'd left it in. I'll remove these checks.
Comment 4 Jordan Harband 2015-04-29 21:14:40 PDT
Created attachment 252036 [details]
Patch
Comment 5 Darin Adler 2015-04-30 14:06:52 PDT
(In reply to comment #3)
> > I think this should say:
> > 
> >     if (positionArg.isInt32())
> >         start = std::max(0, positionArg.asInt32());
> > 
> > Covers more cases with the fast path code, and I see no downside to it.
> Sounds good! Only downside I can see is the extra "max" call in the UInt32
> case, but I suppose that's negligible.

It’s less of a downside than you might think. The isUInt32 function includes a comparison with zero; its name is misleading, as it’s really just a helper that checks for a nonnegative Int32. So by using isInt32 instead, we trade that comparison with zero for the same comparison with zero done as part of std::max. So not just negligible, but probably nonexistent!

> > I would write this without the local variable:
> > 
> >     start = clampInt32(positionArg.toInteger(exec), 0, length);
> >     if (exec->hadException())
> >         return JSValue::encode(jsUndefined());
> Done (I'd thought that avoiding the clampInt32 call when there was an
> exception would be more performant)

It’s worth thinking about that type of thing, but a tiny bit better performance when there is an exception is not something should worry about optimizing for.

> Removing this check does not cause tests to fail, but since it's explicitly
> called out in the spec, and because it seems more performant to avoid the
> hasInfixStartingAt call entirely in that case, I'd left it in. I'll remove
> these checks.

It might have seemed like a performance optimization and it did make that case slightly faster. But by adding that branch we made all other cases slower!
Comment 6 Darin Adler 2015-04-30 14:10:57 PDT
Comment on attachment 252036 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=252036&action=review

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1672
> +static inline unsigned clampInt32(double value, unsigned min, unsigned max)

It occurs to me that this function name is not good. The argument types are 32-bit unsigned values, not what we normally mean by “Int32”. It also worth noting that this function truncates as well as clamping. Would be nice to keep the name brief but I think we can do better, maybe clampAndTruncateToUnsigned?

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1700
> +    unsigned length = stringToSearchIn.length();

I just noticed that this length is only used in the clamp call below. We should slightly optimize by not putting it into a local variable in the isInt32 code path.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1743
> +    return JSValue::encode(jsBoolean(stringToSearchIn.hasInfixEndingAt(searchString, std::min<unsigned>(end, length))));

This should just be std::min, not std::min<unsigned>. Both arguments are unsigned so there is no need to specify the type explicitly.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1765
> +    unsigned length = stringToSearchIn.length();

I just noticed that this length is only used in the clamp call below. We should slightly optimize by not putting it into a local variable in the isInt32 code path.
Comment 7 Jordan Harband 2015-04-30 16:09:39 PDT
Thanks, all fixed in the next patch
Comment 8 Jordan Harband 2015-04-30 16:41:58 PDT
Created attachment 252113 [details]
Patch
Comment 9 Darin Adler 2015-05-01 14:57:43 PDT
Comment on attachment 252113 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=252113&action=review

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1707
> +        if (exec->hadException())
> +            return JSValue::encode(jsUndefined());

If we cared to, we could get away with leaving out this early exit. We would see if we wrote a test that the code has no effect other than performance optimization; roughly speaking, there’d be no way to write a test that would notice the presence or absence of this code. That’s because the code below has no observable side effects. It’s hard to build tests that cover exception handling in cases like this -- we need to pass objects in that have custom valueOf functions with side effects or something like that. No reason you have to do that in this patch, though.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1740
> +        if (exec->hadException())
> +            return JSValue::encode(jsUndefined());

Same thing.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1772
> +        if (exec->hadException())
> +            return JSValue::encode(jsUndefined());

Same here.
Comment 10 WebKit Commit Bot 2015-05-01 15:45:47 PDT
Comment on attachment 252113 [details]
Patch

Clearing flags on attachment: 252113

Committed r183694: <http://trac.webkit.org/changeset/183694>
Comment 11 WebKit Commit Bot 2015-05-01 15:45:52 PDT
All reviewed patches have been landed.  Closing bug.