Bug 140047 - Implement ES6 String.prototype.repeat(count)
Summary: Implement ES6 String.prototype.repeat(count)
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Yusuke Suzuki
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-01-02 17:34 PST by Yusuke Suzuki
Modified: 2015-01-06 12:04 PST (History)
4 users (show)

See Also:


Attachments
Patch (10.07 KB, patch)
2015-01-02 18:02 PST, Yusuke Suzuki
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews105 for mac-mountainlion-wk2 (515.56 KB, application/zip)
2015-01-02 18:35 PST, Build Bot
no flags Details
Archive of layout-test-results from ews100 for mac-mountainlion (516.06 KB, application/zip)
2015-01-02 18:48 PST, Build Bot
no flags Details
Patch (14.69 KB, patch)
2015-01-02 18:51 PST, Yusuke Suzuki
no flags Details | Formatted Diff | Diff
Patch (16.90 KB, patch)
2015-01-06 06:10 PST, Yusuke Suzuki
no flags Details | Formatted Diff | Diff
Patch (17.50 KB, patch)
2015-01-06 06:36 PST, Yusuke Suzuki
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Yusuke Suzuki 2015-01-02 17:34:52 PST
Implement ES6 String.prototype.repeat(count)
Comment 1 Yusuke Suzuki 2015-01-02 18:02:07 PST
Created attachment 243909 [details]
Patch
Comment 2 Build Bot 2015-01-02 18:35:11 PST
Comment on attachment 243909 [details]
Patch

Attachment 243909 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.appspot.com/results/6324153409863680

New failing tests:
js/Object-getOwnPropertyNames.html
Comment 3 Build Bot 2015-01-02 18:35:13 PST
Created attachment 243910 [details]
Archive of layout-test-results from ews105 for mac-mountainlion-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews105  Port: mac-mountainlion-wk2  Platform: Mac OS X 10.8.5
Comment 4 Build Bot 2015-01-02 18:48:17 PST
Comment on attachment 243909 [details]
Patch

Attachment 243909 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.appspot.com/results/6369738045259776

New failing tests:
js/Object-getOwnPropertyNames.html
Comment 5 Build Bot 2015-01-02 18:48:19 PST
Created attachment 243911 [details]
Archive of layout-test-results from ews100 for mac-mountainlion

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews100  Port: mac-mountainlion  Platform: Mac OS X 10.8.5
Comment 6 Yusuke Suzuki 2015-01-02 18:51:50 PST
Created attachment 243912 [details]
Patch
Comment 7 Darin Adler 2015-01-04 13:34:30 PST
Comment on attachment 243912 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=243912&action=review

Looks generally good. Needs a little work.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:688
> +    JSValue a0 = exec->argument(0);
> +    double n = a0.toInteger(exec);

No need to put argument 0 into a local variable since we are using it only once.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:692
> +    if (n < 0 || !std::isfinite(n))
> +        return throwVMError(exec, createRangeError(exec, ASCIILiteral("repeat() argument must be greater than or equal to 0 and not be infinity.")));

This error message doesn’t seem quite right for NaN.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:694
> +    if ((n * string->length()) > std::numeric_limits<int32_t>::max())

We don’t normally use parentheses for an expression like this. We are used to recognizing that arithmetic operators like * bind more tightly that comparison operators like >.

We should probably cut down a little bit on the floating point math. We can do this:

    if (n > std::numeric_limits<int32_t>::max() / string->length())

I don’t understand why the math is being done with int32_t rather than unsigned. String lengths are unsigned, not signed, so it doesn’t make sense to mention <int32_t>::max() here rather than <unsigned>::max().

> Source/JavaScriptCore/runtime/StringPrototype.cpp:712
> +    JSString* result = jsEmptyString(exec);
> +    int32_t count = static_cast<int32_t>(n);
> +    while (count > 0) {
> +        if (count & 1) {
> +            JSValue concatenated = jsString(exec, result, string);
> +            if (exec->hadException())
> +                return JSValue::encode(jsUndefined());
> +            result = jsCast<JSString*>(concatenated);
> +        }
> +        count >>= 1;
> +        JSValue concatenated = jsString(exec, string, string);
> +        if (exec->hadException())
> +            return JSValue::encode(jsUndefined());
> +        string = jsCast<JSString*>(concatenated);
> +    }
> +    return JSValue::encode(result);

This is unnecessarily inefficient. We should create only a single JSString, not all these extra ones. Each of those intermediate strings will need to be allocated and then garbage collected so it’s inappropriately slow. The code should look more like this:

    VM& vm = exec->vm();

    if (!string->length() || !repeatCount)
        return jsEmptyString(vm);

    if (repeatCount == 1)
        return string;

    JSRopeString::RopeBuilder ropeBuilder(vm);

    for (unsigned i = 0; i < count; ++i) {
        if (!ropeBuilder.append(string))
            return throwOutOfMemoryError(exec);
    }

    return ropeBuilder.release();

I’m not sure if we need the special case for empty strings, for repeat count 0, for repeat count 1, and for large lengths that would overflow. All of them are performance optimizations and not needed for correctness.

The rope builder code will handle out of memory correctly without an explicit length check. And even with a length check there are going to be large repeat counts that will cause the code to loop for a long time before detecting that it ran out of memory. The length check is *not* needed to prevent overflow, so it’s solely a performance optimization for overflow cases, one that I’m not sure we need.
Comment 8 Darin Adler 2015-01-04 13:36:59 PST
Comment on attachment 243912 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=243912&action=review

>> Source/JavaScriptCore/runtime/StringPrototype.cpp:692
>> +        return throwVMError(exec, createRangeError(exec, ASCIILiteral("repeat() argument must be greater than or equal to 0 and not be infinity.")));
> 
> This error message doesn’t seem quite right for NaN.

Oh, I see, this won’t happen for NaN because toInteger is guaranteed to not return NaN. In that case, the code should use std::isinf rather than std::isfinite, since it is more efficient:

    if (n < 0 || std::isinf(n))
Comment 9 Darin Adler 2015-01-04 13:37:51 PST
Comment on attachment 243912 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=243912&action=review

>>> Source/JavaScriptCore/runtime/StringPrototype.cpp:692
>>> +        return throwVMError(exec, createRangeError(exec, ASCIILiteral("repeat() argument must be greater than or equal to 0 and not be infinity.")));
>> 
>> This error message doesn’t seem quite right for NaN.
> 
> Oh, I see, this won’t happen for NaN because toInteger is guaranteed to not return NaN. In that case, the code should use std::isinf rather than std::isfinite, since it is more efficient:
> 
>     if (n < 0 || std::isinf(n))

Do other error strings end in periods? I note that the one for out-of-memory does not (you can see that in the test result).
Comment 10 Darin Adler 2015-01-04 13:39:29 PST
Comment on attachment 243912 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=243912&action=review

>> Source/JavaScriptCore/runtime/StringPrototype.cpp:694
>> +    if ((n * string->length()) > std::numeric_limits<int32_t>::max())
> 
> We don’t normally use parentheses for an expression like this. We are used to recognizing that arithmetic operators like * bind more tightly that comparison operators like >.
> 
> We should probably cut down a little bit on the floating point math. We can do this:
> 
>     if (n > std::numeric_limits<int32_t>::max() / string->length())
> 
> I don’t understand why the math is being done with int32_t rather than unsigned. String lengths are unsigned, not signed, so it doesn’t make sense to mention <int32_t>::max() here rather than <unsigned>::max().

Hmm, reading more code it seems there is some kind of int32_t limit for JSString.
Comment 11 Yusuke Suzuki 2015-01-04 14:09:15 PST
Comment on attachment 243912 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=243912&action=review

Thank you for your review!

>> Source/JavaScriptCore/runtime/StringPrototype.cpp:688
>> +    double n = a0.toInteger(exec);
> 
> No need to put argument 0 into a local variable since we are using it only once.

Nice! I'll fix it.

>>>> Source/JavaScriptCore/runtime/StringPrototype.cpp:692
>>>> +        return throwVMError(exec, createRangeError(exec, ASCIILiteral("repeat() argument must be greater than or equal to 0 and not be infinity.")));
>>> 
>>> This error message doesn’t seem quite right for NaN.
>> 
>> Oh, I see, this won’t happen for NaN because toInteger is guaranteed to not return NaN. In that case, the code should use std::isinf rather than std::isfinite, since it is more efficient:
>> 
>>     if (n < 0 || std::isinf(n))
> 
> Do other error strings end in periods? I note that the one for out-of-memory does not (you can see that in the test result).

> Oh, I see, this won’t happen for NaN because toInteger is guaranteed to not return NaN. In that case, the code should use std::isinf rather than std::isfinite, since it is more efficient:

Looks very nice! I'll change to `if (n < 0 || std::isinf(n))`.

> Do other error strings end in periods? I note that the one for out-of-memory does not (you can see that in the test result).

Seeing the code, some ends in periods[1], but some doesn't end in periods[2].
But seeing the results, it seems that ending without periods is major, so I'll follow it (dropping the period).

[1]: http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/runtime/RegExpPrototype.cpp#L116
[2]: http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/runtime/DatePrototype.cpp#L1083

>>> Source/JavaScriptCore/runtime/StringPrototype.cpp:694
>>> +    if ((n * string->length()) > std::numeric_limits<int32_t>::max())
>> 
>> We don’t normally use parentheses for an expression like this. We are used to recognizing that arithmetic operators like * bind more tightly that comparison operators like >.
>> 
>> We should probably cut down a little bit on the floating point math. We can do this:
>> 
>>     if (n > std::numeric_limits<int32_t>::max() / string->length())
>> 
>> I don’t understand why the math is being done with int32_t rather than unsigned. String lengths are unsigned, not signed, so it doesn’t make sense to mention <int32_t>::max() here rather than <unsigned>::max().
> 
> Hmm, reading more code it seems there is some kind of int32_t limit for JSString.

Yes. Seeing the `JSValue jsString(ExecState* exec, JSString* s1, JSString* s2)` code, I found that there's a int32_t length limitation for JSString.
http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/runtime/Operations.h#L47

Maybe, IMHO, it's introduced for some optimizations. By limiting JSString's length in int32_t, string.length results can be assumed within int32_t JSValue range.

At this time, I'll use `JSRopeString::RopeBuilder` and dropping this check :)

>> Source/JavaScriptCore/runtime/StringPrototype.cpp:712
>> +    return JSValue::encode(result);
> 
> This is unnecessarily inefficient. We should create only a single JSString, not all these extra ones. Each of those intermediate strings will need to be allocated and then garbage collected so it’s inappropriately slow. The code should look more like this:
> 
>     VM& vm = exec->vm();
> 
>     if (!string->length() || !repeatCount)
>         return jsEmptyString(vm);
> 
>     if (repeatCount == 1)
>         return string;
> 
>     JSRopeString::RopeBuilder ropeBuilder(vm);
> 
>     for (unsigned i = 0; i < count; ++i) {
>         if (!ropeBuilder.append(string))
>             return throwOutOfMemoryError(exec);
>     }
> 
>     return ropeBuilder.release();
> 
> I’m not sure if we need the special case for empty strings, for repeat count 0, for repeat count 1, and for large lengths that would overflow. All of them are performance optimizations and not needed for correctness.
> 
> The rope builder code will handle out of memory correctly without an explicit length check. And even with a length check there are going to be large repeat counts that will cause the code to loop for a long time before detecting that it ran out of memory. The length check is *not* needed to prevent overflow, so it’s solely a performance optimization for overflow cases, one that I’m not sure we need.

Ah, thank you!! I've misunderstood the JSC's String representation (I assumed that JSC has simple Seq/Cons Strings like V8, but JSC has multiple (3) slots).
JSRopeString::RopeBuilder efficiently manages ropes, it looks very nice.
I'll change the code to it.

> I’m not sure if we need the special case for empty strings, for repeat count 0, for repeat count 1, and for large lengths that would overflow. All of them are performance optimizations and not needed for correctness.

Additionally, I think checking "empty strings, for repeat count 0, for repeat count 1" allow JSC to execute `"".repeat(very large number)` without a OOM error. So I think introducing it looks nice.

> The rope builder code will handle out of memory correctly without an explicit length check. And even with a length check there are going to be large repeat counts that will cause the code to loop for a long time before detecting that it ran out of memory. The length check is *not* needed to prevent overflow, so it’s solely a performance optimization for overflow cases, one that I’m not sure we need.

Agreed. At the end of the execution, both cases produces the OOM error, so there's no observable results except for the consumed time.
Comment 12 Yusuke Suzuki 2015-01-04 14:13:56 PST
Comment on attachment 243912 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=243912&action=review

>>> Source/JavaScriptCore/runtime/StringPrototype.cpp:712
>>> +    return JSValue::encode(result);
>> 
>> This is unnecessarily inefficient. We should create only a single JSString, not all these extra ones. Each of those intermediate strings will need to be allocated and then garbage collected so it’s inappropriately slow. The code should look more like this:
>> 
>>     VM& vm = exec->vm();
>> 
>>     if (!string->length() || !repeatCount)
>>         return jsEmptyString(vm);
>> 
>>     if (repeatCount == 1)
>>         return string;
>> 
>>     JSRopeString::RopeBuilder ropeBuilder(vm);
>> 
>>     for (unsigned i = 0; i < count; ++i) {
>>         if (!ropeBuilder.append(string))
>>             return throwOutOfMemoryError(exec);
>>     }
>> 
>>     return ropeBuilder.release();
>> 
>> I’m not sure if we need the special case for empty strings, for repeat count 0, for repeat count 1, and for large lengths that would overflow. All of them are performance optimizations and not needed for correctness.
>> 
>> The rope builder code will handle out of memory correctly without an explicit length check. And even with a length check there are going to be large repeat counts that will cause the code to loop for a long time before detecting that it ran out of memory. The length check is *not* needed to prevent overflow, so it’s solely a performance optimization for overflow cases, one that I’m not sure we need.
> 
> Ah, thank you!! I've misunderstood the JSC's String representation (I assumed that JSC has simple Seq/Cons Strings like V8, but JSC has multiple (3) slots).
> JSRopeString::RopeBuilder efficiently manages ropes, it looks very nice.
> I'll change the code to it.

**Reformatting my exiting replies in the review to easily see in the "Review Patch" viewer.**

your comment: "I’m not sure if we need the special case for empty strings, for repeat count 0, for repeat count 1, and for large lengths that would overflow. All of them are performance optimizations and not needed for correctness."

Additionally, I think checking "empty strings, for repeat count 0, for repeat count 1" allow JSC to execute `"".repeat(very large number)` without a OOM error. So I think introducing it looks nice.

your comment: "The rope builder code will handle out of memory correctly without an explicit length check. And even with a length check there are going to be large repeat counts that will cause the code to loop for a long time before detecting that it ran out of memory. The length check is *not* needed to prevent overflow, so it’s solely a performance optimization for overflow cases, one that I’m not sure we need."

Agreed. At the end of the execution, both cases produces the OOM error, so there's no observable results except for the consumed time.
Comment 13 Darin Adler 2015-01-04 14:17:53 PST
Comment on attachment 243912 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=243912&action=review

>>>> Source/JavaScriptCore/runtime/StringPrototype.cpp:712
>>>> +    return JSValue::encode(result);
>>> 
>>> This is unnecessarily inefficient. We should create only a single JSString, not all these extra ones. Each of those intermediate strings will need to be allocated and then garbage collected so it’s inappropriately slow. The code should look more like this:
>>> 
>>>     VM& vm = exec->vm();
>>> 
>>>     if (!string->length() || !repeatCount)
>>>         return jsEmptyString(vm);
>>> 
>>>     if (repeatCount == 1)
>>>         return string;
>>> 
>>>     JSRopeString::RopeBuilder ropeBuilder(vm);
>>> 
>>>     for (unsigned i = 0; i < count; ++i) {
>>>         if (!ropeBuilder.append(string))
>>>             return throwOutOfMemoryError(exec);
>>>     }
>>> 
>>>     return ropeBuilder.release();
>>> 
>>> I’m not sure if we need the special case for empty strings, for repeat count 0, for repeat count 1, and for large lengths that would overflow. All of them are performance optimizations and not needed for correctness.
>>> 
>>> The rope builder code will handle out of memory correctly without an explicit length check. And even with a length check there are going to be large repeat counts that will cause the code to loop for a long time before detecting that it ran out of memory. The length check is *not* needed to prevent overflow, so it’s solely a performance optimization for overflow cases, one that I’m not sure we need.
>> 
>> Ah, thank you!! I've misunderstood the JSC's String representation (I assumed that JSC has simple Seq/Cons Strings like V8, but JSC has multiple (3) slots).
>> JSRopeString::RopeBuilder efficiently manages ropes, it looks very nice.
>> I'll change the code to it.
> 
> **Reformatting my exiting replies in the review to easily see in the "Review Patch" viewer.**
> 
> your comment: "I’m not sure if we need the special case for empty strings, for repeat count 0, for repeat count 1, and for large lengths that would overflow. All of them are performance optimizations and not needed for correctness."
> 
> Additionally, I think checking "empty strings, for repeat count 0, for repeat count 1" allow JSC to execute `"".repeat(very large number)` without a OOM error. So I think introducing it looks nice.
> 
> your comment: "The rope builder code will handle out of memory correctly without an explicit length check. And even with a length check there are going to be large repeat counts that will cause the code to loop for a long time before detecting that it ran out of memory. The length check is *not* needed to prevent overflow, so it’s solely a performance optimization for overflow cases, one that I’m not sure we need."
> 
> Agreed. At the end of the execution, both cases produces the OOM error, so there's no observable results except for the consumed time.

I see your point, but single character strings with a large repeat count would be an optimization for string length 1, not repeat count 1.

An optimized path for single character strings might make sense. Building a rope where each string in the rope is a single character is really inefficient. There's probably a cutoff for a small enough string size where we’d want to use StringBuilder instead of RopeBuilder and create the JSString from a single String instead of making a rope. It’s straightforward to use StringBuilder; just make one, then call reserveCapacity, then append in a loop. Or for single characters we could just use String::createUninitialized rather than using StringBuilder or RopeBuilder.
Comment 14 Yusuke Suzuki 2015-01-04 14:35:38 PST
Comment on attachment 243912 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=243912&action=review

>>>>> Source/JavaScriptCore/runtime/StringPrototype.cpp:712
>>>>> +    return JSValue::encode(result);
>>>> 
>>>> This is unnecessarily inefficient. We should create only a single JSString, not all these extra ones. Each of those intermediate strings will need to be allocated and then garbage collected so it’s inappropriately slow. The code should look more like this:
>>>> 
>>>>     VM& vm = exec->vm();
>>>> 
>>>>     if (!string->length() || !repeatCount)
>>>>         return jsEmptyString(vm);
>>>> 
>>>>     if (repeatCount == 1)
>>>>         return string;
>>>> 
>>>>     JSRopeString::RopeBuilder ropeBuilder(vm);
>>>> 
>>>>     for (unsigned i = 0; i < count; ++i) {
>>>>         if (!ropeBuilder.append(string))
>>>>             return throwOutOfMemoryError(exec);
>>>>     }
>>>> 
>>>>     return ropeBuilder.release();
>>>> 
>>>> I’m not sure if we need the special case for empty strings, for repeat count 0, for repeat count 1, and for large lengths that would overflow. All of them are performance optimizations and not needed for correctness.
>>>> 
>>>> The rope builder code will handle out of memory correctly without an explicit length check. And even with a length check there are going to be large repeat counts that will cause the code to loop for a long time before detecting that it ran out of memory. The length check is *not* needed to prevent overflow, so it’s solely a performance optimization for overflow cases, one that I’m not sure we need.
>>> 
>>> Ah, thank you!! I've misunderstood the JSC's String representation (I assumed that JSC has simple Seq/Cons Strings like V8, but JSC has multiple (3) slots).
>>> JSRopeString::RopeBuilder efficiently manages ropes, it looks very nice.
>>> I'll change the code to it.
>> 
>> **Reformatting my exiting replies in the review to easily see in the "Review Patch" viewer.**
>> 
>> your comment: "I’m not sure if we need the special case for empty strings, for repeat count 0, for repeat count 1, and for large lengths that would overflow. All of them are performance optimizations and not needed for correctness."
>> 
>> Additionally, I think checking "empty strings, for repeat count 0, for repeat count 1" allow JSC to execute `"".repeat(very large number)` without a OOM error. So I think introducing it looks nice.
>> 
>> your comment: "The rope builder code will handle out of memory correctly without an explicit length check. And even with a length check there are going to be large repeat counts that will cause the code to loop for a long time before detecting that it ran out of memory. The length check is *not* needed to prevent overflow, so it’s solely a performance optimization for overflow cases, one that I’m not sure we need."
>> 
>> Agreed. At the end of the execution, both cases produces the OOM error, so there's no observable results except for the consumed time.
> 
> I see your point, but single character strings with a large repeat count would be an optimization for string length 1, not repeat count 1.
> 
> An optimized path for single character strings might make sense. Building a rope where each string in the rope is a single character is really inefficient. There's probably a cutoff for a small enough string size where we’d want to use StringBuilder instead of RopeBuilder and create the JSString from a single String instead of making a rope. It’s straightforward to use StringBuilder; just make one, then call reserveCapacity, then append in a loop. Or for single characters we could just use String::createUninitialized rather than using StringBuilder or RopeBuilder.

That's very nice insight!
Repeating a single character is considered as very common use cases (e.g. `' '.repeat(4)` to generate 4 spaces for indentation). Optimizing for this case seems reasonable.

And one note about the current code is that, in the current implementation, we concat the strings in the loop (`string` variable in the code) even if we don't use it (not adding it to `result`). As a result, we only loop log(N) times. If the N becomes larger, it might have an effect. But on the other hand, we can say that we just delay the processing time until resolving the ropes. In the current code, I took the log(N) method.
What do you think of this? I'd like to hear your thought.
Comment 15 Darin Adler 2015-01-04 15:13:15 PST
Comment on attachment 243912 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=243912&action=review

>>>>>> Source/JavaScriptCore/runtime/StringPrototype.cpp:712
>>>>>> +    return JSValue::encode(result);
>>>>> 
>>>>> This is unnecessarily inefficient. We should create only a single JSString, not all these extra ones. Each of those intermediate strings will need to be allocated and then garbage collected so it’s inappropriately slow. The code should look more like this:
>>>>> 
>>>>>     VM& vm = exec->vm();
>>>>> 
>>>>>     if (!string->length() || !repeatCount)
>>>>>         return jsEmptyString(vm);
>>>>> 
>>>>>     if (repeatCount == 1)
>>>>>         return string;
>>>>> 
>>>>>     JSRopeString::RopeBuilder ropeBuilder(vm);
>>>>> 
>>>>>     for (unsigned i = 0; i < count; ++i) {
>>>>>         if (!ropeBuilder.append(string))
>>>>>             return throwOutOfMemoryError(exec);
>>>>>     }
>>>>> 
>>>>>     return ropeBuilder.release();
>>>>> 
>>>>> I’m not sure if we need the special case for empty strings, for repeat count 0, for repeat count 1, and for large lengths that would overflow. All of them are performance optimizations and not needed for correctness.
>>>>> 
>>>>> The rope builder code will handle out of memory correctly without an explicit length check. And even with a length check there are going to be large repeat counts that will cause the code to loop for a long time before detecting that it ran out of memory. The length check is *not* needed to prevent overflow, so it’s solely a performance optimization for overflow cases, one that I’m not sure we need.
>>>> 
>>>> Ah, thank you!! I've misunderstood the JSC's String representation (I assumed that JSC has simple Seq/Cons Strings like V8, but JSC has multiple (3) slots).
>>>> JSRopeString::RopeBuilder efficiently manages ropes, it looks very nice.
>>>> I'll change the code to it.
>>> 
>>> **Reformatting my exiting replies in the review to easily see in the "Review Patch" viewer.**
>>> 
>>> your comment: "I’m not sure if we need the special case for empty strings, for repeat count 0, for repeat count 1, and for large lengths that would overflow. All of them are performance optimizations and not needed for correctness."
>>> 
>>> Additionally, I think checking "empty strings, for repeat count 0, for repeat count 1" allow JSC to execute `"".repeat(very large number)` without a OOM error. So I think introducing it looks nice.
>>> 
>>> your comment: "The rope builder code will handle out of memory correctly without an explicit length check. And even with a length check there are going to be large repeat counts that will cause the code to loop for a long time before detecting that it ran out of memory. The length check is *not* needed to prevent overflow, so it’s solely a performance optimization for overflow cases, one that I’m not sure we need."
>>> 
>>> Agreed. At the end of the execution, both cases produces the OOM error, so there's no observable results except for the consumed time.
>> 
>> I see your point, but single character strings with a large repeat count would be an optimization for string length 1, not repeat count 1.
>> 
>> An optimized path for single character strings might make sense. Building a rope where each string in the rope is a single character is really inefficient. There's probably a cutoff for a small enough string size where we’d want to use StringBuilder instead of RopeBuilder and create the JSString from a single String instead of making a rope. It’s straightforward to use StringBuilder; just make one, then call reserveCapacity, then append in a loop. Or for single characters we could just use String::createUninitialized rather than using StringBuilder or RopeBuilder.
> 
> That's very nice insight!
> Repeating a single character is considered as very common use cases (e.g. `' '.repeat(4)` to generate 4 spaces for indentation). Optimizing for this case seems reasonable.
> 
> And one note about the current code is that, in the current implementation, we concat the strings in the loop (`string` variable in the code) even if we don't use it (not adding it to `result`). As a result, we only loop log(N) times. If the N becomes larger, it might have an effect. But on the other hand, we can say that we just delay the processing time until resolving the ropes. In the current code, I took the log(N) method.
> What do you think of this? I'd like to hear your thought.

The rope builder still loops through all the strings to make each rope, so the O(log(N)) is a false economy; the looping across every string is still happening and the code is still O(N), not O(log(N)). Our single loop that runs across all N strings is more efficient than the loop that calls the rope code log(N) times.
Comment 16 Yusuke Suzuki 2015-01-04 15:22:24 PST
Comment on attachment 243912 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=243912&action=review

>>>>>>> Source/JavaScriptCore/runtime/StringPrototype.cpp:712
>>>>>>> +    return JSValue::encode(result);
>>>>>> 
>>>>>> This is unnecessarily inefficient. We should create only a single JSString, not all these extra ones. Each of those intermediate strings will need to be allocated and then garbage collected so it’s inappropriately slow. The code should look more like this:
>>>>>> 
>>>>>>     VM& vm = exec->vm();
>>>>>> 
>>>>>>     if (!string->length() || !repeatCount)
>>>>>>         return jsEmptyString(vm);
>>>>>> 
>>>>>>     if (repeatCount == 1)
>>>>>>         return string;
>>>>>> 
>>>>>>     JSRopeString::RopeBuilder ropeBuilder(vm);
>>>>>> 
>>>>>>     for (unsigned i = 0; i < count; ++i) {
>>>>>>         if (!ropeBuilder.append(string))
>>>>>>             return throwOutOfMemoryError(exec);
>>>>>>     }
>>>>>> 
>>>>>>     return ropeBuilder.release();
>>>>>> 
>>>>>> I’m not sure if we need the special case for empty strings, for repeat count 0, for repeat count 1, and for large lengths that would overflow. All of them are performance optimizations and not needed for correctness.
>>>>>> 
>>>>>> The rope builder code will handle out of memory correctly without an explicit length check. And even with a length check there are going to be large repeat counts that will cause the code to loop for a long time before detecting that it ran out of memory. The length check is *not* needed to prevent overflow, so it’s solely a performance optimization for overflow cases, one that I’m not sure we need.
>>>>> 
>>>>> Ah, thank you!! I've misunderstood the JSC's String representation (I assumed that JSC has simple Seq/Cons Strings like V8, but JSC has multiple (3) slots).
>>>>> JSRopeString::RopeBuilder efficiently manages ropes, it looks very nice.
>>>>> I'll change the code to it.
>>>> 
>>>> **Reformatting my exiting replies in the review to easily see in the "Review Patch" viewer.**
>>>> 
>>>> your comment: "I’m not sure if we need the special case for empty strings, for repeat count 0, for repeat count 1, and for large lengths that would overflow. All of them are performance optimizations and not needed for correctness."
>>>> 
>>>> Additionally, I think checking "empty strings, for repeat count 0, for repeat count 1" allow JSC to execute `"".repeat(very large number)` without a OOM error. So I think introducing it looks nice.
>>>> 
>>>> your comment: "The rope builder code will handle out of memory correctly without an explicit length check. And even with a length check there are going to be large repeat counts that will cause the code to loop for a long time before detecting that it ran out of memory. The length check is *not* needed to prevent overflow, so it’s solely a performance optimization for overflow cases, one that I’m not sure we need."
>>>> 
>>>> Agreed. At the end of the execution, both cases produces the OOM error, so there's no observable results except for the consumed time.
>>> 
>>> I see your point, but single character strings with a large repeat count would be an optimization for string length 1, not repeat count 1.
>>> 
>>> An optimized path for single character strings might make sense. Building a rope where each string in the rope is a single character is really inefficient. There's probably a cutoff for a small enough string size where we’d want to use StringBuilder instead of RopeBuilder and create the JSString from a single String instead of making a rope. It’s straightforward to use StringBuilder; just make one, then call reserveCapacity, then append in a loop. Or for single characters we could just use String::createUninitialized rather than using StringBuilder or RopeBuilder.
>> 
>> That's very nice insight!
>> Repeating a single character is considered as very common use cases (e.g. `' '.repeat(4)` to generate 4 spaces for indentation). Optimizing for this case seems reasonable.
>> 
>> And one note about the current code is that, in the current implementation, we concat the strings in the loop (`string` variable in the code) even if we don't use it (not adding it to `result`). As a result, we only loop log(N) times. If the N becomes larger, it might have an effect. But on the other hand, we can say that we just delay the processing time until resolving the ropes. In the current code, I took the log(N) method.
>> What do you think of this? I'd like to hear your thought.
> 
> The rope builder still loops through all the strings to make each rope, so the O(log(N)) is a false economy; the looping across every string is still happening and the code is still O(N), not O(log(N)). Our single loop that runs across all N strings is more efficient than the loop that calls the rope code log(N) times.

Thanks, I see. So I'll fix the code to use RopeBuilder.
Comment 17 Yusuke Suzuki 2015-01-06 06:10:16 PST
Created attachment 244055 [details]
Patch
Comment 18 Yusuke Suzuki 2015-01-06 06:36:42 PST
Created attachment 244056 [details]
Patch
Comment 19 Yusuke Suzuki 2015-01-06 06:46:37 PST
Comment on attachment 244056 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=244056&action=review

I've updated the patch :)

> Source/JavaScriptCore/runtime/StringPrototype.cpp:707
> +    double repeatCountDouble = exec->argument(0).toInteger(exec);

Directly use `exec->argument(0)`.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:710
> +    if (repeatCountDouble < 0 || std::isinf(repeatCountDouble))

Use `isinf`.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:711
> +        return throwVMError(exec, createRangeError(exec, ASCIILiteral("repeat() argument must be greater than or equal to 0 and not be infinity")));

Dropped the period.
Seeing the code, some ends in periods[1], but some doesn't end in periods[2].
But seeing the results, it seems that ending without periods is major, so I've followed :)

1: http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/runtime/RegExpPrototype.cpp#L116
2: http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/runtime/DatePrototype.cpp#L1083

> Source/JavaScriptCore/runtime/StringPrototype.cpp:715
> +    if (!string->length() || !repeatCountDouble)

Fast path for length == 0 OR repeatCount == 0.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:718
> +    if (repeatCountDouble == 1)

Fast path for repeatCount == 1.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:721
> +    // JSString requires the limitation that its length is in the range of int32_t.

At first, I used `std::numeric_limits<unsigned>::max()` for checking the length.
But JSString implicitly requires the StringImpl's length is in range of int32_t.
(http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/runtime/JSString.h#L127)

So we need to check `std::numeric_limits<int32_t>::max()` here.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:722
> +    if (repeatCountDouble > std::numeric_limits<int32_t>::max() / string->length())

At first, we use the JSRopeString's OOM check.
However, since it requires N time loop, it consumes much time and tests cannot be passed due to timeout.
So in this time, I inserted the check that darin suggested.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:724
> +    unsigned repeatCount = static_cast<unsigned>(repeatCountDouble);

Reach here if repeatCountDouble doesn't exceed the int32_t max value. So we can cast safely.
(Since string->length() == 0 case early returned, string->length() is always > 0 here. And since the above `repeatCountDouble > std::numeric_limits<int32_t>::max() / string->length()` test is passed, `repeatCountDouble should be <= std::numeric_limits<int32_t>::max()`.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:727
> +    // allocating a sequential buffer and fill with the repeated string for efficiency.

For repeating small string (in this time, length == 1), we use StringImpl::tryCreateUninitialized path.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:732
> +        return JSValue::encode(repeatSmallString<UChar>(exec, s, repeatCount));

Extracted a repeating code into a function with CharacterType.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:735
> +    JSRopeString::RopeBuilder ropeBuilder(vm);

Use JSRopeString::RopeBuilder for constructing the string.
Comment 20 Darin Adler 2015-01-06 10:41:51 PST
Comment on attachment 244056 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=244056&action=review

> Source/JavaScriptCore/runtime/StringPrototype.cpp:728
> +    if (string->length() == 1) {

If this is really for single character strings, then we can write a much tighter version of the repeatSmallString function, and I think we should.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:729
> +        String s = string->value(exec);

Please don’t use a single letter variable name.
Comment 21 Yusuke Suzuki 2015-01-06 11:19:45 PST
Comment on attachment 244056 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=244056&action=review

Thank you for your review!

>> Source/JavaScriptCore/runtime/StringPrototype.cpp:728
>> +    if (string->length() == 1) {
> 
> If this is really for single character strings, then we can write a much tighter version of the repeatSmallString function, and I think we should.

Yes. So instead of generic version of repeatSmallString function, I've introduced new version optimized for single character. In this version, I've used `std::fill_n` to fill the buffer.

>> Source/JavaScriptCore/runtime/StringPrototype.cpp:729
>> +        String s = string->value(exec);
> 
> Please don’t use a single letter variable name.

I've changed it to `repeatedString` variable :)
Comment 22 Darin Adler 2015-01-06 11:32:03 PST
Comment on attachment 244056 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=244056&action=review

>>> Source/JavaScriptCore/runtime/StringPrototype.cpp:728
>>> +    if (string->length() == 1) {
>> 
>> If this is really for single character strings, then we can write a much tighter version of the repeatSmallString function, and I think we should.
> 
> Yes. So instead of generic version of repeatSmallString function, I've introduced new version optimized for single character. In this version, I've used `std::fill_n` to fill the buffer.

Also should decide LChar vs. UChar based on the value of the character, not on whether the string happens to be 8-bit.
Comment 23 Yusuke Suzuki 2015-01-06 11:33:22 PST
Committed r177978: <http://trac.webkit.org/changeset/177978>
Comment 24 Yusuke Suzuki 2015-01-06 12:04:39 PST
Comment on attachment 244056 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=244056&action=review

>>>> Source/JavaScriptCore/runtime/StringPrototype.cpp:728
>>>> +    if (string->length() == 1) {
>>> 
>>> If this is really for single character strings, then we can write a much tighter version of the repeatSmallString function, and I think we should.
>> 
>> Yes. So instead of generic version of repeatSmallString function, I've introduced new version optimized for single character. In this version, I've used `std::fill_n` to fill the buffer.
> 
> Also should decide LChar vs. UChar based on the value of the character, not on whether the string happens to be 8-bit.

Oops, before landing the patch, I've missed reflecting your last comment, sorry for troubling you.
I've created the subsequent patch for fixing this issue[1].

1: https://bugs.webkit.org/show_bug.cgi?id=140139