Bug 156013

Summary: ES6: Implement String.prototype.split and RegExp.prototype[@@split].
Product: WebKit Reporter: Mark Lam <mark.lam>
Component: JavaScriptCoreAssignee: Mark Lam <mark.lam>
Status: RESOLVED FIXED    
Severity: Normal CC: benjamin, commit-queue, fpizlo, ggaren, keith_miller, msaboff, saam, webkit-bug-importer, ysuzuki
Priority: P2 Keywords: InRadar
Version: WebKit Local Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 156557    
Bug Blocks: 156331    
Attachments:
Description Flags
work in progress: still need tests an a real @isPureRegExp implementation.
none
work in progress: need to apply get_by_id_pure next.
none
proposed patch.
none
proposed patch.
none
x86_64 benchmark results.
none
proposed patch: with a better name for hasObservableSideEffectsForRegExpSplit().
keith_miller: review+
patch for landing (w/ Keith's feedback resolved).
none
Diff between patches.
none
patch for landing 2: removed unneeded explicit setting of result.length.
none
Diff between patches 2.
none
browser sunspider benchmark baseline on MBP
none
browser sunspider benchmark new on MBP
none
browser sunspider benchmark baseline on MacPro
none
browser sunspider benchmark new on MacPro none

Description Mark Lam 2016-03-30 00:47:40 PDT
Patch coming.
Comment 1 Radar WebKit Bug Importer 2016-03-30 00:49:37 PDT
<rdar://problem/25435210>
Comment 2 Mark Lam 2016-04-01 10:31:59 PDT
Created attachment 275415 [details]
work in progress: still need tests an a real @isPureRegExp implementation.
Comment 3 Mark Lam 2016-04-05 14:59:27 PDT
Created attachment 275697 [details]
work in progress: need to apply get_by_id_pure next.
Comment 4 Mark Lam 2016-04-06 14:16:18 PDT
Created attachment 275817 [details]
proposed patch.

This patch should be correct.  Some layout tests may need to be rebased, and I'll do if needed that after I run them.

Note that hasObservableSideEffects() in the JS regexp split implementation checks a lot of stuff, and I have to call it twice for subclasses of RegExp because their species constructor does not match the RegExp one by definition.  Hence, there might be some perf implications here.  Will find out after @tryGetById gets some DFG love from Keith.
Comment 5 Mark Lam 2016-04-12 09:36:49 PDT
Created attachment 276240 [details]
proposed patch.
Comment 6 Mark Lam 2016-04-12 09:38:16 PDT
Created attachment 276241 [details]
x86_64 benchmark results.
Comment 7 Mark Lam 2016-04-12 09:39:15 PDT
Perf is neutral with this patch.  All JSC and layout tests pass on x86_64 with a release build.
Comment 8 WebKit Commit Bot 2016-04-12 09:39:49 PDT
Attachment 276240 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/ChangeLog:8:  Need whitespace between colon and description  [changelog/filechangedescriptionwhitespace] [5]
Total errors found: 1 in 80 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 9 Mark Lam 2016-04-12 10:02:48 PDT
Created attachment 276243 [details]
proposed patch: with a better name for hasObservableSideEffectsForRegExpSplit().
Comment 10 Keith Miller 2016-04-12 10:46:38 PDT
Comment on attachment 276243 [details]
proposed patch: with a better name for hasObservableSideEffectsForRegExpSplit().

View in context: https://bugs.webkit.org/attachment.cgi?id=276243&action=review

r=me with comments.

> Source/JavaScriptCore/builtins/RegExpPrototype.js:101
> +    let builtinExec = @RegExp.prototype.@exec;

What's the advantage of having this on the RegExp prototype over the global object?

> Source/JavaScriptCore/builtins/RegExpPrototype.js:114
> +    if (regexpExec !== @RegExp.prototype.@exec)

Ditto.

> Source/JavaScriptCore/builtins/RegExpPrototype.js:154
> +    if (!(this instanceof @Object))

This is not correct and slow! I can do: foo = {}; foo.__proto__ = null;  split.call(foo,...) and I will get a type error when it should work fine. You should use the @isObject function, which is also much faster.

> Source/JavaScriptCore/builtins/RegExpPrototype.js:172
> +    let unicodeMatching = flags.includes("u");

What happens here if someone deletes String.prototype.includes? Does this still work?

> Source/JavaScriptCore/builtins/RegExpPrototype.js:189
> +    // 13. If limit is undefined, let lim be 232-1; else let lim be ? ToUint32(limit).

Is this supposed to be 2^32-1?

> Source/JavaScriptCore/builtins/RegExpPrototype.js:207
> +        result[0] = str;

I think you need to use @putDirectByVal here. CreateDataProperty does not look at the prototype chain, which result[0] will do.

> Source/JavaScriptCore/builtins/RegExpPrototype.js:259
> +                    result.push(nextCapture);

I don't think you can use push here since result might not be an array and push might be customized. You need to do @putDirectByVal.

> Source/JavaScriptCore/builtins/RegExpPrototype.js:274
> +    result.push(remainingStr);

Ditto.

> Source/JavaScriptCore/runtime/JSGlobalObject.cpp:652
> +        // @substrInternal should not have any observable side effects (e.g. it should not call
> +        // GetMethod(..., @@toPrimitive) on the thisValue). It is ok to use the default
> +        // stringProtoFuncSubstr as the implementation of @substrInternal because @substrInternal
> +        // will only be called by builtins, which will guarantee that we only pass it a string
> +        // thisValue. As a result, stringProtoFuncSubstr will not need to call toString() on the
> +        // thisValue, and there will be no observable side-effects.

Why is this comment here and not at the implementation of stringProtoFuncSubstr?

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1189
>      JSValue separatorValue = exec->argument(0);

This can be uncheckedArgument since it's only called by us.
Comment 11 Mark Lam 2016-04-12 14:56:37 PDT
Comment on attachment 276243 [details]
proposed patch: with a better name for hasObservableSideEffectsForRegExpSplit().

View in context: https://bugs.webkit.org/attachment.cgi?id=276243&action=review

>> Source/JavaScriptCore/builtins/RegExpPrototype.js:101
>> +    let builtinExec = @RegExp.prototype.@exec;
> 
> What's the advantage of having this on the RegExp prototype over the global object?

This was a pre-existing idiom from the @@match implementation.  In the interest of not changing too many things at once, I'll leave it as is for now.  I'll do a follow up patch to bring this over to a global.

>> Source/JavaScriptCore/builtins/RegExpPrototype.js:114
>> +    if (regexpExec !== @RegExp.prototype.@exec)
> 
> Ditto.

Same as above.

>> Source/JavaScriptCore/builtins/RegExpPrototype.js:154
>> +    if (!(this instanceof @Object))
> 
> This is not correct and slow! I can do: foo = {}; foo.__proto__ = null;  split.call(foo,...) and I will get a type error when it should work fine. You should use the @isObject function, which is also much faster.

Thanks.  Fixed.

>> Source/JavaScriptCore/builtins/RegExpPrototype.js:172
>> +    let unicodeMatching = flags.includes("u");
> 
> What happens here if someone deletes String.prototype.includes? Does this still work?

Thanks.  Will fix and add a test case.

>> Source/JavaScriptCore/builtins/RegExpPrototype.js:189
>> +    // 13. If limit is undefined, let lim be 232-1; else let lim be ? ToUint32(limit).
> 
> Is this supposed to be 2^32-1?

Yes.  Fixed.

>> Source/JavaScriptCore/builtins/RegExpPrototype.js:207
>> +        result[0] = str;
> 
> I think you need to use @putDirectByVal here. CreateDataProperty does not look at the prototype chain, which result[0] will do.

Thanks.  Fixed.

>> Source/JavaScriptCore/builtins/RegExpPrototype.js:259
>> +                    result.push(nextCapture);
> 
> I don't think you can use push here since result might not be an array and push might be customized. You need to do @putDirectByVal.

Thanks. Fixed.

>> Source/JavaScriptCore/builtins/RegExpPrototype.js:274
>> +    result.push(remainingStr);
> 
> Ditto.

Fixed.

>> Source/JavaScriptCore/runtime/JSGlobalObject.cpp:652
>> +        // thisValue, and there will be no observable side-effects.
> 
> Why is this comment here and not at the implementation of stringProtoFuncSubstr?

Because this is where we define @substrInternal, and this comment pertains to @substrInternal.

On further consideration, I think it's clearer to have an explicit builtinStringSubstrInternal() function and to move this comment there.  Plus I can add some asserts there.  I'm going with that.

>> Source/JavaScriptCore/runtime/StringPrototype.cpp:1189
>>      JSValue separatorValue = exec->argument(0);
> 
> This can be uncheckedArgument since it's only called by us.

Thanks.  Fixed.
Comment 12 Mark Lam 2016-04-12 14:57:32 PDT
Created attachment 276281 [details]
patch for landing (w/ Keith's feedback resolved).
Comment 13 Mark Lam 2016-04-12 14:58:02 PDT
Created attachment 276282 [details]
Diff between patches.
Comment 14 Keith Miller 2016-04-12 15:22:15 PDT
Comment on attachment 276281 [details]
patch for landing (w/ Keith's feedback resolved).

View in context: https://bugs.webkit.org/attachment.cgi?id=276281&action=review

> Source/JavaScriptCore/builtins/RegExpPrototype.js:209
> +        result.length = 1;

unneeded.

> Source/JavaScriptCore/builtins/RegExpPrototype.js:249
> +                if (resultLength == limit) {
> +                    result.length = resultLength;
> +                    return result;
> +                }

unneeded.

> Source/JavaScriptCore/builtins/RegExpPrototype.js:270
> +                    if (resultLength == limit) {
> +                        result.length = resultLength;
> +                        return result;
> +                    }

unneeded.

> Source/JavaScriptCore/builtins/RegExpPrototype.js:283
> +    result.length = ++resultLength;

unneeded.
Comment 15 Keith Miller 2016-04-12 15:23:29 PDT
Comment on attachment 276281 [details]
patch for landing (w/ Keith's feedback resolved).

View in context: https://bugs.webkit.org/attachment.cgi?id=276281&action=review

>> Source/JavaScriptCore/builtins/RegExpPrototype.js:249
>> +                }
> 
> unneeded.

Whoops, this should just refer the length setting part.

>> Source/JavaScriptCore/builtins/RegExpPrototype.js:270
>> +                    }
> 
> unneeded.

Ditto.
Comment 16 Mark Lam 2016-04-12 15:35:07 PDT
Created attachment 276285 [details]
patch for landing 2: removed unneeded explicit setting of result.length.
Comment 17 Mark Lam 2016-04-12 15:35:36 PDT
Created attachment 276286 [details]
Diff between patches 2.
Comment 18 Mark Lam 2016-04-12 15:41:06 PDT
Thanks for all the reviews.  Landed in r199393: <http://trac.webkit.org/r199393>.
Comment 20 Alexey Proskuryakov 2016-04-12 20:02:08 PDT
The rollout didn't fix 32-bit failures, still occurring with r199401.
Comment 21 Mark Lam 2016-04-13 10:47:23 PDT
The crash issue was due to https://bugs.webkit.org/show_bug.cgi?id=156532, and is now fixed.

Re-landed r199393 in r199502: <http://trac.webkit.org/r199502>.
Comment 22 WebKit Commit Bot 2016-04-13 14:56:34 PDT
Re-opened since this is blocked by bug 156557
Comment 23 Mark Lam 2016-04-19 10:15:52 PDT
Created attachment 276732 [details]
browser sunspider benchmark baseline on MBP
Comment 24 Mark Lam 2016-04-19 10:16:33 PDT
Created attachment 276733 [details]
browser sunspider benchmark new on MBP
Comment 25 Mark Lam 2016-04-19 10:19:27 PDT
I've benchmarked this patch (on rev 199568) many times on a MBP with the in browser sunspider, and the results show no regression.  Whatever perf regression appears on arewefastyet.com previously are probably due to interplay with other factors (maybe cache effects?).

I will re-land and monitor the bots.
Comment 26 Mark Lam 2016-04-19 10:25:41 PDT
Re-landed in r199731: <http://trac.webkit.org/r199731>.
Comment 27 Mark Lam 2016-04-19 12:05:06 PDT
Created attachment 276742 [details]
browser sunspider benchmark baseline on MacPro
Comment 28 Mark Lam 2016-04-19 12:06:25 PDT
Created attachment 276743 [details]
browser sunspider benchmark new on MacPro

In browser Sunspider benchmark results on my MacPro agrees.  There's no perf regression here.
Comment 29 Mark Lam 2016-04-19 14:20:02 PDT
Strange.  http://arewefastyet.com/#machine=29&view=breakdown&suite=ss is still complaining that the time for ss-format-xparb has nearly doubled (likely due to this patch).  But locally, I can't replicate the issue at all.  Investigating ...
Comment 30 Mark Lam 2016-04-19 14:29:34 PDT
Command line sunspider run without warmup ...

                              base                      new                                        
Normal FTL on:
date-format-xparb        5.9714+-0.0932            5.8345+-0.2865          might be 1.0235x faster
FTL disable:
date-format-xparb        5.7622+-0.2792            5.6508+-0.1500          might be 1.0197x faster
DFG disable:
date-format-xparb       13.0135+-0.5711           12.5326+-0.3871          might be 1.0384x faster
Baseline JIT disabled:
date-format-xparb       20.2147+-0.3067           20.1280+-0.3900        

Based on the above, it does not look like there are any regressions at any tier level either.
Comment 31 Mark Lam 2016-04-19 15:35:49 PDT
All internal test bots also show no regression.  The arewefastyet bot is an anomaly.  Will ignore and move on to other things.