Bug 202471 - Implement String.prototype.replaceAll
Summary: Implement String.prototype.replaceAll
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Ross Kirsling
URL:
Keywords: InRadar
Depends on: 204475
Blocks:
  Show dependency treegraph
 
Reported: 2019-10-02 08:54 PDT by Mathias Bynens
Modified: 2019-11-21 15:35 PST (History)
20 users (show)

See Also:


Attachments
Patch (12.77 KB, patch)
2019-11-17 20:37 PST, Ross Kirsling
no flags Details | Formatted Diff | Diff
Patch (18.54 KB, patch)
2019-11-17 21:08 PST, Ross Kirsling
no flags Details | Formatted Diff | Diff
Patch (21.11 KB, patch)
2019-11-18 15:07 PST, Ross Kirsling
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews212 for win-future (14.51 MB, application/zip)
2019-11-18 18:02 PST, EWS Watchlist
no flags Details
Patch (24.56 KB, patch)
2019-11-18 21:49 PST, Ross Kirsling
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews214 for win-future (14.87 MB, application/zip)
2019-11-19 03:48 PST, EWS Watchlist
no flags Details
Patch for landing (25.49 KB, patch)
2019-11-19 20:13 PST, Ross Kirsling
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mathias Bynens 2019-10-02 08:54:12 PDT
The proposal reached stage 3 at today’s TC39 meeting.

Repository: https://github.com/tc39/proposal-string-replaceall
Comment 1 Radar WebKit Bug Importer 2019-10-02 13:17:52 PDT
<rdar://problem/55921557>
Comment 2 Ross Kirsling 2019-11-17 20:37:52 PST
Created attachment 383727 [details]
Patch
Comment 3 Ross Kirsling 2019-11-17 20:45:15 PST
Notes:

- This passes all Test262 tests (pending merge here: https://github.com/tc39/test262/pull/2423), except for one that also fails for `replace` itself.

- Instead of adding replaceAllUsingStringSearch, we could alternatively just add to replaceUsingStringSearch, but I was concerned about affecting performance for the existing path.
Comment 4 Ross Kirsling 2019-11-17 20:51:18 PST Comment hidden (obsolete)
Comment 5 Ross Kirsling 2019-11-17 21:08:19 PST
Created attachment 383729 [details]
Patch
Comment 6 Ross Kirsling 2019-11-18 15:07:45 PST
Created attachment 383797 [details]
Patch
Comment 7 Yusuke Suzuki 2019-11-18 16:30:09 PST
Comment on attachment 383797 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=383797&action=review

Nice! Some comments.

> Source/JavaScriptCore/builtins/StringPrototype.js:267
> +    if (@isUndefinedOrNull(this))
> +        @throwTypeError("String.prototype.replaceAll requires |this| not to be null nor undefined");

Can you test it?

> Source/JavaScriptCore/builtins/StringPrototype.js:270
> +        if (@isRegExp(search) && !@stringIncludesInternal.@call(@toString(search.flags), "g"))

Can you test the pattern that tells incorrect flag information, like,

class DerivedRegExp extends RegExp {
    constructor(pattern, flags) { super(pattern, flags); }
    get flags() { return "INVALID" };
};

> Source/JavaScriptCore/builtins/StringPrototype.js:271
> +            @throwTypeError("String.prototype.replaceAll argument must not be a non-global regular expression");

Can you test this error?

> Source/JavaScriptCore/builtins/StringPrototype.js:273
> +        let replacer = search.@replaceSymbol;

Use `var` in builtin as much as possible.

> Source/JavaScriptCore/builtins/StringPrototype.js:279
> +    }

Can you test the code when @@replace function is deleted, replaced etc.?

> Source/JavaScriptCore/runtime/StringPrototype.cpp:859
> +            JSValue replacement = call(globalObject, replaceValue, callType, callData, jsUndefined(), args);

Can you make this repeated calling efficient by using CachedCall?
Comment 8 EWS Watchlist 2019-11-18 18:02:31 PST Comment hidden (obsolete)
Comment 9 EWS Watchlist 2019-11-18 18:02:34 PST Comment hidden (obsolete)
Comment 10 Ross Kirsling 2019-11-18 21:01:32 PST
(In reply to Yusuke Suzuki from comment #7)
> > Source/JavaScriptCore/builtins/StringPrototype.js:270
> > +        if (@isRegExp(search) && !@stringIncludesInternal.@call(@toString(search.flags), "g"))
> 
> Can you test the pattern that tells incorrect flag information, like,
> 
> class DerivedRegExp extends RegExp {
>     constructor(pattern, flags) { super(pattern, flags); }
>     get flags() { return "INVALID" };
> };

But that'll just trigger a runtime SyntaxError when we parse the flags in regExpProtoFuncCompile, before we even manage to call the getter. We can test with objects like these though:

  // bad:
  { [Symbol.match]() {} }

  // good:
  { [Symbol.match]() {}, flags: 'g' }

That's what I did for the same path in String#matchAll, but I'd forgotten to do it here.
Comment 11 Ross Kirsling 2019-11-18 21:49:16 PST
Created attachment 383841 [details]
Patch
Comment 12 Ross Kirsling 2019-11-18 21:55:58 PST
Comment on attachment 383841 [details]
Patch

Addressed feedback.

Also switched to modifying replaceUsingStringSearch instead of duplicating it, as this appears to be perf-neutral on JetStream 2 after all.
Comment 13 Ross Kirsling 2019-11-19 00:11:11 PST
(In reply to Ross Kirsling from comment #10)
> But that'll just trigger a runtime SyntaxError when we parse the flags in
> regExpProtoFuncCompile, before we even manage to call the getter.

Er whoops, I misspoke here -- I meant to say "before we even manage to call stringIncludesInternal".
Comment 14 EWS Watchlist 2019-11-19 03:48:35 PST Comment hidden (obsolete)
Comment 15 EWS Watchlist 2019-11-19 03:48:38 PST Comment hidden (obsolete)
Comment 16 Yusuke Suzuki 2019-11-19 18:58:49 PST
Comment on attachment 383841 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=383841&action=review

r=me with nits.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:803
> +        if (callType != CallType::None) {

`if (cachedCall)` would be nice since we are using cachedCall in this brace. And this is the same to `callType != CallType::None`.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:823
> +        if (callType != CallType::None) {

`if (cachedCall)` would be clean.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:842
> +    RELEASE_AND_RETURN(scope, jsSpliceSubstringsWithSeparators(globalObject, jsString, string, sourceRanges.data(), sourceRanges.size(), replacements.data(), replacements.size()));

Previously, we are calling `jsString()` with three substrings. But now, jsSpliceSubstringsWithSeparators is allocating a new String buffer for this case (`!isGlobal`) case.
Is it efficient in terms of memory? Should we do this in jsSpliceSubstringsWithSeparators?
Comment 17 Ross Kirsling 2019-11-19 20:13:34 PST
Created attachment 383940 [details]
Patch for landing
Comment 18 WebKit Commit Bot 2019-11-19 21:11:26 PST
The commit-queue encountered the following flaky tests while processing attachment 383940 [details]:

compositing/backing/backing-store-attachment-with-rotation.html bug 204394 (author: simon.fraser@apple.com)
The commit-queue is continuing to process your patch.
Comment 19 WebKit Commit Bot 2019-11-19 21:12:21 PST
Comment on attachment 383940 [details]
Patch for landing

Clearing flags on attachment: 383940

Committed r252683: <https://trac.webkit.org/changeset/252683>
Comment 20 WebKit Commit Bot 2019-11-19 21:12:24 PST
All reviewed patches have been landed.  Closing bug.
Comment 21 Darin Adler 2019-11-20 15:23:22 PST
Comment on attachment 383940 [details]
Patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=383940&action=review

> Source/JavaScriptCore/runtime/StringPrototype.cpp:779
> +static ALWAYS_INLINE JSString* replaceUsingStringSearch(VM& vm, JSGlobalObject* globalObject, CallFrame* callFrame, JSString* jsString, JSValue searchValue, JSValue replaceValue, bool isGlobal)

We normally use enumerations for arguments like this "isGlobal" one rather than using named constants at the call site.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:807
> +    Vector<StringRange, 16> sourceRanges;
> +    Vector<String, 16> replacements;

This approach looks very different. Do we have performance test coverage?

> Source/JavaScriptCore/runtime/StringPrototype.cpp:817
> +            if (UNLIKELY(cachedCall->hasOverflowedArguments()))
> +                OUT_OF_MEMORY(globalObject, scope);

I don’t think this is needed when appending exactly 3 arguments. Maybe this was copied and pasted from code that appends arbitrary numbers of arguments, where it would be needed. Or did I miss something?

> Source/JavaScriptCore/runtime/StringPrototype.cpp:819
> +            RETURN_IF_EXCEPTION(scope, nullptr);

What operation makes this RETURN_IF_EXCEPTION needed? I don’t see anything above that can result in an exception. Or did I miss something?

> Source/JavaScriptCore/runtime/StringPrototype.cpp:828
> +        int ovector[2] = { static_cast<int>(matchStart),  static_cast<int>(matchEnd)};

Not new code, just moved: Formatting mistake here. Missing space before "}".

I think this should be moved inside else block just below.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:831
> +            RETURN_IF_EXCEPTION(scope, nullptr);

What operation makes this RETURN_IF_EXCEPTION needed? I don’t see anything above that can result in an exception. Or did I miss something?

> Source/JavaScriptCore/runtime/StringPrototype.cpp:843
> +        matchStart = string.find(searchString, UNLIKELY(!searchStringLength) ? endOfLastMatch + 1 : endOfLastMatch);

Seems unnecessary to use this UNLIKELY here. Premature optimization perhaps.

I don’t understand the searchStringLength 0 case. I don’t see a test case covering this.

> Source/WTF/wtf/text/StringCommon.h:574
> +    if (start > haystack.length())
> +        return notFound;

This is a subtle change in behavior, and this helper is used in many places. How did you test that this is safe across all use of this in WebKit?

> Source/WTF/wtf/text/StringCommon.h:577
>      if (!needleLength)
>          return std::min(start, haystack.length());

This can now be simplified to just:

    if (!needleLength)
        return start;

Would be nice to do that.
Comment 22 Ross Kirsling 2019-11-20 16:31:53 PST
(In reply to Darin Adler from comment #21)
> > Source/JavaScriptCore/runtime/StringPrototype.cpp:779
> > +static ALWAYS_INLINE JSString* replaceUsingStringSearch(VM& vm, JSGlobalObject* globalObject, CallFrame* callFrame, JSString* jsString, JSValue searchValue, JSValue replaceValue, bool isGlobal)
> 
> We normally use enumerations for arguments like this "isGlobal" one rather
> than using named constants at the call site.

Can do.

> > Source/JavaScriptCore/runtime/StringPrototype.cpp:807
> > +    Vector<StringRange, 16> sourceRanges;
> > +    Vector<String, 16> replacements;
> 
> This approach looks very different. Do we have performance test coverage?

All the changes to this function are effectively aligning the replace('foo', ...) path with the replace(/foo/, ...) path (i.e. replaceUsingRegExpSearch with !global).

I was certainly worried about this affecting performance at first (comment 3), but JetStream 2 results appeared to be neutral (comment 12).

This should hopefully be confirmable by perf bots since this patch landed, but unfortunately I don't think they're externally visible right now(?).

> > Source/JavaScriptCore/runtime/StringPrototype.cpp:817
> > +            if (UNLIKELY(cachedCall->hasOverflowedArguments()))
> > +                OUT_OF_MEMORY(globalObject, scope);
> 
> I don’t think this is needed when appending exactly 3 arguments. Maybe this
> was copied and pasted from code that appends arbitrary numbers of arguments,
> where it would be needed. Or did I miss something?

Good point, given that this was an ASSERT before, it ought to still be one now.

> > Source/JavaScriptCore/runtime/StringPrototype.cpp:819
> > +            RETURN_IF_EXCEPTION(scope, nullptr);
> 
> What operation makes this RETURN_IF_EXCEPTION needed? I don’t see anything
> above that can result in an exception. Or did I miss something?

It's true that it wasn't here before, but it is an arbitrary function call, so I figured it had just been missing?

> > Source/JavaScriptCore/runtime/StringPrototype.cpp:828
> > +        int ovector[2] = { static_cast<int>(matchStart),  static_cast<int>(matchEnd)};
> 
> Not new code, just moved: Formatting mistake here. Missing space before "}".
> 
> I think this should be moved inside else block just below.

Indeed.

> > Source/JavaScriptCore/runtime/StringPrototype.cpp:831
> > +            RETURN_IF_EXCEPTION(scope, nullptr);
> 
> What operation makes this RETURN_IF_EXCEPTION needed? I don’t see anything
> above that can result in an exception. Or did I miss something?

Hmm, yeah, this one does seem like overkill.

> > Source/JavaScriptCore/runtime/StringPrototype.cpp:843
> > +        matchStart = string.find(searchString, UNLIKELY(!searchStringLength) ? endOfLastMatch + 1 : endOfLastMatch);
> 
> Seems unnecessary to use this UNLIKELY here. Premature optimization perhaps.

I went with it and figured I'd see if somebody complained. :) I can certainly reverse it.

> I don’t understand the searchStringLength 0 case. I don’t see a test case
> covering this.

Right, I should add one. The point is that replacing '' with 'z' in 'abc' should produce 'zazbzcz', but if we don't shove the start position over manually, we'll just keep inserting 'z' in place until we run out of memory.

> > Source/WTF/wtf/text/StringCommon.h:574
> > +    if (start > haystack.length())
> > +        return notFound;
> 
> This is a subtle change in behavior, and this helper is used in many places.
> How did you test that this is safe across all use of this in WebKit?

I originally planned to do this in a follow-up patch because I expected some degree of breakage (obsoleted comment 4), but was surprised to encounter none whatsoever.

I would hope that nobody is *silently* depending on this in untested behavior, but my hypothesis is that because of the unexpectedness of the existing logic, any callsite where this would be an issue was likely already guarding for it.

> > Source/WTF/wtf/text/StringCommon.h:577
> >      if (!needleLength)
> >          return std::min(start, haystack.length());
> 
> This can now be simplified to just:
> 
>     if (!needleLength)
>         return start;
> 
> Would be nice to do that.

Oops, thanks!
Comment 23 Ross Kirsling 2019-11-20 17:16:37 PST
Committed r252721: <https://trac.webkit.org/changeset/252721>
Comment 24 Maciej Stachowiak 2019-11-20 18:19:37 PST
This probably should have had a feature flag.
Comment 25 Ross Kirsling 2019-11-20 18:30:40 PST
(In reply to Maciej Stachowiak from comment #24)
> This probably should have had a feature flag.

I wasn't quite sure, since it seems like we've tended not to do so for new built-in methods (like String#matchAll or Promise.allSettled).
Comment 26 WebKit Commit Bot 2019-11-21 15:15:28 PST
Re-opened since this is blocked by bug 204475
Comment 27 Ross Kirsling 2019-11-21 15:19:14 PST Comment hidden (obsolete)
Comment 28 Yusuke Suzuki 2019-11-21 15:27:37 PST
(In reply to WebKit Commit Bot from comment #26)
> Re-opened since this is blocked by bug 204475

Ooooooops, I was completely wrong. This regression is caused by r252684, not by r252683. I'm now manually re-landing.
Comment 29 Yusuke Suzuki 2019-11-21 15:35:27 PST
Committed r252754: <https://trac.webkit.org/changeset/252754>