Bug 191786

Summary: [Cocoa] [WebKit2] Add support for replacing find-in-page text matches
Product: WebKit Reporter: Wenson Hsieh <wenson_hsieh>
Component: HTML EditingAssignee: Wenson Hsieh <wenson_hsieh>
Status: RESOLVED FIXED    
Severity: Normal CC: bdakin, commit-queue, dbates, ews-watchlist, mitz, rniwa, thorton, webkit-bug-importer, wenson_hsieh
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
WIP (no tests)
none
Archive of layout-test-results from ews123 for ios-simulator-wk2
none
First pass
none
Patch
none
Archive of layout-test-results from ews126 for ios-simulator-wk2
none
Fire 'insertReplacementText' input events.
none
Patch
none
Slightly simpler approach
rniwa: review+
Patch for landing none

Description Wenson Hsieh 2018-11-16 15:48:56 PST
<rdar://problem/45813871>
Comment 1 Wenson Hsieh 2018-11-16 15:52:26 PST
Created attachment 355144 [details]
WIP (no tests)
Comment 2 EWS Watchlist 2018-11-16 21:27:08 PST Comment hidden (obsolete)
Comment 3 EWS Watchlist 2018-11-16 21:27:10 PST Comment hidden (obsolete)
Comment 4 Wenson Hsieh 2018-11-16 22:42:05 PST
Created attachment 355182 [details]
First pass
Comment 5 Ryosuke Niwa 2018-11-16 23:15:35 PST
Comment on attachment 355182 [details]
First pass

View in context: https://bugs.webkit.org/attachment.cgi?id=355182&action=review

> Source/WebCore/page/Page.cpp:778
> +    ListHashSet<ContainerNode*> containerNodesInOrderOfReplacement;
> +    HashMap<ContainerNode*, Vector<FindReplacementRange>> rangesByContainerNode;

Can we use Ref/RefPtr instead?

> Source/WebCore/page/Page.cpp:782
> +        auto existingEntry = rangesByContainerNode.find(container);
> +        if (existingEntry == rangesByContainerNode.end())

Why can't we just do: rangesByContainerNode[container].append(range)?

> Source/WebCore/page/Page.cpp:789
> +    for (auto* container : containerNodesInOrderOfReplacement) {

Why not just rangesByContainerNode.keys() ?

> Source/WebCore/page/Page.cpp:796
> +        // Iterate backwards through ranges, such that editing offsets near the start of the document
> +        // aren't clobbered due to text replacement.

Hm... the fact we'll be doing replacement backwards would be observable.
Are we sure we want to do that?

> Source/WebCore/page/Page.cpp:829
> +        FindReplacementRange rangeInfo;
> +        rangeInfo.root = WTFMove(highestRoot);

FindReplacementRange rangeInfo { WTFMove(highestRoot) } ?
Although... it might be cleaner to define two local size_t's and just construct FindReplacementRange when you call append.
Comment 6 Ryosuke Niwa 2018-11-16 23:16:15 PST
I can't do the full review... my brain is toast after looking at so many PSON related API test failures.
Comment 7 Wenson Hsieh 2018-11-16 23:30:40 PST
Comment on attachment 355182 [details]
First pass

View in context: https://bugs.webkit.org/attachment.cgi?id=355182&action=review

>> Source/WebCore/page/Page.cpp:778
>> +    HashMap<ContainerNode*, Vector<FindReplacementRange>> rangesByContainerNode;
> 
> Can we use Ref/RefPtr instead?

Sure!

(I avoided it here because `Vector<FindReplacementRange>&& ranges` keeps a strong Ref to the editing roots anyways, so the other data structures shouldn't need to worry about keeping the node alive).

>> Source/WebCore/page/Page.cpp:789
>> +    for (auto* container : containerNodesInOrderOfReplacement) {
> 
> Why not just rangesByContainerNode.keys() ?

I did this so that the order in which text replacement occurs would be deterministic (one of the layout tests I wrote was flaky because the final selection would not always end up in the same place, since the order of `rangesByContainerNode.keys()` isn't predictable).

On second thought, I should really just make this replace text in the order of Ranges. It's a tad annoying, because we'd need to keep a map of editing root to some sort of "replacement offset amount" which we'd use to offset latter editing ranges by as we replace. I suppose this is a similar problem as the one solved in spellchecking code in Editor.cpp, when there are multiple autocorrection checking results in a single paragraph.

>> Source/WebCore/page/Page.cpp:796
>> +        // aren't clobbered due to text replacement.
> 
> Hm... the fact we'll be doing replacement backwards would be observable.
> Are we sure we want to do that?

I *think* it's not a big deal since nothing mandates that replacement needs to be in the same order as the list of matches. That being said, I'll try to change this to replace text in order.

>> Source/WebCore/page/Page.cpp:829
>> +        rangeInfo.root = WTFMove(highestRoot);
> 
> FindReplacementRange rangeInfo { WTFMove(highestRoot) } ?
> Although... it might be cleaner to define two local size_t's and just construct FindReplacementRange when you call append.

Sounds good — I'll do the latter.
Comment 8 Wenson Hsieh 2018-11-16 23:31:12 PST
Comment on attachment 355182 [details]
First pass

View in context: https://bugs.webkit.org/attachment.cgi?id=355182&action=review

>> Source/WebCore/page/Page.cpp:782
>> +        if (existingEntry == rangesByContainerNode.end())
> 
> Why can't we just do: rangesByContainerNode[container].append(range)?

👍
Comment 9 Wenson Hsieh 2018-11-17 18:02:47 PST
Created attachment 355217 [details]
Patch
Comment 10 Wenson Hsieh 2018-11-17 18:03:10 PST
Comment on attachment 355182 [details]
First pass

View in context: https://bugs.webkit.org/attachment.cgi?id=355182&action=review

>>> Source/WebCore/page/Page.cpp:782
>>> +        if (existingEntry == rangesByContainerNode.end())
>> 
>> Why can't we just do: rangesByContainerNode[container].append(range)?
> 
> 👍

On second thought, HashMap doesn't seem to support the subscript operator. However, in the process of looking for one, I stumbled into HashMap::ensure, which seems more like what I want to use here.

I changed this to:

    rangesByContainerNode.ensure(container, [] {
        return Vector<FindReplacementRange>();
    }).iterator->value.append(range);

...which seems cleaner than adding a branch.

>>> Source/WebCore/page/Page.cpp:796
>>> +        // aren't clobbered due to text replacement.
>> 
>> Hm... the fact we'll be doing replacement backwards would be observable.
>> Are we sure we want to do that?
> 
> I *think* it's not a big deal since nothing mandates that replacement needs to be in the same order as the list of matches. That being said, I'll try to change this to replace text in order.

Ryosuke and I chatted earlier, and agreed that since this (1) isn't exposed in Safari, and (2) iterating backwards to perform text replacements would make "Replace All" more robust against find-in-page/TextIterator bugs that may cause us to be off-by-one, it's okay to just keep it this way for now.

If this turns out to be a problem, I can tweak this in a followup patch.
Comment 11 EWS Watchlist 2018-11-18 00:23:51 PST Comment hidden (obsolete)
Comment 12 EWS Watchlist 2018-11-18 00:23:52 PST Comment hidden (obsolete)
Comment 13 Wenson Hsieh 2018-11-18 16:42:24 PST
Created attachment 355245 [details]
Fire 'insertReplacementText' input events.
Comment 14 Ryosuke Niwa 2018-11-19 15:06:12 PST
Comment on attachment 355245 [details]
Fire 'insertReplacementText' input events.

View in context: https://bugs.webkit.org/attachment.cgi?id=355245&action=review

> Source/WebCore/ChangeLog:26
> +        * page/EditorClient.h:
> +        (WebCore::EditorClient::findStringMatchesInPage):
> +        (WebCore::EditorClient::replaceFindMatchesAtIndices):
> +
> +        Add testing hooks to simulate find-in-page, as well as replacing find-in-page matches.

This is not right. We shouldn't be adding EditorClient callbacks just to expose things to Internals.
The right approach is to expose new APIs on testRunner or textInputController, etc... instead.
r- because of this.

> Source/WebCore/page/Page.cpp:782
> +            return Vector<FindReplacementRange>();

Nit: Use { } ?

> Source/WebCore/page/Page.cpp:783
> +        }).iterator->value.append(range);

Can't we just insert this range at the right location instead?

> Source/WebCore/page/Page.cpp:787
> +    for (auto& container : containerNodesInOrderOfReplacement) {

It's strange that this API orders ranges within each root editable element in the tree order
yet keeps the order by which each root editable element appeared in the vector of ranges it received.
I think it's cleaner to always sort container nodes & ranges, or not do either.

> Source/WebCore/page/Page.cpp:796
> +            return (first.length + first.location) > (second.length + second.location);

Instead of sorting it here?

> Source/WebCore/page/Page.cpp:816
> +    Vector<FindReplacementRange> replacementRanges;

Why don't we reserve the initial capacity of rangesToReplace.size() to avoid multiple mallocs?
In most cases, we would not skip ranges. Even in that case, shrinking it later is probably more efficient.
Comment 15 Wenson Hsieh 2018-11-19 23:43:27 PST
Comment on attachment 355245 [details]
Fire 'insertReplacementText' input events.

View in context: https://bugs.webkit.org/attachment.cgi?id=355245&action=review

>> Source/WebCore/ChangeLog:26
>> +        Add testing hooks to simulate find-in-page, as well as replacing find-in-page matches.
> 
> This is not right. We shouldn't be adding EditorClient callbacks just to expose things to Internals.
> The right approach is to expose new APIs on testRunner or textInputController, etc... instead.
> r- because of this.

Fixed! Moved this logic over to TestRunner.

>> Source/WebCore/page/Page.cpp:782
>> +            return Vector<FindReplacementRange>();
> 
> Nit: Use { } ?

Doing that as-is fails with a compilation error, "cannot deduce lambda return type from initializer list".

However, I can return { } if I explicitly make the lambda return type Vector<FindReplacementRange>. Changed to:

    rangesByContainerNode.ensure(container, [] () -> Vector<FindReplacementRange> {
        return { };
    }).iterator->value.append(range);

>> Source/WebCore/page/Page.cpp:783
>> +        }).iterator->value.append(range);
> 
> Can't we just insert this range at the right location instead?

Sure! Changed to an insertion sort.

At least in the case of find and replace, ranges is going to be very close to (if not already in) sorted order anyways, so this should reduce to simply appending each range to the end in a lot of cases.

>> Source/WebCore/page/Page.cpp:787
>> +    for (auto& container : containerNodesInOrderOfReplacement) {
> 
> It's strange that this API orders ranges within each root editable element in the tree order
> yet keeps the order by which each root editable element appeared in the vector of ranges it received.
> I think it's cleaner to always sort container nodes & ranges, or not do either.

I changed this to iterate backwards through both container nodes as well as ranges per container node.

>> Source/WebCore/page/Page.cpp:816
>> +    Vector<FindReplacementRange> replacementRanges;
> 
> Why don't we reserve the initial capacity of rangesToReplace.size() to avoid multiple mallocs?
> In most cases, we would not skip ranges. Even in that case, shrinking it later is probably more efficient.

Sounds good — reserved the initial capacity to rangesToReplace.size();

This also makes a lot of sense, considering that in the vast majority of cases where we'll exercise this codepath, all of the rangesToReplace will be editable and emit a FindReplacementRange anyways.
Comment 16 Wenson Hsieh 2018-11-19 23:51:54 PST
Created attachment 355315 [details]
Patch
Comment 17 Wenson Hsieh 2018-11-20 14:43:04 PST
Created attachment 355366 [details]
Slightly simpler approach
Comment 18 Ryosuke Niwa 2018-11-21 16:38:21 PST
Comment on attachment 355366 [details]
Slightly simpler approach

View in context: https://bugs.webkit.org/attachment.cgi?id=355366&action=review

> Source/WebCore/page/Page.cpp:781
> +        auto& rangeList = rangesByContainerNode.ensure(range.root, [] () -> Vector<FindReplacementRange> {
> +            return { };
> +        }).iterator->value;

I think returning Vector<FindReplacementRange> { } is probably cleaner rather than specifying the return type.

> Source/WebCore/page/Page.cpp:784
> +        // Iterate backwards through ranges when replacing text, such that editing offsets near the start
> +        // of the document aren't clobbered due to text replacement.

Not sure "iterating backwards" is the right comment here.
Maybe just say sort the range by the end offset?

> Source/WebCore/page/Page.cpp:821
> +        // If the containers are in different frames altogether, compare their frames in traversal order instead.

This comment seems redundant. It's pretty self-evident from the code. Remove?

> Source/WebCore/page/Page.cpp:831
> +        for (auto iterator = ranges.rbegin(); iterator != ranges.rend(); ++iterator) {

The comment about how we do replacement in the reverse tree order is more relevant here.
Comment 19 Wenson Hsieh 2018-11-21 19:44:49 PST
Comment on attachment 355366 [details]
Slightly simpler approach

View in context: https://bugs.webkit.org/attachment.cgi?id=355366&action=review

Thanks for the review!

>> Source/WebCore/page/Page.cpp:781
>> +        }).iterator->value;
> 
> I think returning Vector<FindReplacementRange> { } is probably cleaner rather than specifying the return type.

Sounds good — changed to return Vector<FindReplacementRange> { };

>> Source/WebCore/page/Page.cpp:784
>> +        // of the document aren't clobbered due to text replacement.
> 
> Not sure "iterating backwards" is the right comment here.
> Maybe just say sort the range by the end offset?

👍🏻 Reworded this comment accordingly, and moved the bit about not clobbering ranges further down.

>> Source/WebCore/page/Page.cpp:821
>> +        // If the containers are in different frames altogether, compare their frames in traversal order instead.
> 
> This comment seems redundant. It's pretty self-evident from the code. Remove?

Ok — removed!

>> Source/WebCore/page/Page.cpp:831
>> +        for (auto iterator = ranges.rbegin(); iterator != ranges.rend(); ++iterator) {
> 
> The comment about how we do replacement in the reverse tree order is more relevant here.

Added the earlier comment here.
Comment 20 Wenson Hsieh 2018-11-21 20:25:00 PST
Created attachment 355461 [details]
Patch for landing
Comment 21 WebKit Commit Bot 2018-11-21 21:04:06 PST
Comment on attachment 355461 [details]
Patch for landing

Clearing flags on attachment: 355461

Committed r238438: <https://trac.webkit.org/changeset/238438>
Comment 22 Daniel Bates 2018-11-21 23:32:03 PST
Comment on attachment 355366 [details]
Slightly simpler approach

View in context: https://bugs.webkit.org/attachment.cgi?id=355366&action=review

> Source/WebCore/page/Page.cpp:825
> +    for (auto container : containerNodesInOrderOfReplacement) {

This code causes unnecessary ref count churn. container should be auto&.

> Source/WebCore/page/Page.cpp:836
> +            frame->selection().setSelectedRange(range.get(), DOWNSTREAM, true);

What is the purpose of the Boolean arguments. Should consider having this function take enums, use an inline comment here, or a named local that describes the Boolean.

> Source/WebCore/page/Page.cpp:837
> +            frame->editor().replaceSelectionWithText(replacementText, true, false, EditAction::InsertReplacement);

Ditto.
Comment 23 Wenson Hsieh 2018-11-21 23:38:40 PST
Comment on attachment 355366 [details]
Slightly simpler approach

View in context: https://bugs.webkit.org/attachment.cgi?id=355366&action=review

>> Source/WebCore/page/Page.cpp:825
>> +    for (auto container : containerNodesInOrderOfReplacement) {
> 
> This code causes unnecessary ref count churn. container should be auto&.

Good point. I'll fix this in a followup.

>> Source/WebCore/page/Page.cpp:836
>> +            frame->selection().setSelectedRange(range.get(), DOWNSTREAM, true);
> 
> What is the purpose of the Boolean arguments. Should consider having this function take enums, use an inline comment here, or a named local that describes the Boolean.

I'll refactor setSelectedRange() and replaceSelectionWithText() to take `enum class`es instead of bools for these arguments in a followup as well.
Comment 24 Daniel Bates 2018-11-21 23:46:36 PST
Comment on attachment 355366 [details]
Slightly simpler approach

View in context: https://bugs.webkit.org/attachment.cgi?id=355366&action=review

> Source/WebKit/WebProcess/InjectedBundle/API/c/WKBundlePage.cpp:464
> +    for (size_t arrayIndex = 0; arrayIndex < matchIndices->size(); ++arrayIndex) {

It would be more idiomatic to write this loop using i as the name of the index variable. Caching the size of the vector would also be a nice improvement or is the compiler smart enough to realize matchIndices isn't being mutated?

> Source/WebKit/WebProcess/WebPage/FindController.cpp:102
> +uint32_t FindController::replaceMatches(Vector<uint32_t>&& matchIndices, const String& replacementText, bool selectionOnly)

This function does not take ownership of matchIndices. So, i suggest it be passed as const lvalue refererenxe instead of rvalue-reference.

> Source/WebKit/WebProcess/WebPage/WebPage.cpp:3799
> +void WebPage::replaceStringMatchesFromInjectedBundle(Vector<uint32_t>&& matchIndices, const String& replacementText, bool selectionOnly)

Ditto.

> Source/WebKit/WebProcess/WebPage/WebPage.cpp:3834
> +void WebPage::replaceMatches(Vector<uint32_t>&& matchIndices, const String& replacementText, bool selectionOnly, CallbackID callbackID)

Ditto.