RESOLVED FIXED 191786
[Cocoa] [WebKit2] Add support for replacing find-in-page text matches
https://bugs.webkit.org/show_bug.cgi?id=191786
Summary [Cocoa] [WebKit2] Add support for replacing find-in-page text matches
Wenson Hsieh
Reported 2018-11-16 15:48:56 PST
Attachments
WIP (no tests) (23.94 KB, patch)
2018-11-16 15:52 PST, Wenson Hsieh
no flags
Archive of layout-test-results from ews123 for ios-simulator-wk2 (6.97 MB, application/zip)
2018-11-16 21:27 PST, EWS Watchlist
no flags
First pass (51.84 KB, patch)
2018-11-16 22:42 PST, Wenson Hsieh
no flags
Patch (51.88 KB, patch)
2018-11-17 18:02 PST, Wenson Hsieh
no flags
Archive of layout-test-results from ews126 for ios-simulator-wk2 (2.40 MB, application/zip)
2018-11-18 00:23 PST, EWS Watchlist
no flags
Fire 'insertReplacementText' input events. (56.41 KB, patch)
2018-11-18 16:42 PST, Wenson Hsieh
no flags
Patch (65.67 KB, patch)
2018-11-19 23:51 PST, Wenson Hsieh
no flags
Slightly simpler approach (63.87 KB, patch)
2018-11-20 14:43 PST, Wenson Hsieh
rniwa: review+
Patch for landing (63.81 KB, patch)
2018-11-21 20:25 PST, Wenson Hsieh
no flags
Wenson Hsieh
Comment 1 2018-11-16 15:52:26 PST
Created attachment 355144 [details] WIP (no tests)
EWS Watchlist
Comment 2 2018-11-16 21:27:08 PST Comment hidden (obsolete)
EWS Watchlist
Comment 3 2018-11-16 21:27:10 PST Comment hidden (obsolete)
Wenson Hsieh
Comment 4 2018-11-16 22:42:05 PST
Created attachment 355182 [details] First pass
Ryosuke Niwa
Comment 5 2018-11-16 23:15:35 PST
Comment on attachment 355182 [details] First pass View in context: https://bugs.webkit.org/attachment.cgi?id=355182&action=review > Source/WebCore/page/Page.cpp:778 > + ListHashSet<ContainerNode*> containerNodesInOrderOfReplacement; > + HashMap<ContainerNode*, Vector<FindReplacementRange>> rangesByContainerNode; Can we use Ref/RefPtr instead? > Source/WebCore/page/Page.cpp:782 > + auto existingEntry = rangesByContainerNode.find(container); > + if (existingEntry == rangesByContainerNode.end()) Why can't we just do: rangesByContainerNode[container].append(range)? > Source/WebCore/page/Page.cpp:789 > + for (auto* container : containerNodesInOrderOfReplacement) { Why not just rangesByContainerNode.keys() ? > Source/WebCore/page/Page.cpp:796 > + // Iterate backwards through ranges, such that editing offsets near the start of the document > + // aren't clobbered due to text replacement. Hm... the fact we'll be doing replacement backwards would be observable. Are we sure we want to do that? > Source/WebCore/page/Page.cpp:829 > + FindReplacementRange rangeInfo; > + rangeInfo.root = WTFMove(highestRoot); FindReplacementRange rangeInfo { WTFMove(highestRoot) } ? Although... it might be cleaner to define two local size_t's and just construct FindReplacementRange when you call append.
Ryosuke Niwa
Comment 6 2018-11-16 23:16:15 PST
I can't do the full review... my brain is toast after looking at so many PSON related API test failures.
Wenson Hsieh
Comment 7 2018-11-16 23:30:40 PST
Comment on attachment 355182 [details] First pass View in context: https://bugs.webkit.org/attachment.cgi?id=355182&action=review >> Source/WebCore/page/Page.cpp:778 >> + HashMap<ContainerNode*, Vector<FindReplacementRange>> rangesByContainerNode; > > Can we use Ref/RefPtr instead? Sure! (I avoided it here because `Vector<FindReplacementRange>&& ranges` keeps a strong Ref to the editing roots anyways, so the other data structures shouldn't need to worry about keeping the node alive). >> Source/WebCore/page/Page.cpp:789 >> + for (auto* container : containerNodesInOrderOfReplacement) { > > Why not just rangesByContainerNode.keys() ? I did this so that the order in which text replacement occurs would be deterministic (one of the layout tests I wrote was flaky because the final selection would not always end up in the same place, since the order of `rangesByContainerNode.keys()` isn't predictable). On second thought, I should really just make this replace text in the order of Ranges. It's a tad annoying, because we'd need to keep a map of editing root to some sort of "replacement offset amount" which we'd use to offset latter editing ranges by as we replace. I suppose this is a similar problem as the one solved in spellchecking code in Editor.cpp, when there are multiple autocorrection checking results in a single paragraph. >> Source/WebCore/page/Page.cpp:796 >> + // aren't clobbered due to text replacement. > > Hm... the fact we'll be doing replacement backwards would be observable. > Are we sure we want to do that? I *think* it's not a big deal since nothing mandates that replacement needs to be in the same order as the list of matches. That being said, I'll try to change this to replace text in order. >> Source/WebCore/page/Page.cpp:829 >> + rangeInfo.root = WTFMove(highestRoot); > > FindReplacementRange rangeInfo { WTFMove(highestRoot) } ? > Although... it might be cleaner to define two local size_t's and just construct FindReplacementRange when you call append. Sounds good — I'll do the latter.
Wenson Hsieh
Comment 8 2018-11-16 23:31:12 PST
Comment on attachment 355182 [details] First pass View in context: https://bugs.webkit.org/attachment.cgi?id=355182&action=review >> Source/WebCore/page/Page.cpp:782 >> + if (existingEntry == rangesByContainerNode.end()) > > Why can't we just do: rangesByContainerNode[container].append(range)? 👍
Wenson Hsieh
Comment 9 2018-11-17 18:02:47 PST
Wenson Hsieh
Comment 10 2018-11-17 18:03:10 PST
Comment on attachment 355182 [details] First pass View in context: https://bugs.webkit.org/attachment.cgi?id=355182&action=review >>> Source/WebCore/page/Page.cpp:782 >>> + if (existingEntry == rangesByContainerNode.end()) >> >> Why can't we just do: rangesByContainerNode[container].append(range)? > > 👍 On second thought, HashMap doesn't seem to support the subscript operator. However, in the process of looking for one, I stumbled into HashMap::ensure, which seems more like what I want to use here. I changed this to: rangesByContainerNode.ensure(container, [] { return Vector<FindReplacementRange>(); }).iterator->value.append(range); ...which seems cleaner than adding a branch. >>> Source/WebCore/page/Page.cpp:796 >>> + // aren't clobbered due to text replacement. >> >> Hm... the fact we'll be doing replacement backwards would be observable. >> Are we sure we want to do that? > > I *think* it's not a big deal since nothing mandates that replacement needs to be in the same order as the list of matches. That being said, I'll try to change this to replace text in order. Ryosuke and I chatted earlier, and agreed that since this (1) isn't exposed in Safari, and (2) iterating backwards to perform text replacements would make "Replace All" more robust against find-in-page/TextIterator bugs that may cause us to be off-by-one, it's okay to just keep it this way for now. If this turns out to be a problem, I can tweak this in a followup patch.
EWS Watchlist
Comment 11 2018-11-18 00:23:51 PST Comment hidden (obsolete)
EWS Watchlist
Comment 12 2018-11-18 00:23:52 PST Comment hidden (obsolete)
Wenson Hsieh
Comment 13 2018-11-18 16:42:24 PST
Created attachment 355245 [details] Fire 'insertReplacementText' input events.
Ryosuke Niwa
Comment 14 2018-11-19 15:06:12 PST
Comment on attachment 355245 [details] Fire 'insertReplacementText' input events. View in context: https://bugs.webkit.org/attachment.cgi?id=355245&action=review > Source/WebCore/ChangeLog:26 > + * page/EditorClient.h: > + (WebCore::EditorClient::findStringMatchesInPage): > + (WebCore::EditorClient::replaceFindMatchesAtIndices): > + > + Add testing hooks to simulate find-in-page, as well as replacing find-in-page matches. This is not right. We shouldn't be adding EditorClient callbacks just to expose things to Internals. The right approach is to expose new APIs on testRunner or textInputController, etc... instead. r- because of this. > Source/WebCore/page/Page.cpp:782 > + return Vector<FindReplacementRange>(); Nit: Use { } ? > Source/WebCore/page/Page.cpp:783 > + }).iterator->value.append(range); Can't we just insert this range at the right location instead? > Source/WebCore/page/Page.cpp:787 > + for (auto& container : containerNodesInOrderOfReplacement) { It's strange that this API orders ranges within each root editable element in the tree order yet keeps the order by which each root editable element appeared in the vector of ranges it received. I think it's cleaner to always sort container nodes & ranges, or not do either. > Source/WebCore/page/Page.cpp:796 > + return (first.length + first.location) > (second.length + second.location); Instead of sorting it here? > Source/WebCore/page/Page.cpp:816 > + Vector<FindReplacementRange> replacementRanges; Why don't we reserve the initial capacity of rangesToReplace.size() to avoid multiple mallocs? In most cases, we would not skip ranges. Even in that case, shrinking it later is probably more efficient.
Wenson Hsieh
Comment 15 2018-11-19 23:43:27 PST
Comment on attachment 355245 [details] Fire 'insertReplacementText' input events. View in context: https://bugs.webkit.org/attachment.cgi?id=355245&action=review >> Source/WebCore/ChangeLog:26 >> + Add testing hooks to simulate find-in-page, as well as replacing find-in-page matches. > > This is not right. We shouldn't be adding EditorClient callbacks just to expose things to Internals. > The right approach is to expose new APIs on testRunner or textInputController, etc... instead. > r- because of this. Fixed! Moved this logic over to TestRunner. >> Source/WebCore/page/Page.cpp:782 >> + return Vector<FindReplacementRange>(); > > Nit: Use { } ? Doing that as-is fails with a compilation error, "cannot deduce lambda return type from initializer list". However, I can return { } if I explicitly make the lambda return type Vector<FindReplacementRange>. Changed to: rangesByContainerNode.ensure(container, [] () -> Vector<FindReplacementRange> { return { }; }).iterator->value.append(range); >> Source/WebCore/page/Page.cpp:783 >> + }).iterator->value.append(range); > > Can't we just insert this range at the right location instead? Sure! Changed to an insertion sort. At least in the case of find and replace, ranges is going to be very close to (if not already in) sorted order anyways, so this should reduce to simply appending each range to the end in a lot of cases. >> Source/WebCore/page/Page.cpp:787 >> + for (auto& container : containerNodesInOrderOfReplacement) { > > It's strange that this API orders ranges within each root editable element in the tree order > yet keeps the order by which each root editable element appeared in the vector of ranges it received. > I think it's cleaner to always sort container nodes & ranges, or not do either. I changed this to iterate backwards through both container nodes as well as ranges per container node. >> Source/WebCore/page/Page.cpp:816 >> + Vector<FindReplacementRange> replacementRanges; > > Why don't we reserve the initial capacity of rangesToReplace.size() to avoid multiple mallocs? > In most cases, we would not skip ranges. Even in that case, shrinking it later is probably more efficient. Sounds good — reserved the initial capacity to rangesToReplace.size(); This also makes a lot of sense, considering that in the vast majority of cases where we'll exercise this codepath, all of the rangesToReplace will be editable and emit a FindReplacementRange anyways.
Wenson Hsieh
Comment 16 2018-11-19 23:51:54 PST
Wenson Hsieh
Comment 17 2018-11-20 14:43:04 PST
Created attachment 355366 [details] Slightly simpler approach
Ryosuke Niwa
Comment 18 2018-11-21 16:38:21 PST
Comment on attachment 355366 [details] Slightly simpler approach View in context: https://bugs.webkit.org/attachment.cgi?id=355366&action=review > Source/WebCore/page/Page.cpp:781 > + auto& rangeList = rangesByContainerNode.ensure(range.root, [] () -> Vector<FindReplacementRange> { > + return { }; > + }).iterator->value; I think returning Vector<FindReplacementRange> { } is probably cleaner rather than specifying the return type. > Source/WebCore/page/Page.cpp:784 > + // Iterate backwards through ranges when replacing text, such that editing offsets near the start > + // of the document aren't clobbered due to text replacement. Not sure "iterating backwards" is the right comment here. Maybe just say sort the range by the end offset? > Source/WebCore/page/Page.cpp:821 > + // If the containers are in different frames altogether, compare their frames in traversal order instead. This comment seems redundant. It's pretty self-evident from the code. Remove? > Source/WebCore/page/Page.cpp:831 > + for (auto iterator = ranges.rbegin(); iterator != ranges.rend(); ++iterator) { The comment about how we do replacement in the reverse tree order is more relevant here.
Wenson Hsieh
Comment 19 2018-11-21 19:44:49 PST
Comment on attachment 355366 [details] Slightly simpler approach View in context: https://bugs.webkit.org/attachment.cgi?id=355366&action=review Thanks for the review! >> Source/WebCore/page/Page.cpp:781 >> + }).iterator->value; > > I think returning Vector<FindReplacementRange> { } is probably cleaner rather than specifying the return type. Sounds good — changed to return Vector<FindReplacementRange> { }; >> Source/WebCore/page/Page.cpp:784 >> + // of the document aren't clobbered due to text replacement. > > Not sure "iterating backwards" is the right comment here. > Maybe just say sort the range by the end offset? 👍🏻 Reworded this comment accordingly, and moved the bit about not clobbering ranges further down. >> Source/WebCore/page/Page.cpp:821 >> + // If the containers are in different frames altogether, compare their frames in traversal order instead. > > This comment seems redundant. It's pretty self-evident from the code. Remove? Ok — removed! >> Source/WebCore/page/Page.cpp:831 >> + for (auto iterator = ranges.rbegin(); iterator != ranges.rend(); ++iterator) { > > The comment about how we do replacement in the reverse tree order is more relevant here. Added the earlier comment here.
Wenson Hsieh
Comment 20 2018-11-21 20:25:00 PST
Created attachment 355461 [details] Patch for landing
WebKit Commit Bot
Comment 21 2018-11-21 21:04:06 PST
Comment on attachment 355461 [details] Patch for landing Clearing flags on attachment: 355461 Committed r238438: <https://trac.webkit.org/changeset/238438>
Daniel Bates
Comment 22 2018-11-21 23:32:03 PST
Comment on attachment 355366 [details] Slightly simpler approach View in context: https://bugs.webkit.org/attachment.cgi?id=355366&action=review > Source/WebCore/page/Page.cpp:825 > + for (auto container : containerNodesInOrderOfReplacement) { This code causes unnecessary ref count churn. container should be auto&. > Source/WebCore/page/Page.cpp:836 > + frame->selection().setSelectedRange(range.get(), DOWNSTREAM, true); What is the purpose of the Boolean arguments. Should consider having this function take enums, use an inline comment here, or a named local that describes the Boolean. > Source/WebCore/page/Page.cpp:837 > + frame->editor().replaceSelectionWithText(replacementText, true, false, EditAction::InsertReplacement); Ditto.
Wenson Hsieh
Comment 23 2018-11-21 23:38:40 PST
Comment on attachment 355366 [details] Slightly simpler approach View in context: https://bugs.webkit.org/attachment.cgi?id=355366&action=review >> Source/WebCore/page/Page.cpp:825 >> + for (auto container : containerNodesInOrderOfReplacement) { > > This code causes unnecessary ref count churn. container should be auto&. Good point. I'll fix this in a followup. >> Source/WebCore/page/Page.cpp:836 >> + frame->selection().setSelectedRange(range.get(), DOWNSTREAM, true); > > What is the purpose of the Boolean arguments. Should consider having this function take enums, use an inline comment here, or a named local that describes the Boolean. I'll refactor setSelectedRange() and replaceSelectionWithText() to take `enum class`es instead of bools for these arguments in a followup as well.
Daniel Bates
Comment 24 2018-11-21 23:46:36 PST
Comment on attachment 355366 [details] Slightly simpler approach View in context: https://bugs.webkit.org/attachment.cgi?id=355366&action=review > Source/WebKit/WebProcess/InjectedBundle/API/c/WKBundlePage.cpp:464 > + for (size_t arrayIndex = 0; arrayIndex < matchIndices->size(); ++arrayIndex) { It would be more idiomatic to write this loop using i as the name of the index variable. Caching the size of the vector would also be a nice improvement or is the compiler smart enough to realize matchIndices isn't being mutated? > Source/WebKit/WebProcess/WebPage/FindController.cpp:102 > +uint32_t FindController::replaceMatches(Vector<uint32_t>&& matchIndices, const String& replacementText, bool selectionOnly) This function does not take ownership of matchIndices. So, i suggest it be passed as const lvalue refererenxe instead of rvalue-reference. > Source/WebKit/WebProcess/WebPage/WebPage.cpp:3799 > +void WebPage::replaceStringMatchesFromInjectedBundle(Vector<uint32_t>&& matchIndices, const String& replacementText, bool selectionOnly) Ditto. > Source/WebKit/WebProcess/WebPage/WebPage.cpp:3834 > +void WebPage::replaceMatches(Vector<uint32_t>&& matchIndices, const String& replacementText, bool selectionOnly, CallbackID callbackID) Ditto.
Note You need to log in before you can comment on or make changes to this bug.