From http://code.google.com/p/chromium/issues/detail?id=139552 Open attached example (http://code.google.com/p/chromium/issues/attachmentText?id=139552&aid=1395520000000&name=test.html&token=_zfslVPKMGj56AZt5m4QuCjHpJ0%3A1345659739164) and press "Highlight" button 2x It takes around 791ms on my machine (i7 2.8) where in Chrome 15 it only takes 30ms. Now press the "Collapse selection before syntax highlight" which calls: window.getSelection().collapse() before it assigns new innerHTML. When selection is collapsed I get around 30ms. Chrome 15 is around 30-40ms in both cases (slightly better when not checked)
I wonder if this came from the original HTML5 Parser work. It's not clear to me what the timeframe of this regression is.
Created attachment 191895 [details] Test case (original from Chromium, by jaroslav.benc) (Uploading original test case.)
You can also use Performance/Bindings/innerHTML-setter.html for micro benchmarks.
Regression is in Chromium r-103242 http://crrev.com/103242 which is a WebKit roll r96201:96267 (r96201:r96267). Revision r96257 by Ryosuke, fixing: Bug 62092 - Setting innerText to an empty string on editable div loses focus ...looks suspicious, and is the likely culprit. This change caused a different regression, fixed in r115460 namely: Bug 83983 - REGRESSION(r96257): Deleting a large amount of text is very slow The changeset message for r115460 is very instructive: "The change in r96257 did not cause the performance regression per se, but exposed a problem ... If the container has a very large number of children, we walk the entire list of child nodes in the container simply to find out how many they are." So the concrete issue looks like: Updating selection on (start/end?) child node removal is slow (b/c triggers some underlying slow computation, say offset). Prior to r96257 the selection was *cleared* if start/end node was deleted, so this wasn't triggered. When this bug was fixed (preserving the selection), it hit the slow computation. Revision r115460 fixed some of these regressions, but, given this bug, not all. Probably has similar cause and fix to that bug. Other notes: Ryosuke comments at http://crbug.com/139552#c3 that: "This is probably because we're updating selection on every child node removal." To Kentaro: Thanks for the benchmark ref! I'll next work on building a simple test case, as we'll need it anyway, and then look at which slow code we're hitting.
Some links. WebKit roll revision log: http://trac.webkit.org/log/trunk/?rev=96267&stop_rev=96201 Full logs: http://trac.webkit.org/log/trunk/?rev=96267&stop_rev=96201&verbose=on Revision r96257 for Bug 62092 - Setting innerText to an empty string on editable div loses focus http://trac.webkit.org/changeset/96257 Revision r115460 for Bug 83983 - REGRESSION(r96257): Deleting a large amount of text is very slow http://trac.webkit.org/changeset/115460
nbarth: Great discovery! Let's dive into the fix!
(In reply to comment #4) > Regression is in Chromium r-103242 http://crrev.com/103242 > which is a WebKit roll r96201:96267 (r96201:r96267). > > Revision r96257 by Ryosuke, fixing: That's not my patch although I reviewed it. The key is to mass-remove nodes as much as possible since the cost of updating selection manifests only at a node removal.
(In reply to comment #7) > (In reply to comment #4) > > Regression is in Chromium r-103242 http://crrev.com/103242 > > which is a WebKit roll r96201:96267 (r96201:r96267). > > > > Revision r96257 by Ryosuke, fixing: > > That's not my patch although I reviewed it. Oops, sorry! (For the record, patch by Una Sabovic) > The key is to mass-remove nodes as much as possible since the cost of updating selection manifests only at a node removal. Abstractly, the two natural solutions are: 1. Make updating selection fast 2. Avoid updating selection (e.g., batch node removals) Looking at the solution in r115460 (by Enrica Casucci ;-) for Bug 83983 - REGRESSION(r96257): Deleting a large amount of text is very slow http://trac.webkit.org/changeset/115460 ...looks like it just speeds up the computation, without needing to make major code changes (i.e., #1). Perhaps we can do something similar here. In terms of numbers, I'm guessing we're hitting an O(n^2) algorithm here, presumably doing something O(n) to update the selection for each of the n child nodes. If we can make updating the selection O(1) without much trouble, this would solve it.
(In reply to comment #8) > (In reply to comment #7) > > (In reply to comment #4) > > > Regression is in Chromium r-103242 http://crrev.com/103242 > > > which is a WebKit roll r96201:96267 (r96201:r96267). > > > > > > Revision r96257 by Ryosuke, fixing: > > > > That's not my patch although I reviewed it. > > Oops, sorry! (For the record, patch by Una Sabovic) > > > The key is to mass-remove nodes as much as possible since the cost of updating selection manifests only at a node removal. > > Abstractly, the two natural solutions are: > 1. Make updating selection fast > 2. Avoid updating selection (e.g., batch node removals) We need to do 2. There's nothing of significance we can do for 1 at this point.
(In reply to comment #9) > (In reply to comment #8) > > Abstractly, the two natural solutions are: > > 1. Make updating selection fast > > 2. Avoid updating selection (e.g., batch node removals) > > We need to do 2. There's nothing of significance we can do for 1 at this point. I was afraid you would say that... (>.<)
Created attachment 192416 [details] Test case (revised: simplified, configurable) Narrowed down test case -- specific bug is: setting innerHTML with selection is O(n^2) in number of children of *existing* value Notably, clearing innerHTML (setting to "") takes O(n^2) time. This looks very similar to the other regression caused by the original bug: Bug 83983 - REGRESSION(r96257): Deleting a large amount of text is very slow This test case just sets innerHTML to a large number of <br>, then clears it, allowing you to vary the number of children and whether the selection is set or not, seeing performance in each case. With selection set, performance is O(n^2); with selection unset, performance is O(n). For reference: * setting innerHTML is O(n) in number of children of *new* value * div.focus() is O(n) in number of children * window.getSelection().collapse() is O(n) in number of children
(In reply to comment #1) > I wonder if this came from the original HTML5 Parser work. It's not clear to me what the timeframe of this regression is. Just to reply to this: This isn't related to the HTML5 Parser work; it's just due to a bugfix meaning we preserve a selection (instead of clearing it), hitting a slow code path we hadn't hit before.
Created attachment 192635 [details] Test case (revised: various selection types) Revised test case to allow various selection types: None, Caret, and Range. (Code cleanup too, e.g., explicitly setting selection, rather than implicitly setting via focus.) Existing example used Caret selection (from focus); having range (selecting all innerHTML) also quadratic and even slower (factor of about 3).
Created attachment 192637 [details] Initial performance test * platform/gtk/TestExpectations: Removing a few expectations for tests that were rolled out in r145296. * platform/gtk/editing/pasteboard/paste-text-016-expected.txt: Rebaselining after r145296. * platform/gtk/fast/dynamic/002-expected.txt: Ditto. git-svn-id: http://svn.webkit.org/repository/webkit/trunk@145343 268f45cc-cd09-0410-ab3c-d52691b4dbfc
Comment on attachment 192637 [details] Initial performance test View in context: https://bugs.webkit.org/attachment.cgi?id=192637&action=review > PerformanceTests/ChangeLog:10 > + * Parser/innerHTML-setter-selection-long.html: Added. Since this is asymptotic behavior, you should be able to test it inside LayoutTests/perf/
(In reply to comment #14) > Created an attachment (id=192637) [details] Sorry, wrong comment (fixed my script). Just uploaded a performance test.
To clarify my position, please don't add a test for this bug as a performance test. This bug is about an asymptotic behavior, and should be tested by a layout test using magnitude-perf.js.
(In reply to comment #17) > To clarify my position, please don't add a test for this bug as a performance test. This bug is about an asymptotic behavior, and should be tested by a layout test using magnitude-perf.js. nbarth: c.f. See this patch for how to write a layout perf test. https://bugs.webkit.org/attachment.cgi?id=191094&action=review
(In reply to comment #15) (In reply to comment #17) > To clarify my position, please don't add a test for this bug as a performance test. This bug is about an asymptotic behavior, and should be tested by a layout test using magnitude-perf.js. Thanks, will do! What's the rule for when to put tests in PerformanceTests/ vs. LayoutTests/perf/ ? Is it simply: * asymptotic behavior in LayoutTests/perf/ using magnitude-perf.js * other (performance etc.) in PerformanceTests/
(In reply to comment #19) > > What's the rule for when to put tests in PerformanceTests/ vs. LayoutTests/perf/ ? > Is it simply: > * asymptotic behavior in LayoutTests/perf/ using magnitude-perf.js > * other (performance etc.) in PerformanceTests/ Yes. Whenever we're testing asymptotic behavior, we should be writing a layout test using magnitude-perf.js.
Created attachment 192669 [details] Convert test to order-of-magnitude regression test
(In reply to comment #20) > Whenever we're testing asymptotic behavior, we should be writing a layout test using magnitude-perf.js. Got it, thanks! Order-of-growth test using magnitude-perf.js uploaded.
Comment on attachment 192669 [details] Convert test to order-of-magnitude regression test View in context: https://bugs.webkit.org/attachment.cgi?id=192669&action=review > LayoutTests/perf/innerHTML-setter-selection-long.html:12 > + var childArray = new Array(magnitude); > + for (var i = 0; i < magnitude; ++i) { > + childArray[i] = "<br>"; > + } > + childList = childArray.join("\n"); Nit: This whole thing can just be: childList = new Array(magnitude).join("<br>\n");
Created attachment 192848 [details] Simplify initialization in test
Created attachment 192850 [details] Test case (tweak: simpler initialization)
(In reply to comment #23) >>[...long string initialization...] > Nit: This whole thing can just be: > childList = new Array(magnitude).join("<br>\n"); Thanks for the tip -- tests revised! BTW, there's a fenceposting error (join fills in gaps), so you need to add 1 to get it right: childList = new Array(magnitude + 1).join("<br>\n");
I’ve tracked down the cause and have a proposed solution. First, to explain why the current code is O(n^2). Setting innerHTML first removes all the children in the DOM subtree. If there is a selection, it checks each of the n nodes to see if it is in the selection, which involves 3 linear searches, which is O(n) work each (3 * average n/2 ~= 3n/2), for O(n^2) work. See attached DOT file for call graph from gperf-tools; you can see the slow section in the 2 big boxes at the bottom, under compareNode. In detail, at the top we have HTMLElement::setInnerHTML in the middle, in function FrameSelection::respondToNodeModification the line: Range::CompareResults compareResult = range->compareNode(node, ec); gets called for almost every node (except the first, and presumably last). Range::compareNode is the slow function (O(n)): it first walks backwards through the tree (calling previousSibling), then walks forwards twice (calling nextSibling), hence at the bottom previousSibling and nextSibling are the hot spots, in a 1:2 ratio. Stack trace, starting from setInnerHTML (indented section is secondary branch): WebCore::Node::nextSibling() << primary hot spot WebCore::ContainerNode::childNode(unsigned int) const << tertiary hot spot WebCore::Node::childNode(unsigned int) const WebCore::Range::checkNodeWOffset(WebCore::Node*, int, int&) const WebCore::Range::comparePoint(WebCore::Node*, int, int&) const WebCore::Node::previousSibling() << secondary hot spot WebCore::Node::nodeIndex << quaternary hot spot WebCore::Range::compareNode(WebCore::Node*, int&) const WebCore::FrameSelection::respondToNodeModification(WebCore::Node*, bool, bool, bool, bool) WebCore::FrameSelection::nodeWillBeRemoved(WebCore::Node*) WebCore::Document::nodeChildrenWillBeRemoved(WebCore::ContainerNode*) WebCore::willRemoveChildren(WebCore::ContainerNode*) WebCore::ContainerNode::removeChildren() WebCore::replaceChildrenWithFragment(WebCore::ContainerNode*, WTF::PassRefPtr<WebCore::DocumentFragment>, int&) WebCore::HTMLElement::setInnerHTML(WTF::String const&, int&)
Created attachment 197500 [details] Call graph from gperftools (as GraphViz DOT file) You can see the slow section in the 2 big boxes at the bottom, under compareNode. A simple DOT viewer is available here: https://code.google.com/p/jrfonseca/wiki/XDot
Turning to solutions, two options present themselves: 1. Low-level: still check each node against the selection, but update the selection so it doesn’t hit the search 2. High-level: update the selection once for the whole container as a batch, rather than for each child Proposed patch (next post) takes approach #2; resulting code is much clearer than a low-level solution, and avoids many possible problems with trying to optimize updating the selection one node at a time (e.g., what if selection is a caret in the middle of innerHTML, and then it’s set to zero, etc.?); this follows Ryosuke’s rec in comment #9: https://bugs.webkit.org/show_bug.cgi?id=94733#c9 This also yields calculation overhead that is O(1) (in number of children), since we only need to make a single check for if the boundary points of the selection are in the deleted subtree, and resulting code is very fast.
Discussion moved to: https://code.google.com/p/chromium/issues/detail?id=139552
(CC'ing people who've touched FrameSelection::respondToNodeModification)
As I mentioned earlier, the solution here is to detect this case and update the selection pre-emptively in setInnerHTML or removeAllChildren.
Created attachment 197504 [details] Proposed solution (first draft)
(In reply to comment #32) > As I mentioned earlier, the solution here is to detect this case and update the selection pre-emptively in setInnerHTML or removeAllChildren. Done in patch (uploaded); hopefully works as is. Otherwise further development against Blink (hopefully easy to port), over at Chromium Issue Tracker: https://code.google.com/p/chromium/issues/detail?id=139552
Comment on attachment 197504 [details] Proposed solution (first draft) View in context: https://bugs.webkit.org/attachment.cgi?id=197504&action=review > Source/WebCore/editing/FrameSelection.cpp:389 > + Position start = m_selection.start(); > + Position end = m_selection.end(); We need to also check base & extent. I'm very skeptical that this will do the right thing in all cases. I'd much prefer adding some flag to respondToNodeModification and handle it there. > Source/WebCore/editing/FrameSelection.cpp:394 > + if (startRemoved) > + start = Position(container, 0, Position::PositionIsOffsetInAnchor); Please use firstPositionInNode(container). > Source/WebCore/editing/FrameSelection.cpp:396 > + end = Position(container, 0, Position::PositionIsOffsetInAnchor); Ditto. > Source/WebCore/editing/FrameSelection.cpp:406 > + setSelection(VisibleSelection(), DoNotSetFocus); > + > + clearRenderViewSelection(m_selection.start()); The ordering here matters. This is why respondToNodeModification has a flag for it. Another reason to modify respondToNodeModification instead of duplicating code.
FYI, this has been fixed in Chromium due to a separate change: https://code.google.com/p/chromium/issues/detail?id=139552
Please don't assign bugs to other contributors without first asking if he or she is intending to be working on it. As it turns out, I don't intend to work on this bug.
(In reply to comment #37) > Please don't assign bugs to other contributors without first asking if he or she is intending to be working on it. As it turns out, I don't intend to work on this bug. Sorry about that.