RESOLVED FIXED 104332
NodeRenderingContext is slow due to ComposedShadowTreeWalker
https://bugs.webkit.org/show_bug.cgi?id=104332
Summary NodeRenderingContext is slow due to ComposedShadowTreeWalker
Hajime Morrita
Reported 2012-12-06 18:16:08 PST
This is variant of Bug 103208. Since I don't want to obsolete the original patch there, I'd file a separate bug. My attempt is to kill the slowdown even if SHADOW_DOM is on.
Attachments
Patch (13.80 KB, patch)
2012-12-07 03:11 PST, Hajime Morrita
no flags
dom-modify innerHTML result (21.96 KB, text/html)
2012-12-09 20:14 PST, Hajime Morrita
no flags
Patch (17.78 KB, patch)
2012-12-09 22:33 PST, Hajime Morrita
no flags
Benchmark result with non-bitflag version (23.06 KB, text/html)
2012-12-10 19:35 PST, Hajime Morrita
no flags
Patch (36.54 KB, patch)
2012-12-12 01:40 PST, Hajime Morrita
no flags
WIP (41.41 KB, patch)
2012-12-12 03:19 PST, Hajime Morrita
no flags
WIP (43.94 KB, patch)
2012-12-12 04:00 PST, Hajime Morrita
no flags
WIP (46.04 KB, patch)
2012-12-12 04:51 PST, Hajime Morrita
no flags
Patch (44.91 KB, patch)
2012-12-12 22:37 PST, Hajime Morrita
no flags
Fixed ChangeLog (44.63 KB, patch)
2012-12-12 22:42 PST, Hajime Morrita
no flags
Patch (45.14 KB, patch)
2012-12-13 00:09 PST, Hajime Morrita
no flags
benchmark result showing 2-3% win for document-modify-innerhtml (22.01 KB, text/html)
2012-12-13 00:11 PST, Hajime Morrita
no flags
Patch for landing (45.31 KB, patch)
2012-12-13 21:06 PST, Hajime Morrita
no flags
Patch for landing (45.31 KB, patch)
2012-12-13 21:15 PST, Hajime Morrita
no flags
Hajime Morrita
Comment 1 2012-12-07 03:11:09 PST
Hajime Morrita
Comment 2 2012-12-07 03:14:31 PST
Need to measure this on Apple Mac. dom-modify is too noisy on Chromium to see difference.
Hajime Morrita
Comment 3 2012-12-07 03:52:44 PST
(In reply to comment #2) > Need to measure this on Apple Mac. dom-modify is too noisy on Chromium to see difference. Will do next week....
Hajime Morrita
Comment 4 2012-12-09 20:14:43 PST
Created attachment 178462 [details] dom-modify innerHTML result I confirmed similar progression similar to what Antti did at Bug 103208. Note that this is not whole dom-modify. It runs only innerHTML sub test.
Hajime Morrita
Comment 5 2012-12-09 20:15:30 PST
First three trial came from patched, last three from original.
Hajime Morrita
Comment 6 2012-12-09 22:33:20 PST
Dimitri Glazkov (Google)
Comment 7 2012-12-10 10:16:13 PST
Comment on attachment 178476 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=178476&action=review This looks nice. I am not sure whether the ShadowTreeJoint is the right term, since it needs explanation. Not sure if I have better ideas, but it seems like the shadow tree joint node is a node that impacts shadow tree algorithms... ImpactsShadowDOMAlgorithms is way too verbose though :) > Source/WebCore/ChangeLog:29 > + This chagne has 3-4% speedup on Dromaeo/dom-modify/innerHTML. chagne -> change > Source/WebCore/dom/ContainerNodeAlgorithms.h:261 > + bool inDocument = node->inDocument(); > + bool isShadowTreeJoint = node->isShadowTreeJoint(); > + bool isContainerNode = node->isContainerNode(); Not sure why all these guys need to be pre-declared upfront.
Elliott Sprehn
Comment 8 2012-12-10 10:38:37 PST
Comment on attachment 178476 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=178476&action=review It seems like we could do this without the flag and having to maintain it everywhere. It also seems like we can make CSTW::nextSibling and previousSibling inline and have them check a bitfield based on the m_node that the ComposedShadowTreeWalker was created with for "started on a joint node". If that bitfield is false then the walker should fast path itself. Callers should not need to use special static methods. > Source/WebCore/ChangeLog:23 > + - The node is a child of shadow host. Can we do this without the flag? Why not just parentOrHostNode() for being either an InsertionPointNode or having an elementShadow ? > Source/WebCore/ChangeLog:49 > + (WebCore::Element::childrenChanged): Maingains shadow tree joint flag. Misspelled maintains? You have this all over the description. :) > Source/WebCore/dom/NodeRenderingContext.cpp:95 > + for (Node* sibling = ComposedShadowTreeWalker::traverseNextSibling(m_node); sibling; sibling = ComposedShadowTreeWalker::traverseNextSibling(sibling)) { This doesn't seem right because if you started a non-joint node you'll never run into a joint node by walking siblings per your definition in the ChangeLog so you shouldn't need this complexity where you subvert the walker and use static methods. If that's the case when Walker::nextSibling() can just be smart enough to handle this all for you.
Elliott Sprehn
Comment 9 2012-12-10 10:44:44 PST
(In reply to comment #8) > (From update of attachment 178476 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=178476&action=review > ... > > Can we do this without the flag? Why not just parentOrHostNode() for being either an InsertionPointNode or having an elementShadow ? Specifically, this check seems equivalent: // These are bitfield checks, they should be super fast. bool isShadowPivot() { return !parentOrHostNode()->isInsertionPoint() && (!parentOrHostNode()->isElementNode() || toElement(parentOrHostNode())->shadow(); } and you could do: inline void ComposedShadowTreeWalker::nextSibling() { m_node = m_node->isShadowPivot() ? slowTraverse() : m_node->nextSibling(); }
Hajime Morrita
Comment 10 2012-12-10 16:33:42 PST
I consume a flag here since even checking parent node and rare data is a const considering what we can do without shadow is just fetching a word for m_firstChild. I hope this change be comparable to Bug 103208, where these whole paths are compiled out by SHADOW_DOM flag.
Elliott Sprehn
Comment 11 2012-12-10 16:57:10 PST
(In reply to comment #10) > I consume a flag here since even checking parent node and rare data is a > const considering what we can do without shadow is just fetching > a word for m_firstChild. > > I hope this change be comparable to Bug 103208, where these whole paths are > compiled out by SHADOW_DOM flag. Can you try the bool on CSTW construction and inline path instead? We also can't compile this check out as we need to support traversing the composed children for the new DOM based generated content to work properly with <input>... unless we plan to drop support for all generated content on shadows again.
Elliott Sprehn
Comment 12 2012-12-10 16:58:43 PST
(In reply to comment #11) > (In reply to comment #10) > > I consume a flag here since even checking parent node and rare data is a > > const considering what we can do without shadow is just fetching > > a word for m_firstChild. > > > > I hope this change be comparable to Bug 103208, where these whole paths are > > compiled out by SHADOW_DOM flag. > > Can you try the bool on CSTW construction and inline path instead? We also can't compile this check out as we need to support traversing the composed children for the new DOM based generated content to work properly with <input>... unless we plan to drop support for all generated content on shadows again. Also note that Antti himself has said there's no reason to avoid bitfield checks with fancy acrobatics: https://bugs.webkit.org/show_bug.cgi?id=101448 So checking parentOrHostNode()->isShadowRoot() || parentOrHostNode()->isInsertionPoint() on CSTW should be fine.
Hajime Morrita
Comment 13 2012-12-10 17:11:16 PST
(In reply to comment #12) > (In reply to comment #11) > > (In reply to comment #10) > > > I consume a flag here since even checking parent node and rare data is a > > > const considering what we can do without shadow is just fetching > > > a word for m_firstChild. > > > > > > I hope this change be comparable to Bug 103208, where these whole paths are > > > compiled out by SHADOW_DOM flag. > > > > Can you try the bool on CSTW construction and inline path instead? We also can't compile this check out as we need to support traversing the composed children for the new DOM based generated content to work properly with <input>... unless we plan to drop support for all generated content on shadows again. > > Also note that Antti himself has said there's no reason to avoid bitfield checks with fancy acrobatics: https://bugs.webkit.org/show_bug.cgi?id=101448 This doesn't conflict introducing shadow-tree-joint flag ... Anyway, I'm now aware that we need to take care of generated content here. But I don't think it is good idea to mix shadow tree walking and generated content. I'm introducing NodeRenderingTraversal module as a sister of NodeTraversal ns, which is going to be added by Bug 104507. We could manage generated content things there. Let me try... before you land Bug 178634 :-)
Hajime Morrita
Comment 14 2012-12-10 19:35:24 PST
Created attachment 178690 [details] Benchmark result with non-bitflag version I tried non bitflag version but the benchmark doesn't say it's good. The last three columns came from non-flag version.
Elliott Sprehn
Comment 15 2012-12-10 21:04:19 PST
(In reply to comment #14) > Created an attachment (id=178690) [details] > Benchmark result with non-bitflag version > > I tried non bitflag version but the benchmark doesn't say it's good. > The last three columns came from non-flag version. Hmm, that's really suspicious since isShadowRoot() and isInsertionPoint() are both bitfields and checking two bitfields vs 1 (isShadowPivot) shouldn't have any noticeable difference... Okay lets go with this and perhaps we can try to get rid of this new bitfield in the future.
Dimitri Glazkov (Google)
Comment 16 2012-12-11 09:47:32 PST
Comment on attachment 178476 [details] Patch Please take care of the remaining comments and let's file two bugs: one to track better naming of ShadowTreeJoint and one to combine flags.
Hajime Morrita
Comment 17 2012-12-11 16:09:53 PST
(In reply to comment #16) > (From update of attachment 178476 [details]) > Please take care of the remaining comments and let's file two bugs: one to track better naming of ShadowTreeJoint and one to combine flags. I can't land this since it conflicts Elliot's pseudo element patch. I'll adopt the change and ask for review again.
Hajime Morrita
Comment 18 2012-12-12 01:40:25 PST
Elliott Sprehn
Comment 19 2012-12-12 02:12:54 PST
Comment on attachment 178998 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=178998&action=review This patch is going to make generated content much slower, and also makes appending children to a ShadowRoot n^2 which is bad. Please don't do this :( > Source/WebCore/dom/Element.cpp:2160 > + child->resetShadowTreeJoint(); This doesn't seem like the right place for this. Why does adding a pseudo element reset the join bits? I don't like this because it means adding :before and :after to a page dynamically now causes an extra walk across all children of the node you defined it on. ex. You made <div> ... thousands of nodes ... </div> and then adding div:before {} much slower by introducing a new DOM walk. > Source/WebCore/dom/Element.h:373 > + bool isChildShadowTreeJoint() const { return isInsertionPoint() || shadow() || hasPseudoElements(); } This is way too agressive. You're making everything with :before or :after into a joint node which is not correct. > Source/WebCore/dom/ShadowRoot.cpp:330 > + child->setShadowTreeJoint(); Again this is bad, you make appendChild() to ShadowRoot into an O(n^2) operation. This needs to be handled inside appendChild() itself where you check if the parent isShadowRoot() and update.
Elliott Sprehn
Comment 20 2012-12-12 02:26:33 PST
(In reply to comment #19) > (From update of attachment 178998 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=178998&action=review > ... > Again this is bad, you make appendChild() to ShadowRoot into an O(n^2) operation. This needs to be handled inside appendChild() itself where you check if the parent isShadowRoot() and update. More correctly Node::insertedInto, but adding a ShadowRoot to an existing element needs to update every child as well so that won't work... instead what if inside Node::attach() and Node::detach() you look at your parent node and update the bits? That removes the n^2 traversals. Also can we call the bit "needsShadowRootAwareAttach" ? :before and :after are not "Shadow Joints", especially when the page has no shadows. :)
Hajime Morrita
Comment 21 2012-12-12 03:19:32 PST
Hajime Morrita
Comment 22 2012-12-12 03:43:36 PST
Comment on attachment 178998 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=178998&action=review Elliot, I noticed your comment now. I'll do some routine work like adding files to exp file before addressing your comment. >> Source/WebCore/dom/Element.cpp:2160 >> + child->resetShadowTreeJoint(); > > This doesn't seem like the right place for this. Why does adding a pseudo element reset the join bits? I don't like this because it means adding :before and :after to a page dynamically now causes an extra walk across all children of the node you defined it on. > > ex. You made <div> ... thousands of nodes ... </div> and then adding div:before {} much slower by introducing a new DOM walk. Right. we need some compromise here. >> Source/WebCore/dom/ShadowRoot.cpp:330 >> + child->setShadowTreeJoint(); > > Again this is bad, you make appendChild() to ShadowRoot into an O(n^2) operation. This needs to be handled inside appendChild() itself where you check if the parent isShadowRoot() and update. Oh I overlooked the comment. I should take care of this in coming patch.
Hajime Morrita
Comment 23 2012-12-12 04:00:53 PST
Hajime Morrita
Comment 24 2012-12-12 04:51:10 PST
WebKit Review Bot
Comment 25 2012-12-12 06:06:04 PST
Comment on attachment 179025 [details] WIP Attachment 179025 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/15259959 New failing tests: fast/dom/shadow/pseudoclass-update-disabled-fieldset.html fast/dom/shadow/dynamically-created-shadow-root.html fast/dom/shadow/pseudoclass-update-link-anchor.html fast/dom/shadow/focus-navigation.html fast/dom/shadow/pseudoclass-update-target.html fast/dom/shadow/distribution-className-modified.html fast/dom/shadow/pseudoclass-update-enabled-fieldset.html fast/dom/shadow/content-element-select-dynamic.html editing/shadow/select-contenteditable-shadowhost.html fast/dom/shadow/pseudoclass-update-indeterminate-progress.html fast/dom/shadow/content-element-move.html fast/dom/shadow/pseudoclass-update-checked-input.html fast/dom/shadow/pseudoclass-update-disabled-select.html fast/dom/shadow/pseudoclass-update-visited-area.html fast/dom/shadow/pseudoclass-update-disabled-textarea.html fast/dom/shadow/pseudoclass-update-enabled-select.html fast/dom/shadow/distribution-id-modified.html fast/dom/shadow/no-renderers-for-light-children.html fast/dom/shadow/pseudoclass-update-indeterminate-input.html editing/shadow/shadow-selection-not-exported.html editing/shadow/contenteditable-propagation-at-shadow-boundary.html fast/dom/shadow/pseudoclass-update-link-area.html fast/dom/shadow/pseudoclass-update-visited-anchor.html fast/dom/shadow/multiple-shadowroot-rendering.html fast/dom/shadow/pseudoclass-update-enabled-button.html fast/dom/shadow/pseudoclass-update-disabled-input.html fast/dom/shadow/pseudoclass-update-enabled-input.html fast/dom/shadow/distribution-attribute-modified.html fast/dom/shadow/pseudoclass-update-enabled-textarea.html fast/dom/shadow/pseudoclass-update-disabled-button.html
WebKit Review Bot
Comment 26 2012-12-12 06:54:03 PST
Comment on attachment 179025 [details] WIP Attachment 179025 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/15296118 New failing tests: fast/dom/shadow/pseudoclass-update-disabled-fieldset.html fast/dom/shadow/dynamically-created-shadow-root.html fast/dom/shadow/pseudoclass-update-link-anchor.html fast/dom/shadow/focus-navigation.html fast/dom/shadow/pseudoclass-update-target.html fast/dom/shadow/distribution-className-modified.html fast/dom/shadow/pseudoclass-update-enabled-fieldset.html fast/dom/shadow/content-element-select-dynamic.html editing/shadow/select-contenteditable-shadowhost.html fast/dom/shadow/pseudoclass-update-indeterminate-progress.html fast/dom/shadow/content-element-move.html fast/dom/shadow/pseudoclass-update-checked-input.html fast/dom/shadow/pseudoclass-update-disabled-select.html fast/dom/shadow/pseudoclass-update-visited-area.html fast/dom/shadow/pseudoclass-update-disabled-textarea.html fast/dom/shadow/pseudoclass-update-enabled-select.html fast/dom/shadow/distribution-id-modified.html fast/dom/shadow/no-renderers-for-light-children.html fast/dom/shadow/pseudoclass-update-indeterminate-input.html editing/shadow/shadow-selection-not-exported.html editing/shadow/contenteditable-propagation-at-shadow-boundary.html fast/dom/shadow/pseudoclass-update-link-area.html fast/dom/shadow/pseudoclass-update-visited-anchor.html fast/dom/shadow/multiple-shadowroot-rendering.html fast/dom/shadow/pseudoclass-update-enabled-button.html fast/dom/shadow/pseudoclass-update-disabled-input.html fast/dom/shadow/pseudoclass-update-enabled-input.html fast/dom/shadow/distribution-attribute-modified.html fast/dom/shadow/pseudoclass-update-enabled-textarea.html fast/dom/shadow/pseudoclass-update-disabled-button.html
Elliott Sprehn
Comment 27 2012-12-12 19:05:17 PST
I had an idea for an even better way to do this, lets just pass the need for a shadow tree walker down through ::attach() into createRendererIfNeeded and finally sets a boolean on the constructor for NodeRenderingContext. and then you just need to update ContainerNode::attachChildren to do: // outside the loop. bool needsWalker = isShadowRoot() || isInsertionPoint() || hasPseudoElements(); // inside the loop... child->attach(needsWalker); and inside Element::attach(bool walkerIsNeeded) you do createRendererIfNeeded(walkerIsNeeded) and NodeRenderingContext(this, walkerIsNeeded).createRendererForElementIfNeeded(); This saves us from all these n^2 traversals and having to maintain the bit and should be very fast.
Hajime Morrita
Comment 28 2012-12-12 20:36:07 PST
(In reply to comment #27) > I had an idea for an even better way to do this, lets just pass the need for a shadow tree walker down through ::attach() into createRendererIfNeeded and finally sets a boolean on the constructor for NodeRenderingContext. My thought was a variation of this - We can pass NodeRenderingContext as a parameter to attach(). We can tell NRC which node is the parent/caller of attach(). Changing the signature of attach() is certainly a big change and that's why I'm hesitating to do that. But we could probably do that in a follow up patch instead of this one.
Hajime Morrita
Comment 29 2012-12-12 21:41:06 PST
(In reply to comment #27) > > and NodeRenderingContext(this, walkerIsNeeded).createRendererForElementIfNeeded(); Noticed that this approach might not work since the parent can change during nextSibling/previosuSibling traversal: The next sibling of fast-traversable-node can be not-fast-traversable. It can be an insertion point for example.
Hajime Morrita
Comment 30 2012-12-12 22:37:15 PST
Hajime Morrita
Comment 31 2012-12-12 22:42:01 PST
Created attachment 179200 [details] Fixed ChangeLog
Hajime Morrita
Comment 32 2012-12-13 00:09:26 PST
Hajime Morrita
Comment 33 2012-12-13 00:11:40 PST
Created attachment 179215 [details] benchmark result showing 2-3% win for document-modify-innerhtml This one is no longer as strong as before. The improvement is now 2-3%. I'd like to land this and attack further in separate change. The patch will be bigger otherwise.
Dimitri Glazkov (Google)
Comment 34 2012-12-13 20:40:01 PST
Comment on attachment 179214 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=179214&action=review > Source/WebCore/ChangeLog:17 > + chagne, NRC directly used CSTW. This NRT module hides and narrows chagne -> change > Source/WebCore/ChangeLog:21 > + doesn't need to use CSTW and just goes neighboring nodes in a plain DOM way. doesn't -> don't, "just goes _to_ neighboring nodes" > Source/WebCore/ChangeLog:25 > + - CSTW::ParentTraversalDetails is moved and renamed to This could've been a nice separate patch :-\ > Source/WebCore/ChangeLog:37 > + node flag. Each DOM node is now markd as NeedsShadowTreeWalker if > + it requires non-trivial traversal in NRT which uses CSTW. This This is great. > Source/WebCore/ChangeLog:44 > + - The node is a pseudo element or or...? :) > Source/WebCore/ChangeLog:46 > + This criteria criteria is defined in Node::needsShadowTreeWalkerSlow(). The node actually needs zap one "criteria" from the sentence. You have two. > Source/WebCore/dom/ContainerNode.h:268 > +inline bool Node::needsShadowTreeWalker() const Looks like this is the hottest function in the whole patch. > Source/WebCore/dom/NodeRenderingTraversal.h:36 > +namespace NodeRenderingTraversal { The name is a touch obtuse, but I don't have better ideas. > Source/WebCore/html/HTMLOptGroupElement.cpp:29 > +#include "ElementShadow.h" Why do we need this? > Source/WebCore/html/HTMLOptionElement.cpp:32 > +#include "ElementShadow.h" And this?
Hajime Morrita
Comment 35 2012-12-13 21:00:47 PST
Thanks for the review! I can land this. (In reply to comment #34) > > Source/WebCore/dom/ContainerNode.h:268 > > +inline bool Node::needsShadowTreeWalker() const > > Looks like this is the hottest function in the whole patch. > Yeah, it's sad to see the parent here. I should figure out how we can get rid of it. > > > Source/WebCore/html/HTMLOptGroupElement.cpp:29 > > +#include "ElementShadow.h" > > Why do we need this? > > > Source/WebCore/html/HTMLOptionElement.cpp:32 > > +#include "ElementShadow.h" > > And this? We originally included ComposedShadowTreeWalker.h through NodeRenderingContext.h but we no longer do. That change revealed these implicit dependency. I'll explain it in ChangeLog.
Hajime Morrita
Comment 36 2012-12-13 21:06:52 PST
Created attachment 179411 [details] Patch for landing
WebKit Review Bot
Comment 37 2012-12-13 21:09:35 PST
Comment on attachment 179411 [details] Patch for landing Rejecting attachment 179411 [details] from commit-queue. Failed to run "['/mnt/git/webkit-commit-queue/Tools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '-..." exit_code: 1 /mnt/git/webkit-commit-queue/Source/WebCore/ChangeLog neither lists a valid reviewer nor contains the string "Unreviewed" or "Rubber stamp" (case insensitive). Full output: http://queues.webkit.org/results/15319529
Hajime Morrita
Comment 38 2012-12-13 21:15:34 PST
Created attachment 179413 [details] Patch for landing
WebKit Review Bot
Comment 39 2012-12-13 22:51:54 PST
Comment on attachment 179413 [details] Patch for landing Clearing flags on attachment: 179413 Committed r137715: <http://trac.webkit.org/changeset/137715>
WebKit Review Bot
Comment 40 2012-12-13 22:52:02 PST
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.