RESOLVED FIXED 73737
HIERARCHY_REQUEST_ERR check in checkAcceptChild should be optimized for newChild without children
https://bugs.webkit.org/show_bug.cgi?id=73737
Summary HIERARCHY_REQUEST_ERR check in checkAcceptChild should be optimized for newCh...
Ryosuke Niwa
Reported 2011-12-02 23:32:11 PST
Per statistics posted in http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2011-December/034038.html, it's pretty clear that the check for HIERARCHY_REQUEST_ERR in checkAcceptChild should be optimized for the case where newChild doesn't have any children or less than 10 children.
Attachments
perf test (1.09 KB, text/html)
2011-12-03 01:15 PST, Ryosuke Niwa
no flags
Patch (1.96 KB, patch)
2011-12-03 01:35 PST, Ryosuke Niwa
no flags
Also implement arv's optimization (2.31 KB, patch)
2011-12-04 00:03 PST, Ryosuke Niwa
no flags
new perf test (1.21 KB, text/html)
2011-12-04 01:14 PST, Ryosuke Niwa
no flags
Ryosuke Niwa
Comment 1 2011-12-02 23:32:37 PST
Diff I used to collect statistics for completeness: Index: Source/WebCore/dom/Node.cpp =================================================================== --- Source/WebCore/dom/Node.cpp (revision 101848) +++ Source/WebCore/dom/Node.cpp (working copy) @@ -1309,6 +1309,15 @@ // HIERARCHY_REQUEST_ERR: Raised if this node is of a type that does not allow children of the type of the // newChild node, or if the node to append is one of this node's ancestors. + { + unsigned i = 0; + for (Node* node = newChild; node && i < 100; node = node->traverseNextNode(newChild)) + i++; + if (i < 100) + fprintf(stderr, "inserting child with %d nodes\n", i); + else + fprintf(stderr, "inserting child with 100+ nodes\n"); + } if (newChild == newParent || newParent->isDescendantOf(newChild)) { ec = HIERARCHY_REQUEST_ERR;
Ryosuke Niwa
Comment 2 2011-12-03 01:15:27 PST
Created attachment 117749 [details] perf test
Ryosuke Niwa
Comment 3 2011-12-03 01:28:59 PST
I took 11 samples on Apple's mac port applied on r101862. Without patch: shallow tree: 2ms deep tree: 4.09ms (std:0.3) deeper tree: 11.45ms (std:0.5) With patch: shallow tree: 2ms deep tree: 3ms deeper tree: 6.91ms (std 0.3) 40% improvement! The number, however, indicates that we have at least one more linear algorithm hidden somewhere.
Ryosuke Niwa
Comment 4 2011-12-03 01:35:21 PST
Erik Arvidsson
Comment 5 2011-12-03 12:28:41 PST
Thanks for bringing this up. I forgot we talked about this a while ago. I thought of another simple optimization we could do here. If we are comparing two trees that have different inDocument state, then we know that one cannot be a descendant of the other: if (inDocument() != other->inDocument()) return false; I'm curious how many of the cases you saw where a node is appended that has no children falls into the above category? My gut feeling is that the building of the disconnected tree falls into the no child case and that the attachment to the document would fall into the above case.
Ryosuke Niwa
Comment 6 2011-12-03 15:09:59 PST
(In reply to comment #5) > Thanks for bringing this up. I forgot we talked about this a while ago. > > I thought of another simple optimization we could do here. If we are comparing two trees that have different inDocument state, then we know that one cannot be a descendant of the other: > > if (inDocument() != other->inDocument()) > return false; > > I'm curious how many of the cases you saw where a node is appended that has no children falls into the above category? My gut feeling is that the building of the disconnected tree falls into the no child case and that the attachment to the document would fall into the above case. That's an excellent obervation obervstion. I'm going to collect a stat on this as well as the number of ancesors newParent has.
Darin Adler
Comment 7 2011-12-03 19:27:18 PST
That code counts the total number of descendants, not the number of children.
Ryosuke Niwa
Comment 8 2011-12-03 21:00:34 PST
(In reply to comment #7) > That code counts the total number of descendants, not the number of children. Oh, right. That's what I meant. New stat coming per Erik's comment. Will wait landing this patch until that.
Ryosuke Niwa
Comment 9 2011-12-03 23:19:09 PST
(In reply to comment #5) > I'm curious how many of the cases you saw where a node is appended that has no children falls into the above category? My gut feeling is that the building of the disconnected tree falls into the no child case and that the attachment to the document would fall into the above case. 59%. Sample size: 42,000 - 1 n = Number of inserted nodes 1 2 7 6 100+ 3 10+ 5 4 20+ 0.679 0.089 0.038 0.029 0.028 0.024 0.021 0.018 0.013 0.011 d = Number of ancestors of new parent 1 20+ 10+ 3 2 4 5 9 8 6 0.544 0.160 0.114 0.077 0.061 0.011 0.010 0.008 0.006 0.006 P(either but not both in document) = 0.625610133575 P(d=1 | n=1) ≈ 0.50007 P(parent not in document | n=1) ≈ 0.585 P(child not in document | n=1) ≈ 0.00681 P(either but not both in document | n=1) ≈ 0.589 P(either but not both in document) ≈ 0.626 P(n = 1, either but not both in document) ≈ 0.400 P(n = 1 | either but not both in document) ≈ 0.639
Ryosuke Niwa
Comment 10 2011-12-03 23:55:21 PST
My data suggests that we should do both optimizations.
Ryosuke Niwa
Comment 11 2011-12-04 00:03:54 PST
Created attachment 117784 [details] Also implement arv's optimization
Ryosuke Niwa
Comment 12 2011-12-04 00:48:16 PST
Resubmitting the patch for review since I've implemented Erik's optimization as well. Most significantly, P(n = 1 or either but not both in document) ≈ 0.904. So this patch will make 90% of cases really fast.
Ryosuke Niwa
Comment 13 2011-12-04 01:14:12 PST
Created attachment 117785 [details] new perf test Given that parent having more than 30 ancestors is very rare, I've updated my perf test to reflect this.
Erik Arvidsson
Comment 14 2011-12-04 11:12:03 PST
(In reply to comment #12) > Most significantly, P(n = 1 or either but not both in document) ≈ 0.904. So this patch will make 90% of cases really fast. 90%! Wow that is huge. I expect this to have big impact on Gmail. We should ask them for numbers to see how much this brings down their latency.
Ryosuke Niwa
Comment 15 2011-12-04 12:30:04 PST
Comment on attachment 117784 [details] Also implement arv's optimization Thanks for the reviews, Darin!
Ryosuke Niwa
Comment 16 2011-12-04 12:32:20 PST
(In reply to comment #14) > (In reply to comment #12) > > Most significantly, P(n = 1 or either but not both in document) ≈ 0.904. So this patch will make 90% of cases really fast. > > 90%! Wow that is huge. I expect this to have big impact on Gmail. We should ask them for numbers to see how much this brings down their latency. Hm... we still have a linear performance, and there are so many other APIs that are much slower so I'll be surprised this had any meaningful impact on Gmail but it'll great if this patch improved their latency :)
WebKit Review Bot
Comment 17 2011-12-04 13:53:13 PST
Comment on attachment 117784 [details] Also implement arv's optimization Clearing flags on attachment: 117784 Committed r101967: <http://trac.webkit.org/changeset/101967>
WebKit Review Bot
Comment 18 2011-12-04 13:53:18 PST
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.