Summary:  HIERARCHY_REQUEST_ERR check in checkAcceptChild should be optimized for newChild without children  

Product:  WebKit  Reporter:  Ryosuke Niwa <rniwa>  
Component:  DOM  Assignee:  Ryosuke Niwa <rniwa>  
Status:  RESOLVED FIXED  
Severity:  Normal  CC:  arv, darin, eae, ggaren, mjs, ojan, sam, webkit.review.bot  
Priority:  P2  
Version:  528+ (Nightly build)  
Hardware:  Unspecified  
OS:  Unspecified  
Bug Depends on:  73802  
Bug Blocks:  73853  
Attachments: 

Description
Ryosuke Niwa
20111202 23:32:11 PST
Diff I used to collect statistics for completeness: Index: Source/WebCore/dom/Node.cpp ===================================================================  Source/WebCore/dom/Node.cpp (revision 101848) +++ Source/WebCore/dom/Node.cpp (working copy) @@ 1309,6 +1309,15 @@ // HIERARCHY_REQUEST_ERR: Raised if this node is of a type that does not allow children of the type of the // newChild node, or if the node to append is one of this node's ancestors. + { + unsigned i = 0; + for (Node* node = newChild; node && i < 100; node = node>traverseNextNode(newChild)) + i++; + if (i < 100) + fprintf(stderr, "inserting child with %d nodes\n", i); + else + fprintf(stderr, "inserting child with 100+ nodes\n"); + } if (newChild == newParent  newParent>isDescendantOf(newChild)) { ec = HIERARCHY_REQUEST_ERR; Created attachment 117749 [details]
perf test
I took 11 samples on Apple's mac port applied on r101862. Without patch: shallow tree: 2ms deep tree: 4.09ms (std:0.3) deeper tree: 11.45ms (std:0.5) With patch: shallow tree: 2ms deep tree: 3ms deeper tree: 6.91ms (std 0.3) 40% improvement! The number, however, indicates that we have at least one more linear algorithm hidden somewhere. Created attachment 117751 [details]
Patch
Thanks for bringing this up. I forgot we talked about this a while ago. I thought of another simple optimization we could do here. If we are comparing two trees that have different inDocument state, then we know that one cannot be a descendant of the other: if (inDocument() != other>inDocument()) return false; I'm curious how many of the cases you saw where a node is appended that has no children falls into the above category? My gut feeling is that the building of the disconnected tree falls into the no child case and that the attachment to the document would fall into the above case. (In reply to comment #5) > Thanks for bringing this up. I forgot we talked about this a while ago. > > I thought of another simple optimization we could do here. If we are comparing two trees that have different inDocument state, then we know that one cannot be a descendant of the other: > > if (inDocument() != other>inDocument()) > return false; > > I'm curious how many of the cases you saw where a node is appended that has no children falls into the above category? My gut feeling is that the building of the disconnected tree falls into the no child case and that the attachment to the document would fall into the above case. That's an excellent obervation obervstion. I'm going to collect a stat on this as well as the number of ancesors newParent has. That code counts the total number of descendants, not the number of children. (In reply to comment #7) > That code counts the total number of descendants, not the number of children. Oh, right. That's what I meant. New stat coming per Erik's comment. Will wait landing this patch until that. (In reply to comment #5) > I'm curious how many of the cases you saw where a node is appended that has no children falls into the above category? My gut feeling is that the building of the disconnected tree falls into the no child case and that the attachment to the document would fall into the above case. 59%. Sample size: 42,000  1 n = Number of inserted nodes 1 2 7 6 100+ 3 10+ 5 4 20+ 0.679 0.089 0.038 0.029 0.028 0.024 0.021 0.018 0.013 0.011 d = Number of ancestors of new parent 1 20+ 10+ 3 2 4 5 9 8 6 0.544 0.160 0.114 0.077 0.061 0.011 0.010 0.008 0.006 0.006 P(either but not both in document) = 0.625610133575 P(d=1  n=1) ≈ 0.50007 P(parent not in document  n=1) ≈ 0.585 P(child not in document  n=1) ≈ 0.00681 P(either but not both in document  n=1) ≈ 0.589 P(either but not both in document) ≈ 0.626 P(n = 1, either but not both in document) ≈ 0.400 P(n = 1  either but not both in document) ≈ 0.639 My data suggests that we should do both optimizations. Created attachment 117784 [details]
Also implement arv's optimization
Resubmitting the patch for review since I've implemented Erik's optimization as well. Most significantly, P(n = 1 or either but not both in document) ≈ 0.904. So this patch will make 90% of cases really fast. Created attachment 117785 [details]
new perf test
Given that parent having more than 30 ancestors is very rare, I've updated my perf test to reflect this.
(In reply to comment #12) > Most significantly, P(n = 1 or either but not both in document) ≈ 0.904. So this patch will make 90% of cases really fast. 90%! Wow that is huge. I expect this to have big impact on Gmail. We should ask them for numbers to see how much this brings down their latency. Comment on attachment 117784 [details]
Also implement arv's optimization
Thanks for the reviews, Darin!
(In reply to comment #14) > (In reply to comment #12) > > Most significantly, P(n = 1 or either but not both in document) ≈ 0.904. So this patch will make 90% of cases really fast. > > 90%! Wow that is huge. I expect this to have big impact on Gmail. We should ask them for numbers to see how much this brings down their latency. Hm... we still have a linear performance, and there are so many other APIs that are much slower so I'll be surprised this had any meaningful impact on Gmail but it'll great if this patch improved their latency :) Comment on attachment 117784 [details] Also implement arv's optimization Clearing flags on attachment: 117784 Committed r101967: <http://trac.webkit.org/changeset/101967> All reviewed patches have been landed. Closing bug. 