55626 – Node::attach() and Node::nextRenderer() account for over 50% of Peacekeeper's domDynamicCreationCreateElement

RESOLVED WONTFIX 55626

Node::attach() and Node::nextRenderer() account for over 50% of Peacekeeper's domDynamicCreationCreateElement

https://bugs.webkit.org/show_bug.cgi?id=55626

Summary Node::attach() and Node::nextRenderer() account for over 50% of Peacekeeper's...

Eric Seidel (no email)

Reported 2011-03-02 15:39:05 PST

Node::attach() and Node::nextRenderer() account for over 50% of Peacekeeper's domDynamicCreationCreateElement At least according to my latest shark samples. 80% of that time is spent in the JNZ instruction caused by "if (node->renderer())" in the "walk my siblings for renderer" loops in both functions. I suspect we're just stalling hard, as the hardware isn't able to predict what memory to pre-fetch for the linked list walk. Maciej suggested one way to prevent this might be to use beforeRenderer() instead, since that's more likely to be already set than nextRenderer. It's also possible that lazyAttach() is defeating this optimization in nextRenderer: if (parentOrHostNode() && !parentOrHostNode()->attached()) return 0; However I tried adding a !parentOrHostNode()->renderer() check into that if an saw no change in the benchmark results. This did not show up in my previous testing, as my hacked copy of peacekeeper was using display: none. Looks like the real one doesn't do that. :) The linked-list walk in attach() came from https://bugs.webkit.org/show_bug.cgi?id=14134. The nextRenderer() walk has been with us forever.

Attachments
A shark sample from earlier today showing about 40% in these two functions. (4.59 MB, application/octet-stream) 2011-03-02 15:40 PST, Eric Seidel (no email)	no flags	Details
View All Add attachment proposed patch, testcase, etc.

Eric Seidel (no email)

Comment 1 2011-03-02 15:40:21 PST

Created attachment 84477 [details] A shark sample from earlier today showing about 40% in these two functions. Later samples showed the higher 50% which I quoted in the bug title. Either way, seems like this is the hot spot. :)

Eric Seidel (no email)

Comment 2 2011-03-02 16:01:22 PST

Maciej pointed out over IRC that replacing children with innerHTML (which is exactly what peacekeeper does) subverts the nextRenderer check, because the parent remains attached and all the kids have no renderers. Meaning this is then O(N^2) the number of dom nodes added via innerHTML. :(

Eric Seidel (no email)

Comment 3 2011-03-02 16:54:08 PST

I suspect that http://trac.webkit.org/changeset/75277 (from bug 49481) may have caused us to call nextRenderer() more often, and may have caused a regression on this benchmark, but having not tested I can't say for sure.

Eric Seidel (no email)

Comment 4 2011-03-03 15:33:55 PST

Preparing to fix the createRendererIfNeeded case with bug 55720

Eric Seidel (no email)

Comment 5 2011-03-04 18:13:47 PST

Maciej suggested computing nextRenderer from the previous renderer for the createRendererIfNeeded() case. However after a few attempts I'm not sure it's that easy. nextRenderer() avoids anonymous renderers and continuations, and gives you the next renderer relating to the position in the DOM (not necessarily relating to where something would be rendered. My most recent attempt: static RenderObject* fastNextRendererForCreateRenderer(Node* node) { // It is more common to have a previousRenderer than a nextRenderer since // we attach from first to last child in the list. RenderObject* previousRenderer = node->previousRenderer(); RenderObject* nextRenderer = 0; if (previousRenderer) { // previousRenderer may be wrapped in an anonymous renderer. if (previousRenderer->parent() != node->parentOrHostNode()->renderer()) return node->nextRenderer(); nextRenderer = previousRenderer->nextSibling(); } else nextRenderer = node->parentOrHostNode()->renderer()->firstChild(); // If we found an anonymous renderer fall back to a (slow) walk of the node list. if (nextRenderer && nextRenderer->isAnonymous()) return node->nextRenderer(); ASSERT(nextRenderer == node->nextRenderer()); if (!nextRenderer) return 0; ASSERT(nextRenderer != node->renderer()); ASSERT(nextRenderer->node()->parentOrHostNode() == node->parentOrHostNode()); return nextRenderer; } Fails editing/deleting/pruning-after-merge-1.html (And a few other editing tests) still.

Eric Seidel (no email)

Comment 6 2011-03-04 18:31:03 PST

It might be possible to re-write addChild to compute nextRenderer itself in the few places it may be needed. (It certainly could reach through renderer->node()->nextRenderer() in all non-anonymous cases). Or we could create a RendererAddRequest object to hold nextRenderer in the few places it's actually used. Interestingly enough, when I first did a niave implementation of this fastNextRenderer, it disappeared off of the peacekeeper shark sample, but the peacekeeper numbers didn't get any better. Making me wonder if shark is leading me astray.

Eric Seidel (no email)

Comment 7 2011-03-07 11:43:48 PST

(In reply to comment #6) > Interestingly enough, when I first did a niave implementation of this fastNextRenderer, it disappeared off of the peacekeeper shark sample, but the peacekeeper numbers didn't get any better. Making me wonder if shark is leading me astray. Someone who wanted to work on this bug could and *should* confirm that, by simply replacing the nextRenderer() call in createRendererIfNecessary with "0" and re-running peacekeeper. I saw no change in time on my machine, *despite* the previously 25% of the sample which was in nextRenderer() disappearing. However it's possible I misread the peacekeeper numbers. My current theory is that Shark's 25% attribution to nextRenderer() is a bug.

Ahmad Saleem

Comment 8 2024-04-10 06:54:43 PDT

Took test case from Mozilla bug - https://bug922016.bmoattachments.org/attachment.cgi?id=811946 *** Firefox Nightly 126 *** 161 ms *** Safari Technology Preview 191 *** 118 ms *** Chrome Canary 125 *** 191 ms ____ @Rysouke - do we need to squeeze anymore performance from this old benchmark or Speedometer would cover these and this can be closed?

Ryosuke Niwa

Comment 9 2024-04-10 12:51:10 PDT

We're no longer tracking this benchmark.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution WONTFIX

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware PC

OS OS X 10.5

Product WebKit

Component New Bugs

Assignee

Nobody

Reported

2011-03-02 15:39 PST

Modified

2024-04-10 12:51 PDT History

CC List

7 users Show

URL

Keywords

Depends on

55720

Blocks

Dependencies

tree graph