Bug 55626 - Node::attach() and Node::nextRenderer() account for over 50% of Peacekeeper's domDynamicCreationCreateElement
Summary: Node::attach() and Node::nextRenderer() account for over 50% of Peacekeeper's...
Status: RESOLVED WONTFIX
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC OS X 10.5
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on: 55720
Blocks:
  Show dependency treegraph
 
Reported: 2011-03-02 15:39 PST by Eric Seidel (no email)
Modified: 2024-04-10 12:51 PDT (History)
7 users (show)

See Also:


Attachments
A shark sample from earlier today showing about 40% in these two functions. (4.59 MB, application/octet-stream)
2011-03-02 15:40 PST, Eric Seidel (no email)
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Seidel (no email) 2011-03-02 15:39:05 PST
Node::attach() and Node::nextRenderer() account for over 50% of Peacekeeper's domDynamicCreationCreateElement

At least according to my latest shark samples.

80% of that time is spent in the JNZ instruction caused by "if (node->renderer())" in the "walk my siblings for renderer" loops in both functions.  I suspect we're just stalling hard, as the hardware isn't able to predict what memory to pre-fetch for the linked list walk.

Maciej suggested one way to prevent this might be to use beforeRenderer() instead, since that's more likely to be already set than nextRenderer.

It's also possible that lazyAttach() is defeating this optimization in nextRenderer:
    if (parentOrHostNode() && !parentOrHostNode()->attached())
        return 0;

However I tried adding a !parentOrHostNode()->renderer() check into that if an saw no change in the benchmark results.

This did not show up in my previous testing, as my hacked copy of peacekeeper was using display: none.  Looks like the real one doesn't do that. :)

The linked-list walk in attach() came from https://bugs.webkit.org/show_bug.cgi?id=14134.  The nextRenderer() walk has been with us forever.
Comment 1 Eric Seidel (no email) 2011-03-02 15:40:21 PST
Created attachment 84477 [details]
A shark sample from earlier today showing about 40% in these two functions.

Later samples showed the higher 50% which I quoted in the bug title.  Either way, seems like this is the hot spot. :)
Comment 2 Eric Seidel (no email) 2011-03-02 16:01:22 PST
Maciej pointed out over IRC that replacing children with innerHTML (which is exactly what peacekeeper does) subverts the nextRenderer check, because the parent remains attached and all the kids have no renderers. Meaning this is then O(N^2) the number of dom nodes added via innerHTML. :(
Comment 3 Eric Seidel (no email) 2011-03-02 16:54:08 PST
I suspect that http://trac.webkit.org/changeset/75277 (from bug 49481) may have caused us to call nextRenderer() more often, and may have caused a regression on this benchmark, but having not tested I can't say for sure.
Comment 4 Eric Seidel (no email) 2011-03-03 15:33:55 PST
Preparing to fix the createRendererIfNeeded case with bug 55720
Comment 5 Eric Seidel (no email) 2011-03-04 18:13:47 PST
Maciej suggested computing nextRenderer from the previous renderer for the createRendererIfNeeded() case.

However after a few attempts I'm not sure it's that easy.  nextRenderer() avoids anonymous renderers and continuations, and gives you the next renderer relating to the position in the DOM (not necessarily relating to where something would be rendered.

My most recent attempt:

static RenderObject* fastNextRendererForCreateRenderer(Node* node)
{
    // It is more common to have a previousRenderer than a nextRenderer since
    // we attach from first to last child in the list.
    RenderObject* previousRenderer = node->previousRenderer();
    RenderObject* nextRenderer = 0;
    if (previousRenderer) {
        // previousRenderer may be wrapped in an anonymous renderer.
        if (previousRenderer->parent() != node->parentOrHostNode()->renderer())
            return node->nextRenderer();
        nextRenderer = previousRenderer->nextSibling();
    } else
        nextRenderer = node->parentOrHostNode()->renderer()->firstChild();

    // If we found an anonymous renderer fall back to a (slow) walk of the node list.
    if (nextRenderer && nextRenderer->isAnonymous())
        return node->nextRenderer();

    ASSERT(nextRenderer == node->nextRenderer());
    if (!nextRenderer)
        return 0;

    ASSERT(nextRenderer != node->renderer());
    ASSERT(nextRenderer->node()->parentOrHostNode() == node->parentOrHostNode());
    return nextRenderer;
}

Fails editing/deleting/pruning-after-merge-1.html (And a few other editing tests) still.
Comment 6 Eric Seidel (no email) 2011-03-04 18:31:03 PST
It might be possible to re-write addChild to compute nextRenderer itself in the few places it may be needed.  (It certainly could reach through renderer->node()->nextRenderer() in all non-anonymous cases).

Or we could create a RendererAddRequest object to hold nextRenderer in the few places it's actually used.

Interestingly enough, when I first did a niave implementation of this fastNextRenderer, it disappeared off of the peacekeeper shark sample, but the peacekeeper numbers didn't get any better.  Making me wonder if shark is leading me astray.
Comment 7 Eric Seidel (no email) 2011-03-07 11:43:48 PST
(In reply to comment #6)
> Interestingly enough, when I first did a niave implementation of this fastNextRenderer, it disappeared off of the peacekeeper shark sample, but the peacekeeper numbers didn't get any better.  Making me wonder if shark is leading me astray.

Someone who wanted to work on this bug could and *should* confirm that, by simply replacing the nextRenderer() call in createRendererIfNecessary with "0" and re-running peacekeeper.  I saw no change in time on my machine, *despite* the previously 25% of the sample which was in nextRenderer() disappearing.   However it's possible I misread the peacekeeper numbers.  My current theory is that Shark's 25% attribution to nextRenderer() is a bug.
Comment 8 Ahmad Saleem 2024-04-10 06:54:43 PDT
Took test case from Mozilla bug - https://bug922016.bmoattachments.org/attachment.cgi?id=811946

*** Firefox Nightly 126 ***

161 ms

*** Safari Technology Preview 191 ***

118 ms

*** Chrome Canary 125 ***

191 ms

____

@Rysouke - do we need to squeeze anymore performance from this old benchmark or Speedometer would cover these and this can be closed?
Comment 9 Ryosuke Niwa 2024-04-10 12:51:10 PDT
We're no longer tracking this benchmark.