WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
REOPENED
103868
Change ChildNodeInsertionNotifier::m_postInsertionsNotificationTargets from a Vector to an OwnPtr
https://bugs.webkit.org/show_bug.cgi?id=103868
Summary
Change ChildNodeInsertionNotifier::m_postInsertionsNotificationTargets from a...
Kentaro Hara
Reported
2012-12-03 01:49:42 PST
ChildNodeInsertionNotifier::m_postInsertionsNotificationTargets is used for HTMLFrameElement and HTMLBodyElement only, which would be a cold path. We can change it from a Vector to an OwnPtr<Vector> so that ChildNodeInsertionNotifier can be allocated faster. This will slightly optimize Dromaeo/dom-modify.
Attachments
Patch
(4.13 KB, patch)
2012-12-03 01:54 PST
,
Kentaro Hara
no flags
Details
Formatted Diff
Diff
patch for landing
(4.10 KB, patch)
2012-12-03 17:12 PST
,
Kentaro Hara
no flags
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Kentaro Hara
Comment 1
2012-12-03 01:54:04 PST
Created
attachment 177209
[details]
Patch
Andreas Kling
Comment 2
2012-12-03 06:25:10 PST
Are we confident that this is a real improvement? It looks to me like the Vector is never appended to in the common case, which means it would never allocate a buffer. While this patch shrinks ChildNodeInsertionNotifier, it shouldn't cause any fewer allocations. In fact, it adds allocations to the iframe and body cases.
Kentaro Hara
Comment 3
2012-12-03 07:36:04 PST
> Are we confident that this is a real improvement?
Let me check it again in micro benchmarks tomorrow. This is one of the optimizations I tried in
bug 88834
to offset performance regression introduced by another patch (the patch didn't land).
Darin Adler
Comment 4
2012-12-03 10:38:00 PST
Comment on
attachment 177209
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=177209&action=review
> Source/WebCore/dom/ContainerNodeAlgorithms.h:50 > + OwnPtr<Vector< RefPtr<Node> > > m_postInsertionNotificationTargets;
The space in Vector< RefPtr is not the correct style. It should be Vector<RefPtr.
> Source/WebCore/dom/ContainerNodeAlgorithms.h:197 > + if (UNLIKELY(Node::InsertionShouldCallDidNotifySubtreeInsertions == node->insertedInto(m_insertionPoint))) {
Are you sure this improved performance? Normally we only use LIKELY and UNLIKELY when we have concrete evidence it’s helpful.
> Source/WebCore/dom/ContainerNodeAlgorithms.h:199 > + m_postInsertionNotificationTargets = adoptPtr(new Vector<RefPtr<Node> >());
The () here is not needed.
> Source/WebCore/dom/ContainerNodeAlgorithms.h:213 > + m_postInsertionNotificationTargets = adoptPtr(new Vector<RefPtr<Node> >());
The () here is not needed.
Kentaro Hara
Comment 5
2012-12-03 17:12:03 PST
Created
attachment 177378
[details]
patch for landing
Kentaro Hara
Comment 6
2012-12-03 17:13:31 PST
(In reply to
comment #4
)
> (From update of
attachment 177209
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=177209&action=review
> > > Source/WebCore/dom/ContainerNodeAlgorithms.h:50 > > + OwnPtr<Vector< RefPtr<Node> > > m_postInsertionNotificationTargets; > > The space in Vector< RefPtr is not the correct style. It should be Vector<RefPtr.
Fixed.
> > Source/WebCore/dom/ContainerNodeAlgorithms.h:197 > > + if (UNLIKELY(Node::InsertionShouldCallDidNotifySubtreeInsertions == node->insertedInto(m_insertionPoint))) { > > Are you sure this improved performance? Normally we only use LIKELY and UNLIKELY when we have concrete evidence it’s helpful.
You're right. I couldn't observe any effect. Removed.
> > Source/WebCore/dom/ContainerNodeAlgorithms.h:199 > > + m_postInsertionNotificationTargets = adoptPtr(new Vector<RefPtr<Node> >()); > > The () here is not needed. > > > Source/WebCore/dom/ContainerNodeAlgorithms.h:213 > > + m_postInsertionNotificationTargets = adoptPtr(new Vector<RefPtr<Node> >()); > > The () here is not needed.
Fixed.
Kentaro Hara
Comment 7
2012-12-03 17:15:02 PST
(In reply to
comment #2
)
> Are we confident that this is a real improvement?
I confirmed, yes. When we insert a tree whose depth is 2 or larger, insertion time is improved.
Andreas Kling
Comment 8
2012-12-03 17:17:34 PST
This still looks wrong to me. ChildNodeInsertionNotifier is only created on the stack, so all this patch does is save 8 bytes of stack space in the common case, and adds a heap allocation for body and iframe elements. Where is the improvement coming from?
Kentaro Hara
Comment 9
2012-12-03 17:20:05 PST
Comment on
attachment 177378
[details]
patch for landing ok, then let me investigate the details later. (Maybe instruction cache? A Vector allocation code is removed from a hot call path, which is larger than a OwnPtr allocation code?)
Kentaro Hara
Comment 10
2012-12-03 23:04:38 PST
(In reply to
comment #8
)
> This still looks wrong to me. ChildNodeInsertionNotifier is only created on the stack, so all this patch does is save 8 bytes of stack space in the common case, and adds a heap allocation for body and iframe elements. Where is the improvement coming from?
for (NodeVector::const_iterator it = targets.begin(); it != targets.end(); ++it) { Node* child = it->get(); insertBeforeCommon(next.get(), child); childrenChanged(true, nextChildPreviousSibling.get(), nextChild, 1); asm("_ppppp:"); ChildNodeInsertionNotifier tmp(this); asm("_qqqqq:"); tmp.notify(child); } Here are assemblies for between _ppppp and _qqqqq. Before this patch: 00000000005f9d6b <_pppppp>: 5f9d6b: 48 89 6c 24 10 mov %rbp,0x10(%rsp) 5f9d70: 48 8d 44 24 18 lea 0x18(%rsp),%rax 5f9d75: 48 c7 40 10 00 00 00 movq $0x0,0x10(%rax) 5f9d7c: 00 5f9d7d: 48 c7 40 08 00 00 00 movq $0x0,0x8(%rax) 5f9d84: 00 5f9d85: 48 c7 00 00 00 00 00 movq $0x0,(%rax) 00000000005f9d8c <_qqqqqq>: After this patch: 00000000005f9c5d <_pppppp>: 5f9c5d: 48 89 1c 24 mov %rbx,(%rsp) 5f9c61: 48 c7 44 24 08 00 00 movq $0x0,0x8(%rsp) 5f9c68: 00 00 00000000005f9c6a <_qqqqqq>: In this way, instructions per ChildNodeInsertionNotifier's constructor are reduced. In particular, note that ChildNodeInsertionNotifier's constructor is used inside for loops.
WebKit Review Bot
Comment 11
2012-12-04 00:05:21 PST
Comment on
attachment 177378
[details]
patch for landing Clearing flags on attachment: 177378 Committed
r136481
: <
http://trac.webkit.org/changeset/136481
>
WebKit Review Bot
Comment 12
2012-12-04 00:05:26 PST
All reviewed patches have been landed. Closing bug.
Andreas Kling
Comment 13
2012-12-04 18:12:59 PST
(In reply to
comment #10
)
> (In reply to
comment #8
) > > This still looks wrong to me. ChildNodeInsertionNotifier is only created on the stack, so all this patch does is save 8 bytes of stack space in the common case, and adds a heap allocation for body and iframe elements. Where is the improvement coming from? > > for (NodeVector::const_iterator it = targets.begin(); it != targets.end(); ++it) { > Node* child = it->get(); > insertBeforeCommon(next.get(), child); > childrenChanged(true, nextChildPreviousSibling.get(), nextChild, 1); > asm("_ppppp:"); > ChildNodeInsertionNotifier tmp(this); > asm("_qqqqq:"); > tmp.notify(child); > } > > > Here are assemblies for between _ppppp and _qqqqq. > > > Before this patch: > > 00000000005f9d6b <_pppppp>: > 5f9d6b: 48 89 6c 24 10 mov %rbp,0x10(%rsp) > 5f9d70: 48 8d 44 24 18 lea 0x18(%rsp),%rax > 5f9d75: 48 c7 40 10 00 00 00 movq $0x0,0x10(%rax) > 5f9d7c: 00 > 5f9d7d: 48 c7 40 08 00 00 00 movq $0x0,0x8(%rax) > 5f9d84: 00 > 5f9d85: 48 c7 00 00 00 00 00 movq $0x0,(%rax) > > 00000000005f9d8c <_qqqqqq>: > > After this patch: > > 00000000005f9c5d <_pppppp>: > 5f9c5d: 48 89 1c 24 mov %rbx,(%rsp) > 5f9c61: 48 c7 44 24 08 00 00 movq $0x0,0x8(%rsp) > 5f9c68: 00 00 > > 00000000005f9c6a <_qqqqqq>: > > > > In this way, instructions per ChildNodeInsertionNotifier's constructor are reduced. In particular, note that ChildNodeInsertionNotifier's constructor is used inside for loops.
If making the ChildNodeInsertionNotifier constructor 3 instructions shorter gives you a ~1.4% speed boost on Dromaeo/dom-modify, you should probably move construction out of the loop which would be trivial since you're always passing 'this' to the constructor. You could also move the reffing/unreffing of the Document out of ChildNodeInsertionNotifier::notify() since that Document* should be the same for every call. Personally, I don't trust these numbers but if they are indeed true then there is much work here to be done.
Kentaro Hara
Comment 14
2012-12-05 16:32:29 PST
Reverted
r136481
for reason: it might have regressed dom_perf/CloneNodes (See
bug 104177
) Committed
r136777
: <
http://trac.webkit.org/changeset/136777
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug