Bug 229680 - REGRESSION (r272900): wpt.fyi loading performance is very slow (regressed, and slower than other browsers)
Summary: REGRESSION (r272900): wpt.fyi loading performance is very slow (regressed, an...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: Safari Technology Preview
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on: 229960
Blocks: 148695 221386
  Show dependency treegraph
 
Reported: 2021-08-30 12:24 PDT by Simon Fraser (smfr)
Modified: 2021-09-06 14:12 PDT (History)
15 users (show)

See Also:


Attachments
patch (1.65 KB, patch)
2021-08-31 05:07 PDT, Antti Koivisto
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Simon Fraser (smfr) 2021-08-30 12:24:17 PDT
Load https://wpt.fyi/results/?label=master&label=experimental&aligned. Scroll down and click on the link for the css/ tests. Note how long it takes to load the new page.

Before r272900: a few seconds.
After 272900: many tens of seconds.
Comment 1 Simon Fraser (smfr) 2021-08-30 12:25:58 PDT
Sample shows:

 +   !   1620 WebCore::Element::dispatchMouseEvent(WebCore::PlatformMouseEvent const&, WTF::AtomString const&, int, WebCore::Element*, WebCore::IsSyntheticClick)  (in WebCore) + 802  [0x10814d432]
 +   !     1620 WebCore::EventDispatcher::dispatchEvent(WebCore::Node&, WebCore::Event&)  (in WebCore) + 4771  [0x1081683a3]
 +   !       1620 WebCore::dispatchEventInDOM(WebCore::Event&, WebCore::EventPath const&)  (in WebCore) + 240  [0x108168e40]
 +   !         1620 WebCore::EventTarget::fireEventListeners(WebCore::Event&, WebCore::EventTarget::EventInvokePhase)  (in WebCore) + 611  [0x10816c3e3]
 +   !           1620 WebCore::EventTarget::innerInvokeEventListeners(WebCore::Event&, WTF::Vector<WTF::RefPtr<WebCore::RegisteredEventListener, WTF::RawPtrTraits<WebCore::RegisteredEventListener>, WTF::DefaultRefDerefTraits<WebCore::RegisteredEventListener> >, 1ul, WTF::CrashOnOverflow, 16ul, WTF::FastMalloc>, WebCore::EventTarget::EventInvokePhase)  (in WebCore) + 428  [0x10816cb6c]
 +   !             1507 WebCore::JSEventListener::handleEvent(WebCore::ScriptExecutionContext&, WebCore::Event&)  (in WebCore) + 1965  [0x107e286ad]
 +   !             : 1507 WebCore::JSExecState::~JSExecState()  (in WebCore) + 214  [0x107e2c756]
 +   !             :   1507 WebCore::MicrotaskQueue::performMicrotaskCheckpoint()  (in WebCore) + 149  [0x108181d35]
 +   !             :     1507 WTF::Detail::CallableWrapper<WebCore::WindowEventLoop::queueMutationObserverCompoundMicrotask()::$_0, void>::call()  (in WebCore) + 71  [0x1081e6797]
 +   !             :       1507 WebCore::MutationObserver::notifyMutationObservers(WebCore::WindowEventLoop&)  (in WebCore) + 2255  [0x10818844f]
 +   !             :         1507 WebCore::JSMutationCallback::handleEvent(WebCore::MutationObserver&, WTF::Vector<WTF::Ref<WebCore::MutationRecord, WTF::RawPtrTraits<WebCore::MutationRecord> >, 0ul, WTF::CrashOnOverflow, 16ul, WTF::FastMalloc> const&, WebCore::MutationObserver&)  (in WebCore) + 515  [0x10758deb3]
 +   !             :           1507 WebCore::JSCallbackData::invokeCallback(WebCore::JSDOMGlobalObject&, JSC::JSObject*, JSC::JSValue, JSC::MarkedArgumentBuffer&, WebCore::JSCallbackData::CallbackType, JSC::PropertyName, WTF::NakedPtr<JSC::Exception>&)  (in WebCore) + 611  [0x107e10b33]
 +   !             :             1507 JSC::profiledCall(JSC::JSGlobalObject*, JSC::ProfilingReason, JSC::JSValue, JSC::CallData const&, JSC::JSValue, JSC::ArgList const&, WTF::NakedPtr<JSC::Exception>&)  (in JavaScriptCore) + 174  [0x10bb8bdce]
 +   !             :               1507 JSC::Interpreter::executeCall(JSC::JSGlobalObject*, JSC::JSObject*, JSC::CallData const&, JSC::JSValue, JSC::ArgList const&)  (in JavaScriptCore) + 555  [0x10b92c81b]
 +   !             :                 1507 vmEntryToJavaScript  (in JavaScriptCore) + 216  [0x10b251326]
 +   !             :                   1501 ???  (in <unknown binary>)  [0x3a716bc303aa]
 +   !             :                   | 1501 ???  (in <unknown binary>)  [0x3a716ba2c317]
 +   !             :                   |   1499 ???  (in <unknown binary>)  [0x3a716be87fca]
 +   !             :                   |   + 1452 ???  (in <unknown binary>)  [0x3a716bc177d8]
 +   !             :                   |   + ! 1186 ???  (in <unknown binary>)  [0x3a716bdfa779]
 +   !             :                   |   + ! : 1186 ???  (in <unknown binary>)  [0x3a716ba011d8]
 +   !             :                   |   + ! :   1058 WebCore::jsNodePrototypeFunction_insertBefore(JSC::JSGlobalObject*, JSC::CallFrame*)  (in WebCore) + 392  [0x1075bb4a8]
 +   !             :                   |   + ! :   | 718 WebCore::ContainerNode::insertBefore(WebCore::Node&, WebCore::Node*)  (in WebCore) + 1697  [0x1080ea671]
 +   !             :                   |   + ! :   | + 718 WebCore::HTMLElement::childrenChanged(WebCore::ContainerNode::ChildChange const&)  (in WebCore) + 18  [0x10834fcb2]
 +   !             :                   |   + ! :   | +   697 WebCore::SlotAssignment::didChangeSlot(WTF::AtomString const&, WebCore::ShadowRoot&)  (in WebCore) + 260  [0x1081c9f24]
 +   !             :                   |   + ! :   | +   ! 364 WebCore::RenderTreeUpdater::tearDownRenderers(WebCore::Element&, WebCore::RenderTreeUpdater::TeardownType, WebCore::RenderTreeBuilder&)  (in WebCore) + 332  [0x108cd7f3c]
 +   !             :                   |   + ! :   | +   ! : 106 WebCore::ComposedTreeIterator::traverseNextInShadowTree()  (in WebCore) + 96  [0x1080e33a0]
 +   !             :                   |   + ! :   | +   ! : | 106 WebCore::ElementAndTextDescendantIterator::traverseNext()  (in WebCore) + 196,45,...  [0x1080e37b4,0x1080e371d,...]
 +   !             :                   |   + ! :   | +   ! : 67 WebCore::ComposedTreeIterator::traverseNextInShadowTree()  (in WebCore) + 192  [0x1080e3400]
 +   !             :                   |   + ! :   | +   ! : | 67 WebCore::HTMLSlotElement::assignedNodes() const  (in WebCore) + 64  [0x1083dd940]
 +   !             :                   |   + ! :   | +   ! : |   67 WebCore::SlotAssignment::assignedNodesForSlot(WebCore::HTMLSlotElement const&, WebCore::ShadowRoot&)  (in WebCore) + 353  [0x1081c6181]
 +   !             :                   |   + ! :   | +   ! : |     60 WebCore::SlotAssignment::assignSlots(WebCore::ShadowRoot&)  (in WebCore) + 258  [0x1081c9032]
 +   !             :                   |   + ! :   | +   ! : |     + 34 WebCore::SlotAssignment::assignToSlot(WebCore::Node&, WTF::AtomString const&)  (in WebCore) + 730,340,...  [0x1081ca63a,0x1081ca4b4,...]
Comment 2 Radar WebKit Bug Importer 2021-08-30 12:26:15 PDT
<rdar://problem/82541045>
Comment 3 Simon Fraser (smfr) 2021-08-30 14:01:56 PDT
<rdar://81789435>
Comment 4 Carlos Garcia Campos 2021-08-31 02:12:59 PDT
hmm, this happens because a lot of elements are added to the same shadow root, so didChangeDefaultSlot() and hostChildElementDidChange() are called too many times. Iterating the composed tree every time makes things super slow. I don't think we need to tear down renderers on every didChangeSlot call, only the ones coming from hostChildElementDidChange() and not in all those cases either. IIRC the problem was only when replacing a previous tree and there was a node with display contents. So, I think we need to find a more specific solution.
Comment 5 Antti Koivisto 2021-08-31 05:07:13 PDT
Created attachment 436877 [details]
patch
Comment 6 Carlos Garcia Campos 2021-08-31 05:44:28 PDT
Comment on attachment 436877 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=436877&action=review

> Source/WebCore/rendering/updating/RenderTreeUpdater.cpp:544
> +    if (!host.renderer() && !host.hasDisplayContents())
> +        return;

Oh, good catch! this is consistent with destroyRenderTreeIfNeeded().
Comment 7 Antti Koivisto 2021-08-31 05:50:37 PDT
> Oh, good catch! this is consistent with destroyRenderTreeIfNeeded().

This test should/could probably be in tearDownRenderers but lets do a minimal fix here.
Comment 8 Darin Adler 2021-08-31 11:06:48 PDT
Comment on attachment 436877 [details]
patch

Any good way to regression-test this performance optimization? Maybe we could make a test that is *super-slow* if this is broken.
Comment 9 Antti Koivisto 2021-08-31 12:03:46 PDT
> Any good way to regression-test this performance optimization? Maybe we
> could make a test that is *super-slow* if this is broken.

Those sort of benchmark is layout tests have often been problematic (hard to make reliable and speedy enough). We definitely should have coverage in our regular performance tests for Shadow DOM as it is getting increasingly popular. Speedometer3 is apparently planned to have some. It might also make sense to add microbenchmarks.
Comment 10 EWS 2021-08-31 12:04:26 PDT
Committed r281813 (241149@main): <https://commits.webkit.org/241149@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 436877 [details].
Comment 11 Darin Adler 2021-08-31 13:32:50 PDT
(In reply to Antti Koivisto from comment #9)
> > Any good way to regression-test this performance optimization? Maybe we
> > could make a test that is *super-slow* if this is broken.
> 
> Those sort of benchmark is layout tests have often been problematic (hard to
> make reliable and speedy enough). We definitely should have coverage in our
> regular performance tests for Shadow DOM as it is getting increasingly
> popular. Speedometer3 is apparently planned to have some. It might also make
> sense to add microbenchmarks.

Seems to me that this optimization for this issue is so huge, it lends itself to a fast test that would simply hang without the optimization. Maybe that’s not right?
Comment 12 Antti Koivisto 2021-09-01 02:14:22 PDT
Yeah, maybe. I'll try.
Comment 13 Sam Sneddon [:gsnedders] 2021-09-01 10:25:00 PDT
Yeah, for anything like this where we're doing > O(n) work it's not too hard to write a regression test; I'm against writing tests for checking whether O(n) differs from O(n), but O(n) (or O(1) in this case) versus O(n^2) is pretty easy to detect.
Comment 14 Antti Koivisto 2021-09-06 07:06:06 PDT
Perf test in 229960