We can optimize the performance of querySelector(".class") and querySelector("tag"), by using the result of Node::getElementsClassName() and Node::getElementsByTagName() instead of traversing the DOM tree.
Created attachment 145756 [details] Patch
Comment on attachment 145756 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=145756&action=review > Source/WebCore/dom/SelectorQuery.cpp:132 > + RefPtr<NodeList> nodeList = match == CSSSelector::Class ? > + rootNode->getElementsByClassName(selector->value()) : > + rootNode->getElementsByTagName(selector->tag().localName()); > + ASSERT(nodeList); > + // If we do not keep a reference to the nodeList, nodeList is destructed when > + // this method returns. Then getElementsByClassName()/getElementsByTagName() > + // always causes cache misses and has to create nodeList every time. > + // To prevent the situation, we keep the reference in a HashSet. > + m_cachedNodeList.add(nodeList); Isn't this only helpful if the same query is executed without document changing between the calls? Of the document mutates the node list will invalid and has to recreated anyway. Optimizing for the same query on the same unmutated document seems more like a benchmark hack than something that help in real world. The id case is different as we always maintain a live id map. Seems like getElementsByClassName/TagName caching should work without explicitly holding the reference. Is that a bug in those implementations? > Source/WebCore/dom/SelectorQuery.h:67 > + HashSet<RefPtr<NodeList> > m_cachedNodeList; HashSets are large. Having an instance per-query could consumes lots of memory.
Thanks for the quick review! (In reply to comment #2) > Isn't this only helpful if the same query is executed without document changing between the calls? Of the document mutates the node list will invalid and has to recreated anyway. Optimizing for the same query on the same unmutated document seems more like a benchmark hack than something that help in real world. I think no. It would be helpful in a lot of real-world cases. For example: <html><body><div class="a"></div><div class="a"></div><script> document.getElementsByClassName("a"); // Creates NodeRareData::m_classNodeListCache document.getElementsByClassName("a"); // Cache hit div = document.createElement("div"); div.className = "a"; document.body.appendChild(div); // div is added to m_classNodeListCache document.getElementsByClassName("a"); // Cache hit </script></body></html> m_classNodeListCache is sometimes invalidated (I do not fully understand the exact condition of invalidation), but m_classNodeListCache is kept being alive over simple document mutations as above. (We can observe cache hit/miss by inserting printf() in Node::getElementsByClassName()). > The id case is different as we always maintain a live id map. Yeah, the id map is always alive. On the other hand, the class map and the tag map are almost alive but sometimes invalidated. (I think that "almost alive" is the reason why document.getElementsByClassName() are maintaining the cached node list. Otherwise, document.getElementsByClassName() will just traverse the DOM tree.) > Seems like getElementsByClassName/TagName caching should work without explicitly holding the reference. Is that a bug in those implementations? I do not think it is a bug. In case of document.getElementsByClassName(), the cached node list works well without explicitly holding the reference, because document.getElementsByClassName() creates a wrapper object of the cached node list, and the wrapper object holds a reference to the cached node list. In case of querySelector(), we cannot rely on a wrapper object since we do not create any wrapper object. So somewhere we need to hold the reference to keep the cached node list alive. > > Source/WebCore/dom/SelectorQuery.h:67 > > + HashSet<RefPtr<NodeList> > m_cachedNodeList; > > HashSets are large. Having an instance per-query could consumes lots of memory. Where do you think we can add the HashSet?
Comment on attachment 145756 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=145756&action=review >> Source/WebCore/dom/SelectorQuery.cpp:132 >> + m_cachedNodeList.add(nodeList); > > Isn't this only helpful if the same query is executed without document changing between the calls? Of the document mutates the node list will invalid and has to recreated anyway. Optimizing for the same query on the same unmutated document seems more like a benchmark hack than something that help in real world. > > The id case is different as we always maintain a live id map. > > Seems like getElementsByClassName/TagName caching should work without explicitly holding the reference. Is that a bug in those implementations? Oh, I think the point is to keep a reference to getElementsByClassName/getElementsByTagName. If the reference is GC'ed, then we'll remove the cache as well as NodeList.
(In reply to comment #4) > > Seems like getElementsByClassName/TagName caching should work without explicitly holding the reference. Is that a bug in those implementations? > > Oh, I think the point is to keep a reference to getElementsByClassName/getElementsByTagName. If the reference is GC'ed, then we'll remove the cache as well as NodeList. I think that's correct.
Created attachment 145905 [details] Patch
Created attachment 145906 [details] Patch
(In reply to comment #2) > > + HashSet<RefPtr<NodeList> > m_cachedNodeList; > > HashSets are large. Having an instance per-query could consumes lots of memory. Removed the HashSet. Since one SelectorDataList matches just either one ClassNodeList or one TagNodeList, SelectorDataList just needs to keep one m_nodeList.
Created attachment 145908 [details] Patch
Comment on attachment 145908 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=145908&action=review > Source/WebCore/dom/SelectorQuery.cpp:127 > + m_nodeList = match == CSSSelector::Class ? > + rootNode->getElementsByClassName(selector->value()) : > + rootNode->getElementsByTagName(selector->tag().localName()); > + ASSERT(m_nodeList); Using dynamic node lists internally has been causing performance problems before. The invalidation logic is complicated and needs to run on DOM mutations if there are dynamic lists alive. This can significantly slow down mutations. I don't know if that code has improved so that this is no longer a problem. I would really like to have internal versions for class and tag lookups. Using DOM API types can cause all sorts of trouble.
(In reply to comment #10) > > + m_nodeList = match == CSSSelector::Class ? > > + rootNode->getElementsByClassName(selector->value()) : > > + rootNode->getElementsByTagName(selector->tag().localName()); > > + ASSERT(m_nodeList); > > Using dynamic node lists internally has been causing performance problems before. The invalidation logic is complicated and needs to run on DOM mutations if there are dynamic lists alive. This can significantly slow down mutations. I don't know if that code has improved so that this is no longer a problem. > > I would really like to have internal versions for class and tag lookups. Using DOM API types can cause all sorts of trouble. Another approach is to cache node lists only when the same query is repeatedly requested. This approach would make sense because given that the same query is repeatedly requested, the node list management cost will pay. If the query is just one time, a node list is never created. But anyway more investigation seems to be needed. Let me unmark r?.