Summary: | [Performance] Optimize querySelector() by caching SelectorQuery objects | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Kentaro Hara <haraken> | ||||||||||||||||||
Component: | CSS | Assignee: | Kentaro Hara <haraken> | ||||||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||||||
Severity: | Normal | CC: | cmarcelo, darin, dglazkov, koivisto, macpherson, menard, rniwa, webkit.review.bot | ||||||||||||||||||
Priority: | P2 | ||||||||||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||
Bug Depends on: | |||||||||||||||||||||
Bug Blocks: | 87625 | ||||||||||||||||||||
Attachments: |
|
Description
Kentaro Hara
2012-05-31 01:42:30 PDT
Created attachment 145027 [details]
Patch
Comment on attachment 145027 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=145027&action=review > Source/WebCore/css/CSSParser.cpp:9631 > + if (!m_querySelectorList.first() || m_querySelectorList.hasUnknownPseudoElements()) { > + ec = SYNTAX_ERR; > + return; > + } Exceptions should usually be generated on DOM interface classes, not on internal types. > Source/WebCore/css/CSSParser.h:497 > +class CSSSelectorCache : public RefCounted<CSSSelectorCache> { This is more like SelectorQueryCacheEntry. > Source/WebCore/css/CSSParser.h:501 > + CSSSelectorCache() { }; > + SelectorQuery& selectorQuery() { return m_selectorQuery; }; > + void parseSelector(const String&, Document*, ExceptionCode&); I think it would be better to just pass in the CSSSelectorList as a constructor parameter instead of having parseSelector() here. > Source/WebCore/css/CSSParser.h:505 > + CSSSelectorList m_querySelectorList; > + SelectorQuery m_selectorQuery; What do you need the m_querySelectorList as a member? > Source/WebCore/dom/Document.cpp:748 > +PassRefPtr<CSSSelectorCache> Document::findCSSSelectorCache(const AtomicString& selectors) > +{ > + Settings* currentSettings = settings(); > + if (currentSettings && currentSettings->cssSelectorCacheInvalidated()) { > + currentSettings->clearCSSSelectorCacheInvalidated(); > + invalidateCSSSelectorCache(); > + return 0; > + } > + > + HashMap<AtomicString, RefPtr<CSSSelectorCache> >::iterator it = m_cssSelectorCacheMap.find(selectors); > + if (it == m_cssSelectorCacheMap.end()) > + return 0; > + return it->second; > +} Bit sad to see all this stuff added to the already-bloated Document. I think you should go with a different factoring. You would have a SelectorQueryCache class that (either per-document or a singleton but better start with per-document) that represents the cache. Most of the code and fields you currently have on Document move this this class. What you now call CSSSelectorCache becomes SelectorQueryCacheEntry (or SelectorQueryCache::Entry). > Source/WebCore/dom/Document.cpp:774 > + invalidateCSSSelectorCache(); Instead of having the fragile invalidation scheme, I think you could just include a copy of CSSParserContext used for parsing the selector along with the cache (or cache entry). When restoring an entry you would check the context is still equal. If not you throw it out. What do you think? Comment on attachment 145027 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=145027&action=review Thanks for reviewing! >> Source/WebCore/css/CSSParser.cpp:9631 >> + } > > Exceptions should usually be generated on DOM interface classes, not on internal types. Yeah... any suggestion about what we should do here? (I just copied the code from the current Node::querySelector() to keep the current behavior as is.) >> Source/WebCore/css/CSSParser.h:497 >> +class CSSSelectorCache : public RefCounted<CSSSelectorCache> { > > This is more like SelectorQueryCacheEntry. Will fix. >> Source/WebCore/css/CSSParser.h:501 >> + void parseSelector(const String&, Document*, ExceptionCode&); > > I think it would be better to just pass in the CSSSelectorList as a constructor parameter instead of having parseSelector() here. Makes sense. Will fix. >> Source/WebCore/css/CSSParser.h:505 >> + SelectorQuery m_selectorQuery; > > What do you need the m_querySelectorList as a member? As described in ChangeLog, m_querySelectorList manages the lifetime of CSSSelector s. SelectorQuery::m_selectors just hold pointers to CSSSelector s. Those CSSSelector s are destructed when m_querySelectorList is destructed. >> Source/WebCore/dom/Document.cpp:748 >> +} > > Bit sad to see all this stuff added to the already-bloated Document. > > I think you should go with a different factoring. You would have a SelectorQueryCache class that (either per-document or a singleton but better start with per-document) that represents the cache. Most of the code and fields you currently have on Document move this this class. What you now call CSSSelectorCache becomes SelectorQueryCacheEntry (or SelectorQueryCache::Entry). Sounds reasonable. Then I can prepare a new file (SelectorQueryCache.{h,cpp}) and put SelectorQueryCache and SelectorQueryCache::Entry in the file. Will fix in the next patch. >> Source/WebCore/dom/Document.cpp:774 >> + invalidateCSSSelectorCache(); > > Instead of having the fragile invalidation scheme, I think you could just include a copy of CSSParserContext used for parsing the selector along with the cache (or cache entry). When restoring an entry you would check the context is still equal. If not you throw it out. > > What do you think? I've tried it but gave up due to performance issue. The approach of copying a CSSParserContext and comparing two CSSParserContext s degrades performance by 90%~ (though I could not remember the exact figure:-). Even by hackily removing KURL and baseURI entries from CSSParserContext to reduce the copy and compare overhead, it still degrades performance by 30%~. So I'd like to adopt the invalidation approach here. (In reply to comment #3) > (From update of attachment 145027 [details]) > As described in ChangeLog, m_querySelectorList manages the lifetime of CSSSelector s. SelectorQuery::m_selectors just hold pointers to CSSSelector s. Those CSSSelector s are destructed when m_querySelectorList is destructed. That is a surprising behavior. Perhaps SelectorQuery should own them? > I've tried it but gave up due to performance issue. The approach of copying a CSSParserContext and comparing two CSSParserContext s degrades performance by 90%~ (though I could not remember the exact figure:-). Even by hackily removing KURL and baseURI entries from CSSParserContext to reduce the copy and compare overhead, it still degrades performance by 30%~. So I'd like to adopt the invalidation approach here. Ok. Just saving/comparing the bits that affect selector parsing might be an option too. Created attachment 145202 [details]
Patch
Created attachment 145204 [details]
Patch
anttik: Uploaded a patch that applies your comments. Thanks. (In reply to comment #4) > (In reply to comment #3) > > (From update of attachment 145027 [details] [details]) > > As described in ChangeLog, m_querySelectorList manages the lifetime of CSSSelector s. SelectorQuery::m_selectors just hold pointers to CSSSelector s. Those CSSSelector s are destructed when m_querySelectorList is destructed. > > That is a surprising behavior. Perhaps SelectorQuery should own them? Maybe. But in that case, to make adoptPtr(new CSSSelector(selectors)) work, we need to implement CSSSelector::CSSSelector(const CSSSelector* o), which would be almost a copy of CSSSelector::CSSSelector(const CSSSelector& o). Since CSSSelectorList is an object to keep the lifetime of CSSSelectors (c.f. CSSSelectorList::deleteSelectors() deletes all CSSSelectors), I guess that simply keeping CSSSelectorList in the cache entry would be a straightforward approach. > > I've tried it but gave up due to performance issue. The approach of copying a CSSParserContext and comparing two CSSParserContext s degrades performance by 90%~ (though I could not remember the exact figure:-). Even by hackily removing KURL and baseURI entries from CSSParserContext to reduce the copy and compare overhead, it still degrades performance by 30%~. So I'd like to adopt the invalidation approach here. > > Ok. Just saving/comparing the bits that affect selector parsing might be an option too. I tried it, but it still has a performance concern. Look at a WIP patch: https://bugs.webkit.org/attachment.cgi?id=145202&action=review The WIP patch saves/compares bits in SelectorQueryCache::findEntry(). The performance of query-selector-first.html is as follows: My original patch: 2077.08 runs/s The WIP patch: 1797.71 runs/s (i.e. 15.5% regression) Since the bottleneck of the save/compare is baseURI, just for experiment, I commented out the code for saving/comparing baseURI. (Consequently, all saving/comparing are conducted by booleans in this experiment.) Even in that case the performance is 2020.71 runs/s (i.e. 2.8% regression). Comment on attachment 145204 [details] Patch Attachment 145204 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/12871390 New failing tests: http/tests/media/media-source/video-media-source-event-attributes.html Created attachment 145235 [details]
Archive of layout-test-results from ec2-cr-linux-02
The attached test failures were seen while running run-webkit-tests on the chromium-ews.
Bot: ec2-cr-linux-02 Port: <class 'webkitpy.common.config.ports.ChromiumXVFBPort'> Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
Comment on attachment 145204 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=145204&action=review Looks better! > Source/WebCore/dom/SelectorQuery.cpp:183 > +SelectorQueryCache::Entry::Entry(const AtomicString& selectors, Document* document, ExceptionCode& ec) > +{ > + CSSParser parser(document); > + parser.parseSelector(selectors, m_querySelectorList); > + > + if (!m_querySelectorList.first() || m_querySelectorList.hasUnknownPseudoElements()) { > + ec = SYNTAX_ERR; > + return; > + } > + > + // throw a NAMESPACE_ERR if the selector includes any namespace prefixes. > + if (m_querySelectorList.selectorsNeedNamespaceResolution()) { > + ec = NAMESPACE_ERR; > + return; > + } > + > + m_selectorQuery.initialize(m_querySelectorList); > +} The parsing should be done in the calling function not the Entry constructor (as parsing can fail and we just constructed something we don't need). The constructor can take CSSSelectorList as argument. > Source/WebCore/dom/SelectorQuery.h:94 > +class Entry : public RefCounted<SelectorQueryCache::Entry> { > +public: > + Entry(const AtomicString&, Document*, ExceptionCode&); > + SelectorQuery& selectorQuery() { return m_selectorQuery; }; > + > +private: > + // m_querySelectorList keeps the lifetime of CSSSelectors in m_selectorQuery. > + // Since m_selectorQuery just holds pointers to CSSSelector objects, > + // m_querySelectorList must not be destructed before m_selectorQuery is destructed. > + CSSSelectorList m_querySelectorList; > + SelectorQuery m_selectorQuery; > +}; Indentation. > Source/WebCore/dom/SelectorQuery.h:99 > + PassRefPtr<SelectorQueryCache::Entry> createNewEntry(const AtomicString&, Document*, ExceptionCode&); > + PassRefPtr<SelectorQueryCache::Entry> findEntry(const AtomicString&, Settings*); A single function (maybe just add() since HashMap::add() has semantics like that) should be sufficient. If the entry does not exists it creates it. This is more efficient since we will only need one hash lookup when adding. SelectorQueryCache::Entry could be a private class, the function could just return SelectorQuery*. > Source/WebCore/dom/SelectorQuery.h:100 > + void invalidateEntries(); Just call it invalidate() > Source/WebCore/page/Settings.h:330 > - void setCSSCustomFilterEnabled(bool enabled) { m_isCSSCustomFilterEnabled = enabled; } > + void setCSSCustomFilterEnabled(bool enabled) { m_cssSelectorCacheInvalidated = true; m_isCSSCustomFilterEnabled = enabled; } > bool isCSSCustomFilterEnabled() const { return m_isCSSCustomFilterEnabled; } > > #if ENABLE(CSS_REGIONS) > - void setCSSRegionsEnabled(bool enabled) { m_cssRegionsEnabled = enabled; } > + void setCSSRegionsEnabled(bool enabled) { m_cssSelectorCacheInvalidated = true; m_cssRegionsEnabled = enabled; } I don't think this invalidation code is needed at all. Flipping these settings is not a user scenario. If someone changes them it is reasonable to expect them to take full effect only on subsequent document loads. The existing code works like this too (we don't reparse stylesheets for example). Comment on attachment 145204 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=145204&action=review > Source/WebCore/dom/SelectorQuery.h:83 > +class Entry : public RefCounted<SelectorQueryCache::Entry> { This doesn't need to be RefCounted. You can use HashMap<AtomicString, OwnPtr<SelectorQueryCache::Entry> > Created attachment 145447 [details]
Patch
Comment on attachment 145204 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=145204&action=review >> Source/WebCore/dom/SelectorQuery.cpp:183 >> +} > > The parsing should be done in the calling function not the Entry constructor (as parsing can fail and we just constructed something we don't need). The constructor can take CSSSelectorList as argument. Done. >> Source/WebCore/dom/SelectorQuery.h:83 >> +class Entry : public RefCounted<SelectorQueryCache::Entry> { > > This doesn't need to be RefCounted. You can use HashMap<AtomicString, OwnPtr<SelectorQueryCache::Entry> > Done. >> Source/WebCore/dom/SelectorQuery.h:94 >> +}; > > Indentation. Done. >> Source/WebCore/dom/SelectorQuery.h:99 >> + PassRefPtr<SelectorQueryCache::Entry> findEntry(const AtomicString&, Settings*); > > A single function (maybe just add() since HashMap::add() has semantics like that) should be sufficient. If the entry does not exists it creates it. This is more efficient since we will only need one hash lookup when adding. > > SelectorQueryCache::Entry could be a private class, the function could just return SelectorQuery*. Done. >> Source/WebCore/dom/SelectorQuery.h:100 >> + void invalidateEntries(); > > Just call it invalidate() Done. >> Source/WebCore/page/Settings.h:330 >> + void setCSSRegionsEnabled(bool enabled) { m_cssSelectorCacheInvalidated = true; m_cssRegionsEnabled = enabled; } > > I don't think this invalidation code is needed at all. Flipping these settings is not a user scenario. If someone changes them it is reasonable to expect them to take full effect only on subsequent document loads. The existing code works like this too (we don't reparse stylesheets for example). Good point. Done. Comment on attachment 145447 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=145447&action=review nice, r=me > Source/WebCore/dom/SelectorQuery.h:80 > +class SelectorQueryCache { > +public: This should have WTF_MAKE_NONCOPYABLE(SelectorQueryCache); WTF_MAKE_FAST_ALLOCATED; > Source/WebCore/dom/SelectorQuery.h:89 > + class Entry { > + public: WTF_MAKE_NONCOPYABLE(Entry); WTF_MAKE_FAST_ALLOCATED; > Source/WebCore/dom/SelectorQuery.h:98 > + // m_querySelectorList must not be destructed before m_selectorQuery is destructed. > + CSSSelectorList m_querySelectorList; > + SelectorQuery m_selectorQuery; If the CSSSelectorList was moved to SelectorQuery then you might be able to get rid of the Entry type entirely and just have OwnPtr<SelectorQuery> as the map value. This doesn't need to be done in this patch though. Created attachment 145454 [details]
patch for landing
Thanks for reviewing. (In reply to comment #15) > This should have > > WTF_MAKE_NONCOPYABLE(SelectorQueryCache); WTF_MAKE_FAST_ALLOCATED; Done. > > Source/WebCore/dom/SelectorQuery.h:89 > > + class Entry { > > + public: > > WTF_MAKE_NONCOPYABLE(Entry); WTF_MAKE_FAST_ALLOCATED; Done. > > Source/WebCore/dom/SelectorQuery.h:98 > > + // m_querySelectorList must not be destructed before m_selectorQuery is destructed. > > + CSSSelectorList m_querySelectorList; > > + SelectorQuery m_selectorQuery; > > If the CSSSelectorList was moved to SelectorQuery then you might be able to get rid of the Entry type entirely and just have OwnPtr<SelectorQuery> as the map value. This doesn't need to be done in this patch though. OK, let me try it in a follow-up patch. Comment on attachment 145454 [details] patch for landing Clearing flags on attachment: 145454 Committed r119330: <http://trac.webkit.org/changeset/119330> This patch broke Windows build: http://build.webkit.org/builders/Chromium%20Win%20Release/builds/44741/steps/compile-webkit/logs/stdio http://build.webkit.org/builders/Windows%20Debug%20%28Build%29/builds/49795/steps/compile-webkit/logs/stdio Nested classes don't work well in Visual Studio. (In reply to comment #19) > This patch broke Windows build: > > Nested classes don't work well in Visual Studio. Ah, sorry. Do you know the quick fix? Maybe we need to make SelectorQueryCache::Entry public? (I don't have win build environment in my home now...) (In reply to comment #20) > (In reply to comment #19) > > This patch broke Windows build: > > > > Nested classes don't work well in Visual Studio. > > Ah, sorry. Do you know the quick fix? Maybe we need to make SelectorQueryCache::Entry public? (I don't have win build environment in my home now...) I'm un-nesting the class at least for now since that'll work for sure. Thank you very much for the fix! Fixed the build in http://trac.webkit.org/changeset/119348. Please re-nest the Entry if you can figure out how to make it compile on Windows :) We have plenty of nested classes so I suppose it can be made work. But as mentioned above, a better fix is to eliminate SelectorQueryCacheEntry. It doesn't actually need to be in the header either, it could go to .cpp. (In reply to comment #25) > It doesn't actually need to be in the header either, it could go to .cpp. Yeah, I'll upload a patch tomorrow:) Comment on attachment 145454 [details] patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=145454&action=review > Source/WebCore/dom/SelectorQuery.h:71 > + SelectorQuery() { }; Two thoughts here: 1) If you have no arguments and want to inline the constructor, you don’t have to define a constructor at all. That’s exactly what the compiler-generated constructor would do. 2) The semicolon after the end brace here is unneeded and incorrect, although it causes no concrete problems. > Source/WebCore/dom/SelectorQuery.h:83 > + SelectorQueryCache() { }; Both above comments apply here. > Source/WebCore/dom/SelectorQuery.h:95 > + SelectorQuery* selectorQuery() { return &m_selectorQuery; }; Semicolon here should go as above. Reopening to attach new patch. Created attachment 145656 [details]
Patch
Thanks darin. (In reply to comment #27) > > Source/WebCore/dom/SelectorQuery.h:71 > > + SelectorQuery() { }; > > Two thoughts here: > > 1) If you have no arguments and want to inline the constructor, you don’t have to define a constructor at all. That’s exactly what the compiler-generated constructor would do. > > 2) The semicolon after the end brace here is unneeded and incorrect, although it causes no concrete problems. > > > Source/WebCore/dom/SelectorQuery.h:95 > > + SelectorQuery* selectorQuery() { return &m_selectorQuery; }; > > Semicolon here should go as above. These issues were resolved in r119399. > > Source/WebCore/dom/SelectorQuery.h:83 > > + SelectorQueryCache() { }; > > Both above comments apply here. Uploaded a patch to fix this. Would you take a look? Comment on attachment 145656 [details] Patch Rejecting attachment 145656 [details] from commit-queue. New failing tests: http/tests/media/media-source/video-media-source-event-attributes.html Full output: http://queues.webkit.org/results/12896475 Created attachment 145706 [details]
Archive of layout-test-results from ec2-cq-01
The attached test failures were seen while running run-webkit-tests on the commit-queue.
Bot: ec2-cq-01 Port: <class 'webkitpy.common.config.ports.ChromiumXVFBPort'> Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
Comment on attachment 145656 [details] Patch Clearing flags on attachment: 145656 Committed r119478: <http://trac.webkit.org/changeset/119478> All reviewed patches have been landed. Closing bug. |