Bug 87942 - [Performance] Optimize querySelector() by caching SelectorQuery objects
Summary: [Performance] Optimize querySelector() by caching SelectorQuery objects
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: CSS (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Kentaro Hara
URL:
Keywords:
Depends on:
Blocks: 87625
  Show dependency treegraph
 
Reported: 2012-05-31 01:42 PDT by Kentaro Hara
Modified: 2012-06-05 03:40 PDT (History)
8 users (show)

See Also:


Attachments
Patch (19.23 KB, patch)
2012-05-31 02:40 PDT, Kentaro Hara
no flags Details | Formatted Diff | Diff
Patch (13.27 KB, patch)
2012-05-31 20:33 PDT, Kentaro Hara
no flags Details | Formatted Diff | Diff
Patch (19.04 KB, patch)
2012-05-31 20:48 PDT, Kentaro Hara
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ec2-cr-linux-02 (691.59 KB, application/zip)
2012-06-01 01:06 PDT, WebKit Review Bot
no flags Details
Patch (15.03 KB, patch)
2012-06-02 09:46 PDT, Kentaro Hara
no flags Details | Formatted Diff | Diff
patch for landing (15.18 KB, patch)
2012-06-02 10:38 PDT, Kentaro Hara
no flags Details | Formatted Diff | Diff
Patch (1.48 KB, patch)
2012-06-04 17:02 PDT, Kentaro Hara
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ec2-cq-01 (782.29 KB, application/zip)
2012-06-05 00:04 PDT, WebKit Review Bot
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Kentaro Hara 2012-05-31 01:42:30 PDT
Currently Node::querySelector() and Node::querySelectorAll() parse CSS queries every time. By caching the parsed results (i.e. SelectorQuery objects) on a Document, we can optimize the performance.
Comment 1 Kentaro Hara 2012-05-31 02:40:04 PDT
Created attachment 145027 [details]
Patch
Comment 2 Antti Koivisto 2012-05-31 04:30:59 PDT
Comment on attachment 145027 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=145027&action=review

> Source/WebCore/css/CSSParser.cpp:9631
> +    if (!m_querySelectorList.first() || m_querySelectorList.hasUnknownPseudoElements()) {
> +        ec = SYNTAX_ERR;
> +        return;
> +    }

Exceptions should usually be generated on DOM interface classes, not on internal types.

> Source/WebCore/css/CSSParser.h:497
> +class CSSSelectorCache : public RefCounted<CSSSelectorCache> {

This is more like SelectorQueryCacheEntry.

> Source/WebCore/css/CSSParser.h:501
> +    CSSSelectorCache() { };
> +    SelectorQuery& selectorQuery() { return m_selectorQuery; };
> +    void parseSelector(const String&, Document*, ExceptionCode&);

I think it would be better to just pass in the CSSSelectorList as a constructor parameter instead of having parseSelector() here.

> Source/WebCore/css/CSSParser.h:505
> +    CSSSelectorList m_querySelectorList;
> +    SelectorQuery m_selectorQuery;

What do you need the m_querySelectorList as a member?

> Source/WebCore/dom/Document.cpp:748
> +PassRefPtr<CSSSelectorCache> Document::findCSSSelectorCache(const AtomicString& selectors)
> +{
> +    Settings* currentSettings = settings();
> +    if (currentSettings && currentSettings->cssSelectorCacheInvalidated()) {
> +        currentSettings->clearCSSSelectorCacheInvalidated();
> +        invalidateCSSSelectorCache();
> +        return 0;
> +    }
> +
> +    HashMap<AtomicString, RefPtr<CSSSelectorCache> >::iterator it = m_cssSelectorCacheMap.find(selectors);
> +    if (it == m_cssSelectorCacheMap.end())
> +        return 0;
> +    return it->second;
> +}

Bit sad to see all this stuff added to the already-bloated Document.

I think you should go with a different factoring. You would have a SelectorQueryCache class that (either per-document or a singleton but better start with per-document) that represents the cache. Most of the code and fields you currently have on Document move this this class. What you now call CSSSelectorCache becomes SelectorQueryCacheEntry (or SelectorQueryCache::Entry).

> Source/WebCore/dom/Document.cpp:774
> +    invalidateCSSSelectorCache();

Instead of having the fragile invalidation scheme, I think you could just include a copy of CSSParserContext used for parsing the selector along with the cache (or cache entry). When restoring an entry you would check the context is still equal. If not you throw it out.

What do you think?
Comment 3 Kentaro Hara 2012-05-31 07:22:13 PDT
Comment on attachment 145027 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=145027&action=review

Thanks for reviewing!

>> Source/WebCore/css/CSSParser.cpp:9631
>> +    }
> 
> Exceptions should usually be generated on DOM interface classes, not on internal types.

Yeah... any suggestion about what we should do here? (I just copied the code from the current Node::querySelector() to keep the current behavior as is.)

>> Source/WebCore/css/CSSParser.h:497
>> +class CSSSelectorCache : public RefCounted<CSSSelectorCache> {
> 
> This is more like SelectorQueryCacheEntry.

Will fix.

>> Source/WebCore/css/CSSParser.h:501
>> +    void parseSelector(const String&, Document*, ExceptionCode&);
> 
> I think it would be better to just pass in the CSSSelectorList as a constructor parameter instead of having parseSelector() here.

Makes sense. Will fix.

>> Source/WebCore/css/CSSParser.h:505
>> +    SelectorQuery m_selectorQuery;
> 
> What do you need the m_querySelectorList as a member?

As described in ChangeLog, m_querySelectorList manages the lifetime of CSSSelector s. SelectorQuery::m_selectors just hold pointers to CSSSelector s. Those CSSSelector s are destructed when m_querySelectorList is destructed.

>> Source/WebCore/dom/Document.cpp:748
>> +}
> 
> Bit sad to see all this stuff added to the already-bloated Document.
> 
> I think you should go with a different factoring. You would have a SelectorQueryCache class that (either per-document or a singleton but better start with per-document) that represents the cache. Most of the code and fields you currently have on Document move this this class. What you now call CSSSelectorCache becomes SelectorQueryCacheEntry (or SelectorQueryCache::Entry).

Sounds reasonable. Then I can prepare a new file (SelectorQueryCache.{h,cpp}) and put SelectorQueryCache and SelectorQueryCache::Entry in the file. Will fix in the next patch.

>> Source/WebCore/dom/Document.cpp:774
>> +    invalidateCSSSelectorCache();
> 
> Instead of having the fragile invalidation scheme, I think you could just include a copy of CSSParserContext used for parsing the selector along with the cache (or cache entry). When restoring an entry you would check the context is still equal. If not you throw it out.
> 
> What do you think?

I've tried it but gave up due to performance issue. The approach of copying a CSSParserContext and comparing two CSSParserContext s degrades performance by 90%~ (though I could not remember the exact figure:-). Even by hackily removing KURL and baseURI entries from CSSParserContext to reduce the copy and compare overhead, it still degrades performance by 30%~. So I'd like to adopt the invalidation approach here.
Comment 4 Antti Koivisto 2012-05-31 07:52:00 PDT
(In reply to comment #3)
> (From update of attachment 145027 [details])
> As described in ChangeLog, m_querySelectorList manages the lifetime of CSSSelector s. SelectorQuery::m_selectors just hold pointers to CSSSelector s. Those CSSSelector s are destructed when m_querySelectorList is destructed.

That is a surprising behavior. Perhaps SelectorQuery should own them?

> I've tried it but gave up due to performance issue. The approach of copying a CSSParserContext and comparing two CSSParserContext s degrades performance by 90%~ (though I could not remember the exact figure:-). Even by hackily removing KURL and baseURI entries from CSSParserContext to reduce the copy and compare overhead, it still degrades performance by 30%~. So I'd like to adopt the invalidation approach here.

Ok. Just saving/comparing the bits that affect selector parsing might be an option too.
Comment 5 Kentaro Hara 2012-05-31 20:33:36 PDT
Created attachment 145202 [details]
Patch
Comment 6 Kentaro Hara 2012-05-31 20:48:35 PDT
Created attachment 145204 [details]
Patch
Comment 7 Kentaro Hara 2012-05-31 20:49:08 PDT
anttik: Uploaded a patch that applies your comments. Thanks.
Comment 8 Kentaro Hara 2012-05-31 20:59:37 PDT
(In reply to comment #4)
> (In reply to comment #3)
> > (From update of attachment 145027 [details] [details])
> > As described in ChangeLog, m_querySelectorList manages the lifetime of CSSSelector s. SelectorQuery::m_selectors just hold pointers to CSSSelector s. Those CSSSelector s are destructed when m_querySelectorList is destructed.
> 
> That is a surprising behavior. Perhaps SelectorQuery should own them?

Maybe. But in that case, to make adoptPtr(new CSSSelector(selectors)) work, we need to implement CSSSelector::CSSSelector(const CSSSelector* o), which would be almost a copy of CSSSelector::CSSSelector(const CSSSelector& o).

Since CSSSelectorList is an object to keep the lifetime of CSSSelectors (c.f. CSSSelectorList::deleteSelectors() deletes all CSSSelectors), I guess that simply keeping CSSSelectorList in the cache entry would be a straightforward approach.

> > I've tried it but gave up due to performance issue. The approach of copying a CSSParserContext and comparing two CSSParserContext s degrades performance by 90%~ (though I could not remember the exact figure:-). Even by hackily removing KURL and baseURI entries from CSSParserContext to reduce the copy and compare overhead, it still degrades performance by 30%~. So I'd like to adopt the invalidation approach here.
> 
> Ok. Just saving/comparing the bits that affect selector parsing might be an option too.

I tried it, but it still has a performance concern. Look at a WIP patch: https://bugs.webkit.org/attachment.cgi?id=145202&action=review

The WIP patch saves/compares bits in SelectorQueryCache::findEntry(). The performance of query-selector-first.html is as follows:

My original patch: 2077.08 runs/s
The WIP patch: 1797.71 runs/s  (i.e. 15.5% regression)

Since the bottleneck of the save/compare is baseURI, just for experiment, I commented out the code for saving/comparing baseURI. (Consequently, all saving/comparing are conducted by booleans in this experiment.) Even in that case the performance is 2020.71 runs/s (i.e. 2.8% regression).
Comment 9 WebKit Review Bot 2012-06-01 01:06:36 PDT
Comment on attachment 145204 [details]
Patch

Attachment 145204 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/12871390

New failing tests:
http/tests/media/media-source/video-media-source-event-attributes.html
Comment 10 WebKit Review Bot 2012-06-01 01:06:57 PDT
Created attachment 145235 [details]
Archive of layout-test-results from ec2-cr-linux-02

The attached test failures were seen while running run-webkit-tests on the chromium-ews.
Bot: ec2-cr-linux-02  Port: <class 'webkitpy.common.config.ports.ChromiumXVFBPort'>  Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
Comment 11 Antti Koivisto 2012-06-01 04:43:41 PDT
Comment on attachment 145204 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=145204&action=review

Looks better!

> Source/WebCore/dom/SelectorQuery.cpp:183
> +SelectorQueryCache::Entry::Entry(const AtomicString& selectors, Document* document, ExceptionCode& ec)
> +{
> +    CSSParser parser(document);
> +    parser.parseSelector(selectors, m_querySelectorList);
> +
> +    if (!m_querySelectorList.first() || m_querySelectorList.hasUnknownPseudoElements()) {
> +        ec = SYNTAX_ERR;
> +        return;
> +    }
> +
> +    // throw a NAMESPACE_ERR if the selector includes any namespace prefixes.
> +    if (m_querySelectorList.selectorsNeedNamespaceResolution()) {
> +        ec = NAMESPACE_ERR;
> +        return;
> +    }
> +    
> +    m_selectorQuery.initialize(m_querySelectorList);
> +}

The parsing should be done in the calling function not the Entry constructor (as parsing can fail and we just constructed something we don't need). The constructor can take CSSSelectorList as argument.

> Source/WebCore/dom/SelectorQuery.h:94
> +class Entry : public RefCounted<SelectorQueryCache::Entry> {
> +public:
> +    Entry(const AtomicString&, Document*, ExceptionCode&);
> +    SelectorQuery& selectorQuery() { return m_selectorQuery; };
> +
> +private:
> +    // m_querySelectorList keeps the lifetime of CSSSelectors in m_selectorQuery.
> +    // Since m_selectorQuery just holds pointers to CSSSelector objects,
> +    // m_querySelectorList must not be destructed before m_selectorQuery is destructed.
> +    CSSSelectorList m_querySelectorList;
> +    SelectorQuery m_selectorQuery;
> +};

Indentation.

> Source/WebCore/dom/SelectorQuery.h:99
> +    PassRefPtr<SelectorQueryCache::Entry> createNewEntry(const AtomicString&, Document*, ExceptionCode&);
> +    PassRefPtr<SelectorQueryCache::Entry> findEntry(const AtomicString&, Settings*);

A single function (maybe just add() since HashMap::add() has semantics like that) should be sufficient. If the entry does not exists it creates it. This is more efficient since we will only need one hash lookup when adding.

SelectorQueryCache::Entry could be a private class, the function could just return SelectorQuery*.

> Source/WebCore/dom/SelectorQuery.h:100
> +    void invalidateEntries();

Just call it invalidate()

> Source/WebCore/page/Settings.h:330
> -        void setCSSCustomFilterEnabled(bool enabled) { m_isCSSCustomFilterEnabled = enabled; }
> +        void setCSSCustomFilterEnabled(bool enabled) { m_cssSelectorCacheInvalidated = true; m_isCSSCustomFilterEnabled = enabled; }
>          bool isCSSCustomFilterEnabled() const { return m_isCSSCustomFilterEnabled; }
>  
>  #if ENABLE(CSS_REGIONS)
> -        void setCSSRegionsEnabled(bool enabled) { m_cssRegionsEnabled = enabled; }
> +        void setCSSRegionsEnabled(bool enabled) { m_cssSelectorCacheInvalidated = true; m_cssRegionsEnabled = enabled; }

I don't think this invalidation code is needed at all. Flipping these settings is not a user scenario. If someone changes them it is reasonable to expect them to take full effect only on subsequent document loads. The existing code works like this too (we don't reparse stylesheets for example).
Comment 12 Antti Koivisto 2012-06-01 05:05:22 PDT
Comment on attachment 145204 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=145204&action=review

> Source/WebCore/dom/SelectorQuery.h:83
> +class Entry : public RefCounted<SelectorQueryCache::Entry> {

This doesn't need to be RefCounted. You can use HashMap<AtomicString, OwnPtr<SelectorQueryCache::Entry> >
Comment 13 Kentaro Hara 2012-06-02 09:46:29 PDT
Created attachment 145447 [details]
Patch
Comment 14 Kentaro Hara 2012-06-02 09:47:54 PDT
Comment on attachment 145204 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=145204&action=review

>> Source/WebCore/dom/SelectorQuery.cpp:183
>> +}
> 
> The parsing should be done in the calling function not the Entry constructor (as parsing can fail and we just constructed something we don't need). The constructor can take CSSSelectorList as argument.

Done.

>> Source/WebCore/dom/SelectorQuery.h:83
>> +class Entry : public RefCounted<SelectorQueryCache::Entry> {
> 
> This doesn't need to be RefCounted. You can use HashMap<AtomicString, OwnPtr<SelectorQueryCache::Entry> >

Done.

>> Source/WebCore/dom/SelectorQuery.h:94
>> +};
> 
> Indentation.

Done.

>> Source/WebCore/dom/SelectorQuery.h:99
>> +    PassRefPtr<SelectorQueryCache::Entry> findEntry(const AtomicString&, Settings*);
> 
> A single function (maybe just add() since HashMap::add() has semantics like that) should be sufficient. If the entry does not exists it creates it. This is more efficient since we will only need one hash lookup when adding.
> 
> SelectorQueryCache::Entry could be a private class, the function could just return SelectorQuery*.

Done.

>> Source/WebCore/dom/SelectorQuery.h:100
>> +    void invalidateEntries();
> 
> Just call it invalidate()

Done.

>> Source/WebCore/page/Settings.h:330
>> +        void setCSSRegionsEnabled(bool enabled) { m_cssSelectorCacheInvalidated = true; m_cssRegionsEnabled = enabled; }
> 
> I don't think this invalidation code is needed at all. Flipping these settings is not a user scenario. If someone changes them it is reasonable to expect them to take full effect only on subsequent document loads. The existing code works like this too (we don't reparse stylesheets for example).

Good point. Done.
Comment 15 Antti Koivisto 2012-06-02 10:31:16 PDT
Comment on attachment 145447 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=145447&action=review

nice, r=me

> Source/WebCore/dom/SelectorQuery.h:80
> +class SelectorQueryCache {
> +public:

This should have

WTF_MAKE_NONCOPYABLE(SelectorQueryCache); WTF_MAKE_FAST_ALLOCATED;

> Source/WebCore/dom/SelectorQuery.h:89
> +    class Entry {
> +    public:

WTF_MAKE_NONCOPYABLE(Entry); WTF_MAKE_FAST_ALLOCATED;

> Source/WebCore/dom/SelectorQuery.h:98
> +        // m_querySelectorList must not be destructed before m_selectorQuery is destructed.
> +        CSSSelectorList m_querySelectorList;
> +        SelectorQuery m_selectorQuery;

If the CSSSelectorList was moved to SelectorQuery then you might be able to get rid of the Entry type entirely and just have OwnPtr<SelectorQuery> as the map value. This doesn't need to be done in this patch though.
Comment 16 Kentaro Hara 2012-06-02 10:38:11 PDT
Created attachment 145454 [details]
patch for landing
Comment 17 Kentaro Hara 2012-06-02 10:39:12 PDT
Thanks for reviewing.

(In reply to comment #15)
> This should have
> 
> WTF_MAKE_NONCOPYABLE(SelectorQueryCache); WTF_MAKE_FAST_ALLOCATED;

Done.

> > Source/WebCore/dom/SelectorQuery.h:89
> > +    class Entry {
> > +    public:
> 
> WTF_MAKE_NONCOPYABLE(Entry); WTF_MAKE_FAST_ALLOCATED;

Done.

> > Source/WebCore/dom/SelectorQuery.h:98
> > +        // m_querySelectorList must not be destructed before m_selectorQuery is destructed.
> > +        CSSSelectorList m_querySelectorList;
> > +        SelectorQuery m_selectorQuery;
> 
> If the CSSSelectorList was moved to SelectorQuery then you might be able to get rid of the Entry type entirely and just have OwnPtr<SelectorQuery> as the map value. This doesn't need to be done in this patch though.

OK, let me try it in a follow-up patch.
Comment 18 WebKit Review Bot 2012-06-02 12:23:44 PDT
Comment on attachment 145454 [details]
patch for landing

Clearing flags on attachment: 145454

Committed r119330: <http://trac.webkit.org/changeset/119330>
Comment 20 Kentaro Hara 2012-06-02 19:34:06 PDT
(In reply to comment #19)
> This patch broke Windows build:
> 
> Nested classes don't work well in Visual Studio.

Ah, sorry. Do you know the quick fix? Maybe we need to make SelectorQueryCache::Entry public? (I don't have win build environment in my home now...)
Comment 21 Ryosuke Niwa 2012-06-02 19:36:15 PDT
(In reply to comment #20)
> (In reply to comment #19)
> > This patch broke Windows build:
> > 
> > Nested classes don't work well in Visual Studio.
> 
> Ah, sorry. Do you know the quick fix? Maybe we need to make SelectorQueryCache::Entry public? (I don't have win build environment in my home now...)

I'm un-nesting the class at least for now since that'll work for sure.
Comment 22 Kentaro Hara 2012-06-02 19:52:22 PDT
Thank you very much for the fix!
Comment 23 Ryosuke Niwa 2012-06-02 19:53:48 PDT
Fixed the build in http://trac.webkit.org/changeset/119348. Please re-nest the Entry if you can figure out how to make it compile on Windows :)
Comment 24 Antti Koivisto 2012-06-03 07:26:52 PDT
We have plenty of nested classes so I suppose it can be made work. But as mentioned above, a better fix is to eliminate SelectorQueryCacheEntry.
Comment 25 Antti Koivisto 2012-06-03 07:28:54 PDT
It doesn't actually need to be in the header either, it could go to .cpp.
Comment 26 Kentaro Hara 2012-06-03 07:30:37 PDT
(In reply to comment #25)
> It doesn't actually need to be in the header either, it could go to .cpp.

Yeah, I'll upload a patch tomorrow:)
Comment 27 Darin Adler 2012-06-04 15:23:55 PDT
Comment on attachment 145454 [details]
patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=145454&action=review

> Source/WebCore/dom/SelectorQuery.h:71
> +    SelectorQuery() { };

Two thoughts here:

1) If you have no arguments and want to inline the constructor, you don’t have to define a constructor at all. That’s exactly what the compiler-generated constructor would do.

2) The semicolon after the end brace here is unneeded and incorrect, although it causes no concrete problems.

> Source/WebCore/dom/SelectorQuery.h:83
> +    SelectorQueryCache() { };

Both above comments apply here.

> Source/WebCore/dom/SelectorQuery.h:95
> +        SelectorQuery* selectorQuery() { return &m_selectorQuery; };

Semicolon here should go as above.
Comment 28 Kentaro Hara 2012-06-04 17:02:08 PDT
Reopening to attach new patch.
Comment 29 Kentaro Hara 2012-06-04 17:02:12 PDT
Created attachment 145656 [details]
Patch
Comment 30 Kentaro Hara 2012-06-04 17:04:22 PDT
Thanks darin.

(In reply to comment #27)
> > Source/WebCore/dom/SelectorQuery.h:71
> > +    SelectorQuery() { };
> 
> Two thoughts here:
> 
> 1) If you have no arguments and want to inline the constructor, you don’t have to define a constructor at all. That’s exactly what the compiler-generated constructor would do.
> 
> 2) The semicolon after the end brace here is unneeded and incorrect, although it causes no concrete problems.
> 
> > Source/WebCore/dom/SelectorQuery.h:95
> > +        SelectorQuery* selectorQuery() { return &m_selectorQuery; };
> 
> Semicolon here should go as above.

These issues were resolved in r119399.

> > Source/WebCore/dom/SelectorQuery.h:83
> > +    SelectorQueryCache() { };
> 
> Both above comments apply here.

Uploaded a patch to fix this. Would you take a look?
Comment 31 WebKit Review Bot 2012-06-05 00:04:35 PDT
Comment on attachment 145656 [details]
Patch

Rejecting attachment 145656 [details] from commit-queue.

New failing tests:
http/tests/media/media-source/video-media-source-event-attributes.html
Full output: http://queues.webkit.org/results/12896475
Comment 32 WebKit Review Bot 2012-06-05 00:04:40 PDT
Created attachment 145706 [details]
Archive of layout-test-results from ec2-cq-01

The attached test failures were seen while running run-webkit-tests on the commit-queue.
Bot: ec2-cq-01  Port: <class 'webkitpy.common.config.ports.ChromiumXVFBPort'>  Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
Comment 33 WebKit Review Bot 2012-06-05 03:40:17 PDT
Comment on attachment 145656 [details]
Patch

Clearing flags on attachment: 145656

Committed r119478: <http://trac.webkit.org/changeset/119478>
Comment 34 WebKit Review Bot 2012-06-05 03:40:25 PDT
All reviewed patches have been landed.  Closing bug.