WebKit Bugzilla
New
Browse
Search+
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
49876
Optimize matching of descendant selectors
https://bugs.webkit.org/show_bug.cgi?id=49876
Summary
Optimize matching of descendant selectors
Simon Fraser (smfr)
Reported
2010-11-20 21:20:46 PST
Pages with selectors like: .foo .bar * can spend a lot of time in style resolution, because these types of selectors require walking back up the tree when matching every node. A possible optimization here would be to keep track of which parts of the selectors have been matched while walking down the tree in recalcStyle.
Attachments
patch
(14.08 KB, patch)
2011-02-03 16:07 PST
,
Antti Koivisto
hyatt
: review+
Details
Formatted Diff
Diff
View All
Add attachment
proposed patch, testcase, etc.
Simon Fraser (smfr)
Comment 1
2010-12-15 10:28:06 PST
<
rdar://problem/8772822
>
Antti Koivisto
Comment 2
2011-02-03 16:07:02 PST
Created
attachment 81133
[details]
patch During style recalculation, maintain a hash of tags, ids and classes seen in ancestor elements. Use the hash to quickly reject descendant and child selectors when doing style matching. This speeds up style recalculations 3-6x on major web sites.
James Robinson
Comment 3
2011-02-03 16:32:51 PST
Does this have overhead when the page isn't using descendant selectors? I don't want to penalize pages who write good selectors just to optimize bad ones. What's the hit/miss rate like on this cache?
Antti Koivisto
Comment 4
2011-02-03 16:56:10 PST
Third rule on google.com stylesheet: .gac_m td{line-height:17px} Also check out our default stylesheets.
Simon Fraser (smfr)
Comment 5
2011-02-03 16:57:37 PST
Do we have good selector test coverage for this this may break? I'm thinking of selectors like: .foo div .foo .bar where greediness may trip you up.
Dave Hyatt
Comment 6
2011-02-03 16:58:29 PST
Dibs. Will review later tonight.
Antti Koivisto
Comment 7
2011-02-03 17:00:44 PST
(In reply to
comment #5
)
> Do we have good selector test coverage for this this may break? I'm thinking of selectors like: > .foo div .foo .bar > where greediness may trip you up.
Judging from the number of failure case it picked up, we have pretty good coverage. Modulo bugs, this shouldn't be tripped by anything.
James Robinson
Comment 8
2011-02-03 17:01:06 PST
(In reply to
comment #4
)
> Third rule on google.com stylesheet: .gac_m td{line-height:17px} > > Also check out our default stylesheets.
I know descendant selectors are common, but that doesn't really answer my question. Consider the size of the DOM for google.com - it's tiny. Pages with significantly larger DOMs also tend to more aggressively avoid bad selectors (assuming the authors are savvy), and we definitely don't want to penalize those pages. On google.com the overhead from this is probably pretty tiny.
Simon Fraser (smfr)
Comment 9
2011-02-03 17:03:19 PST
Comment on
attachment 81133
[details]
patch View in context:
https://bugs.webkit.org/attachment.cgi?id=81133&action=review
> Source/WebCore/css/CSSStyleSelector.cpp:364 > + void collectIdentifiersInDescendantSelectors();
collectDescendantSelectorIdentifiers()?
> Source/WebCore/css/CSSStyleSelector.cpp:370 > + static const unsigned maximumIdentifierCount = 4; > + const unsigned* identifiersInDescendantSelectors() const { return m_identifiersInDescendantSelectors; }
How do you know that '4' is a good number? Maybe use a Vector would be clearer, and a typedef for the unsigned (which I guess is really a hash) would be even clearer.
> Source/WebCore/css/CSSStyleSelector.cpp:652 > +static inline void addContextIdentifiersToVector(Element* element, Vector<unsigned, 4>& contextIdentifiers)
Do you really need the , 4 here?
> Source/WebCore/css/CSSStyleSelector.cpp:654 > + // Mix tags, class names and ids into sort of tasty bouillabaisse.
This comment, though amusing, doesn't really help me to understand the code.
> Source/WebCore/css/CSSStyleSelector.cpp:671 > + // There are all kinds of vacky special cases where the style recalc may temporarily branch to some random elements.
s/vacky/wacky
> Source/WebCore/css/CSSStyleSelector.cpp:672 > + // We could fix up the stack but that seems like a hassle.
Maybe use a FIXME: make this work in more unusual cases (e.g. when....)
> Source/WebCore/css/CSSStyleSelector.cpp:680 > + m_parentStack.append(ParentStackFrame(parent));
I'd like a blank line above this one.
> Source/WebCore/css/CSSStyleSelector.cpp:683 > + size_t size = contextFrame.identifiers.size();
I'd prefer count to size.
> Source/WebCore/dom/Element.cpp:81 > + ~StyleSelectorParentPusher() { if (m_didPush) m_styleSelector->popParent(m_parent); }
Please unwrap this.
Mihai Parparita
Comment 10
2011-02-03 17:07:49 PST
(In reply to
comment #8
)
> I know descendant selectors are common, but that doesn't really answer my question. Consider the size of the DOM for google.com - it's tiny. Pages with significantly larger DOMs also tend to more aggressively avoid bad selectors (assuming the authors are savvy), and we definitely don't want to penalize those pages. On google.com the overhead from this is probably pretty tiny.
Anecdotally, on Google Reader we knew that descendant selectors were bad and we had tooling to calculate their approximate cost and warn if they were out of hand, but we never removed them entirely (i.e. the only thing we outright forbade was *). I believe Gmail and Spreadsheets are the same. So I don't think it's safe to assume that savvy web developers will not have descendant selectors.
Antti Koivisto
Comment 11
2011-02-03 17:17:29 PST
> > Source/WebCore/css/CSSStyleSelector.cpp:370 > > + static const unsigned maximumIdentifierCount = 4; > > + const unsigned* identifiersInDescendantSelectors() const { return m_identifiersInDescendantSelectors; } > > How do you know that '4' is a good number? Maybe use a Vector would be clearer, and a typedef for the unsigned (which I guess is really a hash) would be even clearer.
It was based on looking through a number of stylesheets on popular web sites. 2 or 3 might still give good enough coverage but this can be adjusted later. Much more (or a Vector!) is a bad idea as these data structures are significant portion of our memory usage. We consistently use "unsigned" for hash through the codebase.
> > Source/WebCore/css/CSSStyleSelector.cpp:652 > > +static inline void addContextIdentifiersToVector(Element* element, Vector<unsigned, 4>& contextIdentifiers) > > Do you really need the , 4 here?
That avoids most of the unnecessary mallocs. 4 is enough to fit the tag name and some classes. This is not an important structure in terms of for memory usage, it could be more too.
> This comment, though amusing, doesn't really help me to understand the code.
It is my contribution to the recent important webkit-dev discussions.
Simon Fraser (smfr)
Comment 12
2011-02-03 17:22:29 PST
(In reply to
comment #11
)
> > > Source/WebCore/css/CSSStyleSelector.cpp:370 > > > + static const unsigned maximumIdentifierCount = 4; > > > + const unsigned* identifiersInDescendantSelectors() const { return m_identifiersInDescendantSelectors; } > > > > How do you know that '4' is a good number? Maybe use a Vector would be clearer, and a typedef for the unsigned (which I guess is really a hash) would be even clearer. > > It was based on looking through a number of stylesheets on popular web sites. 2 or 3 might still give good enough coverage but this can be adjusted later. Much more (or a Vector!) is a bad idea as these data structures are significant portion of our memory usage.
A Vector of fixed capacity shouldn't have any additional overhead.
> We consistently use "unsigned" for hash through the codebase. > > > > Source/WebCore/css/CSSStyleSelector.cpp:652 > > > +static inline void addContextIdentifiersToVector(Element* element, Vector<unsigned, 4>& contextIdentifiers) > > > > Do you really need the , 4 here? > > That avoids most of the unnecessary mallocs. 4 is enough to fit the tag name and some classes. This is not an important structure in terms of for memory usage, it could be more too.
But here you're just passing a reference to a Vector.
Antti Koivisto
Comment 13
2011-02-03 17:25:40 PST
> A Vector of fixed capacity shouldn't have any additional overhead.
size_t m_size; <----- Buffer m_buffer;
> But here you're just passing a reference to a Vector.
I prefer my code to compile.
Pratik Solanki
Comment 14
2011-02-03 18:03:15 PST
(In reply to
comment #3
)
> What's the hit/miss rate like on this cache?
Using this simple dtrace script with a debug build sudo dtrace -n 'pid$target:WebCore:WebCore??CSSStyleSelector??fastRejectSelector*:return { @[arg1] = count(); }' -p <pid> I see nytimes.com 0 91425 1 533836 google.com (not logged in) 0 8589 1 2447 reader.google.com (logged into my account) 0 12810 1 31293 engadget.com 0 15017 1 326230 csszengarden.com 0 412 1 279
WebKit Review Bot
Comment 15
2011-02-03 20:34:26 PST
Attachment 81133
[details]
did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1 Source/WebCore/dom/Element.cpp:81: More than one command on the same line in if [whitespace/parens] [4] Source/WebCore/css/CSSStyleSelector.h:194: The parameter name "ruleData" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 2 in 4 files If any of these errors are false positives, please file a bug against check-webkit-style.
Dave Hyatt
Comment 16
2011-02-04 09:38:40 PST
Comment on
attachment 81133
[details]
patch This looks great to me. Address Simon's comments/feedback and r=me.
Antti Koivisto
Comment 17
2011-02-05 03:25:21 PST
http://trac.webkit.org/changeset/77740
(with Simon's comments implemented and some additional cleanups)
Antti Koivisto
Comment 18
2011-02-05 08:54:33 PST
(In reply to
comment #10
)
> Anecdotally, on Google Reader we knew that descendant selectors were bad
Note that the reason descendant selectors are considered "bad" is that browser implementations have sucked. There is nothing inherently wrong with them.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug