Summary: | Optimize matching of descendant selectors | ||||||
---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Simon Fraser (smfr) <simon.fraser> | ||||
Component: | CSS | Assignee: | Antti Koivisto <koivisto> | ||||
Status: | RESOLVED FIXED | ||||||
Severity: | Normal | CC: | alex, hyatt, jamesr, koivisto, mihaip, peter, psolanki, sam, simon.fraser, tonikitoo, tonyg, webkit.review.bot | ||||
Priority: | P2 | Keywords: | InRadar | ||||
Version: | 528+ (Nightly build) | ||||||
Hardware: | PC | ||||||
OS: | OS X 10.5 | ||||||
Attachments: |
|
Description
Simon Fraser (smfr)
2010-11-20 21:20:46 PST
Created attachment 81133 [details]
patch
During style recalculation, maintain a hash of tags, ids and classes seen in ancestor elements. Use the hash to quickly reject descendant and child selectors when doing style matching.
This speeds up style recalculations 3-6x on major web sites.
Does this have overhead when the page isn't using descendant selectors? I don't want to penalize pages who write good selectors just to optimize bad ones. What's the hit/miss rate like on this cache? Third rule on google.com stylesheet: .gac_m td{line-height:17px} Also check out our default stylesheets. Do we have good selector test coverage for this this may break? I'm thinking of selectors like: .foo div .foo .bar where greediness may trip you up. Dibs. Will review later tonight. (In reply to comment #5) > Do we have good selector test coverage for this this may break? I'm thinking of selectors like: > .foo div .foo .bar > where greediness may trip you up. Judging from the number of failure case it picked up, we have pretty good coverage. Modulo bugs, this shouldn't be tripped by anything. (In reply to comment #4) > Third rule on google.com stylesheet: .gac_m td{line-height:17px} > > Also check out our default stylesheets. I know descendant selectors are common, but that doesn't really answer my question. Consider the size of the DOM for google.com - it's tiny. Pages with significantly larger DOMs also tend to more aggressively avoid bad selectors (assuming the authors are savvy), and we definitely don't want to penalize those pages. On google.com the overhead from this is probably pretty tiny. Comment on attachment 81133 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=81133&action=review > Source/WebCore/css/CSSStyleSelector.cpp:364 > + void collectIdentifiersInDescendantSelectors(); collectDescendantSelectorIdentifiers()? > Source/WebCore/css/CSSStyleSelector.cpp:370 > + static const unsigned maximumIdentifierCount = 4; > + const unsigned* identifiersInDescendantSelectors() const { return m_identifiersInDescendantSelectors; } How do you know that '4' is a good number? Maybe use a Vector would be clearer, and a typedef for the unsigned (which I guess is really a hash) would be even clearer. > Source/WebCore/css/CSSStyleSelector.cpp:652 > +static inline void addContextIdentifiersToVector(Element* element, Vector<unsigned, 4>& contextIdentifiers) Do you really need the , 4 here? > Source/WebCore/css/CSSStyleSelector.cpp:654 > + // Mix tags, class names and ids into sort of tasty bouillabaisse. This comment, though amusing, doesn't really help me to understand the code. > Source/WebCore/css/CSSStyleSelector.cpp:671 > + // There are all kinds of vacky special cases where the style recalc may temporarily branch to some random elements. s/vacky/wacky > Source/WebCore/css/CSSStyleSelector.cpp:672 > + // We could fix up the stack but that seems like a hassle. Maybe use a FIXME: make this work in more unusual cases (e.g. when....) > Source/WebCore/css/CSSStyleSelector.cpp:680 > + m_parentStack.append(ParentStackFrame(parent)); I'd like a blank line above this one. > Source/WebCore/css/CSSStyleSelector.cpp:683 > + size_t size = contextFrame.identifiers.size(); I'd prefer count to size. > Source/WebCore/dom/Element.cpp:81 > + ~StyleSelectorParentPusher() { if (m_didPush) m_styleSelector->popParent(m_parent); } Please unwrap this. (In reply to comment #8) > I know descendant selectors are common, but that doesn't really answer my question. Consider the size of the DOM for google.com - it's tiny. Pages with significantly larger DOMs also tend to more aggressively avoid bad selectors (assuming the authors are savvy), and we definitely don't want to penalize those pages. On google.com the overhead from this is probably pretty tiny. Anecdotally, on Google Reader we knew that descendant selectors were bad and we had tooling to calculate their approximate cost and warn if they were out of hand, but we never removed them entirely (i.e. the only thing we outright forbade was *). I believe Gmail and Spreadsheets are the same. So I don't think it's safe to assume that savvy web developers will not have descendant selectors. > > Source/WebCore/css/CSSStyleSelector.cpp:370 > > + static const unsigned maximumIdentifierCount = 4; > > + const unsigned* identifiersInDescendantSelectors() const { return m_identifiersInDescendantSelectors; } > > How do you know that '4' is a good number? Maybe use a Vector would be clearer, and a typedef for the unsigned (which I guess is really a hash) would be even clearer. It was based on looking through a number of stylesheets on popular web sites. 2 or 3 might still give good enough coverage but this can be adjusted later. Much more (or a Vector!) is a bad idea as these data structures are significant portion of our memory usage. We consistently use "unsigned" for hash through the codebase. > > Source/WebCore/css/CSSStyleSelector.cpp:652 > > +static inline void addContextIdentifiersToVector(Element* element, Vector<unsigned, 4>& contextIdentifiers) > > Do you really need the , 4 here? That avoids most of the unnecessary mallocs. 4 is enough to fit the tag name and some classes. This is not an important structure in terms of for memory usage, it could be more too. > This comment, though amusing, doesn't really help me to understand the code. It is my contribution to the recent important webkit-dev discussions. (In reply to comment #11) > > > Source/WebCore/css/CSSStyleSelector.cpp:370 > > > + static const unsigned maximumIdentifierCount = 4; > > > + const unsigned* identifiersInDescendantSelectors() const { return m_identifiersInDescendantSelectors; } > > > > How do you know that '4' is a good number? Maybe use a Vector would be clearer, and a typedef for the unsigned (which I guess is really a hash) would be even clearer. > > It was based on looking through a number of stylesheets on popular web sites. 2 or 3 might still give good enough coverage but this can be adjusted later. Much more (or a Vector!) is a bad idea as these data structures are significant portion of our memory usage. A Vector of fixed capacity shouldn't have any additional overhead. > We consistently use "unsigned" for hash through the codebase. > > > > Source/WebCore/css/CSSStyleSelector.cpp:652 > > > +static inline void addContextIdentifiersToVector(Element* element, Vector<unsigned, 4>& contextIdentifiers) > > > > Do you really need the , 4 here? > > That avoids most of the unnecessary mallocs. 4 is enough to fit the tag name and some classes. This is not an important structure in terms of for memory usage, it could be more too. But here you're just passing a reference to a Vector. > A Vector of fixed capacity shouldn't have any additional overhead. size_t m_size; <----- Buffer m_buffer; > But here you're just passing a reference to a Vector. I prefer my code to compile. (In reply to comment #3) > What's the hit/miss rate like on this cache? Using this simple dtrace script with a debug build sudo dtrace -n 'pid$target:WebCore:WebCore??CSSStyleSelector??fastRejectSelector*:return { @[arg1] = count(); }' -p <pid> I see nytimes.com 0 91425 1 533836 google.com (not logged in) 0 8589 1 2447 reader.google.com (logged into my account) 0 12810 1 31293 engadget.com 0 15017 1 326230 csszengarden.com 0 412 1 279 Attachment 81133 [details] did not pass style-queue:
Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1
Source/WebCore/dom/Element.cpp:81: More than one command on the same line in if [whitespace/parens] [4]
Source/WebCore/css/CSSStyleSelector.h:194: The parameter name "ruleData" adds no information, so it should be removed. [readability/parameter_name] [5]
Total errors found: 2 in 4 files
If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 81133 [details]
patch
This looks great to me. Address Simon's comments/feedback and r=me.
http://trac.webkit.org/changeset/77740 (with Simon's comments implemented and some additional cleanups) (In reply to comment #10) > Anecdotally, on Google Reader we knew that descendant selectors were bad Note that the reason descendant selectors are considered "bad" is that browser implementations have sucked. There is nothing inherently wrong with them. |