Bug 97298 - [Refactoring] Introduce a traversal strategy in SelectorChecker
Summary: [Refactoring] Introduce a traversal strategy in SelectorChecker
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: CSS (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Shinya Kawanaka
URL:
Keywords:
Depends on:
Blocks: 96990
  Show dependency treegraph
 
Reported: 2012-09-20 23:43 PDT by Shinya Kawanaka
Modified: 2012-10-04 21:47 PDT (History)
10 users (show)

See Also:


Attachments
Patch (18.69 KB, patch)
2012-09-21 00:21 PDT, Shinya Kawanaka
no flags Details | Formatted Diff | Diff
Patch (22.05 KB, patch)
2012-09-24 23:27 PDT, Shinya Kawanaka
no flags Details | Formatted Diff | Diff
Patch (22.22 KB, patch)
2012-09-25 00:49 PDT, Shinya Kawanaka
no flags Details | Formatted Diff | Diff
Patch (22.22 KB, patch)
2012-09-25 01:10 PDT, Shinya Kawanaka
no flags Details | Formatted Diff | Diff
WIP including other patches (40.05 KB, patch)
2012-09-25 01:21 PDT, Shinya Kawanaka
no flags Details | Formatted Diff | Diff
Patch (22.22 KB, patch)
2012-09-25 01:24 PDT, Shinya Kawanaka
no flags Details | Formatted Diff | Diff
wip (21.83 KB, patch)
2012-10-03 05:35 PDT, Hayato Ito
no flags Details | Formatted Diff | Diff
Sync with ToT (21.83 KB, patch)
2012-10-03 05:36 PDT, Hayato Ito
no flags Details | Formatted Diff | Diff
WIP: Strategy as a member of SelectorCheckingContext. (23.38 KB, patch)
2012-10-03 12:18 PDT, Dimitri Glazkov (Google)
no flags Details | Formatted Diff | Diff
WIP: Strategy as interface member of SelectorCheckingContext. (21.27 KB, patch)
2012-10-03 13:21 PDT, Dimitri Glazkov (Google)
no flags Details | Formatted Diff | Diff
Fix a performance test and update the benchmark result. (21.81 KB, patch)
2012-10-04 01:11 PDT, Hayato Ito
no flags Details | Formatted Diff | Diff
Fix a performance test. (21.73 KB, patch)
2012-10-04 04:39 PDT, Hayato Ito
no flags Details | Formatted Diff | Diff
Patch for landing (21.62 KB, patch)
2012-10-04 19:14 PDT, Hayato Ito
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Shinya Kawanaka 2012-09-20 23:43:49 PDT
In Bug 96990, we would like to make SelectorChecker work with a node pool which is populated by <content> or <shadow> element.
As the first step, we would like to extract a dom traversal code from SelectorChecker so that we would like to reuse SelectorChecker for a node pool.
Comment 1 Shinya Kawanaka 2012-09-21 00:21:49 PDT
Created attachment 165065 [details]
Patch
Comment 2 Dimitri Glazkov (Google) 2012-09-21 09:45:01 PDT
Comment on attachment 165065 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=165065&action=review

I am still happy with the general approach, just need to keep improving.

> Source/WebCore/ChangeLog:14
> +        Since this code path is very hot, we were very careful not to regress performance.
> +        We used template and inline method, also we did not add a new argument to SelectorChecker::checkOneSelector,
> +        but replaced the SelectorCheckingContext to Strategy. When we added a new argument to checkOneSelector,
> +        we detected 3% ~ 5% performance regression.

This seems like black magic. Why would the performance regress like this?

Also, I am fine with Context containing Strategy, but not the other way around. Strategy is an abstract notion and shouldn't really have state. Context contains specifics of selector-checking. Strategy taking Context as argument also seems fine.

> Source/WebCore/ChangeLog:42
> +        (WebCore::SelectorChecker::DOMTraversalStrategy::isFirstChild): This kind of methods takes arguments
> +        instead of having a member. When we use member, we have detected 10% performance regression.

I don't understand this comment. Maybe remove it? :)

> Source/WebCore/css/SelectorChecker.h:268
> +inline bool SelectorChecker::DOMTraversalStrategy::isFirstOfType(Element* element, const QualifiedName& type) const

Do all these methods have to live in the header? That seems sad.

> PerformanceTests/CSS/pseudoclass-selectors.html:6
> +    <span></span>

I wonder if you would make the tree a bit more complex, you'd see more interesting data. I worry this is too small to give you any meaningful results.
Comment 3 Antti Koivisto 2012-09-21 10:22:59 PDT
Factoring like this makes no sense to me. You are invoking the DOMTraversalStrategy functions with 'this' pointer but none of them actually use it.

The added abstraction just makes the code harder to understand. If there are code sharing opportunities please just factor them into static inline functions.

Generally, I'd like to hear what the goal here is. Refactoring without some measurable end goal (enabling performance improvements or new features for example) does not necessarily lead to good results.
Comment 4 Dimitri Glazkov (Google) 2012-09-21 11:00:42 PDT
(In reply to comment #3)
> Factoring like this makes no sense to me. You are invoking the DOMTraversalStrategy functions with 'this' pointer but none of them actually use it.

That is a really good point.

> The added abstraction just makes the code harder to understand. If there are code sharing opportunities please just factor them into static inline functions.

> Generally, I'd like to hear what the goal here is. Refactoring without some measurable end goal (enabling performance improvements or new features for example) does not necessarily lead to good results.

Here's the larger patch on bug 96990 (that I asked Shinya to break up) to show the context:
https://bugs.webkit.org/attachment.cgi?id=164899&action=review

The trick here is that in Shadow DOM, we need the selector to match a list of nodes that is not (necessarily) the list of children of an element. Some of the nodes are, but some may have been projected as children by an enclosing shadow DOM subtree.
Comment 5 Shinya Kawanaka 2012-09-24 21:08:30 PDT
(In reply to comment #2)
> (From update of attachment 165065 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=165065&action=review
> 
> I am still happy with the general approach, just need to keep improving.
> 
> > Source/WebCore/ChangeLog:14
> > +        Since this code path is very hot, we were very careful not to regress performance.
> > +        We used template and inline method, also we did not add a new argument to SelectorChecker::checkOneSelector,
> > +        but replaced the SelectorCheckingContext to Strategy. When we added a new argument to checkOneSelector,
> > +        we detected 3% ~ 5% performance regression.
> 
> This seems like black magic. Why would the performance regress like this?

The 4th argument is passed with a register, and it will cause a few extra memory operations in the function, as far as I investigated the assembly code.

> > PerformanceTests/CSS/pseudoclass-selectors.html:6
> > +    <span></span>
> 
> I wonder if you would make the tree a bit more complex, you'd see more interesting data. I worry this is too small to give you any meaningful results.

Since we want to measure the overhead of template function call, it seems simpler is better. A simple test will magnify the overhead of this patch if any.
Comment 6 Shinya Kawanaka 2012-09-24 21:44:46 PDT
> > > PerformanceTests/CSS/pseudoclass-selectors.html:6
> > > +    <span></span>
> > 
> > I wonder if you would make the tree a bit more complex, you'd see more interesting data. I worry this is too small to give you any meaningful results.
> 
> Since we want to measure the overhead of template function call, it seems simpler is better. A simple test will magnify the overhead of this patch if any.

Oops, this is not correct so much. checkSelector will be called for all elements which we try to match. So I think complexity does not matter so much. Very simple one is not good, though.
Comment 7 Shinya Kawanaka 2012-09-24 23:27:37 PDT
Created attachment 165529 [details]
Patch
Comment 8 Gyuyoung Kim 2012-09-24 23:56:52 PDT
Comment on attachment 165529 [details]
Patch

Attachment 165529 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/14006377
Comment 9 Shinya Kawanaka 2012-09-25 00:49:27 PDT
Created attachment 165545 [details]
Patch
Comment 10 Shinya Kawanaka 2012-09-25 01:06:00 PDT
Comment on attachment 165545 [details]
Patch

This doesn't pass tests
Comment 11 Shinya Kawanaka 2012-09-25 01:10:20 PDT
Created attachment 165548 [details]
Patch
Comment 12 Shinya Kawanaka 2012-09-25 01:21:30 PDT
Created attachment 165550 [details]
WIP including other patches
Comment 13 Shinya Kawanaka 2012-09-25 01:24:23 PDT
Created attachment 165552 [details]
Patch
Comment 14 Dimitri Glazkov (Google) 2012-09-25 08:43:14 PDT
(In reply to comment #5)

> > > +        but replaced the SelectorCheckingContext to Strategy. When we added a new argument to checkOneSelector,
> > > +        we detected 3% ~ 5% performance regression.
> > 
> > This seems like black magic. Why would the performance regress like this?
> 
> The 4th argument is passed with a register, and it will cause a few extra memory operations in the function, as far as I investigated the assembly code.

Whoa.
Comment 15 Dimitri Glazkov (Google) 2012-09-25 09:17:58 PDT
Comment on attachment 165552 [details]
Patch

Hmm.. The template turducken is not enjoyable. It's clever, but at least to me, it's a decrease in clarity. Let me ponder this...
Comment 16 Dimitri Glazkov (Google) 2012-09-25 09:28:23 PDT
(In reply to comment #14)
> (In reply to comment #5)
> 
> > > > +        but replaced the SelectorCheckingContext to Strategy. When we added a new argument to checkOneSelector,
> > > > +        we detected 3% ~ 5% performance regression.
> > > 
> > > This seems like black magic. Why would the performance regress like this?
> > 
> > The 4th argument is passed with a register, and it will cause a few extra memory operations in the function, as far as I investigated the assembly code.
> 
> Whoa.

BTW, I will reduce function signature to 2 arguments in bug 96445, which I'll finish as soon as I am done with <insert word that rhymes with nerf>.
Comment 17 Dimitri Glazkov (Google) 2012-09-26 09:02:33 PDT
I wonder if we could lower the complexity by making SelectorCheckingContext a template and specializing it _before_ it's passes to checkSelector?
Comment 18 Shinya Kawanaka 2012-09-26 19:09:43 PDT
(In reply to comment #17)
> I wonder if we could lower the complexity by making SelectorCheckingContext a template and specializing it _before_ it's passes to checkSelector?

I've thought the following ideas before.
# I prefer to making SelectorCheckingContext and SelectorCheckingContextForShadow incompatible, since they are not have is-a relation.

(type A)

template<Strategy>
struct SelectorCheckingContext {
    ...
    Strategy strategy;
};

When we should not have Vector<Node*> for strategy, we have to pass SelectorCheckingContext to strategy anyway. Seems strange for me.

(type B) 

template<Strategy>
struct SelectorCheckingContextBase {
    typedef Strategy TraversalStrategy;
};

and

struct SelectCheckingContext : public SelectorCheckingContextBase<DOMTraversalStrategy> {
};

struct SelectCheckingContextForShadow : public SelectorCheckingContextBase<ShadowDOMTraversalStrategy> {
    Vector<Node*> nodes;
    int nth;
};

and calls CheckingContext::TraversalStrategy::isFirstChild(context, element) etc. This is more complicated than now, I think.

(type C)

struct SelectorCheckingContextBase {
    // fields are the same as exiting SelectorCheckingContext.
};

and

struct SelectCheckingContext : public SelectorCheckingContextBase {
    bool isFirstChild(Element*) const;
    ...
};

struct SelectCheckingContextForShadow : public SelectorCheckingContextBase<ShadowDOMTraversalStrategy> {
    bool isFirstChild(Element*) const;
    ...

    Vector<Node*> nodes;
    int nth;
};

SelectorCheckingContext and TraversalStrategy are mixed. I don't like this design.
Comment 19 Shinya Kawanaka 2012-09-26 19:10:41 PDT
(In reply to comment #17)
> I wonder if we could lower the complexity by making SelectorCheckingContext a template and specializing it _before_ it's passes to checkSelector?

I'm not sure what this means, so can you give me some example?
Comment 20 Dimitri Glazkov (Google) 2012-09-26 20:03:58 PDT
(In reply to comment #18)
> (In reply to comment #17)
> > I wonder if we could lower the complexity by making SelectorCheckingContext a template and specializing it _before_ it's passes to checkSelector?
> 
> I've thought the following ideas before.
> # I prefer to making SelectorCheckingContext and SelectorCheckingContextForShadow incompatible, since they are not have is-a relation.
> 
> (type A)
> 
> template<Strategy>
> struct SelectorCheckingContext {
>     ...
>     Strategy strategy;
> };

This is the one I meant. 

> When we should not have Vector<Node*> for strategy, we have to pass SelectorCheckingContext to strategy anyway. Seems strange for me.

Can you explain this a bit? I am not sure I understand.
Comment 21 Shinya Kawanaka 2012-09-27 22:30:26 PDT
> > When we should not have Vector<Node*> for strategy, we have to pass SelectorCheckingContext to strategy anyway. Seems strange for me.
> 
> Can you explain this a bit? I am not sure I understand.

Since we don't want to have state in Strategy, we will have to have Vector<Node*> in SelectorCheckingContext (for Shadow). Strategy for ShadowDOM needs Vector<Node*>, we have to pass SelectorCheckingContext to Strategy as an argument. The method signature of Strategy will be like this.

class Strategy {
    ...
    bool isFirstChild(const SelectorCheckingContext<Strategy>& context, Element*) const;
    ...
};

Though strategy is owned by Context, we have to pass Context to strategy. This seems a bit strange for me. It's OK in some case, but I hesitate to do like this.

# Here, we have to pass Element to isFirstChild not to regress performance. Though context has the same object, reading form a class field needs memory operation in the current compiler.

If it's OK to have Vector<Node*> in Strategy, this discussion does not hold, and I don't have any objection for this, since we don't have to pass SelectorCheckingContext to Strategy.
Comment 22 Dimitri Glazkov (Google) 2012-09-28 08:43:35 PDT
(In reply to comment #21)
> > > When we should not have Vector<Node*> for strategy, we have to pass SelectorCheckingContext to strategy anyway. Seems strange for me.
> > 
> > Can you explain this a bit? I am not sure I understand.
> 
> Since we don't want to have state in Strategy, we will have to have Vector<Node*> in SelectorCheckingContext (for Shadow). Strategy for ShadowDOM needs Vector<Node*>, we have to pass SelectorCheckingContext to Strategy as an argument. The method signature of Strategy will be like this.
> 
> class Strategy {
>     ...
>     bool isFirstChild(const SelectorCheckingContext<Strategy>& context, Element*) const;
>     ...
> };
> 
> Though strategy is owned by Context, we have to pass Context to strategy. This seems a bit strange for me. It's OK in some case, but I hesitate to do like this.
> 
> # Here, we have to pass Element to isFirstChild not to regress performance. Though context has the same object, reading form a class field needs memory operation in the current compiler.
> 
> If it's OK to have Vector<Node*> in Strategy, this discussion does not hold, and I don't have any objection for this, since we don't have to pass SelectorCheckingContext to Strategy.

I am ok with relaxing my view that Strategies should not hold state, if it will make code more readable. Let's do it :)
Comment 23 Dimitri Glazkov (Google) 2012-10-02 09:21:03 PDT
Wanted to check and see how things are going here. I think I am waiting on a new patch?
Comment 24 Hayato Ito 2012-10-03 05:35:00 PDT
Created attachment 166868 [details]
wip
Comment 25 Hayato Ito 2012-10-03 05:36:23 PDT
Created attachment 166869 [details]
Sync with ToT
Comment 26 Hayato Ito 2012-10-03 05:38:13 PDT
I've updated the patch, syncing with ToT. I am afraid that I have not addressed every comments in the discussion. Please let me know if I miss something.

(In reply to comment #23)
> Wanted to check and see how things are going here. I think I am waiting on a new patch?
Comment 27 Dimitri Glazkov (Google) 2012-10-03 12:18:09 PDT
Created attachment 166936 [details]
WIP: Strategy as a member of SelectorCheckingContext.
Comment 28 Dimitri Glazkov (Google) 2012-10-03 12:23:03 PDT
(In reply to comment #27)
> Created an attachment (id=166936) [details]
> WIP: Strategy as a member of SelectorCheckingContext.

In this patch, I tried the approach A from comment 18. Performance stays within the noise range, but the templates are still quite ugly. Let me give this another try.
Comment 29 Ryosuke Niwa 2012-10-03 13:21:04 PDT
Comment on attachment 166869 [details]
Sync with ToT

View in context: https://bugs.webkit.org/attachment.cgi?id=166869&action=review

> Source/WebCore/ChangeLog:33
> +        Test: PerformanceTests/CSS?pseudoclass-selectors.html

Please use CamelCase for the test name.
Comment 30 Dimitri Glazkov (Google) 2012-10-03 13:21:40 PDT
Created attachment 166950 [details]
WIP: Strategy as interface member of SelectorCheckingContext.
Comment 31 Dimitri Glazkov (Google) 2012-10-03 14:09:40 PDT
Comment on attachment 166950 [details]
WIP: Strategy as interface member of SelectorCheckingContext.

View in context: https://bugs.webkit.org/attachment.cgi?id=166950&action=review

Personally, I like this one the best. Running the perf test, I couldn't discern any differences between either of the three approaches. WDYT?

> Source/WebCore/css/SelectorChecker.h:122
> +    bool checkOneSelector(const SelectorCheckingContext&) const;

This can probably move back.
Comment 32 Hayato Ito 2012-10-03 19:26:59 PDT
The code became much cleaner and simpler. I like that.

I am afraid that shinya has intentionally introduced an template approach to avoid the performance regression. I guess shinya noticed the performance regression, but if perf test cannot tell any differences between these approaches, I prefer the clean one.

Let me investigate further.

(In reply to comment #31)
> (From update of attachment 166950 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=166950&action=review
> 
> Personally, I like this one the best. Running the perf test, I couldn't discern any differences between either of the three approaches. WDYT?
> 
> > Source/WebCore/css/SelectorChecker.h:122
> > +    bool checkOneSelector(const SelectorCheckingContext&) const;
> 
> This can probably move back.
Comment 33 Dimitri Glazkov (Google) 2012-10-03 19:34:01 PDT
(In reply to comment #32)

Antti, which one do you prefer?
Comment 34 Hayato Ito 2012-10-03 20:11:48 PDT
(In reply to comment #33)
> (In reply to comment #32)
> 
> Antti, which one do you prefer?

This one. https://bugs.webkit.org/attachment.cgi?id=166950&action=review

Let me rewrite the patch in https://bugs.webkit.org/show_bug.cgi?id=96990 based on attachment 166950 [details] and see how things work.
Comment 35 Hayato Ito 2012-10-04 01:11:35 PDT
Created attachment 167045 [details]
Fix a performance test and update the benchmark result.
Comment 36 Hayato Ito 2012-10-04 01:13:37 PDT
I fixed the performance test and update the benchmark result in the ChangeLog.
It seems there is no performance regression.

(In reply to comment #35)
> Created an attachment (id=167045) [details]
> Fix a performance test and update the benchmark result.
Comment 37 Build Bot 2012-10-04 01:18:13 PDT
Comment on attachment 167045 [details]
Fix a performance test and update the benchmark result.

Attachment 167045 [details] did not pass mac-ews (mac):
Output: http://queues.webkit.org/results/14168197
Comment 38 Hayato Ito 2012-10-04 04:39:12 PDT
Created attachment 167073 [details]
Fix a performance test.
Comment 39 Antti Koivisto 2012-10-04 14:11:18 PDT
Comment on attachment 167073 [details]
Fix a performance test.

View in context: https://bugs.webkit.org/attachment.cgi?id=167073&action=review

I'm bit unhappy about having to make this code generic and so harder to understand. The reasons given are about Shadow DOM and I can't evaluate if they are valid or not. The patch itself seems ok so I'll r+ it anyway.

> Source/WebCore/ChangeLog:29
> +        The performance of the added test before using this patch was:
> +          avg 3669.143396776992 runs/s
> +          median 3667.9394112696314 runs/s
> +          stdev 21.12345296601369 runs/s
> +          min 3594.5452776430134 runs/s
> +          max 3707.408251056703 runs/s
> +
> +        When we used this patch, the performance was:
> +          avg 3851.376555590946 runs/s
> +          median 3858.343345260435 runs/s
> +          stdev 25.245683967249494 runs/s
> +          min 3782.510878241754 runs/s
> +          max 3878.4305938425096 runs/s

Can you explain where the performance gains are coming from?
Comment 40 Hayato Ito 2012-10-04 18:36:06 PDT
(In reply to comment #39)
> (From update of attachment 167073 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=167073&action=review
> 
> I'm bit unhappy about having to make this code generic and so harder to understand. The reasons given are about Shadow DOM and I can't evaluate if they are valid or not. The patch itself seems ok so I'll r+ it anyway.

I understand the concern. That's our biggest concern too. We are still seeking and trying various approaches which keep code clean.

Let me land this now and try another approaches as in another bugs.

> 
> > Source/WebCore/ChangeLog:29
> > +        The performance of the added test before using this patch was:
> > +          avg 3669.143396776992 runs/s
> > +          median 3667.9394112696314 runs/s
> > +          stdev 21.12345296601369 runs/s
> > +          min 3594.5452776430134 runs/s
> > +          max 3707.408251056703 runs/s
> > +
> > +        When we used this patch, the performance was:
> > +          avg 3851.376555590946 runs/s
> > +          median 3858.343345260435 runs/s
> > +          stdev 25.245683967249494 runs/s
> > +          min 3782.510878241754 runs/s
> > +          max 3878.4305938425096 runs/s
> 
> Can you explain where the performance gains are coming from?

I picked the benchmark results simply at ramdom (no intention) to be honest.
As a result, unfortunately, the number became confusing since the
benchmark results on my machine are unstable.

Let me update the benchmark result which runs on stable machine. We already tried that. Then land it.
Comment 41 Hayato Ito 2012-10-04 19:14:08 PDT
Created attachment 167226 [details]
Patch for landing
Comment 42 WebKit Review Bot 2012-10-04 21:47:11 PDT
Comment on attachment 167226 [details]
Patch for landing

Clearing flags on attachment: 167226

Committed r130459: <http://trac.webkit.org/changeset/130459>
Comment 43 WebKit Review Bot 2012-10-04 21:47:17 PDT
All reviewed patches have been landed.  Closing bug.