RESOLVED FIXED Bug 97298
[Refactoring] Introduce a traversal strategy in SelectorChecker
https://bugs.webkit.org/show_bug.cgi?id=97298
Summary [Refactoring] Introduce a traversal strategy in SelectorChecker
Shinya Kawanaka
Reported 2012-09-20 23:43:49 PDT
In Bug 96990, we would like to make SelectorChecker work with a node pool which is populated by <content> or <shadow> element. As the first step, we would like to extract a dom traversal code from SelectorChecker so that we would like to reuse SelectorChecker for a node pool.
Attachments
Patch (18.69 KB, patch)
2012-09-21 00:21 PDT, Shinya Kawanaka
no flags
Patch (22.05 KB, patch)
2012-09-24 23:27 PDT, Shinya Kawanaka
no flags
Patch (22.22 KB, patch)
2012-09-25 00:49 PDT, Shinya Kawanaka
no flags
Patch (22.22 KB, patch)
2012-09-25 01:10 PDT, Shinya Kawanaka
no flags
WIP including other patches (40.05 KB, patch)
2012-09-25 01:21 PDT, Shinya Kawanaka
no flags
Patch (22.22 KB, patch)
2012-09-25 01:24 PDT, Shinya Kawanaka
no flags
wip (21.83 KB, patch)
2012-10-03 05:35 PDT, Hayato Ito
no flags
Sync with ToT (21.83 KB, patch)
2012-10-03 05:36 PDT, Hayato Ito
no flags
WIP: Strategy as a member of SelectorCheckingContext. (23.38 KB, patch)
2012-10-03 12:18 PDT, Dimitri Glazkov (Google)
no flags
WIP: Strategy as interface member of SelectorCheckingContext. (21.27 KB, patch)
2012-10-03 13:21 PDT, Dimitri Glazkov (Google)
no flags
Fix a performance test and update the benchmark result. (21.81 KB, patch)
2012-10-04 01:11 PDT, Hayato Ito
no flags
Fix a performance test. (21.73 KB, patch)
2012-10-04 04:39 PDT, Hayato Ito
no flags
Patch for landing (21.62 KB, patch)
2012-10-04 19:14 PDT, Hayato Ito
no flags
Shinya Kawanaka
Comment 1 2012-09-21 00:21:49 PDT
Dimitri Glazkov (Google)
Comment 2 2012-09-21 09:45:01 PDT
Comment on attachment 165065 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=165065&action=review I am still happy with the general approach, just need to keep improving. > Source/WebCore/ChangeLog:14 > + Since this code path is very hot, we were very careful not to regress performance. > + We used template and inline method, also we did not add a new argument to SelectorChecker::checkOneSelector, > + but replaced the SelectorCheckingContext to Strategy. When we added a new argument to checkOneSelector, > + we detected 3% ~ 5% performance regression. This seems like black magic. Why would the performance regress like this? Also, I am fine with Context containing Strategy, but not the other way around. Strategy is an abstract notion and shouldn't really have state. Context contains specifics of selector-checking. Strategy taking Context as argument also seems fine. > Source/WebCore/ChangeLog:42 > + (WebCore::SelectorChecker::DOMTraversalStrategy::isFirstChild): This kind of methods takes arguments > + instead of having a member. When we use member, we have detected 10% performance regression. I don't understand this comment. Maybe remove it? :) > Source/WebCore/css/SelectorChecker.h:268 > +inline bool SelectorChecker::DOMTraversalStrategy::isFirstOfType(Element* element, const QualifiedName& type) const Do all these methods have to live in the header? That seems sad. > PerformanceTests/CSS/pseudoclass-selectors.html:6 > + <span></span> I wonder if you would make the tree a bit more complex, you'd see more interesting data. I worry this is too small to give you any meaningful results.
Antti Koivisto
Comment 3 2012-09-21 10:22:59 PDT
Factoring like this makes no sense to me. You are invoking the DOMTraversalStrategy functions with 'this' pointer but none of them actually use it. The added abstraction just makes the code harder to understand. If there are code sharing opportunities please just factor them into static inline functions. Generally, I'd like to hear what the goal here is. Refactoring without some measurable end goal (enabling performance improvements or new features for example) does not necessarily lead to good results.
Dimitri Glazkov (Google)
Comment 4 2012-09-21 11:00:42 PDT
(In reply to comment #3) > Factoring like this makes no sense to me. You are invoking the DOMTraversalStrategy functions with 'this' pointer but none of them actually use it. That is a really good point. > The added abstraction just makes the code harder to understand. If there are code sharing opportunities please just factor them into static inline functions. > Generally, I'd like to hear what the goal here is. Refactoring without some measurable end goal (enabling performance improvements or new features for example) does not necessarily lead to good results. Here's the larger patch on bug 96990 (that I asked Shinya to break up) to show the context: https://bugs.webkit.org/attachment.cgi?id=164899&action=review The trick here is that in Shadow DOM, we need the selector to match a list of nodes that is not (necessarily) the list of children of an element. Some of the nodes are, but some may have been projected as children by an enclosing shadow DOM subtree.
Shinya Kawanaka
Comment 5 2012-09-24 21:08:30 PDT
(In reply to comment #2) > (From update of attachment 165065 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=165065&action=review > > I am still happy with the general approach, just need to keep improving. > > > Source/WebCore/ChangeLog:14 > > + Since this code path is very hot, we were very careful not to regress performance. > > + We used template and inline method, also we did not add a new argument to SelectorChecker::checkOneSelector, > > + but replaced the SelectorCheckingContext to Strategy. When we added a new argument to checkOneSelector, > > + we detected 3% ~ 5% performance regression. > > This seems like black magic. Why would the performance regress like this? The 4th argument is passed with a register, and it will cause a few extra memory operations in the function, as far as I investigated the assembly code. > > PerformanceTests/CSS/pseudoclass-selectors.html:6 > > + <span></span> > > I wonder if you would make the tree a bit more complex, you'd see more interesting data. I worry this is too small to give you any meaningful results. Since we want to measure the overhead of template function call, it seems simpler is better. A simple test will magnify the overhead of this patch if any.
Shinya Kawanaka
Comment 6 2012-09-24 21:44:46 PDT
> > > PerformanceTests/CSS/pseudoclass-selectors.html:6 > > > + <span></span> > > > > I wonder if you would make the tree a bit more complex, you'd see more interesting data. I worry this is too small to give you any meaningful results. > > Since we want to measure the overhead of template function call, it seems simpler is better. A simple test will magnify the overhead of this patch if any. Oops, this is not correct so much. checkSelector will be called for all elements which we try to match. So I think complexity does not matter so much. Very simple one is not good, though.
Shinya Kawanaka
Comment 7 2012-09-24 23:27:37 PDT
Gyuyoung Kim
Comment 8 2012-09-24 23:56:52 PDT
Shinya Kawanaka
Comment 9 2012-09-25 00:49:27 PDT
Shinya Kawanaka
Comment 10 2012-09-25 01:06:00 PDT
Comment on attachment 165545 [details] Patch This doesn't pass tests
Shinya Kawanaka
Comment 11 2012-09-25 01:10:20 PDT
Shinya Kawanaka
Comment 12 2012-09-25 01:21:30 PDT
Created attachment 165550 [details] WIP including other patches
Shinya Kawanaka
Comment 13 2012-09-25 01:24:23 PDT
Dimitri Glazkov (Google)
Comment 14 2012-09-25 08:43:14 PDT
(In reply to comment #5) > > > + but replaced the SelectorCheckingContext to Strategy. When we added a new argument to checkOneSelector, > > > + we detected 3% ~ 5% performance regression. > > > > This seems like black magic. Why would the performance regress like this? > > The 4th argument is passed with a register, and it will cause a few extra memory operations in the function, as far as I investigated the assembly code. Whoa.
Dimitri Glazkov (Google)
Comment 15 2012-09-25 09:17:58 PDT
Comment on attachment 165552 [details] Patch Hmm.. The template turducken is not enjoyable. It's clever, but at least to me, it's a decrease in clarity. Let me ponder this...
Dimitri Glazkov (Google)
Comment 16 2012-09-25 09:28:23 PDT
(In reply to comment #14) > (In reply to comment #5) > > > > > + but replaced the SelectorCheckingContext to Strategy. When we added a new argument to checkOneSelector, > > > > + we detected 3% ~ 5% performance regression. > > > > > > This seems like black magic. Why would the performance regress like this? > > > > The 4th argument is passed with a register, and it will cause a few extra memory operations in the function, as far as I investigated the assembly code. > > Whoa. BTW, I will reduce function signature to 2 arguments in bug 96445, which I'll finish as soon as I am done with <insert word that rhymes with nerf>.
Dimitri Glazkov (Google)
Comment 17 2012-09-26 09:02:33 PDT
I wonder if we could lower the complexity by making SelectorCheckingContext a template and specializing it _before_ it's passes to checkSelector?
Shinya Kawanaka
Comment 18 2012-09-26 19:09:43 PDT
(In reply to comment #17) > I wonder if we could lower the complexity by making SelectorCheckingContext a template and specializing it _before_ it's passes to checkSelector? I've thought the following ideas before. # I prefer to making SelectorCheckingContext and SelectorCheckingContextForShadow incompatible, since they are not have is-a relation. (type A) template<Strategy> struct SelectorCheckingContext { ... Strategy strategy; }; When we should not have Vector<Node*> for strategy, we have to pass SelectorCheckingContext to strategy anyway. Seems strange for me. (type B) template<Strategy> struct SelectorCheckingContextBase { typedef Strategy TraversalStrategy; }; and struct SelectCheckingContext : public SelectorCheckingContextBase<DOMTraversalStrategy> { }; struct SelectCheckingContextForShadow : public SelectorCheckingContextBase<ShadowDOMTraversalStrategy> { Vector<Node*> nodes; int nth; }; and calls CheckingContext::TraversalStrategy::isFirstChild(context, element) etc. This is more complicated than now, I think. (type C) struct SelectorCheckingContextBase { // fields are the same as exiting SelectorCheckingContext. }; and struct SelectCheckingContext : public SelectorCheckingContextBase { bool isFirstChild(Element*) const; ... }; struct SelectCheckingContextForShadow : public SelectorCheckingContextBase<ShadowDOMTraversalStrategy> { bool isFirstChild(Element*) const; ... Vector<Node*> nodes; int nth; }; SelectorCheckingContext and TraversalStrategy are mixed. I don't like this design.
Shinya Kawanaka
Comment 19 2012-09-26 19:10:41 PDT
(In reply to comment #17) > I wonder if we could lower the complexity by making SelectorCheckingContext a template and specializing it _before_ it's passes to checkSelector? I'm not sure what this means, so can you give me some example?
Dimitri Glazkov (Google)
Comment 20 2012-09-26 20:03:58 PDT
(In reply to comment #18) > (In reply to comment #17) > > I wonder if we could lower the complexity by making SelectorCheckingContext a template and specializing it _before_ it's passes to checkSelector? > > I've thought the following ideas before. > # I prefer to making SelectorCheckingContext and SelectorCheckingContextForShadow incompatible, since they are not have is-a relation. > > (type A) > > template<Strategy> > struct SelectorCheckingContext { > ... > Strategy strategy; > }; This is the one I meant. > When we should not have Vector<Node*> for strategy, we have to pass SelectorCheckingContext to strategy anyway. Seems strange for me. Can you explain this a bit? I am not sure I understand.
Shinya Kawanaka
Comment 21 2012-09-27 22:30:26 PDT
> > When we should not have Vector<Node*> for strategy, we have to pass SelectorCheckingContext to strategy anyway. Seems strange for me. > > Can you explain this a bit? I am not sure I understand. Since we don't want to have state in Strategy, we will have to have Vector<Node*> in SelectorCheckingContext (for Shadow). Strategy for ShadowDOM needs Vector<Node*>, we have to pass SelectorCheckingContext to Strategy as an argument. The method signature of Strategy will be like this. class Strategy { ... bool isFirstChild(const SelectorCheckingContext<Strategy>& context, Element*) const; ... }; Though strategy is owned by Context, we have to pass Context to strategy. This seems a bit strange for me. It's OK in some case, but I hesitate to do like this. # Here, we have to pass Element to isFirstChild not to regress performance. Though context has the same object, reading form a class field needs memory operation in the current compiler. If it's OK to have Vector<Node*> in Strategy, this discussion does not hold, and I don't have any objection for this, since we don't have to pass SelectorCheckingContext to Strategy.
Dimitri Glazkov (Google)
Comment 22 2012-09-28 08:43:35 PDT
(In reply to comment #21) > > > When we should not have Vector<Node*> for strategy, we have to pass SelectorCheckingContext to strategy anyway. Seems strange for me. > > > > Can you explain this a bit? I am not sure I understand. > > Since we don't want to have state in Strategy, we will have to have Vector<Node*> in SelectorCheckingContext (for Shadow). Strategy for ShadowDOM needs Vector<Node*>, we have to pass SelectorCheckingContext to Strategy as an argument. The method signature of Strategy will be like this. > > class Strategy { > ... > bool isFirstChild(const SelectorCheckingContext<Strategy>& context, Element*) const; > ... > }; > > Though strategy is owned by Context, we have to pass Context to strategy. This seems a bit strange for me. It's OK in some case, but I hesitate to do like this. > > # Here, we have to pass Element to isFirstChild not to regress performance. Though context has the same object, reading form a class field needs memory operation in the current compiler. > > If it's OK to have Vector<Node*> in Strategy, this discussion does not hold, and I don't have any objection for this, since we don't have to pass SelectorCheckingContext to Strategy. I am ok with relaxing my view that Strategies should not hold state, if it will make code more readable. Let's do it :)
Dimitri Glazkov (Google)
Comment 23 2012-10-02 09:21:03 PDT
Wanted to check and see how things are going here. I think I am waiting on a new patch?
Hayato Ito
Comment 24 2012-10-03 05:35:00 PDT
Hayato Ito
Comment 25 2012-10-03 05:36:23 PDT
Created attachment 166869 [details] Sync with ToT
Hayato Ito
Comment 26 2012-10-03 05:38:13 PDT
I've updated the patch, syncing with ToT. I am afraid that I have not addressed every comments in the discussion. Please let me know if I miss something. (In reply to comment #23) > Wanted to check and see how things are going here. I think I am waiting on a new patch?
Dimitri Glazkov (Google)
Comment 27 2012-10-03 12:18:09 PDT
Created attachment 166936 [details] WIP: Strategy as a member of SelectorCheckingContext.
Dimitri Glazkov (Google)
Comment 28 2012-10-03 12:23:03 PDT
(In reply to comment #27) > Created an attachment (id=166936) [details] > WIP: Strategy as a member of SelectorCheckingContext. In this patch, I tried the approach A from comment 18. Performance stays within the noise range, but the templates are still quite ugly. Let me give this another try.
Ryosuke Niwa
Comment 29 2012-10-03 13:21:04 PDT
Comment on attachment 166869 [details] Sync with ToT View in context: https://bugs.webkit.org/attachment.cgi?id=166869&action=review > Source/WebCore/ChangeLog:33 > + Test: PerformanceTests/CSS?pseudoclass-selectors.html Please use CamelCase for the test name.
Dimitri Glazkov (Google)
Comment 30 2012-10-03 13:21:40 PDT
Created attachment 166950 [details] WIP: Strategy as interface member of SelectorCheckingContext.
Dimitri Glazkov (Google)
Comment 31 2012-10-03 14:09:40 PDT
Comment on attachment 166950 [details] WIP: Strategy as interface member of SelectorCheckingContext. View in context: https://bugs.webkit.org/attachment.cgi?id=166950&action=review Personally, I like this one the best. Running the perf test, I couldn't discern any differences between either of the three approaches. WDYT? > Source/WebCore/css/SelectorChecker.h:122 > + bool checkOneSelector(const SelectorCheckingContext&) const; This can probably move back.
Hayato Ito
Comment 32 2012-10-03 19:26:59 PDT
The code became much cleaner and simpler. I like that. I am afraid that shinya has intentionally introduced an template approach to avoid the performance regression. I guess shinya noticed the performance regression, but if perf test cannot tell any differences between these approaches, I prefer the clean one. Let me investigate further. (In reply to comment #31) > (From update of attachment 166950 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=166950&action=review > > Personally, I like this one the best. Running the perf test, I couldn't discern any differences between either of the three approaches. WDYT? > > > Source/WebCore/css/SelectorChecker.h:122 > > + bool checkOneSelector(const SelectorCheckingContext&) const; > > This can probably move back.
Dimitri Glazkov (Google)
Comment 33 2012-10-03 19:34:01 PDT
(In reply to comment #32) Antti, which one do you prefer?
Hayato Ito
Comment 34 2012-10-03 20:11:48 PDT
(In reply to comment #33) > (In reply to comment #32) > > Antti, which one do you prefer? This one. https://bugs.webkit.org/attachment.cgi?id=166950&action=review Let me rewrite the patch in https://bugs.webkit.org/show_bug.cgi?id=96990 based on attachment 166950 [details] and see how things work.
Hayato Ito
Comment 35 2012-10-04 01:11:35 PDT
Created attachment 167045 [details] Fix a performance test and update the benchmark result.
Hayato Ito
Comment 36 2012-10-04 01:13:37 PDT
I fixed the performance test and update the benchmark result in the ChangeLog. It seems there is no performance regression. (In reply to comment #35) > Created an attachment (id=167045) [details] > Fix a performance test and update the benchmark result.
Build Bot
Comment 37 2012-10-04 01:18:13 PDT
Comment on attachment 167045 [details] Fix a performance test and update the benchmark result. Attachment 167045 [details] did not pass mac-ews (mac): Output: http://queues.webkit.org/results/14168197
Hayato Ito
Comment 38 2012-10-04 04:39:12 PDT
Created attachment 167073 [details] Fix a performance test.
Antti Koivisto
Comment 39 2012-10-04 14:11:18 PDT
Comment on attachment 167073 [details] Fix a performance test. View in context: https://bugs.webkit.org/attachment.cgi?id=167073&action=review I'm bit unhappy about having to make this code generic and so harder to understand. The reasons given are about Shadow DOM and I can't evaluate if they are valid or not. The patch itself seems ok so I'll r+ it anyway. > Source/WebCore/ChangeLog:29 > + The performance of the added test before using this patch was: > + avg 3669.143396776992 runs/s > + median 3667.9394112696314 runs/s > + stdev 21.12345296601369 runs/s > + min 3594.5452776430134 runs/s > + max 3707.408251056703 runs/s > + > + When we used this patch, the performance was: > + avg 3851.376555590946 runs/s > + median 3858.343345260435 runs/s > + stdev 25.245683967249494 runs/s > + min 3782.510878241754 runs/s > + max 3878.4305938425096 runs/s Can you explain where the performance gains are coming from?
Hayato Ito
Comment 40 2012-10-04 18:36:06 PDT
(In reply to comment #39) > (From update of attachment 167073 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=167073&action=review > > I'm bit unhappy about having to make this code generic and so harder to understand. The reasons given are about Shadow DOM and I can't evaluate if they are valid or not. The patch itself seems ok so I'll r+ it anyway. I understand the concern. That's our biggest concern too. We are still seeking and trying various approaches which keep code clean. Let me land this now and try another approaches as in another bugs. > > > Source/WebCore/ChangeLog:29 > > + The performance of the added test before using this patch was: > > + avg 3669.143396776992 runs/s > > + median 3667.9394112696314 runs/s > > + stdev 21.12345296601369 runs/s > > + min 3594.5452776430134 runs/s > > + max 3707.408251056703 runs/s > > + > > + When we used this patch, the performance was: > > + avg 3851.376555590946 runs/s > > + median 3858.343345260435 runs/s > > + stdev 25.245683967249494 runs/s > > + min 3782.510878241754 runs/s > > + max 3878.4305938425096 runs/s > > Can you explain where the performance gains are coming from? I picked the benchmark results simply at ramdom (no intention) to be honest. As a result, unfortunately, the number became confusing since the benchmark results on my machine are unstable. Let me update the benchmark result which runs on stable machine. We already tried that. Then land it.
Hayato Ito
Comment 41 2012-10-04 19:14:08 PDT
Created attachment 167226 [details] Patch for landing
WebKit Review Bot
Comment 42 2012-10-04 21:47:11 PDT
Comment on attachment 167226 [details] Patch for landing Clearing flags on attachment: 167226 Committed r130459: <http://trac.webkit.org/changeset/130459>
WebKit Review Bot
Comment 43 2012-10-04 21:47:17 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.