Bug 78070 - REGRESSION: Page Cycler Moz regression from presentation attribute changes.
Summary: REGRESSION: Page Cycler Moz regression from presentation attribute changes.
Status: RESOLVED WONTFIX
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Andreas Kling
URL:
Keywords:
Depends on: 78220 78497 78590 78604 78798 78891 79195 79304 79468 80370 82816
Blocks: 77861
  Show dependency treegraph
 
Reported: 2012-02-07 19:25 PST by Julien Chaffraix
Modified: 2013-02-16 16:13 PST (History)
14 users (show)

See Also:


Attachments
possible patch (6.13 KB, patch)
2012-03-05 12:21 PST, Antti Koivisto
webkit-ews: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Julien Chaffraix 2012-02-07 19:25:48 PST
Using our integration bots (following WebKit and Chromium ToT) did catch up the regression but we did not hear them (not on WebKit:

http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/moz/report.html?history=150&rev=-1

You can see 2 spikes (the revision number is written on the bottom left side of the graph):
* first one at r120492
* second one at r120514

I have found out which one corresponds to each build:
* first one -> http://build.chromium.org/p/chromium.webkit/builders/Mac10.6%20Perf/builds/6206
* second one -> http://build.chromium.org/p/chromium.webkit/builders/Mac10.6%20Perf/builds/6212

A chromium change could have regressed everything though we doubt it as rolling back WebKit removed the regression.

First spike points at http://trac.webkit.org/changeset/106741.

Second spike seems to point at http://trac.webkit.org/changeset/106756 (there is another change so I am not 100% sure).
Comment 1 Andreas Kling 2012-02-08 05:53:11 PST
Would it be possible to get a CPU profile dump of this test? Best thing would be a way for me to run the test myself, but as I understand it that's not possible. :/
Comment 2 Andreas Kling 2012-02-08 08:58:21 PST
Is there some way you could tell me which page(s) have regressed?
Comment 3 Tony Chang 2012-02-08 10:16:48 PST
(In reply to comment #1)
> Would it be possible to get a CPU profile dump of this test? Best thing would be a way for me to run the test myself, but as I understand it that's not possible. :/

I've sent you an email with the page cycler data.
Comment 4 Andreas Kling 2012-02-08 20:43:38 PST
(In reply to comment #3)
> (In reply to comment #1)
> > Would it be possible to get a CPU profile dump of this test? Best thing would be a way for me to run the test myself, but as I understand it that's not possible. :/
> 
> I've sent you an email with the page cycler data.

Thanks a lot Tony. Using the data, I came up with the patch on bug 78199, which landed in <http://trac.webkit.org/changeset/107173>. I hope this fixes the regression.
Comment 5 Julien Chaffraix 2012-02-09 09:40:19 PST
Unfortunately it doesn't seem to have helped. You can always look at the graphs on Chromium side:

* Mac: http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/moz/report.html?history=150&rev=-1
* Linux: http://build.chromium.org/f/chromium/perf/linux-release-webkit-latest/moz/report.html?history=150&rev=-1
* Win Vista: http://build.chromium.org/f/chromium/perf/vista-release-webkit-latest/moz/report.html?history=150&rev=-1

Unfortunately they don't show the WebKit revisions (this has been reported) but they do follow ToT WebKit.
Comment 6 Andreas Kling 2012-02-12 23:19:41 PST
FYI, I am still working on this.
Landed <http://trac.webkit.org/changeset/107484> which re-enables the matched declaration cache optimizations for elements with presentation attributes.
I have a few more things coming, faster style sharing rejection, bypassing CSSParser where possible, etc. If none of this helps I'm going to reinstate the cache for tagname/attrname/attrvalue -> style.
Comment 7 Tony Chang 2012-02-13 16:10:59 PST
We're continuing to get slower as patches land and it's getting harder to determine the cause.  Here's the current graph:
http://build.chromium.org/f/chromium/perf/vista-release-webkit-latest/moz/report.html?history=150&rev=121749

We recorded another 2% regression to the moz page cycler between r107445 and r107543: http://trac.webkit.org/log/trunk/Source?rev=107543&stop_rev=107445&verbose=on
Comment 8 Andreas Kling 2012-02-14 00:43:58 PST
(In reply to comment #7)
> We're continuing to get slower as patches land and it's getting harder to determine the cause.  Here's the current graph:
> http://build.chromium.org/f/chromium/perf/vista-release-webkit-latest/moz/report.html?history=150&rev=121749
> 
> We recorded another 2% regression to the moz page cycler between r107445 and r107543: http://trac.webkit.org/log/trunk/Source?rev=107543&stop_rev=107445&verbose=on

Did <http://trac.webkit.org/changeset/107573> get rolled in yet? How do I know which WebKit revision the perf builder is at?
Comment 9 Andreas Kling 2012-02-14 01:01:30 PST
(In reply to comment #8)
> (In reply to comment #7)
> > We're continuing to get slower as patches land and it's getting harder to determine the cause.  Here's the current graph:
> > http://build.chromium.org/f/chromium/perf/vista-release-webkit-latest/moz/report.html?history=150&rev=121749
> > 
> > We recorded another 2% regression to the moz page cycler between r107445 and r107543: http://trac.webkit.org/log/trunk/Source?rev=107543&stop_rev=107445&verbose=on
> 
> Did <http://trac.webkit.org/changeset/107573> get rolled in yet? How do I know which WebKit revision the perf builder is at?

Never mind, I guess this is what the "Roll webkit deps from r107561 to r107621." commits are about. :)
Comment 10 Tony Chang 2012-02-15 10:32:36 PST
loislo bisected the recent 2% perf regression and narrowed it down to http://trac.webkit.org/changeset/107484.  Andreas, should we open a new bug for that regression?
Comment 11 Tony Chang 2012-02-21 14:01:58 PST
We seem to be holding steady at a 2% page cycler regression:

http://build.chromium.org/f/chromium/perf/xp-release-dual-core/moz/report.html?history=400&rev=122870
Comment 12 Tony Chang 2012-02-21 14:24:12 PST
Sorry, I read the graph wrong.  It's around 9.5% right now on Windows.  Here's the graph:

http://build.chromium.org/f/chromium/perf/xp-release-dual-core/moz/report.html?history=500&rev=122870

The first bump is the initial bug report, the second smaller bump is bug 78798.  Both still exist.

On Mac 10.6, it's only about 6-7%, but that's still pretty bad:
http://build.chromium.org/f/chromium/perf/mac-release-10.6/moz/report.html?history=570&rev=122870

It's been 2 weeks since this bug was filed, should we consider reverting changes?
Comment 13 Andreas Kling 2012-02-21 14:58:31 PST
(In reply to comment #12)
> It's been 2 weeks since this bug was filed, should we consider reverting changes?

Reverting these changes would be a task of unsavory proportions, and I'd prefer that we didn't resort to that.

It's important to note that the content in this particular PLT is ~12 years old, and reflects a very different web where the nearly all styling is done with presentation attributes rather than CSS. To give you some idea, the largest style sheet in the test is 12 kB.

That said, I do intend to make up for this regression.

Much of the remaining delta is due to repeated reparsing of the same CSS values. My next step will be to introduce a String->CSSValue cache in CSSValuePool for use by presentation attributes.
Comment 14 Tony Chang 2012-03-01 16:00:31 PST
How're things progressing?  It looks like we got back the 2%, but we're still a 7% regression.

http://build.chromium.org/f/chromium/perf/xp-release-dual-core/moz/report.html?history=700&rev=124156
Comment 15 Antti Koivisto 2012-03-05 12:21:06 PST
Created attachment 130179 [details]
possible patch
Comment 16 WebKit Review Bot 2012-03-05 12:31:53 PST
Attachment 130179 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/dom/StyledElement.cpp', u'S..." exit_code: 1
Source/WebCore/dom/StyledElement.cpp:52:  This { should be at the end of the previous line  [whitespace/braces] [4]
Source/WebCore/dom/StyledElement.cpp:85:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
Total errors found: 2 in 2 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 17 Early Warning System Bot 2012-03-05 13:08:54 PST
Comment on attachment 130179 [details]
possible patch

Attachment 130179 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/11818158
Comment 18 WebKit Review Bot 2012-03-05 13:37:41 PST
Comment on attachment 130179 [details]
possible patch

Attachment 130179 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/11817178
Comment 19 Julien Chaffraix 2012-03-05 16:18:11 PST
Comment on attachment 130179 [details]
possible patch

View in context: https://bugs.webkit.org/attachment.cgi?id=130179&action=review

I run Mozilla page cycler locally and it looks like this gives us a 1.8% increase in performance.

> Source/WebCore/dom/StyledElement.cpp:246
> +    PresentationAttributeCache::iterator cacheEntry;
> +    if (!cacheKey.isNull())
> +        cacheEntry = presentationAttributeCache().add(cacheKey, 0).first;

This is what makes Chromium unhappy: cacheEntry can be used later without having been properly initialized. Adding a default initialization to presentationAttributeCache().end() removes the issue.
Comment 20 Julien Chaffraix 2012-03-08 11:51:37 PST
Comment on attachment 130179 [details]
possible patch

View in context: https://bugs.webkit.org/attachment.cgi?id=130179&action=review

> Source/WebCore/dom/StyledElement.cpp:101
> +    static PresentationAttributeCache* cache = new PresentationAttributeCache();

This could also use DEFINE_STATIC_LOCAL.

>> Source/WebCore/dom/StyledElement.cpp:246
>> +        cacheEntry = presentationAttributeCache().add(cacheKey, 0).first;
> 
> This is what makes Chromium unhappy: cacheEntry can be used later without having been properly initialized. Adding a default initialization to presentationAttributeCache().end() removes the issue.

It looks like there is some race conditions here and we can still do a re-hashing (causing an ASSERT in debug). I though this was because we postpone the size() check to the end of the function and that we don't properly cap the size to minimumTableSize (I think it should be >= instead of >). Patching that did not help though.

Strangely enough changing our traits minimumTableSize to 255 (bad as it is not a power of 2) solves the problem. I would have to dig deeper into this to understand why.
Comment 21 Julien Chaffraix 2012-03-13 18:58:26 PDT
The situation on the bots hasn't moved since this bug was filed and it makes no sense.

I could see the performance degradation on my Linux machine (a solid 2% under Chromium's DRT) and on the same setup, Antti's change brought us back to the performance level prior to r106740.

The linux perf bot is very noisy which makes it hard to know if he agrees with me here. I will see if I can reproduce the situation on a Mac SL machine.
Comment 22 Antti Koivisto 2012-03-14 13:04:59 PDT
Yeah, it seems to me the sum of gains from the various fixes roughly matches the original regression. The bot graphs have a general long term trend of creeping up. I wonder if the what we have now is part of that trend, a collection of unrelated smaller regressions, apparently mostly affecting windows platform.

It also doesn't make sense to spend too much engineering effort on these tests. The web that they measure (all simple presentational html, no stylesheet, barely any scripts) doesn't exist anymore and we in any case load 40 of these kinds page in a second on an average laptop.
Comment 23 Tony Chang 2012-03-14 13:45:45 PDT
(In reply to comment #22)
> Yeah, it seems to me the sum of gains from the various fixes roughly matches the original regression. The bot graphs have a general long term trend of creeping up. I wonder if the what we have now is part of that trend, a collection of unrelated smaller regressions, apparently mostly affecting windows platform.

This regression isn't Windows specific, here's the long view on the mac bot (looks like about 5% since this started):
http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/moz/report.html?history=700&rev=126700

Also, because it's easy for other perf regressions to creep in, this is why it's better to roll out rather than fix in the tree over the span of weeks.

> It also doesn't make sense to spend too much engineering effort on these tests. The web that they measure (all simple presentational html, no stylesheet, barely any scripts) doesn't exist anymore and we in any case load 40 of these kinds page in a second on an average laptop.

There's similar regressions noticed in the intl1 and intl2 page cyclers, which are from a scrape in 2007.  They're mostly pages from countries other than the US, but it's probably not that different from if we did a scrape of today's top US sites.
http://build.chromium.org/f/chromium/perf/vista-release-webkit-latest/intl1/report.html?history=700&rev=126700
http://build.chromium.org/f/chromium/perf/vista-release-webkit-latest/intl2/report.html?history=700&rev=126700

Which is to say, the regression may still impact today's pages, but it's hard to tell without a page cycler or some other perf test.
Comment 24 Julien Chaffraix 2012-03-26 11:11:13 PDT
I have more info on the regression. It is related to the matched properties cache in CSSStyleSelector:
* the original change (r106740) disabled matched properties caching for element with presentation attribute change.
* r107484 (bug 78381) re-introduced the caching but it did not improve the hit rate. There is actually a strong correlation here: what we would not cache before just ends up not finding any entries (the numbers are very close).
* other fixes have pumped up the average matched properties cache hit rate up but the case of presentation attribute change is still pathological.

I don't have a good idea why bug 78381 hasn't helped here though.
Comment 25 Julien Chaffraix 2012-04-17 11:32:14 PDT
Before I forget, here are my notes on this regression as I had to move to something else:
* When it comes to matched properties cache hit, there are 2 statistical populations: elements with and without presentation attributes. Element without presentation attributes have hit rate of ~60%, those with presentation attributes ~40%.
* I haven't been able to find the reasons for this big difference (I have investigated the presentation attribute cache, the style sharing algorithm and the matched properties cache).
* The original regression broke 2 levels of caching for presentation attributes (matched properties relies on the presentation attributes cache which was removed in r116740). Bug 80707 re-introduced the presentation attributes cache with a different design. However the new design may miss opportunities for cache hits: the old code would try to re-use CSSMappedAttributeDeclaration between different tags by using some buckets for each presentation attribute [called MappedAttributeEntry]. An early prototype of re-adding those buckets did not lead to any improvements though.
* Most of the matched properties cache misses for elements with presentation attribute are for: td (51% of the records), img (22%), table (11%) and body (4%). The remaining records are below 2%.
* Compared to pre-r116740, we are trying harder to cache matched properties which did help the overall cache hit rate. On intl* page cycler, our top reason for not caching is that the parent style has explicitly inherited properties (according for 38% of the records). This is likely due to table parts having borders related properties inherited by default.
Comment 26 Tony Chang 2012-04-26 13:16:24 PDT
At this point, too much time has passed and no one is actively investigating.  We may as well just close the bug.
Comment 27 Andreas Kling 2013-02-16 16:13:24 PST
(In reply to comment #26)
> At this point, too much time has passed and no one is actively investigating.  We may as well just close the bug.

Indeed.