Bug 69238

Summary: REGRESSION(96070) 25% intl1 PLT regression from scrollbar invalidation
Product: WebKit Reporter: Adam Barth <abarth>
Component: Page LoadingAssignee: Simon Fraser (smfr) <simon.fraser>
Status: RESOLVED FIXED    
Severity: Normal CC: darin, fishd, gregsimon, jamesr, leviw, mitz, sam, simon.fraser
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 69237    
Attachments:
Description Flags
Patch darin: review+

Comment 2 Adam Barth 2011-10-02 22:55:54 PDT
This regression shows up on Linux too:
http://build.chromium.org/f/chromium/perf/linux-release/intl1/report.html?history=300&rev=103677
Comment 3 Adam Barth 2011-10-02 22:57:56 PDT
It's also on Mac, but it's harder to see:
http://build.chromium.org/f/chromium/perf/mac-release-10.5/intl1/report.html?history=500&rev=103677
Comment 4 Darin Fisher (:fishd, Google) 2011-10-02 23:02:20 PDT
I'm pretty sure that the Chromium page cyclers run with GPU accelerated rendering disabled.  That means no accelerated compositor.
Comment 5 Adam Barth 2011-10-02 23:23:54 PDT
I can create a build from around this time frame and bisect.
Comment 6 Adam Barth 2011-10-03 01:20:15 PDT
I'm having trouble reproducing these results locally on 10.6.  It might be easier on Windows, where the change is more pronounced.
Comment 7 Darin Fisher (:fishd, Google) 2011-10-06 16:45:19 PDT
This particular change looks suspiciously like something that could impact performance:
http://trac.webkit.org/changeset/96070/trunk/Source/WebCore/platform/ScrollView.cpp

It is adding invalidations.
Comment 8 Simon Fraser (smfr) 2011-10-06 16:49:29 PDT
Hmm, we could try not doing the invalidations if the document is being destroyed.
Comment 9 Adam Barth 2011-10-06 17:21:27 PDT
The first step is to validate that this is what's causing the issue.  At the risk of making mitz sad, it might be worth temporarily removing the invalidations to see if that heals the bot.
Comment 10 Sam Weinig 2011-10-06 18:50:49 PDT
(In reply to comment #9)
> The first step is to validate that this is what's causing the issue.  At the risk of making mitz sad, it might be worth temporarily removing the invalidations to see if that heals the bot.

Can this not be done locally on someones machine?
Comment 11 Adam Barth 2011-10-06 19:00:18 PDT
> Can this not be done locally on someones machine?

I have not been able to reproduce the issue locally, but I think fishd was try to do that before he went home for the day.

Apparently I can get ssh access to the perf bot if I ask nicely, but I haven't filled out the paperwork yet.
Comment 12 Darin Fisher (:fishd, Google) 2011-10-07 00:40:04 PDT
I have confirmed that this regression was indeed caused by http://trac.webkit.org/changeset/96070

I compared content_shell.exe performance with and without that patch using the intl1 page cycler.

[with]
6365.25 (+/-279.66)
6631.25 (+/-179.05)
6513.50 (+/-184.78)

[without]
5238.25 (+/-142.47)
5158.50 (+/-155.56)
5396.50 (+/-182.12)
Comment 13 Darin Fisher (:fishd, Google) 2011-10-07 00:54:26 PDT
It looks like reverting only http://trac.webkit.org/changeset/96070/trunk/Source/WebCore/platform/ScrollView.cpp is sufficient to resolve the performance regression.
Comment 14 Darin Fisher (:fishd, Google) 2011-10-07 01:04:40 PDT
(In reply to comment #13)
> It looks like reverting only http://trac.webkit.org/changeset/96070/trunk/Source/WebCore/platform/ScrollView.cpp is sufficient to resolve the performance regression.

Narrowing the scope further, this is not caused by the invalidateScrollCornerRect() call.  It is somehow caused by the scrollbar invalidations.
Comment 15 Adam Barth 2011-10-07 08:48:34 PDT
Removing the [Chromium] label because the fix is going to involve cross-platform code.
Comment 16 Adam Barth 2011-10-07 08:49:00 PDT
+darin because he reviewed the patch.
Comment 17 Adam Barth 2011-10-07 09:04:22 PDT
@smfr: It looks like the regression exists on Snow Leopard too, but the magnitude is much smaller:

http://build.chromium.org/f/chromium/perf/mac-release-10.6/intl1/report.html?history=80&rev=102938

Depending on which platform you're using, you might have more or less luck reproducing the issue.
Comment 18 Darin Adler 2011-10-07 09:47:00 PDT
Dan, didn’t you make some code change related to this recently? I don’t mean a change that caused slowness, but a change that sped things up perhaps? Or avoided a flash?
Comment 19 Darin Adler 2011-10-07 09:48:37 PDT
(In reply to comment #8)
> Hmm, we could try not doing the invalidations if the document is being destroyed.

Sounds like something we should try immediately.

I also would not object to a rollout on principle, although it would be even better if we could move forward instead of backward.
Comment 20 Darin Fisher (:fishd, Google) 2011-10-07 09:54:48 PDT
(In reply to comment #19)
> (In reply to comment #8)
> > Hmm, we could try not doing the invalidations if the document is being destroyed.
> 
> Sounds like something we should try immediately.
> 
> I also would not object to a rollout on principle, although it would be even better if we could move forward instead of backward.

Since Apple engineers also have a way to reproduce this regression locally, I'd really like to advocate for a rollout.

Reason:  I find that performance regressions can creep in and be hidden by existing performance regressions.  Once you fix the original regression, you can end up not being back to where you thought you should be.  Then you have a real mystery on your hands in trying to determine what other regression slipped in.  The browser is a multi-threaded beast, and so one regression masking another is a real concern.
Comment 21 Darin Adler 2011-10-07 09:59:13 PDT
(In reply to comment #20)
> [...] I'd really like to advocate for a rollout.
> 
> Reason: [...] [performance regressions hide others, etc.]

Absolutely, the argument you are citing is one of the main WebKit project principles, even though we don’t always follow it religiously.

It’s a great argument for a super-fast resolution. Obviously a rollout can be done faster than a fix.

Simon, can you handle this yourself?
Comment 22 Simon Fraser (smfr) 2011-10-07 10:18:15 PDT
I'll get a patch today to reduce the invalidations.
Comment 23 mitz 2011-10-07 10:21:05 PDT
(In reply to comment #18)
> Dan, didn’t you make some code change related to this recently? I don’t mean a change that caused slowness, but a change that sped things up perhaps? Or avoided a flash?

I fixed the fact that invalidation could (and sometimes did) occur even when scrollbars were suppressed, but it doesn’t look like my fix eliminates any of the invalidation introduced by Simon in r96070.
Comment 24 Simon Fraser (smfr) 2011-10-07 12:18:37 PDT
Is there a way I can get the "intl" PLT suite? I can't replicate the performance regression in the normal suite.
Comment 25 Simon Fraser (smfr) 2011-10-07 13:32:01 PDT
One issue I notice is that SVGImages cause scrollbar invalidation when being torn down; they get to the host window via page->chrome() to do the invalidation. This seems wrong (but is unlikely to be the cause of perf issues).
Comment 26 Darin Adler 2011-10-07 16:32:12 PDT
(In reply to comment #20)
> Since Apple engineers also have a way to reproduce this regression locally

Is this true? Do Apple engineers have a way to reproduce this regression?
Comment 27 Simon Fraser (smfr) 2011-10-07 16:46:39 PDT
I have not been able to reproduce it in normal PLT.
Comment 28 Adam Barth 2011-10-07 17:19:57 PDT
If you mean the moz dataset, this issue doesn't seem to effect the moz dataset, as far as I can tell.
Comment 29 Simon Fraser (smfr) 2011-10-07 17:29:40 PDT
So I don't have a way to reproduce this. Does the "intl" set have a lot of iframes, or form controls?
Comment 30 Darin Fisher (:fishd, Google) 2011-10-07 21:55:03 PDT
(In reply to comment #26)
> (In reply to comment #20)
> > Since Apple engineers also have a way to reproduce this regression locally
> 
> Is this true? Do Apple engineers have a way to reproduce this regression?

Whoops, I misread an earlier comment.


(In reply to comment #29)
> So I don't have a way to reproduce this. Does the "intl" set have a lot of iframes, or form controls?

Let me see what can be done about that.
Comment 31 Simon Fraser (smfr) 2011-10-10 11:20:16 PDT
I have reproduced the regression in WK2 on Mac.
Comment 32 Simon Fraser (smfr) 2011-10-10 12:15:25 PDT
Created attachment 110380 [details]
Patch
Comment 33 Adam Barth 2011-10-10 12:36:30 PDT
> I have reproduced the regression in WK2 on Mac.

Thanks Simon.
Comment 34 Darin Adler 2011-10-10 12:40:56 PDT
Comment on attachment 110380 [details]
Patch

The old code invalidated only when hiding. The new code invalidates on both hiding and showing. Is that intentional?
Comment 35 Darin Adler 2011-10-10 12:45:23 PDT
(In reply to comment #34)
> The old code invalidated only when hiding. The new code invalidates on both hiding and showing. Is that intentional?

I figured it out. The new code does a null check so it only affects hiding.
Comment 36 Simon Fraser (smfr) 2011-10-10 13:04:34 PDT
http://trac.webkit.org/changeset/97079
Comment 37 Adam Barth 2011-10-10 14:13:02 PDT
The perf graph looks good:

http://build.chromium.org/f/chromium/perf/vista-release-webkit-latest/intl1/report.html?history=300&rev=-1

Thanks again.