Bug 62349 - REGRESSION(88260): 10-50% performance regression across many page cyclers
Summary: REGRESSION(88260): 10-50% performance regression across many page cyclers
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: James Robinson
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-06-08 20:24 PDT by James Robinson
Modified: 2011-06-13 12:29 PDT (History)
9 users (show)

See Also:


Attachments
Patch (1.95 KB, patch)
2011-06-08 20:27 PDT, James Robinson
no flags Details | Formatted Diff | Diff
Patch with platform specific values for FontCache inactive constants (1.86 KB, patch)
2011-06-10 17:32 PDT, Michael Saboff
jamesr: review-
Details | Formatted Diff | Diff
Change with chromium recommended value (1.84 KB, patch)
2011-06-11 17:27 PDT, Michael Saboff
jamesr: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description James Robinson 2011-06-08 20:24:47 PDT
REGRESSION(88260): 10-50% performance regression across many page cyclers
Comment 1 James Robinson 2011-06-08 20:27:38 PDT
Created attachment 96536 [details]
Patch
Comment 2 James Robinson 2011-06-08 20:28:46 PDT
Patch for consideration.  Please let me know if you would prefer to retain the old thresholds for some platforms.  We definitely want to change them for Chromium.
Comment 3 Darin Fisher (:fishd, Google) 2011-06-08 21:59:30 PDT
Comment on attachment 96536 [details]
Patch

R=me
Comment 4 WebKit Review Bot 2011-06-08 23:38:21 PDT
Comment on attachment 96536 [details]
Patch

Clearing flags on attachment: 96536

Committed r88428: <http://trac.webkit.org/changeset/88428>
Comment 5 WebKit Review Bot 2011-06-08 23:38:26 PDT
All reviewed patches have been landed.  Closing bug.
Comment 6 Michael Saboff 2011-06-09 11:00:57 PDT
Please make the larger values platform specific.  We do not see a need for these thresholds to be this large.
Comment 7 James Robinson 2011-06-09 11:13:26 PDT
I'm happy to #ifdef values, what platforms and what values do you want?  Given that this regresses performance across the board I imagine most ports will want to keep the old values.
Comment 8 Geoffrey Garen 2011-06-09 12:05:13 PDT
James, Darin, I don't think "svn revert" was a very polite way to raise this technical issue.

But, now that you have raised the issue, let's try to find a solution that preserves rendering speed without a memory footprint regression.

James, you mentioned that the Page Cycler uses up to 105 fonts. Do you think that's representative of typical web browsing, or is your goal here just to keep the whole Page Cycler in cache?

How quickly does the Page Cycler cycle? Would a time-based approach to purging inactive fonts keep the whole Page Cycler in cache?
Comment 9 James Robinson 2011-06-09 12:18:30 PDT
I did not svn revert anything, I restored the previous constants and left the rest of the patch untouched.  I do believe that our page cyclers are representative of user's browsing habits - that's why we have them in the first place - and as I mentioned on the other bug plan to gather some data from users in the field to get some more insight into what the font use looks like more broadly.
Comment 10 Geoffrey Garen 2011-06-09 12:29:58 PDT
> I did not svn revert anything, I restored the previous constants and left the rest of the patch untouched.

OK. I don't think manually reverting a portion of a patch while leaving the rest untouched was a very polite way to raise this technical issue.

> and as I mentioned on the other bug plan to gather some data from users in the field to get some more insight into what the font use looks like more broadly.

Great. How soon do you think you'll have data to evaluate?

In the meantime, do you think a time-based approach would succeed at keeping the whole Page Cycler in cache? How quickly does it cycle between its 105 fonts?
Comment 11 Michael Saboff 2011-06-09 14:56:45 PDT
Let's keep in mind we are not talking about total fonts, we are talking about inactive fonts.  The original changes in <http://trac.webkit.org/changeset/88260> made the previously abandon system fallback fonts be inactive immediately after they are accessed.  This could increase the number of inactive fonts. The change set also moved the deletion check to be after we are done using fonts for layout and painting operations instead of when we request a font from the cache.

This change set did not place or modify any limit on the number of active fonts.   From the testing I did, it appears that there are more active fonts than inactive fonts.  In many cases the only inactive fonts that result from a page are when it is closed.

If a page being accessed / rendered is Latin-1 based there may be little or no inactive fonts due to system fallback font usage as few if any glyphs are needed outside those provided by a specified font.  From my tests the pages with the greatest number of inactive fonts due to system fallback fonts are asian or other non Latin-1 based.  Although we have and use tests with a wide variety of pages with diverse languages, we don't believe that users in the real world actively access pages with more than two or three languages and therefore more than a handful of alphabets.  There may be some inadvertent access to a page foreign to a user, but that is not the general case.  This info coupled with various page load tests results led us to the smaller inactive font constants.

I would prefer that the higher limits be for the chromium platform.  Other platform developers can make a choice of the two sets or add their own values.
Comment 12 James Robinson 2011-06-09 17:37:12 PDT
Performance is still ~50% worse than pre-88260 levels on some of our tests, even with these font cache limits:

http://build.chromium.org/f/chromium/perf/mac-release-10.5/intl1/report.html?history=100&rev=88610

The initial bump is r88620, the small drop is r88428.  Given WebKit's no-regression policy this seems like a strong reason to revert the whole thing until the regression is understood and fixed, otherwise we are likely to introduce even more regressions on top of this one that will be more difficult to identify and fix.
Comment 13 Michael Saboff 2011-06-10 06:18:21 PDT
As I indicated in a prior comment, the change in r88260 fixed reference count leakage for system fallback fonts.  Your tests could be creating far more inactive fonts so that 120 is now too small. Before reverting all of r88260, I suggest that you try the following test.

- Change cMaxInactiveFontData to INT_MAX.
- Rerun your tests to see what that does to performance.
- At the end of the tests, display how many fonts are in the cache with fontDataCount() and inactiveFontDataCount().  If you do this after windows are closed, all or almost all fonts should be inactive.  Better to display these values while some or all windows / tabs are still open.  Or, add code to monitor the inactive count, keeping the highest value before all windows / tabs are closed. This should help you determine the new value for cMaxInactiveFontData to keep all fonts in the cache.

Reply back with the results.
Comment 14 Michael Saboff 2011-06-10 17:32:55 PDT
Created attachment 96827 [details]
Patch with platform specific values for FontCache inactive constants

Proposed patch making Chromium specific values of cMaxInactiveFontData and cTargetInactiveFontData.

Note that set cMaxInactiveFontData to INT_MAX with the expectation that someone from the Chromium team will investigate a better value.
Comment 15 James Robinson 2011-06-10 17:39:00 PDT
Comment on attachment 96827 [details]
Patch with platform specific values for FontCache inactive constants

INT_MAX is a terrible value as well, I'd be much happier with the previous values of 120/100.
Comment 16 Geoffrey Garen 2011-06-10 17:54:14 PDT
> INT_MAX is a terrible value as well, I'd be much happier with the previous values of 120/100.

There's no such thing as "the previous values". Previously, WebCore leaked most fonts. A 120/100 limit in a world where most fonts are leaked doesn't tell us what the limit should be once the leak is fixed.

I think you confirmed this finding when you measured that restoring the 120/100 constants doesn't fully restore performance to the Page Cycler.

Are you saying that you're OK with not immediately restoring Chromium's score on the Page Cycler? 

As an aside, I'm not super interested in deciding what value goes into the #IF PLATFORM(CHROMIUM). You guys can pick any value you want. My goal is to bring memory use under control for the other ports.
Comment 17 James Robinson 2011-06-10 18:20:15 PDT
"leaking most" sounds different from setting it to INT_MAX, although perhaps it isn't - I'm still not fully familiar with how this code works.

it seems like for Chromium-Mac, a value of 250 will be high enough to keep all inactive fonts in the cache for the page cycler data we have.  This value is twice as high as what's needed for Chromium-linux, which is a bit surprising.  Let's try 250 and see what happens.

Longer term we'll have to do more detailed research on what values make sense for our user's browsing habits.  Just doing a simple test of navigating around yahoo.jp, though, shows that the inactive font cache is over 80 after visiting 4 randomly selected pages within a domain.  I'm curious what data led to a value of 50 being chosen.
Comment 18 Michael Saboff 2011-06-11 17:27:35 PDT
Created attachment 96868 [details]
Change with chromium recommended value

Set cMaxInactiveFontData to 250 per comment 17.  Set cTargetInactiveFontData to 200 so that 50 inactive fonts will be cleaned up at a time.
Comment 19 James Robinson 2011-06-11 18:37:27 PDT
Comment on attachment 96868 [details]
Change with chromium recommended value

This looks good for now, thanks.
Comment 20 Michael Saboff 2011-06-11 23:01:49 PDT
Committed r88611: <http://trac.webkit.org/changeset/88611>