Bug 112084

Summary: Add a single character cache to WidthCache
Product: WebKit Reporter: Benjamin Poulain <benjamin>
Component: New BugsAssignee: Benjamin Poulain <benjamin>
Status: RESOLVED FIXED    
Severity: Normal CC: ggaren, kling, mitz
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch none

Description Benjamin Poulain 2013-03-11 17:42:36 PDT
Add a single character cache to WidthCache
Comment 1 Benjamin Poulain 2013-03-11 17:47:34 PDT
Created attachment 192608 [details]
Patch
Comment 2 Andreas Kling 2013-03-11 18:46:33 PDT
Would it make sense to special-case even harder with a 256-entry array for the Latin1 characters?
Comment 3 Benjamin Poulain 2013-03-11 19:21:44 PDT
(In reply to comment #2)
> Would it make sense to special-case even harder with a 256-entry array for the Latin1 characters?

To test your hypothesis, I loaded a bunch of wikipedia pages and dumped the characters in a file.

Here are all the characters used in that case:
'\n', ' ', '"', "'", ')', '(', '\xab', '-', ',', '/', '.', '1', '0', '3', '2', '5', '4', '7', '6', '9', '8', 'E', '\xa0', '[', ']', 'a', 'c', 'e', 'g', 'f', 'i', 'h', 'k', 'j', 'l', 't', '|'

The single space is the most common one, with 2773 out of 4269 chars.

Given that the data is sparse, I think it is not needed to have a special case for Latin1 characters. What do you think?
Comment 4 Geoffrey Garen 2013-03-12 08:58:15 PDT
Comment on attachment 192608 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=192608&action=review

r=me

The data here convinces me that a 256-entry fixed array is probably not worth the space. Hashing a UChar will be cheap.

> Source/WebCore/platform/graphics/WidthCache.h:158
> +        float *value;

"float*", please.

> Source/WebCore/platform/graphics/WidthCache.h:161
> +            isNewEntry = addResult.isNewEntry;

I wonder if the single character should influence the cache's ramp-up strategy.

A few reasons we might not want it to:

(1) Single character hits -- especially single space -- will always be common. So, they're not a good indicator that we're laying out lots of duplicated words.

(2) The single character cache has a small, fixed size limit, so its memory tradeoff is much much smaller.

(3) The single character cache has a tiny lookup cost, so its speed tradeoff is smaller.

That change would probably require extensive testing, though.
Comment 5 Benjamin Poulain 2013-03-12 15:41:03 PDT
Comment on attachment 192608 [details]
Patch

Clearing flags on attachment: 192608

Committed r145601: <http://trac.webkit.org/changeset/145601>
Comment 6 Benjamin Poulain 2013-03-12 15:41:04 PDT
All reviewed patches have been landed.  Closing bug.
Comment 7 Benjamin Poulain 2013-03-12 15:44:32 PDT
> I wonder if the single character should influence the cache's ramp-up strategy.
> 
> A few reasons we might not want it to:
> 
> (1) Single character hits -- especially single space -- will always be common. So, they're not a good indicator that we're laying out lots of duplicated words.
> 
> (2) The single character cache has a small, fixed size limit, so its memory tradeoff is much much smaller.
> 
> (3) The single character cache has a tiny lookup cost, so its speed tradeoff is smaller.
> 
> That change would probably require extensive testing, though.

I completely agree with you.

Do you have good test cases I can use to do this analysis?
Comment 8 Geoffrey Garen 2013-03-12 16:09:31 PDT
PerformanceTests/Layout/chapter-reflow*.html and PLT3 are what I would use to test.