Bug 108835 - Optimize GlyphPage for case where all glyphs are available in the same font.
Summary: Optimize GlyphPage for case where all glyphs are available in the same font.
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Text (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Andreas Kling
URL:
Keywords: InRadar, Performance
Depends on: 109019 109850
Blocks:
  Show dependency treegraph
 
Reported: 2013-02-04 09:46 PST by Andreas Kling
Modified: 2013-02-26 11:42 PST (History)
11 users (show)

See Also:


Attachments
Sneak preview for EWS (8.68 KB, patch)
2013-02-04 09:53 PST, Andreas Kling
no flags Details | Formatted Diff | Diff
Proposed patch (10.93 KB, patch)
2013-02-04 14:01 PST, Andreas Kling
no flags Details | Formatted Diff | Diff
Proposed patch v2 (11.23 KB, patch)
2013-02-05 14:43 PST, Andreas Kling
koivisto: review+
Details | Formatted Diff | Diff
Patch for landing (11.23 KB, patch)
2013-02-05 18:15 PST, Andreas Kling
no flags Details | Formatted Diff | Diff
Speculative optimization (3.58 KB, patch)
2013-02-17 14:31 PST, Andreas Kling
ojan: review+
Details | Formatted Diff | Diff
Mildly desperate patch (11.79 KB, patch)
2013-02-21 06:13 PST, Andreas Kling
koivisto: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Andreas Kling 2013-02-04 09:46:48 PST
We could save large amounts of memory by detecting that all glyphs in a GlyphPage are from the same SimpleFontData* and not storing an array of font data pointers for every glyph in that case.
An initial draft shows a 4.98 MB improvement on Membuster3, though I'm not 100% sure of its correctness.
Comment 1 Andreas Kling 2013-02-04 09:53:57 PST
Created attachment 186404 [details]
Sneak preview for EWS
Comment 2 WebKit Review Bot 2013-02-04 09:57:20 PST
Attachment 186404 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/platform/graphics/GlyphPage.h', u'Source/WebCore/platform/graphics/GlyphPageTreeNode.cpp']" exit_code: 1
Source/WebCore/platform/graphics/GlyphPage.h:35:  Alphabetical sorting problem.  [build/include_order] [4]
Total errors found: 1 in 2 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 3 Andreas Kling 2013-02-04 14:01:42 PST
Created attachment 186457 [details]
Proposed patch
Comment 4 Antti Koivisto 2013-02-05 10:36:43 PST
Comment on attachment 186457 [details]
Proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=186457&action=review

> Source/WebCore/ChangeLog:14
> +        4.98 MB progression on Membuster3.

Nice!

> Source/WebCore/ChangeLog:24
> +            There are now three ways of constructing a GlyphPage. createZeroed() and createCopy()
> +            are only used when creating the system fallback page.

createCopiedSystemFallbackPage/createZeroedSystemFallbackPage then perhaps?

> Source/WebCore/platform/graphics/GlyphPage.h:138
> +        // -1 is a placeholder before we have an initial SimpleFontData*.
> +        if (m_fontDataForAllGlyphs == reinterpret_cast<const SimpleFontData*>(-1))
> +            m_fontDataForAllGlyphs = fontData;

Uhhuh. At least make it a named constant, unitializeFontData() or something

> Source/WebCore/platform/graphics/GlyphPage.h:207
> +    bool m_allGlyphsAreFromSameFont;
> +    union {
> +        const SimpleFontData* m_fontDataForAllGlyphs;
> +        const SimpleFontData** m_glyphFontData;
> +    };

Using union here doesn't seem to gain much. Without it you wouldn't need m_allGlyphsAreFromSameFont bit and the memory usage would be unchanged. The code would probably be clearer. I suppose it documents these are mutually exclusive...
Comment 5 Radar WebKit Bug Importer 2013-02-05 14:41:19 PST
<rdar://problem/13157042>
Comment 6 Andreas Kling 2013-02-05 14:43:12 PST
Created attachment 186710 [details]
Proposed patch v2

@Antti: Sure thing brah.
Comment 7 Antti Koivisto 2013-02-05 14:57:10 PST
Comment on attachment 186710 [details]
Proposed patch v2

View in context: https://bugs.webkit.org/attachment.cgi?id=186710&action=review

> Source/WebCore/platform/graphics/GlyphPage.h:191
> +        : m_fontDataForAllGlyphs(undefinedFontDataValue())

This may be a stupid question but why can't this just be null?
Comment 8 Andreas Kling 2013-02-05 15:52:27 PST
(In reply to comment #7)
> (From update of attachment 186710 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=186710&action=review
> 
> > Source/WebCore/platform/graphics/GlyphPage.h:191
> > +        : m_fontDataForAllGlyphs(undefinedFontDataValue())
> 
> This may be a stupid question but why can't this just be null?

Good point! I thought I was guarding against a sequence like:

[0] glyph=1234, fontData=0x0
[1] glyph=4567, fontData=0x00031337

But that doesn't actually happen in practice, since only glyph=0 will have a null font data pointer, and we handle that separately. I'll add an assertion for that instead and land without the -1 gunk.

Thanks for reviewing.
Comment 9 Darin Adler 2013-02-05 16:08:15 PST
Comment on attachment 186457 [details]
Proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=186457&action=review

> Source/WebCore/platform/graphics/GlyphPage.h:80
> +    ~GlyphPage()
> +    {
> +        if (!m_allGlyphsAreFromSameFont)
> +            fastFree(m_glyphFontData);
>      }

Is it important to make this inline?

> Source/WebCore/platform/graphics/GlyphPage.h:95
> +        if (UNLIKELY(!glyph))
> +            return GlyphData(0, 0);

Why is this needed? I presume because we got bogus answers when we ask if a font includes a glyph for 0?

> Source/WebCore/platform/graphics/GlyphPage.h:115
>      const SimpleFontData* fontDataForCharacter(UChar32 c) const
>      {
> +        Glyph glyph = m_glyphs[indexForCharacter(c)];
> +        if (UNLIKELY(!glyph))
> +            return 0;
> +        if (m_allGlyphsAreFromSameFont)
> +            return m_fontDataForAllGlyphs;
>          return m_glyphFontData[indexForCharacter(c)];
>      }

Irritating that this repeats half of the glyphDataForIndex function. I suggest we refactor so they can share code in the future.

>> Source/WebCore/platform/graphics/GlyphPage.h:138
>> +        // -1 is a placeholder before we have an initial SimpleFontData*.
>> +        if (m_fontDataForAllGlyphs == reinterpret_cast<const SimpleFontData*>(-1))
>> +            m_fontDataForAllGlyphs = fontData;
> 
> Uhhuh. At least make it a named constant, unitializeFontData() or something

Can’t we use a named constant for this?
Comment 10 Darin Adler 2013-02-05 16:08:43 PST
Sorry about my late comments. I started reviewing yesterday and never hit the submit button.
Comment 11 Andreas Kling 2013-02-05 16:25:06 PST
(In reply to comment #9)
> (From update of attachment 186457 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=186457&action=review
> 
> > Source/WebCore/platform/graphics/GlyphPage.h:80
> > +    ~GlyphPage()
> > +    {
> > +        if (!m_allGlyphsAreFromSameFont)
> > +            fastFree(m_glyphFontData);
> >      }
> 
> Is it important to make this inline?

Not really. I stuck with inline methods since there was no corresponding .cpp file for this class already.

> > Source/WebCore/platform/graphics/GlyphPage.h:95
> > +        if (UNLIKELY(!glyph))
> > +            return GlyphData(0, 0);
> 
> Why is this needed? I presume because we got bogus answers when we ask if a font includes a glyph for 0?

Correct.

> > Source/WebCore/platform/graphics/GlyphPage.h:115
> >      const SimpleFontData* fontDataForCharacter(UChar32 c) const
> >      {
> > +        Glyph glyph = m_glyphs[indexForCharacter(c)];
> > +        if (UNLIKELY(!glyph))
> > +            return 0;
> > +        if (m_allGlyphsAreFromSameFont)
> > +            return m_fontDataForAllGlyphs;
> >          return m_glyphFontData[indexForCharacter(c)];
> >      }
> 
> Irritating that this repeats half of the glyphDataForIndex function. I suggest we refactor so they can share code in the future.

True, it could just be:
return glyphDataForIndex(indexForCharacter(c)).fontData;
Comment 12 Andreas Kling 2013-02-05 18:15:30 PST
Created attachment 186739 [details]
Patch for landing
Comment 13 WebKit Review Bot 2013-02-05 18:51:50 PST
Comment on attachment 186739 [details]
Patch for landing

Clearing flags on attachment: 186739

Committed r141961: <http://trac.webkit.org/changeset/141961>
Comment 14 WebKit Review Bot 2013-02-05 18:51:54 PST
All reviewed patches have been landed.  Closing bug.
Comment 15 Matt Falkenhagen 2013-02-05 23:51:05 PST
It looks like this patch caused assertion failures on the following tests:

svg/W3C-SVG-1.1/fonts-glyph-04-t.svg
svg/W3C-SVG-1.1/fonts-glyph-03-t.svg
svg/custom/glyph-selection-lang-attribute.svg
svg/text/alt-glpyh-on-fallback-font-crash.svg

See for example: http://build.webkit.org/builders/Apple%20Lion%20Debug%20WK2%20%28Tests%29/builds/7165/steps/layout-test/logs/stdio

Sample stack trace:
   ASSERTION FAILED: fontData
   /Volumes/Data/slave/lion-debug/build/Source/WebCore/platform/graphics/GlyphPage.h(139) : void WebCore::GlyphPage::setGlyphDataForIndex(unsigned int, Glyph, const WebCore::SimpleFontData *)
   1   0x108778665 WebCore::GlyphPage::setGlyphDataForIndex(unsigned int, unsigned short, WebCore::SimpleFontData const*)
   2   0x108775e25 WebCore::GlyphPage::setGlyphDataForCharacter(int, unsigned short, WebCore::SimpleFontData const*)
   3   0x1095fb718 WebCore::SVGTextRunRenderingContext::glyphDataForCharacter(WebCore::Font const&, WebCore::TextRun const&, WebCore::WidthIterator&, int, bool, int, unsigned int&)
   4   0x109b64c0d WebCore::WidthIterator::glyphDataForCharacter(int, bool, int, unsigned int&)
   5   0x109b668cb unsigned int WebCore::WidthIterator::advanceInternal<WebCore::SurrogatePairAwareTextIterator>(WebCore::SurrogatePairAwareTextIterator&, WebCore::GlyphBuffer*)
   6   0x109b64d4d WebCore::WidthIterator::advance(int, WebCore::GlyphBuffer*)
   7   0x1095fc40f WebCore::SVGTextMetricsBuilder::advanceSimpleText()
   8   0x1095fc1cc WebCore::SVGTextMetricsBuilder::advance()
   9   0x1095fc925 WebCore::SVGTextMetricsBuilder::measureTextRenderer(WebCore::RenderSVGInlineText*, WebCore::MeasureTextData*)
   10  0x1095fcca8 WebCore::SVGTextMetricsBuilder::walkTree(WebCore::RenderObject*, WebCore::RenderSVGInlineText*, WebCore::MeasureTextData*)
   11  0x1095f6730 WebCore::SVGTextMetricsBuilder::buildMetricsAndLayoutAttributes(WebCore::RenderSVGText*, WebCore::RenderSVGInlineText*, WTF::HashMap<unsigned int, WebCore::SVGCharacterData, WTF::IntHash<unsigned int>, WTF::HashTraits<unsigned int>, WTF::HashTraits<WebCore::SVGCharacterData> >&)
   12  0x1095e4d18 WebCore::SVGTextLayoutAttributesBuilder::buildLayoutAttributesForForSubtree(WebCore::RenderSVGText*)
   13  0x1095e4453 WebCore::RenderSVGText::layout()
   14  0x1095c8b98 WebCore::SVGRenderSupport::layoutChildren(WebCore::RenderObject*, bool)
   15  0x1095c878e WebCore::RenderSVGContainer::layout()
   16  0x1095c8b98 WebCore::SVGRenderSupport::layoutChildren(WebCore::RenderObject*, bool)
   17  0x1095c878e WebCore::RenderSVGContainer::layout()
   18  0x1095c8b98 WebCore::SVGRenderSupport::layoutChildren(WebCore::RenderObject*, bool)
   19  0x1095e0439 WebCore::RenderSVGRoot::layout()
   20  0x1093cc60c WebCore::RenderBlock::layoutBlockChild(WebCore::RenderBox*, WebCore::RenderBlock::MarginInfo&, WebCore::LayoutUnit&, WebCore::LayoutUnit&)
   21  0x1093c2aa7 WebCore::RenderBlock::layoutBlockChildren(bool, WebCore::LayoutUnit&)
   22  0x1093c00d7 WebCore::RenderBlock::layoutBlock(bool, WebCore::LayoutUnit)
   23  0x1093bf0de WebCore::RenderBlock::layout()
   24  0x109695b24 WebCore::RenderView::layoutContent(WebCore::LayoutState const&)
   25  0x109696457 WebCore::RenderView::layout()
   26  0x1087db67d WebCore::FrameView::layout(bool)
   27  0x10848732d WebCore::Document::implicitClose()
   28  0x1087b102b WebCore::FrameLoader::checkCallImplicitClose()
   29  0x1087b0cf3 WebCore::FrameLoader::checkCompleted()
   30  0x1087af9b3 WebCore::FrameLoader::finishedParsing()
   31  0x108491509 WebCore::Document::finishedParsing()
Comment 16 WebKit Review Bot 2013-02-06 00:14:16 PST
Re-opened since this is blocked by bug 109019
Comment 17 Andreas Kling 2013-02-06 05:10:56 PST
Committed r141990: <http://trac.webkit.org/changeset/141990>
Comment 18 Matt Falkenhagen 2013-02-13 23:07:33 PST
It seems this regressed performance on Chromium's intl2 page cycler test:

http://chromium-perf.appspot.com/?tab=mac-release-10.6-webkit-latest&graph=times&trace=t&rev=181260&history=150&master=ChromiumWebkit&testSuite=page_cycler_intl2&details=true

The spikes in the graph correlate to:
r141961: the first landing of the GlyphPage patch
r141976: rollout of the patch
r141990: relanding

I'm not sure it needs fixing as it looks like the patch is a deliberate decision to optimize for cases where there are not multiple fonts or font fallback, which occurs more often on international pages.

Locally, the pages that regressed the most are copies of BBC Arabic and BBC Tamil, though running the cycler on just those pages in isolation does not reproduce the regression.
Comment 19 Ojan Vafai 2013-02-14 11:35:36 PST
(In reply to comment #18)
> I'm not sure it needs fixing as it looks like the patch is a deliberate decision to optimize for cases where there are not multiple fonts or font fallback, which occurs more often on international pages.

A 5% performance regression on something that isn't a micro-benchmark definitely needs fixing. 

> Locally, the pages that regressed the most are copies of BBC Arabic and BBC Tamil, though running the cycler on just those pages in isolation does not reproduce the regression.

I think you misread the output.

Before:http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/intl2/details.html?cl=180871&graph=times&trace=t
After: http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/intl2/details.html?cl=180898&graph=times&trace=t

The pages that regressed the most were:
arabicnews.google.com
bn.wikipedia.org
ml.wikipedia.org
sgkalesh.blogspot.com

sgkalesh.blogspot.com had a nearly 200% regression!

alking, I can't get you a copy of this if you want for digging into it. Ideally you'd rollback first so that we can ensure that the regression is fully resolved and doesn't come back with the fixed up version.
Comment 20 Ojan Vafai 2013-02-14 11:45:31 PST
s/can't/can obviously. :(
Comment 21 Andreas Kling 2013-02-14 11:47:38 PST
Ouch. Sure, I'll roll this out and try to fix the regression. Thanks for the poke.
Oh, and I already have a copy of the page cyclers somewhere :)
Comment 22 WebKit Review Bot 2013-02-14 11:49:32 PST
Re-opened since this is blocked by bug 109850
Comment 23 Andreas Kling 2013-02-16 23:25:10 PST
Looking at the way those sites use this code, it looks like this may be a case of overzealous UNLIKELY() branch prediction hints.
Comment 24 Andreas Kling 2013-02-17 00:11:09 PST
(In reply to comment #23)
> Looking at the way those sites use this code, it looks like this may be a case of overzealous UNLIKELY() branch prediction hints.

Local testing shows that removing the branch prediction hints removes the regression. I'm going to reland with that changed and see how it fares on the bots.
Comment 25 Andreas Kling 2013-02-17 00:19:44 PST
Committed r143125: <http://trac.webkit.org/changeset/143125>
Comment 27 Andreas Kling 2013-02-17 14:16:48 PST
(In reply to comment #26)
> No luck. :(
> 
> http://chromium-perf.appspot.com/?tab=mac-release-10.6-webkit-latest&graph=times&trace=t&rev=183037&history=50&master=ChromiumWebkit&testSuite=page_cycler_intl2&details=true

Ugh. I can't get seem to get stable numbers locally, and it's hard to tell from a profile what's different between to runs. I have a speculative patch that streamlines the per-glyph fontdata paths that I'd like to try before rolling it out. Uploading for review shortly..
Comment 28 Andreas Kling 2013-02-17 14:31:02 PST
Created attachment 188776 [details]
Speculative optimization
Comment 29 Ojan Vafai 2013-02-17 15:24:22 PST
Comment on attachment 188776 [details]
Speculative optimization

Seems plausible.
Comment 30 Andreas Kling 2013-02-17 15:28:09 PST
Committed r143137: <http://trac.webkit.org/changeset/143137>
Comment 32 Andreas Kling 2013-02-18 20:58:19 PST
(In reply to comment #31)
> (In reply to comment #30)
> > Committed r143137: <http://trac.webkit.org/changeset/143137>
> 
> Unfortunately, it looks like this didn't help:
> 
> http://chromium-perf.appspot.com/?tab=mac-release-10.6-webkit-latest&graph=times&trace=t&rev=183149&history=150&master=ChromiumWebkit&testSuite=page_cycler_intl2&details=true

Hm. Confusing. So, just to make sure I'm understanding everything, this is a regression on Chromium/Mac only, and this perf bot is running Snow Leopard?

You'd think that at leasts the 2x regression on sgkalesh.blogspot.com would be reproducible in the applemac port if we're using the same codepaths. Does Chromium/Mac use the same fonts implementation as the Apple Mac port?
Comment 33 Matt Falkenhagen 2013-02-18 22:00:14 PST
(In reply to comment #32)
> Hm. Confusing. So, just to make sure I'm understanding everything, this is a regression on Chromium/Mac only, and this perf bot is running Snow Leopard?

Yes, the intl2 regression we only see on Chromium/Mac, running on Snow Leopard.

There is also a smaller intl1 regression visible on Chromium/Linux:
http://chromium-perf.appspot.com/?tab=linux-release-webkit-latest&graph=times&trace=t&rev=183149&history=150&master=ChromiumWebkit&testSuite=page_cycler_intl1&details=true


> You'd think that at leasts the 2x regression on sgkalesh.blogspot.com would be reproducible in the applemac port if we're using the same codepaths. Does Chromium/Mac use the same fonts implementation as the Apple Mac port?

It looks like unlike other Chromium builds, the Mac one uses the same fonts implementation as in WebCore. Based on comments in WebCore.gyp like: "The Mac currently uses FontCustomPlatformData.cpp from platform/graphics/mac"

One place they can possibly differ is in default per-script font settings, but this seems unlikely since sgkalesh.blogspot.com doesn't set the HTML 'lang' attribute.

I actually could not reproduce this locally either, on Chromium/Mac running on Mountain Lion. The BBC pages seemed to regress, and sgkalesh.blogspot.com actually got a little faster.
Comment 34 Andreas Kling 2013-02-19 00:30:41 PST
I'm gonna try to rewrite this so that GlyphPage is always a single allocation, in case the regression is caused by the indirection in accessing the per-glyph array. I just need some sleep first if I'm gonna get it right. :)
Comment 35 Andreas Kling 2013-02-21 06:13:36 PST
Created attachment 189517 [details]
Mildly desperate patch

Okay, here's a patch that allocates GlyphPage and the per-glyph font data array in a single block instead.
Now that I see the diff, it's actually kind of beautiful. Wish I'd written it this way from the beginning.
Comment 36 Antti Koivisto 2013-02-21 07:28:00 PST
Comment on attachment 189517 [details]
Mildly desperate patch

View in context: https://bugs.webkit.org/attachment.cgi?id=189517&action=review

Nice, r=me

> Source/WebCore/platform/graphics/GlyphPage.h:59
> +#if COMPILER(MSVC)
> +#pragma warning(push)
> +#pragma warning(disable: 4200) // Disable "zero-sized array in struct/union" warning
> +#endif

Could this silly warning be disabled globally?

> Source/WebCore/platform/graphics/GlyphPage.h:158
> +        // This method should only be called on the system fallback page, which is never single-font.
> +        ASSERT(hasPerGlyphFontData());

Maybe add "system fallback page" to the function name?
Comment 37 Andreas Kling 2013-02-21 07:46:00 PST
Committed r143601: <http://trac.webkit.org/changeset/143601>
Comment 38 Andreas Kling 2013-02-22 02:09:42 PST
Committed r143707: <http://trac.webkit.org/changeset/143707>
Comment 39 Andreas Kling 2013-02-22 02:11:56 PST
(In reply to comment #38)
> Committed r143707: <http://trac.webkit.org/changeset/143707>

Slapped ALWAYS_INLINE on the relevant getters to see if this is an inlining problem. webkit-patch added the "Reviewed by Antti Koivisto" line for some reason. Bad webkit-patch.
Comment 40 Andreas Kling 2013-02-22 03:12:05 PST
Sigh. It appears that didn't help either. Would someone be able to grab a profile of this test on a machine where it's reproducible? I'm at a loss here.
Comment 41 Matt Falkenhagen 2013-02-25 19:01:42 PST
Thanks for working on fixing this. Sorry it's such a pain :-(

I don't know if we can get a profile. I haven't been able to reproduce locally on my machine and don't think we have duplicate machines other than the actual one running the test.

Somewhat good news is there is talk of replacing the intl tests, but this won't happen soon: http://crbug.com/175050

Do you think you can rollout the patch? It would be bad if another regression sneaks in while it's in the tree.
Comment 42 Maciej Stachowiak 2013-02-25 20:32:32 PST
We often have trouble with the Intl2 tester when optimizing the glyph cache, the font cache, or other such things, because it uses more different scripts than is remotely likely for a single user. Our own page load speed benchmark which includes some non-Latin script pages but not quite so many different ones doesn't show a regression from this rev.

Historically we've tried to deal with this by dynamically scaling for more fonts or more glyphs on a timer - this makes Intl2 happy without regressing memory use for more realistic cases. Perhaps that would make sense to do here.

Unfortunately, it's hard to do that in this case without a profile or ability to reproduce the benchmark.

If we just roll out the change with no further info, we'll be regressing memory for normal users for the sake of an artificial benchmark, with no obvious path to getting it back, since no one outside Google has a profile or can run the benchmark.
Comment 43 Ryosuke Niwa 2013-02-26 00:30:39 PST
This patch appears to reduce the memory usage by roughly 6% on the same intl1 test:
http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/intl1/report.html?history=250&rev=184424&graph=commit_charge
Comment 44 Ojan Vafai 2013-02-26 11:42:30 PST
We're chatting about this on the Chromium bug and it sounds like there's general agreement that this is one of the rare times it makes sense to leave a regression in given all the details and that akling has done pretty much all that can be done to try to address it. 

Also, there's talk of trying to replace this test suite with a more realistic workload.

http://crbug.com/175050 if you care.