We could save large amounts of memory by detecting that all glyphs in a GlyphPage are from the same SimpleFontData* and not storing an array of font data pointers for every glyph in that case. An initial draft shows a 4.98 MB improvement on Membuster3, though I'm not 100% sure of its correctness.
Created attachment 186404 [details] Sneak preview for EWS
Attachment 186404 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/platform/graphics/GlyphPage.h', u'Source/WebCore/platform/graphics/GlyphPageTreeNode.cpp']" exit_code: 1 Source/WebCore/platform/graphics/GlyphPage.h:35: Alphabetical sorting problem. [build/include_order] [4] Total errors found: 1 in 2 files If any of these errors are false positives, please file a bug against check-webkit-style.
Created attachment 186457 [details] Proposed patch
Comment on attachment 186457 [details] Proposed patch View in context: https://bugs.webkit.org/attachment.cgi?id=186457&action=review > Source/WebCore/ChangeLog:14 > + 4.98 MB progression on Membuster3. Nice! > Source/WebCore/ChangeLog:24 > + There are now three ways of constructing a GlyphPage. createZeroed() and createCopy() > + are only used when creating the system fallback page. createCopiedSystemFallbackPage/createZeroedSystemFallbackPage then perhaps? > Source/WebCore/platform/graphics/GlyphPage.h:138 > + // -1 is a placeholder before we have an initial SimpleFontData*. > + if (m_fontDataForAllGlyphs == reinterpret_cast<const SimpleFontData*>(-1)) > + m_fontDataForAllGlyphs = fontData; Uhhuh. At least make it a named constant, unitializeFontData() or something > Source/WebCore/platform/graphics/GlyphPage.h:207 > + bool m_allGlyphsAreFromSameFont; > + union { > + const SimpleFontData* m_fontDataForAllGlyphs; > + const SimpleFontData** m_glyphFontData; > + }; Using union here doesn't seem to gain much. Without it you wouldn't need m_allGlyphsAreFromSameFont bit and the memory usage would be unchanged. The code would probably be clearer. I suppose it documents these are mutually exclusive...
<rdar://problem/13157042>
Created attachment 186710 [details] Proposed patch v2 @Antti: Sure thing brah.
Comment on attachment 186710 [details] Proposed patch v2 View in context: https://bugs.webkit.org/attachment.cgi?id=186710&action=review > Source/WebCore/platform/graphics/GlyphPage.h:191 > + : m_fontDataForAllGlyphs(undefinedFontDataValue()) This may be a stupid question but why can't this just be null?
(In reply to comment #7) > (From update of attachment 186710 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=186710&action=review > > > Source/WebCore/platform/graphics/GlyphPage.h:191 > > + : m_fontDataForAllGlyphs(undefinedFontDataValue()) > > This may be a stupid question but why can't this just be null? Good point! I thought I was guarding against a sequence like: [0] glyph=1234, fontData=0x0 [1] glyph=4567, fontData=0x00031337 But that doesn't actually happen in practice, since only glyph=0 will have a null font data pointer, and we handle that separately. I'll add an assertion for that instead and land without the -1 gunk. Thanks for reviewing.
Comment on attachment 186457 [details] Proposed patch View in context: https://bugs.webkit.org/attachment.cgi?id=186457&action=review > Source/WebCore/platform/graphics/GlyphPage.h:80 > + ~GlyphPage() > + { > + if (!m_allGlyphsAreFromSameFont) > + fastFree(m_glyphFontData); > } Is it important to make this inline? > Source/WebCore/platform/graphics/GlyphPage.h:95 > + if (UNLIKELY(!glyph)) > + return GlyphData(0, 0); Why is this needed? I presume because we got bogus answers when we ask if a font includes a glyph for 0? > Source/WebCore/platform/graphics/GlyphPage.h:115 > const SimpleFontData* fontDataForCharacter(UChar32 c) const > { > + Glyph glyph = m_glyphs[indexForCharacter(c)]; > + if (UNLIKELY(!glyph)) > + return 0; > + if (m_allGlyphsAreFromSameFont) > + return m_fontDataForAllGlyphs; > return m_glyphFontData[indexForCharacter(c)]; > } Irritating that this repeats half of the glyphDataForIndex function. I suggest we refactor so they can share code in the future. >> Source/WebCore/platform/graphics/GlyphPage.h:138 >> + // -1 is a placeholder before we have an initial SimpleFontData*. >> + if (m_fontDataForAllGlyphs == reinterpret_cast<const SimpleFontData*>(-1)) >> + m_fontDataForAllGlyphs = fontData; > > Uhhuh. At least make it a named constant, unitializeFontData() or something Can’t we use a named constant for this?
Sorry about my late comments. I started reviewing yesterday and never hit the submit button.
(In reply to comment #9) > (From update of attachment 186457 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=186457&action=review > > > Source/WebCore/platform/graphics/GlyphPage.h:80 > > + ~GlyphPage() > > + { > > + if (!m_allGlyphsAreFromSameFont) > > + fastFree(m_glyphFontData); > > } > > Is it important to make this inline? Not really. I stuck with inline methods since there was no corresponding .cpp file for this class already. > > Source/WebCore/platform/graphics/GlyphPage.h:95 > > + if (UNLIKELY(!glyph)) > > + return GlyphData(0, 0); > > Why is this needed? I presume because we got bogus answers when we ask if a font includes a glyph for 0? Correct. > > Source/WebCore/platform/graphics/GlyphPage.h:115 > > const SimpleFontData* fontDataForCharacter(UChar32 c) const > > { > > + Glyph glyph = m_glyphs[indexForCharacter(c)]; > > + if (UNLIKELY(!glyph)) > > + return 0; > > + if (m_allGlyphsAreFromSameFont) > > + return m_fontDataForAllGlyphs; > > return m_glyphFontData[indexForCharacter(c)]; > > } > > Irritating that this repeats half of the glyphDataForIndex function. I suggest we refactor so they can share code in the future. True, it could just be: return glyphDataForIndex(indexForCharacter(c)).fontData;
Created attachment 186739 [details] Patch for landing
Comment on attachment 186739 [details] Patch for landing Clearing flags on attachment: 186739 Committed r141961: <http://trac.webkit.org/changeset/141961>
All reviewed patches have been landed. Closing bug.
It looks like this patch caused assertion failures on the following tests: svg/W3C-SVG-1.1/fonts-glyph-04-t.svg svg/W3C-SVG-1.1/fonts-glyph-03-t.svg svg/custom/glyph-selection-lang-attribute.svg svg/text/alt-glpyh-on-fallback-font-crash.svg See for example: http://build.webkit.org/builders/Apple%20Lion%20Debug%20WK2%20%28Tests%29/builds/7165/steps/layout-test/logs/stdio Sample stack trace: ASSERTION FAILED: fontData /Volumes/Data/slave/lion-debug/build/Source/WebCore/platform/graphics/GlyphPage.h(139) : void WebCore::GlyphPage::setGlyphDataForIndex(unsigned int, Glyph, const WebCore::SimpleFontData *) 1 0x108778665 WebCore::GlyphPage::setGlyphDataForIndex(unsigned int, unsigned short, WebCore::SimpleFontData const*) 2 0x108775e25 WebCore::GlyphPage::setGlyphDataForCharacter(int, unsigned short, WebCore::SimpleFontData const*) 3 0x1095fb718 WebCore::SVGTextRunRenderingContext::glyphDataForCharacter(WebCore::Font const&, WebCore::TextRun const&, WebCore::WidthIterator&, int, bool, int, unsigned int&) 4 0x109b64c0d WebCore::WidthIterator::glyphDataForCharacter(int, bool, int, unsigned int&) 5 0x109b668cb unsigned int WebCore::WidthIterator::advanceInternal<WebCore::SurrogatePairAwareTextIterator>(WebCore::SurrogatePairAwareTextIterator&, WebCore::GlyphBuffer*) 6 0x109b64d4d WebCore::WidthIterator::advance(int, WebCore::GlyphBuffer*) 7 0x1095fc40f WebCore::SVGTextMetricsBuilder::advanceSimpleText() 8 0x1095fc1cc WebCore::SVGTextMetricsBuilder::advance() 9 0x1095fc925 WebCore::SVGTextMetricsBuilder::measureTextRenderer(WebCore::RenderSVGInlineText*, WebCore::MeasureTextData*) 10 0x1095fcca8 WebCore::SVGTextMetricsBuilder::walkTree(WebCore::RenderObject*, WebCore::RenderSVGInlineText*, WebCore::MeasureTextData*) 11 0x1095f6730 WebCore::SVGTextMetricsBuilder::buildMetricsAndLayoutAttributes(WebCore::RenderSVGText*, WebCore::RenderSVGInlineText*, WTF::HashMap<unsigned int, WebCore::SVGCharacterData, WTF::IntHash<unsigned int>, WTF::HashTraits<unsigned int>, WTF::HashTraits<WebCore::SVGCharacterData> >&) 12 0x1095e4d18 WebCore::SVGTextLayoutAttributesBuilder::buildLayoutAttributesForForSubtree(WebCore::RenderSVGText*) 13 0x1095e4453 WebCore::RenderSVGText::layout() 14 0x1095c8b98 WebCore::SVGRenderSupport::layoutChildren(WebCore::RenderObject*, bool) 15 0x1095c878e WebCore::RenderSVGContainer::layout() 16 0x1095c8b98 WebCore::SVGRenderSupport::layoutChildren(WebCore::RenderObject*, bool) 17 0x1095c878e WebCore::RenderSVGContainer::layout() 18 0x1095c8b98 WebCore::SVGRenderSupport::layoutChildren(WebCore::RenderObject*, bool) 19 0x1095e0439 WebCore::RenderSVGRoot::layout() 20 0x1093cc60c WebCore::RenderBlock::layoutBlockChild(WebCore::RenderBox*, WebCore::RenderBlock::MarginInfo&, WebCore::LayoutUnit&, WebCore::LayoutUnit&) 21 0x1093c2aa7 WebCore::RenderBlock::layoutBlockChildren(bool, WebCore::LayoutUnit&) 22 0x1093c00d7 WebCore::RenderBlock::layoutBlock(bool, WebCore::LayoutUnit) 23 0x1093bf0de WebCore::RenderBlock::layout() 24 0x109695b24 WebCore::RenderView::layoutContent(WebCore::LayoutState const&) 25 0x109696457 WebCore::RenderView::layout() 26 0x1087db67d WebCore::FrameView::layout(bool) 27 0x10848732d WebCore::Document::implicitClose() 28 0x1087b102b WebCore::FrameLoader::checkCallImplicitClose() 29 0x1087b0cf3 WebCore::FrameLoader::checkCompleted() 30 0x1087af9b3 WebCore::FrameLoader::finishedParsing() 31 0x108491509 WebCore::Document::finishedParsing()
Re-opened since this is blocked by bug 109019
Committed r141990: <http://trac.webkit.org/changeset/141990>
It seems this regressed performance on Chromium's intl2 page cycler test: http://chromium-perf.appspot.com/?tab=mac-release-10.6-webkit-latest&graph=times&trace=t&rev=181260&history=150&master=ChromiumWebkit&testSuite=page_cycler_intl2&details=true The spikes in the graph correlate to: r141961: the first landing of the GlyphPage patch r141976: rollout of the patch r141990: relanding I'm not sure it needs fixing as it looks like the patch is a deliberate decision to optimize for cases where there are not multiple fonts or font fallback, which occurs more often on international pages. Locally, the pages that regressed the most are copies of BBC Arabic and BBC Tamil, though running the cycler on just those pages in isolation does not reproduce the regression.
(In reply to comment #18) > I'm not sure it needs fixing as it looks like the patch is a deliberate decision to optimize for cases where there are not multiple fonts or font fallback, which occurs more often on international pages. A 5% performance regression on something that isn't a micro-benchmark definitely needs fixing. > Locally, the pages that regressed the most are copies of BBC Arabic and BBC Tamil, though running the cycler on just those pages in isolation does not reproduce the regression. I think you misread the output. Before:http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/intl2/details.html?cl=180871&graph=times&trace=t After: http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/intl2/details.html?cl=180898&graph=times&trace=t The pages that regressed the most were: arabicnews.google.com bn.wikipedia.org ml.wikipedia.org sgkalesh.blogspot.com sgkalesh.blogspot.com had a nearly 200% regression! alking, I can't get you a copy of this if you want for digging into it. Ideally you'd rollback first so that we can ensure that the regression is fully resolved and doesn't come back with the fixed up version.
s/can't/can obviously. :(
Ouch. Sure, I'll roll this out and try to fix the regression. Thanks for the poke. Oh, and I already have a copy of the page cyclers somewhere :)
Re-opened since this is blocked by bug 109850
Looking at the way those sites use this code, it looks like this may be a case of overzealous UNLIKELY() branch prediction hints.
(In reply to comment #23) > Looking at the way those sites use this code, it looks like this may be a case of overzealous UNLIKELY() branch prediction hints. Local testing shows that removing the branch prediction hints removes the regression. I'm going to reland with that changed and see how it fares on the bots.
Committed r143125: <http://trac.webkit.org/changeset/143125>
No luck. :( http://chromium-perf.appspot.com/?tab=mac-release-10.6-webkit-latest&graph=times&trace=t&rev=183037&history=50&master=ChromiumWebkit&testSuite=page_cycler_intl2&details=true
(In reply to comment #26) > No luck. :( > > http://chromium-perf.appspot.com/?tab=mac-release-10.6-webkit-latest&graph=times&trace=t&rev=183037&history=50&master=ChromiumWebkit&testSuite=page_cycler_intl2&details=true Ugh. I can't get seem to get stable numbers locally, and it's hard to tell from a profile what's different between to runs. I have a speculative patch that streamlines the per-glyph fontdata paths that I'd like to try before rolling it out. Uploading for review shortly..
Created attachment 188776 [details] Speculative optimization
Comment on attachment 188776 [details] Speculative optimization Seems plausible.
Committed r143137: <http://trac.webkit.org/changeset/143137>
(In reply to comment #30) > Committed r143137: <http://trac.webkit.org/changeset/143137> Unfortunately, it looks like this didn't help: http://chromium-perf.appspot.com/?tab=mac-release-10.6-webkit-latest&graph=times&trace=t&rev=183149&history=150&master=ChromiumWebkit&testSuite=page_cycler_intl2&details=true
(In reply to comment #31) > (In reply to comment #30) > > Committed r143137: <http://trac.webkit.org/changeset/143137> > > Unfortunately, it looks like this didn't help: > > http://chromium-perf.appspot.com/?tab=mac-release-10.6-webkit-latest&graph=times&trace=t&rev=183149&history=150&master=ChromiumWebkit&testSuite=page_cycler_intl2&details=true Hm. Confusing. So, just to make sure I'm understanding everything, this is a regression on Chromium/Mac only, and this perf bot is running Snow Leopard? You'd think that at leasts the 2x regression on sgkalesh.blogspot.com would be reproducible in the applemac port if we're using the same codepaths. Does Chromium/Mac use the same fonts implementation as the Apple Mac port?
(In reply to comment #32) > Hm. Confusing. So, just to make sure I'm understanding everything, this is a regression on Chromium/Mac only, and this perf bot is running Snow Leopard? Yes, the intl2 regression we only see on Chromium/Mac, running on Snow Leopard. There is also a smaller intl1 regression visible on Chromium/Linux: http://chromium-perf.appspot.com/?tab=linux-release-webkit-latest&graph=times&trace=t&rev=183149&history=150&master=ChromiumWebkit&testSuite=page_cycler_intl1&details=true > You'd think that at leasts the 2x regression on sgkalesh.blogspot.com would be reproducible in the applemac port if we're using the same codepaths. Does Chromium/Mac use the same fonts implementation as the Apple Mac port? It looks like unlike other Chromium builds, the Mac one uses the same fonts implementation as in WebCore. Based on comments in WebCore.gyp like: "The Mac currently uses FontCustomPlatformData.cpp from platform/graphics/mac" One place they can possibly differ is in default per-script font settings, but this seems unlikely since sgkalesh.blogspot.com doesn't set the HTML 'lang' attribute. I actually could not reproduce this locally either, on Chromium/Mac running on Mountain Lion. The BBC pages seemed to regress, and sgkalesh.blogspot.com actually got a little faster.
I'm gonna try to rewrite this so that GlyphPage is always a single allocation, in case the regression is caused by the indirection in accessing the per-glyph array. I just need some sleep first if I'm gonna get it right. :)
Created attachment 189517 [details] Mildly desperate patch Okay, here's a patch that allocates GlyphPage and the per-glyph font data array in a single block instead. Now that I see the diff, it's actually kind of beautiful. Wish I'd written it this way from the beginning.
Comment on attachment 189517 [details] Mildly desperate patch View in context: https://bugs.webkit.org/attachment.cgi?id=189517&action=review Nice, r=me > Source/WebCore/platform/graphics/GlyphPage.h:59 > +#if COMPILER(MSVC) > +#pragma warning(push) > +#pragma warning(disable: 4200) // Disable "zero-sized array in struct/union" warning > +#endif Could this silly warning be disabled globally? > Source/WebCore/platform/graphics/GlyphPage.h:158 > + // This method should only be called on the system fallback page, which is never single-font. > + ASSERT(hasPerGlyphFontData()); Maybe add "system fallback page" to the function name?
Committed r143601: <http://trac.webkit.org/changeset/143601>
Committed r143707: <http://trac.webkit.org/changeset/143707>
(In reply to comment #38) > Committed r143707: <http://trac.webkit.org/changeset/143707> Slapped ALWAYS_INLINE on the relevant getters to see if this is an inlining problem. webkit-patch added the "Reviewed by Antti Koivisto" line for some reason. Bad webkit-patch.
Sigh. It appears that didn't help either. Would someone be able to grab a profile of this test on a machine where it's reproducible? I'm at a loss here.
Thanks for working on fixing this. Sorry it's such a pain :-( I don't know if we can get a profile. I haven't been able to reproduce locally on my machine and don't think we have duplicate machines other than the actual one running the test. Somewhat good news is there is talk of replacing the intl tests, but this won't happen soon: http://crbug.com/175050 Do you think you can rollout the patch? It would be bad if another regression sneaks in while it's in the tree.
We often have trouble with the Intl2 tester when optimizing the glyph cache, the font cache, or other such things, because it uses more different scripts than is remotely likely for a single user. Our own page load speed benchmark which includes some non-Latin script pages but not quite so many different ones doesn't show a regression from this rev. Historically we've tried to deal with this by dynamically scaling for more fonts or more glyphs on a timer - this makes Intl2 happy without regressing memory use for more realistic cases. Perhaps that would make sense to do here. Unfortunately, it's hard to do that in this case without a profile or ability to reproduce the benchmark. If we just roll out the change with no further info, we'll be regressing memory for normal users for the sake of an artificial benchmark, with no obvious path to getting it back, since no one outside Google has a profile or can run the benchmark.
This patch appears to reduce the memory usage by roughly 6% on the same intl1 test: http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/intl1/report.html?history=250&rev=184424&graph=commit_charge
We're chatting about this on the Chromium bug and it sounds like there's general agreement that this is one of the rare times it makes sense to leave a regression in given all the details and that akling has done pretty much all that can be done to try to address it. Also, there's talk of trying to replace this test suite with a more realistic workload. http://crbug.com/175050 if you care.