Bug 189722 - Fix crash under FontCache::purgeInactiveFontData() when a memory warning fires
Summary: Fix crash under FontCache::purgeInactiveFontData() when a memory warning fires
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Simon Fraser (smfr)
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2018-09-18 16:13 PDT by Simon Fraser (smfr)
Modified: 2018-10-04 09:40 PDT (History)
6 users (show)

See Also:


Attachments
Patch (2.33 KB, patch)
2018-09-18 16:18 PDT, Simon Fraser (smfr)
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Simon Fraser (smfr) 2018-09-18 16:13:58 PDT
Fix crash under FontCache::purgeInactiveFontData() when a memory warning fires
Comment 1 Simon Fraser (smfr) 2018-09-18 16:18:08 PDT
Created attachment 350073 [details]
Patch
Comment 2 Simon Fraser (smfr) 2018-09-18 16:18:10 PDT
<rdar://problem/44182860>
Comment 3 Myles C. Maxfield 2018-09-18 23:45:48 PDT
Comment on attachment 350073 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=350073&action=review

> Source/WebCore/platform/graphics/FontCache.cpp:379
> -        for (auto& font : cachedFonts().values()) {
> +        for (auto font : cachedFonts().values()) {

Right, I see what's happening here. cachedFonts().remove(font->platformData()); is removing the wrong item. This null deref means that if we follow the RefPtr in the cachedFonts() cache to a Font object, and we look up that Font’s inner FontPlatformData, we don’t end up with the same entry in cachedFonts() than we started with. This should be impossible because we maintain the invariant that this cycle should hold.

Copying the RefPtr is the wrong thing to do because it means that the cachedFonts().remove() call won't actually remove the font, which is contrary to the whole point of this function. This is papering over the problem rather than actually fixing it.
Comment 4 Myles C. Maxfield 2018-09-20 01:34:34 PDT
Comment on attachment 350073 [details]
Patch

I can't reproduce this in the simulator, so we can commit this until I get back to Cupertino.
Comment 5 WebKit Commit Bot 2018-09-20 02:00:22 PDT
Comment on attachment 350073 [details]
Patch

Clearing flags on attachment: 350073

Committed r236254: <https://trac.webkit.org/changeset/236254>
Comment 6 WebKit Commit Bot 2018-09-20 02:00:23 PDT
All reviewed patches have been landed.  Closing bug.
Comment 7 Darin Adler 2018-09-20 12:46:29 PDT
This is an extremely important bug to fix, and I am thrilled that it’s resolved!

Since moving out of the hash table value is something we can’t support, I think the best way to change the code to do that is to remove the call to WTFMove (and ideally write a clear comment explaining why we must copy and not move, preventing future people from attempting that optimization). The much more subtle alteration we chose to do, changing from auto& to auto so that the WTFMove won’t move out of the hash map values, does not seem to be as clear.

I also don’t fully understand why we need to do it. Can we instead fix the crash by changing the code to not assume the pointer can never be null?

I also think we need to either find a way to test this, make the code’s rationale much clearer, or both.

And the whole thing about "removing the wrong item" seems like it needs a separate fix -- I assume that it has a symptom even if we don’t crash.
Comment 8 Chris Dumez 2018-09-28 12:11:29 PDT
Comment on attachment 350073 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=350073&action=review

> Source/WebCore/platform/graphics/FontCache.cpp:381
>              if (!font->hasOneRef())

But then how can this ever be false since you have now copied the RefPtr?
Comment 9 Chris Dumez 2018-09-28 12:13:20 PDT
(In reply to Chris Dumez from comment #8)
> Comment on attachment 350073 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=350073&action=review
> 
> > Source/WebCore/platform/graphics/FontCache.cpp:381
> >              if (!font->hasOneRef())
> 
> But then how can this ever be false since you have now copied the RefPtr?

Note that I later re-introduced auto& in http://trac.webkit.org/r236383 (but stopped using WTFMove() later), I wasn't aware of Simon's change until now.
Comment 10 Darin Adler 2018-09-28 15:53:03 PDT
Comment on attachment 350073 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=350073&action=review

>>> Source/WebCore/platform/graphics/FontCache.cpp:381
>>>              if (!font->hasOneRef())
>> 
>> But then how can this ever be false since you have now copied the RefPtr?
> 
> Note that I later re-introduced auto& in http://trac.webkit.org/r236383 (but stopped using WTFMove() later), I wasn't aware of Simon's change until now.

I’m pretty sure that, because of the error you are pointing out here, in the time after *this* patch was landed but before r236383 we had a purgeInactiveFontData function that would not delete any fonts!

This means we need to find a way to add some tests that verify that our purge function actually does what it is supposed to. So next time we break it we notice that we have done so.
Comment 11 Chris Dumez 2018-10-03 10:09:17 PDT
(In reply to Darin Adler from comment #10)
> Comment on attachment 350073 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=350073&action=review
> 
> >>> Source/WebCore/platform/graphics/FontCache.cpp:381
> >>>              if (!font->hasOneRef())
> >> 
> >> But then how can this ever be false since you have now copied the RefPtr?
> > 
> > Note that I later re-introduced auto& in http://trac.webkit.org/r236383 (but stopped using WTFMove() later), I wasn't aware of Simon's change until now.
> 
> I’m pretty sure that, because of the error you are pointing out here, in the
> time after *this* patch was landed but before r236383 we had a
> purgeInactiveFontData function that would not delete any fonts!
> 
> This means we need to find a way to add some tests that verify that our
> purge function actually does what it is supposed to. So next time we break
> it we notice that we have done so.

FYI, we did notice because r236254 caused a ~2% PLT regression.
Comment 12 Myles C. Maxfield 2018-10-03 14:35:40 PDT
(In reply to Chris Dumez from comment #11)
> (In reply to Darin Adler from comment #10)
> > Comment on attachment 350073 [details]
> > Patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=350073&action=review
> > 
> > >>> Source/WebCore/platform/graphics/FontCache.cpp:381
> > >>>              if (!font->hasOneRef())
> > >> 
> > >> But then how can this ever be false since you have now copied the RefPtr?
> > > 
> > > Note that I later re-introduced auto& in http://trac.webkit.org/r236383 (but stopped using WTFMove() later), I wasn't aware of Simon's change until now.
> > 
> > I’m pretty sure that, because of the error you are pointing out here, in the
> > time after *this* patch was landed but before r236383 we had a
> > purgeInactiveFontData function that would not delete any fonts!
> > 
> > This means we need to find a way to add some tests that verify that our
> > purge function actually does what it is supposed to. So next time we break
> > it we notice that we have done so.
> 
> FYI, we did notice because r236254 caused a ~2% PLT regression.

<rdar://problem/44954219> is for reverting this patch.
Comment 13 Chris Dumez 2018-10-03 15:02:35 PDT
(In reply to Myles C. Maxfield from comment #12)
> (In reply to Chris Dumez from comment #11)
> > (In reply to Darin Adler from comment #10)
> > > Comment on attachment 350073 [details]
> > > Patch
> > > 
> > > View in context:
> > > https://bugs.webkit.org/attachment.cgi?id=350073&action=review
> > > 
> > > >>> Source/WebCore/platform/graphics/FontCache.cpp:381
> > > >>>              if (!font->hasOneRef())
> > > >> 
> > > >> But then how can this ever be false since you have now copied the RefPtr?
> > > > 
> > > > Note that I later re-introduced auto& in http://trac.webkit.org/r236383 (but stopped using WTFMove() later), I wasn't aware of Simon's change until now.
> > > 
> > > I’m pretty sure that, because of the error you are pointing out here, in the
> > > time after *this* patch was landed but before r236383 we had a
> > > purgeInactiveFontData function that would not delete any fonts!
> > > 
> > > This means we need to find a way to add some tests that verify that our
> > > purge function actually does what it is supposed to. So next time we break
> > > it we notice that we have done so.
> > 
> > FYI, we did notice because r236254 caused a ~2% PLT regression.
> 
> <rdar://problem/44954219> is for reverting this patch.

What sure what you mean, http://trac.webkit.org/r236383 basically already reverted it and recovered the regression.
Comment 14 Myles C. Maxfield 2018-10-03 16:11:45 PDT
(In reply to Chris Dumez from comment #13)
> (In reply to Myles C. Maxfield from comment #12)
> > (In reply to Chris Dumez from comment #11)
> > > (In reply to Darin Adler from comment #10)
> > > > Comment on attachment 350073 [details]
> > > > Patch
> > > > 
> > > > View in context:
> > > > https://bugs.webkit.org/attachment.cgi?id=350073&action=review
> > > > 
> > > > >>> Source/WebCore/platform/graphics/FontCache.cpp:381
> > > > >>>              if (!font->hasOneRef())
> > > > >> 
> > > > >> But then how can this ever be false since you have now copied the RefPtr?
> > > > > 
> > > > > Note that I later re-introduced auto& in http://trac.webkit.org/r236383 (but stopped using WTFMove() later), I wasn't aware of Simon's change until now.
> > > > 
> > > > I’m pretty sure that, because of the error you are pointing out here, in the
> > > > time after *this* patch was landed but before r236383 we had a
> > > > purgeInactiveFontData function that would not delete any fonts!
> > > > 
> > > > This means we need to find a way to add some tests that verify that our
> > > > purge function actually does what it is supposed to. So next time we break
> > > > it we notice that we have done so.
> > > 
> > > FYI, we did notice because r236254 caused a ~2% PLT regression.
> > 
> > <rdar://problem/44954219> is for reverting this patch.
> 
> What sure what you mean, http://trac.webkit.org/r236383 basically already
> reverted it and recovered the regression.

That patch says copyRef(), so I don't understand how that fixed it.
Comment 15 Chris Dumez 2018-10-03 16:15:03 PDT
(In reply to Myles C. Maxfield from comment #14)
> (In reply to Chris Dumez from comment #13)
> > (In reply to Myles C. Maxfield from comment #12)
> > > (In reply to Chris Dumez from comment #11)
> > > > (In reply to Darin Adler from comment #10)
> > > > > Comment on attachment 350073 [details]
> > > > > Patch
> > > > > 
> > > > > View in context:
> > > > > https://bugs.webkit.org/attachment.cgi?id=350073&action=review
> > > > > 
> > > > > >>> Source/WebCore/platform/graphics/FontCache.cpp:381
> > > > > >>>              if (!font->hasOneRef())
> > > > > >> 
> > > > > >> But then how can this ever be false since you have now copied the RefPtr?
> > > > > > 
> > > > > > Note that I later re-introduced auto& in http://trac.webkit.org/r236383 (but stopped using WTFMove() later), I wasn't aware of Simon's change until now.
> > > > > 
> > > > > I’m pretty sure that, because of the error you are pointing out here, in the
> > > > > time after *this* patch was landed but before r236383 we had a
> > > > > purgeInactiveFontData function that would not delete any fonts!
> > > > > 
> > > > > This means we need to find a way to add some tests that verify that our
> > > > > purge function actually does what it is supposed to. So next time we break
> > > > > it we notice that we have done so.
> > > > 
> > > > FYI, we did notice because r236254 caused a ~2% PLT regression.
> > > 
> > > <rdar://problem/44954219> is for reverting this patch.
> > 
> > What sure what you mean, http://trac.webkit.org/r236383 basically already
> > reverted it and recovered the regression.
> 
> That patch says copyRef(), so I don't understand how that fixed it.

r236383 fixed the regression introduced by r236254 by reverting it:
auto -> auto& so that hasOneRef() check does the right thing again
r236383 still addresses the crash that r236254 was trying to fix AFAICT, because it uses copyRef() instead of WTFMove(), thus not modifying the original structure.
Comment 16 Darin Adler 2018-10-04 09:40:21 PDT
I believe Chris’s analysis is correct. There is no need to "revert" this patch.