Bug 233747 - move FontCache singleton and instance on WorkerGlobalScope to ThreadGlobalData
Summary: move FontCache singleton and instance on WorkerGlobalScope to ThreadGlobalData
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Layout and Rendering (show other bugs)
Version: WebKit Local Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Cameron McCormack (:heycam)
URL:
Keywords: InRadar
Depends on: 233750
Blocks: 233488 233749
  Show dependency treegraph
 
Reported: 2021-12-01 22:27 PST by Cameron McCormack (:heycam)
Modified: 2021-12-07 21:14 PST (History)
19 users (show)

See Also:


Attachments
Patch (58.91 KB, patch)
2021-12-01 22:56 PST, Cameron McCormack (:heycam)
no flags Details | Formatted Diff | Diff
Patch (61.65 KB, patch)
2021-12-06 16:01 PST, Cameron McCormack (:heycam)
no flags Details | Formatted Diff | Diff
Patch (61.72 KB, patch)
2021-12-06 16:37 PST, Cameron McCormack (:heycam)
no flags Details | Formatted Diff | Diff
Patch (61.85 KB, patch)
2021-12-07 13:51 PST, Cameron McCormack (:heycam)
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Cameron McCormack (:heycam) 2021-12-01 22:27:42 PST
There are various places in font code where we need access to a FontCache instance.  Currently we:

* assume that we're on the main thread (and call FontCache::singleton()), or
* get a FontCache off WorkerGlobalScope and pass it down (or store it) where it's needed, then pass it to fontCacheFallingBackToSingleton

Having to thread through, or store, the FontCache pointer is inconvenient, and there are some places where we don't have access to a FontCache pointer, but we're on a worker, and we call fontCacheFallingBackToSingleton and get back the main thread's instance.

I think it would be cleaner if we moved both the main thread FontCache singleton, and the one stored on a WorkerGlobalScope, to ThreadGlobalData, which is easily globally accessible.  Performing a TLS lookup is cheap these days.
Comment 1 Cameron McCormack (:heycam) 2021-12-01 22:56:36 PST
Created attachment 445673 [details]
Patch
Comment 2 Chris Lord 2021-12-02 02:14:44 PST
Funny, this is how I originally had it, I can't remember what the resistance was to this method... Possibly because it reinforces the pattern of singletons that we generally want to move away from(?)

No objections from me though, I think this would just make a lot of things easier.
Comment 3 Chris Lord 2021-12-02 03:26:54 PST
Comment on attachment 445673 [details]
Patch

Fly-by r+ - I like this direction, personally, and this patch looks good to me. Perhaps we could do the same thing for CSSValuePool...
Comment 4 Cameron McCormack (:heycam) 2021-12-02 13:28:47 PST
I'm having seconds thoughts about exactly where to store these caches (now that Simon suggested it).  Maybe I shouldn't move FontCache to ThreadGlobalData, and (in my later patches like bug 233749) move the other thread unsafe caches to hang off FontCache, but instead use either bmalloc::PerThread<T> or WTF::ThreadSpecific<T> (depending on whether it's a per-thread singleton type like FontCache, or something else like a HashMap).

That would avoid mixing everything up in FontCache.  On the other hand, instead of having the one call to m_fontCache->invalidate() like I have in this patch in ThreadGlobalData::destroy, we'd have to have a bunch of calls to the different caches to clear them.  Maybe it's better to be more explicit there anyway?
Comment 5 Cameron McCormack (:heycam) 2021-12-02 13:43:25 PST
Well, there is one more advantage to having the FontCache hanging off ThreadGlobalData -- it means we don't have to have any code to handle the Web thread on iOS needing to have access to the same caches that are created on the main thread.

All this stuff would need to be duplicated for each cache that might be used by the Web thread and the main thread on iOS:

https://webkit-search.igalia.com/webkit/rev/535c2ce176bb30bd14e60940f4bcdb7e7cfcf7ba/Source/WebCore/platform/ThreadGlobalData.cpp#68-105
Comment 6 Darin Adler 2021-12-02 16:40:38 PST
Comment on attachment 445673 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=445673&action=review

What’s the performance impact of fetching thread global data in all these places, where many were accessing a local variable or data member before?

> Source/WebCore/page/SettingsBase.cpp:61
> +    // FIXME: Call invalidateFontCascadeCache() on all workers.
> +    FontCache::forCurrentThread().invalidateFontCascadeCache();

Can we do this through a FontCache::forEachFontCache function, analogous to Page::forEachPage, or do we need to keep the registry of "all" somewhere outside the FontCache class?

One reason I ask is that if we know we are going to use that interface, we could deploy it now while touching each of these call sites even if right now it only did the current thread one. Then we’d have fewer FIXME in this patch.

> Source/WebCore/platform/ThreadGlobalData.cpp:68
> +    if (m_fontCache)
> +        m_fontCache->invalidate();

What destroys the font cache? Does calling invalidate() also null out m_fontCache?

> Source/WebCore/platform/graphics/FontCache.h:286
> +    static std::unique_ptr<FontCache> create();

Not sure we need this function. Typically we just make the constructor public and use make_unique directly for non-reference-counted things.

Or we could try to keep things private and use friend to make it accessible to ThreadGlobalData. Could be the create function or the constructor.

> Source/WebKit/WebProcess/WebProcess.cpp:873
>  #ifndef NDEBUG
>      GCController::singleton().garbageCollectNow();
> -    FontCache::singleton().invalidate();
> +    // FIXME: What about FontCaches on worker threads?
> +    FontCache::forCurrentThread().invalidate();
>      MemoryCache::singleton().setDisabled(true);
>  #endif

Question for whoever originally wrote this: Why is doing this work at termination, but only in debug versions, a good idea? Lack of a comment here makes this mysterious!
Comment 7 Cameron McCormack (:heycam) 2021-12-02 17:15:40 PST
(In reply to Darin Adler from comment #6)
> What’s the performance impact of fetching thread global data in all these
> places, where many were accessing a local variable or data member before?

On iOS, the threadGlobalData() function does this, in the common case of already being initialized:

* Reads a static variable (which doesn't have a complex initializer) that holds a pointer to a ThreadSpecific<RefPtr<ThreadGlobalData>>
* Jumps past the initialization code
* Calls into ThreadSpecific::get on the object it points to, which:
    * calls pthread_getspecific, which is three instructions long:
        _pthread_getspecific:
            1b8c:       68 d0 3b d5     mrs     x8, TPIDRRO_EL0
            1b90:       00 79 60 f8     ldr     x0, [x8, x0, lsl #3]
            1b94:       c0 03 5f d6     ret
* Dereferences the pointer returned by ThreadSpecific::get()
* Jumps past its initialization since it's already been created
* Returns the pointer to the ThreadGlobalData

On macOS, it does similar, though slightly less:

* Reads a static variable (which doesn't have a complex initializer) that holds a pointer to a ThreadSpecific<ThreadGlobalData>
* Jumps past the initialization code
* Calls into ThreadSpecific::get on the object it points to, which:
    * calls pthread_getspecific, which is two instructions long:
        _pthread_getspecific:
            1f29:       65 48 8b 04 fd 00 00 00 00      movq    %gs:(,%rdi,8), %rax
            1f32:       c3      retq
* Returns the pointer returned by ThreadSpecific::get()


So I think it's not much more than the current static singleton code, and also would have to be balanced against the stack pushing and popping to pass FontCache through all of the functions that would need access to it.

> > Source/WebCore/page/SettingsBase.cpp:61
> > +    // FIXME: Call invalidateFontCascadeCache() on all workers.
> > +    FontCache::forCurrentThread().invalidateFontCascadeCache();
> 
> Can we do this through a FontCache::forEachFontCache function, analogous to
> Page::forEachPage, or do we need to keep the registry of "all" somewhere
> outside the FontCache class?
> 
> One reason I ask is that if we know we are going to use that interface, we
> could deploy it now while touching each of these call sites even if right
> now it only did the current thread one. Then we’d have fewer FIXME in this
> patch.

I haven't thought how exactly I'm going to the function that loops over the FontCaches, since they'll each exist in a different thread, and there'll have to be some kind of runnable dispatch to each.  But probably we'd iterate over all of the WorkerOrWorkletThreads (since we already have them all in WorkerOrWorkletThread::workerOrWorkletThreads()).

> > Source/WebCore/platform/ThreadGlobalData.cpp:68
> > +    if (m_fontCache)
> > +        m_fontCache->invalidate();
> 
> What destroys the font cache? Does calling invalidate() also null out
> m_fontCache?

Maybe I have to.  There are some other fields on ThreadGlobalScope that aren't cleared in destroy(), and maybe they should be too.

I have a feeling now that the ThreadGlobalObjects themselves are never destroyed, which makes me wonder how the memory allocated for them is freed?  If we keep creating and destroying threads are we leaking ThreadGlobalObjects?

> > Source/WebKit/WebProcess/WebProcess.cpp:873
> >  #ifndef NDEBUG
> >      GCController::singleton().garbageCollectNow();
> > -    FontCache::singleton().invalidate();
> > +    // FIXME: What about FontCaches on worker threads?
> > +    FontCache::forCurrentThread().invalidate();
> >      MemoryCache::singleton().setDisabled(true);
> >  #endif
> 
> Question for whoever originally wrote this: Why is doing this work at
> termination, but only in debug versions, a good idea? Lack of a comment here
> makes this mysterious!

I don't know either.
Comment 8 Darin Adler 2021-12-02 17:36:27 PST
(In reply to Cameron McCormack (:heycam) from comment #7)
> So I think it's not much more than the current static singleton code, and
> also would have to be balanced against the stack pushing and popping to pass
> FontCache through all of the functions that would need access to it.

I suppose we can do a quick performance test to be sure nothing measurable regressed.
Comment 9 Darin Adler 2021-12-02 17:37:28 PST
(In reply to Cameron McCormack (:heycam) from comment #7)
> > > Source/WebKit/WebProcess/WebProcess.cpp:873
> > >  #ifndef NDEBUG
> > >      GCController::singleton().garbageCollectNow();
> > > -    FontCache::singleton().invalidate();
> > > +    // FIXME: What about FontCaches on worker threads?
> > > +    FontCache::forCurrentThread().invalidate();
> > >      MemoryCache::singleton().setDisabled(true);
> > >  #endif
> > 
> > Question for whoever originally wrote this: Why is doing this work at
> > termination, but only in debug versions, a good idea? Lack of a comment here
> > makes this mysterious!
> 
> I don't know either.

We should try just deleting this code. We know it won’t affect the production build since it’s debug-only, so any problems would have to be seen by people doing development or bots testing debug builds.
Comment 10 Cameron McCormack (:heycam) 2021-12-02 18:10:44 PST
(In reply to Darin Adler from comment #9)
> We should try just deleting this code. We know it won’t affect the
> production build since it’s debug-only, so any problems would have to be
> seen by people doing development or bots testing debug builds.

https://commits.webkit.org/136348@main claims it reduces "LEAK" output lines.  Sounds plausible though I don't think I've never seen any "LEAK" output lines so I don't know whether that still works or if I need to enable it explicitly.
Comment 11 Cameron McCormack (:heycam) 2021-12-02 18:19:54 PST
(In reply to Cameron McCormack (:heycam) from comment #7)
> > > Source/WebCore/platform/ThreadGlobalData.cpp:68
> > > +    if (m_fontCache)
> > > +        m_fontCache->invalidate();
> > 
> > What destroys the font cache? Does calling invalidate() also null out
> > m_fontCache?
> 
> Maybe I have to.  There are some other fields on ThreadGlobalScope that
> aren't cleared in destroy(), and maybe they should be too.
> 
> I have a feeling now that the ThreadGlobalObjects themselves are never
> destroyed, which makes me wonder how the memory allocated for them is freed?
> If we keep creating and destroying threads are we leaking
> ThreadGlobalObjects?

From code inspection, I think ThreadGlobaData objects get destroyed by virtue of having a destructor function registered as part of the pthread_create_key call, which will will be called upon thread exit.

So I'm no longer sure why any of this work is done under ThreadGlobalData::destroy().
Comment 12 Cameron McCormack (:heycam) 2021-12-02 18:22:56 PST
From https://commits.webkit.org/50098@main it sounds like there was a reason to clear things in destroy() -- to ensure that AtomStrings were all destroyed before the atom string table itself is (as part of destroying WTFThreadData).
Comment 13 Darin Adler 2021-12-02 18:45:14 PST
(In reply to Cameron McCormack (:heycam) from comment #7)
> > > Source/WebCore/platform/ThreadGlobalData.cpp:68
> > > +    if (m_fontCache)
> > > +        m_fontCache->invalidate();
> > 
> > What destroys the font cache? Does calling invalidate() also null out
> > m_fontCache?
> 
> Maybe I have to.  There are some other fields on ThreadGlobalScope that
> aren't cleared in destroy(), and maybe they should be too.

It’s entirely possible that the nulling out is only needed for cases where the order of destruction matters.

> I have a feeling now that the ThreadGlobalObjects themselves are never
> destroyed

Even though it was my comment that engendered that feeling, I think it is wrong. Seems like they do because ThreadSpecific::destroy gets called by the platform threading machinery, which then calls delete on the ThreadSpecific's object, which is a ThreadGlobalData.
Comment 14 Darin Adler 2021-12-02 18:45:28 PST
Oh, I see you figured this things out above.
Comment 15 Darin Adler 2021-12-02 18:46:17 PST
(In reply to Cameron McCormack (:heycam) from comment #10)
> https://commits.webkit.org/136348@main claims it reduces "LEAK" output
> lines.  Sounds plausible though I don't think I've never seen any "LEAK"
> output lines so I don't know whether that still works or if I need to enable
> it explicitly.

I think we may have removed the LEAK thing in the last couple of years.
Comment 16 Darin Adler 2021-12-03 08:53:12 PST
(In reply to Cameron McCormack (:heycam) from comment #12)
> From https://commits.webkit.org/50098@main it sounds like there was a reason
> to clear things in destroy() -- to ensure that AtomStrings were all
> destroyed before the atom string table itself is (as part of destroying
> WTFThreadData).

Either you or I should help out future programmers by turning that into a comment in destroy(). In my view, this is exactly what comments are for. Adding the "why" to code. The code itself should be very clear on "what".
Comment 17 Cameron McCormack (:heycam) 2021-12-03 13:17:29 PST
I'll add that comment.

Re performance I ran PLT5 and got no change.
Comment 18 zalan 2021-12-03 13:27:28 PST
(In reply to Cameron McCormack (:heycam) from comment #17)
> I'll add that comment.
> 
> Re performance I ran PLT5 and got no change.
MM might be a better pick to test for pref regression.
Comment 19 Cameron McCormack (:heycam) 2021-12-04 16:13:32 PST
(In reply to zalan from comment #18)
> MM might be a better pick to test for pref regression.

MotionMark shows no change too.
Comment 20 zalan 2021-12-04 19:09:16 PST
(In reply to Cameron McCormack (:heycam) from comment #19)
> (In reply to zalan from comment #18)
> > MM might be a better pick to test for pref regression.
> 
> MotionMark shows no change too.
👍
Comment 21 Cameron McCormack (:heycam) 2021-12-06 16:01:06 PST
Created attachment 446089 [details]
Patch
Comment 22 Cameron McCormack (:heycam) 2021-12-06 16:37:22 PST
Created attachment 446095 [details]
Patch
Comment 23 Myles C. Maxfield 2021-12-07 00:44:19 PST
Comment on attachment 446095 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=446095&action=review

> Source/WebCore/ChangeLog:19
> +        fontCacheFallingBackToSingleton and wrongly get back the main thread's

"incorrectly"

> Source/WebCore/ChangeLog:24
> +        ThreadGlobalData, which is easily globally accessible. Performing a

This sounds like a good idea.
Comment 24 Cameron McCormack (:heycam) 2021-12-07 13:51:51 PST
Created attachment 446234 [details]
Patch
Comment 25 EWS 2021-12-07 15:34:44 PST
Committed r286625 (244938@main): <https://commits.webkit.org/244938@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 446234 [details].
Comment 26 Radar WebKit Bug Importer 2021-12-07 15:35:25 PST
<rdar://problem/86180020>