Summary: | move FontCache singleton and instance on WorkerGlobalScope to ThreadGlobalData | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Cameron McCormack (:heycam) <heycam> | ||||||||||
Component: | Layout and Rendering | Assignee: | Cameron McCormack (:heycam) <heycam> | ||||||||||
Status: | RESOLVED FIXED | ||||||||||||
Severity: | Normal | CC: | bfulgham, cdumez, clord, darin, esprehn+autocc, ews-watchlist, glenn, gyuyoung.kim, hi, joepeck, kangil.han, macpherson, menard, mifenton, mmaxfield, pangle, simon.fraser, webkit-bug-importer, zalan | ||||||||||
Priority: | P2 | Keywords: | InRadar | ||||||||||
Version: | WebKit Local Build | ||||||||||||
Hardware: | Unspecified | ||||||||||||
OS: | Unspecified | ||||||||||||
Bug Depends on: | 233750 | ||||||||||||
Bug Blocks: | 233488, 233749 | ||||||||||||
Attachments: |
|
Description
Cameron McCormack (:heycam)
2021-12-01 22:27:42 PST
Created attachment 445673 [details]
Patch
Funny, this is how I originally had it, I can't remember what the resistance was to this method... Possibly because it reinforces the pattern of singletons that we generally want to move away from(?) No objections from me though, I think this would just make a lot of things easier. Comment on attachment 445673 [details]
Patch
Fly-by r+ - I like this direction, personally, and this patch looks good to me. Perhaps we could do the same thing for CSSValuePool...
I'm having seconds thoughts about exactly where to store these caches (now that Simon suggested it). Maybe I shouldn't move FontCache to ThreadGlobalData, and (in my later patches like bug 233749) move the other thread unsafe caches to hang off FontCache, but instead use either bmalloc::PerThread<T> or WTF::ThreadSpecific<T> (depending on whether it's a per-thread singleton type like FontCache, or something else like a HashMap). That would avoid mixing everything up in FontCache. On the other hand, instead of having the one call to m_fontCache->invalidate() like I have in this patch in ThreadGlobalData::destroy, we'd have to have a bunch of calls to the different caches to clear them. Maybe it's better to be more explicit there anyway? Well, there is one more advantage to having the FontCache hanging off ThreadGlobalData -- it means we don't have to have any code to handle the Web thread on iOS needing to have access to the same caches that are created on the main thread. All this stuff would need to be duplicated for each cache that might be used by the Web thread and the main thread on iOS: https://webkit-search.igalia.com/webkit/rev/535c2ce176bb30bd14e60940f4bcdb7e7cfcf7ba/Source/WebCore/platform/ThreadGlobalData.cpp#68-105 Comment on attachment 445673 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=445673&action=review What’s the performance impact of fetching thread global data in all these places, where many were accessing a local variable or data member before? > Source/WebCore/page/SettingsBase.cpp:61 > + // FIXME: Call invalidateFontCascadeCache() on all workers. > + FontCache::forCurrentThread().invalidateFontCascadeCache(); Can we do this through a FontCache::forEachFontCache function, analogous to Page::forEachPage, or do we need to keep the registry of "all" somewhere outside the FontCache class? One reason I ask is that if we know we are going to use that interface, we could deploy it now while touching each of these call sites even if right now it only did the current thread one. Then we’d have fewer FIXME in this patch. > Source/WebCore/platform/ThreadGlobalData.cpp:68 > + if (m_fontCache) > + m_fontCache->invalidate(); What destroys the font cache? Does calling invalidate() also null out m_fontCache? > Source/WebCore/platform/graphics/FontCache.h:286 > + static std::unique_ptr<FontCache> create(); Not sure we need this function. Typically we just make the constructor public and use make_unique directly for non-reference-counted things. Or we could try to keep things private and use friend to make it accessible to ThreadGlobalData. Could be the create function or the constructor. > Source/WebKit/WebProcess/WebProcess.cpp:873 > #ifndef NDEBUG > GCController::singleton().garbageCollectNow(); > - FontCache::singleton().invalidate(); > + // FIXME: What about FontCaches on worker threads? > + FontCache::forCurrentThread().invalidate(); > MemoryCache::singleton().setDisabled(true); > #endif Question for whoever originally wrote this: Why is doing this work at termination, but only in debug versions, a good idea? Lack of a comment here makes this mysterious! (In reply to Darin Adler from comment #6) > What’s the performance impact of fetching thread global data in all these > places, where many were accessing a local variable or data member before? On iOS, the threadGlobalData() function does this, in the common case of already being initialized: * Reads a static variable (which doesn't have a complex initializer) that holds a pointer to a ThreadSpecific<RefPtr<ThreadGlobalData>> * Jumps past the initialization code * Calls into ThreadSpecific::get on the object it points to, which: * calls pthread_getspecific, which is three instructions long: _pthread_getspecific: 1b8c: 68 d0 3b d5 mrs x8, TPIDRRO_EL0 1b90: 00 79 60 f8 ldr x0, [x8, x0, lsl #3] 1b94: c0 03 5f d6 ret * Dereferences the pointer returned by ThreadSpecific::get() * Jumps past its initialization since it's already been created * Returns the pointer to the ThreadGlobalData On macOS, it does similar, though slightly less: * Reads a static variable (which doesn't have a complex initializer) that holds a pointer to a ThreadSpecific<ThreadGlobalData> * Jumps past the initialization code * Calls into ThreadSpecific::get on the object it points to, which: * calls pthread_getspecific, which is two instructions long: _pthread_getspecific: 1f29: 65 48 8b 04 fd 00 00 00 00 movq %gs:(,%rdi,8), %rax 1f32: c3 retq * Returns the pointer returned by ThreadSpecific::get() So I think it's not much more than the current static singleton code, and also would have to be balanced against the stack pushing and popping to pass FontCache through all of the functions that would need access to it. > > Source/WebCore/page/SettingsBase.cpp:61 > > + // FIXME: Call invalidateFontCascadeCache() on all workers. > > + FontCache::forCurrentThread().invalidateFontCascadeCache(); > > Can we do this through a FontCache::forEachFontCache function, analogous to > Page::forEachPage, or do we need to keep the registry of "all" somewhere > outside the FontCache class? > > One reason I ask is that if we know we are going to use that interface, we > could deploy it now while touching each of these call sites even if right > now it only did the current thread one. Then we’d have fewer FIXME in this > patch. I haven't thought how exactly I'm going to the function that loops over the FontCaches, since they'll each exist in a different thread, and there'll have to be some kind of runnable dispatch to each. But probably we'd iterate over all of the WorkerOrWorkletThreads (since we already have them all in WorkerOrWorkletThread::workerOrWorkletThreads()). > > Source/WebCore/platform/ThreadGlobalData.cpp:68 > > + if (m_fontCache) > > + m_fontCache->invalidate(); > > What destroys the font cache? Does calling invalidate() also null out > m_fontCache? Maybe I have to. There are some other fields on ThreadGlobalScope that aren't cleared in destroy(), and maybe they should be too. I have a feeling now that the ThreadGlobalObjects themselves are never destroyed, which makes me wonder how the memory allocated for them is freed? If we keep creating and destroying threads are we leaking ThreadGlobalObjects? > > Source/WebKit/WebProcess/WebProcess.cpp:873 > > #ifndef NDEBUG > > GCController::singleton().garbageCollectNow(); > > - FontCache::singleton().invalidate(); > > + // FIXME: What about FontCaches on worker threads? > > + FontCache::forCurrentThread().invalidate(); > > MemoryCache::singleton().setDisabled(true); > > #endif > > Question for whoever originally wrote this: Why is doing this work at > termination, but only in debug versions, a good idea? Lack of a comment here > makes this mysterious! I don't know either. (In reply to Cameron McCormack (:heycam) from comment #7) > So I think it's not much more than the current static singleton code, and > also would have to be balanced against the stack pushing and popping to pass > FontCache through all of the functions that would need access to it. I suppose we can do a quick performance test to be sure nothing measurable regressed. (In reply to Cameron McCormack (:heycam) from comment #7) > > > Source/WebKit/WebProcess/WebProcess.cpp:873 > > > #ifndef NDEBUG > > > GCController::singleton().garbageCollectNow(); > > > - FontCache::singleton().invalidate(); > > > + // FIXME: What about FontCaches on worker threads? > > > + FontCache::forCurrentThread().invalidate(); > > > MemoryCache::singleton().setDisabled(true); > > > #endif > > > > Question for whoever originally wrote this: Why is doing this work at > > termination, but only in debug versions, a good idea? Lack of a comment here > > makes this mysterious! > > I don't know either. We should try just deleting this code. We know it won’t affect the production build since it’s debug-only, so any problems would have to be seen by people doing development or bots testing debug builds. (In reply to Darin Adler from comment #9) > We should try just deleting this code. We know it won’t affect the > production build since it’s debug-only, so any problems would have to be > seen by people doing development or bots testing debug builds. https://commits.webkit.org/136348@main claims it reduces "LEAK" output lines. Sounds plausible though I don't think I've never seen any "LEAK" output lines so I don't know whether that still works or if I need to enable it explicitly. (In reply to Cameron McCormack (:heycam) from comment #7) > > > Source/WebCore/platform/ThreadGlobalData.cpp:68 > > > + if (m_fontCache) > > > + m_fontCache->invalidate(); > > > > What destroys the font cache? Does calling invalidate() also null out > > m_fontCache? > > Maybe I have to. There are some other fields on ThreadGlobalScope that > aren't cleared in destroy(), and maybe they should be too. > > I have a feeling now that the ThreadGlobalObjects themselves are never > destroyed, which makes me wonder how the memory allocated for them is freed? > If we keep creating and destroying threads are we leaking > ThreadGlobalObjects? From code inspection, I think ThreadGlobaData objects get destroyed by virtue of having a destructor function registered as part of the pthread_create_key call, which will will be called upon thread exit. So I'm no longer sure why any of this work is done under ThreadGlobalData::destroy(). From https://commits.webkit.org/50098@main it sounds like there was a reason to clear things in destroy() -- to ensure that AtomStrings were all destroyed before the atom string table itself is (as part of destroying WTFThreadData). (In reply to Cameron McCormack (:heycam) from comment #7) > > > Source/WebCore/platform/ThreadGlobalData.cpp:68 > > > + if (m_fontCache) > > > + m_fontCache->invalidate(); > > > > What destroys the font cache? Does calling invalidate() also null out > > m_fontCache? > > Maybe I have to. There are some other fields on ThreadGlobalScope that > aren't cleared in destroy(), and maybe they should be too. It’s entirely possible that the nulling out is only needed for cases where the order of destruction matters. > I have a feeling now that the ThreadGlobalObjects themselves are never > destroyed Even though it was my comment that engendered that feeling, I think it is wrong. Seems like they do because ThreadSpecific::destroy gets called by the platform threading machinery, which then calls delete on the ThreadSpecific's object, which is a ThreadGlobalData. Oh, I see you figured this things out above. (In reply to Cameron McCormack (:heycam) from comment #10) > https://commits.webkit.org/136348@main claims it reduces "LEAK" output > lines. Sounds plausible though I don't think I've never seen any "LEAK" > output lines so I don't know whether that still works or if I need to enable > it explicitly. I think we may have removed the LEAK thing in the last couple of years. (In reply to Cameron McCormack (:heycam) from comment #12) > From https://commits.webkit.org/50098@main it sounds like there was a reason > to clear things in destroy() -- to ensure that AtomStrings were all > destroyed before the atom string table itself is (as part of destroying > WTFThreadData). Either you or I should help out future programmers by turning that into a comment in destroy(). In my view, this is exactly what comments are for. Adding the "why" to code. The code itself should be very clear on "what". I'll add that comment. Re performance I ran PLT5 and got no change. (In reply to Cameron McCormack (:heycam) from comment #17) > I'll add that comment. > > Re performance I ran PLT5 and got no change. MM might be a better pick to test for pref regression. (In reply to zalan from comment #18) > MM might be a better pick to test for pref regression. MotionMark shows no change too. (In reply to Cameron McCormack (:heycam) from comment #19) > (In reply to zalan from comment #18) > > MM might be a better pick to test for pref regression. > > MotionMark shows no change too. 👍 Created attachment 446089 [details]
Patch
Created attachment 446095 [details]
Patch
Comment on attachment 446095 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=446095&action=review > Source/WebCore/ChangeLog:19 > + fontCacheFallingBackToSingleton and wrongly get back the main thread's "incorrectly" > Source/WebCore/ChangeLog:24 > + ThreadGlobalData, which is easily globally accessible. Performing a This sounds like a good idea. Created attachment 446234 [details]
Patch
Committed r286625 (244938@main): <https://commits.webkit.org/244938@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 446234 [details]. |