RESOLVED FIXED Bug 228240
3.5 MB system-wide footprint impact due to thread-locals in libANGLE
https://bugs.webkit.org/show_bug.cgi?id=228240
Summary 3.5 MB system-wide footprint impact due to thread-locals in libANGLE
Dean Jackson
Reported 2021-07-23 11:28:58 PDT
From the macOS system memory folks: Because of the way thread-locals are implemented in dyld today, they are dirty on process launch, even if the process does not use the functionality provided by your framework at all. /System/Library/PrivateFrameworks/WebCore.framework/Frameworks/libANGLE-shared.dylib has recently added these symbols: 100.0%: __ZN2gl20gCurrentValidContextE 97.5%: __ZN3egl14gCurrentThreadE (gl::gCurrentValidContext and egl::gCurrentThread) Which are thread-local. This is a regression in macOS Monterey. Is it possible to switch to a different mechanism, or to use pthread TLVs manually instead of thread-locals? Because of this, libANGLE-shared used to be dirty in 172 processes and is now dirtied in 355. Overall it is a 3.5MB regression.
Attachments
WIP (8.52 KB, patch)
2021-07-23 11:32 PDT, Dean Jackson
no flags
Patch (10.27 KB, patch)
2021-07-23 17:23 PDT, Dean Jackson
kbr: review+
Dean Jackson
Comment 1 2021-07-23 11:29:19 PDT
Dean Jackson
Comment 2 2021-07-23 11:32:21 PDT
Created attachment 434103 [details] WIP Work in progress to get review. Please ignore the change to WebKit networking code - that was to help a local compile. I'm not confident in this patch because: - I've never used pthread_key_create before - I'm worried about a performance regression in querying state on every access (which happens a lot) - I want a better place to initialise and delete the keys - This is an invasive patch that will need to be applied each time we update ANGLE Eventually we'll fix dyld's thread_local implementation.
EWS Watchlist
Comment 3 2021-07-23 11:33:17 PDT
Note that there are important steps to take when updating ANGLE. See https://trac.webkit.org/wiki/UpdatingANGLE
Kenneth Russell
Comment 4 2021-07-23 12:02:35 PDT
Comment on attachment 434103 [details] WIP View in context: https://bugs.webkit.org/attachment.cgi?id=434103&action=review Looks good overall especially if this significantly reduces binary size, though it's unfortunate we can't use the built-in mechanisms. A few questions / comments, but to unblock this work, r+ now. > Source/ThirdParty/ANGLE/src/libANGLE/Context.cpp:302 > +void FreeCurrentValidContext() This and the function above are a little confusingly named. Could you consider: InitializeCurrentValidContextTLS FreeCurrentValidContextTLS ? I also noticed that FreeCurrentValidContext isn't called from anywhere - which is fine because it can really only be called upon unloading of ANGLE or application shutdown. Should a comment be added about when it could plausibly be used? > Source/ThirdParty/ANGLE/src/libGLESv2/global_state.cpp:49 > + if (!initialized) { Is it necessary to do this double-checking? The documentation indicates that the correct usage is just to call dispatch_once unconditionally. https://developer.apple.com/documentation/dispatch/1447169-dispatch_once > Source/ThirdParty/ANGLE/src/libGLESv2/global_state.cpp:99 > +bool InitializeCurrentThreadTLS() The naming convention between the CurrentThread and CurrentValidContext should match. So maybe here: InitializeCurrentThreadTLSIndex FreeCurrentThreadTLSIndex GetCurrentThreadTLS SetCurrentThreadTLS and use TLSIndex / TLS in the CurrentValidContext functions above. > Source/ThirdParty/ANGLE/src/libGLESv2/global_state.cpp:119 > + if (!initialized) { Same question about the double-checked initialization, though I do see this is on the hot path.
Geoffrey Garen
Comment 5 2021-07-23 12:32:32 PDT
Comment on attachment 434103 [details] WIP View in context: https://bugs.webkit.org/attachment.cgi?id=434103&action=review I appreciate Kenneth's review but I'm gonna be a little more explicit and say r- because I see a bug. > Source/ThirdParty/ANGLE/src/libGLESv2/global_state.cpp:47 > + static dispatch_once_t once = 0; Better to leave off the = 0 and accept whatever the platform says is the default linker-initialized value. >> Source/ThirdParty/ANGLE/src/libGLESv2/global_state.cpp:49 >> + static bool initialized = false; >> + if (!initialized) { > > Is it necessary to do this double-checking? The documentation indicates that the correct usage is just to call dispatch_once unconditionally. > https://developer.apple.com/documentation/dispatch/1447169-dispatch_once Kenneth is right. This is a bug. Because 'initialized' lacks any synchronization or memory barrier, it can read as true when it is still logically false. I suggest removing the initialized variable and always calling dispatch_once. dispatch_once is fast (and inlined) in the "already dispatched" case. >> Source/ThirdParty/ANGLE/src/libGLESv2/global_state.cpp:119 >> + if (!initialized) { > > Same question about the double-checked initialization, though I do see this is on the hot path. Same comments as above. > Source/WebKit/NetworkProcess/webrtc/NetworkRTCUDPSocketCocoa.mm:282 > +#if defined(NW_HAS_SHARE_LISTENER_PORT) && NW_HAS_SHARE_LISTENER_PORT && 0 Revert?
Dean Jackson
Comment 6 2021-07-23 17:07:00 PDT
Comment on attachment 434103 [details] WIP View in context: https://bugs.webkit.org/attachment.cgi?id=434103&action=review >> Source/ThirdParty/ANGLE/src/libANGLE/Context.cpp:302 >> +void FreeCurrentValidContext() > > This and the function above are a little confusingly named. Could you consider: > > InitializeCurrentValidContextTLS > FreeCurrentValidContextTLS > ? > > I also noticed that FreeCurrentValidContext isn't called from anywhere - which is fine because it can really only be called upon unloading of ANGLE or application shutdown. Should a comment be added about when it could plausibly be used? Done! (The reason I prefixed the thread form with TLS was because there was already a method called GetCurrentThread()). >> Source/ThirdParty/ANGLE/src/libGLESv2/global_state.cpp:47 >> + static dispatch_once_t once = 0; > > Better to leave off the = 0 and accept whatever the platform says is the default linker-initialized value. Done >>> Source/ThirdParty/ANGLE/src/libGLESv2/global_state.cpp:49 >>> + if (!initialized) { >> >> Is it necessary to do this double-checking? The documentation indicates that the correct usage is just to call dispatch_once unconditionally. >> https://developer.apple.com/documentation/dispatch/1447169-dispatch_once > > Kenneth is right. This is a bug. Because 'initialized' lacks any synchronization or memory barrier, it can read as true when it is still logically false. > > I suggest removing the initialized variable and always calling dispatch_once. dispatch_once is fast (and inlined) in the "already dispatched" case. Removed (although, question... even if it was read as true, the code would still hit the dispatch_once and sync there right?) >> Source/ThirdParty/ANGLE/src/libGLESv2/global_state.cpp:99 >> +bool InitializeCurrentThreadTLS() > > The naming convention between the CurrentThread and CurrentValidContext should match. So maybe here: > > InitializeCurrentThreadTLSIndex > FreeCurrentThreadTLSIndex > GetCurrentThreadTLS > SetCurrentThreadTLS > > and use TLSIndex / TLS in the CurrentValidContext functions above. Done. >>> Source/ThirdParty/ANGLE/src/libGLESv2/global_state.cpp:119 >>> + if (!initialized) { >> >> Same question about the double-checked initialization, though I do see this is on the hot path. > > Same comments as above. Done. I also moved the initialization call into the Get/Set function to make it less confusing.
Dean Jackson
Comment 7 2021-07-23 17:23:27 PDT
Kenneth Russell
Comment 8 2021-07-23 17:43:08 PDT
Comment on attachment 434148 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=434148&action=review This revision looks better to me. Since Geoff had concerns earlier, please sync up with him too. r+ with two small nits. > Source/ThirdParty/ANGLE/src/libANGLE/Context.cpp:297 > +TLSIndex CurrentValidContextIndex = TLS_INVALID_INDEX; Nit: did you want to put this in an anonymous namespace, make it static, etc. to hide the symbol's visibility? > Source/ThirdParty/ANGLE/src/libGLESv2/global_state.cpp:94 > +TLSIndex CurrentThreadIndex = TLS_INVALID_INDEX; Same question here.
Geoffrey Garen
Comment 9 2021-07-26 11:02:37 PDT
Comment on attachment 434148 [details] Patch r=me too; also agree that we should use static for CurrentValidContextIndex and CurrentThreadIndex.
Dean Jackson
Comment 10 2021-07-27 11:06:30 PDT
Comment on attachment 434148 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=434148&action=review >> Source/ThirdParty/ANGLE/src/libANGLE/Context.cpp:297 >> +TLSIndex CurrentValidContextIndex = TLS_INVALID_INDEX; > > Nit: did you want to put this in an anonymous namespace, make it static, etc. to hide the symbol's visibility? Yes, thanks!
Dean Jackson
Comment 11 2021-07-27 11:06:31 PDT
Comment on attachment 434148 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=434148&action=review >> Source/ThirdParty/ANGLE/src/libANGLE/Context.cpp:297 >> +TLSIndex CurrentValidContextIndex = TLS_INVALID_INDEX; > > Nit: did you want to put this in an anonymous namespace, make it static, etc. to hide the symbol's visibility? Yes, thanks!
Dean Jackson
Comment 12 2021-07-27 14:36:42 PDT
Note You need to log in before you can comment on or make changes to this bug.