RESOLVED FIXED 233851
[libpas] Set pthread_setspecific with marker in TLS destructor to detect TLS is destroyed
https://bugs.webkit.org/show_bug.cgi?id=233851
Summary [libpas] Set pthread_setspecific with marker in TLS destructor to detect TLS ...
Yusuke Suzuki
Reported 2021-12-03 19:22:41 PST
[libpas] Set pthread_setspecific with marker in TLS destructor to detect TLS is destroyed
Attachments
Patch (22.57 KB, patch)
2021-12-03 19:35 PST, Yusuke Suzuki
no flags
Patch (22.70 KB, patch)
2021-12-03 19:41 PST, Yusuke Suzuki
ews-feeder: commit-queue-
Patch (23.17 KB, patch)
2021-12-03 19:53 PST, Yusuke Suzuki
no flags
Patch (24.77 KB, patch)
2021-12-03 20:52 PST, Yusuke Suzuki
no flags
Patch (15.31 KB, patch)
2021-12-06 13:19 PST, Yusuke Suzuki
no flags
Patch (16.20 KB, patch)
2021-12-06 13:27 PST, Yusuke Suzuki
mark.lam: review+
Yusuke Suzuki
Comment 1 2021-12-03 19:35:24 PST
Yusuke Suzuki
Comment 2 2021-12-03 19:41:55 PST
Yusuke Suzuki
Comment 3 2021-12-03 19:53:47 PST
Yusuke Suzuki
Comment 4 2021-12-03 20:52:51 PST
Mark Lam
Comment 5 2021-12-06 13:06:19 PST
Comment on attachment 445953 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=445953&action=review > Source/bmalloc/libpas/src/libpas/pas_deallocate.h:59 > thread_local_cache = pas_thread_local_cache_try_get(); > - if (!thread_local_cache) { > + if (pas_thread_local_cache_is_null(thread_local_cache)) { It looks like in almost all places, you're calling pas_thread_local_cache_try_get(), and then calling pas_thread_local_cache_is_null() on its result. Is there any reason why you didn't just call pas_thread_local_cache_is_null() in pas_thread_local_cache_try_get(), and return a NULL if pas_thread_local_cache_is_null() is true? That way, a lot of these pieces of client code don't need to change, and it is less error prone (new client code won't be vulnerable to forgetting to call pas_thread_local_cache_is_null())? > Source/bmalloc/libpas/src/libpas/pas_thread_local_cache.c:94 > + static const bool verbose = false; nit: constexpr instead of const?
Yusuke Suzuki
Comment 6 2021-12-06 13:07:51 PST
Comment on attachment 445953 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=445953&action=review >> Source/bmalloc/libpas/src/libpas/pas_deallocate.h:59 >> + if (pas_thread_local_cache_is_null(thread_local_cache)) { > > It looks like in almost all places, you're calling pas_thread_local_cache_try_get(), and then calling pas_thread_local_cache_is_null() on its result. Is there any reason why you didn't just call pas_thread_local_cache_is_null() in pas_thread_local_cache_try_get(), and return a NULL if pas_thread_local_cache_is_null() is true? That way, a lot of these pieces of client code don't need to change, and it is less error prone (new client code won't be vulnerable to forgetting to call pas_thread_local_cache_is_null())? OK, maybe it is better will check >> Source/bmalloc/libpas/src/libpas/pas_thread_local_cache.c:94 >> + static const bool verbose = false; > > nit: constexpr instead of const? Cannot use it since it is C (Not C++).
Yusuke Suzuki
Comment 7 2021-12-06 13:19:58 PST
Yusuke Suzuki
Comment 8 2021-12-06 13:27:16 PST
Mark Lam
Comment 9 2021-12-06 14:19:52 PST
Comment on attachment 446075 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=446075&action=review > Source/bmalloc/ChangeLog:19 > + 1. When destroying TLS, we set a marker (PAS_THREAD_LOCAL_CACHE_DESTROYED in this case) to TLS. /to TLS/in TLS/ > Source/bmalloc/libpas/src/libpas/pas_thread_local_cache.c:101 > + /* If pthread_self_is_exiting_np does not exist, we set PAS_THREAD_LOCAL_CACHE_DESTROYED to the TLS so that /to the TLS/in the TLS/.
Mark Lam
Comment 10 2021-12-06 14:46:27 PST
Comment on attachment 446080 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=446080&action=review Nice. r=me > Source/bmalloc/libpas/src/libpas/pas_thread_local_cache.c:101 > + /* If pthread_self_is_exiting_np does not exist, we set PAS_THREAD_LOCAL_CACHE_DESTROYED to the TLS so that /to the TLS/in the TLS/ > Source/bmalloc/libpas/src/libpas/pas_thread_local_cache.c:103 > + subsequent calls of pas_thread_local_cache_try_get() can detect whether TLS is destroyed. Since it is not > + NULL, this destructor will be called once again later. We repeat invoking this destructor PTHREAD_DESTRUCTOR_ITERATIONS times. */ I suggest rephrasing the last 2 sentences (starting with "Since it is not ...") as follows: Since PAS_THREAD_LOCAL_CACHE_DESTROYED is a non-null value, pthread will call this destructor again (up to PTHREAD_DESTRUCTOR_ITERATIONS times). Each time it does, it will clear the TLS entry. Hence, we need to re-set PAS_THREAD_LOCAL_CACHE_DESTROYED in the TLS each time to continue to indicate that destroy() has already been called once.
Yusuke Suzuki
Comment 11 2021-12-06 14:51:47 PST
Comment on attachment 446080 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=446080&action=review Thank you! >> Source/bmalloc/libpas/src/libpas/pas_thread_local_cache.c:101 >> + /* If pthread_self_is_exiting_np does not exist, we set PAS_THREAD_LOCAL_CACHE_DESTROYED to the TLS so that > > /to the TLS/in the TLS/ Fixed. >> Source/bmalloc/libpas/src/libpas/pas_thread_local_cache.c:103 >> + NULL, this destructor will be called once again later. We repeat invoking this destructor PTHREAD_DESTRUCTOR_ITERATIONS times. */ > > I suggest rephrasing the last 2 sentences (starting with "Since it is not ...") as follows: > Since PAS_THREAD_LOCAL_CACHE_DESTROYED is a non-null value, pthread will call this destructor again (up to PTHREAD_DESTRUCTOR_ITERATIONS times). Each time it does, it will clear the TLS entry. Hence, we need to re-set PAS_THREAD_LOCAL_CACHE_DESTROYED in the TLS each time to continue to indicate that destroy() has already been called once. Nice, changed.
Yusuke Suzuki
Comment 12 2021-12-06 14:54:37 PST
Radar WebKit Bug Importer
Comment 13 2021-12-06 14:55:24 PST
Filip Pizlo
Comment 14 2021-12-06 19:29:04 PST
Comment on attachment 446080 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=446080&action=review This change is so gloriously evil! I love it! > Source/bmalloc/libpas/src/libpas/pas_thread_local_cache.c:106 > +#endif I would have put a comment like "/* !defined(PAS_THREAD_LOCAL_CACHE_CAN_DETECT_THREAD_EXIT) */" after the #endif, just to make it easier to tell what the #endif is for. > Source/bmalloc/libpas/src/libpas/pas_thread_local_cache.h:111 > +static inline pas_thread_local_cache* pas_thread_local_cache_try_get(void) > { > -#ifdef PAS_THREAD_LOCAL_CACHE_CAN_DETECT_THREAD_EXIT > - return true; > -#else > - return false; > + pas_thread_local_cache* cache = pas_thread_local_cache_try_get_impl(); > +#ifndef PAS_THREAD_LOCAL_CACHE_CAN_DETECT_THREAD_EXIT > + if (((uintptr_t)cache) == PAS_THREAD_LOCAL_CACHE_DESTROYED) > + return NULL; > #endif > + return cache; > } I'm a bit worried that this introduces an extra branch on fast paths. Is there any way to accomplish this without using the DESTROYED marker inside the fast TLS? Maybe using a separate key? (If there is a perf problem here then it's probably very very small, so I wouldn't worry about this too much.)
Yusuke Suzuki
Comment 15 2021-12-06 21:58:06 PST
Comment on attachment 446080 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=446080&action=review >> Source/bmalloc/libpas/src/libpas/pas_thread_local_cache.h:111 >> } > > I'm a bit worried that this introduces an extra branch on fast paths. > > Is there any way to accomplish this without using the DESTROYED marker inside the fast TLS? Maybe using a separate key? > > (If there is a perf problem here then it's probably very very small, so I wouldn't worry about this too much.) One way is that, reverting the original change back from https://bugs.webkit.org/attachment.cgi?id=445953&action=review. In the original patch, we are using pas_thread_local_cache_is_null, which implementation is, return (uintptr_t)(cache) <= PAS_THREAD_LOCAL_CACHE_DESTROYED; so, we can filter out null and this destroying case with one check. And we used it instead of old NULL check, so literally, it was just replacing NULL branching with that comparison branching. What do you think of?
Filip Pizlo
Comment 16 2021-12-07 09:48:19 PST
(In reply to Yusuke Suzuki from comment #15) > Comment on attachment 446080 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=446080&action=review > > >> Source/bmalloc/libpas/src/libpas/pas_thread_local_cache.h:111 > >> } > > > > I'm a bit worried that this introduces an extra branch on fast paths. > > > > Is there any way to accomplish this without using the DESTROYED marker inside the fast TLS? Maybe using a separate key? > > > > (If there is a perf problem here then it's probably very very small, so I wouldn't worry about this too much.) > > One way is that, reverting the original change back from > https://bugs.webkit.org/attachment.cgi?id=445953&action=review. > In the original patch, we are using pas_thread_local_cache_is_null, which > implementation is, > > return (uintptr_t)(cache) <= PAS_THREAD_LOCAL_CACHE_DESTROYED; > > so, we can filter out null and this destroying case with one check. And we > used it instead of old NULL check, so literally, it was just replacing NULL > branching with that comparison branching. > What do you think of? That's not bad! What about using a separate (and not fast TLS) key for this? It would work like this: - Creating a thread_local_cache sets that key to 1. - Destroying that key sets it to 2. - is_exiting is emulated by checking if that thing is 2. Thinking about it more, I'm not sure if it's actually better, since it has a loophole: if the thread_local_cache is created for the first time during the last round of pthread's destructor loop, then we'll leak a thread_local_cache. So, I'm not sure I'd actually recommend that. Maybe your way is better.
Yusuke Suzuki
Comment 17 2021-12-07 10:03:27 PST
Comment on attachment 446080 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=446080&action=review >>>> Source/bmalloc/libpas/src/libpas/pas_thread_local_cache.h:111 >>>> } >>> >>> I'm a bit worried that this introduces an extra branch on fast paths. >>> >>> Is there any way to accomplish this without using the DESTROYED marker inside the fast TLS? Maybe using a separate key? >>> >>> (If there is a perf problem here then it's probably very very small, so I wouldn't worry about this too much.) >> >> One way is that, reverting the original change back from https://bugs.webkit.org/attachment.cgi?id=445953&action=review. >> In the original patch, we are using pas_thread_local_cache_is_null, which implementation is, >> >> return (uintptr_t)(cache) <= PAS_THREAD_LOCAL_CACHE_DESTROYED; >> >> so, we can filter out null and this destroying case with one check. And we used it instead of old NULL check, so literally, it was just replacing NULL branching with that comparison branching. >> What do you think of? > > That's not bad! > > What about using a separate (and not fast TLS) key for this? > > It would work like this: > > - Creating a thread_local_cache sets that key to 1. > - Destroying that key sets it to 2. > - is_exiting is emulated by checking if that thing is 2. > > Thinking about it more, I'm not sure if it's actually better, since it has a loophole: if the thread_local_cache is created for the first time during the last round of pthread's destructor loop, then we'll leak a thread_local_cache. So, I'm not sure I'd actually recommend that. Maybe your way is better. I checked all callsites of pas_thread_local_cache_try_get, and all places are doing NULL check after that. So, it means that, if clang is smart enough, clang can combine the current 0 or 1 check into one comparison effectively, and probably, with this patch, it does not introduce any new branching. I'll check it :)
Note You need to log in before you can comment on or make changes to this bug.