Bug 225099

Summary: [WPE][GTK] More correct fixes for stack size issues on musl libc
Product: WebKit Reporter: Daniel Kolesa <dkolesa>
Component: Web Template FrameworkAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: aperez, benjamin, cdumez, clopez, cmarcelo, ews-watchlist, keith_miller, mark.lam, msaboff, pmatos, psaavedra, raj.khem, saam, tzagallo, webkit-bug-importer, ysuzuki
Priority: P2 Keywords: InRadar
Version: WebKit Local Build   
Hardware: Other   
OS: Linux   
Bug Depends on:    
Bug Blocks: 187485, 226244    
Attachments:
Description Flags
patch
none
fixed style, typos
mark.lam: review+
use main stack boundary logic on linux universally
none
fixed patch
mark.lam: review+
fixed changelog and other comment typos none

Daniel Kolesa
Reported 2021-04-27 06:27:12 PDT
[WPE][GTK] More correct fixes for stack size issues on musl libc Partial revert https://bugs.webkit.org/show_bug.cgi?id=210068 While the changes in r236306 stopped JSC from crashing outright, they are not correct, since they also make it rather unstable. To counter this, increase stack size for threads on musl to 1M, to match other systems but not be overly greedy like e.g. glibc. While at it, the previous approach to musl thread stack size was breaking use of DEFAULT_THREAD_STACK_SIZE_IN_KB (if defined) as well as not properly taking care of the unused parameter. Move the code to a more appropriate place, which solves these problems. All this is however not enough, since there is still the main thread; on musl, attempting to get the stack size of main thread via the API provided by pthreads results in the value of 128k, which is the reserved kernel size. WebKit then uses this to check the stack size of all threads, including main. That results in small stack bounds for the main thread, which results in JSC thinking it has overflown the stack, and crashes many websites. In order to fix this, we need to do something similar to what is already being done for Drawin - use the resource limits interface for main thread instead. Finally, drop the special casing in JSC options. It is no longer necessary - softReservedZoneSize was causing a crash at a value of 128K because of the main thread stack bounds, and this is now anon-issue. We can keep the maxPerThreadStackUsage at 5M as well; there is no fundamental difference from how things are done on glibc anymore.
Attachments
patch (10.22 KB, patch)
2021-04-27 06:31 PDT, Daniel Kolesa
no flags
fixed style, typos (10.22 KB, patch)
2021-04-27 06:38 PDT, Daniel Kolesa
mark.lam: review+
use main stack boundary logic on linux universally (10.16 KB, patch)
2021-04-27 11:09 PDT, Daniel Kolesa
no flags
fixed patch (10.20 KB, patch)
2021-04-27 11:12 PDT, Daniel Kolesa
mark.lam: review+
fixed changelog and other comment typos (10.12 KB, patch)
2021-04-27 15:46 PDT, Daniel Kolesa
no flags
Daniel Kolesa
Comment 1 2021-04-27 06:31:09 PDT
Created attachment 427141 [details] patch Some comments for potential review: - I considered using WTF::isMainThread() but decided against it; since this piece of code is specific to Linux, I thought it'd be best to use the lowest level API in order to avoid abstractions potentially causing issues and stay robust - Ideally I'd like to change the '#if OS(LINUX) && !defined(__BIONIC__) && !defined(__GLIBC__)' to just '#if OS(LINUX)' in StackBounds.cpp. Common paths are better tested, and this should not hurt non-musl systems; finally, fewer ifdefs are better than more. I included a strict conditional for now to be conservative. I'll leave it up to reviewer which path we'll end up taking.
Daniel Kolesa
Comment 2 2021-04-27 06:38:41 PDT
Created attachment 427144 [details] fixed style, typos
Adrian Perez
Comment 3 2021-04-27 07:15:56 PDT
Comment on attachment 427144 [details] fixed style, typos Nice one, thanks! Patch LGTM, but I also think it would be positive to remove conditional compilation guards and use the same code with the different libcs; to avoid having one code path less exercised than the others. I will try to get some JSC developer to also take a look at this and give the final r+; it seems like a good idea to have another pair of eyes go over this patch. View in context: https://bugs.webkit.org/attachment.cgi?id=427144&action=review > Source/WTF/wtf/StackBounds.cpp:125 > + if (getpid() == static_cast<pid_t>(syscall(SYS_gettid))) { I would be in favor of moving this check into a Linux-specific implementation of WTF::isMainThread(). Currently Linux uses the implementation from MainThreadGeneric.cpp but pthread_main_np() is not available on Linux, so if something attempts to use it before a call to WTF::initializeThreadPlatform() has been made e.g. constructors it will most certainly return “false” because the “mainThread” variable would have not been yet initialized. Nevertheless, I think that could be in a follow-up patch, and unless somebody has a different opinion, should not block landing this.
Daniel Kolesa
Comment 4 2021-04-27 07:20:44 PDT
I should probably adjust the comments a bit more to clarify things - notably, it's not that musl always returns 128k, it actually calculates the currently available size (which happens to be 128k as that's what the initial size is at the point; this was confirmed by the musl author, and that it's intended behavior) so the solutions are really two - either do what i'm doing (but update the comment) or pre-poke the stack (which i don't think is necessary - it's fine that it grows, it's just that the stack boundary is created once on startup and never updated later, so it should represent the maximum rather than current) anyway, after someone involved with JSC confirms it's fine, I will do a final update on the patch along with other potential review changes.
Khem Raj
Comment 5 2021-04-27 07:33:04 PDT
Comment on attachment 427144 [details] fixed style, typos thanks for this patch, I have tried it and seems to work ok in my case so far.
Mark Lam
Comment 6 2021-04-27 10:16:51 PDT
Comment on attachment 427144 [details] fixed style, typos View in context: https://bugs.webkit.org/attachment.cgi?id=427144&action=review r=me with suggestion for USE(MUSL). Feel free to do that as clean up in a follow up patch. > Source/WTF/wtf/StackBounds.cpp:117 > +#if OS(LINUX) && !defined(__BIONIC__) && !defined(__GLIBC__) I see this set of conditionals used in more than 1 place (3 in this patch). Would it be better to define USE_MUSL in PlatformUse.h and check #if USE(MUSL) here (and in the other places)? Is this code only needed for musl?
Mark Lam
Comment 7 2021-04-27 10:23:02 PDT
(In reply to Daniel Kolesa from comment #4) > I should probably adjust the comments a bit more to clarify things - > notably, it's not that musl always returns 128k, it actually calculates the > currently available size (which happens to be 128k as that's what the > initial size is at the point; this was confirmed by the musl author, and > that it's intended behavior) > > so the solutions are really two - either do what i'm doing (but update the > comment) or pre-poke the stack (which i don't think is necessary - it's fine > that it grows, it's just that the stack boundary is created once on startup > and never updated later, so it should represent the maximum rather than > current) > > anyway, after someone involved with JSC confirms it's fine, I will do a > final update on the patch along with other potential review changes. FYI, the StackBounds is expected to be constant. Client code can cache it to avoid continually probing for the bounds. One can expect that whatever it returns the first time is the bounds the code will use for the lifecycle of the process. StackBounds should never lie and claim to have more stack than is actually available. The downside of doing so is that it will crash when the client use stack memory that isn't actually there. And yes, adding the details to the ChangeLog will probably be helpful as future readers of this code will be consulting it for the rationale of your implementation.
Daniel Kolesa
Comment 8 2021-04-27 10:38:27 PDT
> FYI, the StackBounds is expected to be constant. Client code can cache it > to avoid continually probing for the bounds. One can expect that whatever > it returns the first time is the bounds the code will use for the lifecycle > of the process. StackBounds should never lie and claim to have more stack > than is actually available. The downside of doing so is that it will crash > when the client use stack memory that isn't actually there. > > And yes, adding the details to the ChangeLog will probably be helpful as > future readers of this code will be consulting it for the rationale of your > implementation. well, that's the assumption we're working with in this patch - for main thread, it's initialized once, to the size the stack can have at maximum on glibc and most other libc's, pthread_attr_getstack will already return the maximum, while on musl, it returns the current size (it will automatically grow though, unlike thread stacks which never grow further) > I see this set of conditionals used in more than 1 place (3 in this patch). > Would it be better to define USE_MUSL in PlatformUse.h and check #if > USE(MUSL) here (and in the other places)? Is this code only needed for musl? that's the thing - it's technically not needed on glibc/bionic (since the pthread api already returns what we expect), but if we also enabled it for glibc/bionic, it wouldn't hurt anything which is why i'd perhaps like to change the conditional to just '#if OS(LINUX)' and use it everywhere, to avoid needless platform checks that conditional will obviously catch other libc's than musl as well; there's no telling what kind of behavior they will actually have - so we should play it safe for those either way musl has no test macro of its own, intentionally (in order to encourage people to write portable code)
Mark Lam
Comment 9 2021-04-27 10:43:53 PDT
(In reply to Daniel Kolesa from comment #8) > > I see this set of conditionals used in more than 1 place (3 in this patch). > > Would it be better to define USE_MUSL in PlatformUse.h and check #if > > USE(MUSL) here (and in the other places)? Is this code only needed for musl? > > that's the thing - it's technically not needed on glibc/bionic (since the > pthread api already returns what we expect), but if we also enabled it for > glibc/bionic, it wouldn't hurt anything > > which is why i'd perhaps like to change the conditional to just '#if > OS(LINUX)' and use it everywhere, to avoid needless platform checks > > that conditional will obviously catch other libc's than musl as well; > there's no telling what kind of behavior they will actually have - so we > should play it safe for those either way > > musl has no test macro of its own, intentionally (in order to encourage > people to write portable code) Can't you add this in PlatformUse.h: #if OS(LINUX) && !defined(__BIONIC__) && !defined(__GLIBC__) #define USE_MUSL #endif ... and check USE(MUSL) in all the places instead? That is unless you intend to use this code for other than musl. Anyway, I just think it's error prone to always have a long string of conditionals like that that needs to be copy-pasted to multiple places. That's my $.02.
Daniel Kolesa
Comment 10 2021-04-27 10:48:19 PDT
(In reply to Mark Lam from comment #9) > (In reply to Daniel Kolesa from comment #8) > > > I see this set of conditionals used in more than 1 place (3 in this patch). > > > Would it be better to define USE_MUSL in PlatformUse.h and check #if > > > USE(MUSL) here (and in the other places)? Is this code only needed for musl? > > > > that's the thing - it's technically not needed on glibc/bionic (since the > > pthread api already returns what we expect), but if we also enabled it for > > glibc/bionic, it wouldn't hurt anything > > > > which is why i'd perhaps like to change the conditional to just '#if > > OS(LINUX)' and use it everywhere, to avoid needless platform checks > > > > that conditional will obviously catch other libc's than musl as well; > > there's no telling what kind of behavior they will actually have - so we > > should play it safe for those either way > > > > musl has no test macro of its own, intentionally (in order to encourage > > people to write portable code) > > Can't you add this in PlatformUse.h: > > #if OS(LINUX) && !defined(__BIONIC__) && !defined(__GLIBC__) > #define USE_MUSL > #endif > > ... and check USE(MUSL) in all the places instead? That is unless you > intend to use this code for other than musl. Anyway, I just think it's > error prone to always have a long string of conditionals like that that > needs to be copy-pasted to multiple places. That's my $.02. that won't be true, because that doesn't check for musl, it checks for every c library except glibc and bionic (so also e.g. uclibc and so on) as i said, the best thing would probably be to use the the code universally, without checking which libc it is; it should work everywhere as is, so there is no need to check for libc here
Mark Lam
Comment 11 2021-04-27 10:51:32 PDT
(In reply to Daniel Kolesa from comment #10) > (In reply to Mark Lam from comment #9) > > Can't you add this in PlatformUse.h: > > > > #if OS(LINUX) && !defined(__BIONIC__) && !defined(__GLIBC__) > > #define USE_MUSL > > #endif > > > > ... and check USE(MUSL) in all the places instead? That is unless you > > intend to use this code for other than musl. Anyway, I just think it's > > error prone to always have a long string of conditionals like that that > > needs to be copy-pasted to multiple places. That's my $.02. > > that won't be true, because that doesn't check for musl, it checks for every > c library except glibc and bionic (so also e.g. uclibc and so on) > > as i said, the best thing would probably be to use the the code universally, > without checking which libc it is; it should work everywhere as is, so there > is no need to check for libc here SGTM.
Daniel Kolesa
Comment 12 2021-04-27 11:09:13 PDT
Created attachment 427171 [details] use main stack boundary logic on linux universally switched to not checking libc, instead enabling the new code for all Linux universally I kept the old conditional in the thread stack size code, as there is no need to adjust it on those two platforms specifically
Daniel Kolesa
Comment 13 2021-04-27 11:12:38 PDT
Created attachment 427173 [details] fixed patch apologies, i accidentally reverted some changes in the patch
Mark Lam
Comment 14 2021-04-27 11:24:50 PDT
Comment on attachment 427173 [details] fixed patch View in context: https://bugs.webkit.org/attachment.cgi?id=427173&action=review r=me > Source/WTF/ChangeLog:33 > + we allows us to drop the options special-casing that was in place. /we allows/it allows/
Daniel Kolesa
Comment 15 2021-04-27 11:26:45 PDT
i'll let the CI finish, and then i'll fix the typo
Daniel Kolesa
Comment 16 2021-04-27 15:46:49 PDT
Created attachment 427207 [details] fixed changelog and other comment typos fixed the typo and some others in the changelogs and comments (no code changes) CI seems to have passed on the previous run.
EWS
Comment 17 2021-04-28 00:36:53 PDT
Committed r276695 (237109@main): <https://commits.webkit.org/237109@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 427207 [details].
Radar WebKit Bug Importer
Comment 18 2021-04-28 00:37:18 PDT
Note You need to log in before you can comment on or make changes to this bug.