WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
213200
REGRESSION(
r262991
): Broke stress/ensure-crash.js
https://bugs.webkit.org/show_bug.cgi?id=213200
Summary
REGRESSION(r262991): Broke stress/ensure-crash.js
Michael Catanzaro
Reported
2020-06-15 10:53:52 PDT
Moving from
bug #212479
. stress/ensure-crash.js is broken since
r262991
on Red Hat's internal cloop CI: Running stress/ensure-crash.js.default stress/ensure-crash.js.default: ERROR: Unexpected exit code: 134 FAIL: stress/ensure-crash.js.default Running stress/ensure-crash.js.mini-mode stress/ensure-crash.js.mini-mode: ERROR: Unexpected exit code: 134 FAIL: stress/ensure-crash.js.mini-mode 134 - 128 = 6 = SIGABRT on Linux, so the test is failing with SIGABRT. (SIGILL = 4 so if it were to exit with SIGILL its exit status would be 132.) Skimming the commit, I don't immediately see what could be causing a SIGABRT. I thought the most likely problem was that this change assumes HAVE(MACHINE_CONTEXT) is enabled, but looking in PlatformHave.h, that is not true on two of the four architectures we test (ppc64le and s390x). However, the test is failing in the same way on both x86_64 and aarch64, so that must not be the only problem here. I wonder if it might be related to ENABLE(SIGNAL_BASED_VM_TRAPS), which depends on DFG and is therefore disabled in cloop.
Attachments
proposed patch.
(3.33 KB, patch)
2020-06-15 15:04 PDT
,
Mark Lam
mcatanzaro
: review+
Details
Formatted Diff
Diff
patch for landing.
(4.33 KB, patch)
2020-06-15 15:23 PDT
,
Mark Lam
no flags
Details
Formatted Diff
Diff
Patch
(6.66 KB, patch)
2020-06-15 16:29 PDT
,
Michael Catanzaro
no flags
Details
Formatted Diff
Diff
Show Obsolete
(2)
View All
Add attachment
proposed patch, testcase, etc.
Michael Catanzaro
Comment 1
2020-06-15 11:13:46 PDT
Mark says
bug #212543
might help.
Mark Lam
Comment 2
2020-06-15 13:11:51 PDT
(In reply to Michael Catanzaro from
comment #1
)
> Mark says
bug #212543
might help.
I was wrong. 212542 won't help. But I think I know what's happening. Will provide details after I confirm things.
Mark Lam
Comment 3
2020-06-15 13:26:46 PDT
The root cause of this is because of
https://trac.webkit.org/r238478
i.e. non-darwin ports chooses to implement CRASH() using abort.
Michael Catanzaro
Comment 4
2020-06-15 14:31:35 PDT
(In reply to Mark Lam from
comment #3
)
> The root cause of this is because of
https://trac.webkit.org/r238478
i.e. > non-darwin ports chooses to implement CRASH() using abort.
We want CRASH() to raise SIGABRT because that's the expected way to express an assertion failure crash on Linux. Before that commit, it would have been SIGSEGV (not SIGILL).
Mark Lam
Comment 5
2020-06-15 14:32:31 PDT
(In reply to Michael Catanzaro from
comment #4
)
> (In reply to Mark Lam from
comment #3
) > > The root cause of this is because of
https://trac.webkit.org/r238478
i.e. > > non-darwin ports chooses to implement CRASH() using abort. > > We want CRASH() to raise SIGABRT because that's the expected way to express > an assertion failure crash on Linux. Before that commit, it would have been > SIGSEGV (not SIGILL).
I understand. I'm investigating a fix now.
Mark Lam
Comment 6
2020-06-15 15:04:34 PDT
Created
attachment 401944
[details]
proposed patch.
Michael Catanzaro
Comment 7
2020-06-15 15:10:31 PDT
Comment on
attachment 401944
[details]
proposed patch. Thanks for investigating! The change looks good, but I don't think it will fully fix the crash because HAVE(MACHINE_CONTEXT) is false on two of the four tested architectures.
Mark Lam
Comment 8
2020-06-15 15:14:58 PDT
(In reply to Michael Catanzaro from
comment #7
)
> Comment on
attachment 401944
[details]
> proposed patch. > > Thanks for investigating! > > The change looks good, but I don't think it will fully fix the crash because > HAVE(MACHINE_CONTEXT) is false on two of the four tested architectures.
I think you'll just have to skip the tests that expect crashes (this with `//@ crashOK!` at the top) for those platforms. That is unless they can actually handle signals and for some reason is misclaiming that they !HAVE(MACHINE_CONTEXT).
Mark Lam
Comment 9
2020-06-15 15:23:19 PDT
Created
attachment 401947
[details]
patch for landing. Thanks for the review. I'm also going to unskip the ensure-crash.js test for armv7 and mips. Will let the EWS run on this patch before landing.
Michael Catanzaro
Comment 10
2020-06-15 16:20:37 PDT
(In reply to Mark Lam from
comment #8
)
> I think you'll just have to skip the tests that expect crashes (this with > `//@ crashOK!` at the top) for those platforms. That is unless they can > actually handle signals and for some reason is misclaiming that they > !HAVE(MACHINE_CONTEXT).
I can add per-architecture skips if need be, but hopefully there's a better way. :) TL;DR I think the affected code doesn't actually need to be guarded by HAVE(MACHINE_CONTEXT) at all? Signals work the same on all architectures. I can't imagine how broken things would be if the OS couldn't handle UNIX signals. :D PlatformHave.h has this: #if OS(DARWIN) || OS(FUCHSIA) || ((OS(FREEBSD) || defined(__GLIBC__) || defined(__BIONIC__)) && (CPU(X86) || CPU(X86_64) || CPU(ARM) || CPU(ARM64) || CPU(MIPS))) #define HAVE_MACHINE_CONTEXT 1 #endif mcontext is actually available on all architectures when the C library is glibc; however, since mcontext is architecture-specific, any code that wants to use it needs to be written separately for each desired architecture. I see JSC's MachineContext.h includes platform-specific code for many architectures, but there's no way to make this work for arbitrary architectures. I *personally* only care about a few specific architectures (x86_64, aarch64, ppc64le, and s390x), but since signal handling is not platform-specific and doesn't require mcontext, ideally we wouldn't require mcontext at all. Also, I know Igalia cares about non-glibc too; they use either ulibc or muslc (I can never remember which) on some embedded systems. (I think(?) that mcontext is implemented by both of these, but ucontext is probably not, and our MachineContext.h does depend on ucontext.) Anyway, it looks like the affected code in jsc.cpp does not depend on MachineContext.h at all. It only uses WTF's Signals.h, which is guarded by HAVE(MACHINE_CONTEXT), but I don't see why. Guess: it's guarded by HAVE(MACHINE_CONTEXT) because originally it was only needed for that case? I think we can just remove HAVE(MACHINE_CONTEXT) from the guard... or change it to #if !OS(WINDOWS) or something like that. Your patch still looks good though, don't hesitate to land. I'll cook a patch to change the guards tomorrow.
Michael Catanzaro
Comment 11
2020-06-15 16:22:08 PDT
(In reply to Michael Catanzaro from
comment #10
)
> Your patch still looks good though, don't hesitate to land. I'll cook a > patch to change the guards tomorrow.
Actually let me try uploading an alternate version of your patch with these guards changed and see if it passes EWS.
Michael Catanzaro
Comment 12
2020-06-15 16:29:57 PDT
Created
attachment 401955
[details]
Patch
Michael Catanzaro
Comment 13
2020-06-15 16:30:45 PDT
(In reply to Michael Catanzaro from
comment #12
)
> Created
attachment 401955
[details]
> Patch
Let me know what you think of this. I'll do a local build with HAVE(MACHINE_CONTEXT) disabled now to make sure it actually builds.
Mark Lam
Comment 14
2020-06-15 16:31:47 PDT
(In reply to Michael Catanzaro from
comment #11
)
> (In reply to Michael Catanzaro from
comment #10
) > > Your patch still looks good though, don't hesitate to land. I'll cook a > > patch to change the guards tomorrow. > > Actually let me try uploading an alternate version of your patch with these > guards changed and see if it passes EWS.
I don't think we should remove HAVE(MACHINE_CONTEXT). Even on unixes, there's no guarantee that the struct for the signal context is the same, right? The platform that opts in to HAVE(MACHINE_CONTEXT) must do the work to make it work first. It doesn't make sense to opt all unixes in blindly.
Mark Lam
Comment 15
2020-06-15 16:33:22 PDT
(In reply to Mark Lam from
comment #14
)
> (In reply to Michael Catanzaro from
comment #11
) > > (In reply to Michael Catanzaro from
comment #10
) > > > Your patch still looks good though, don't hesitate to land. I'll cook a > > > patch to change the guards tomorrow. > > > > Actually let me try uploading an alternate version of your patch with these > > guards changed and see if it passes EWS. > > I don't think we should remove HAVE(MACHINE_CONTEXT). Even on unixes, > there's no guarantee that the struct for the signal context is the same, > right? The platform that opts in to HAVE(MACHINE_CONTEXT) must do the work > to make it work first. It doesn't make sense to opt all unixes in blindly.
And please don't just obsolete my patch and replace it with yours. Now, my patch won't get the EWS validation that I put it there for.
Michael Catanzaro
Comment 16
2020-06-15 16:35:47 PDT
(In reply to Mark Lam from
comment #14
)
> I don't think we should remove HAVE(MACHINE_CONTEXT). Even on unixes, > there's no guarantee that the struct for the signal context is the same, > right?
The mcontext struct is going to be different on every architecture, but jsc.cpp and Signals.[cpp,h] don't use it anywhere, so it should be fine. (There is code in JSC that really does depend on mcontext, just not this code.) Sorry for interfering with your EWS run. :(
Michael Catanzaro
Comment 17
2020-06-15 16:43:52 PDT
Comment on
attachment 401955
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=401955&action=review
> Source/WTF/wtf/threads/Signals.h:28 > -#if USE(PTHREADS) && HAVE(MACHINE_CONTEXT) > +#if USE(PTHREADS)
Note this code does not use pthreads either, so really these are not good guards... how about #if !OS(WINDOWS)? I bet the original usage of these classes in JSC depends on pthreads and mcontext; that's probably how these guards were originally selected.
Michael Catanzaro
Comment 18
2020-06-15 16:48:18 PDT
Comment on
attachment 401955
[details]
Patch Obsoleting my patch, Mark asked me to use a separate bug to change the guards.
Michael Catanzaro
Comment 19
2020-06-15 20:01:34 PDT
Comment on
attachment 401947
[details]
patch for landing. View in context:
https://bugs.webkit.org/attachment.cgi?id=401947&action=review
> Source/WTF/wtf/threads/Signals.h:53 > +#if !OS(DARWIN) > + Abort, > +#endif
Looking this over again, the #if !OS(DARWIN) guards aren't really needed in Signals.h. The guards in jsc.cpp should suffice to avoid installing the signal handler on Apple systems. But even then, I guess it wouldn't hurt to install the SIGABRT handler unconditionally, since the code looks like it's trying to handle every possible crash signal, and it seems odd for only SIGABRT to be conspicuously missing. (Linux has a couple more crash signals, but they are truly obscure. Is SIGABRT really not common on macOS?)
Mark Lam
Comment 20
2020-06-15 20:08:31 PDT
Comment on
attachment 401947
[details]
patch for landing. View in context:
https://bugs.webkit.org/attachment.cgi?id=401947&action=review
>> Source/WTF/wtf/threads/Signals.h:53 >> +#endif > > Looking this over again, the #if !OS(DARWIN) guards aren't really needed in Signals.h. The guards in jsc.cpp should suffice to avoid installing the signal handler on Apple systems. But even then, I guess it wouldn't hurt to install the SIGABRT handler unconditionally, since the code looks like it's trying to handle every possible crash signal, and it seems odd for only SIGABRT to be conspicuously missing. (Linux has a couple more crash signals, but they are truly obscure. Is SIGABRT really not common on macOS?)
I added it because we deliberately don't need or want this for OS(DARWIN). There should never be a test that generates a SIGABRT. Only non-DARWIN ports will ever encounter SIGABRT by their own choice. Alternatively, I can make it more restricted to Apple specific platforms. The purpose of the handler is not to catch all crashes, only crashes that we think we will generate deliberately. This is facility for testing that expected crash points will crash.
Mark Lam
Comment 21
2020-06-16 00:56:09 PDT
Comment on
attachment 401947
[details]
patch for landing. armv7 and mips bots are happy. Landing now.
Mark Lam
Comment 22
2020-06-16 00:59:26 PDT
***
Bug 213207
has been marked as a duplicate of this bug. ***
EWS
Comment 23
2020-06-16 01:32:51 PDT
Committed
r263082
: <
https://trac.webkit.org/changeset/263082
> All reviewed patches have been landed. Closing bug and clearing flags on
attachment 401947
[details]
.
Radar WebKit Bug Importer
Comment 24
2020-06-16 01:33:19 PDT
<
rdar://problem/64397861
>
Michael Catanzaro
Comment 25
2020-06-17 10:47:07 PDT
(In reply to Michael Catanzaro from
comment #0
)
> Moving from
bug #212479
. stress/ensure-crash.js is broken since
r262991
on > Red Hat's internal cloop CI
I've verified this is now fixed with the combination of
r263082
,
r263074
, and
r263123
.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug