Bug 237398 - [macOS] WebContent processes crash with XPC_EXIT_REASON_SIGTERM_TIMEOUT when logging out
Summary: [macOS] WebContent processes crash with XPC_EXIT_REASON_SIGTERM_TIMEOUT when ...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit2 (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Chris Dumez
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2022-03-02 16:18 PST by Chris Dumez
Modified: 2022-03-10 16:29 PST (History)
7 users (show)

See Also:


Attachments
Patch (5.23 KB, patch)
2022-03-02 16:23 PST, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (5.31 KB, patch)
2022-03-03 11:48 PST, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (5.07 KB, patch)
2022-03-10 08:08 PST, Chris Dumez
ap: review+
Details | Formatted Diff | Diff
Patch for landing (5.04 KB, patch)
2022-03-10 14:47 PST, Chris Dumez
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Dumez 2022-03-02 16:18:28 PST
WebContent processes crash with XPC_EXIT_REASON_SIGTERM_TIMEOUT when logging out of macOS.
Comment 1 Chris Dumez 2022-03-02 16:18:39 PST
<rdar://88940229>
Comment 2 Chris Dumez 2022-03-02 16:23:58 PST
Created attachment 453674 [details]
Patch
Comment 3 Alexey Proskuryakov 2022-03-02 19:05:16 PST
Comment on attachment 453674 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=453674&action=review

> Source/WebKit/Shared/EntryPointUtilities/Cocoa/XPCService/XPCServiceEntryPoint.mm:178
> +            exit(0);

Is it safe to call exit from a signal handler? atexit functions could do arbitrary work, so I’d be looking at _exit().
Comment 4 Chris Dumez 2022-03-02 20:18:05 PST
(In reply to Alexey Proskuryakov from comment #3)
> Comment on attachment 453674 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=453674&action=review
> 
> > Source/WebKit/Shared/EntryPointUtilities/Cocoa/XPCService/XPCServiceEntryPoint.mm:178
> > +            exit(0);
> 
> Is it safe to call exit from a signal handler? atexit functions could do
> arbitrary work, so I’d be looking at _exit().

Well, we do want those atexit handlers to run though I think. It minimizes the odds of losing cookies and storage data.

What is the concern exactly? That the atexit handlers take too long and we reach the timeout and still get killed? Even if that’s the case, I would think it is preferable to at least try and run those handlers. Would be sad to lose some cookies or local storage data when logging out.
Comment 5 Alexey Proskuryakov 2022-03-03 10:35:23 PST
My concern is that atexit handlers will be using functions that are unsafe in signal handler context (which is extremely restrictive), so we'll be getting memory corruption, hangs and such.

If we need to run these even after receiving SIGTERM, we need to ignore the signal, and to initiate a clean exit from run loop somehow (I don't know how exactly to do that from signal handler context).
Comment 6 Chris Dumez 2022-03-03 11:34:33 PST
(In reply to Alexey Proskuryakov from comment #5)
> My concern is that atexit handlers will be using functions that are unsafe
> in signal handler context (which is extremely restrictive), so we'll be
> getting memory corruption, hangs and such.
> 
> If we need to run these even after receiving SIGTERM, we need to ignore the
> signal, and to initiate a clean exit from run loop somehow (I don't know how
> exactly to do that from signal handler context).

I may be able to do the following in my signal handler:
1. Clear the OS transaction
2. Reset to the default signal handler
3. raise() the signal again 

I need to validate this to make sure it actually works in practice though.
Comment 7 Chris Dumez 2022-03-03 11:48:36 PST
Created attachment 453770 [details]
Patch
Comment 8 Chris Dumez 2022-03-03 11:49:37 PST
(In reply to Chris Dumez from comment #6)
> (In reply to Alexey Proskuryakov from comment #5)
> > My concern is that atexit handlers will be using functions that are unsafe
> > in signal handler context (which is extremely restrictive), so we'll be
> > getting memory corruption, hangs and such.
> > 
> > If we need to run these even after receiving SIGTERM, we need to ignore the
> > signal, and to initiate a clean exit from run loop somehow (I don't know how
> > exactly to do that from signal handler context).
> 
> I may be able to do the following in my signal handler:
> 1. Clear the OS transaction
> 2. Reset to the default signal handler
> 3. raise() the signal again 
> 
> I need to validate this to make sure it actually works in practice though.

Ok, I implemented this alternative proposal that should be less controversial I hope.
I tested it manually and it seems to work just as well.
Comment 9 Darin Adler 2022-03-03 12:01:15 PST
Comment on attachment 453770 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=453770&action=review

> Source/WebKit/Shared/EntryPointUtilities/Cocoa/XPCService/XPCServiceEntryPoint.mm:180
> +        signal(SIGTERM, [](int) {
> +            globalTransaction.get() = nullptr;
> +            signal(SIGTERM, SIG_DFL);
> +            raise(SIGTERM);
> +        });

This looks great. Really glad Alexey pointed out it was not OK to call exit. I tried to do some more research on "cleanup and re-raise signal" and see if it’s more elegantly done with sigaction instead of signal, but the examples I found are all like what you wrote here.
Comment 10 Alexey Proskuryakov 2022-03-03 12:13:24 PST
Looks good to me too. I'm far from being an expert on writing code that works in signal handlers, but I couldn't find anything against this approach with a few web searches.
Comment 11 Chris Dumez 2022-03-03 12:14:27 PST
(In reply to Alexey Proskuryakov from comment #10)
> Looks good to me too. I'm far from being an expert on writing code that
> works in signal handlers, but I couldn't find anything against this approach
> with a few web searches.

Ok, thanks for pointing out the issue in the original proposal. TIL :)
Comment 12 EWS 2022-03-03 14:40:02 PST
Committed r290795 (248035@main): <https://commits.webkit.org/248035@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 453770 [details].
Comment 13 Chris Dumez 2022-03-10 08:01:22 PST
Reverted r290795 for reason:

Caused crashes <rdar://problem/89970722>

Committed r291110 (248272@trunk): <https://commits.webkit.org/248272@trunk>
Comment 14 Chris Dumez 2022-03-10 08:08:07 PST
Created attachment 454364 [details]
Patch
Comment 15 Chris Dumez 2022-03-10 08:17:15 PST
Comment on attachment 454364 [details]
Patch

Turns out I cannot even release the OS transaction in the signal handler. As a result, I am going back to Alexey's original proposal to call _exit(0).

Eventually we should just stop leaking this transaction (and adopt RunningBoard) but this is quite a bit of work and not risk free.
Comment 16 Alexey Proskuryakov 2022-03-10 14:42:09 PST
Comment on attachment 454364 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=454364&action=review

> Source/WebKit/ChangeLog:14
> +        To address the issue, we now set our own SIGTERM handler that releases the OS transaction and calls _exit(0)

Please update the ChangeLog for updated patch.
Comment 17 Chris Dumez 2022-03-10 14:47:42 PST
Created attachment 454403 [details]
Patch for landing
Comment 18 EWS 2022-03-10 16:29:14 PST
Committed r291137 (248297@main): <https://commits.webkit.org/248297@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 454403 [details].