WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
232603
Terminate unresponsive network process by crashing it
https://bugs.webkit.org/show_bug.cgi?id=232603
Summary
Terminate unresponsive network process by crashing it
Sihui Liu
Reported
2021-11-01 21:58:38 PDT
...
Attachments
Patch
(5.63 KB, patch)
2021-11-01 22:18 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(8.48 KB, patch)
2021-11-02 10:04 PDT
,
Sihui Liu
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(8.52 KB, patch)
2021-11-02 10:56 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch for landing
(8.82 KB, patch)
2021-11-02 11:58 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Show Obsolete
(3)
View All
Add attachment
proposed patch, testcase, etc.
Sihui Liu
Comment 1
2021-11-01 22:18:31 PDT
Created
attachment 443059
[details]
Patch
Chris Dumez
Comment 2
2021-11-02 07:34:24 PDT
Comment on
attachment 443059
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=443059&action=review
> Source/WebKit/Platform/IPC/Connection.cpp:759 > + if (message->messageName() == MessageName::Terminate) {
I like the idea of generating a crash report on termination. One advice though would be to call a NEVER_INLINE function with a name such as terminateDueToIPCTerminateMessage(). It will make the generated crash logs much easier to interpret and will help distinguish it from other crashes that could happen in processIncomingMessage().
> Source/WebKit/UIProcess/Network/NetworkProcessProxy.cpp:91 > +static constexpr Seconds unresponsivenessCheckPeriod = 15_s;
The responsiveness timeout is 6_s. So if the network process has to be unresponsive 3 times, it means it will take at least 3*6=18 seconds to happen, no? However, your new logic only triggers if this happens within 15 seconds. Therefore, I am not sure how it can trigger. Also, I am not sure we want such logic at all? Why not be honest and report every network termination as a crash? Then we'd see how much of a problem is it and how urgent it is to address.
Sihui Liu
Comment 3
2021-11-02 09:43:25 PDT
(In reply to Chris Dumez from
comment #2
)
> Comment on
attachment 443059
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=443059&action=review
> > > Source/WebKit/Platform/IPC/Connection.cpp:759 > > + if (message->messageName() == MessageName::Terminate) { > > I like the idea of generating a crash report on termination. > > One advice though would be to call a NEVER_INLINE function with a name such > as terminateDueToIPCTerminateMessage(). It will make the generated crash > logs much easier to interpret and will help distinguish it from other > crashes that could happen in processIncomingMessage().
Sure, will do.
> > > Source/WebKit/UIProcess/Network/NetworkProcessProxy.cpp:91 > > +static constexpr Seconds unresponsivenessCheckPeriod = 15_s; > > The responsiveness timeout is 6_s. So if the network process has to be > unresponsive 3 times, it means it will take at least 3*6=18 seconds to > happen, no? However, your new logic only triggers if this happens within 15 > seconds. Therefore, I am not sure how it can trigger. > > Also, I am not sure we want such logic at all? Why not be honest and report > every network termination as a crash? Then we'd see how much of a problem is > it and how urgent it is to address.
The timestamp is set when the first unresponsiveness case appears, so it's first case -6s- second case -6s- third case, which is about 12 seconds for 3 (I think there might be delay so I use 15). I worried that will generate many same crash reports on user's device (which might be costly compared to our old behavior, since we know the network process can get terminated a lot by the number of web process crash reports
rdar://84745326
(~20 failed attempts will cause one crash)). The goal is to identify the cause, and with the limit we will still know the magnitude of the issue.
Sihui Liu
Comment 4
2021-11-02 10:04:46 PDT
Created
attachment 443102
[details]
Patch
Sihui Liu
Comment 5
2021-11-02 10:56:57 PDT
Created
attachment 443109
[details]
Patch
Chris Dumez
Comment 6
2021-11-02 10:59:14 PDT
(In reply to Sihui Liu from
comment #3
)
> (In reply to Chris Dumez from
comment #2
) > > Comment on
attachment 443059
[details]
> > Patch > > > > View in context: > >
https://bugs.webkit.org/attachment.cgi?id=443059&action=review
> > > > > Source/WebKit/Platform/IPC/Connection.cpp:759 > > > + if (message->messageName() == MessageName::Terminate) { > > > > I like the idea of generating a crash report on termination. > > > > One advice though would be to call a NEVER_INLINE function with a name such > > as terminateDueToIPCTerminateMessage(). It will make the generated crash > > logs much easier to interpret and will help distinguish it from other > > crashes that could happen in processIncomingMessage(). > > Sure, will do. > > > > > > Source/WebKit/UIProcess/Network/NetworkProcessProxy.cpp:91 > > > +static constexpr Seconds unresponsivenessCheckPeriod = 15_s; > > > > The responsiveness timeout is 6_s. So if the network process has to be > > unresponsive 3 times, it means it will take at least 3*6=18 seconds to > > happen, no? However, your new logic only triggers if this happens within 15 > > seconds. Therefore, I am not sure how it can trigger. > > > > Also, I am not sure we want such logic at all? Why not be honest and report > > every network termination as a crash? Then we'd see how much of a problem is > > it and how urgent it is to address. > > > The timestamp is set when the first unresponsiveness case appears, so it's > first case -6s- second case -6s- third case, which is about 12 seconds for 3 > (I think there might be delay so I use 15).
Ah, indeed. I missed that.
> > I worried that will generate many same crash reports on user's device (which > might be costly compared to our old behavior, since we know the network > process can get terminated a lot by the number of web process crash reports >
rdar://84745326
(~20 failed attempts will cause one crash)). The goal is to > identify the cause, and with the limit we will still know the magnitude of > the issue.
Chris Dumez
Comment 7
2021-11-02 11:13:51 PDT
Comment on
attachment 443109
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=443109&action=review
r=me with comment.
> Source/WebKit/UIProcess/Network/NetworkProcessProxy.cpp:153 > + if (shouldTerminateNetworkProcessBySendingMessage()) {
Please add a comment to explain that we're doing this to generate a crash report when the network process is killed for being unresponsive several times in a row. It is other not clear why sometimes we're killing by sending an IPC and sometimes not.
Sihui Liu
Comment 8
2021-11-02 11:58:50 PDT
Created
attachment 443118
[details]
Patch for landing
EWS
Comment 9
2021-11-02 12:42:13 PDT
Committed
r285177
(
243810@main
): <
https://commits.webkit.org/243810@main
> All reviewed patches have been landed. Closing bug and clearing flags on
attachment 443118
[details]
.
Radar WebKit Bug Importer
Comment 10
2021-11-02 12:43:18 PDT
<
rdar://problem/84942667
>
Simon Fraser (smfr)
Comment 11
2021-11-02 13:23:43 PDT
Did you consider using os_log_fault like we do for MESSAGE_CHECK?
Chris Dumez
Comment 12
2021-11-02 13:25:14 PDT
(In reply to Simon Fraser (smfr) from
comment #11
)
> Did you consider using os_log_fault like we do for MESSAGE_CHECK?
2 things about os_log_fault: 1. It generates a simulated crash, not a real crash (so it doesn't exit the process) 2. It only generates the simulated crash on iOS, not macOS.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug