...
Created attachment 443059 [details] Patch
Comment on attachment 443059 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=443059&action=review > Source/WebKit/Platform/IPC/Connection.cpp:759 > + if (message->messageName() == MessageName::Terminate) { I like the idea of generating a crash report on termination. One advice though would be to call a NEVER_INLINE function with a name such as terminateDueToIPCTerminateMessage(). It will make the generated crash logs much easier to interpret and will help distinguish it from other crashes that could happen in processIncomingMessage(). > Source/WebKit/UIProcess/Network/NetworkProcessProxy.cpp:91 > +static constexpr Seconds unresponsivenessCheckPeriod = 15_s; The responsiveness timeout is 6_s. So if the network process has to be unresponsive 3 times, it means it will take at least 3*6=18 seconds to happen, no? However, your new logic only triggers if this happens within 15 seconds. Therefore, I am not sure how it can trigger. Also, I am not sure we want such logic at all? Why not be honest and report every network termination as a crash? Then we'd see how much of a problem is it and how urgent it is to address.
(In reply to Chris Dumez from comment #2) > Comment on attachment 443059 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=443059&action=review > > > Source/WebKit/Platform/IPC/Connection.cpp:759 > > + if (message->messageName() == MessageName::Terminate) { > > I like the idea of generating a crash report on termination. > > One advice though would be to call a NEVER_INLINE function with a name such > as terminateDueToIPCTerminateMessage(). It will make the generated crash > logs much easier to interpret and will help distinguish it from other > crashes that could happen in processIncomingMessage(). Sure, will do. > > > Source/WebKit/UIProcess/Network/NetworkProcessProxy.cpp:91 > > +static constexpr Seconds unresponsivenessCheckPeriod = 15_s; > > The responsiveness timeout is 6_s. So if the network process has to be > unresponsive 3 times, it means it will take at least 3*6=18 seconds to > happen, no? However, your new logic only triggers if this happens within 15 > seconds. Therefore, I am not sure how it can trigger. > > Also, I am not sure we want such logic at all? Why not be honest and report > every network termination as a crash? Then we'd see how much of a problem is > it and how urgent it is to address. The timestamp is set when the first unresponsiveness case appears, so it's first case -6s- second case -6s- third case, which is about 12 seconds for 3 (I think there might be delay so I use 15). I worried that will generate many same crash reports on user's device (which might be costly compared to our old behavior, since we know the network process can get terminated a lot by the number of web process crash reports rdar://84745326 (~20 failed attempts will cause one crash)). The goal is to identify the cause, and with the limit we will still know the magnitude of the issue.
Created attachment 443102 [details] Patch
Created attachment 443109 [details] Patch
(In reply to Sihui Liu from comment #3) > (In reply to Chris Dumez from comment #2) > > Comment on attachment 443059 [details] > > Patch > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=443059&action=review > > > > > Source/WebKit/Platform/IPC/Connection.cpp:759 > > > + if (message->messageName() == MessageName::Terminate) { > > > > I like the idea of generating a crash report on termination. > > > > One advice though would be to call a NEVER_INLINE function with a name such > > as terminateDueToIPCTerminateMessage(). It will make the generated crash > > logs much easier to interpret and will help distinguish it from other > > crashes that could happen in processIncomingMessage(). > > Sure, will do. > > > > > > Source/WebKit/UIProcess/Network/NetworkProcessProxy.cpp:91 > > > +static constexpr Seconds unresponsivenessCheckPeriod = 15_s; > > > > The responsiveness timeout is 6_s. So if the network process has to be > > unresponsive 3 times, it means it will take at least 3*6=18 seconds to > > happen, no? However, your new logic only triggers if this happens within 15 > > seconds. Therefore, I am not sure how it can trigger. > > > > Also, I am not sure we want such logic at all? Why not be honest and report > > every network termination as a crash? Then we'd see how much of a problem is > > it and how urgent it is to address. > > > The timestamp is set when the first unresponsiveness case appears, so it's > first case -6s- second case -6s- third case, which is about 12 seconds for 3 > (I think there might be delay so I use 15). Ah, indeed. I missed that. > > I worried that will generate many same crash reports on user's device (which > might be costly compared to our old behavior, since we know the network > process can get terminated a lot by the number of web process crash reports > rdar://84745326 (~20 failed attempts will cause one crash)). The goal is to > identify the cause, and with the limit we will still know the magnitude of > the issue.
Comment on attachment 443109 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=443109&action=review r=me with comment. > Source/WebKit/UIProcess/Network/NetworkProcessProxy.cpp:153 > + if (shouldTerminateNetworkProcessBySendingMessage()) { Please add a comment to explain that we're doing this to generate a crash report when the network process is killed for being unresponsive several times in a row. It is other not clear why sometimes we're killing by sending an IPC and sometimes not.
Created attachment 443118 [details] Patch for landing
Committed r285177 (243810@main): <https://commits.webkit.org/243810@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 443118 [details].
<rdar://problem/84942667>
Did you consider using os_log_fault like we do for MESSAGE_CHECK?
(In reply to Simon Fraser (smfr) from comment #11) > Did you consider using os_log_fault like we do for MESSAGE_CHECK? 2 things about os_log_fault: 1. It generates a simulated crash, not a real crash (so it doesn't exit the process) 2. It only generates the simulated crash on iOS, not macOS.