WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
168479
[WK2] Support termination of unresponsive WebContent processes that are in the background
https://bugs.webkit.org/show_bug.cgi?id=168479
Summary
[WK2] Support termination of unresponsive WebContent processes that are in th...
Chris Dumez
Reported
2017-02-16 15:51:29 PST
Support termination of unresponsive WebContent processes that are in the background. This protects us against cases where a background tab is unresponsive and has high CPU usage, thus draining the battery without the user knowing about it.
Attachments
Patch
(31.33 KB, patch)
2017-02-16 16:49 PST
,
Chris Dumez
no flags
Details
Formatted Diff
Diff
Patch
(30.82 KB, patch)
2017-02-17 09:14 PST
,
Chris Dumez
no flags
Details
Formatted Diff
Diff
Patch
(30.82 KB, patch)
2017-02-17 10:19 PST
,
Chris Dumez
no flags
Details
Formatted Diff
Diff
Patch
(30.78 KB, patch)
2017-02-19 16:23 PST
,
Chris Dumez
no flags
Details
Formatted Diff
Diff
Patch
(30.78 KB, patch)
2017-02-19 18:02 PST
,
Chris Dumez
no flags
Details
Formatted Diff
Diff
Show Obsolete
(4)
View All
Add attachment
proposed patch, testcase, etc.
Chris Dumez
Comment 1
2017-02-16 15:52:05 PST
<
rdar://problem/30558745
>
Chris Dumez
Comment 2
2017-02-16 16:49:23 PST
Created
attachment 301856
[details]
Patch
WebKit Commit Bot
Comment 3
2017-02-16 16:50:32 PST
Attachment 301856
[details]
did not pass style-queue: ERROR: Source/WebKit2/UIProcess/WebProcessProxy.cpp:106: Code inside a namespace should not be indented. [whitespace/indent] [4] Total errors found: 1 in 12 files If any of these errors are false positives, please file a bug against check-webkit-style.
Keith Rollin
Comment 4
2017-02-16 20:16:22 PST
Comment on
attachment 301856
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=301856&action=review
> Source/WebKit2/UIProcess/BackgroundWebProcessResponsivenessTimer.cpp:34 > +static const std::chrono::seconds maximumInterval { 86400 }; // 24 hours.
How was 86400 decided? My inclination would be to say it's too high. If the wedged webpage is on a laptop or phone, it could drain the battery in an hour or two, if not sooner. If I have a web page loaded and it gets wedged 2 days later, we won't know it for another day (other than by the battery dying). We'd like to catch issues before then. Perhaps a max time of 1 hour? Question from a noob: what happens to the timer and the checking when a device is turned off or MobileSafari is switched out? Does the timer continue to run? What happens when the device is turned back on or MobileSafari is re-entered? Does the timer continue from where it was when MobileSafari was exited? Or does it resume after incorporating the full elapsed time? If the latter, do we want that? In other words, if we were due to check the process in an hour, but then turned off the phone for two hours, what behavior do we want when MobileSafari starts again? This patch was inspired by a web page that responded poorly while in the background, and while Safari was started with this page already in the background. Does this patch do what we want in that case? In particular, I'm thinking that we should perhaps reset the interval back to 20 seconds in order to catch pages that don't awaken in the background politely. (Crib notes on checking) Intervals: 20 40 80 160 320 640 1280 2560 3600 3600 ... Checks: 20 60 140 300 620 1260 2540 5100 8700 12300 ... Or: 20s 1m 2m20s 5min 10m20s 21m 42m20s 1h25m 2h25m 3h25m ...
> Source/WebKit2/UIProcess/BackgroundWebProcessResponsivenessTimer.cpp:70 > + page->terminateProcess(WebPageProxy::TerminationReason::UnresponsiveWhileInBackground);
Are there possible race conditions here? I don't know what calling terminateProcess could lead to. Could the other pages in the process get notification that the process has terminated and then delete themselves before you get a chance to mark them with UnresponsiveWhileInBackground? Could they be removed from m_webProcessProxy.m_pageMap while you're still iterating over it (leading to the iterator becoming invalid, or the pages returned from it being invalid)? Could the WebProcessProxy that m_webProcessProxy refers to be deleted out from under you (you're not doing anything to protect its lifetime)? Or is all that safely deferred until timerFired is done?
Chris Dumez
Comment 5
2017-02-16 20:48:29 PST
Comment on
attachment 301856
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=301856&action=review
>> Source/WebKit2/UIProcess/BackgroundWebProcessResponsivenessTimer.cpp:34 >> +static const std::chrono::seconds maximumInterval { 86400 }; // 24 hours. > > How was 86400 decided? My inclination would be to say it's too high. If the wedged webpage is on a laptop or phone, it could drain the battery in an hour or two, if not sooner. If I have a web page loaded and it gets wedged 2 days later, we won't know it for another day (other than by the battery dying). We'd like to catch issues before then. Perhaps a max time of 1 hour? > > Question from a noob: what happens to the timer and the checking when a device is turned off or MobileSafari is switched out? Does the timer continue to run? What happens when the device is turned back on or MobileSafari is re-entered? Does the timer continue from where it was when MobileSafari was exited? Or does it resume after incorporating the full elapsed time? If the latter, do we want that? In other words, if we were due to check the process in an hour, but then turned off the phone for two hours, what behavior do we want when MobileSafari starts again? > > This patch was inspired by a web page that responded poorly while in the background, and while Safari was started with this page already in the background. Does this patch do what we want in that case? In particular, I'm thinking that we should perhaps reset the interval back to 20 seconds in order to catch pages that don't awaken in the background politely. > > (Crib notes on checking) > Intervals: 20 40 80 160 320 640 1280 2560 3600 3600 ... > Checks: 20 60 140 300 620 1260 2540 5100 8700 12300 ... > Or: 20s 1m 2m20s 5min 10m20s 21m 42m20s 1h25m 2h25m 3h25m ...
1. Let's discuss the max interval with Gavin. I came up with the current value myself and was mostly used as a way not to overflow. 2. I am not 100% sure but I believe that if you close your laptop lid and a timer was scheduled to fire while the system is suspended, then it will fire as soon as the system resumes. 3. The patch works great in cases like <
rdar://problem/29808005
>. I do not understand why you want to reset to 20 seconds, or when you would want to reset to 20 seconds. To be clear, in the case of a session restore, if one of the background tabs that was restored starts mis-behaving right away, then we will notice it after 20 seconds (because the responsiveness timer started as soon as the tab started loading, given that the tab was in the background).
>> Source/WebKit2/UIProcess/BackgroundWebProcessResponsivenessTimer.cpp:70 >> + page->terminateProcess(WebPageProxy::TerminationReason::UnresponsiveWhileInBackground); > > Are there possible race conditions here? I don't know what calling terminateProcess could lead to. Could the other pages in the process get notification that the process has terminated and then delete themselves before you get a chance to mark them with UnresponsiveWhileInBackground? Could they be removed from m_webProcessProxy.m_pageMap while you're still iterating over it (leading to the iterator becoming invalid, or the pages returned from it being invalid)? Could the WebProcessProxy that m_webProcessProxy refers to be deleted out from under you (you're not doing anything to protect its lifetime)? > > Or is all that safely deferred until timerFired is done?
This could indeed be unsafe if the client were to mess with the pages in a delegate. I think I will have to copy the pages before looping.
Chris Dumez
Comment 6
2017-02-17 09:14:59 PST
Created
attachment 301941
[details]
Patch
WebKit Commit Bot
Comment 7
2017-02-17 09:16:32 PST
Attachment 301941
[details]
did not pass style-queue: ERROR: Source/WebKit2/UIProcess/WebProcessProxy.cpp:106: Code inside a namespace should not be indented. [whitespace/indent] [4] Total errors found: 1 in 13 files If any of these errors are false positives, please file a bug against check-webkit-style.
Chris Dumez
Comment 8
2017-02-17 10:19:46 PST
Created
attachment 301951
[details]
Patch
WebKit Commit Bot
Comment 9
2017-02-17 10:22:29 PST
Attachment 301951
[details]
did not pass style-queue: ERROR: Source/WebKit2/UIProcess/WebProcessProxy.cpp:106: Code inside a namespace should not be indented. [whitespace/indent] [4] Total errors found: 1 in 13 files If any of these errors are false positives, please file a bug against check-webkit-style.
Gavin Barraclough
Comment 10
2017-02-17 10:53:28 PST
Comment on
attachment 301941
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=301941&action=review
> Source/WebKit2/UIProcess/BackgroundWebProcessResponsivenessTimer.cpp:34 > +static const std::chrono::seconds maximumInterval { 14400 }; // 4 hours.
Per conversation I like 8hrs. :-)
> Source/WebKit2/UIProcess/WebProcessProxy.cpp:274 > + updateBackgroundResponsivenessTimer();
Is this call not redundant? When a page is created/destroyed, if it changes the visible page count that automatically triggers updateBackgroundResponsivenessTimer. If it doesn't nothing to do. I guess you do need a one time updateBackgroundResponsivenessTimer when the WebProcessProxy is first created (or the first page added) to get the timer going in the first place – maybe that could be from the constructor? Do we need any special handling around process disconnect / reconnect? (If the processes crashes should we be suspending the timer.)
> Source/WebKit2/UIProcess/WebProcessProxy.cpp:282 > + updateBackgroundResponsivenessTimer();
ditto
Chris Dumez
Comment 11
2017-02-17 11:08:44 PST
Comment on
attachment 301941
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=301941&action=review
>> Source/WebKit2/UIProcess/BackgroundWebProcessResponsivenessTimer.cpp:34 >> +static const std::chrono::seconds maximumInterval { 14400 }; // 4 hours. > > Per conversation I like 8hrs. :-)
Yes, this is taken care of in the latest iteration.
>> Source/WebKit2/UIProcess/WebProcessProxy.cpp:274 >> + updateBackgroundResponsivenessTimer(); > > Is this call not redundant? > > When a page is created/destroyed, if it changes the visible page count that automatically triggers updateBackgroundResponsivenessTimer. If it doesn't nothing to do. > > I guess you do need a one time updateBackgroundResponsivenessTimer when the WebProcessProxy is first created (or the first page added) to get the timer going in the first place – maybe that could be from the constructor? > > Do we need any special handling around process disconnect / reconnect? (If the processes crashes should we be suspending the timer.)
This function is called when a page's process has crashed and we reconnect it to a fresh WebProcess. Our policy to start the responsiveness timer is that: 1. There should be pages 2. There should be no-visible pages In the case of a page crash and then reconnect, this method will be called. It is possible that when this is called, the page is not visible and we want to start the responsiveness timer. So I do not think this is redundant.
>> Source/WebKit2/UIProcess/WebProcessProxy.cpp:282 >> + updateBackgroundResponsivenessTimer(); > > ditto
If you remove the last page that was visible, and they are still pages, then we want to start the responsiveness timer. If we remove the last page (i.e. the process has no more pages), we want to stop the responsiveness timer. So I think his call is needed.
Antti Koivisto
Comment 12
2017-02-19 12:29:50 PST
Comment on
attachment 301951
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=301951&action=review
> Source/WebKit2/UIProcess/BackgroundWebProcessResponsivenessTimer.cpp:34 > +static const std::chrono::seconds initialInterval { 20 }; > +static const std::chrono::seconds maximumInterval { 28800 }; // 8 hours.
These are better written as static const std::chrono::seconds initialInterval { 20s }; static const std::chrono::seconds maximumInterval { 8h }; No need for comment.
> Source/WebKit2/UIProcess/BackgroundWebProcessResponsivenessTimer.h:33 > +class BackgroundWebProcessResponsivenessTimer {
It has a timer but is it a timer? Maybe UnresponsiveWebProcessTerminator or similar would describe it better?
> Source/WebKit2/UIProcess/WebPageProxy.cpp:1592 > + m_process->responsivenessTimer().stop();
...especially since we have a ResponsivenessTimer already that is something else.
Chris Dumez
Comment 13
2017-02-19 16:23:06 PST
Created
attachment 302099
[details]
Patch
WebKit Commit Bot
Comment 14
2017-02-19 16:24:41 PST
Attachment 302099
[details]
did not pass style-queue: ERROR: Source/WebKit2/UIProcess/WebProcessProxy.cpp:106: Code inside a namespace should not be indented. [whitespace/indent] [4] Total errors found: 1 in 13 files If any of these errors are false positives, please file a bug against check-webkit-style.
WebKit Commit Bot
Comment 15
2017-02-19 17:03:09 PST
Comment on
attachment 302099
[details]
Patch Rejecting
attachment 302099
[details]
from commit-queue. Failed to run "['/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch', '--status-host=webkit-queues.webkit.org', '--bot-id=webkit-cq-01', 'validate-changelog', '--check-oops', '--non-interactive', 302099, '--port=mac']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit ChangeLog entry in Source/WebKit2/ChangeLog contains OOPS!. Full output:
http://webkit-queues.webkit.org/results/3156555
Chris Dumez
Comment 16
2017-02-19 18:02:52 PST
Created
attachment 302105
[details]
Patch
WebKit Commit Bot
Comment 17
2017-02-19 18:05:38 PST
Attachment 302105
[details]
did not pass style-queue: ERROR: Source/WebKit2/UIProcess/WebProcessProxy.cpp:106: Code inside a namespace should not be indented. [whitespace/indent] [4] Total errors found: 1 in 13 files If any of these errors are false positives, please file a bug against check-webkit-style.
WebKit Commit Bot
Comment 18
2017-02-19 18:40:27 PST
Comment on
attachment 302105
[details]
Patch Clearing flags on attachment: 302105 Committed
r212619
: <
http://trac.webkit.org/changeset/212619
>
WebKit Commit Bot
Comment 19
2017-02-19 18:40:32 PST
All reviewed patches have been landed. Closing bug.
mitz
Comment 20
2017-02-19 18:44:33 PST
What’s the rationale behind reporting these via the API differently from how termination of a Web process by the OS due to memory pressure is reported to the client?
Chris Dumez
Comment 21
2017-02-19 18:49:32 PST
(In reply to
comment #20
)
> What’s the rationale behind reporting these via the API differently from how > termination of a Web process by the OS due to memory pressure is reported to > the client?
Can you clarify? We do call the processDidCrash client delegate in most cases. The only case we opted not to call this delegate is when the tab has never been visible. We thought it was not worth displaying a crash banner in this case since it is not useful to the user. To the user, it will just look as if the tab started loading when he switched to it, which is not surprising behavior in case of session restore.
mitz
Comment 22
2017-02-19 18:59:32 PST
(In reply to
comment #21
)
> (In reply to
comment #20
) > > What’s the rationale behind reporting these via the API differently from how > > termination of a Web process by the OS due to memory pressure is reported to > > the client? > > Can you clarify? We do call the processDidCrash client delegate in most > cases. The only case we opted not to call this delegate is when the tab has > never been visible.
And that differs from the case where the OS would have killed such a Web content process, I believe, which seems like an arbitrary inconsistency.
> We thought it was not worth displaying a crash banner in > this case since it is not useful to the user.
Displaying crash UI is the client app’s domain. I think it’s best if the framework faithfully, or at least consistently, represents to the client what is happening. It seems like this change also introduces another API curiosity: a navigation (the reload) triggered not by the user nor by the webpage nor by the API. It seems like this could be confusing to the client’s navigation delegate, depending on how it’s reported (what is the navigation action being reported by the API in this case?).
> To the user, it will just look > as if the tab started loading when he switched to it, which is not > surprising behavior in case of session restore.
I do think that what things look like “to the user” is really the domain of the app, not the framework. Admittedly, this new behavior is opt-in, so apps sort of know what they’re signing up for when they opt in, but it seems like a weird contract. Consider an app like an email client, that after initially loading an email message in an off-screen web view performs some operations that read or modify the content. It may not know to repeat those operations after an automatic reload happened. Or, if it naively assumes that email messages don’t reload, then it may erroneously repeat something that shouldn’t be repeated (say: add a calendar event from an email message).
mitz
Comment 23
2017-02-19 19:03:00 PST
(In reply to
comment #22
)
> I do think that what things look like “to the user” is really the domain of > the app, not the framework. Admittedly, this new behavior is opt-in, so apps > sort of know what they’re signing up for when they opt in, but it seems like > a weird contract. Consider an app like an email client, that after initially > loading an email message in an off-screen web view performs some operations > that read or modify the content. It may not know to repeat those operations > after an automatic reload happened. Or, if it naively assumes that email > messages don’t reload, then it may erroneously repeat something that > shouldn’t be repeated (say: add a calendar event from an email message).
Further, consider that the email client executes some JavaScript code when an email message loads (in the offscreen web view) and that it’s this code that causes the Web process to become unresponsive. Now, the email app may have a failsafe mechanism that after some number of crashes of the Web process, it stops trying to reload the message into it. But if it is not aware of these crashes nor of the unresponsiveness that induced them, then it may just repeatedly enter the high-CPU-usage condition.
mitz
Comment 24
2017-02-19 19:24:33 PST
Given that there is already API for reporting to the client app that a Web process is unresponsive, and given that this new API is opt-in, it seems plainly redundant. An app that wants this behavior can just implement it using the existing API. It seems like the issue this change was meant to address is that we only classify a Web process as “unresponsive” if someone is actually expecting a response from it, but off-screen web views may not be expected to respond, and therefor may be “unresponsive” for extended periods of time. It seems like this may also be the case for an on-screen web view, if it’s done loading and no user interaction is taking place. So really, shouldn’t we just address this by expanding the notion of “unresponsive” to include “would have been unresponsive if anyone actually bothered to ask it to do anything”?
Chris Dumez
Comment 25
2017-02-19 19:29:09 PST
(In reply to
comment #22
)
> (In reply to
comment #21
) > > (In reply to
comment #20
) > > > What’s the rationale behind reporting these via the API differently from how > > > termination of a Web process by the OS due to memory pressure is reported to > > > the client? > > > > Can you clarify? We do call the processDidCrash client delegate in most > > cases. The only case we opted not to call this delegate is when the tab has > > never been visible. > > And that differs from the case where the OS would have killed such a Web > content process, I believe, which seems like an arbitrary inconsistency.
To be clear, the part that is inconsistent and when we decide to reload instead of calling processDidCrash, right?
mitz
Comment 26
2017-02-19 19:30:22 PST
(In reply to
comment #25
)
> (In reply to
comment #22
) > > (In reply to
comment #21
) > > > (In reply to
comment #20
) > > > > What’s the rationale behind reporting these via the API differently from how > > > > termination of a Web process by the OS due to memory pressure is reported to > > > > the client? > > > > > > Can you clarify? We do call the processDidCrash client delegate in most > > > cases. The only case we opted not to call this delegate is when the tab has > > > never been visible. > > > > And that differs from the case where the OS would have killed such a Web > > content process, I believe, which seems like an arbitrary inconsistency. > > To be clear, the part that is inconsistent and when we decide to reload > instead of calling processDidCrash, right?
Right.
Chris Dumez
Comment 27
2017-02-19 19:30:54 PST
(In reply to
comment #22
)
> (In reply to
comment #21
) > > (In reply to
comment #20
) > > > What’s the rationale behind reporting these via the API differently from how > > > termination of a Web process by the OS due to memory pressure is reported to > > > the client? > > > > Can you clarify? We do call the processDidCrash client delegate in most > > cases. The only case we opted not to call this delegate is when the tab has > > never been visible. > > And that differs from the case where the OS would have killed such a Web > content process, I believe, which seems like an arbitrary inconsistency. > > > We thought it was not worth displaying a crash banner in > > this case since it is not useful to the user. > > Displaying crash UI is the client app’s domain. I think it’s best if the > framework faithfully, or at least consistently, represents to the client > what is happening.
I think this is a good point. I'll discuss this with Gavin next week and he suggested this.
Chris Dumez
Comment 28
2017-02-19 19:36:43 PST
Note that part of the reason we wanted to do this in WebKit instead of client side is: 1. The next step is to kill processes that use too much CPU in the background too, not just unresponsive ones. 2. I believe doing this on client side is a bit more awkward because I don't think there is an API abstraction for a WebContent process, only WKPages.
mitz
Comment 29
2017-02-19 19:42:45 PST
(In reply to
comment #28
)
> Note that part of the reason we wanted to do this in WebKit instead of > client side is: > 1. The next step is to kill processes that use too much CPU in the > background too, not just unresponsive ones.
I can see this as a natural extension of the operating system’s existing behavior of killing processes when under memory pressure, so it seems OK if the framework does it and reports it to the client the same way.
> 2. I believe doing this on client side is a bit more awkward because I don't > think there is an API abstraction for a WebContent process, only WKPages.
Advanced clients already take process-sharing considerations into account when they respond to unresponsiveness (Safari on macOS is one such client that I know of, and I think the iOS version does that too). Perhaps the API that allows them to do this can be made better. Less advanced clients are probably going to only ever have one web view per process.
Michael Catanzaro
Comment 30
2017-02-20 07:36:45 PST
(In reply to
comment #24
)
> Given that there is already API for reporting to the client app that a Web > process is unresponsive
Really? What's that API? Mildly-related problem: We have an issue in Epiphany where attempting to Stop or close an unresponsive web process takes a very long time (I think 30 seconds). It would be useful for us to have a way to detect this and kill the process immediately. I do think this should really be handled by WebKit, not the application, though: if the web process doesn't respond to Stop or close within a second or two, something's gone wrong and it surely needs to be killed.
mitz
Comment 31
2017-02-20 07:48:02 PST
(In reply to
comment #30
)
> (In reply to
comment #24
) > > Given that there is already API for reporting to the client app that a Web > > process is unresponsive > > Really? What's that API?
_webViewWebProcessDidBecomeUnresponsive: in WKNavigationDelegatePrivate.
Michael Catanzaro
Comment 32
2017-02-20 09:14:59 PST
(In reply to
comment #31
)
> _webViewWebProcessDidBecomeUnresponsive: in WKNavigationDelegatePrivate.
OK thanks,
bug #168604
.
Chris Dumez
Comment 33
2017-02-21 12:42:28 PST
(In reply to
comment #27
)
> (In reply to
comment #22
) > > (In reply to
comment #21
) > > > (In reply to
comment #20
) > > > > What’s the rationale behind reporting these via the API differently from how > > > > termination of a Web process by the OS due to memory pressure is reported to > > > > the client? > > > > > > Can you clarify? We do call the processDidCrash client delegate in most > > > cases. The only case we opted not to call this delegate is when the tab has > > > never been visible. > > > > And that differs from the case where the OS would have killed such a Web > > content process, I believe, which seems like an arbitrary inconsistency. > > > > > We thought it was not worth displaying a crash banner in > > > this case since it is not useful to the user. > > > > Displaying crash UI is the client app’s domain. I think it’s best if the > > framework faithfully, or at least consistently, represents to the client > > what is happening. > > I think this is a good point. I'll discuss this with Gavin next week and he > suggested this.
Gavin is OK always reporting this to the client as a crash.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug