Bug 203046 - [iOS] "Unexpectedly Resumed" process assertion may cause us to get terminated
Summary: [iOS] "Unexpectedly Resumed" process assertion may cause us to get terminated
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit2 (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Chris Dumez
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2019-10-16 11:44 PDT by Chris Dumez
Modified: 2019-10-22 09:53 PDT (History)
4 users (show)

See Also:


Attachments
Patch (5.37 KB, patch)
2019-10-16 11:47 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (5.68 KB, patch)
2019-10-16 14:35 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Dumez 2019-10-16 11:44:22 PDT
"Unexpectedly Resumed" process assertion may cause us to get terminated.
Comment 1 Chris Dumez 2019-10-16 11:44:36 PDT
<rdar://problem/56179592>
Comment 2 Chris Dumez 2019-10-16 11:47:37 PDT
Created attachment 381091 [details]
Patch
Comment 3 Geoffrey Garen 2019-10-16 13:49:42 PDT
Comment on attachment 381091 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=381091&action=review

> Source/WebKit/WebProcess/cocoa/WebProcessCocoa.mm:331
> +    dispatch_after(dispatch_time(DISPATCH_TIME_NOW, (int64_t)(5 * NSEC_PER_SEC)), dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), releaseAssertion);

I don't think this approach is thread-safe. I guess the use of 'this' is thread-safe because WebProocess is a global singleton. But I don't see anything that guarantees that loads and stores of m_unexpectedlyResumedUIAssertion will be coherent across threads.

I think you need a lock around uses of m_unexpectedlyResumedUIAssertion. Or maybe it would work to make m_unexpectedlyResumedUIAssertion std::atomic<RetainPtr<BKSProcessAssertion>>. (I've never tried that before.)
Comment 4 Chris Dumez 2019-10-16 14:35:29 PDT
Created attachment 381109 [details]
Patch
Comment 5 Chris Dumez 2019-10-17 20:19:19 PDT
Comment on attachment 381109 [details]
Patch

I updated the patch based on Geoff's feedback.
Comment 6 Geoffrey Garen 2019-10-18 14:38:07 PDT
Comment on attachment 381109 [details]
Patch

r=me
Comment 7 WebKit Commit Bot 2019-10-18 15:45:49 PDT
Comment on attachment 381109 [details]
Patch

Clearing flags on attachment: 381109

Committed r251304: <https://trac.webkit.org/changeset/251304>
Comment 8 WebKit Commit Bot 2019-10-18 15:45:51 PDT
All reviewed patches have been landed.  Closing bug.
Comment 9 Geoffrey Garen 2019-10-21 10:32:14 PDT
Comment on attachment 381109 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=381109&action=review

> Source/WebKit/WebProcess/WebProcess.h:540
> +    Lock m_unexpectedlyResumedUIAssertionLock;
> +    RetainPtr<BKSProcessAssertion> m_unexpectedlyResumedUIAssertion;

I think a better name for this condition would be "resumedBeforeUIProcess". It's not really unexpected. It happens all the time. Sometimes it happens because we're playing media in the background, and in that case, only the WebContent process gets a resume message. Sometimes it happens because we're in the foreground, and the WebContent process wins the race to resumption vs the UI process. There's no way for us to know the difference between these two cases, so we always wake up the UI process (and network process by extension) when we wake up first.

> Source/WebKit/WebProcess/cocoa/WebProcessCocoa.mm:333
> +    dispatch_after(dispatch_time(DISPATCH_TIME_NOW, (int64_t)(5 * NSEC_PER_SEC)), dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), releaseAssertion);

I think this code is either a no-op or a bug.

In the case where the UI process responds right away, it sends us a resume message and we clear m_unexpectedlyResumedUIAssertion. So, it's a no-op.

In the case where the UI process takes longer than five seconds to respond, and the RunninBoard invalidation timeout is less than five seconds, RunningBoard sends us an invalidation message. So, it's a no-op.

In the case where the UI process takes longer than five seconds to respond, and the RunninBoard invalidation timeout is more than five seconds, we give up on resuming, even though we have. more time available. So, it's a bug. (Although a minor bug, which probably never happens in practice.)

So, I think the code would be a little clearer if we removed this timeout.
Comment 10 Geoffrey Garen 2019-10-21 10:41:37 PDT
Comment on attachment 381109 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=381109&action=review

> Source/WebKit/WebProcess/cocoa/WebProcessCocoa.mm:313
>      if (!m_processIsSuspended)
>          return;
>  

If we just trust ProcessTaskStateObserver::TaskState, we can avoid the race in checking m_processIsSuspended, and unconditionally take an assertion and signal the UI process to wake up. (This what often happens in practice now anyway.)

Then, if the UI process responds with a message to indicate it received ProcessWasUnexpectedlyUnsuspended, we can clear our assertion in the implementation of that message. In theory, we always must hold our assertion until we receive this response; otherwise, we have not woken up yet.
Comment 11 Chris Dumez 2019-10-22 09:53:01 PDT
(In reply to Geoffrey Garen from comment #10)
> Comment on attachment 381109 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=381109&action=review
> 
> > Source/WebKit/WebProcess/cocoa/WebProcessCocoa.mm:313
> >      if (!m_processIsSuspended)
> >          return;
> >  
> 
> If we just trust ProcessTaskStateObserver::TaskState, we can avoid the race
> in checking m_processIsSuspended, and unconditionally take an assertion and
> signal the UI process to wake up. (This what often happens in practice now
> anyway.)
> 
> Then, if the UI process responds with a message to indicate it received
> ProcessWasUnexpectedlyUnsuspended, we can clear our assertion in the
> implementation of that message. In theory, we always must hold our assertion
> until we receive this response; otherwise, we have not woken up yet.

Implementing suggestions via:
https://bugs.webkit.org/show_bug.cgi?id=203254