Bug 222074 - [macOS] Disabling relaunch on login for the WebContent process is racy
Summary: [macOS] Disabling relaunch on login for the WebContent process is racy
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit Misc. (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Per Arne Vollan
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2021-02-17 14:33 PST by Per Arne Vollan
Modified: 2021-02-22 13:55 PST (History)
4 users (show)

See Also:


Attachments
Patch (4.20 KB, patch)
2021-02-17 14:36 PST, Per Arne Vollan
no flags Details | Formatted Diff | Diff
Patch (5.13 KB, patch)
2021-02-18 08:51 PST, Per Arne Vollan
no flags Details | Formatted Diff | Diff
Patch (1.86 KB, patch)
2021-02-19 09:28 PST, Per Arne Vollan
no flags Details | Formatted Diff | Diff
Patch (1.88 KB, patch)
2021-02-19 09:31 PST, Per Arne Vollan
no flags Details | Formatted Diff | Diff
Patch (4.28 KB, patch)
2021-02-19 16:57 PST, Per Arne Vollan
no flags Details | Formatted Diff | Diff
Patch (4.20 KB, patch)
2021-02-20 07:24 PST, Per Arne Vollan
no flags Details | Formatted Diff | Diff
Patch (4.15 KB, patch)
2021-02-20 12:58 PST, Per Arne Vollan
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Per Arne Vollan 2021-02-17 14:33:08 PST
Delay revoking sandbox extension to Launch Services until Web page is created, since revoking it when initializing the Web process is too early. AppKit may access the daemon after that.
Comment 1 Per Arne Vollan 2021-02-17 14:34:58 PST
<rdar://74230216>
Comment 2 Per Arne Vollan 2021-02-17 14:36:41 PST
Created attachment 420719 [details]
Patch
Comment 3 Geoffrey Garen 2021-02-17 15:04:20 PST
I think the logic here is that +[NSAccessibilityRemoteUIElement remoteTokenForLocalUIElement:] does a one-time call to LaunchServices, and we want to make sure to make that call before closing the sandbox.

I feel like the logic might be more clear if we just wrote it outright: In WebProcess::platformInitializeWebProcess(), can you explicitly call [NSAccessibilityRemoteUIElement remoteTokenForLocalUIElement:[[[WKAccessibilityWebPageObject alloc] init] autorelease]], and then revoke? That would feel more straightforward and self-documenting.

But also: What does this revocation achieve? After we have made the connection to LaunchServices, what action severs that connection?
Comment 4 Per Arne Vollan 2021-02-18 08:51:54 PST
Created attachment 420834 [details]
Patch
Comment 5 Per Arne Vollan 2021-02-18 09:07:11 PST
(In reply to Geoffrey Garen from comment #3)
> I think the logic here is that +[NSAccessibilityRemoteUIElement
> remoteTokenForLocalUIElement:] does a one-time call to LaunchServices, and
> we want to make sure to make that call before closing the sandbox.
> 
> I feel like the logic might be more clear if we just wrote it outright: In
> WebProcess::platformInitializeWebProcess(), can you explicitly call
> [NSAccessibilityRemoteUIElement
> remoteTokenForLocalUIElement:[[[WKAccessibilityWebPageObject alloc] init]
> autorelease]], and then revoke? That would feel more straightforward and
> self-documenting.
> 


From having a breakpoint in the debugger when the connection to Launch Services is being used in WebProcess::platformInitializeWebProcess, it appears that the Launch Services is being accessed on a non-main thread created by dispatch_async of the method -[NSApplication disableRelaunchOnLogin] which is being called when NSApplication is being initialized.

It might be sufficient to revoke and sever the connection from a RunLoop::main dispatch, but it is possibly not guaranteed to happen after the dispatch_async. Would you prefer doing this with a RunLoop::main dispatch instead?

> But also: What does this revocation achieve? After we have made the
> connection to LaunchServices, what action severs that connection?

Ah, great catch! When we delay the revocation of the extension, we should also delay the severing of the Launch Services connection, which is done by calling _LSSetApplicationLaunchServicesServerConnectionStatus with a specified set of flags. This has been fixed in the latest patch.

Thanks for reviewing!
Comment 6 Geoffrey Garen 2021-02-18 16:03:36 PST
> From having a breakpoint in the debugger when the connection to Launch
> Services is being used in WebProcess::platformInitializeWebProcess, it
> appears that the Launch Services is being accessed on a non-main thread
> created by dispatch_async of the method -[NSApplication
> disableRelaunchOnLogin] which is being called when NSApplication is being
> initialized.
> 
> It might be sufficient to revoke and sever the connection from a
> RunLoop::main dispatch, but it is possibly not guaranteed to happen after
> the dispatch_async. Would you prefer doing this with a RunLoop::main
> dispatch instead?

Hmmm... 

Does this mean that this patch as-is is a race against a dispatch_async, and if/when the dispatch_async fires second, it will attempt to use a closed connection to LaunchServices, and crash?

Seems like we need some kind of explicit synchronization to fix that. Or we need LaunchServices to fail silently instead of crashing.

RunLoop::main dispatch would not synchronize with a dispatch_async to an unrelated queue. So, I don't think RunLoop::main dispatch would resolve a race condition.
Comment 7 Per Arne Vollan 2021-02-19 09:19:08 PST
(In reply to Geoffrey Garen from comment #6)
> > From having a breakpoint in the debugger when the connection to Launch
> > Services is being used in WebProcess::platformInitializeWebProcess, it
> > appears that the Launch Services is being accessed on a non-main thread
> > created by dispatch_async of the method -[NSApplication
> > disableRelaunchOnLogin] which is being called when NSApplication is being
> > initialized.
> > 
> > It might be sufficient to revoke and sever the connection from a
> > RunLoop::main dispatch, but it is possibly not guaranteed to happen after
> > the dispatch_async. Would you prefer doing this with a RunLoop::main
> > dispatch instead?
> 
> Hmmm... 
> 
> Does this mean that this patch as-is is a race against a dispatch_async, and
> if/when the dispatch_async fires second, it will attempt to use a closed
> connection to LaunchServices, and crash?
> 

Yes.

> Seems like we need some kind of explicit synchronization to fix that. Or we
> need LaunchServices to fail silently instead of crashing.
> 

That is a good point. What if we disable relaunch on login synchronously while holding the Launch Services extension, and then have Launch Services fail silently? 

> RunLoop::main dispatch would not synchronize with a dispatch_async to an
> unrelated queue. So, I don't think RunLoop::main dispatch would resolve a
> race condition.

Yes, that makes sense.

Thanks for reviewing!
Comment 8 Per Arne Vollan 2021-02-19 09:28:59 PST
Created attachment 420980 [details]
Patch
Comment 9 Per Arne Vollan 2021-02-19 09:31:10 PST
Created attachment 420981 [details]
Patch
Comment 10 Darin Adler 2021-02-19 09:55:54 PST
Comment on attachment 420981 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=420981&action=review

> Source/WebKit/ChangeLog:11
> +        When NSApplication is being intialized, the method -[NSApplication disableRelaunchOnLogin] is dispatched on a non-main thread, which is in a race
> +        with the revocation of the Launch Services sandbox extension. This patch addresses this by setting this information synchronously with Launch
> +        Services while the sandbox extension is being held.

Given this description I would expect code to be removed as well as code to be added. Unclear why code is not removed?

> Source/WebKit/WebProcess/cocoa/WebProcessCocoa.mm:332
> +    // Disable relaunch on login

WebKit coding style uses a period in a sentence style comment.

A more useful comment would be a little clearer on *why*, maybe even a short sentence somehow mentions that this must be done synchronously in this particular place, and why that’s OK from a performance point of view? And if there is another attempt at this elsewhere that is ineffective, might refer to that.
Comment 11 Geoffrey Garen 2021-02-19 13:23:32 PST
> That is a good point. What if we disable relaunch on login synchronously
> while holding the Launch Services extension, and then have Launch Services
> fail silently?

I think this can work.

But what about the NSAccessibilityRemoteUIElement case? (See crash log attached to rdar://problem/74230216.)
Comment 12 Per Arne Vollan 2021-02-19 16:48:33 PST
(In reply to Darin Adler from comment #10)
> Comment on attachment 420981 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=420981&action=review
> 
> > Source/WebKit/ChangeLog:11
> > +        When NSApplication is being intialized, the method -[NSApplication disableRelaunchOnLogin] is dispatched on a non-main thread, which is in a race
> > +        with the revocation of the Launch Services sandbox extension. This patch addresses this by setting this information synchronously with Launch
> > +        Services while the sandbox extension is being held.
> 
> Given this description I would expect code to be removed as well as code to
> be added. Unclear why code is not removed?
> 

This is also being done from -[NSApplication init] by dispatching -[NSApplication disableRelaunchOnLogin] on a non-main thread.

> > Source/WebKit/WebProcess/cocoa/WebProcessCocoa.mm:332
> > +    // Disable relaunch on login
> 
> WebKit coding style uses a period in a sentence style comment.
> 

Will fix!

> A more useful comment would be a little clearer on *why*, maybe even a short
> sentence somehow mentions that this must be done synchronously in this
> particular place, and why that’s OK from a performance point of view? And if
> there is another attempt at this elsewhere that is ineffective, might refer
> to that.

Will fix in upcoming patch.

Thanks for reviewing!
Comment 13 Per Arne Vollan 2021-02-19 16:49:35 PST
(In reply to Geoffrey Garen from comment #11)
> > That is a good point. What if we disable relaunch on login synchronously
> > while holding the Launch Services extension, and then have Launch Services
> > fail silently?
> 
> I think this can work.
> 
> But what about the NSAccessibilityRemoteUIElement case? (See crash log
> attached to rdar://problem/74230216.)

Ah, great catch, I completely missed that one! I will add the code you suggested from a previous comment.

Thanks for reviewing!
Comment 14 Per Arne Vollan 2021-02-19 16:57:28 PST
Created attachment 421059 [details]
Patch
Comment 15 Per Arne Vollan 2021-02-20 07:24:04 PST
Created attachment 421091 [details]
Patch
Comment 16 Darin Adler 2021-02-20 10:05:31 PST
Comment on attachment 421091 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=421091&action=review

> Source/WebKit/WebProcess/cocoa/WebProcessCocoa.mm:339
> +    [NSAccessibilityRemoteUIElement remoteTokenForLocalUIElement:[[[WKAccessibilityWebPageObject alloc] init] autorelease]];

Can minimize the use of autorelease here, which is something Chris Dumez is working on across WebKit. Write this, which uses RetainPtr instead:

    [NSAccessibilityRemoteUIElement remoteTokenForLocalUIElement:adoptNS([[WKAccessibilityWebPageObject alloc] init]).get()];
Comment 17 Per Arne Vollan 2021-02-20 12:58:24 PST
Created attachment 421106 [details]
Patch
Comment 18 Per Arne Vollan 2021-02-20 12:59:27 PST
(In reply to Darin Adler from comment #16)
> Comment on attachment 421091 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=421091&action=review
> 
> > Source/WebKit/WebProcess/cocoa/WebProcessCocoa.mm:339
> > +    [NSAccessibilityRemoteUIElement remoteTokenForLocalUIElement:[[[WKAccessibilityWebPageObject alloc] init] autorelease]];
> 
> Can minimize the use of autorelease here, which is something Chris Dumez is
> working on across WebKit. Write this, which uses RetainPtr instead:
> 
>     [NSAccessibilityRemoteUIElement
> remoteTokenForLocalUIElement:adoptNS([[WKAccessibilityWebPageObject alloc]
> init]).get()];

Fixed, I have uploaded a new patch for review.

Thanks for reviewing!
Comment 19 Geoffrey Garen 2021-02-22 12:53:43 PST
Comment on attachment 421106 [details]
Patch

r=me
Comment 20 EWS 2021-02-22 12:59:01 PST
Committed r273270: <https://commits.webkit.org/r273270>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 421106 [details].
Comment 21 Per Arne Vollan 2021-02-22 13:55:24 PST
(In reply to Geoffrey Garen from comment #19)
> Comment on attachment 421106 [details]
> Patch
> 
> r=me

Thank you!