Bug 174751 - [Mac WK2] Fix null dereference in asynchronous NSTextInputClient methods when deallocating a WKWebView
Summary: [Mac WK2] Fix null dereference in asynchronous NSTextInputClient methods when...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit2 (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Wenson Hsieh
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2017-07-21 23:10 PDT by Wenson Hsieh
Modified: 2017-07-23 14:30 PDT (History)
8 users (show)

See Also:


Attachments
Patch (5.36 KB, patch)
2017-07-21 23:39 PDT, Wenson Hsieh
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews122 for ios-simulator-wk2 (965.58 KB, application/zip)
2017-07-22 01:04 PDT, Build Bot
no flags Details
Close the page earlier (3.82 KB, patch)
2017-07-22 01:22 PDT, Wenson Hsieh
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Wenson Hsieh 2017-07-21 23:10:04 PDT
<rdar://problem/33132405>
Comment 1 Wenson Hsieh 2017-07-21 23:39:41 PDT
Created attachment 316176 [details]
Patch
Comment 2 Tim Horton 2017-07-21 23:44:34 PDT
Comment on attachment 316176 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=316176&action=review

> Source/WebKit/ChangeLog:30
> +        behavior change since it swaps the order in which key teardown methods are invoked, and as such, incurs
> +        considerably more risk than just adding null checks.

However, that behavior change would fix /all/ callbacks, not just these few... there's nothing that makes this specific to the NSTextInputClient methods, right?
Comment 3 Wenson Hsieh 2017-07-22 00:30:13 PDT
(In reply to Tim Horton from comment #2)
> Comment on attachment 316176 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=316176&action=review
> 
> > Source/WebKit/ChangeLog:30
> > +        behavior change since it swaps the order in which key teardown methods are invoked, and as such, incurs
> > +        considerably more risk than just adding null checks.
> 
> However, that behavior change would fix /all/ callbacks, not just these
> few... there's nothing that makes this specific to the NSTextInputClient
> methods, right?

That is true, if NSTextInputContext were to invoke anything else on our WKWebView that dereferences _impl within a NSTextInputClient completion block, we would still crash. However, from the crash logs in the radar, NSTextInputContext only ends up invoking these asynchronous NSTextInputClient methods in the codepath that's registering these crashy completion blocks. In that sense, this is a very targeted fix to deal with NSTextInputContext behavior.

I'll take a second look at closing the WebPageProxy at an earlier time. I already tried this approach and confirmed that it fixes the crash without any obvious fallout, but...it should be more thoroughly vetted :P.

What do you think?
Comment 4 mitz 2017-07-22 00:32:29 PDT
(In reply to Wenson Hsieh from comment #3)
> I'll take a second look at closing the WebPageProxy at an earlier time. I
> already tried this approach and confirmed that it fixes the crash without
> any obvious fallout, but...it should be more thoroughly vetted :P.

Closing the page earlier is both consistent with what we do on iOS and with what we used to do on macOS prior to the introduction of _impl.
Comment 5 Tim Horton 2017-07-22 00:39:26 PDT
(In reply to Wenson Hsieh from comment #3)
> (In reply to Tim Horton from comment #2)
> > Comment on attachment 316176 [details]
> > Patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=316176&action=review
> > 
> > > Source/WebKit/ChangeLog:30
> > > +        behavior change since it swaps the order in which key teardown methods are invoked, and as such, incurs
> > > +        considerably more risk than just adding null checks.
> > 
> > However, that behavior change would fix /all/ callbacks, not just these
> > few... there's nothing that makes this specific to the NSTextInputClient
> > methods, right?
> 
> That is true, if NSTextInputContext were to invoke anything else on our

Forget, for a second, about NSTextInputContext. There are lots of other callbacks that use this same mechanism, some which come from the client, all of which could easily call back into something that calls into WebViewImpl. Right?

> I'll take a second look at closing the WebPageProxy at an earlier time. I
> already tried this approach and confirmed that it fixes the crash without
> any obvious fallout, but...it should be more thoroughly vetted :P.
> 
> What do you think?

I think I'm with mitz on this one. I made a mistake with WebViewImpl and we should fix it :)
Comment 6 Wenson Hsieh 2017-07-22 00:41:46 PDT
(In reply to Tim Horton from comment #5)
> (In reply to Wenson Hsieh from comment #3)
> > (In reply to Tim Horton from comment #2)
> > > Comment on attachment 316176 [details]
> > > Patch
> > > 
> > > View in context:
> > > https://bugs.webkit.org/attachment.cgi?id=316176&action=review
> > > 
> > > > Source/WebKit/ChangeLog:30
> > > > +        behavior change since it swaps the order in which key teardown methods are invoked, and as such, incurs
> > > > +        considerably more risk than just adding null checks.
> > > 
> > > However, that behavior change would fix /all/ callbacks, not just these
> > > few... there's nothing that makes this specific to the NSTextInputClient
> > > methods, right?
> > 
> > That is true, if NSTextInputContext were to invoke anything else on our
> 
> Forget, for a second, about NSTextInputContext. There are lots of other
> callbacks that use this same mechanism, some which come from the client, all
> of which could easily call back into something that calls into WebViewImpl.
> Right?
> 
> > I'll take a second look at closing the WebPageProxy at an earlier time. I
> > already tried this approach and confirmed that it fixes the crash without
> > any obvious fallout, but...it should be more thoroughly vetted :P.
> > 
> > What do you think?
> 
> I think I'm with mitz on this one. I made a mistake with WebViewImpl and we
> should fix it :)

Sounds good. I'll update the patch -- thanks, Tim and Dan!
Comment 7 Build Bot 2017-07-22 01:04:34 PDT
Comment on attachment 316176 [details]
Patch

Attachment 316176 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: http://webkit-queues.webkit.org/results/4166161

New failing tests:
imported/w3c/IndexedDB-private-browsing/idbfactory_open12.html
Comment 8 Build Bot 2017-07-22 01:04:35 PDT
Created attachment 316177 [details]
Archive of layout-test-results from ews122 for ios-simulator-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews122  Port: ios-simulator-wk2  Platform: Mac OS X 10.12.5
Comment 9 Wenson Hsieh 2017-07-22 01:22:39 PDT
Created attachment 316180 [details]
Close the page earlier
Comment 10 Tim Horton 2017-07-22 01:23:50 PDT
Comment on attachment 316180 [details]
Close the page earlier

View in context: https://bugs.webkit.org/attachment.cgi?id=316180&action=review

> Source/WebKit/UIProcess/API/Cocoa/WKWebView.mm:688
>      _page->close();

I wonder if we still have this bug in WKView now?
Comment 11 Wenson Hsieh 2017-07-22 01:50:50 PDT
(In reply to Tim Horton from comment #10)
> Comment on attachment 316180 [details]
> Close the page earlier
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=316180&action=review
> 
> > Source/WebKit/UIProcess/API/Cocoa/WKWebView.mm:688
> >      _page->close();
> 
> I wonder if we still have this bug in WKView now?

I think so, because in -[WKView dealloc], we clear out the WebViewImpl pointer early on by setting `_data->_impl = nullptr;`, which would trigger ~WebViewImpl and then invoke all pending callbacks, which could potentially do things to the WKView that assume _data->_impl exists. We should be able to address this by doing _data->_impl->page().close() before clearing out the _impl pointer.
Comment 12 Wenson Hsieh 2017-07-22 02:18:29 PDT
(In reply to Wenson Hsieh from comment #11)
> (In reply to Tim Horton from comment #10)
> > Comment on attachment 316180 [details]
> > Close the page earlier
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=316180&action=review
> > 
> > > Source/WebKit/UIProcess/API/Cocoa/WKWebView.mm:688
> > >      _page->close();
> > 
> > I wonder if we still have this bug in WKView now?
> 
> I think so, because in -[WKView dealloc], we clear out the WebViewImpl
> pointer early on by setting `_data->_impl = nullptr;`, which would trigger
> ~WebViewImpl and then invoke all pending callbacks, which could potentially
> do things to the WKView that assume _data->_impl exists. We should be able
> to address this by doing _data->_impl->page().close() before clearing out
> the _impl pointer.

That being said, can we do it in a followup? I'd like to limit the scope of this change to fixing the crash being tracked in <rdar://problem/33132405>. I filed https://bugs.webkit.org/show_bug.cgi?id=174753 to track this.
Comment 13 Darin Adler 2017-07-22 10:48:49 PDT
Comment on attachment 316180 [details]
Close the page earlier

Especially since we cannot create a regression test, we want to go beyond fixing this bug and instead structure the code so the bug is highly unlikely to ever return. I think it’s worth some further thought about how to do that after this fix goes in.
Comment 14 WebKit Commit Bot 2017-07-22 12:57:16 PDT
Comment on attachment 316180 [details]
Close the page earlier

Clearing flags on attachment: 316180

Committed r219765: <http://trac.webkit.org/changeset/219765>
Comment 15 WebKit Commit Bot 2017-07-22 12:57:18 PDT
All reviewed patches have been landed.  Closing bug.
Comment 16 Tim Horton 2017-07-22 13:54:51 PDT
(In reply to Wenson Hsieh from comment #12)

> That being said, can we do it in a followup? I'd like to limit the scope of
> this change to fixing the crash being tracked in <rdar://problem/33132405>.
> I filed https://bugs.webkit.org/show_bug.cgi?id=174753 to track this.

Yes, that definitely makes sense.
Comment 17 Darin Adler 2017-07-23 13:30:28 PDT
I think the change log for this is misleading; sorry I did not spot it when reviewing. The comment says "Tweaks -[WKWebView dealloc] to close the WebPageProxy at an earlier time". But the change here is to start calling WebPageProxy::close on Mac *at* *all*.  Before this patch, -[WKWebView dealloc] called WebPageProxy::close on iOS but not at all on Mac.

As far as I can tell, this has been true since r191907 and I am not sure if the change to omit the call on the Mac was intentional. Tim, you might want to take a look since you made that r191907 change and Anders reviewed it.
Comment 18 mitz 2017-07-23 14:20:55 PDT
(In reply to Darin Adler from comment #17)
> I think the change log for this is misleading; sorry I did not spot it when
> reviewing. The comment says "Tweaks -[WKWebView dealloc] to close the
> WebPageProxy at an earlier time". But the change here is to start calling
> WebPageProxy::close on Mac *at* *all*.  Before this patch, -[WKWebView
> dealloc] called WebPageProxy::close on iOS but not at all on Mac.

Before the patch, close would still be called on macOS by ~WebViewImpl, which was invoked beneath -[WKWebView dealloc] (in the call to super).
Comment 19 Darin Adler 2017-07-23 14:30:47 PDT
OK, I see.