Bug 179123

Summary: REGRESSION(r223718): Leaking WebProcessPool after reconfiguration
Product: WebKit Reporter: Jonathan Bedard <jbedard>
Component: WebKit Misc.Assignee: Chris Dumez <cdumez>
Status: RESOLVED FIXED    
Severity: Normal CC: beidson, buildbot, cdumez, commit-queue, darin, ggaren, rniwa, ryanhaddad, sam, webkit-bug-importer, youennf
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=208541
Attachments:
Description Flags
Patch
none
Archive of layout-test-results from ews123 for ios-simulator-wk2
none
Patch
none
Patch
none
Patch none

Description Jonathan Bedard 2017-11-01 09:50:21 PDT
Note that when this change was committed, r224124 had not yet been committed, the particular leak would have been very difficult to detect.

This is a big problem when running layout tests, as it means that test runners leak entire processes.
Comment 1 Radar WebKit Bug Importer 2017-11-01 09:50:53 PDT
<rdar://problem/35294685>
Comment 2 Jonathan Bedard 2017-11-01 09:53:48 PDT
Here is the stack trace for the un-released retain (coming from instruments):

   0 libobjc.A.dylib -[NSObject retain]
   1 WebKit API::Object::ref() Source/WebKit/Shared/Cocoa/APIObject.mm:87
   2 WebKit WTF::Ref<WebKit::WebProcessPool>::Ref(WebKit::WebProcessPool&) WebKitBuild/Debug/usr/local/include/wtf/Ref.h:64
   3 WebKit WTF::Ref<WebKit::WebProcessPool>::Ref(WebKit::WebProcessPool&) WebKitBuild/Debug/usr/local/include/wtf/Ref.h:63
   4 WebKit WebKit::WebProcessProxy::WebProcessProxy(WebKit::WebProcessPool&, WebKit::WebsiteDataStore&) Source/WebKit/UIProcess/WebProcessProxy.cpp:102
   5 WebKit WebKit::ServiceWorkerProcessProxy::ServiceWorkerProcessProxy(WebKit::WebProcessPool&, WebKit::WebsiteDataStore&) Source/WebKit/UIProcess/ServiceWorkerProcessProxy.cpp:35
   6 WebKit WebKit::ServiceWorkerProcessProxy::ServiceWorkerProcessProxy(WebKit::WebProcessPool&, WebKit::WebsiteDataStore&) Source/WebKit/UIProcess/ServiceWorkerProcessProxy.cpp:37
   7 WebKit WebKit::ServiceWorkerProcessProxy::create(WebKit::WebProcessPool&, WebKit::WebsiteDataStore&) Source/WebKit/UIProcess/ServiceWorkerProcessProxy.h:37
   8 WebKit WebKit::WebProcessPool::getWorkerContextProcessConnection(WebKit::StorageProcessProxy&) Source/WebKit/UIProcess/WebProcessPool.cpp:601
   9 WebKit WebKit::StorageProcessProxy::getWorkerContextProcessConnection() Source/WebKit/UIProcess/Storage/StorageProcessProxy.cpp:229
  10 WebKit void IPC::callMemberFunctionImpl<WebKit::StorageProcessProxy, void (WebKit::StorageProcessProxy::*)(), std::__1::tuple<> >(WebKit::StorageProcessProxy*, void (WebKit::StorageProcessProxy::*)(), std::__1::tuple<>&&, std::__1::integer_sequence<unsigned long>) Source/WebKit/Platform/IPC/HandleMessage.h:40
  11 WebKit void IPC::callMemberFunction<WebKit::StorageProcessProxy, void (WebKit::StorageProcessProxy::*)(), std::__1::tuple<>, std::__1::integer_sequence<unsigned long> >(std::__1::tuple<>&&, WebKit::StorageProcessProxy*, void (WebKit::StorageProcessProxy::*)()) Source/WebKit/Platform/IPC/HandleMessage.h:46
  12 WebKit void IPC::handleMessage<Messages::StorageProcessProxy::GetWorkerContextProcessConnection, WebKit::StorageProcessProxy, void (WebKit::StorageProcessProxy::*)()>(IPC::Decoder&, WebKit::StorageProcessProxy*, void (WebKit::StorageProcessProxy::*)()) Source/WebKit/Platform/IPC/HandleMessage.h:126
  13 WebKit WebKit::StorageProcessProxy::didReceiveStorageProcessProxyMessage(IPC::Connection&, IPC::Decoder&) WebKitBuild/Debug/DerivedSources/WebKit2/StorageProcessProxyMessageReceiver.cpp:72
  14 WebKit WebKit::StorageProcessProxy::didReceiveMessage(IPC::Connection&, IPC::Decoder&) Source/WebKit/UIProcess/Storage/StorageProcessProxy.cpp:81
  15 WebKit non-virtual thunk to WebKit::StorageProcessProxy::didReceiveMessage(IPC::Connection&, IPC::Decoder&) Source/WebKit/UIProcess/Storage/StorageProcessProxy.cpp:0
Comment 3 Chris Dumez 2017-11-01 10:00:02 PDT
WebProcessPools hold a strong ref to every WebProcess (including the SW WebProces). WebProcessProxy holds a strong ref to the WebProcessPool via m_processPool.
Comment 4 Chris Dumez 2017-11-01 10:15:23 PDT
Created attachment 325584 [details]
Patch
Comment 5 Build Bot 2017-11-01 11:33:13 PDT
Comment on attachment 325584 [details]
Patch

Attachment 325584 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: http://webkit-queues.webkit.org/results/5063905

Number of test failures exceeded the failure limit.
Comment 6 Build Bot 2017-11-01 11:33:15 PDT
Created attachment 325600 [details]
Archive of layout-test-results from ews123 for ios-simulator-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews123  Port: ios-simulator-wk2  Platform: Mac OS X 10.12.6
Comment 7 Jonathan Bedard 2017-11-01 11:34:39 PDT
I tested this patch locally, and it fixes the leak on Mac.....but seems like it causes a new set of problems on iOS.
Comment 8 Chris Dumez 2017-11-01 11:47:28 PDT
(In reply to Jonathan Bedard from comment #7)
> I tested this patch locally, and it fixes the leak on Mac.....but seems like
> it causes a new set of problems on iOS.

Do we have the crash trace on iOS? The attached archive does not seem to have it?
Comment 9 Geoffrey Garen 2017-11-01 12:18:38 PDT
Comment on attachment 325584 [details]
Patch

ios-sim EWS seems to be crashing on every test.
Comment 10 Chris Dumez 2017-11-01 12:22:25 PDT
(In reply to Geoffrey Garen from comment #9)
> Comment on attachment 325584 [details]
> Patch
> 
> ios-sim EWS seems to be crashing on every test.

Well, this fixes the leak too :P
Comment 11 Jonathan Bedard 2017-11-01 12:45:05 PDT
(In reply to Geoffrey Garen from comment #9)
> Comment on attachment 325584 [details]
> Patch
> 
> ios-sim EWS seems to be crashing on every test.

It's not quite crashing on every test, (at least according to the logs) looks like we get through nearly 40,000 tests before we early-exit because of too many crashes (30 is the threshold for EWS).

EWS doesn't upload files when we early-exit (although it should), I'm running locally, I'll upload some crash logs.
Comment 12 Chris Dumez 2017-11-01 15:39:04 PDT
Created attachment 325641 [details]
Patch
Comment 13 Chris Dumez 2017-11-01 16:04:52 PDT
Created attachment 325645 [details]
Patch
Comment 14 Chris Dumez 2017-11-01 19:42:29 PDT
EWS is now happy.
Comment 15 Darin Adler 2017-11-01 21:53:04 PDT
Comment on attachment 325645 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=325645&action=review

> Source/WebKit/UIProcess/WebProcessPool.cpp:945
> +    if (m_processes.size() == 1 && m_processes[0] == m_serviceWorkerProcess) {
> +        m_serviceWorkerProcess = nullptr;
> +        m_processes.clear();
> +    }

Why is this the right thing to do? Instead of nulling out m_serviceWorkerProcess and clearing m_processes, shouldn’t this call:

    m_serviceWorkerProcess->requestTermination(RequestedByClient);

or something like that?
Comment 16 Chris Dumez 2017-11-01 22:06:20 PDT
(In reply to Darin Adler from comment #15)
> Comment on attachment 325645 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=325645&action=review
> 
> > Source/WebKit/UIProcess/WebProcessPool.cpp:945
> > +    if (m_processes.size() == 1 && m_processes[0] == m_serviceWorkerProcess) {
> > +        m_serviceWorkerProcess = nullptr;
> > +        m_processes.clear();
> > +    }
> 
> Why is this the right thing to do? Instead of nulling out
> m_serviceWorkerProcess and clearing m_processes, shouldn’t this call:
> 
>     m_serviceWorkerProcess->requestTermination(RequestedByClient);
> 
> or something like that?

Destroying the WebProcessProxy object severs the IPC connection with the WebProcess and causes it to exit. Not sure why requestTermination() would be better since:
- we do not need to notify the client
- we do not need to reuse the proxy object
Comment 17 Darin Adler 2017-11-02 13:02:15 PDT
Every other process gets removed from m_processes when the object itself calls to WebProcessPool to tell it that it’s done. It seems really strange that the service worker process doesn’t do it that way, even if it’s the pool that tells the service worker process that it’s not needed any more.

I guess I am too unfamiliar with this code to understand the relationships here.
Comment 18 Jonathan Bedard 2017-11-03 15:20:07 PDT
Can we get this fix landed soon?

Two weeks ago, a variation of this bug fixed by r224124 crippled EWS for a few days.  We increased the process limit on the machines in question, but I feel like the longer we leave this bug in the tree, the more we are pushing our luck.
Comment 19 Geoffrey Garen 2017-11-03 15:42:13 PDT
Comment on attachment 325645 [details]
Patch

r=me
Comment 20 WebKit Commit Bot 2017-11-03 16:02:19 PDT
Comment on attachment 325645 [details]
Patch

Clearing flags on attachment: 325645

Committed r224438: <https://trac.webkit.org/changeset/224438>
Comment 21 WebKit Commit Bot 2017-11-03 16:02:21 PDT
All reviewed patches have been landed.  Closing bug.
Comment 22 Ryosuke Niwa 2017-11-03 22:17:24 PDT
This seems to have caused a bunch of Inspector tests to hit assertions :(

https://build.webkit.org/builders/Apple%20Sierra%20Debug%20WK2%20(Tests)/
e.g.
https://build.webkit.org/results/Apple%20Sierra%20Debug%20WK2%20(Tests)/r224440%20(3868)/results.html
https://build.webkit.org/results/Apple%20Sierra%20Debug%20WK2%20(Tests)/r224440%20(3868)/http/tests/inspector/dom/disconnect-dom-tree-after-main-frame-navigation-crash-log.txt

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   com.apple.JavaScriptCore      	0x000000010c95bab4 WTFCrash + 36 (Assertions.cpp:270)
1   com.apple.WebKit              	0x000000010fe09a39 WebKit::StorageProcessProxy::getWorkerContextProcessConnection() + 73 (StorageProcessProxy.cpp:226)
2   com.apple.WebKit              	0x000000010fe0d18a void IPC::callMemberFunctionImpl<WebKit::StorageProcessProxy, void (WebKit::StorageProcessProxy::*)(), std::__1::tuple<> >(WebKit::StorageProcessProxy*, void (WebKit::StorageProcessProxy::*)(), std::__1::tuple<>&&, std::__1::integer_sequence<unsigned long>) + 122 (HandleMessage.h:41)
3   com.apple.WebKit              	0x000000010fe0d108 void IPC::callMemberFunction<WebKit::StorageProcessProxy, void (WebKit::StorageProcessProxy::*)(), std::__1::tuple<>, std::__1::integer_sequence<unsigned long> >(std::__1::tuple<>&&, WebKit::StorageProcessProxy*, void (WebKit::StorageProcessProxy::*)()) + 88 (HandleMessage.h:47)
4   com.apple.WebKit              	0x000000010fe0cc49 void IPC::handleMessage<Messages::StorageProcessProxy::GetWorkerContextProcessConnection, WebKit::StorageProcessProxy, void (WebKit::StorageProcessProxy::*)()>(IPC::Decoder&, WebKit::StorageProcessProxy*, void (WebKit::StorageProcessProxy::*)()) + 185 (HandleMessage.h:127)
5   com.apple.WebKit              	0x000000010fe0c5c2 WebKit::StorageProcessProxy::didReceiveStorageProcessProxyMessage(IPC::Connection&, IPC::Decoder&) + 770 (StorageProcessProxyMessageReceiver.cpp:73)
6   com.apple.WebKit              	0x000000010fe06c98 WebKit::StorageProcessProxy::didReceiveMessage(IPC::Connection&, IPC::Decoder&) + 104 (StorageProcessProxy.cpp:84)
7   com.apple.WebKit              	0x000000010fe06d04 non-virtual thunk to WebKit::StorageProcessProxy::didReceiveMessage(IPC::Connection&, IPC::Decoder&) + 52
8   com.apple.WebKit              	0x000000010f8aff23 IPC::Connection::dispatchMessage(IPC::Decoder&) + 51 (Connection.cpp:902)
9   com.apple.WebKit              	0x000000010f8a5578 IPC::Connection::dispatchMessage(std::__1::unique_ptr<IPC::Decoder, std::__1::default_delete<IPC::Decoder> >) + 712 (Connection.cpp:930)
10  com.apple.WebKit              	0x000000010f8b0520 IPC::Connection::dispatchOneMessage() + 1520 (Connection.cpp:959)
11  com.apple.WebKit              	0x000000010f8c86bd IPC::Connection::enqueueIncomingMessage(std::__1::unique_ptr<IPC::Decoder, std::__1::default_delete<IPC::Decoder> >)::$_14::operator()() + 29 (Connection.cpp:896)
12  com.apple.WebKit              	0x000000010f8c8619 WTF::Function<void ()>::CallableWrapper<IPC::Connection::enqueueIncomingMessage(std::__1::unique_ptr<IPC::Decoder, std::__1::default_delete<IPC::Decoder> >)::$_14>::call() + 25 (Function.h:101)
13  com.apple.JavaScriptCore      	0x000000010c99103b WTF::Function<void ()>::operator()() const + 139 (Function.h:56)
14  com.apple.JavaScriptCore      	0x000000010c9b15b3 WTF::RunLoop::performWork() + 211 (RunLoop.cpp:107)
15  com.apple.JavaScriptCore      	0x000000010c9b1e34 WTF::RunLoop::performWork(void*) + 36 (RunLoopCF.cpp:38)
16  com.apple.CoreFoundation      	0x00007fffa62763e1 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 17
17  com.apple.CoreFoundation      	0x00007fffa625765c __CFRunLoopDoSources0 + 556
18  com.apple.CoreFoundation      	0x00007fffa6256b46 __CFRunLoopRun + 934
19  com.apple.CoreFoundation      	0x00007fffa6256544 CFRunLoopRunSpecific + 420
20  com.apple.Foundation          	0x00007fffa7c874e2 -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 277
21  WebKitTestRunner              	0x000000010b36ddb4 WTR::TestController::platformRunUntil(bool&, double) + 260 (TestControllerCocoa.mm:186)
22  WebKitTestRunner              	0x000000010b34bca9 WTR::TestController::runUntil(bool&, double) + 73 (TestController.cpp:1240)
23  WebKitTestRunner              	0x000000010b370f20 WTR::TestInvocation::invoke() + 1344 (TestInvocation.cpp:182)
24  WebKitTestRunner              	0x000000010b352fcd WTR::TestController::runTest(char const*) + 2061 (TestController.cpp:1201)
25  WebKitTestRunner              	0x000000010b354041 WTR::TestController::runTestingServerLoop() + 177 (TestController.cpp:1217)
26  WebKitTestRunner              	0x000000010b346786 WTR::TestController::run() + 54 (TestController.cpp:1225)
27  WebKitTestRunner              	0x000000010b34622a WTR::TestController::TestController(int, char const**) + 1530 (TestController.cpp:123)
28  WebKitTestRunner              	0x000000010b346943 WTR::TestController::TestController(int, char const**) + 35 (TestController.cpp:124)
29  WebKitTestRunner              	0x000000010b3261bf main + 127 (main.mm:66)
30  libdyld.dylib                 	0x00007fffbbf8e235 start + 1
Comment 23 Chris Dumez 2017-11-03 22:19:51 PDT
Reverted r224438 for reason:

Has caused assertions on the bots

Committed r224454: <https://trac.webkit.org/changeset/224454>
Comment 24 Chris Dumez 2017-11-04 15:29:47 PDT
Created attachment 326046 [details]
Patch
Comment 25 WebKit Commit Bot 2017-11-04 16:02:00 PDT
Comment on attachment 326046 [details]
Patch

Clearing flags on attachment: 326046

Committed r224462: <https://trac.webkit.org/changeset/224462>
Comment 26 WebKit Commit Bot 2017-11-04 16:02:01 PDT
All reviewed patches have been landed.  Closing bug.