Bug 241353 - [WPE][GTK] REGRESSION (r294381): WPEWebProcess leak after closing browser
Summary: [WPE][GTK] REGRESSION (r294381): WPEWebProcess leak after closing browser
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit2 (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Yury Semikhatsky
URL:
Keywords: InRadar
Depends on: 238892
Blocks: 241583
  Show dependency treegraph
 
Reported: 2022-06-06 16:11 PDT by Yury Semikhatsky
Modified: 2022-06-13 19:26 PDT (History)
10 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Yury Semikhatsky 2022-06-06 16:11:05 PDT
Since r294381 we (Playwright) often (every second run or so) see WPEWebProcesses that stay alive after the browser exits. The tests create a few WebsiteDataStore/WebProcessPool instances and then one/two pages in them. When exiting the UIProcess we now routinely see WebsiteDataStore and WebProcessPool with one WebProcess in it whose sessionID is the same as that of the WebsiteDataStore. WebsiteDataStore::processes() is empty. The repro is quite complicated, so I'm going to continue debugging.

Sihui Liu, perhaps you know off the top of your head what may have changed?
Comment 1 Yury Semikhatsky 2022-06-07 16:30:23 PDT
The problem is caused by the change in the behavior of WebProcessCache::canCacheProcess[1]. I see that it was discussed in the code review in https://bugs.webkit.org/show_bug.cgi?id=238892#c10 so maybe Carlos knows what's going on?

Before the change the process would not be cached and now it gets cached. Not sure if this is the desirable behavior for a process that contained pages that use ephemeral datastore. The symptoms as I mentioned above is leaking WPEWebProcess after closing all pages in the browser.  There is also now a hanging WebPageProxy object now. It may well be that the change just exposed an existing bug in how the process cache works in WPE.


[1] https://github.com/WebKit/WebKit/commit/e02402e809515c4c063339ba30f8516778b0888c#diff-cfe68db49d8cf52bf5ad142cc19fb5aacf3616e1d3c71887d6cc28af666542a6L79-L80
Comment 2 Michael Catanzaro 2022-06-07 16:58:45 PDT
Any chance you know how to reproduce this? Even if it's not a reliable reproducer, having a test that sometimes fails would be useful.

(In reply to Yury Semikhatsky from comment #1)
> It may well be that the change just
> exposed an existing bug in how the process cache works in WPE.

Seems likely.
Comment 3 Yury Semikhatsky 2022-06-08 10:32:03 PDT
> Any chance you know how to reproduce this? Even if it's not a reliable
> reproducer, having a test that sometimes fails would be useful.
> 

I have a reliable reproduction but it requires running 76 tests in playwright (didn't manage to isolate it further) after which closing browser hangs because of the leaking process.

The steps would be like this:

git clone https://github.com/yury-s/playwright/tree/wk-1649
git co wk-1649
npm i
npx playwright install
CI=1 taskset 3 npm run wtest -- 'trace-viewer|tracing|video' --reporter=list

This is probably not very helpful.

Now that we know at which lines caused the problem I can try to create a webkit test for that.
Comment 4 Yury Semikhatsky 2022-06-10 15:00:04 PDT
I managed to debug what causes the issue. We call WebPageProxy::forceRepaint, which in turn sends an async message to the web process and captures a strong reference to the WebPageProxy in the async callback. In WPE on a static page the callback is not invoked for a long time and the client closes the page. While the page is closed, the callback is still in asyncReplyHandlerMap and keeps WebProcessPool alive. Since the closed page was last page in the web process the process gets into WebProcessCache and in 30s gets stopped (CachedProcess::suspensionTimerFired). When the browser exists, the web process is still alive an is in the STOPPED state. Since the reference to WebProcessPool (and transitively to WebProcessProxy) still exists the web process is not killed and keeps hanging after the UIProcess exited.

The retention chain looks like this:



  asyncReplyHandlerMap
           │
           ▼
sendWithAsyncReply callback
           │
           ▼
      WebPageProxy
           │
           ▼
  API::PageConfiguration
           │
           ▼
     WebProcessPool
           │
           ▼
    WebProcessCache
           │
           ▼
      CachedProcess
           │
           ▼
     WebProccessProxy


I'll upload a fix for this particular issue shorty but it looks like a bigger problem with stopping web process when there are in-flight IPC messages.
Comment 5 Yury Semikhatsky 2022-06-10 15:06:42 PDT
Pull request: https://github.com/WebKit/WebKit/pull/1455
Comment 6 Yury Semikhatsky 2022-06-13 19:11:30 PDT
Another thing that may be worth implementing is killing stopped web processes if the UI process exits while some of the WebProcessProxy objects are still alive (intentionally due to the embedder logic or unintentionally due to bugs like this). As far as I understand, normally closing the UI process will close one end of the IPC connection and the Web process will exit as a result. However, if the WebProcess is stopped it will not react to the state changes of the IPC pipe and will keep hanging.
Comment 7 EWS 2022-06-13 19:25:15 PDT
Committed r295511 (251516@main): <https://commits.webkit.org/251516@main>

Reviewed commits have been landed. Closing PR #1455 and removing active labels.
Comment 8 Radar WebKit Bug Importer 2022-06-13 19:26:14 PDT
<rdar://problem/95075439>