WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 262794
[WPE][GTK] Web process cache suspend/resume does not work, WebProcessProxy::processIdentifier is not the pid of the actual web process
https://bugs.webkit.org/show_bug.cgi?id=262794
Summary
[WPE][GTK] Web process cache suspend/resume does not work, WebProcessProxy::p...
Michael Catanzaro
Reported
2023-10-06 09:27:13 PDT
In
https://commits.webkit.org/247615@main
we added code to send SIGSTOP and SIGCONT to the web process. However, it's broken because the pid tracked is actually the pid of the bwrap or flatpak-spawn process, not the real web process. The actual web process is never stopped unless the sandbox is disabled using WEBKIT_DISABLE_SANDBOX_THIS_IS_DANGEROUS=1.
Attachments
Add attachment
proposed patch, testcase, etc.
Michael Catanzaro
Comment 1
2023-10-06 09:33:07 PDT
I think every function that uses WebProcessProxy::processIdentifier is broken if it attempts to use the identifier as a pid rather than just a unique identifier. P.S. This explains why I have a huge number of stopped flatpak-spawn processes on my system. :)
Patrick Griffis
Comment 2
2023-10-06 10:06:56 PDT
So for the bubblewrap sandbox we can get the real pid fairly easily. `bwrap` takes `--info-fd` which outputs something like this: { "child-pid": 310369, "cgroup-namespace": 4026533658, "ipc-namespace": 4026533656, "mnt-namespace": 4026533654, "net-namespace": 4026533661, "pid-namespace": 4026533657, "uts-namespace": 4026533655 } We can then use this PID for some operations. I don't think it necessarily makes sense to set `WebProcessProxy::processIdentifier` to this PID. Sometimes we may want to directly do things to the child PID, sometimes we may want to do it to bwrap. We'd just have to audit the current usage. --- For Flatpak this is also easy, but I think it requires we stop using `flatpak-spawn`. It was the easy solution but we need to use the `org.freedesktop.portal.Flatpak` DBus API directly. It exposes a `SpawnSignal()` method. Again I don't think we will have a directly usable PID for everything `WebProcessProxy::processIdentifier` represents but we can send it signals.
Michael Catanzaro
Comment 3
2023-12-15 08:13:27 PST
I'm not sure what to do here. Auditing every single use of WebProcessProxy::processIdentifier is going to be disruptive. To confidently prevent bugs, we would indeed need to change it to ensure it cannot be converted to a pid. It would be *much* easier if we could instead just ensure we have the right pid of the WebKit child process (as seen within the UI process's pid namespace). I doubt we are ever going to want to perform operations on the bwrap process, because that's an implementation detail that the rest of WebKit does not know about. Problem is org.freedesktop.portal.Flatpak is not going to work because the documentation says we cannot assume the returned pid is valid in the caller's pid namespace. So I guess we really do need to either (a) perform this audit and hope we do not actually need real pids except for things that can be handled by org.freedesktop.portal.Flatpak, or (b) change flatpak to provide us a usable pid.
Michael Catanzaro
Comment 4
2023-12-15 08:24:40 PST
Wait, WebProcessProxy::processIdentifier is not a pid. It's just an int that increments each time a new auxiliary process is created. There has been some refactoring since
247615@main
. Now we have a new WTF::ProcessID class to represent pids. We need to audit this instead, not WebProcessProxy::processIdentifier.
Michael Catanzaro
Comment 5
2023-12-15 08:50:30 PST
WebCore::processIdentifier is fine too. I see a *lot* of things using WTF::ProcessID, but it's mostly used after calling getCurrentProcessID(), which is fine. The problem is specifically AuxiliaryProcessProxy::processID, which is where the bogus pids are coming from. Still, I doubt fixing this is realistic. We probably need to change flatpak. Is it really not possible for it to return results in the caller's pid namespace? :/ Until we are able to fix this, I think we should change ProcessLauncherGLib to set m_processID = 0 instead of setting the wrong pid. There are just too many possible bugs otherwise. This will either prevent or expose them all. E.g. the first one I found: ProcessLauncher::terminateProcess will currently only terminate the flatpak-spawn supervisor process and not the actual WebKit child process; presumably that will cause the WebKit process to leak, rather than terminate?
Michael Catanzaro
Comment 6
2023-12-15 09:38:00 PST
Temporary workaround in
bug #266489
Michael Catanzaro
Comment 7
2024-05-16 08:25:58 PDT
We might want to disable web process cache altogether, see
bug #274261
.
Michael Catanzaro
Comment 8
2024-05-23 11:06:05 PDT
(In reply to Michael Catanzaro from
comment #3
)
> Problem is org.freedesktop.portal.Flatpak is not going to work because the > documentation says we cannot assume the returned pid is valid in the > caller's pid namespace.
Actually there is a flag FLATPAK_SPAWN_FLAGS_EXPOSE_PIDS that we should be able to use!
Michael Catanzaro
Comment 9
2024-06-10 09:07:18 PDT
(In reply to Patrick Griffis from
comment #2
)
> So for the bubblewrap sandbox we can get the real pid fairly easily. > > `bwrap` takes `--info-fd` which outputs something like this: > > { > "child-pid": 310369, > "cgroup-namespace": 4026533658, > "ipc-namespace": 4026533656, > "mnt-namespace": 4026533654, > "net-namespace": 4026533661, > "pid-namespace": 4026533657, > "uts-namespace": 4026533655 > } > > We can then use this PID for some operations. I don't think it necessarily > makes sense to set `WebProcessProxy::processIdentifier` to this PID. > Sometimes we may want to directly do things to the child PID, sometimes we > may want to do it to bwrap. We'd just have to audit the current usage.
Carlos Garcia noticed that this pid is actually the pid of the intermediate bwrap process that gets used as pid 1 in the container, not the pid of the actual web process. We might need to add a new way to get the pid of the actual child.
Michael Catanzaro
Comment 10
2024-06-10 10:24:02 PDT
Somebody solved this problem already:
https://github.com/containers/bubblewrap/pull/576
Michael Catanzaro
Comment 11
2024-06-13 09:47:49 PDT
Pull request:
https://github.com/WebKit/WebKit/pull/29787
Michael Catanzaro
Comment 12
2024-06-13 10:23:36 PDT
(In reply to Michael Catanzaro from
comment #8
)
> Actually there is a flag FLATPAK_SPAWN_FLAGS_EXPOSE_PIDS that we should be > able to use!
So this is necessary but not sufficient. It allows the parent process to see pids from the child's pid namespace, but we still need a guarantee that the pid returned is valid in the parent namespace. So the org.freedesktop.portal.Flatpak API is not going to work. I'm out of ideas. It's possible to fix just suspend/resume if we use this pid with the SpawnSignal method call, instead of calling kill() manually. But that's surely not good enough as there could be plenty of other bugs and broken features if WebKit's understanding of the child process pid is not correct. I think we're going to need changes to flatpak to fix this.
Michael Catanzaro
Comment 13
2024-06-13 15:10:59 PDT
What that bwrap pull request does to get the command-pid is send Unix credentials via a socket. The kernel then translates the pid between pid namespaces automatically. We could possibly do that, but it would have to be done by the sandboxed process *after* exec() to avoid requiring a child setup function.
Michael Catanzaro
Comment 14
2024-06-20 13:00:30 PDT
Solution: instead of depending on a new bubblewrap, just have the WebKit subprocess manually send credentials to the parent process via a socket. Problem solved.
Michael Catanzaro
Comment 15
2024-07-14 11:30:11 PDT
Good news: this bug is the cause of our longstanding bwrap and xdg-dbus-proxy process leaks. The subprocesses are not actually being leaked as I had thought; they're just being stopped with SIGSTOP in error. They quit as expected if I manually send SIGCONT. So fixing this should solve that.
EWS
Comment 16
2024-07-29 01:11:09 PDT
Committed
281488@main
(c3c75352f0ee): <
https://commits.webkit.org/281488@main
> Reviewed commits have been landed. Closing PR #29787 and removing active labels.
Diego Pino
Comment 17
2024-08-01 06:23:21 PDT
Re-opening for pull request
https://github.com/webkit/webkit/pull/31596
EWS
Comment 18
2024-08-01 06:39:36 PDT
Committed
281704@main
(04cc3a119676): <
https://commits.webkit.org/281704@main
> Reviewed commits have been landed. Closing PR #31596 and removing active labels.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug