Bug 221560 - Connection clients should be able to obtain the messages as data instead of embedded in function references
Summary: Connection clients should be able to obtain the messages as data instead of e...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit2 (show other bugs)
Version: WebKit Local Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Kimmo Kinnunen
URL:
Keywords: InRadar
Depends on: 221891 221894
Blocks: webglgpup 219641
  Show dependency treegraph
 
Reported: 2021-02-08 10:17 PST by Kimmo Kinnunen
Modified: 2021-02-16 10:48 PST (History)
16 users (show)

See Also:


Attachments
Patch (93.95 KB, patch)
2021-02-08 13:00 PST, Kimmo Kinnunen
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
Patch (63.37 KB, patch)
2021-02-08 13:07 PST, Kimmo Kinnunen
no flags Details | Formatted Diff | Diff
Patch (71.15 KB, patch)
2021-02-09 02:14 PST, Kimmo Kinnunen
no flags Details | Formatted Diff | Diff
rebase (71.55 KB, patch)
2021-02-09 05:40 PST, Kimmo Kinnunen
no flags Details | Formatted Diff | Diff
rebase2 (71.57 KB, patch)
2021-02-10 12:37 PST, Kimmo Kinnunen
no flags Details | Formatted Diff | Diff
remove examples (37.68 KB, patch)
2021-02-11 00:43 PST, Kimmo Kinnunen
no flags Details | Formatted Diff | Diff
attempt to simplify (41.63 KB, patch)
2021-02-15 04:37 PST, Kimmo Kinnunen
no flags Details | Formatted Diff | Diff
rebase (44.32 KB, patch)
2021-02-16 08:01 PST, Kimmo Kinnunen
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Kimmo Kinnunen 2021-02-08 10:17:20 PST
Connection clients should be able to obtain the messages as data instead of embedded in function references

It is not well-defined to drop functions, since the abstraction hides what kind of assumptions the function caller has wrt the function.

WebGL IPC implementation needs to obtain sync and async message data and store it until the functions appear in the message order.
WebGL IPC implementation needs to be able to discard the messages if an error occurs before the stored sync/async messages are executed.
Comment 1 Kimmo Kinnunen 2021-02-08 13:00:27 PST
Created attachment 419614 [details]
Patch
Comment 2 Kimmo Kinnunen 2021-02-08 13:07:33 PST
Created attachment 419616 [details]
Patch
Comment 3 Kimmo Kinnunen 2021-02-09 02:14:57 PST
Created attachment 419691 [details]
Patch
Comment 4 Kimmo Kinnunen 2021-02-09 05:40:32 PST
Created attachment 419703 [details]
rebase
Comment 5 Kimmo Kinnunen 2021-02-10 12:37:06 PST
Created attachment 419889 [details]
rebase2
Comment 6 Chris Dumez 2021-02-10 15:31:36 PST
Comment on attachment 419703 [details]
rebase

View in context: https://bugs.webkit.org/attachment.cgi?id=419703&action=review

> Source/WebKit/GPUProcess/graphics/RemoteRenderingBackend.cpp:93
> +    m_workQueue->dispatch([protectedThis = WTFMove(refFromConnection)] {

I don't really understand the purpose of this. If someone had done a workQueue dispatch elsewhere and failed to capture protectedThis in the lambda, then it would be their bug.

> Source/WebKit/GPUProcess/graphics/RemoteRenderingBackend.cpp:118
> +    callOnMainRunLoopAndWait([weakRemoteMediaPlayerManagerProxy = m_remoteMediaPlayerManagerProxy, &context, &result, &mediaItem]() {

I spent time doing all rendering work on a background thread and it seems this is basically reverting my part of my work, doing the media rendering back on the main thread.

> Source/WebKit/GPUProcess/graphics/RemoteRenderingBackend.h:-143
> -    Ref<GPUConnectionToWebProcess> m_gpuConnectionToWebProcess;

I was ref'ing the GPUConnectionToWebProcess, so that I could access its RemoteMediaPlayerManagerProxy from the background thread.

> Source/WebKit/GPUProcess/graphics/RemoteRenderingBackend.h:153
> +    WeakPtr<RemoteMediaPlayerManagerProxy> m_remoteMediaPlayerManagerProxy;

The issue with WeakPtr is that it does not play nice with threads and this class is doing most of its work on a background thread nowadays.
Comment 7 Chris Dumez 2021-02-10 15:36:46 PST
This patch is really big and does quite a bit of refactoring. I suggest splitting it to facilitate reviewing.
Comment 8 Kimmo Kinnunen 2021-02-11 00:43:24 PST
Created attachment 419958 [details]
remove examples
Comment 9 Chris Dumez 2021-02-11 08:55:37 PST
Comment on attachment 419958 [details]
remove examples

View in context: https://bugs.webkit.org/attachment.cgi?id=419958&action=review

> Source/WebKit/ChangeLog:9
> +        the message receive thread and forwards them to caller as data.

So now we're just running random WebKit code from the IPC receive queue thread? I did not come up with the original design but I thought it was nice that the IPC thread could not be badly impacted by issues in the rest of the WebKit code base (e.g. hangs).

> Source/WebKit/ChangeLog:10
> +        This is important in order to be able to skip messages that are stale

I don't really understand why we need an entirely new IPC concept for this. Our IPC code is already complex and we already have MessageReceiver, WorkQueueMessageReceiver and ThreadMessageReceiver.
I would be really surprised if there wasn't a way to do whatever you need to do with the current IPC infrastructure.

I am not sure I fully understand the issue but it seems to be you could easily pass an ID for the sender as part of the IPC message, which would get passed as parameter to whatever function gets called as a result of the IPC. This would be similar to what we do with the PageID when sending messages to the UIProcess. And you could use this ID to determine if you still care about this sender or not.

> Source/WebKit/Platform/IPC/FunctionDispatcherMessageReceiveQueue.h:48
> +        m_dispatcher.dispatch([connection = makeRef(connection), message = WTFMove(message), &receiver = m_receiver]() mutable {

Considering m_dispatcher may be a WorkQueue, why is it safe to capture m_receiver by reference? What guarantees that m_receiver is still alive by the time the lambda runs?
Comment 10 Myles C. Maxfield 2021-02-12 09:03:14 PST
Comment on attachment 419958 [details]
remove examples

View in context: https://bugs.webkit.org/attachment.cgi?id=419958&action=review

> Source/WebKit/ChangeLog:100
> -        * GPUProcess/media/RemoteAudioHardwareListenerProxy.cpp: 
> +        * GPUProcess/media/RemoteAudioHardwareListenerProxy.cpp:

We usually don't modify other people's changelog entries.
Comment 11 Kimmo Kinnunen 2021-02-12 10:28:22 PST
(In reply to Chris Dumez from comment #9)
> Comment on attachment 419958 [details]
> remove examples
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=419958&action=review
> 
> > Source/WebKit/ChangeLog:9
> > +        the message receive thread and forwards them to caller as data.
> 
> So now we're just running random WebKit code from the IPC receive queue
> thread? I did not come up with the original design but I thought it was nice
> that the IPC thread could not be badly impacted by issues in the rest of the
> WebKit code base (e.g. hangs).

So by "random webkit code", you mean client overridable virtual function?
It's there already in the ThreadMessageReceiver.

> 
> > Source/WebKit/ChangeLog:10
> > +        This is important in order to be able to skip messages that are stale
> 
> I don't really understand why we need an entirely new IPC concept for this.

You cannot get ownership of the Decoder. You get passed a Decoder ref and a possibility of ownership of Encoder, dispatched through a closure.

This is not good, since with the stream extensions 
1) you don't want an Encoder, since you might reply through the SHM. 
2) you don't want a closure, since we want to be able to not hold on to it



> Our IPC code is already complex and we already have MessageReceiver,
> WorkQueueMessageReceiver and ThreadMessageReceiver.
> I would be really surprised if there wasn't a way to do whatever you need to
> do with the current IPC infrastructure.

The idea would be to simplify the code by removing WorkQueueMessageReceiver, ThreadMessageReceiver and ThreadMessageReceiverRefCounted, since they're one and the same concept.


> Considering m_dispatcher may be a WorkQueue, why is it safe to capture m_receiver by reference? What guarantees that m_receiver is still alive by the time the lambda runs?

Nothing. The dispatcher does not need to be alive when the lambda runs. :)

If you consider the case that a dispatcher would dispatch to main thread:
1) the dispatcher dispatches the tasks to main thread.
2) somebody deletes the dispatcher
3) after 30 mins, the tasks run in main thread.

The dispatcher needs to be alive when the dispatch happens.

The dispatcher is alive when the dispatch happens, because the client calls:

addMessageReceiveQueue()
 .. dispatches start to happen
removeMessageReceiveQueue()
  .. dispatches never happen anymore
Comment 12 Kimmo Kinnunen 2021-02-12 10:38:17 PST
> Considering m_dispatcher may be a WorkQueue, why is it safe to capture m_receiver by reference? What guarantees that m_receiver is still alive by the time the lambda runs?

And to expand:

addMessageReceiveQueue() means "I want messages. I guarantee the dispatcher is alive".

So it's confusing to induce any refcounts in the equation.

removeMessageReceiveQueue() means "I don't want messages."

So in case of work queue + "Me as the message receiver" code structure, the code would be:

startMessages()
  addMessageReceiveQueue(workQueue, this, ...);


stopMessagesSinceIWillDropMyRefAndYouExistOnlyForMyRequests()
  removeMessageReceiveQueue(workQueue, this, ...);
  workQueue([protectedThis = makeRef(*this)]() {
     cleanupIfNeeded();
});

(This was the example in the patch before the "remove examples" patch)

So essentially to your question:

The modification is designed to _not_ force the dispatched functions hold the ref of the messagereceiver -- since that's not very good, since the message receivers don't necessarily need to be refcounted.

The caller may very well capture their specific message receiver with a ref. However, for correct operation none of the existing call sites need this.
Comment 13 Kimmo Kinnunen 2021-02-12 10:41:19 PST
> force the dispatched functions hold the ref of the messagereceiver -- since that's not very good

And to back this claim, you only need to look at the back-and-forth-and-dead-code with the ThreadMessageReceiver / ThreadMessageReceiverRefCounted. It typically causes a bit of problems if you force something in the middle of the inheritance hierarchy.
Comment 14 Chris Dumez 2021-02-12 10:41:25 PST
(In reply to Kimmo Kinnunen from comment #11)
> (In reply to Chris Dumez from comment #9)
> > Comment on attachment 419958 [details]
> > remove examples
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=419958&action=review
> > 
> > > Source/WebKit/ChangeLog:9
> > > +        the message receive thread and forwards them to caller as data.
> > 
> > So now we're just running random WebKit code from the IPC receive queue
> > thread? I did not come up with the original design but I thought it was nice
> > that the IPC thread could not be badly impacted by issues in the rest of the
> > WebKit code base (e.g. hangs).
> 
> So by "random webkit code", you mean client overridable virtual function?
> It's there already in the ThreadMessageReceiver.

Up until now, the only thing we did on the IPC receive queue was receive the IPC, then dispatch to either the main thread, a WorkQueue or a Thread.
We did not actually process the IPC (i.e. do the operation resulting from the IPC) on the IPC receive queue. Changing this is a pretty big behavior change and I am curious what other experts think about this.
As I said, I did not come up with the original design but it was very robust to have the IPC receive queue only do the dispatching. If the IPC receive queue start doing more work, then we open the door to new classes of bugs where a process may no longer process incoming IPC because it is stuck somewhere in WebKit code.

> 
> > 
> > > Source/WebKit/ChangeLog:10
> > > +        This is important in order to be able to skip messages that are stale
> > 
> > I don't really understand why we need an entirely new IPC concept for this.
> 
> You cannot get ownership of the Decoder. You get passed a Decoder ref and a
> possibility of ownership of Encoder, dispatched through a closure.
> 
> This is not good, since with the stream extensions 
> 1) you don't want an Encoder, since you might reply through the SHM. 
> 2) you don't want a closure, since we want to be able to not hold on to it

This has been working well for years. What's new or different now that we need this capability? You can send a shared memory handle via IPC and then later on communicate via that shared memory already. We already do this in other places.

> 
> 
> 
> > Our IPC code is already complex and we already have MessageReceiver,
> > WorkQueueMessageReceiver and ThreadMessageReceiver.
> > I would be really surprised if there wasn't a way to do whatever you need to
> > do with the current IPC infrastructure.
> 
> The idea would be to simplify the code by removing WorkQueueMessageReceiver,
> ThreadMessageReceiver and ThreadMessageReceiverRefCounted, since they're one
> and the same concept.

Those are the concepts we've been using for us. They've worked well. I am not convinced the proposed approach is simpler.

> 
> 
> > Considering m_dispatcher may be a WorkQueue, why is it safe to capture m_receiver by reference? What guarantees that m_receiver is still alive by the time the lambda runs?
> 
> Nothing. The dispatcher does not need to be alive when the lambda runs. :)
> 
> If you consider the case that a dispatcher would dispatch to main thread:
> 1) the dispatcher dispatches the tasks to main thread.
> 2) somebody deletes the dispatcher
> 3) after 30 mins, the tasks run in main thread.
> 
> The dispatcher needs to be alive when the dispatch happens.
> 
> The dispatcher is alive when the dispatch happens, because the client calls:
> 
> addMessageReceiveQueue()
>  .. dispatches start to happen
> removeMessageReceiveQueue()
>   .. dispatches never happen anymore

I did not ask about the dispatcher. I asked about the receiver (m_receiver) which could capture by reference in your dispatch lambda and then use in your lambda.
Comment 15 Kimmo Kinnunen 2021-02-12 11:02:19 PST
> What guarantees that m_receiver is still alive by the time the lambda runs?

Ah, and I may have answered to question in a roundabout way.

So this is essentially guaranteed by the caller.
If the caller wants the dispatch to reffed receiver, then the caller does a custom dispatcher that knows how to ref the receiver.

If the caller knows that the receiver is a neverdestroyed singleton, they guarantee the aliveness by just that.

(and my favourite)
The caller knows that the receiver is alive because the caller is the receiver.. So they make sure they're alive also for the scheduled functions.

However, this particular design point, though I think is a good one, is very much beside the point of this patch. I can add however many "WorkQueueMessageReceiver/ThreadMessageReceiver/ThreadMessageReceiverRefCounted" classes you think is necessary to the Connection.h and dispatch them from the dispatch site with the ref semantics you think is needed.
Comment 16 Kimmo Kinnunen 2021-02-12 11:07:49 PST
> Up until now, the only thing we did on the IPC receive queue was receive the IPC, then dispatch to either the main thread, a WorkQueue or a Thread.
We did not actually process the IPC (i.e. do the operation resulting from the IPC) on the IPC receive queue. Changing this is a pretty big behavior change and I am curious what other experts think about this.

Ah, but this is not what the patch is trying to do. The patch is trying to maintain status quo with this regard.

class ThreadMessageReceiver : public MessageReceiver {
    public:
        virtual void dispatchToThread(WTF::Function<void()>&&) = 0;

vs

+class MessageReceiveQueue {
+public:
+    virtual ~MessageReceiveQueue() = default;
+    // Called with Connection incoming message lock held.
+    // May be called from multiple threads.
+    virtual void enqueueMessage(Connection&, std::unique_ptr<Decoder>&&) = 0;
Comment 17 Kimmo Kinnunen 2021-02-12 11:19:28 PST
> What's new or different now that we need this capability? You can send a shared memory handle via IPC and then later on communicate via that shared memory already.

So you can write int to SHM and then you can read that int from SHM.
Then you write string to SHM and then you read a string from SHM

....

The thing I'm trying to implement in the depending patch is to run the existing IPC implementation through the SHM, so that not everybody would need to implement their own IPC protocol, because the existing WebKit IPC protocol is not applicable.

The idea would be that there should not be multiple homebaked implementations for IPC. There should be few relatively much used and robust implementations.
Comment 18 Kimmo Kinnunen 2021-02-15 04:37:01 PST
Created attachment 420305 [details]
attempt to simplify
Comment 19 Kimmo Kinnunen 2021-02-15 04:42:51 PST
(In reply to Kimmo Kinnunen from comment #18)
> Created attachment 420305 [details]
> attempt to simplify

I tried to address the claims of:
 - making the code more complex --> remove duplicated code paths
 - ref the receiver ? --> remove the code

How about now? Needs the blocker fixes bug 221891 and bug 221894 to be green, though.
Comment 20 Radar WebKit Bug Importer 2021-02-15 10:18:47 PST
<rdar://problem/74355065>
Comment 21 Kimmo Kinnunen 2021-02-16 08:01:54 PST
Created attachment 420472 [details]
rebase
Comment 22 Kimmo Kinnunen 2021-02-16 10:06:52 PST
Comment on attachment 420472 [details]
rebase

View in context: https://bugs.webkit.org/attachment.cgi?id=420472&action=review

> Source/WebKit/Platform/IPC/Connection.cpp:1028
> +    }

Had to add this back.
Originally I didn't understand that this hunk is needed since other overlapping bug, e.g. one where a message might be queued to SyncState instead of the Connection.
Comment 23 Chris Dumez 2021-02-16 10:26:10 PST
Comment on attachment 420472 [details]
rebase

r=me
Comment 24 EWS 2021-02-16 10:48:01 PST
Committed r272916: <https://commits.webkit.org/r272916>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 420472 [details].