Bug 219240 - Calling waitForAndDispatchImmediately<M> on a loop fails when multiple M messages arrive simultaneously
Summary: Calling waitForAndDispatchImmediately<M> on a loop fails when multiple M mess...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit2 (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Wenson Hsieh
URL:
Keywords: InRadar
Depends on:
Blocks: 219091
  Show dependency treegraph
 
Reported: 2020-11-21 16:47 PST by Wenson Hsieh
Modified: 2020-11-30 10:44 PST (History)
5 users (show)

See Also:


Attachments
Patch (7.69 KB, patch)
2020-11-23 12:54 PST, Wenson Hsieh
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Wenson Hsieh 2020-11-21 16:47:44 PST
Fix a race condition in IPC logic, wherein a client that calls waitForAndDispatchImmediately<M>() n times may time out when receiving n messages of type M.

...more concretely:

RemoteImageBufferProxy sync-waits for consecutive `RemoteRenderingBackendProxy::DidFlush` messages, and finishes sync-waiting when the DidFlush message containing the expected flush identifier arrives. The following may occur:

(Main thread): waitForAndDispatchImmediately<RemoteRenderingBackendProxy::DidFlush>()

(IPC thread): receives the first DidFlush message
(IPC thread): sees that the message name matches that of m_waitingForMessage, and signals the main thread

(Main thread): sees that m_waitingForMessage now has a decoder, and nulls out m_waitingForMessage
(Main thread): calls into RemoteRenderingBackendProxy::didFlush() and returns from waitForAndDispatchImmediately

(IPC thread): receives the second DidFlush message
(IPC thread): sees that m_waitingForMessage is null

(Main thread): calls into RemoteRenderingBackendProxy::didFlush() and returns from waitForAndDispatchImmediately
(Main thread): sees that the flush is still pending, so we call waitForAndDispatchImmediately<RemoteRenderingBackendProxy::DidFlush>() again
(Main thread): sees that the incoming messages queue doesn't contain DidFlush, so we wait

(IPC thread): enqueues the incoming DidFlush message onto the main thread (via enqueueIncomingMessage)

...and now the main thread will time out waiting for the RemoteRenderingBackendProxy::DidFlush which is currently sitting in the incoming messages queue :|
Comment 1 Wenson Hsieh 2020-11-23 12:54:16 PST
Created attachment 414812 [details]
Patch
Comment 2 Chris Dumez 2020-11-23 19:18:50 PST
Comment on attachment 414812 [details]
Patch

r=me assuming the bots are happy.
Comment 3 Wenson Hsieh 2020-11-26 20:40:34 PST
Comment on attachment 414812 [details]
Patch

Thanks for the review!
Comment 4 EWS 2020-11-26 20:42:56 PST
Committed r270180: <https://trac.webkit.org/changeset/270180>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 414812 [details].
Comment 5 Radar WebKit Bug Importer 2020-11-26 20:43:24 PST
<rdar://problem/71757243>
Comment 6 Geoffrey Garen 2020-11-30 10:23:53 PST
Comment on attachment 414812 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=414812&action=review

> Source/WebKit/Platform/IPC/Connection.cpp:553
> +        if (hasIncomingSynchronousMessage && waitForOptions.contains(WaitForOption::InterruptWaitingIfSyncMessageArrives)) {

Since we needed to hold a lock while computing hasIncomingSynchronousMessage, it seems like it must be a bug to check hasIncomingSynchronousMessage after dropping the lock. Between dropping the lock and checking hasIncomingSynchronousMessage, the true value of hasIncomingSynchronousMessage may have changed (because a new incoming message may have arrived).
Comment 7 Geoffrey Garen 2020-11-30 10:26:45 PST
Comment on attachment 414812 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=414812&action=review

>> Source/WebKit/Platform/IPC/Connection.cpp:553
>> +        if (hasIncomingSynchronousMessage && waitForOptions.contains(WaitForOption::InterruptWaitingIfSyncMessageArrives)) {
> 
> Since we needed to hold a lock while computing hasIncomingSynchronousMessage, it seems like it must be a bug to check hasIncomingSynchronousMessage after dropping the lock. Between dropping the lock and checking hasIncomingSynchronousMessage, the true value of hasIncomingSynchronousMessage may have changed (because a new incoming message may have arrived).

But I guess this is OK because there is only one IPC thread, and we are running on it, and we know that only the IPC thread adds to m_incomingMessages?
Comment 8 Chris Dumez 2020-11-30 10:44:11 PST
(In reply to Geoffrey Garen from comment #7)
> Comment on attachment 414812 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=414812&action=review
> 
> >> Source/WebKit/Platform/IPC/Connection.cpp:553
> >> +        if (hasIncomingSynchronousMessage && waitForOptions.contains(WaitForOption::InterruptWaitingIfSyncMessageArrives)) {
> > 
> > Since we needed to hold a lock while computing hasIncomingSynchronousMessage, it seems like it must be a bug to check hasIncomingSynchronousMessage after dropping the lock. Between dropping the lock and checking hasIncomingSynchronousMessage, the true value of hasIncomingSynchronousMessage may have changed (because a new incoming message may have arrived).
> 
> But I guess this is OK because there is only one IPC thread, and we are
> running on it, and we know that only the IPC thread adds to
> m_incomingMessages?

2 things:
a. This is not new behavior in this patch
b. Yes, it is my understanding that incoming messages only get enqueued by the IPC thread and we are on the IPC thread here. The reason we have a lock is because the main thread dequeues those messages.