Bug 219240

Summary: Calling waitForAndDispatchImmediately<M> on a loop fails when multiple M messages arrive simultaneously
Product: WebKit Reporter: Wenson Hsieh <wenson_hsieh>
Component: WebKit2Assignee: Wenson Hsieh <wenson_hsieh>
Status: RESOLVED FIXED    
Severity: Normal CC: cdumez, ggaren, sabouhallawa, thorton, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 219091    
Attachments:
Description Flags
Patch none

Description Wenson Hsieh 2020-11-21 16:47:44 PST
Fix a race condition in IPC logic, wherein a client that calls waitForAndDispatchImmediately<M>() n times may time out when receiving n messages of type M.

...more concretely:

RemoteImageBufferProxy sync-waits for consecutive `RemoteRenderingBackendProxy::DidFlush` messages, and finishes sync-waiting when the DidFlush message containing the expected flush identifier arrives. The following may occur:

(Main thread): waitForAndDispatchImmediately<RemoteRenderingBackendProxy::DidFlush>()

(IPC thread): receives the first DidFlush message
(IPC thread): sees that the message name matches that of m_waitingForMessage, and signals the main thread

(Main thread): sees that m_waitingForMessage now has a decoder, and nulls out m_waitingForMessage
(Main thread): calls into RemoteRenderingBackendProxy::didFlush() and returns from waitForAndDispatchImmediately

(IPC thread): receives the second DidFlush message
(IPC thread): sees that m_waitingForMessage is null

(Main thread): calls into RemoteRenderingBackendProxy::didFlush() and returns from waitForAndDispatchImmediately
(Main thread): sees that the flush is still pending, so we call waitForAndDispatchImmediately<RemoteRenderingBackendProxy::DidFlush>() again
(Main thread): sees that the incoming messages queue doesn't contain DidFlush, so we wait

(IPC thread): enqueues the incoming DidFlush message onto the main thread (via enqueueIncomingMessage)

...and now the main thread will time out waiting for the RemoteRenderingBackendProxy::DidFlush which is currently sitting in the incoming messages queue :|
Comment 1 Wenson Hsieh 2020-11-23 12:54:16 PST
Created attachment 414812 [details]
Patch
Comment 2 Chris Dumez 2020-11-23 19:18:50 PST
Comment on attachment 414812 [details]
Patch

r=me assuming the bots are happy.
Comment 3 Wenson Hsieh 2020-11-26 20:40:34 PST
Comment on attachment 414812 [details]
Patch

Thanks for the review!
Comment 4 EWS 2020-11-26 20:42:56 PST
Committed r270180: <https://trac.webkit.org/changeset/270180>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 414812 [details].
Comment 5 Radar WebKit Bug Importer 2020-11-26 20:43:24 PST
<rdar://problem/71757243>
Comment 6 Geoffrey Garen 2020-11-30 10:23:53 PST
Comment on attachment 414812 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=414812&action=review

> Source/WebKit/Platform/IPC/Connection.cpp:553
> +        if (hasIncomingSynchronousMessage && waitForOptions.contains(WaitForOption::InterruptWaitingIfSyncMessageArrives)) {

Since we needed to hold a lock while computing hasIncomingSynchronousMessage, it seems like it must be a bug to check hasIncomingSynchronousMessage after dropping the lock. Between dropping the lock and checking hasIncomingSynchronousMessage, the true value of hasIncomingSynchronousMessage may have changed (because a new incoming message may have arrived).
Comment 7 Geoffrey Garen 2020-11-30 10:26:45 PST
Comment on attachment 414812 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=414812&action=review

>> Source/WebKit/Platform/IPC/Connection.cpp:553
>> +        if (hasIncomingSynchronousMessage && waitForOptions.contains(WaitForOption::InterruptWaitingIfSyncMessageArrives)) {
> 
> Since we needed to hold a lock while computing hasIncomingSynchronousMessage, it seems like it must be a bug to check hasIncomingSynchronousMessage after dropping the lock. Between dropping the lock and checking hasIncomingSynchronousMessage, the true value of hasIncomingSynchronousMessage may have changed (because a new incoming message may have arrived).

But I guess this is OK because there is only one IPC thread, and we are running on it, and we know that only the IPC thread adds to m_incomingMessages?
Comment 8 Chris Dumez 2020-11-30 10:44:11 PST
(In reply to Geoffrey Garen from comment #7)
> Comment on attachment 414812 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=414812&action=review
> 
> >> Source/WebKit/Platform/IPC/Connection.cpp:553
> >> +        if (hasIncomingSynchronousMessage && waitForOptions.contains(WaitForOption::InterruptWaitingIfSyncMessageArrives)) {
> > 
> > Since we needed to hold a lock while computing hasIncomingSynchronousMessage, it seems like it must be a bug to check hasIncomingSynchronousMessage after dropping the lock. Between dropping the lock and checking hasIncomingSynchronousMessage, the true value of hasIncomingSynchronousMessage may have changed (because a new incoming message may have arrived).
> 
> But I guess this is OK because there is only one IPC thread, and we are
> running on it, and we know that only the IPC thread adds to
> m_incomingMessages?

2 things:
a. This is not new behavior in this patch
b. Yes, it is my understanding that incoming messages only get enqueued by the IPC thread and we are on the IPC thread here. The reason we have a lock is because the main thread dequeues those messages.