Bug 80982

Summary: [Qt] RunLoopQt is missing reentrancy guards
Product: WebKit Reporter: Simon Hausmann <hausmann>
Component: PlatformAssignee: Simon Hausmann <hausmann>
Severity: Normal CC: abecsi, dinu.jacob, webkit.review.bot, yael
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 81671    
Bug Blocks:    
Description Flags
Patch none

Description Simon Hausmann 2012-03-13 05:09:12 PDT
This is mostly relevant to WebKit 2, where the following scenario takes place:

Setup: The web process runs two event loops, the main event loop and the event loop with the socket notifier that listens on the socket connected to the uiprocess.

    1) The web process receives an incoming message from the ui process to create a new page.

    2) The message is received in the WorkQueue's thread of the connection object. It is read from the socket and queued in the list of incoming messages. RunLoop::dispatch() is called with Connection::dispatchMessages as message to call from the main runloop's thread. This will cause the Qt implementation of RunLoop::wakeUp to post a message to the main event loop, causing it to wake up.

    2) The main event loop's thread wakes up and calls Connection::dispatchMessages() via RunLoop::performWork(). dispatchMessages() runs a while (true) loop to process all new messages from the queue of incoming messages.

    3) In the mean time on the webprocess' message receiving thread, another message is received, one for a specific page - for example loadUrl for the page that was supposed to be created in the previous message. The message is read from the socket, queued in the queue of incoming messages and a message is posted to the main event loop again to wake it up.

    4) The message (create new page) is processed, a new WebCore::Page is created, among other things clients like NetworkStateNotifier are created, which initiated the Qt bearer management. Unfortunately the implementation of the latter in some environments does things like calling QEventLoop::exec() while waiting for some system component to respond, which results in the processing of pending events in the main event loop.

    5) The wake-up scheduled in step (3) is triggered through the nested event loop, causing dispatchMessages() to be called. This will take the next message (loadUrl) from the queue and start processing it, even though we are still in the process of constructing our initial page. As it happens the message is not processed successfully because the corresponding page id is not registered yet.

As a result the start-up sequence is kind of random and sometimes an initially requested url is loaded, sometimes it isn't.

The proposed fix is to prevent the Qt implementation of RunLoop to call performWork() recursively. Due to the nature of message queuing, in the above scenario the message would still be received, read and queued in the list of incoming messages. It will consequently be processed by either the while (true) loop of the first, initial dispatchMessages() call or by a subsequent one.
Comment 1 Simon Hausmann 2012-03-13 05:32:45 PDT
Created attachment 131596 [details]
Comment 2 Dinu Jacob 2012-03-13 07:07:45 PDT
Very well written detailed description :) This solution makes it Qt specific!
Comment 3 Simon Hausmann 2012-03-13 08:43:33 PDT
(In reply to comment #2)
> Very well written detailed description :) This solution makes it Qt specific!

Right, it's a Qt specific solution for what I think is a Qt specific problem at this point.
Comment 4 Tor Arne Vestbø 2012-03-14 06:43:51 PDT
Comment on attachment 131596 [details]

Comment 5 Andras Becsi 2012-03-14 08:02:41 PDT
Comment on attachment 131596 [details]

Clearing flags on attachment: 131596

Committed r110699: <http://trac.webkit.org/changeset/110699>
Comment 6 Andras Becsi 2012-03-14 08:03:20 PDT
All reviewed patches have been landed.  Closing bug.
Comment 7 Simon Hausmann 2012-05-16 07:23:01 PDT
Change http://trac.webkit.org/changeset/117286 rolled out this change, as it broke modal event loop handling. The symptom of this issue is fixed differently, and there's a growing trend in qt to stop doing evil local-event-loop spinning in synchronous APIs (and instead block on worker threads), so I think it's safe to go back to the previous approach for now.

Changing this to WONTFIX in the meantime.