Bug 112729 - [GTK] Web Process crash when the UI process finishes too early
Summary: [GTK] Web Process crash when the UI process finishes too early
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit2 (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: Gtk
Depends on:
Blocks:
 
Reported: 2013-03-19 12:00 PDT by Carlos Garcia Campos
Modified: 2013-04-12 10:23 PDT (History)
8 users (show)

See Also:


Attachments
Patch (12.93 KB, patch)
2013-03-19 12:08 PDT, Carlos Garcia Campos
andersca: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Carlos Garcia Campos 2013-03-19 12:00:31 PDT
It finishes with the following error:

(WebKitWebProcess:16101): GLib-GIO-ERROR **: creating GSocket from fd 11: Bad file descriptor

The problem is when creating the GSocket in the WorkQeue for the socket descriptor. GLib considers a programmer error to create a GSocket providing an invalid socket and finishes the process with g_error(). We are actually providing a valid socket when creating the GSocket, but it can be invalidated by the worker thread while the GSocket is being created. This is because registerEventSourceHandler is called from the main thread and unregisterEventSourceHandler can be called from the worker thread. We are currently registering two even handlers, one to read data from the socket and another one to close the wk connection when the socket connection is broken. Every event source registered uses its own GSocket (even if the file descriptor is actually the same), so that when the UI process finishes too early, the first event handler can be executed in the worker thread, closing the socket descriptor, while the main thread is creating the GSocket for the second one. 
We don't really need to use a separate event handler to monitor the connection, because GSocket always notifies when condition is G_IO_HUP and G_IO_ERR even if they haven't been explicitly set in g_socket_create_source(). We can register socket event sources differently, providing also a function to be called when the connection is closed, using a single socket and the same even source.
Comment 1 Carlos Garcia Campos 2013-03-19 12:08:53 PDT
Created attachment 193892 [details]
Patch
Comment 2 Martin Robinson 2013-03-19 13:16:11 PDT
Comment on attachment 193892 [details]
Patch

Looks good to me! Just need an owner stamp. Sam or Anders?
Comment 3 Lauro Moura Maranhao Neto 2013-04-05 10:26:25 PDT
Comment on attachment 193892 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=193892&action=review

> Source/WebKit2/Platform/gtk/WorkQueueGtk.cpp:99
> +        return condition & m_condition;

Shouldn't this function check whether m_cancellable was cancelled before? Calling g_cancellable_cancel will trigger the event source with the current condition.
Comment 4 Lauro Moura Maranhao Neto 2013-04-05 12:00:35 PDT
Also, I've tested this patch with the Nix test I posted on bug #85066 [1] and it works almost fine. Sometimes the Connection's WorkQueue would lock in infinite poll calls when doing series of instant crash/respawn of the WebProcess. I think this happens because ProcessLauncherGtk.cpp::childFinishedFunction can close the socket in another thread while it's being polled by the Connection's WorkQueue.

From the close(2) man page:
"It  is  probably  unwise to close file descriptors while they may be in use by system calls in other threads in the same process.  Since a file descriptor may be reused, there  are  some  obscure race conditions that may cause unintended side effects."

Should I open a separate bug for this? I'll try to trigger this condition in WebKitGtk+ using a modified version of WK2's MouseMoveAfterCrash.cpp.

[1] https://github.com/WebKitNix/webkitnix/blob/master/Tools/TestWebKitAPI/Tests/nix/WebViewWebProcessCrashed.cpp
Comment 5 Carlos Garcia Campos 2013-04-05 23:55:02 PDT
(In reply to comment #3)
> (From update of attachment 193892 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=193892&action=review
> 
> > Source/WebKit2/Platform/gtk/WorkQueueGtk.cpp:99
> > +        return condition & m_condition;
> 
> Shouldn't this function check whether m_cancellable was cancelled before? Calling g_cancellable_cancel will trigger the event source with the current condition.

No, I don't think so, when the cancellable is cancelled, the callback is emitted with no flags, and we silently return FALSE to destroy the source.
Comment 6 Carlos Garcia Campos 2013-04-05 23:59:14 PDT
(In reply to comment #4)
> Also, I've tested this patch with the Nix test I posted on bug #85066 [1] and it works almost fine. Sometimes the Connection's WorkQueue would lock in infinite poll calls when doing series of instant crash/respawn of the WebProcess. I think this happens because ProcessLauncherGtk.cpp::childFinishedFunction can close the socket in another thread while it's being polled by the Connection's WorkQueue.

Yes, I said that part of the patch written for the other bug was already solved by this patch. This fixes the case of the ui process returning too early, but it probably needs changes in the process launcher to fix the other bug. 

> From the close(2) man page:
> "It  is  probably  unwise to close file descriptors while they may be in use by system calls in other threads in the same process.  Since a file descriptor may be reused, there  are  some  obscure race conditions that may cause unintended side effects."
> 
> Should I open a separate bug for this? I'll try to trigger this condition in WebKitGtk+ using a modified version of WK2's MouseMoveAfterCrash.cpp.

If the problem is the same than the mouse move after crash test issue, leave the other bug and make it depende on this one. 

Thanks for your help.
Comment 7 Lauro Moura Maranhao Neto 2013-04-08 11:57:03 PDT
(In reply to comment #6)
> (In reply to comment #4)
> > Also, I've tested this patch with the Nix test I posted on bug #85066 [1] and it works almost fine. Sometimes the Connection's WorkQueue would lock in infinite poll calls when doing series of instant crash/respawn of the WebProcess. I think this happens because ProcessLauncherGtk.cpp::childFinishedFunction can close the socket in another thread while it's being polled by the Connection's WorkQueue.
> 
> Yes, I said that part of the patch written for the other bug was already solved by this patch. This fixes the case of the ui process returning too early, but it probably needs changes in the process launcher to fix the other bug. 

Ok.


> > From the close(2) man page:
> > "It  is  probably  unwise to close file descriptors while they may be in use by system calls in other threads in the same process.  Since a file descriptor may be reused, there  are  some  obscure race conditions that may cause unintended side effects."
> > 
> > Should I open a separate bug for this? I'll try to trigger this condition in WebKitGtk+ using a modified version of WK2's MouseMoveAfterCrash.cpp.
> 
> If the problem is the same than the mouse move after crash test issue, leave the other bug and make it depende on this one. 

Even after fixing the bad file descriptor/lockup issues the mouse move test fails (Both GTK and Nix). So I'll open a new bug about the process launcher fixes and make the mouse move one depend on it. Thanks!
Comment 8 Anders Carlsson 2013-04-12 06:48:37 PDT
Comment on attachment 193892 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=193892&action=review

> Source/WebKit2/Platform/CoreIPC/unix/ConnectionUnix.cpp:430
>  #if PLATFORM(QT)
>      m_socketNotifier = m_connectionQueue->registerSocketEventHandler(m_socketDescriptor, QSocketNotifier::Read, WTF::bind(&Connection::readyReadHandler, this));
>  #elif PLATFORM(GTK)
> -    m_connectionQueue->registerEventSourceHandler(m_socketDescriptor, (G_IO_HUP | G_IO_ERR), WTF::bind(&Connection::connectionDidClose, this));
> -    m_connectionQueue->registerEventSourceHandler(m_socketDescriptor, G_IO_IN, WTF::bind(&Connection::readyReadHandler, this));
> +    m_connectionQueue->registerSocketEventHandler(m_socketDescriptor, G_IO_IN, WTF::bind(&Connection::readyReadHandler, this), WTF::bind(&Connection::connectionDidClose, this));
>  #elif PLATFORM(EFL)
>      m_connectionQueue->registerSocketEventHandler(m_socketDescriptor, WTF::bind(&Connection::readyReadHandler, this));
>  #endif

It's unfortunate that these #ifdefs are here - it feels like something that could be abstracted for ConnectionUnix.

> Source/WebKit2/Platform/WorkQueue.h:82
> +    void registerSocketEventHandler(int, int, const Function<void()>&, const Function<void()>&);

I think you should name both these Function parameters; it's not clear what they do.
Comment 9 Carlos Garcia Campos 2013-04-12 09:13:12 PDT
(In reply to comment #8)
> (From update of attachment 193892 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=193892&action=review

Thanks for the review.

> > Source/WebKit2/Platform/CoreIPC/unix/ConnectionUnix.cpp:430
> >  #if PLATFORM(QT)
> >      m_socketNotifier = m_connectionQueue->registerSocketEventHandler(m_socketDescriptor, QSocketNotifier::Read, WTF::bind(&Connection::readyReadHandler, this));
> >  #elif PLATFORM(GTK)
> > -    m_connectionQueue->registerEventSourceHandler(m_socketDescriptor, (G_IO_HUP | G_IO_ERR), WTF::bind(&Connection::connectionDidClose, this));
> > -    m_connectionQueue->registerEventSourceHandler(m_socketDescriptor, G_IO_IN, WTF::bind(&Connection::readyReadHandler, this));
> > +    m_connectionQueue->registerSocketEventHandler(m_socketDescriptor, G_IO_IN, WTF::bind(&Connection::readyReadHandler, this), WTF::bind(&Connection::connectionDidClose, this));
> >  #elif PLATFORM(EFL)
> >      m_connectionQueue->registerSocketEventHandler(m_socketDescriptor, WTF::bind(&Connection::readyReadHandler, this));
> >  #endif
> 
> It's unfortunate that these #ifdefs are here - it feels like something that could be abstracted for ConnectionUnix.

Yeah, but unfortunately is not that easy, the Qt one returns a qt specific object. For GTK+ and EFL it should be easier to share a common interface.

> > Source/WebKit2/Platform/WorkQueue.h:82
> > +    void registerSocketEventHandler(int, int, const Function<void()>&, const Function<void()>&);
> 
> I think you should name both these Function parameters; it's not clear what they do.

Sure.
Comment 10 Carlos Garcia Campos 2013-04-12 10:23:17 PDT
Committed r148286: <http://trac.webkit.org/changeset/148286>