Bug 115880 - [GTK] Connection issues in repeated WebProcess crash/reloads.
Summary: [GTK] Connection issues in repeated WebProcess crash/reloads.
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKitGTK (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-05-09 15:35 PDT by Lauro Moura Maranhao Neto
Modified: 2013-05-28 10:11 PDT (History)
10 users (show)

See Also:


Attachments
Test case. (4.03 KB, patch)
2013-05-09 15:35 PDT, Lauro Moura Maranhao Neto
no flags Details | Formatted Diff | Diff
Tentative patch. (3.28 KB, patch)
2013-05-09 15:37 PDT, Lauro Moura Maranhao Neto
no flags Details | Formatted Diff | Diff
Updated patch including ChangeLog entry (4.26 KB, patch)
2013-05-28 03:38 PDT, Carlos Garcia Campos
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from webkit-ews-14 for mac-mountainlion-wk2 (556.17 KB, application/zip)
2013-05-28 07:27 PDT, Build Bot
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Lauro Moura Maranhao Neto 2013-05-09 15:35:25 PDT
Created attachment 201291 [details]
Test case.

When stressing the WebProcess creation/destruction, WebKitGTK can often run into socket issues like bad file descriptor errors or polling a socket indefinitely.
    
Currently WebKitGTK's has three places where a socket can be closed.
    
- childFinishedFunction (in ProcessLauncherGtk.cpp)
- Connection::platformInvalidate (in ConnectionUnix.cpp)
- WorkQueue EventSource destruction (in WorkQueueGtk.cpp)

After leaving only the latter, the test case works fine although there are potential issues like when the WebProcess crashes early, and the event sources aren't registered properly.
Comment 1 Lauro Moura Maranhao Neto 2013-05-09 15:37:23 PDT
Created attachment 201292 [details]
Tentative patch.

Patch leaving the EventSource in charge of closing the socket.
Comment 2 Carlos Garcia Campos 2013-05-21 02:38:43 PDT
Comment on attachment 201292 [details]
Tentative patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=201292&action=review

I think this patch is correct. It needs a ChangeLog entry, though.

> Source/WebKit2/Platform/CoreIPC/unix/ConnectionUnix.cpp:136
> +// WorkQueueGtk.cpp EventSource is in charge of closing the socket.

I would explain it a bit more here, because it's not obvious. The socket event source is going to be removed below, which will cancel the source. when the source is cancelled, the GSocket is destroyed and the socket is closed.

> Source/WebKit2/UIProcess/Launcher/gtk/ProcessLauncherGtk.cpp:-115
> -    // Monitor the child process, it calls waitpid to prevent the child process from becomming a zombie,
> -    // and it allows us to close the socket when the child process crashes.
> -    g_child_watch_add(m_processIdentifier, childFinishedFunction, GINT_TO_POINTER(sockets[1]));
> -

Right, we added this when we used DGRAM sockets, ti shouldn't bee needed with STREAM sockets, since we are already notified when the other end closes the connection.
Comment 3 Lauro Moura Maranhao Neto 2013-05-21 13:22:14 PDT
(In reply to comment #2)
> (From update of attachment 201292 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=201292&action=review
> 
> I think this patch is correct. It needs a ChangeLog entry, though.
> 
> 
> I would explain it a bit more here, because it's not obvious. The socket event source is going to be removed below, which will cancel the source. when the source is cancelled, the GSocket is destroyed and the socket is closed.
> 

I'm going to update the patch with the comments and ChangeLog entries. Should I add the test (using WK2 API) to the patch? The current version (50 iterations...) takes a while to run on WebKitGTK+ (several seconds) and Nix (3s), though.

About the problem when the WebProcess crashes before the connection is setup, it'll setup the register handlers and trigger connectionDidClose when trying to read.
Comment 4 Carlos Garcia Campos 2013-05-21 23:46:55 PDT
(In reply to comment #3)
> (In reply to comment #2)
> > (From update of attachment 201292 [details] [details])
> > View in context: https://bugs.webkit.org/attachment.cgi?id=201292&action=review
> > 
> > I think this patch is correct. It needs a ChangeLog entry, though.
> > 
> > 
> > I would explain it a bit more here, because it's not obvious. The socket event source is going to be removed below, which will cancel the source. when the source is cancelled, the GSocket is destroyed and the socket is closed.
> > 
> 
> I'm going to update the patch with the comments and ChangeLog entries.

Thanks!

> Should I add the test (using WK2 API) to the patch? The current version (50 iterations...) takes a while to run on WebKitGTK+ (several seconds) and Nix (3s), though.

Yes, I think it's too much for a unit test, and the problem is very specific to the GTK+ implementation. There are already tests to check the process respawn so I don't think it's needed. 

> About the problem when the WebProcess crashes before the connection is setup, it'll setup the register handlers and trigger connectionDidClose when trying to read.

Exactly.
Comment 5 Carlos Garcia Campos 2013-05-28 03:38:10 PDT
Created attachment 203037 [details]
Updated patch including ChangeLog entry

Updated patch to add a ChangeLog entry so that it can be marked for review, since I would like to include this patch in the next release planned for this week.
Comment 6 Build Bot 2013-05-28 07:27:42 PDT
Comment on attachment 203037 [details]
Updated patch including ChangeLog entry

Attachment 203037 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.appspot.com/results/666089

New failing tests:
editing/pasteboard/4641033.html
Comment 7 Build Bot 2013-05-28 07:27:43 PDT
Created attachment 203050 [details]
Archive of layout-test-results from webkit-ews-14 for mac-mountainlion-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: webkit-ews-14  Port: mac-mountainlion-wk2  Platform: Mac OS X 10.8.3
Comment 8 Martin Robinson 2013-05-28 09:26:21 PDT
Comment on attachment 203037 [details]
Updated patch including ChangeLog entry

View in context: https://bugs.webkit.org/attachment.cgi?id=203037&action=review

Looks good to me. This just need an owner to sign off on it.

> Source/WebKit2/Platform/CoreIPC/unix/ConnectionUnix.cpp:140
> +    // In GTK+ platform the socket is closed by the work queue.
> +#if !PLATFORM(GTK)
>      if (m_socketDescriptor != -1)
>          while (close(m_socketDescriptor) == -1 && errno == EINTR) { }
> +#endif

Ideally everyone would close the socket in the same way. I think we should put a TODO here for other ports and open bugs.
Comment 9 Anders Carlsson 2013-05-28 09:42:13 PDT
Comment on attachment 203037 [details]
Updated patch including ChangeLog entry

View in context: https://bugs.webkit.org/attachment.cgi?id=203037&action=review

>> Source/WebKit2/Platform/CoreIPC/unix/ConnectionUnix.cpp:140
>> +#endif
> 
> Ideally everyone would close the socket in the same way. I think we should put a TODO here for other ports and open bugs.

I agree, having the work queue take ownership of the file descriptor is necessary to avoid race conditions.
Comment 10 Carlos Garcia Campos 2013-05-28 10:00:22 PDT
(In reply to comment #9)
> (From update of attachment 203037 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=203037&action=review
> 
> >> Source/WebKit2/Platform/CoreIPC/unix/ConnectionUnix.cpp:140
> >> +#endif
> > 
> > Ideally everyone would close the socket in the same way. I think we should put a TODO here for other ports and open bugs.
> 
> I agree, having the work queue take ownership of the file descriptor is necessary to avoid race conditions.

Thank you both for the review, opened bugs for EFL and Qt:

https://bugs.webkit.org/show_bug.cgi?id=116872
https://bugs.webkit.org/show_bug.cgi?id=116873
Comment 11 WebKit Commit Bot 2013-05-28 10:11:53 PDT
Comment on attachment 203037 [details]
Updated patch including ChangeLog entry

Clearing flags on attachment: 203037

Committed r150808: <http://trac.webkit.org/changeset/150808>
Comment 12 WebKit Commit Bot 2013-05-28 10:11:56 PDT
All reviewed patches have been landed.  Closing bug.