Bug 164220 - NetworkSession: Network process crash when converting main resource to download
Summary: NetworkSession: Network process crash when converting main resource to download
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit2 (show other bugs)
Version: WebKit Local Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Carlos Garcia Campos
URL:
Keywords: Soup
: 164237 (view as bug list)
Depends on: 164631
Blocks: 164458
  Show dependency treegraph
 
Reported: 2016-10-31 03:49 PDT by Carlos Garcia Campos
Modified: 2016-11-18 11:42 PST (History)
5 users (show)

See Also:


Attachments
Patch (7.45 KB, patch)
2016-10-31 04:25 PDT, Carlos Garcia Campos
achristensen: review+
Details | Formatted Diff | Diff
Patch (9.55 KB, patch)
2016-11-01 03:20 PDT, Carlos Garcia Campos
achristensen: review+
Details | Formatted Diff | Diff
New patch (37.73 KB, patch)
2016-11-05 04:42 PDT, Carlos Garcia Campos
no flags Details | Formatted Diff | Diff
Patch (38.23 KB, patch)
2016-11-09 19:52 PST, Chris Dumez
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Carlos Garcia Campos 2016-10-31 03:49:38 PDT
Right after the main resource load is converted to a download, the web process deletes the ResourceLoader which sends the RemoveLoadIdentifier to the network process and the load is aborted. Sometimes it happens that NetworkResourceLoader::abort() is called while the NetworkLoad is still deciding the destination of the download. In such case, NetworkResourceLoader::didConvertToDownload() has already been called, but not NetworkResourceLoader::didBecomeDownload(). In NetworkResourceLoader::abort() we already handle the case of having a NetworkLoad after NetworkResourceLoader::didConvertToDownload() has been called, to avoid canceling the load in such case, however cleanup() is always called unconditionally and the NetworkLoad is deleted before NetworkResourceLoader::didBecomeDownload() is called. When the NetworkLoad is destroyed the NetworkDataTask client becomes nullptr, leaving it in a state where both the client is nullptr and the download hasn't been created yet. That's not expected to happen and when the response completion handler is called in the NetworkDataTask it tries to use either the client or the download and it crashes. In the GTK+ port we have unit test for this use case, but the find destination happens so fasta that RemoveLoadIdentifier message is always received after etworkResourceLoader::didBecomeDownload() is called.
Comment 1 Carlos Garcia Campos 2016-10-31 04:25:12 PDT
Created attachment 293415 [details]
Patch
Comment 2 Michael Catanzaro 2016-10-31 04:30:23 PDT
Comment on attachment 293415 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=293415&action=review

> Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestDownloads.cpp:545
> +        if (test->m_shouldDelayDecideDestination)
> +            g_usleep(0.2 * G_USEC_PER_SEC);

There's no way to know when the load has been aborted? :/
Comment 3 Carlos Garcia Campos 2016-10-31 04:40:32 PDT
(In reply to comment #2)
> Comment on attachment 293415 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=293415&action=review
> 
> > Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestDownloads.cpp:545
> > +        if (test->m_shouldDelayDecideDestination)
> > +            g_usleep(0.2 * G_USEC_PER_SEC);
> 
> There's no way to know when the load has been aborted? :/

No, because it's an internal abort, the load is converted to a download, which means that a Download is created to continue with the load, and the ResourceLoader is destroyed. So, the load is not actually aborted, it's the NetworkResourceLoader. The problem is that we were deleting the NetworkResourceLoader before the network load finished to become a download. From the API point of view there isn't any abort, after we decide the load is a download, we use the WebKitDownload object and forget about the loader client.
Comment 4 Carlos Garcia Campos 2016-10-31 10:00:07 PDT
Committed r208154: <http://trac.webkit.org/changeset/208154>
Comment 5 Ryan Haddad 2016-10-31 14:34:12 PDT
Reverted r208154 for reason:

This change caused an assertion failure during API tests on macOS.

Committed r208174: <http://trac.webkit.org/changeset/208174>
Comment 6 Ryan Haddad 2016-10-31 14:34:54 PDT
*** Bug 164237 has been marked as a duplicate of this bug. ***
Comment 7 Carlos Garcia Campos 2016-11-01 03:20:03 PDT
Created attachment 293544 [details]
Patch

Updated patch to fix the assert. The abort is no longer cleaning up the resource loader when becoming a download.
Comment 8 Carlos Garcia Campos 2016-11-02 10:33:43 PDT
Alex, could you review this again please? this is causing network process to crash on every single download started by a web page in epiphany :-P
Comment 9 Alex Christensen 2016-11-02 11:27:38 PDT
http://trac.webkit.org/changeset/208293
Comment 11 Alex Christensen 2016-11-04 15:11:07 PDT
When we make a NetworkResourceLoader become a download, it needs to not be owned by the NetworkConnectionToWebProcess any more if we want it to stay alive when a NetworkConnectionToWebProcess is destroyed.  We need a new container for the NetworkResourceLoaders.
Comment 12 Carlos Garcia Campos 2016-11-05 04:42:45 PDT
Created attachment 293992 [details]
New patch

This is a new approach that ensures the NetworkResourceLoader is cleaned up and destroyed right after becoming a download, but the NetworkLoad keeps alive until the NetworkDataTask becomes a download, by using PendingDownload to adopt the NetworkLoad. I think this also makes the downloads life cycle easier to follow now, since we will always have a PendingDownload, and we no longer need didBecomeDownload() callbacks.
Comment 13 Alex Christensen 2016-11-07 14:12:51 PST
Comment on attachment 293992 [details]
New patch

View in context: https://bugs.webkit.org/attachment.cgi?id=293992&action=review

> Source/WebKit2/NetworkProcess/NetworkLoad.h:134
> -    NetworkLoadClient& m_client;
> +    NetworkLoadClient* m_client;

This should be a std::reference_wrapper, which is reassignable but never null.
Comment 14 Chris Dumez 2016-11-09 19:52:41 PST
Created attachment 294325 [details]
Patch
Comment 15 Chris Dumez 2016-11-09 19:53:55 PST
Committing this as I suspect this will fix the assertion being hit by the patch at Bug 164458.
Comment 16 WebKit Commit Bot 2016-11-09 20:52:14 PST
Comment on attachment 294325 [details]
Patch

Clearing flags on attachment: 294325

Committed r208521: <http://trac.webkit.org/changeset/208521>
Comment 17 WebKit Commit Bot 2016-11-09 20:52:19 PST
All reviewed patches have been landed.  Closing bug.
Comment 18 Carlos Garcia Campos 2016-11-10 00:29:46 PST
(In reply to comment #15)
> Committing this as I suspect this will fix the assertion being hit by the
> patch at Bug 164458.

Perfect, thank you!
Comment 19 Chris Dumez 2016-11-10 07:35:00 PST
Why was this reopened?
Comment 20 Carlos Garcia Campos 2016-11-10 07:43:51 PST
(In reply to comment #19)
> Why was this reopened?

History says it was me, but I really don't know how I did it, sorry.
Comment 21 Chris Dumez 2016-11-18 11:42:50 PST
(In reply to comment #20)
> (In reply to comment #19)
> > Why was this reopened?
> 
> History says it was me, but I really don't know how I did it, sorry.

From C++11 standard:
"""
Additionally, u can, upon request, transfer ownership to another unique pointer u2. Upon completion of such a transfer, the following postconditions hold:
— u2.p is equal to the pre-transfer u.p,
— u.p is equal to nullptr, and
"""

So my understanding is that the unique_ptr member is guaranteed to be null after you WTFMove() it out to another unique_ptr.

There's got to be something else going on here. Maybe the method is called on a non-main-thread and we're accessing m_networkLoad from 2 threads?