Bug 144262 - [GTK] Crash in WebProcess when loading large content with custom URI schemes
Summary: [GTK] Crash in WebProcess when loading large content with custom URI schemes
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKitGTK (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Mario Sanchez Prada
URL:
Keywords: Gtk
Depends on:
Blocks:
 
Reported: 2015-04-27 10:02 PDT by Mario Sanchez Prada
Modified: 2016-03-01 16:11 PST (History)
8 users (show)

See Also:


Attachments
Test case (C example code) (80.68 KB, text/html)
2015-04-27 10:02 PDT, Mario Sanchez Prada
no flags Details
Test case (Large HTML content) (80.68 KB, text/html)
2015-04-27 10:03 PDT, Mario Sanchez Prada
no flags Details
Test case (C example code) (1.08 KB, text/x-csrc)
2015-04-27 10:05 PDT, Mario Sanchez Prada
no flags Details
Patch proposal (3.38 KB, patch)
2015-04-27 11:10 PDT, Mario Sanchez Prada
no flags Details | Formatted Diff | Diff
Patch proposal (unit test included) (133.26 KB, patch)
2015-04-28 10:25 PDT, Mario Sanchez Prada
no flags Details | Formatted Diff | Diff
Patch proposal (unit test included) (17.31 KB, patch)
2015-05-12 04:08 PDT, Mario Sanchez Prada
cgarcia: review-
Details | Formatted Diff | Diff
Patch proposal (Unit test included) v2 (22.99 KB, patch)
2015-12-09 04:30 PST, Mario Sanchez Prada
cgarcia: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mario Sanchez Prada 2015-04-27 10:02:19 PDT
Created attachment 251750 [details]
Test case (C example code)

In the past weeks, we've been seeing crashes in one WebKitGTK+ based app when the user navigates web content "too quickly", loaded via custom URI schemes (this bit is important), by clicking very fast in links rendered inside large HTML documents, which causes the current load to be cancelled and start a new load, inside the same web view.

This is not a "happens all the time" thing, but depending on the load, the size of the original document containing the links, how fast the user is clicking and other factors alike, it's certainly likely to hit this crash over and over again, so it would be nice to have it fixed upstream, where I could reproduce it without any trouble using a debug build (although we initially detected using webkitgtk 2.4.6 and 2.6.2).

Last, this is possibly related to these two bugs recently reported in RedHat's bugzilla:
  * https://bugzilla.redhat.com/show_bug.cgi?id=1196677
  * https://bugzilla.redhat.com/show_bug.cgi?id=1209130

I'm attaching a minimal example in C code that causes the web process to crash when trying to load a 'crashy.html' file (to be attached later) twice in the same web view, causing the following crash:

$ ./webkitcrash
ASSERTION FAILED: task.get()
../../Source/WebKit2/Shared/Network/CustomProtocols/soup/CustomProtocolManagerImpl.cpp(120) : void WebKit::CustomProtocolManagerImpl::didFailWithError(uint64_t, const WebCore::ResourceError&)
1   0x7f41d220d0e9 /home/mario/work/WebKit/WebKitBuild/Debug/lib/libjavascriptcoregtk-4.0.so.18(WTFCrash+0x1e) [0x7f41d220d0e9]
2   0x7f41d76da965 /home/mario/work/WebKit/WebKitBuild/Debug/lib/libwebkit2gtk-4.0.so.37(_ZN6WebKit25CustomProtocolManagerImpl16didFailWithErrorEmRKN7WebCore13ResourceErrorE+0xa7) [0x7f41d76da965]
3   0x7f41d76de1b0 /home/mario/work/WebKit/WebKitBuild/Debug/lib/libwebkit2gtk-4.0.so.37(_ZN6WebKit21CustomProtocolManager16didFailWithErrorEmRKN7WebCore13ResourceErrorE+0x3a) [0x7f41d76de1b0]
4   0x7f41d77d9fb1 /home/mario/work/WebKit/WebKitBuild/Debug/lib/libwebkit2gtk-4.0.so.37(_ZN3IPC22callMemberFunctionImplIN6WebKit21CustomProtocolManagerEMS2_FvmRKN7WebCore13ResourceErrorEESt5tupleIImS4_EEILm0ELm1EEEEvPT_T0_OT1_St14index_sequenceIIXspT2_EEE+0x9c) [0x7f41d77d9fb1]
5   0x7f41d77d9c5e /home/mario/work/WebKit/WebKitBuild/Debug/lib/libwebkit2gtk-4.0.so.37(_ZN3IPC18callMemberFunctionIN6WebKit21CustomProtocolManagerEMS2_FvmRKN7WebCore13ResourceErrorEESt5tupleIImS4_EESt19make_index_sequenceILm2EEEEvOT1_PT_T0_+0x41) [0x7f41d77d9c5e]
6   0x7f41d77d9625 /home/mario/work/WebKit/WebKitBuild/Debug/lib/libwebkit2gtk-4.0.so.37(_ZN3IPC13handleMessageIN8Messages21CustomProtocolManager16DidFailWithErrorEN6WebKit21CustomProtocolManagerEMS5_FvmRKN7WebCore13ResourceErrorEEEEvRNS_14MessageDecoderEPT0_T1_+0xa3) [0x7f41d77d9625]
7   0x7f41d77d91e4 /home/mario/work/WebKit/WebKitBuild/Debug/lib/libwebkit2gtk-4.0.so.37(_ZN6WebKit21CustomProtocolManager17didReceiveMessageERN3IPC10ConnectionERNS1_14MessageDecoderE+0xb2) [0x7f41d77d91e4]
8   0x7f41d734eb09 /home/mario/work/WebKit/WebKitBuild/Debug/lib/libwebkit2gtk-4.0.so.37(_ZN3IPC10Connection39dispatchWorkQueueMessageReceiverMessageERNS0_24WorkQueueMessageReceiverERNS_14MessageDecoderE+0x47) [0x7f41d734eb09]
9   0x7f41d7350552 /home/mario/work/WebKit/WebKitBuild/Debug/lib/libwebkit2gtk-4.0.so.37(+0x41d8552) [0x7f41d7350552]
10  0x7f41d7352ce3 /home/mario/work/WebKit/WebKitBuild/Debug/lib/libwebkit2gtk-4.0.so.37(+0x41dace3) [0x7f41d7352ce3]
11  0x7f41d733ac50 /home/mario/work/WebKit/WebKitBuild/Debug/lib/libwebkit2gtk-4.0.so.37(_ZNKSt8functionIFvvEEclEv+0x32) [0x7f41d733ac50]
12  0x7f41d2256029 /home/mario/work/WebKit/WebKitBuild/Debug/lib/libjavascriptcoregtk-4.0.so.18(_ZN3WTF15GMainLoopSource12voidCallbackEv+0x6d) [0x7f41d2256029]
13  0x7f41d225672d /home/mario/work/WebKit/WebKitBuild/Debug/lib/libjavascriptcoregtk-4.0.so.18(_ZN3WTF15GMainLoopSource18voidSourceCallbackEPS0_+0x23) [0x7f41d225672d]
14  0x7f41cdbc658d /home/mario/work/WebKit/WebKitBuild/DependenciesGTK/Root/lib64/libglib-2.0.so.0(g_main_context_dispatch+0x13d) [0x7f41cdbc658d]
15  0x7f41cdbc6928 /home/mario/work/WebKit/WebKitBuild/DependenciesGTK/Root/lib64/libglib-2.0.so.0(+0x48928) [0x7f41cdbc6928]
16  0x7f41cdbc6c42 /home/mario/work/WebKit/WebKitBuild/DependenciesGTK/Root/lib64/libglib-2.0.so.0(g_main_loop_run+0xc2) [0x7f41cdbc6c42]
17  0x7f41d9289cfc /home/mario/work/WebKit/WebKitBuild/Debug/lib/libwebkit2gtk-4.0.so.37(+0x6111cfc) [0x7f41d9289cfc]
18  0x7f41d928a94b /home/mario/work/WebKit/WebKitBuild/Debug/lib/libwebkit2gtk-4.0.so.37(+0x611294b) [0x7f41d928a94b]
19  0x7f41d733ac50 /home/mario/work/WebKit/WebKitBuild/Debug/lib/libwebkit2gtk-4.0.so.37(_ZNKSt8functionIFvvEEclEv+0x32) [0x7f41d733ac50]
20  0x7f41d2220176 /home/mario/work/WebKit/WebKitBuild/Debug/lib/libjavascriptcoregtk-4.0.so.18(+0x14b5176) [0x7f41d2220176]
21  0x7f41d224c8a7 /home/mario/work/WebKit/WebKitBuild/Debug/lib/libjavascriptcoregtk-4.0.so.18(+0x14e18a7) [0x7f41d224c8a7]
22  0x7f41d054052a /lib64/libpthread.so.0(+0x752a) [0x7f41d054052a]
23  0x7f41c809a22d /lib64/libc.so.6(clone+0x6d) [0x7f41c809a22d]


Now see below the backtrace as taken from gdb, while connecting to the crashing process:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffcd880700 (LWP 31538)]
0x00007fffed1ab0ee in WTFCrash () at ../../Source/WTF/wtf/Assertions.cpp:321
321	    *(int *)(uintptr_t)0xbbadbeef = 0;
(gdb) bt
#0  0x00007fffed1ab0ee in WTFCrash () at ../../Source/WTF/wtf/Assertions.cpp:321
#1  0x00007ffff2678965 in WebKit::CustomProtocolManagerImpl::didFailWithError (this=0x682fc0, customProtocolID=1, error=...) at ../../Source/WebKit2/Shared/Network/CustomProtocols/soup/CustomProtocolManagerImpl.cpp:120
#2  0x00007ffff267c1b0 in WebKit::CustomProtocolManager::didFailWithError (this=0x7fffc5ff3000, customProtocolID=1, error=...) at ../../Source/WebKit2/Shared/Network/CustomProtocols/soup/CustomProtocolManagerSoup.cpp:91
#3  0x00007ffff2777fb1 in IPC::callMemberFunctionImpl<WebKit::CustomProtocolManager, void (WebKit::CustomProtocolManager::*)(unsigned long, WebCore::ResourceError const&), std::tuple<unsigned long, WebCore::ResourceError>, 0ul, 1ul>(WebKit::CustomProtocolManager*, void (WebKit::CustomProtocolManager::*)(unsigned long, WebCore::ResourceError const&), std::tuple<unsigned long, WebCore::ResourceError>&&, std::index_sequence<0ul, 1ul>) (object=0x7fffc5ff3000, function=
    (void (WebKit::CustomProtocolManager::*)(WebKit::CustomProtocolManager * const, unsigned long, const WebCore::ResourceError &)) 0x7ffff267c176 <WebKit::CustomProtocolManager::didFailWithError(unsigned long, WebCore::ResourceError const&)>, args=<unknown type in /home/mario/work/WebKit/WebKitBuild/Debug/lib/libwebkit2gtk-4.0.so.37, CU 0xbfa7cb1, DIE 0xbfd4e45>) at ../../Source/WebKit2/Platform/IPC/HandleMessage.h:16
#4  0x00007ffff2777c5e in IPC::callMemberFunction<WebKit::CustomProtocolManager, void (WebKit::CustomProtocolManager::*)(unsigned long, WebCore::ResourceError const&), std::tuple<unsigned long, WebCore::ResourceError>, std::make_index_sequence<2ul> >(std::tuple<unsigned long, WebCore::ResourceError>&&, WebKit::CustomProtocolManager*, void (WebKit::CustomProtocolManager::*)(unsigned long, WebCore::ResourceError const&)) (
    args=<unknown type in /home/mario/work/WebKit/WebKitBuild/Debug/lib/libwebkit2gtk-4.0.so.37, CU 0xbfa7cb1, DIE 0xbfd4082>, object=0x7fffc5ff3000, function=
    (void (WebKit::CustomProtocolManager::*)(WebKit::CustomProtocolManager * const, unsigned long, const WebCore::ResourceError &)) 0x7ffff267c176 <WebKit::CustomProtocolManager::didFailWithError(unsigned long, WebCore::ResourceError const&)>) at ../../Source/WebKit2/Platform/IPC/HandleMessage.h:22
#5  0x00007ffff2777625 in IPC::handleMessage<Messages::CustomProtocolManager::DidFailWithError, WebKit::CustomProtocolManager, void (WebKit::CustomProtocolManager::*)(unsigned long, WebCore::ResourceError const&)> (decoder=..., 
    object=0x7fffc5ff3000, function=
    (void (WebKit::CustomProtocolManager::*)(WebKit::CustomProtocolManager * const, unsigned long, const WebCore::ResourceError &)) 0x7ffff267c176 <WebKit::CustomProtocolManager::didFailWithError(unsigned long, WebCore::ResourceError const&)>) at ../../Source/WebKit2/Platform/IPC/HandleMessage.h:92
#6  0x00007ffff27771e4 in WebKit::CustomProtocolManager::didReceiveMessage (this=0x7fffc5ff3000, connection=..., decoder=...) at DerivedSources/WebKit2/CustomProtocolManagerMessageReceiver.cpp:44
#7  0x00007ffff22ecb09 in IPC::Connection::dispatchWorkQueueMessageReceiverMessage (this=0x7fffc57fb000, workQueueMessageReceiver=..., decoder=...) at ../../Source/WebKit2/Platform/IPC/Connection.cpp:306
#8  0x00007ffff22ee552 in IPC::Connection::<lambda()>::operator()(void) const (__closure=0x7fff68001f10) at ../../Source/WebKit2/Platform/IPC/Connection.cpp:678
#9  0x00007ffff22f0ce3 in std::_Function_handler<void(), IPC::Connection::processIncomingMessage(std::unique_ptr<IPC::MessageDecoder>)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...)
    at /usr/include/c++/4.9.2/functional:2039
#10 0x00007ffff22d8c50 in std::function<void ()>::operator()() const (this=0x7fffcd87f908) at /usr/include/c++/4.9.2/functional:2439
#11 0x00007fffed1f4029 in WTF::GMainLoopSource::voidCallback (this=0x7fffc5fecdc0) at ../../Source/WTF/wtf/gobject/GMainLoopSource.cpp:365
#12 0x00007fffed1f472d in WTF::GMainLoopSource::voidSourceCallback (source=0x7fffc5fecdc0) at ../../Source/WTF/wtf/gobject/GMainLoopSource.cpp:456
#13 0x00007fffe8b6458d in g_main_dispatch (context=0x716470) at gmain.c:3064
#14 g_main_context_dispatch (context=context@entry=0x716470) at gmain.c:3663
#15 0x00007fffe8b64928 in g_main_context_iterate (context=0x716470, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3734
#16 0x00007fffe8b64c42 in g_main_loop_run (loop=0x5cacf0) at gmain.c:3928
#17 0x00007ffff4227cfc in WTF::WorkQueue::<lambda()>::operator()(void) const (__closure=0x8e93e0) at ../../Source/WTF/wtf/gtk/WorkQueueGtk.cpp:59
#18 0x00007ffff422894b in std::_Function_handler<void(), WTF::WorkQueue::platformInitialize(char const*, WTF::WorkQueue::Type, WTF::WorkQueue::QOS)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...)
    at /usr/include/c++/4.9.2/functional:2039
#19 0x00007ffff22d8c50 in std::function<void ()>::operator()() const (this=0x7fffcd87fb50) at /usr/include/c++/4.9.2/functional:2439
#20 0x00007fffed1be176 in WTF::threadEntryPoint (contextData=0x7fffc5ff50a0) at ../../Source/WTF/wtf/Threading.cpp:58
#21 0x00007fffed1ea8a7 in WTF::wtfThreadEntryPoint (param=0x7fffc5ffb050) at ../../Source/WTF/wtf/ThreadingPthreads.cpp:170
#22 0x00007fffeb4de52a in start_thread (arg=0x7fffcd880700) at pthread_create.c:310
#23 0x00007fffe303822d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Comment 1 Mario Sanchez Prada 2015-04-27 10:03:31 PDT
Created attachment 251752 [details]
Test case (Large HTML content)

Now attaching the aforementioned HTML file that the C example would try to load
Comment 2 Mario Sanchez Prada 2015-04-27 10:05:18 PDT
Created attachment 251753 [details]
Test case (C example code)

(In reply to comment #0)
> [...] 
> I'm attaching a minimal example in C code that causes the web process to
> crash when trying to load a 'crashy.html' file (to be attached later) twice
> in the same web view, causing the following crash:

Of course, I could not create the report without making any mistake, so I uploaded the HTML file instead of the C example :/

Now attaching the right one, sorry for the noise.
Comment 3 Mario Sanchez Prada 2015-04-27 10:35:18 PDT
To keep up with my longest bug report ever, here comes a summary of my investigation so far:

Basically, it seems to me that the problem is that the implementation of CustomProtocolManagerImpl::didFailWithError() in the Web process is assuming that the GTask has not been released yet when invoked, which seems like is not always the case (either because of a bigger design issue, or a failing implementation of this method).

To understand the situation better, I spent some time trying to understand how events seem to develop in this scenario, and this is the summary:

First of all, when the UI process sends the DidLoad message to the CustomProtocolmanager in the Web process for the first time, data->stream is NULL in CustomProtocolManagerImpl::didLoadData() when called, which is perfectly normal at that point. Then, precisely because it's the first time didLoad() gets called, data->stream gets filled for the first time too with data coming from the UI process, so that the loader can start processing it, and the GTask inside the WebSoupRequestAsyncData object gets released, since it's no longer needed (it's purpose seems to be to setup this data->stream object for the first time).

But now, because async reads from the source input stream happen very fast in the UI process, at least a second DidLoad message got probably sent to the Web process, even if the first load gets cancelled very quickly because of the second call to webkit_web_view_load_uri().

When this occurs, the second time CustomProtocolManagerImpl::didLoad() gets executed, the statement in "if (data->requestFailed())" will be true (due to the cancelled loading operation), meaning that a StopLoading message will be sent back to the UI process, to notify it that it should stop sending data.

But when this StopLoading message arrives in the UI process, the GCancellable used in WebKitURISchemeRequest to initiate the asynchronous load of data (via g_input_stream_read_async()) gets cancelled as well, meaning that further calls to the webkitURISchemeRequestReadCallback() callback will get a GError when calling g_input_stream_read_finish(), which will end up getting webkit_uri_scheme_request_finish_error() executed.

And what will webkit_uri_scheme_request_finish_error() do now? Send a DidFailWithError message to the Web process, which will be handled in CustomProtocolManagerImpl::didFailWithError().

Now, effectively, this method is being executed **after** the GTask object were released for a certain value customProtocolID, back when data->stream was set up for the first time. And because the implementation is something like this: 

  void CustomProtocolManagerImpl::didFailWithError(uint64_t customProtocolID, const WebCore::ResourceError& error)
  {
      WebSoupRequestAsyncData* data = m_customProtocolMap.get(customProtocolID);
      ASSERT(data);

      GRefPtr<GTask> task = data->releaseTask();
      ASSERT(task.get());
      g_task_return_new_error(task.get(), g_quark_from_string(error.domain().utf8().data()),
                              error.errorCode(), "%s", error.localizedDescription().utf8().data());

      m_customProtocolMap.remove(customProtocolID);
  }

... we get the crash, since data->releaseTask() has been called earlier and therefore there's nothing to complete at this point.

So, it looks to me that either the whole logic of how didLoad() and didFailWithError() needs "some adjustment", or the didFailError() needs to be fixed so that it handles this situation gracefully, while still being correct.

Comments?
Comment 4 Mario Sanchez Prada 2015-04-27 11:10:10 PDT
Created attachment 251760 [details]
Patch proposal

Attaching a tentative fix that gets rid of the cash by handling the two possible situations with data->stream (NULL or not NULL) in didFailWithError.

Not actually asking for review yet, since I would need to add an unit test for it (having some trouble with my environment now), but good enough to help with discussion, I suppose.
Comment 5 WebKit Commit Bot 2015-04-27 12:14:59 PDT
Attachment 251760 [details] did not pass style-queue:


ERROR: Source/WebKit2/Shared/Network/CustomProtocols/soup/CustomProtocolManagerImpl.cpp:127:  When wrapping a line, only indent 4 spaces.  [whitespace/indent] [3]
Total errors found: 1 in 2 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 6 Michael Catanzaro 2015-04-28 09:30:37 PDT
Instead of this:

g_task_return_new_error(task.get(), g_quark_from_string(error.domain().utf8().data()),
                        error.errorCode(), "%s", error.localizedDescription().utf8().data());

Write this:

g_task_return_new_error(task.get(), g_quark_from_string(error.domain().utf8().data()),
    error.errorCode(), "%s", error.localizedDescription().utf8().data());
Comment 7 Mario Sanchez Prada 2015-04-28 10:25:04 PDT
Created attachment 251857 [details]
Patch proposal (unit test included)

I'm having quite some trouble to run the style checker today, so I'm uploading the patch that I have now, hoping that the style bot will turn green.

Let's see...
Comment 8 Carlos Garcia Campos 2015-04-29 05:00:05 PDT
Comment on attachment 251857 [details]
Patch proposal (unit test included)

View in context: https://bugs.webkit.org/attachment.cgi?id=251857&action=review

> Source/WebKit2/ChangeLog:9
> +        Consider already initiated loads of content coming out of a custom
> +        URI scheme handler when a failure has been reported from the UI process.

This is not the only case failing, it can also happen that the didFailWithError message is received before the first data chunk is read, and then we would crash in didReceiveResponse() because data is null (was removed from the map in didFailWithError(). We need an early return there when data is NULL, similar to the on in didLoadData.

> Source/WebKit2/Shared/Network/CustomProtocols/soup/CustomProtocolManagerImpl.cpp:123
>      GRefPtr<GTask> task = data->releaseTask();
> -    ASSERT(task.get());
> -    g_task_return_new_error(task.get(), g_quark_from_string(error.domain().utf8().data()),
> -        error.errorCode(), "%s", error.localizedDescription().utf8().data());
> +    if (data->stream) {
> +        // There won't be any task to complete if the loader has already started
> +        // to read data from the input stream's data sent from the UI process.
> +        ASSERT(!task);

I think we shouldn't even call data->releaseTask() when we have a stream. I would use an early return instead, something like:

if (!data->stream) {
    // Failed while reading the stream, the task was already completed by didLoadData().
    return;
}

GRefPtr<GTask> task = data->releaseTask();
ASSERT(task.get());
g_task_return_new_error....

> Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestWebKitWebContext.cpp:154
> +    static gboolean webProcessCrashedCallback(WebKitWebView*, gpointer userData)
> +    {
> +        URISchemeTest* test = static_cast<URISchemeTest*>(userData);
> +        test->m_webProcessCrashed = true;
> +        test->quitMainLoop();
> +        return FALSE;
> +    }

We already have a way to detect unexpected web process crashes.

> Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestWebKitWebContext.cpp:172
> +            // The slower read access to data is, the easier it gets for the crash to,
> +            // happen, so we use a GFileInputStream instead of a GMemoryInputStream here.

This is going to be flaky

> Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestWebKitWebContext.cpp:213
> +    URISchemeTest()
> +    {
> +        g_signal_connect(m_webView, "web-process-crashed", G_CALLBACK(webProcessCrashedCallback), this);
> +    }
> +
> +    ~URISchemeTest()
> +    {
> +        g_signal_handlers_disconnect_matched(m_webView, G_SIGNAL_MATCH_DATA, 0, 0, 0, 0, this);
> +    }

We don'ty need this, WebViewTest does this automatically.

> Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestWebKitWebContext.cpp:279
> +    test->wait(1);

This is going to be super flaky :-)

There are actually at least two cases more we want to test:

 - didFailWithError before didReceiveResponse: This is easy. Just call webkit_uri_scheme_request_finish_error() right after webkit_uri_scheme_request_finish(), and wait until load finishes, it will crash.
 - didFailWithError after the first didLoadData: This is more tricky, because we need to ensure, there's at least one chunk read, and we don't finish reading the stream. Maybe we could use a pipe, and create a GUnixInputStream. Then we write something in the pipe, and then call webkit_uri_scheme_request_finish_error(). 

I would use better names than crashy, something like error:before-response and error:after-first-chunk. Since it's a load tracking test, it would also be interesting to wait until load finishes (instead of one second) and check that the loader client callbacks were correctly called.

> Tools/TestWebKitAPI/gtk/WebKit2Gtk/LoadTrackingTest.cpp:30
> +    // We need to reset the value of m_loadFailed or any use of LoadTrackingTest's
> +    // API after a load had failed will be wrongly considered to have failed as well.
> +    if (loadEvent != WEBKIT_LOAD_FINISHED)
> +        test->m_loadFailed = false;

I guess this assumes we always use waitUntilLoadFinishes

> Tools/TestWebKitAPI/gtk/WebKit2Gtk/LoadTrackingTest.cpp:187
> +    if (!m_loadFailed) {
> +        // This assertion only makes sense if the current load has not failed,
> +        // otherwise we might be comparing values relative to separate loads.
> +        g_assert_cmpfloat(m_estimatedProgress, <, progress);
> +    }

Ditto.
Comment 9 Mario Sanchez Prada 2015-04-29 06:18:34 PDT
Comment on attachment 251857 [details]
Patch proposal (unit test included)

View in context: https://bugs.webkit.org/attachment.cgi?id=251857&action=review

Thanks a lot for the review, Carlos. I will address those issues soon enough, but in the meanwhile see my answers to your comments below...

>> Source/WebKit2/ChangeLog:9
>> +        URI scheme handler when a failure has been reported from the UI process.
> 
> This is not the only case failing, it can also happen that the didFailWithError message is received before the first data chunk is read, and then we would crash in didReceiveResponse() because data is null (was removed from the map in didFailWithError(). We need an early return there when data is NULL, similar to the on in didLoadData.

I see. I reported it this way because this is what I've seen myself, but will change the description in a follow-up patch

>> Source/WebKit2/Shared/Network/CustomProtocols/soup/CustomProtocolManagerImpl.cpp:123
>> +        ASSERT(!task);
> 
> I think we shouldn't even call data->releaseTask() when we have a stream. I would use an early return instead, something like:
> 
> if (!data->stream) {
>     // Failed while reading the stream, the task was already completed by didLoadData().
>     return;
> }
> 
> GRefPtr<GTask> task = data->releaseTask();
> ASSERT(task.get());
> g_task_return_new_error....

That was my initial version, but I was unsure about doing it and in the end preferred to do it always and assert task or !task depending on the case.

Anyway, I will change it in the next patch

>> Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestWebKitWebContext.cpp:154
>> +    }
> 
> We already have a way to detect unexpected web process crashes.

I see, sorry about that. Will remove it.

>> Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestWebKitWebContext.cpp:172
>> +            // happen, so we use a GFileInputStream instead of a GMemoryInputStream here.
> 
> This is going to be flaky

Well, yeah... thing is that it does not happen always but doing this in my new laptop (with an SSD) I get the crash 8/10 times. If I use a GMemoryInputStream, I can't get it a single time, so I thought it could be acceptable.

>> Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestWebKitWebContext.cpp:213
>> +    }
> 
> We don'ty need this, WebViewTest does this automatically.

Ok

>> Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestWebKitWebContext.cpp:279
>> +    test->wait(1);
> 
> This is going to be super flaky :-)
> 
> There are actually at least two cases more we want to test:
> 
>  - didFailWithError before didReceiveResponse: This is easy. Just call webkit_uri_scheme_request_finish_error() right after webkit_uri_scheme_request_finish(), and wait until load finishes, it will crash.
>  - didFailWithError after the first didLoadData: This is more tricky, because we need to ensure, there's at least one chunk read, and we don't finish reading the stream. Maybe we could use a pipe, and create a GUnixInputStream. Then we write something in the pipe, and then call webkit_uri_scheme_request_finish_error(). 
> 
> I would use better names than crashy, something like error:before-response and error:after-first-chunk. Since it's a load tracking test, it would also be interesting to wait until load finishes (instead of one second) and check that the loader client callbacks were correctly called.

Same thing here :). In my case, I reproduce it 8/10 times consistently with this 1sec delay, which I chose because I saw it as the max value used in other tests. I'd rather do something like waitUntilFinishLoading, but can't do it because I risk canceling the wrong thing... but perhaps I was doing it wrong, based on your comment above.

Anyway, about those two cases:
  - didFailWithError before didReceiveResponse: Ok, will follow your suggestion.
  - didFailWithError after the first didLoadData: this is the one that I'm trying to test I'm checking here and which, as I said, I manage to reproduce 8/10 times with the current patch. I'll give a try to your suggestion, though

About the names, sure I can pick better ones, no problem :)

>> Tools/TestWebKitAPI/gtk/WebKit2Gtk/LoadTrackingTest.cpp:30
>> +        test->m_loadFailed = false;
> 
> I guess this assumes we always use waitUntilLoadFinishes

Not sure I understand your comment. Mind to ellaborate?

>> Tools/TestWebKitAPI/gtk/WebKit2Gtk/LoadTrackingTest.cpp:187
>> +    }
> 
> Ditto.

Same thing :)
Comment 10 Mario Sanchez Prada 2015-04-29 07:24:53 PDT
(In reply to comment #9)
> [...]
> > I think we shouldn't even call data->releaseTask() when we have a stream. I would use an early return instead, something like:
> > 
> > if (!data->stream) {
> >     // Failed while reading the stream, the task was already completed by didLoadData().
> >     return;
> > }
> > 
> > GRefPtr<GTask> task = data->releaseTask();
> > ASSERT(task.get());
> > g_task_return_new_error....
> 
> That was my initial version, but I was unsure about doing it and in the end
> preferred to do it always and assert task or !task depending on the case.
> 
> Anyway, I will change it in the next patch

Just a quick heads-up after a conversation on IRC with Carlos: we can't really early return because we still need to remove the data from the m_customProtocolMap hash table, and also, perhaps ASSERT() task and !task depending on the case would not be a too bad thing either... so we kind of settled on the following code for didFailError():

    void CustomProtocolManagerImpl::didFailWithError(uint64_t customProtocolID, const WebCore::ResourceError& error)
    {
        WebSoupRequestAsyncData* data = m_customProtocolMap.get(customProtocolID);
        ASSERT(data);
    
        // Either we haven't started reading the stream yet, in which case we need to complete the
        // task first, or we failed reading it and the task was already completed by didLoadData().
        ASSERT(!data->stream || !data->task);
    
        if (!data->stream) {
            GRefPtr<GTask> task = data->releaseTask();
            ASSERT(task.get());
            g_task_return_new_error(task.get(), g_quark_from_string(error.domain().utf8().data()),
                error.errorCode(), "%s", error.localizedDescription().utf8().data());
        }
    
        m_customProtocolMap.remove(customProtocolID);
    }

Still, I need to write the unit test in a proper way, but I thought I would share this before, because previous comments could be confusing.
Comment 11 Mario Sanchez Prada 2015-05-12 04:08:04 PDT
Created attachment 252954 [details]
Patch proposal (unit test included)

New patch implementing a slightly different approach, based on the experience writing the unit tests, because the original proposition posed problems when testing other scenarios than the originally described one (see Carlos previous comment). This one also includes more simplified and flakyness-free unit tests while not causing any regression. Please review, thanks
Comment 12 WebKit Commit Bot 2015-05-12 04:10:57 PDT
Thanks for the patch. If this patch contains new public API please make sure it follows the guidelines for new WebKit2 GTK+ API. See http://trac.webkit.org/wiki/WebKitGTK/AddingNewWebKit2API
Comment 13 Carlos Garcia Campos 2015-05-15 01:41:28 PDT
Comment on attachment 252954 [details]
Patch proposal (unit test included)

View in context: https://bugs.webkit.org/attachment.cgi?id=252954&action=review

I'm sorry but this breaks the model. ResourceHandle makes the soup request, and when it finishes it calls didReceiveResponse and starts to read the given stream calling didReceiveBuffer for very chunk read. When it finishes reading the stream, didFinishLoad is called. In case of error at any point didFail is called. What you are doing now is that the request doesn't finish until we have filled the input stream, and then our custom WebKitSoupRequestInputStream is useless, we could use a GMemoryInoutStream instead. The idea was to wait until the first chunk to create the input stream efficiently, using GMemoryInputStream when there's a single chunk or WebKitSoupRequestInputStream to be able to start reading the stream without having to wait for the stream to be filled.

> Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestWebKitWebContext.cpp:303
> +    g_assert(test->m_loadEvents.contains(LoadTrackingTest::ProvisionalLoadFailed));
> +    g_assert(!test->m_loadEvents.contains(LoadTrackingTest::LoadFailed));

And this proves it, I would expect this to be a LoadFailed, not a ProvisionalLoadFailed, since after the first chunk the load is committed. But our ResourceHandle thinks we are still waiting for the response because the soup request hasn't finished.
Comment 14 Mario Sanchez Prada 2015-05-15 04:30:12 PDT
(In reply to comment #13)
> Comment on attachment 252954 [details]
> Patch proposal (unit test included)
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=252954&action=review
> 
> I'm sorry but this breaks the model. ResourceHandle makes the soup request,
> and when it finishes it calls didReceiveResponse and starts to read the
> given stream calling didReceiveBuffer for very chunk read. When it finishes
> reading the stream, didFinishLoad is called. In case of error at any point
> didFail is called. What you are doing now is that the request doesn't finish
> until we have filled the input stream, and then our custom
> WebKitSoupRequestInputStream is useless, we could use a GMemoryInoutStream
> instead. The idea was to wait until the first chunk to create the input
> stream efficiently, using GMemoryInputStream when there's a single chunk or
> WebKitSoupRequestInputStream to be able to start reading the stream without
> having to wait for the stream to be filled.

Everything makes more sense now, thanks for the explanation.

> > Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestWebKitWebContext.cpp:303
> > +    g_assert(test->m_loadEvents.contains(LoadTrackingTest::ProvisionalLoadFailed));
> > +    g_assert(!test->m_loadEvents.contains(LoadTrackingTest::LoadFailed));
> 
> And this proves it, I would expect this to be a LoadFailed, not a
> ProvisionalLoadFailed, since after the first chunk the load is committed.
> But our ResourceHandle thinks we are still waiting for the response because
> the soup request hasn't finished.

I think you're right, even though I feel of course a bit frustrated about this patch being not useful in the end. But worry not, I will try to create a new version of this patch taking into consideration your comments plus all I learned so far while doing the previous patch.

Can't commit to a deadline though, but I'll try not to let this fall into oblivion easily. Feel free to ping me if I fail on that mission, though.

Thanks for the review!
Comment 15 Mario Sanchez Prada 2015-12-09 04:30:41 PST
Created attachment 266992 [details]
Patch proposal (Unit test included) v2

Attached a new patch with all the work done during the hackfest. This should fix the crash, cover all the edge cases and not change the model in the way :)

Please review, thanks!
Comment 16 Mario Sanchez Prada 2015-12-09 06:53:17 PST
Committed r193830: <http://trac.webkit.org/changeset/193830>