Summary: | [GTK] Crash in WebProcess when loading large content with custom URI schemes | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Mario Sanchez Prada <mario> | ||||||||||||||||
Component: | WebKitGTK | Assignee: | Mario Sanchez Prada <mario> | ||||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||||
Severity: | Normal | CC: | berto, cgarcia, commit-queue, danw, gustavo, mcatanzaro, mrobinson, svillar | ||||||||||||||||
Priority: | P2 | Keywords: | Gtk | ||||||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||
See Also: |
https://bugzilla.redhat.com/show_bug.cgi?id=1279715 https://bugzilla.redhat.com/show_bug.cgi?id=1196677 https://bugzilla.redhat.com/show_bug.cgi?id=1272315 https://bugzilla.redhat.com/show_bug.cgi?id=1294086 |
||||||||||||||||||
Attachments: |
|
Description
Mario Sanchez Prada
2015-04-27 10:02:19 PDT
Created attachment 251752 [details]
Test case (Large HTML content)
Now attaching the aforementioned HTML file that the C example would try to load
Created attachment 251753 [details] Test case (C example code) (In reply to comment #0) > [...] > I'm attaching a minimal example in C code that causes the web process to > crash when trying to load a 'crashy.html' file (to be attached later) twice > in the same web view, causing the following crash: Of course, I could not create the report without making any mistake, so I uploaded the HTML file instead of the C example :/ Now attaching the right one, sorry for the noise. To keep up with my longest bug report ever, here comes a summary of my investigation so far: Basically, it seems to me that the problem is that the implementation of CustomProtocolManagerImpl::didFailWithError() in the Web process is assuming that the GTask has not been released yet when invoked, which seems like is not always the case (either because of a bigger design issue, or a failing implementation of this method). To understand the situation better, I spent some time trying to understand how events seem to develop in this scenario, and this is the summary: First of all, when the UI process sends the DidLoad message to the CustomProtocolmanager in the Web process for the first time, data->stream is NULL in CustomProtocolManagerImpl::didLoadData() when called, which is perfectly normal at that point. Then, precisely because it's the first time didLoad() gets called, data->stream gets filled for the first time too with data coming from the UI process, so that the loader can start processing it, and the GTask inside the WebSoupRequestAsyncData object gets released, since it's no longer needed (it's purpose seems to be to setup this data->stream object for the first time). But now, because async reads from the source input stream happen very fast in the UI process, at least a second DidLoad message got probably sent to the Web process, even if the first load gets cancelled very quickly because of the second call to webkit_web_view_load_uri(). When this occurs, the second time CustomProtocolManagerImpl::didLoad() gets executed, the statement in "if (data->requestFailed())" will be true (due to the cancelled loading operation), meaning that a StopLoading message will be sent back to the UI process, to notify it that it should stop sending data. But when this StopLoading message arrives in the UI process, the GCancellable used in WebKitURISchemeRequest to initiate the asynchronous load of data (via g_input_stream_read_async()) gets cancelled as well, meaning that further calls to the webkitURISchemeRequestReadCallback() callback will get a GError when calling g_input_stream_read_finish(), which will end up getting webkit_uri_scheme_request_finish_error() executed. And what will webkit_uri_scheme_request_finish_error() do now? Send a DidFailWithError message to the Web process, which will be handled in CustomProtocolManagerImpl::didFailWithError(). Now, effectively, this method is being executed **after** the GTask object were released for a certain value customProtocolID, back when data->stream was set up for the first time. And because the implementation is something like this: void CustomProtocolManagerImpl::didFailWithError(uint64_t customProtocolID, const WebCore::ResourceError& error) { WebSoupRequestAsyncData* data = m_customProtocolMap.get(customProtocolID); ASSERT(data); GRefPtr<GTask> task = data->releaseTask(); ASSERT(task.get()); g_task_return_new_error(task.get(), g_quark_from_string(error.domain().utf8().data()), error.errorCode(), "%s", error.localizedDescription().utf8().data()); m_customProtocolMap.remove(customProtocolID); } ... we get the crash, since data->releaseTask() has been called earlier and therefore there's nothing to complete at this point. So, it looks to me that either the whole logic of how didLoad() and didFailWithError() needs "some adjustment", or the didFailError() needs to be fixed so that it handles this situation gracefully, while still being correct. Comments? Created attachment 251760 [details]
Patch proposal
Attaching a tentative fix that gets rid of the cash by handling the two possible situations with data->stream (NULL or not NULL) in didFailWithError.
Not actually asking for review yet, since I would need to add an unit test for it (having some trouble with my environment now), but good enough to help with discussion, I suppose.
Attachment 251760 [details] did not pass style-queue:
ERROR: Source/WebKit2/Shared/Network/CustomProtocols/soup/CustomProtocolManagerImpl.cpp:127: When wrapping a line, only indent 4 spaces. [whitespace/indent] [3]
Total errors found: 1 in 2 files
If any of these errors are false positives, please file a bug against check-webkit-style.
Instead of this: g_task_return_new_error(task.get(), g_quark_from_string(error.domain().utf8().data()), error.errorCode(), "%s", error.localizedDescription().utf8().data()); Write this: g_task_return_new_error(task.get(), g_quark_from_string(error.domain().utf8().data()), error.errorCode(), "%s", error.localizedDescription().utf8().data()); Created attachment 251857 [details]
Patch proposal (unit test included)
I'm having quite some trouble to run the style checker today, so I'm uploading the patch that I have now, hoping that the style bot will turn green.
Let's see...
Comment on attachment 251857 [details] Patch proposal (unit test included) View in context: https://bugs.webkit.org/attachment.cgi?id=251857&action=review > Source/WebKit2/ChangeLog:9 > + Consider already initiated loads of content coming out of a custom > + URI scheme handler when a failure has been reported from the UI process. This is not the only case failing, it can also happen that the didFailWithError message is received before the first data chunk is read, and then we would crash in didReceiveResponse() because data is null (was removed from the map in didFailWithError(). We need an early return there when data is NULL, similar to the on in didLoadData. > Source/WebKit2/Shared/Network/CustomProtocols/soup/CustomProtocolManagerImpl.cpp:123 > GRefPtr<GTask> task = data->releaseTask(); > - ASSERT(task.get()); > - g_task_return_new_error(task.get(), g_quark_from_string(error.domain().utf8().data()), > - error.errorCode(), "%s", error.localizedDescription().utf8().data()); > + if (data->stream) { > + // There won't be any task to complete if the loader has already started > + // to read data from the input stream's data sent from the UI process. > + ASSERT(!task); I think we shouldn't even call data->releaseTask() when we have a stream. I would use an early return instead, something like: if (!data->stream) { // Failed while reading the stream, the task was already completed by didLoadData(). return; } GRefPtr<GTask> task = data->releaseTask(); ASSERT(task.get()); g_task_return_new_error.... > Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestWebKitWebContext.cpp:154 > + static gboolean webProcessCrashedCallback(WebKitWebView*, gpointer userData) > + { > + URISchemeTest* test = static_cast<URISchemeTest*>(userData); > + test->m_webProcessCrashed = true; > + test->quitMainLoop(); > + return FALSE; > + } We already have a way to detect unexpected web process crashes. > Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestWebKitWebContext.cpp:172 > + // The slower read access to data is, the easier it gets for the crash to, > + // happen, so we use a GFileInputStream instead of a GMemoryInputStream here. This is going to be flaky > Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestWebKitWebContext.cpp:213 > + URISchemeTest() > + { > + g_signal_connect(m_webView, "web-process-crashed", G_CALLBACK(webProcessCrashedCallback), this); > + } > + > + ~URISchemeTest() > + { > + g_signal_handlers_disconnect_matched(m_webView, G_SIGNAL_MATCH_DATA, 0, 0, 0, 0, this); > + } We don'ty need this, WebViewTest does this automatically. > Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestWebKitWebContext.cpp:279 > + test->wait(1); This is going to be super flaky :-) There are actually at least two cases more we want to test: - didFailWithError before didReceiveResponse: This is easy. Just call webkit_uri_scheme_request_finish_error() right after webkit_uri_scheme_request_finish(), and wait until load finishes, it will crash. - didFailWithError after the first didLoadData: This is more tricky, because we need to ensure, there's at least one chunk read, and we don't finish reading the stream. Maybe we could use a pipe, and create a GUnixInputStream. Then we write something in the pipe, and then call webkit_uri_scheme_request_finish_error(). I would use better names than crashy, something like error:before-response and error:after-first-chunk. Since it's a load tracking test, it would also be interesting to wait until load finishes (instead of one second) and check that the loader client callbacks were correctly called. > Tools/TestWebKitAPI/gtk/WebKit2Gtk/LoadTrackingTest.cpp:30 > + // We need to reset the value of m_loadFailed or any use of LoadTrackingTest's > + // API after a load had failed will be wrongly considered to have failed as well. > + if (loadEvent != WEBKIT_LOAD_FINISHED) > + test->m_loadFailed = false; I guess this assumes we always use waitUntilLoadFinishes > Tools/TestWebKitAPI/gtk/WebKit2Gtk/LoadTrackingTest.cpp:187 > + if (!m_loadFailed) { > + // This assertion only makes sense if the current load has not failed, > + // otherwise we might be comparing values relative to separate loads. > + g_assert_cmpfloat(m_estimatedProgress, <, progress); > + } Ditto. Comment on attachment 251857 [details] Patch proposal (unit test included) View in context: https://bugs.webkit.org/attachment.cgi?id=251857&action=review Thanks a lot for the review, Carlos. I will address those issues soon enough, but in the meanwhile see my answers to your comments below... >> Source/WebKit2/ChangeLog:9 >> + URI scheme handler when a failure has been reported from the UI process. > > This is not the only case failing, it can also happen that the didFailWithError message is received before the first data chunk is read, and then we would crash in didReceiveResponse() because data is null (was removed from the map in didFailWithError(). We need an early return there when data is NULL, similar to the on in didLoadData. I see. I reported it this way because this is what I've seen myself, but will change the description in a follow-up patch >> Source/WebKit2/Shared/Network/CustomProtocols/soup/CustomProtocolManagerImpl.cpp:123 >> + ASSERT(!task); > > I think we shouldn't even call data->releaseTask() when we have a stream. I would use an early return instead, something like: > > if (!data->stream) { > // Failed while reading the stream, the task was already completed by didLoadData(). > return; > } > > GRefPtr<GTask> task = data->releaseTask(); > ASSERT(task.get()); > g_task_return_new_error.... That was my initial version, but I was unsure about doing it and in the end preferred to do it always and assert task or !task depending on the case. Anyway, I will change it in the next patch >> Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestWebKitWebContext.cpp:154 >> + } > > We already have a way to detect unexpected web process crashes. I see, sorry about that. Will remove it. >> Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestWebKitWebContext.cpp:172 >> + // happen, so we use a GFileInputStream instead of a GMemoryInputStream here. > > This is going to be flaky Well, yeah... thing is that it does not happen always but doing this in my new laptop (with an SSD) I get the crash 8/10 times. If I use a GMemoryInputStream, I can't get it a single time, so I thought it could be acceptable. >> Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestWebKitWebContext.cpp:213 >> + } > > We don'ty need this, WebViewTest does this automatically. Ok >> Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestWebKitWebContext.cpp:279 >> + test->wait(1); > > This is going to be super flaky :-) > > There are actually at least two cases more we want to test: > > - didFailWithError before didReceiveResponse: This is easy. Just call webkit_uri_scheme_request_finish_error() right after webkit_uri_scheme_request_finish(), and wait until load finishes, it will crash. > - didFailWithError after the first didLoadData: This is more tricky, because we need to ensure, there's at least one chunk read, and we don't finish reading the stream. Maybe we could use a pipe, and create a GUnixInputStream. Then we write something in the pipe, and then call webkit_uri_scheme_request_finish_error(). > > I would use better names than crashy, something like error:before-response and error:after-first-chunk. Since it's a load tracking test, it would also be interesting to wait until load finishes (instead of one second) and check that the loader client callbacks were correctly called. Same thing here :). In my case, I reproduce it 8/10 times consistently with this 1sec delay, which I chose because I saw it as the max value used in other tests. I'd rather do something like waitUntilFinishLoading, but can't do it because I risk canceling the wrong thing... but perhaps I was doing it wrong, based on your comment above. Anyway, about those two cases: - didFailWithError before didReceiveResponse: Ok, will follow your suggestion. - didFailWithError after the first didLoadData: this is the one that I'm trying to test I'm checking here and which, as I said, I manage to reproduce 8/10 times with the current patch. I'll give a try to your suggestion, though About the names, sure I can pick better ones, no problem :) >> Tools/TestWebKitAPI/gtk/WebKit2Gtk/LoadTrackingTest.cpp:30 >> + test->m_loadFailed = false; > > I guess this assumes we always use waitUntilLoadFinishes Not sure I understand your comment. Mind to ellaborate? >> Tools/TestWebKitAPI/gtk/WebKit2Gtk/LoadTrackingTest.cpp:187 >> + } > > Ditto. Same thing :) (In reply to comment #9) > [...] > > I think we shouldn't even call data->releaseTask() when we have a stream. I would use an early return instead, something like: > > > > if (!data->stream) { > > // Failed while reading the stream, the task was already completed by didLoadData(). > > return; > > } > > > > GRefPtr<GTask> task = data->releaseTask(); > > ASSERT(task.get()); > > g_task_return_new_error.... > > That was my initial version, but I was unsure about doing it and in the end > preferred to do it always and assert task or !task depending on the case. > > Anyway, I will change it in the next patch Just a quick heads-up after a conversation on IRC with Carlos: we can't really early return because we still need to remove the data from the m_customProtocolMap hash table, and also, perhaps ASSERT() task and !task depending on the case would not be a too bad thing either... so we kind of settled on the following code for didFailError(): void CustomProtocolManagerImpl::didFailWithError(uint64_t customProtocolID, const WebCore::ResourceError& error) { WebSoupRequestAsyncData* data = m_customProtocolMap.get(customProtocolID); ASSERT(data); // Either we haven't started reading the stream yet, in which case we need to complete the // task first, or we failed reading it and the task was already completed by didLoadData(). ASSERT(!data->stream || !data->task); if (!data->stream) { GRefPtr<GTask> task = data->releaseTask(); ASSERT(task.get()); g_task_return_new_error(task.get(), g_quark_from_string(error.domain().utf8().data()), error.errorCode(), "%s", error.localizedDescription().utf8().data()); } m_customProtocolMap.remove(customProtocolID); } Still, I need to write the unit test in a proper way, but I thought I would share this before, because previous comments could be confusing. Created attachment 252954 [details]
Patch proposal (unit test included)
New patch implementing a slightly different approach, based on the experience writing the unit tests, because the original proposition posed problems when testing other scenarios than the originally described one (see Carlos previous comment). This one also includes more simplified and flakyness-free unit tests while not causing any regression. Please review, thanks
Thanks for the patch. If this patch contains new public API please make sure it follows the guidelines for new WebKit2 GTK+ API. See http://trac.webkit.org/wiki/WebKitGTK/AddingNewWebKit2API Comment on attachment 252954 [details] Patch proposal (unit test included) View in context: https://bugs.webkit.org/attachment.cgi?id=252954&action=review I'm sorry but this breaks the model. ResourceHandle makes the soup request, and when it finishes it calls didReceiveResponse and starts to read the given stream calling didReceiveBuffer for very chunk read. When it finishes reading the stream, didFinishLoad is called. In case of error at any point didFail is called. What you are doing now is that the request doesn't finish until we have filled the input stream, and then our custom WebKitSoupRequestInputStream is useless, we could use a GMemoryInoutStream instead. The idea was to wait until the first chunk to create the input stream efficiently, using GMemoryInputStream when there's a single chunk or WebKitSoupRequestInputStream to be able to start reading the stream without having to wait for the stream to be filled. > Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestWebKitWebContext.cpp:303 > + g_assert(test->m_loadEvents.contains(LoadTrackingTest::ProvisionalLoadFailed)); > + g_assert(!test->m_loadEvents.contains(LoadTrackingTest::LoadFailed)); And this proves it, I would expect this to be a LoadFailed, not a ProvisionalLoadFailed, since after the first chunk the load is committed. But our ResourceHandle thinks we are still waiting for the response because the soup request hasn't finished. (In reply to comment #13) > Comment on attachment 252954 [details] > Patch proposal (unit test included) > > View in context: > https://bugs.webkit.org/attachment.cgi?id=252954&action=review > > I'm sorry but this breaks the model. ResourceHandle makes the soup request, > and when it finishes it calls didReceiveResponse and starts to read the > given stream calling didReceiveBuffer for very chunk read. When it finishes > reading the stream, didFinishLoad is called. In case of error at any point > didFail is called. What you are doing now is that the request doesn't finish > until we have filled the input stream, and then our custom > WebKitSoupRequestInputStream is useless, we could use a GMemoryInoutStream > instead. The idea was to wait until the first chunk to create the input > stream efficiently, using GMemoryInputStream when there's a single chunk or > WebKitSoupRequestInputStream to be able to start reading the stream without > having to wait for the stream to be filled. Everything makes more sense now, thanks for the explanation. > > Tools/TestWebKitAPI/Tests/WebKit2Gtk/TestWebKitWebContext.cpp:303 > > + g_assert(test->m_loadEvents.contains(LoadTrackingTest::ProvisionalLoadFailed)); > > + g_assert(!test->m_loadEvents.contains(LoadTrackingTest::LoadFailed)); > > And this proves it, I would expect this to be a LoadFailed, not a > ProvisionalLoadFailed, since after the first chunk the load is committed. > But our ResourceHandle thinks we are still waiting for the response because > the soup request hasn't finished. I think you're right, even though I feel of course a bit frustrated about this patch being not useful in the end. But worry not, I will try to create a new version of this patch taking into consideration your comments plus all I learned so far while doing the previous patch. Can't commit to a deadline though, but I'll try not to let this fall into oblivion easily. Feel free to ping me if I fail on that mission, though. Thanks for the review! Created attachment 266992 [details]
Patch proposal (Unit test included) v2
Attached a new patch with all the work done during the hackfest. This should fix the crash, cover all the edge cases and not change the model in the way :)
Please review, thanks!
Committed r193830: <http://trac.webkit.org/changeset/193830> |