85880 – [SOUP] Allow sending URI request data in chunks

RESOLVED FIXED 85880

[SOUP] Allow sending URI request data in chunks

https://bugs.webkit.org/show_bug.cgi?id=85880

Summary [SOUP] Allow sending URI request data in chunks

Carlos Garcia Campos

Reported 2012-05-08 06:33:29 PDT

Currently we have a single IPC message that sends all the uri request data in a single chunk.

Attachments
Patch (21.28 KB, patch) 2012-05-08 06:44 PDT, Carlos Garcia Campos	no flags	Details Formatted Diff Diff
Updated patch (23.18 KB, patch) 2012-05-08 10:19 PDT, Carlos Garcia Campos	no flags	Details Formatted Diff Diff
Updated patch according to review comments (24.92 KB, patch) 2012-05-10 04:12 PDT, Carlos Garcia Campos	mrobinson: review+	Details Formatted Diff Diff
Show Obsolete (2) View All Add attachment proposed patch, testcase, etc.

Carlos Garcia Campos

Comment 1 2012-05-08 06:44:33 PDT

Created attachment 140703 [details] Patch

Carlos Garcia Campos

Comment 2 2012-05-08 07:00:26 PDT

It seems g_pollable_source_new_full() is too new.

Dan Winship

Comment 3 2012-05-08 08:12:13 PDT

Comment on attachment 140703 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=140703&action=review > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:43 > + GRefPtr<GSource> baseSource = adoptGRef(g_timeout_source_new(0)); That means the source will always trigger immediately, and so if there's not actually any data available, the reader will spin in a tight loop. When the stream is "empty" but not finished, you have to return a source that won't trigger until after the next time webkitSoupRequestInputStreamAddData() has been called. One way to do that would be to have a "GCancellable *waiting_for_data;" in priv, and create a source with g_cancellable_create_source(). Then do g_cancellable_cancel(priv->waiting_for_data) in webkitSoupRequestInputStreamAddData() and g_cancellable_reset(priv->waiting_for_data) in webkitSoupRequestInputStreamPollableReadNonblocking(). (Using GCancellable for this is semantically backwards, but... it works.) > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:44 > + return g_pollable_source_new_full(stream, baseSource.get(), cancellable); as you noted, g_pollable_source_new_full() is very new. but you can just reimplement it by hand; it's just a small wrapper around g_pollable_source_new(), which is older. > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:82 > +static void webkit_soup_request_input_stream_class_init(WebKitSoupRequestInputStreamClass* requestStreamClass) Since you're not overriding GInputStreamClass->read_fn, it will use GMemoryInputStream's implementation, which won't deal with streaming data correctly. It *can't* deal with it correctly, but it seems like it might be better to have it either return G_IO_ERROR_NOT_SUPPORTED, or else do g_return_val_if_reached(-1); > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:108 > + return stream->priv->bytesAdded == stream->priv->contentLength; Hm... what about for chunked data, when the final length won't be known in advance? What does WebKit2 do in that case?

Carlos Garcia Campos

Comment 4 2012-05-08 08:24:46 PDT

(In reply to comment #3) > (From update of attachment 140703 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=140703&action=review Thanks for looking at it! > > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:43 > > + GRefPtr<GSource> baseSource = adoptGRef(g_timeout_source_new(0)); > > That means the source will always trigger immediately, and so if there's not actually any data available, the reader will spin in a tight loop. When the stream is "empty" but not finished, you have to return a source that won't trigger until after the next time webkitSoupRequestInputStreamAddData() has been called. Yeah, I thought about it, but assumed the ui client would send more data soon. I agree we should use a custom GSource to implement it properly. > One way to do that would be to have a "GCancellable *waiting_for_data;" in priv, and create a source with g_cancellable_create_source(). Then do g_cancellable_cancel(priv->waiting_for_data) in webkitSoupRequestInputStreamAddData() and g_cancellable_reset(priv->waiting_for_data) in webkitSoupRequestInputStreamPollableReadNonblocking(). (Using GCancellable for this is semantically backwards, but... it works.) It's a bit tricky, but easier than implementing a new GSource. > > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:44 > > + return g_pollable_source_new_full(stream, baseSource.get(), cancellable); > > as you noted, g_pollable_source_new_full() is very new. but you can just reimplement it by hand; it's just a small wrapper around g_pollable_source_new(), which is older. Yes, I was looking at the latest changes in gio. > > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:82 > > +static void webkit_soup_request_input_stream_class_init(WebKitSoupRequestInputStreamClass* requestStreamClass) > > Since you're not overriding GInputStreamClass->read_fn, it will use GMemoryInputStream's implementation, which won't deal with streaming data correctly. It *can't* deal with it correctly, but it seems like it might be better to have it either return G_IO_ERROR_NOT_SUPPORTED, or else do g_return_val_if_reached(-1); Yes, I assumed it was not going to be used, since this input stream is not a generic object in the end, and will only be used by ResourceHandleSoup. > > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:108 > > + return stream->priv->bytesAdded == stream->priv->contentLength; > > Hm... what about for chunked data, when the final length won't be known in advance? What does WebKit2 do in that case? The API always has to provide the content_length in advance, doesn't SoupRequest also require to set content_length before finish()?

Carlos Garcia Campos

Comment 5 2012-05-08 10:19:21 PDT

Created attachment 140735 [details] Updated patch Use a different approach for our custom stream that should work with previous versions of glib.

Dan Winship

Comment 6 2012-05-08 11:25:59 PDT

(In reply to comment #4) > > Hm... what about for chunked data, when the final length won't be known in advance? What does WebKit2 do in that case? > > The API always has to provide the content_length in advance, doesn't SoupRequest also require to set content_length before finish()? Yes, if it's known. But the docs explicitly say soup_request_get_content_length() can return -1 too.

Martin Robinson

Comment 7 2012-05-09 17:59:23 PDT

Comment on attachment 140735 [details] Updated patch View in context: https://bugs.webkit.org/attachment.cgi?id=140735&action=review This looks pretty good to me! I have a few outstanding questions though, before I feel comfortable giving an r+. > Source/WebKit2/ChangeLog:7 > + Reviewed by NOBODY (OOPS!). > + It's probably a good idea to write a couple sentences describing this change in general, for the sake of future readers. > Source/WebKit2/UIProcess/soup/WebSoupRequestManagerProxy.cpp:78 > if (!m_client.didReceiveURIRequest(this, WebURL::create(uriString).get(), requestID)) > - handleURIRequest(WebData::create(0, 0).get(), String(), requestID); > + handleURIRequest(WebData::create(0, 0).get(), 0, String(), requestID); Hrm. Perhaps this should return an error? > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:45 > + uint64_t bytesAdded; bytesReceivedFromWebProcess or bytesReceived? bytesAdded could easily refer to "bytes added to the output." > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:48 > + Mutex readLock; I'm a little confused as to what exactly this mutex is protecting. Is it protecting just the data in this private structure or any operations at all? > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:81 > + stream->priv->asyncReadData = adoptPtr(new AsyncReadData(result.get(), buffer, count, cancellable)); Won't this break if g_input_stream_read_async is called twice in a row? > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:135 > + if (webkitSoupRequestInputStreamFinished(stream)) > + return; Does this need to be protected by the mutex or can the mutex be locked after this early return¿ > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:137 > + if (stream->priv->bytesAdded + dataLength > stream->priv->contentLength) > + dataLength = stream->priv->contentLength - stream->priv->bytesAdded; Dan should confirm, but is it really an error to pass more data to soup than specified by the content length? What about the case where contentLength = -1 that he mentioned above?

Martin Robinson

Comment 8 2012-05-09 18:03:52 PDT

Comment on attachment 140735 [details] Updated patch View in context: https://bugs.webkit.org/attachment.cgi?id=140735&action=review > Source/WebKit2/WebProcess/soup/WebSoupRequestManager.h:58 > + void handleURIRequest(const CoreIPC::DataReference&, uint64_t contentLength, const String& mimeType, uint64_t requestID); > + void addDataToURIRequest(const CoreIPC::DataReference&, uint64_t requestID); Another quick comment here... Perhaps these names should be similar to the names that WebCore uses in these situations, like handleURISchemeResponseReceived and handleURISchemeResponseData or something even better.

Carlos Garcia Campos

Comment 9 2012-05-09 23:52:45 PDT

(In reply to comment #7) > (From update of attachment 140735 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=140735&action=review > > This looks pretty good to me! I have a few outstanding questions though, before I feel comfortable giving an r+. Thanks! > > Source/WebKit2/ChangeLog:7 > > + Reviewed by NOBODY (OOPS!). > > + > > It's probably a good idea to write a couple sentences describing this change in general, for the sake of future readers. Fair enough > > Source/WebKit2/UIProcess/soup/WebSoupRequestManagerProxy.cpp:78 > > if (!m_client.didReceiveURIRequest(this, WebURL::create(uriString).get(), requestID)) > > - handleURIRequest(WebData::create(0, 0).get(), String(), requestID); > > + handleURIRequest(WebData::create(0, 0).get(), 0, String(), requestID); > > Hrm. Perhaps this should return an error? No, this is handled in the web process, and indeed an error is shown, the same error that if you type foo:bar and foo is not handled. > > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:45 > > + uint64_t bytesAdded; > > bytesReceivedFromWebProcess or bytesReceived? bytesAdded could easily refer to "bytes added to the output." Ok, I'll think a better name. > > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:48 > > + Mutex readLock; > > I'm a little confused as to what exactly this mutex is protecting. Is it protecting just the data in this private structure or any operations at all? GLib will run read_async in a thread, so webkitSoupRequestInputStreamReadAsync() and webkitSoupRequestInputStreamAddData() can be run at the same time. We need to protect bytesAdded, asyncReadData, etc. > > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:81 > > + stream->priv->asyncReadData = adoptPtr(new AsyncReadData(result.get(), buffer, count, cancellable)); > > Won't this break if g_input_stream_read_async is called twice in a row? You are not supposed to call read_async twice in a row, in such case, our read_async is not called because g_input_stream_read_async() returns early with the error G_IO_ERROR_PENDING. > > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:135 > > + if (webkitSoupRequestInputStreamFinished(stream)) > > + return; > > Does this need to be protected by the mutex or can the mutex be locked after this early return¿ That reads bytesAdded that is only written by this method, so it can probably be moved after the early return. > > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:137 > > + if (stream->priv->bytesAdded + dataLength > stream->priv->contentLength) > > + dataLength = stream->priv->contentLength - stream->priv->bytesAdded; > > Dan should confirm, but is it really an error to pass more data to soup than specified by the content length? That's not actually an error, the data is simply truncated. > What about the case where contentLength = -1 that he mentioned above? We are not supporting such case, we need to now when more data is expected. We could implement that case by always sending a 0 bytes data message when contentLength is -1 and all data has been read from the input stream.

Carlos Garcia Campos

Comment 10 2012-05-09 23:56:38 PDT

(In reply to comment #8) > (From update of attachment 140735 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=140735&action=review > > > Source/WebKit2/WebProcess/soup/WebSoupRequestManager.h:58 > > + void handleURIRequest(const CoreIPC::DataReference&, uint64_t contentLength, const String& mimeType, uint64_t requestID); > > + void addDataToURIRequest(const CoreIPC::DataReference&, uint64_t requestID); > > Another quick comment here... Perhaps these names should be similar to the names that WebCore uses in these situations, like handleURISchemeResponseReceived and handleURISchemeResponseData or something even better. Yes, I don't like the names either, but I couldn't think of a better name.

Carlos Garcia Campos

Comment 11 2012-05-10 03:54:39 PDT

(In reply to comment #10) > (In reply to comment #8) > > (From update of attachment 140735 [details] [details]) > > View in context: https://bugs.webkit.org/attachment.cgi?id=140735&action=review > > > > > Source/WebKit2/WebProcess/soup/WebSoupRequestManager.h:58 > > > + void handleURIRequest(const CoreIPC::DataReference&, uint64_t contentLength, const String& mimeType, uint64_t requestID); > > > + void addDataToURIRequest(const CoreIPC::DataReference&, uint64_t requestID); > > > > Another quick comment here... Perhaps these names should be similar to the names that WebCore uses in these situations, like handleURISchemeResponseReceived and handleURISchemeResponseData or something even better. > > Yes, I don't like the names either, but I couldn't think of a better name. The names are indeed confusing, the request is handled by the UI process, so sending a message to the web process called HandleFoo makes me think the web process is supposeed to handle it. So, I think DidHandleURIRequest and DidReceiveURIREquestData make more sense.

Carlos Garcia Campos

Comment 12 2012-05-10 04:12:19 PDT

Created attachment 141142 [details] Updated patch according to review comments Addressed all review comments and also added support for unknown content length and mime type for compatibility with soup

Martin Robinson

Comment 13 2012-05-10 11:21:55 PDT

Comment on attachment 141142 [details] Updated patch according to review comments View in context: https://bugs.webkit.org/attachment.cgi?id=141142&action=review Okay! Thanks for addressing my comments above. Before landing please consider the following simplification and clean up. I think it will make the code a lot more readable. I had some small difficulty understanding this partly because (1) I'm an idiot and (2) It's tricky to keep track of "bytes read out" and "bytes received" especially when their names both conjure images of bytes coming in. > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:83 > + // There is data in the memory stream to read. > + if (stream->priv->bytesRead < stream->priv->bytesReceived) { > + webkitSoupRequestInputStreamReadAsyncResultComplete(stream, result.get(), buffer, count, cancellable); > + return; > + } > + > + // We are expecting more data to be added. > + if (!webkitSoupRequestInputStreamFinished(stream)) { > + stream->priv->asyncReadData = adoptPtr(new AsyncReadData(result.get(), buffer, count, cancellable)); > + return; > + } I feel that it might be slightly simpler to unconditionally create stream->priv->asyncReadData here. You could do something like: bool moreDataToSendForRead = stream->priv->bytesRead < stream->priv->bytesReceived; if (!moreDataToSendForRead && webkitSoupRequestInputStreamFinished(stream)) { g_simple_async_result_set_op_res_gssize(result.get(), 0); g_simple_async_result_complete_in_idle(result.get()); return; } stream->priv->asyncReadData = adoptPtr(new AsyncReadData(result.get(), buffer, count, cancellable)); if (moreDataToSendForRead) webkitSoupRequestInputStreamReadAsyncResultComplete(stream, cancellable); This way webkitSoupRequestInputStreamReadAsyncResultComplete would be responsible for calling stream->priv->asyncReadData.clear(), which seems to make more sense since it consumes it. Also, instead of adding a comment like: // We are expecting more data to be added. if (!webkitSoupRequestInputStreamFinished(stream)) { I think it makes sense to simply rename webkitSoupRequestInputStreamFinished to webkitSoupRequestWaitingForMoreDataToBeAdded or something better. I recommend doing the same for stream->priv->bytesRead < stream->priv->bytesReceived by creating a helper like webkitSoupRequestSentAllDataForPendingRead or something similar. Helper methods with good names are almost always better than comments, I'd say.

Carlos Garcia Campos

Comment 14 2012-05-11 00:40:15 PDT

(In reply to comment #13) > (From update of attachment 141142 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=141142&action=review > > Okay! Thanks for addressing my comments above. Before landing please consider the following simplification and clean up. I think it will make the code a lot more readable. I had some small difficulty understanding this partly because (1) I'm an idiot and (2) It's tricky to keep track of "bytes read out" and "bytes received" especially when their names both conjure images of bytes coming in. > > > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:83 > > + // There is data in the memory stream to read. > > + if (stream->priv->bytesRead < stream->priv->bytesReceived) { > > + webkitSoupRequestInputStreamReadAsyncResultComplete(stream, result.get(), buffer, count, cancellable); > > + return; > > + } > > + > > + // We are expecting more data to be added. > > + if (!webkitSoupRequestInputStreamFinished(stream)) { > > + stream->priv->asyncReadData = adoptPtr(new AsyncReadData(result.get(), buffer, count, cancellable)); > > + return; > > + } > > I feel that it might be slightly simpler to unconditionally create stream->priv->asyncReadData here. That would mean adding an unnecessary alloc/dealloc in the case where we already have the data ready in the stream to read from. AsyncReadData is to postpone the async result completion until we have more data in the stream to read from. > You could do something like: > > bool moreDataToSendForRead = stream->priv->bytesRead < stream->priv->bytesReceived; > if (!moreDataToSendForRead && webkitSoupRequestInputStreamFinished(stream)) { > g_simple_async_result_set_op_res_gssize(result.get(), 0); > g_simple_async_result_complete_in_idle(result.get()); > return; > } > > stream->priv->asyncReadData = adoptPtr(new AsyncReadData(result.get(), buffer, count, cancellable)); > if (moreDataToSendForRead) > webkitSoupRequestInputStreamReadAsyncResultComplete(stream, cancellable); > > > This way webkitSoupRequestInputStreamReadAsyncResultComplete would be responsible for calling stream->priv->asyncReadData.clear(), which seems to make more sense since it consumes it. > > Also, instead of adding a comment like: > // We are expecting more data to be added. > if (!webkitSoupRequestInputStreamFinished(stream)) { > > I think it makes sense to simply rename webkitSoupRequestInputStreamFinished to webkitSoupRequestWaitingForMoreDataToBeAdded or something better. I recommend doing the same for stream->priv->bytesRead < stream->priv->bytesReceived by creating a helper like webkitSoupRequestSentAllDataForPendingRead or something similar. Helper methods with good names are almost always better than comments, I'd say. I'll try to use better names

Martin Robinson

Comment 15 2012-05-11 01:06:36 PDT

(In reply to comment #14) > That would mean adding an unnecessary alloc/dealloc in the case where we already have the data ready in the stream to read from. AsyncReadData is to postpone the async result completion until we have more data in the stream to read from. The allocation in this case is very cheap, but one way to avoid it would be to add a method to AsyncReadData to mark it as invalid and never allocate and deallocate it.

Carlos Garcia Campos

Comment 16 2012-05-11 01:13:58 PDT

Committed r116738: <http://trac.webkit.org/changeset/116738>

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware PC

OS Linux

Product WebKit

Component WebKit2

Assignee

Nobody

Reported

2012-05-08 06:33 PDT

Modified

2012-05-11 01:13 PDT History

CC List

7 users Show

URL

Keywords Gtk, Soup

Depends on

Blocks

84133

Dependencies

tree graph