WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
85880
[SOUP] Allow sending URI request data in chunks
https://bugs.webkit.org/show_bug.cgi?id=85880
Summary
[SOUP] Allow sending URI request data in chunks
Carlos Garcia Campos
Reported
2012-05-08 06:33:29 PDT
Currently we have a single IPC message that sends all the uri request data in a single chunk.
Attachments
Patch
(21.28 KB, patch)
2012-05-08 06:44 PDT
,
Carlos Garcia Campos
no flags
Details
Formatted Diff
Diff
Updated patch
(23.18 KB, patch)
2012-05-08 10:19 PDT
,
Carlos Garcia Campos
no flags
Details
Formatted Diff
Diff
Updated patch according to review comments
(24.92 KB, patch)
2012-05-10 04:12 PDT
,
Carlos Garcia Campos
mrobinson
: review+
Details
Formatted Diff
Diff
Show Obsolete
(2)
View All
Add attachment
proposed patch, testcase, etc.
Carlos Garcia Campos
Comment 1
2012-05-08 06:44:33 PDT
Created
attachment 140703
[details]
Patch
Carlos Garcia Campos
Comment 2
2012-05-08 07:00:26 PDT
It seems g_pollable_source_new_full() is too new.
Dan Winship
Comment 3
2012-05-08 08:12:13 PDT
Comment on
attachment 140703
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=140703&action=review
> Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:43 > + GRefPtr<GSource> baseSource = adoptGRef(g_timeout_source_new(0));
That means the source will always trigger immediately, and so if there's not actually any data available, the reader will spin in a tight loop. When the stream is "empty" but not finished, you have to return a source that won't trigger until after the next time webkitSoupRequestInputStreamAddData() has been called. One way to do that would be to have a "GCancellable *waiting_for_data;" in priv, and create a source with g_cancellable_create_source(). Then do g_cancellable_cancel(priv->waiting_for_data) in webkitSoupRequestInputStreamAddData() and g_cancellable_reset(priv->waiting_for_data) in webkitSoupRequestInputStreamPollableReadNonblocking(). (Using GCancellable for this is semantically backwards, but... it works.)
> Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:44 > + return g_pollable_source_new_full(stream, baseSource.get(), cancellable);
as you noted, g_pollable_source_new_full() is very new. but you can just reimplement it by hand; it's just a small wrapper around g_pollable_source_new(), which is older.
> Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:82 > +static void webkit_soup_request_input_stream_class_init(WebKitSoupRequestInputStreamClass* requestStreamClass)
Since you're not overriding GInputStreamClass->read_fn, it will use GMemoryInputStream's implementation, which won't deal with streaming data correctly. It *can't* deal with it correctly, but it seems like it might be better to have it either return G_IO_ERROR_NOT_SUPPORTED, or else do g_return_val_if_reached(-1);
> Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:108 > + return stream->priv->bytesAdded == stream->priv->contentLength;
Hm... what about for chunked data, when the final length won't be known in advance? What does WebKit2 do in that case?
Carlos Garcia Campos
Comment 4
2012-05-08 08:24:46 PDT
(In reply to
comment #3
)
> (From update of
attachment 140703
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=140703&action=review
Thanks for looking at it!
> > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:43 > > + GRefPtr<GSource> baseSource = adoptGRef(g_timeout_source_new(0)); > > That means the source will always trigger immediately, and so if there's not actually any data available, the reader will spin in a tight loop. When the stream is "empty" but not finished, you have to return a source that won't trigger until after the next time webkitSoupRequestInputStreamAddData() has been called.
Yeah, I thought about it, but assumed the ui client would send more data soon. I agree we should use a custom GSource to implement it properly.
> One way to do that would be to have a "GCancellable *waiting_for_data;" in priv, and create a source with g_cancellable_create_source(). Then do g_cancellable_cancel(priv->waiting_for_data) in webkitSoupRequestInputStreamAddData() and g_cancellable_reset(priv->waiting_for_data) in webkitSoupRequestInputStreamPollableReadNonblocking(). (Using GCancellable for this is semantically backwards, but... it works.)
It's a bit tricky, but easier than implementing a new GSource.
> > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:44 > > + return g_pollable_source_new_full(stream, baseSource.get(), cancellable); > > as you noted, g_pollable_source_new_full() is very new. but you can just reimplement it by hand; it's just a small wrapper around g_pollable_source_new(), which is older.
Yes, I was looking at the latest changes in gio.
> > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:82 > > +static void webkit_soup_request_input_stream_class_init(WebKitSoupRequestInputStreamClass* requestStreamClass) > > Since you're not overriding GInputStreamClass->read_fn, it will use GMemoryInputStream's implementation, which won't deal with streaming data correctly. It *can't* deal with it correctly, but it seems like it might be better to have it either return G_IO_ERROR_NOT_SUPPORTED, or else do g_return_val_if_reached(-1);
Yes, I assumed it was not going to be used, since this input stream is not a generic object in the end, and will only be used by ResourceHandleSoup.
> > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:108 > > + return stream->priv->bytesAdded == stream->priv->contentLength; > > Hm... what about for chunked data, when the final length won't be known in advance? What does WebKit2 do in that case?
The API always has to provide the content_length in advance, doesn't SoupRequest also require to set content_length before finish()?
Carlos Garcia Campos
Comment 5
2012-05-08 10:19:21 PDT
Created
attachment 140735
[details]
Updated patch Use a different approach for our custom stream that should work with previous versions of glib.
Dan Winship
Comment 6
2012-05-08 11:25:59 PDT
(In reply to
comment #4
)
> > Hm... what about for chunked data, when the final length won't be known in advance? What does WebKit2 do in that case? > > The API always has to provide the content_length in advance, doesn't SoupRequest also require to set content_length before finish()?
Yes, if it's known. But the docs explicitly say soup_request_get_content_length() can return -1 too.
Martin Robinson
Comment 7
2012-05-09 17:59:23 PDT
Comment on
attachment 140735
[details]
Updated patch View in context:
https://bugs.webkit.org/attachment.cgi?id=140735&action=review
This looks pretty good to me! I have a few outstanding questions though, before I feel comfortable giving an r+.
> Source/WebKit2/ChangeLog:7 > + Reviewed by NOBODY (OOPS!). > +
It's probably a good idea to write a couple sentences describing this change in general, for the sake of future readers.
> Source/WebKit2/UIProcess/soup/WebSoupRequestManagerProxy.cpp:78 > if (!m_client.didReceiveURIRequest(this, WebURL::create(uriString).get(), requestID)) > - handleURIRequest(WebData::create(0, 0).get(), String(), requestID); > + handleURIRequest(WebData::create(0, 0).get(), 0, String(), requestID);
Hrm. Perhaps this should return an error?
> Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:45 > + uint64_t bytesAdded;
bytesReceivedFromWebProcess or bytesReceived? bytesAdded could easily refer to "bytes added to the output."
> Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:48 > + Mutex readLock;
I'm a little confused as to what exactly this mutex is protecting. Is it protecting just the data in this private structure or any operations at all?
> Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:81 > + stream->priv->asyncReadData = adoptPtr(new AsyncReadData(result.get(), buffer, count, cancellable));
Won't this break if g_input_stream_read_async is called twice in a row?
> Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:135 > + if (webkitSoupRequestInputStreamFinished(stream)) > + return;
Does this need to be protected by the mutex or can the mutex be locked after this early return¿
> Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:137 > + if (stream->priv->bytesAdded + dataLength > stream->priv->contentLength) > + dataLength = stream->priv->contentLength - stream->priv->bytesAdded;
Dan should confirm, but is it really an error to pass more data to soup than specified by the content length? What about the case where contentLength = -1 that he mentioned above?
Martin Robinson
Comment 8
2012-05-09 18:03:52 PDT
Comment on
attachment 140735
[details]
Updated patch View in context:
https://bugs.webkit.org/attachment.cgi?id=140735&action=review
> Source/WebKit2/WebProcess/soup/WebSoupRequestManager.h:58 > + void handleURIRequest(const CoreIPC::DataReference&, uint64_t contentLength, const String& mimeType, uint64_t requestID); > + void addDataToURIRequest(const CoreIPC::DataReference&, uint64_t requestID);
Another quick comment here... Perhaps these names should be similar to the names that WebCore uses in these situations, like handleURISchemeResponseReceived and handleURISchemeResponseData or something even better.
Carlos Garcia Campos
Comment 9
2012-05-09 23:52:45 PDT
(In reply to
comment #7
)
> (From update of
attachment 140735
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=140735&action=review
> > This looks pretty good to me! I have a few outstanding questions though, before I feel comfortable giving an r+.
Thanks!
> > Source/WebKit2/ChangeLog:7 > > + Reviewed by NOBODY (OOPS!). > > + > > It's probably a good idea to write a couple sentences describing this change in general, for the sake of future readers.
Fair enough
> > Source/WebKit2/UIProcess/soup/WebSoupRequestManagerProxy.cpp:78 > > if (!m_client.didReceiveURIRequest(this, WebURL::create(uriString).get(), requestID)) > > - handleURIRequest(WebData::create(0, 0).get(), String(), requestID); > > + handleURIRequest(WebData::create(0, 0).get(), 0, String(), requestID); > > Hrm. Perhaps this should return an error?
No, this is handled in the web process, and indeed an error is shown, the same error that if you type foo:bar and foo is not handled.
> > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:45 > > + uint64_t bytesAdded; > > bytesReceivedFromWebProcess or bytesReceived? bytesAdded could easily refer to "bytes added to the output."
Ok, I'll think a better name.
> > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:48 > > + Mutex readLock; > > I'm a little confused as to what exactly this mutex is protecting. Is it protecting just the data in this private structure or any operations at all?
GLib will run read_async in a thread, so webkitSoupRequestInputStreamReadAsync() and webkitSoupRequestInputStreamAddData() can be run at the same time. We need to protect bytesAdded, asyncReadData, etc.
> > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:81 > > + stream->priv->asyncReadData = adoptPtr(new AsyncReadData(result.get(), buffer, count, cancellable)); > > Won't this break if g_input_stream_read_async is called twice in a row?
You are not supposed to call read_async twice in a row, in such case, our read_async is not called because g_input_stream_read_async() returns early with the error G_IO_ERROR_PENDING.
> > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:135 > > + if (webkitSoupRequestInputStreamFinished(stream)) > > + return; > > Does this need to be protected by the mutex or can the mutex be locked after this early return¿
That reads bytesAdded that is only written by this method, so it can probably be moved after the early return.
> > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:137 > > + if (stream->priv->bytesAdded + dataLength > stream->priv->contentLength) > > + dataLength = stream->priv->contentLength - stream->priv->bytesAdded; > > Dan should confirm, but is it really an error to pass more data to soup than specified by the content length?
That's not actually an error, the data is simply truncated.
> What about the case where contentLength = -1 that he mentioned above?
We are not supporting such case, we need to now when more data is expected. We could implement that case by always sending a 0 bytes data message when contentLength is -1 and all data has been read from the input stream.
Carlos Garcia Campos
Comment 10
2012-05-09 23:56:38 PDT
(In reply to
comment #8
)
> (From update of
attachment 140735
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=140735&action=review
> > > Source/WebKit2/WebProcess/soup/WebSoupRequestManager.h:58 > > + void handleURIRequest(const CoreIPC::DataReference&, uint64_t contentLength, const String& mimeType, uint64_t requestID); > > + void addDataToURIRequest(const CoreIPC::DataReference&, uint64_t requestID); > > Another quick comment here... Perhaps these names should be similar to the names that WebCore uses in these situations, like handleURISchemeResponseReceived and handleURISchemeResponseData or something even better.
Yes, I don't like the names either, but I couldn't think of a better name.
Carlos Garcia Campos
Comment 11
2012-05-10 03:54:39 PDT
(In reply to
comment #10
)
> (In reply to
comment #8
) > > (From update of
attachment 140735
[details]
[details]) > > View in context:
https://bugs.webkit.org/attachment.cgi?id=140735&action=review
> > > > > Source/WebKit2/WebProcess/soup/WebSoupRequestManager.h:58 > > > + void handleURIRequest(const CoreIPC::DataReference&, uint64_t contentLength, const String& mimeType, uint64_t requestID); > > > + void addDataToURIRequest(const CoreIPC::DataReference&, uint64_t requestID); > > > > Another quick comment here... Perhaps these names should be similar to the names that WebCore uses in these situations, like handleURISchemeResponseReceived and handleURISchemeResponseData or something even better. > > Yes, I don't like the names either, but I couldn't think of a better name.
The names are indeed confusing, the request is handled by the UI process, so sending a message to the web process called HandleFoo makes me think the web process is supposeed to handle it. So, I think DidHandleURIRequest and DidReceiveURIREquestData make more sense.
Carlos Garcia Campos
Comment 12
2012-05-10 04:12:19 PDT
Created
attachment 141142
[details]
Updated patch according to review comments Addressed all review comments and also added support for unknown content length and mime type for compatibility with soup
Martin Robinson
Comment 13
2012-05-10 11:21:55 PDT
Comment on
attachment 141142
[details]
Updated patch according to review comments View in context:
https://bugs.webkit.org/attachment.cgi?id=141142&action=review
Okay! Thanks for addressing my comments above. Before landing please consider the following simplification and clean up. I think it will make the code a lot more readable. I had some small difficulty understanding this partly because (1) I'm an idiot and (2) It's tricky to keep track of "bytes read out" and "bytes received" especially when their names both conjure images of bytes coming in.
> Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:83 > + // There is data in the memory stream to read. > + if (stream->priv->bytesRead < stream->priv->bytesReceived) { > + webkitSoupRequestInputStreamReadAsyncResultComplete(stream, result.get(), buffer, count, cancellable); > + return; > + } > + > + // We are expecting more data to be added. > + if (!webkitSoupRequestInputStreamFinished(stream)) { > + stream->priv->asyncReadData = adoptPtr(new AsyncReadData(result.get(), buffer, count, cancellable)); > + return; > + }
I feel that it might be slightly simpler to unconditionally create stream->priv->asyncReadData here. You could do something like: bool moreDataToSendForRead = stream->priv->bytesRead < stream->priv->bytesReceived; if (!moreDataToSendForRead && webkitSoupRequestInputStreamFinished(stream)) { g_simple_async_result_set_op_res_gssize(result.get(), 0); g_simple_async_result_complete_in_idle(result.get()); return; } stream->priv->asyncReadData = adoptPtr(new AsyncReadData(result.get(), buffer, count, cancellable)); if (moreDataToSendForRead) webkitSoupRequestInputStreamReadAsyncResultComplete(stream, cancellable); This way webkitSoupRequestInputStreamReadAsyncResultComplete would be responsible for calling stream->priv->asyncReadData.clear(), which seems to make more sense since it consumes it. Also, instead of adding a comment like: // We are expecting more data to be added. if (!webkitSoupRequestInputStreamFinished(stream)) { I think it makes sense to simply rename webkitSoupRequestInputStreamFinished to webkitSoupRequestWaitingForMoreDataToBeAdded or something better. I recommend doing the same for stream->priv->bytesRead < stream->priv->bytesReceived by creating a helper like webkitSoupRequestSentAllDataForPendingRead or something similar. Helper methods with good names are almost always better than comments, I'd say.
Carlos Garcia Campos
Comment 14
2012-05-11 00:40:15 PDT
(In reply to
comment #13
)
> (From update of
attachment 141142
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=141142&action=review
> > Okay! Thanks for addressing my comments above. Before landing please consider the following simplification and clean up. I think it will make the code a lot more readable. I had some small difficulty understanding this partly because (1) I'm an idiot and (2) It's tricky to keep track of "bytes read out" and "bytes received" especially when their names both conjure images of bytes coming in. > > > Source/WebKit2/WebProcess/soup/WebKitSoupRequestInputStream.cpp:83 > > + // There is data in the memory stream to read. > > + if (stream->priv->bytesRead < stream->priv->bytesReceived) { > > + webkitSoupRequestInputStreamReadAsyncResultComplete(stream, result.get(), buffer, count, cancellable); > > + return; > > + } > > + > > + // We are expecting more data to be added. > > + if (!webkitSoupRequestInputStreamFinished(stream)) { > > + stream->priv->asyncReadData = adoptPtr(new AsyncReadData(result.get(), buffer, count, cancellable)); > > + return; > > + } > > I feel that it might be slightly simpler to unconditionally create stream->priv->asyncReadData here.
That would mean adding an unnecessary alloc/dealloc in the case where we already have the data ready in the stream to read from. AsyncReadData is to postpone the async result completion until we have more data in the stream to read from.
> You could do something like: > > bool moreDataToSendForRead = stream->priv->bytesRead < stream->priv->bytesReceived; > if (!moreDataToSendForRead && webkitSoupRequestInputStreamFinished(stream)) { > g_simple_async_result_set_op_res_gssize(result.get(), 0); > g_simple_async_result_complete_in_idle(result.get()); > return; > } > > stream->priv->asyncReadData = adoptPtr(new AsyncReadData(result.get(), buffer, count, cancellable)); > if (moreDataToSendForRead) > webkitSoupRequestInputStreamReadAsyncResultComplete(stream, cancellable); > > > This way webkitSoupRequestInputStreamReadAsyncResultComplete would be responsible for calling stream->priv->asyncReadData.clear(), which seems to make more sense since it consumes it. > > Also, instead of adding a comment like: > // We are expecting more data to be added. > if (!webkitSoupRequestInputStreamFinished(stream)) { > > I think it makes sense to simply rename webkitSoupRequestInputStreamFinished to webkitSoupRequestWaitingForMoreDataToBeAdded or something better. I recommend doing the same for stream->priv->bytesRead < stream->priv->bytesReceived by creating a helper like webkitSoupRequestSentAllDataForPendingRead or something similar. Helper methods with good names are almost always better than comments, I'd say.
I'll try to use better names
Martin Robinson
Comment 15
2012-05-11 01:06:36 PDT
(In reply to
comment #14
)
> That would mean adding an unnecessary alloc/dealloc in the case where we already have the data ready in the stream to read from. AsyncReadData is to postpone the async result completion until we have more data in the stream to read from.
The allocation in this case is very cheap, but one way to avoid it would be to add a method to AsyncReadData to mark it as invalid and never allocate and deallocate it.
Carlos Garcia Campos
Comment 16
2012-05-11 01:13:58 PDT
Committed
r116738
: <
http://trac.webkit.org/changeset/116738
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug