Bug 18391 - return body data incrementally from libsoup backend
Summary: return body data incrementally from libsoup backend
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKitGTK (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC Linux
: P2 Normal
Assignee: Nobody
URL:
Keywords: Gtk, Soup
Depends on:
Blocks:
 
Reported: 2008-04-09 11:19 PDT by Dan Winship
Modified: 2008-05-01 10:08 PDT (History)
3 users (show)

See Also:


Attachments
return data to the loader incrementally (4.43 KB, patch)
2008-04-09 11:21 PDT, Dan Winship
no flags Details | Formatted Diff | Diff
incremental patch (5.25 KB, patch)
2008-04-09 15:02 PDT, Luca Bruno
no flags Details | Formatted Diff | Diff
02-limit connections (1.71 KB, patch)
2008-04-09 15:03 PDT, Luca Bruno
no flags Details | Formatted Diff | Diff
revised patch (4.95 KB, patch)
2008-04-09 19:14 PDT, Dan Winship
no flags Details | Formatted Diff | Diff
01-check for redirects (5.35 KB, patch)
2008-04-10 02:22 PDT, Luca Bruno
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dan Winship 2008-04-09 11:19:06 PDT
Currently the libsoup backend returns the response body all at once at the end. This fixes it to return it incrementally.
Comment 1 Dan Winship 2008-04-09 11:21:50 PDT
Created attachment 20434 [details]
return data to the loader incrementally
Comment 2 Luca Bruno 2008-04-09 11:34:33 PDT
It's a pretty good patch, thanks Dan.
In gotHeaders Maybe there might be a problem on didReceiveResponse, this will tell webcore about a new response of the same job for each redirect.
Same thing in gotChunk, this will give webcore further data with didReceiveData on redirects.

I think a SOUP_STATUS_IS_REDIRECTION check must be added to both.
Comment 3 Luca Bruno 2008-04-09 15:02:19 PDT
Created attachment 20443 [details]
incremental patch

Check for redirection.
Comment 4 Luca Bruno 2008-04-09 15:03:32 PDT
Created attachment 20444 [details]
02-limit connections

Using 20 and 2 seems a good thing for my slow connection, also seems more responsive.
Comment 5 Dan Winship 2008-04-09 19:14:21 PDT
Created attachment 20446 [details]
revised patch

checking SOUP_STATUS_IS_REDIRECTION() isn't quite right though, because in the cases where it *doesn't* redirect on a 3xx response, we'd end up with no body at the end...

Error messages should generally be short anyway, so I've just changed it so that it provides the data incrementally for 2xx responses, and all at once at the end still for non-2xx responses.
Comment 6 Luca Bruno 2008-04-10 00:57:18 PDT
Hi, well i think the previous one was still better. This way you will lose many pages from being rendered, such as 404 pages, server errors, ecc.. I'd suggest to check only for a "Location" in the header and for transport error, what do you think about?
Comment 7 Luca Bruno 2008-04-10 02:22:05 PDT
Created attachment 20449 [details]
01-check for redirects

Is right to use "Location"? It seems libsoup is case-sensitive for headers.
Comment 8 Dan Winship 2008-04-10 05:29:05 PDT
(In reply to comment #6)
> Hi, well i think the previous one was still better. This way you will lose many
> pages from being rendered, such as 404 pages, server errors, ecc..

It doesn't lose them; 2xx responses are returned incrementally, but non-2xx responses are returned all at once from finishedCallback() just like they were before this patch. So a 301 redirecting to a 200 would look like:

    gotHeadersCallback
        [msg->status_code == 301]
        return
    gotChunkCallback
        return
    restartedCallback
    gotHeadersCallback
        [msg->status_code == 200]
        didReceiveResponse()
    gotChunkCallback
        didReceiveData()
    [more gotChunkCallbacks...]
    finishedCallback
        didFinishLoading()

but if libsoup didn't follow the redirect (eg, because it was in response to a PUT, or was to an ftp: URL, etc), then instead of "restarted", you'd get "finishedCallback", which would see that the final status was non-2xx, and so it would call didReceiveResponse(), didReceiveData(), and didFinishLoading() all at once to make up for the fact that it hadn't done that before.
Comment 9 Dan Winship 2008-04-10 05:33:58 PDT
Re: 02-limit connections: maybe that should be a separate bug? (Or else we should change the summary here to "fix loading speed problems in libsoup backend".)

It definitely looks like Firefox uses 6 for its equivalent of SOUP_SESSION_MAX_CONNS_PER_HOST.
Comment 10 Luca Bruno 2008-04-10 10:45:23 PDT
(In reply to comment #8)
> It doesn't lose them; 2xx responses are returned incrementally, but non-2xx
> responses are returned all at once from finishedCallback() just like they were
> before this patch.

Yes I'm sorry, I definitively missed the finishedCallback part. But why shouldn't 404, not found, and other errors be loaded incrementally too? Do you think there's a speedup the soup-side?
Comment 11 Dan Winship 2008-04-10 11:13:16 PDT
(In reply to comment #10)
> But why
> shouldn't 404, not found, and other errors be loaded incrementally too?

Incremental loading is mostly important for images and large web pages. Most 404s are tiny, and users would never be able to notice the difference between loading them incrementally and not. And anyway, trying to exhaustively distinguish between status codes that might cause libsoup to requeue the message, and status codes that definitely won't seems like a bad idea.
Comment 12 Luca Bruno 2008-04-10 11:31:30 PDT
(In reply to comment #11)
> Incremental loading is mostly important for images and large web pages. Most
> 404s are tiny, and users would never be able to notice the difference between
> loading them incrementally and not. And anyway, trying to exhaustively
> distinguish between status codes that might cause libsoup to requeue the
> message, and status codes that definitely won't seems like a bad idea.
> 

For instance you mean it's not safe to check for "Location"?
Comment 13 Dan Winship 2008-04-10 12:12:10 PDT
Yeah, just checking 3xx and Location header isn't sufficient.

At the moment, the rule is that if there's a Location header, AND it contains a valid http: or https: URL, AND either (a) the status is 302 or 303, or (b) the status is 301 or 307 AND the method is GET, HEAD, OPTIONS, or PROPFIND, then the message gets automatically redirected.

Once we deal with authentication, then that will add another set of cases in which libsoup requeues the message automatically (when there's a 401 or 407, and it already has a valid password cached).
Comment 14 Luca Bruno 2008-04-11 06:12:22 PDT
Comment on attachment 20449 [details]
01-check for redirects

Ouch, you're patch is better then.
Comment 15 Luca Bruno 2008-04-11 06:12:53 PDT
Comment on attachment 20444 [details]
02-limit connections

Open another bug.
Comment 16 Alp Toker 2008-04-13 10:39:24 PDT
Comment on attachment 20446 [details]
revised patch

r=me

Code is fine. (ChangeLog entry should use eight spaces though for next time)

Thanks
Comment 17 Alp Toker 2008-04-13 10:40:38 PDT
Landed in r31852.
Comment 18 Dan Winship 2008-04-14 05:41:14 PDT
Hm... I realized this weekend while trying to figure out the APIs that I had confused

    ResourceLoader::didReceiveData(const char*, int, long long lengthReceived, bool allAtOnce)

and

    ResourceHandleClient::didReceiveData(ResourceHandle*, const char*, int, int lengthReceived)


here:

+    client->didReceiveData(handle, chunk->data, chunk->length, false);

ie, the last argument should actually be an int, not a bool. I couldn't figure out what that argument is actually supposed to mean, since AFAICT it currently ends up going unused in WebKitGtk. The old code passed "0" there but I don't know if that's right or wrong.
Comment 19 Mark Rowe (bdash) 2008-04-24 17:58:21 PDT
Dan, was the change that was landed incorrect?  If so, I think a new bug should be used to track that.  It's not clear to me why this was reopened.
Comment 20 Mark Rowe (bdash) 2008-04-24 18:02:22 PDT
Comment on attachment 20446 [details]
revised patch

Clearing review flag as this was landed already.
Comment 21 Dan Winship 2008-05-01 10:08:10 PDT
(In reply to comment #19)
> Dan, was the change that was landed incorrect?  If so, I think a new bug should
> be used to track that.  It's not clear to me why this was reopened.

ok, i'll reclose it. (the patch is a teeny bit incorrect; it uses "false" where it should have used "0")