RESOLVED FIXED 18391
return body data incrementally from libsoup backend
https://bugs.webkit.org/show_bug.cgi?id=18391
Summary return body data incrementally from libsoup backend
Dan Winship
Reported 2008-04-09 11:19:06 PDT
Currently the libsoup backend returns the response body all at once at the end. This fixes it to return it incrementally.
Attachments
return data to the loader incrementally (4.43 KB, patch)
2008-04-09 11:21 PDT, Dan Winship
no flags
incremental patch (5.25 KB, patch)
2008-04-09 15:02 PDT, Luca Bruno
no flags
02-limit connections (1.71 KB, patch)
2008-04-09 15:03 PDT, Luca Bruno
no flags
revised patch (4.95 KB, patch)
2008-04-09 19:14 PDT, Dan Winship
no flags
01-check for redirects (5.35 KB, patch)
2008-04-10 02:22 PDT, Luca Bruno
no flags
Dan Winship
Comment 1 2008-04-09 11:21:50 PDT
Created attachment 20434 [details] return data to the loader incrementally
Luca Bruno
Comment 2 2008-04-09 11:34:33 PDT
It's a pretty good patch, thanks Dan. In gotHeaders Maybe there might be a problem on didReceiveResponse, this will tell webcore about a new response of the same job for each redirect. Same thing in gotChunk, this will give webcore further data with didReceiveData on redirects. I think a SOUP_STATUS_IS_REDIRECTION check must be added to both.
Luca Bruno
Comment 3 2008-04-09 15:02:19 PDT
Created attachment 20443 [details] incremental patch Check for redirection.
Luca Bruno
Comment 4 2008-04-09 15:03:32 PDT
Created attachment 20444 [details] 02-limit connections Using 20 and 2 seems a good thing for my slow connection, also seems more responsive.
Dan Winship
Comment 5 2008-04-09 19:14:21 PDT
Created attachment 20446 [details] revised patch checking SOUP_STATUS_IS_REDIRECTION() isn't quite right though, because in the cases where it *doesn't* redirect on a 3xx response, we'd end up with no body at the end... Error messages should generally be short anyway, so I've just changed it so that it provides the data incrementally for 2xx responses, and all at once at the end still for non-2xx responses.
Luca Bruno
Comment 6 2008-04-10 00:57:18 PDT
Hi, well i think the previous one was still better. This way you will lose many pages from being rendered, such as 404 pages, server errors, ecc.. I'd suggest to check only for a "Location" in the header and for transport error, what do you think about?
Luca Bruno
Comment 7 2008-04-10 02:22:05 PDT
Created attachment 20449 [details] 01-check for redirects Is right to use "Location"? It seems libsoup is case-sensitive for headers.
Dan Winship
Comment 8 2008-04-10 05:29:05 PDT
(In reply to comment #6) > Hi, well i think the previous one was still better. This way you will lose many > pages from being rendered, such as 404 pages, server errors, ecc.. It doesn't lose them; 2xx responses are returned incrementally, but non-2xx responses are returned all at once from finishedCallback() just like they were before this patch. So a 301 redirecting to a 200 would look like: gotHeadersCallback [msg->status_code == 301] return gotChunkCallback return restartedCallback gotHeadersCallback [msg->status_code == 200] didReceiveResponse() gotChunkCallback didReceiveData() [more gotChunkCallbacks...] finishedCallback didFinishLoading() but if libsoup didn't follow the redirect (eg, because it was in response to a PUT, or was to an ftp: URL, etc), then instead of "restarted", you'd get "finishedCallback", which would see that the final status was non-2xx, and so it would call didReceiveResponse(), didReceiveData(), and didFinishLoading() all at once to make up for the fact that it hadn't done that before.
Dan Winship
Comment 9 2008-04-10 05:33:58 PDT
Re: 02-limit connections: maybe that should be a separate bug? (Or else we should change the summary here to "fix loading speed problems in libsoup backend".) It definitely looks like Firefox uses 6 for its equivalent of SOUP_SESSION_MAX_CONNS_PER_HOST.
Luca Bruno
Comment 10 2008-04-10 10:45:23 PDT
(In reply to comment #8) > It doesn't lose them; 2xx responses are returned incrementally, but non-2xx > responses are returned all at once from finishedCallback() just like they were > before this patch. Yes I'm sorry, I definitively missed the finishedCallback part. But why shouldn't 404, not found, and other errors be loaded incrementally too? Do you think there's a speedup the soup-side?
Dan Winship
Comment 11 2008-04-10 11:13:16 PDT
(In reply to comment #10) > But why > shouldn't 404, not found, and other errors be loaded incrementally too? Incremental loading is mostly important for images and large web pages. Most 404s are tiny, and users would never be able to notice the difference between loading them incrementally and not. And anyway, trying to exhaustively distinguish between status codes that might cause libsoup to requeue the message, and status codes that definitely won't seems like a bad idea.
Luca Bruno
Comment 12 2008-04-10 11:31:30 PDT
(In reply to comment #11) > Incremental loading is mostly important for images and large web pages. Most > 404s are tiny, and users would never be able to notice the difference between > loading them incrementally and not. And anyway, trying to exhaustively > distinguish between status codes that might cause libsoup to requeue the > message, and status codes that definitely won't seems like a bad idea. > For instance you mean it's not safe to check for "Location"?
Dan Winship
Comment 13 2008-04-10 12:12:10 PDT
Yeah, just checking 3xx and Location header isn't sufficient. At the moment, the rule is that if there's a Location header, AND it contains a valid http: or https: URL, AND either (a) the status is 302 or 303, or (b) the status is 301 or 307 AND the method is GET, HEAD, OPTIONS, or PROPFIND, then the message gets automatically redirected. Once we deal with authentication, then that will add another set of cases in which libsoup requeues the message automatically (when there's a 401 or 407, and it already has a valid password cached).
Luca Bruno
Comment 14 2008-04-11 06:12:22 PDT
Comment on attachment 20449 [details] 01-check for redirects Ouch, you're patch is better then.
Luca Bruno
Comment 15 2008-04-11 06:12:53 PDT
Comment on attachment 20444 [details] 02-limit connections Open another bug.
Alp Toker
Comment 16 2008-04-13 10:39:24 PDT
Comment on attachment 20446 [details] revised patch r=me Code is fine. (ChangeLog entry should use eight spaces though for next time) Thanks
Alp Toker
Comment 17 2008-04-13 10:40:38 PDT
Landed in r31852.
Dan Winship
Comment 18 2008-04-14 05:41:14 PDT
Hm... I realized this weekend while trying to figure out the APIs that I had confused ResourceLoader::didReceiveData(const char*, int, long long lengthReceived, bool allAtOnce) and ResourceHandleClient::didReceiveData(ResourceHandle*, const char*, int, int lengthReceived) here: + client->didReceiveData(handle, chunk->data, chunk->length, false); ie, the last argument should actually be an int, not a bool. I couldn't figure out what that argument is actually supposed to mean, since AFAICT it currently ends up going unused in WebKitGtk. The old code passed "0" there but I don't know if that's right or wrong.
Mark Rowe (bdash)
Comment 19 2008-04-24 17:58:21 PDT
Dan, was the change that was landed incorrect? If so, I think a new bug should be used to track that. It's not clear to me why this was reopened.
Mark Rowe (bdash)
Comment 20 2008-04-24 18:02:22 PDT
Comment on attachment 20446 [details] revised patch Clearing review flag as this was landed already.
Dan Winship
Comment 21 2008-05-01 10:08:10 PDT
(In reply to comment #19) > Dan, was the change that was landed incorrect? If so, I think a new bug should > be used to track that. It's not clear to me why this was reopened. ok, i'll reclose it. (the patch is a teeny bit incorrect; it uses "false" where it should have used "0")
Note You need to log in before you can comment on or make changes to this bug.