Bug 44261

Summary: [GTK] Add HTTP caching support
Product: WebKit Reporter: Sergio Villar Senin <svillar>
Component: WebKitGTKAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: alex, a.renevier, commit-queue, danw, gustavo, hayato, l.slachciak, mpakulavelrutka, mrobinson, rawatgg, rniwa, svillar, webkit.review.bot, xan.lopez
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: Linux   
Bug Depends on: 47547    
Bug Blocks: 29299, 44158    
Attachments:
Description Flags
HTTP cache for WebKitGtk+
mrobinson: review-
SoupURILoader stuff as part of webkitgtk+ core network
gustavo: review-
Changes to WebKitGtk+ core network stuff to use the new API from the imported SoupURILoader code
none
New public API to enable clients to use the HTTP cache
none
patch for check-webkit-style to ignore our new stuff
none
Imported SoupURILoader to Webkit. New public API for HTTP cache
gustavo: review-
Changes to WebKitGtk+ core network stuff to use the new API from the imported SoupURILoader code
none
Imported SoupURILoader to Webkit. New public API for HTTP cache
none
WebKitGtk+ to use the new API from the imported SoupURILoader code
mrobinson: review-
Added fast/encoding/percent-escaping.html to the list of Skipped
none
Imported SoupURILoader to Webkit. New public API for HTTP cache
none
WebKitGtk+ to use the new API from the imported SoupURILoader code
none
Imported SoupURILoader to Webkit. New public API for HTTP cache
none
Imported SoupURILoader to Webkit. New public API for HTTP cache
none
WebKitGtk+ to use the new API from the imported SoupURILoader code
mrobinson: review-
Imported SoupURILoader to Webkit. New public API for HTTP cache
none
WebKitGtk+ to use the new API from the imported SoupURILoader code
none
patches to Sergio's patches
none
Imported SoupURILoader to Webkit. New public API for HTTP cache
none
WebKitGtk+ to use the new API from the imported SoupURILoader code
none
Imported SoupURILoader to Webkit. New public API for HTTP cache
none
WebKitGtk+ to use the new API from the imported SoupURILoader code
mrobinson: review-
Imported SoupURILoader to Webkit. New public API for HTTP cache
none
WebKitGtk+ to use the new API from the imported SoupURILoader code
none
WebKitGtk+ to use the new API from the imported SoupURILoader code
mrobinson: review-
WebKitGtk+ to use the new API from the imported SoupURILoader code
mrobinson: review-
Imported SoupURILoader to Webkit. New public API for HTTP cache
mrobinson: review-
Imported SoupURILoader to Webkit. New public API for HTTP cache
none
WebKitGtk+ to use the new API from the imported SoupURILoader code mrobinson: review+

Description Sergio Villar Senin 2010-08-19 07:19:40 PDT
This bug comes from https://bugzilla.gnome.org/show_bug.cgi?id=523100.

Summarizing, a cache for libsoup was implemented using the SoupURILoader stuff that is under development. Dan Winship considered that the SoupURILoader stuff is not ready for trunk https://bugzilla.gnome.org/show_bug.cgi?id=523100#c27. That's why he suggested to move all the cache stuff to webkitgtk while the libsoup stuff does not land upstream. Since now it's purely a webkitgtk+ issue it's better to manage it here.
Comment 1 Sergio Villar Senin 2010-08-19 07:35:29 PDT
Created attachment 64849 [details]
HTTP cache for WebKitGtk+

Basically this patch is made of:

   * changes to ResourceHandleSoup already reviewed in bug 40222
   * libsoup's URI loader code discussed here https://bugzilla.gnome.org/show_bug.cgi?id=557777
   * caching support https://bugzilla.gnome.org/show_bug.cgi?id=523100

Once applied to current trunk all the tests work fine except these two:
   * http/test/history/back-to-post.php: recently changed by http://trac.webkit.org/changeset/65340. Need to review what happens
   * fast/encoding/percent-escaping.html: as stated here https://bugzilla.gnome.org/show_bug.cgi?id=523100#c28 it needs some changes in libsoup to work properly, as libsoup incorrectly escapes file URL's with '%'
Comment 2 Martin Robinson 2010-08-19 11:28:10 PDT
Comment on attachment 64849 [details]
HTTP cache for WebKitGtk+

This patch is huge! I'd really like to see it split up into at least
two parts.

1. The code drop from Soup, which should probably be separated somehow from
the rest of our source, so we can treat it specially (it will go away eventually, right?).
2. The bits that actually implement the cache in WebCore/WebKit.

That being said, I have a few comments on what I presume to be the second part.

> +        GRefPtr<WebKitSoupRequest> m_req;
> +	GRefPtr<WebKitSoupRequester> m_requester;

Please do not abbreviate member names. This could be m_request or, heck,
even m_soupRequest. Someone new will look at the code and know exactly what it is.
The indentation is a little funky here as well.

>  #include "SharedBuffer.h"
>  #include "TextEncoding.h"
>  
> +#include "webkitsouprequesthttp.h"
> +
>  #include <errno.h>
>  #include <fcntl.h>
>  #include <gio/gio.h>

Just remove all spaces in the list of includes per the style guidelines. There is
no need to separate them by include style.

> +    if (m_req) {
> +        g_object_set_data(G_OBJECT(m_req.get()), "webkit-resource", 0);
> +        m_req.clear();

There's no chance that this may be the last reference to m_req here?

> +    GOwnPtr<char> uri(soup_uri_to_string(soup_message_get_uri(msg), false));
> +    String location = String(uri.get());

Even though URL are generally URL-encoded, I don't know if we can rely on that
here. You should use String::fromUTF8.

> +#define BUFFER_SIZE 8192

Why not move this to the top and make the name more descriptive?
INPUT_STREAM_BUFFER_SIZE or something like that.

> +    GInputStream* in = webkitSoupRequestSend(d->m_req.get(), 0, &error.outPtr());

This should be a full-blown variable name like inputStream.

> -    d->m_idleHandler = g_timeout_add(0, parseDataUrl, handle);
> +    d->m_idleHandler = g_idle_add(parseDataUrl, handle);

Why did you switch this to an idle? I think Gustavo is trying to move
away from that because they are often starved. Maybe ping him about this.

>      g_object_set(session,
>                   SOUP_SESSION_MAX_CONNS, MAX_CONNECTIONS,
>                   SOUP_SESSION_MAX_CONNS_PER_HOST, MAX_CONNECTIONS_PER_HOST,
> -                 NULL);
> +                 0);

The style bot should not complain about this, because the missing NULL causes
a compiler warning. Please just use NULL here.

> +
> +    if (d->m_cancellable) {
> +        g_object_unref(d->m_cancellable);
> +        d->m_cancellable = 0;
> +    }

m_cancellable can probably be a GRefPtr in another patch.

> +    if (d->m_inputStream) {
> +        g_object_set_data(G_OBJECT(d->m_inputStream), "webkit-resource", 0);
> +        g_object_unref(d->m_inputStream);
> +        d->m_inputStream = 0;
> +    }

Ditto for m_inputStream.

> +
> +    if (d->m_buffer) {
> +        g_free(d->m_buffer);
> +        d->m_buffer = 0;
> +    }

Ditto for m_buffer, except GOwnPtr.

> +    GInputStream* in = webkitSoupRequestSendFinish(d->m_req.get(), res, &error.outPtr());

inputStream again. :)

> +
> +    if (error) {
> +        if (d->m_msg && SOUP_STATUS_IS_TRANSPORT_ERROR(d->m_msg->status_code)) {
> +            SoupURI* uri = webkitSoupRequestGetUri(d->m_req.get());
> +            GOwnPtr<char> uriStr(soup_uri_to_string(uri, false));

No need to abbreviate the variable name here. uriString is fine.

>  
> -    // Used to set the authentication dialog toplevel; may be NULL
> +    // Used to set the authentication dialog toplevel; may be 0

I think "null" here is clearer and probably won't upset the style gods.

> +        // We have to convert it to UTF-16 due to limitations in KURL
> +        GOwnPtr<gunichar2> unicodeStr(g_utf8_to_utf16(d->m_buffer, bytesRead, 0, 0, 0));
> +        client->didReceiveData(handle.get(), reinterpret_cast<const char*>(unicodeStr.get()), g_utf8_strlen(d->m_buffer, bytesRead) * sizeof(UChar), 0);

Okay, again this is kind of a nit, but unicodeStr is not entirely precise. It should
probably be utf16String.  g_utf8_strlen(d->m_buffer, bytesRead) * sizeof(UChar) seems
unsafe against encoding errors and is a little clunky. I think it's better to do something
like this:

    long wordsWritten;
    GOwnPtr<gunichar2> utf16String(g_utf8_to_utf16(d->m_buffer, bytesRead, 0, &wordsWritten, 0));
    client->didReceiveData(handle.get(), reinterpret_cast<const char*>(utf16String()), wordsWritten * sizeof(UChar), 0);

> +    g_object_set(soupMessage, SOUP_MESSAGE_METHOD, httpMethod().utf8().data(), 0);

NULL here instead of 0.
Comment 3 Sergio Villar Senin 2010-08-20 01:56:48 PDT
(In reply to comment #2)
> (From update of attachment 64849 [details])
> This patch is huge! I'd really like to see it split up into at least
> two parts.
> 
> 1. The code drop from Soup, which should probably be separated somehow from
> the rest of our source, so we can treat it specially (it will go away eventually, right?).
> 2. The bits that actually implement the cache in WebCore/WebKit.

Thx for reviewing Martin.

Well basically the patch consists of:
   1) changes in ResourceHandleSoup/ResourceHandleInternal: basically adaptations to work with the URILoader stuff
   2) WebCore/platform/network/soup/webkitsoup*: all the stuff moved from webkit
   3) WebKit/Wekit/gtk/webkit/webkitsoupcache.[c|h]: the webkitgtk+ API for clients

Now some comments:

> > +    if (m_req) {
> > +        g_object_set_data(G_OBJECT(m_req.get()), "webkit-resource", 0);
> > +        m_req.clear();
> 
> There's no chance that this may be the last reference to m_req here?

What's the problem then?
 
> > +    GOwnPtr<char> uri(soup_uri_to_string(soup_message_get_uri(msg), false));
> > +    String location = String(uri.get());
> 
> Even though URL are generally URL-encoded, I don't know if we can rely on that
> here. You should use String::fromUTF8.

soup_uri_to_string always returns an URL encoded URL, do you think we still need to use ::fromUTF8 ?

> > -    d->m_idleHandler = g_timeout_add(0, parseDataUrl, handle);
> > +    d->m_idleHandler = g_idle_add(parseDataUrl, handle);
> 
> Why did you switch this to an idle? I think Gustavo is trying to move
> away from that because they are often starved. Maybe ping him about this.

As stated here https://bugs.webkit.org/show_bug.cgi?id=40222#c11 there was 1 test that fails with timeout but not with idle. After a long time debugging it, it seems that it's just a matter of timing. All the signals and events are emitted properly but somehow in a different order depending on the function used.
Comment 4 Martin Robinson 2010-08-20 13:29:58 PDT
> > > +    if (m_req) {
> > > +        g_object_set_data(G_OBJECT(m_req.get()), "webkit-resource", 0);
> > > +        m_req.clear();
> > 
> > There's no chance that this may be the last reference to m_req here?
> What's the problem then?

Sorry, misread this one! Looks fine actually.

> > here. You should use String::fromUTF8.
> soup_uri_to_string always returns an URL encoded URL, do you think we still need to use ::fromUTF8 ?

Yes. Since GLib strings are UTF-8, you should still use the proper String constructor. For instance, do you know what Soup's behavior is with IDNs?

> As stated here https://bugs.webkit.org/show_bug.cgi?id=40222#c11
> there was 1 test that fails with timeout but not with idle. After
> a long time debugging it, it seems that it's just a matter of timing.
> All the signals and events are emitted properly but somehow in a
> different order depending on the function used.

Interesting. I'm convinced, but I'd still like to get Gustavo's opinion
on it. He may have some further insights.
Comment 5 Sergio Villar Senin 2010-08-23 01:06:06 PDT
(In reply to comment #4)

> > > here. You should use String::fromUTF8.
> > soup_uri_to_string always returns an URL encoded URL, do you think we still need to use ::fromUTF8 ?
> 
> Yes. Since GLib strings are UTF-8, you should still use the proper String constructor. For instance, do you know what Soup's behavior is with IDNs?

OK will fix that. Regarding IDNs there is no specific code about them in libsoup AFAIK
 
> > As stated here https://bugs.webkit.org/show_bug.cgi?id=40222#c11
> > there was 1 test that fails with timeout but not with idle. After
> > a long time debugging it, it seems that it's just a matter of timing.
> > All the signals and events are emitted properly but somehow in a
> > different order depending on the function used.
> 
> Interesting. I'm convinced, but I'd still like to get Gustavo's opinion
> on it. He may have some further insights.

I don't want to convince you :). Just sharing my conclussions after some investigation.
Comment 6 Sergio Villar Senin 2010-08-23 01:08:13 PDT
Gently added kov and xan to the Cc
Comment 7 Sergio Villar Senin 2010-08-23 09:57:13 PDT
Created attachment 65135 [details]
SoupURILoader stuff as part of webkitgtk+ core network

This patch adds the objects used by SoupURILoader branch of libsoup. This code is required for the HTTP cache implementation. WebKit code style has been used when moving code to WebKit.
Comment 8 Sergio Villar Senin 2010-08-23 09:58:44 PDT
Created attachment 65136 [details]
Changes to WebKitGtk+ core network stuff to use the new API from the imported SoupURILoader code

See also https://bugs.webkit.org/show_bug.cgi?id=40222
Comment 9 Sergio Villar Senin 2010-08-23 09:59:34 PDT
Created attachment 65137 [details]
New public API to enable clients to use the HTTP cache

Public API for the cache
Comment 10 Sergio Villar Senin 2010-08-23 10:00:03 PDT
Patch splitted as suggested by Martin
Comment 11 Gustavo Noronha (kov) 2010-08-23 10:06:33 PDT
Great, I'll take a look at these later today, is there any specific reason why you did not mark the patches for review, though?
Comment 12 Sergio Villar Senin 2010-08-23 10:34:27 PDT
(In reply to comment #11)
> Great, I'll take a look at these later today, is there any specific reason why you did not mark the patches for review, though?

Cool! if you can review them

Regarding the r?, I always forget :/
Comment 13 WebKit Review Bot 2010-08-23 10:44:27 PDT
Attachment 65135 [details] did not pass style-queue:

Failed to run "['WebKitTools/Scripts/check-webkit-style']" exit_code: 1
Last 3072 characters of output:
ty/naming] [4]
WebCore/platform/network/soup/webkitsouprequestdata.cpp:178:  request_data_class is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
WebCore/platform/network/soup/webkitsouprequestdata.cpp:180:  object_class is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
WebCore/platform/network/soup/webkitsouprequestdata.cpp:181:  request_class is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
WebCore/platform/network/soup/webkitsouprequestfile.cpp:47:  webkit_soup_request_file_init is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
WebCore/platform/network/soup/webkitsouprequestfile.cpp:54:  webkit_soup_request_file_finalize is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
WebCore/platform/network/soup/webkitsouprequestfile.cpp:268:  webkit_soup_request_file_class_init is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
WebCore/platform/network/soup/webkitsouprequest.h:57:  parent_class is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
WebCore/platform/network/soup/webkitsouprequest.h:65:  Tab found; better to use spaces  [whitespace/tab] [1]
WebCore/platform/network/soup/webkitsouprequest.h:66:  Tab found; better to use spaces  [whitespace/tab] [1]
WebCore/platform/network/soup/webkitsouprequest.h:67:  Tab found; better to use spaces  [whitespace/tab] [1]
WebCore/platform/network/soup/webkitsouprequest.h:76:  webkit_soup_request_get_type is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
WebCore/platform/network/soup/webkitsouprequester.cpp:39:  webkit_soup_requester_init is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
WebCore/platform/network/soup/webkitsouprequester.cpp:57:  webkit_soup_requester_class_init is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
WebCore/platform/network/soup/webkitsouprequester.cpp:57:  requester_class is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
WebCore/platform/network/soup/webkitsouprequester.cpp:59:  object_class is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
WebCore/platform/network/soup/webkitsouprequester.cpp:67:  init_request_types is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
WebCore/platform/network/soup/webkitsouprequester.cpp:93:  webkit_soup_requester_request is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
WebCore/platform/network/soup/webkitsouprequester.cpp:109:  webkit_soup_requester_request_uri is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
Total errors found: 934 in 19 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 14 WebKit Review Bot 2010-08-23 10:46:13 PDT
Attachment 65136 [details] did not pass style-queue:

Failed to run "['WebKitTools/Scripts/check-webkit-style']" exit_code: 1
WebCore/platform/network/ResourceHandleInternal.h:51:  Alphabetical sorting problem.  [build/include_order] [4]
WebCore/platform/network/ResourceHandleInternal.h:210:  Tab found; better to use spaces  [whitespace/tab] [1]
Total errors found: 2 in 4 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 15 WebKit Review Bot 2010-08-23 10:47:05 PDT
Attachment 65137 [details] did not pass style-queue:

Failed to run "['WebKitTools/Scripts/check-webkit-style']" exit_code: 1
Last 3072 characters of output:
kitsoupcache.cpp:107:  One line control clauses should not use braces.  [whitespace/braces] [4]
WebKit/gtk/webkit/webkitsoupcache.cpp:108:  Tab found; better to use spaces  [whitespace/tab] [1]
WebKit/gtk/webkit/webkitsoupcache.cpp:112:  Extra space before ( in function call  [whitespace/parens] [4]
WebKit/gtk/webkit/webkitsoupcache.cpp:114:  Tab found; better to use spaces  [whitespace/tab] [1]
WebKit/gtk/webkit/webkitsoupcache.cpp:114:  Declaration has space between type name and * in GObjectClass *gobject_class  [whitespace/declaration] [3]
WebKit/gtk/webkit/webkitsoupcache.cpp:116:  Tab found; better to use spaces  [whitespace/tab] [1]
WebKit/gtk/webkit/webkitsoupcache.cpp:118:  Tab found; better to use spaces  [whitespace/tab] [1]
WebKit/gtk/webkit/webkitsoupcache.cpp:118:  Extra space before ( in function call  [whitespace/parens] [4]
WebKit/gtk/webkit/webkitsoupcache.cpp:133:  Extra space before ( in function call  [whitespace/parens] [4]
WebKit/gtk/webkit/webkitsoupcache.cpp:136:  Extra space before ( in function call  [whitespace/parens] [4]
WebKit/gtk/webkit/webkitsoupcache.cpp:141:  Extra space before ( in function call  [whitespace/parens] [4]
WebKit/gtk/webkit/webkitsoupcache.cpp:157:  Extra space before ( in function call  [whitespace/parens] [4]
WebKit/gtk/webkit/webkitsoupcache.cpp:159:  Tab found; better to use spaces  [whitespace/tab] [1]
WebKit/gtk/webkit/webkitsoupcache.cpp:159:  Extra space before ( in function call  [whitespace/parens] [4]
WebKit/gtk/webkit/webkitsoupcache.cpp:170:  Extra space before ( in function call  [whitespace/parens] [4]
WebKit/gtk/webkit/webkitsoupcache.cpp:172:  Tab found; better to use spaces  [whitespace/tab] [1]
WebKit/gtk/webkit/webkitsoupcache.cpp:172:  Extra space before ( in function call  [whitespace/parens] [4]
WebKit/gtk/webkit/webkitsoupcache.cpp:176:  Extra space before ( in function call  [whitespace/parens] [4]
WebKit/gtk/webkit/webkitsoupcache.cpp:178:  Tab found; better to use spaces  [whitespace/tab] [1]
WebKit/gtk/webkit/webkitsoupcache.cpp:178:  Extra space before ( in function call  [whitespace/parens] [4]
WebKit/gtk/webkit/webkitsoupcache.cpp:182:  Extra space before ( in function call  [whitespace/parens] [4]
WebKit/gtk/webkit/webkitsoupcache.cpp:184:  Tab found; better to use spaces  [whitespace/tab] [1]
WebKit/gtk/webkit/webkitsoupcache.cpp:184:  Extra space before ( in function call  [whitespace/parens] [4]
WebKit/gtk/webkit/webkitsoupcache.cpp:188:  Extra space before ( in function call  [whitespace/parens] [4]
WebKit/gtk/webkit/webkitsoupcache.cpp:189:  Tab found; better to use spaces  [whitespace/tab] [1]
WebKit/gtk/webkit/webkitsoupcache.cpp:191:  Tab found; better to use spaces  [whitespace/tab] [1]
WebKit/gtk/webkit/webkitsoupcache.cpp:191:  Extra space before ( in function call  [whitespace/parens] [4]
WebKit/gtk/webkit/webkitsoupcache.cpp:195:  Extra space before ( in function call  [whitespace/parens] [4]
WebKit/gtk/webkit/webkitsoupcache.cpp:197:  Tab found; better to use spaces  [whitespace/tab] [1]
Total errors found: 111 in 3 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 16 WebKit Review Bot 2010-08-23 12:22:24 PDT
Attachment 65135 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/3757605
Comment 17 WebKit Review Bot 2010-08-23 12:47:50 PDT
Attachment 65136 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/3725601
Comment 18 WebKit Review Bot 2010-08-23 13:10:40 PDT
Attachment 65137 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/3751564
Comment 19 Gustavo Noronha (kov) 2010-08-23 16:45:21 PDT
Comment on attachment 65135 [details]
SoupURILoader stuff as part of webkitgtk+ core network

I have three main concerns:

1. I believe we should import the code without style changes into a subdirectory; I know you have probably been told to change it into webkit's style, but I think it would make backporting fixes into soup's version harder than needed - and it's not something we plan to keep in the tree for long, anyway
2. do we really need to import the TPL one-file library? I guess the answer is yes (though we could use GVariant?), but then we probably need to figure out a way of isolating it from the rest of the build as the build failure suggests it'll be problematic =(
3. There's API mixed in there (like  webkit_soup_http_input_stream_new/webkitSoupHttpInputStreamNew) - I think I would prefer having this code go to WebKit/soup/; the API should be kept in soup/GTK+-style even if we decide to move the rest (which is one more argument in favor of keeping the style)
Comment 20 Martin Robinson 2010-08-23 16:51:18 PDT
I'm also in favor of keeping this in the original style, in the interest of  pushing fixes back into your soup branch. I'd love to see this code isolated into Webkit/gtk/soup.

svillar, can you comment on the performance characteristics of the code which uses tpl? It may make more sense to use GVariant or sqlite, depending on what it does.
Comment 21 Xan Lopez 2010-08-23 22:57:12 PDT
(In reply to comment #20)
> I'm also in favor of keeping this in the original style, in the interest of  pushing fixes back into your soup branch. I'd love to see this code isolated into Webkit/gtk/soup.

If we are going to do this then let's open a bug right away to make the style bots ignore that bunch of code, otherwise this is going to be annoying. That way we'll also have something to point people to when they come asking why is there a ball of code not following the style inside the source tree (hello Mark! :)).

> 
> svillar, can you comment on the performance characteristics of the code which uses tpl? It may make more sense to use GVariant or sqlite, depending on what it does.

I agree with the sentiment. Did you talk with Dan and decided that using TPL made sense here? As suggested GVariant seems to be made exactly for this kind of task, so it would make sense to reuse it here.
Comment 22 Sergio Villar Senin 2010-08-24 00:30:23 PDT
Using Xan's comments for the answer the previous reviews

(In reply to comment #21)
> (In reply to comment #20)
> > I'm also in favor of keeping this in the original style, in the interest of  pushing fixes back into your soup branch. I'd love to see this code isolated into Webkit/gtk/soup.
> 
> If we are going to do this then let's open a bug right away to make the style bots ignore that bunch of code, otherwise this is going to be annoying. That way we'll also have something to point people to when they come asking why is there a ball of code not following the style inside the source tree (hello Mark! :)).

Not really sure how to make bots ignore that, could you help on that?

Provided you all are in favour of keeping the style do you think it still makes sense to add a "webkit_" prefix to all the symbols?

> > 
> > svillar, can you comment on the performance characteristics of the code which uses tpl? It may make more sense to use GVariant or sqlite, depending on what it does.
> 
> I agree with the sentiment. Did you talk with Dan and decided that using TPL made sense here? As suggested GVariant seems to be made exactly for this kind of task, so it would make sense to reuse it here.

Regarding why I chose tpl, there are some reasons:
   * I had previously used it for a personal project
   * The API is quite compact, easy and understandable
   * No external dependencies needed
   * It's supposed to be fast

Most of them are likely valid for the GVariant stuff, I guess I was mostly biased by the first one. I don't know about performance using GVariant, but if it's acceptable I agree it makes sense to use it.
Comment 23 Gustavo Noronha (kov) 2010-08-24 06:17:26 PDT
To make the bots ignore the path you need to add the path, + information about which rules are allowed to be broken on that path to WebKit/WebKitTools/Scripts/webkitpy/style/checker.py. If we agree on the path I can make the patch. WebCore/platform/network/soup/cache for the internal files and WebKit/gtk/soup for the API? Or should we go with WebKit/gtk/soup/ for everything, making this gtk-only, essentially?

As for the webkit_ prefix, I think it makes sense to add the prefix to public API, so it won't clash with the proper soup symbols in the future.

TPL/GVariant, I'd like to go with GVariant. TPL does bring _1_ additional dependency, and that is the actual TPL files, which you then have to keep up-to-date yourself. GVariant on the other hand is a standard part of our platform, and will be maintained by the glib maintainers =).
Comment 24 Lukasz Slachciak 2010-08-24 06:55:51 PDT
(In reply to comment #23)
> If we agree on the path I can make the patch. WebCore/platform/network/soup/cache for the internal files and WebKit/gtk/soup for the API? Or should we go with WebKit/gtk/soup/ for everything, making this gtk-only, essentially?

Please don't make it gtk-only  putting files into WebKit/gtk/soup/. Also EFL port is using soup as network backend, and this can be nice feature for EFL also.
Comment 25 Gustavo Noronha (kov) 2010-08-24 08:23:48 PDT
> Please don't make it gtk-only  putting files into WebKit/gtk/soup/. Also EFL port is using soup as network backend, and this can be nice feature for EFL also.

That is why I initially suggested WebKit/soup/. I think for simplicity's sake we should use WebKit/soup for all files.
Comment 26 Martin Robinson 2010-08-24 09:23:34 PDT
(In reply to comment #25)
> > Please don't make it gtk-only  putting files into WebKit/gtk/soup/. Also EFL port is using soup as network backend, and this can be nice feature for EFL also.
> That is why I initially suggested WebKit/soup/. I think for simplicity's sake we should use WebKit/soup for all files.

I can see why this is a good idea now. Here's another vote for WebKit/soup.
Comment 27 Sergio Villar Senin 2010-08-25 00:03:42 PDT
(In reply to comment #23)
> To make the bots ignore the path you need to add the path, + information about which rules are allowed to be broken on that path to WebKit/WebKitTools/Scripts/webkitpy/style/checker.py. If we agree on the path I can make the patch. WebCore/platform/network/soup/cache for the internal files and WebKit/gtk/soup for the API? Or should we go with WebKit/gtk/soup/ for everything, making this gtk-only, essentially?

I like the former. Basically because most of stuff is only needed by the ResourceHandleSoup. That's why having something under WebCore and the public API under WebKit makes a lot of sense to me. Regarding everything under WebKit/soup, I thought that being WebCore objects they should be under WebCore/ but I could be obviously wrong.
 
> As for the webkit_ prefix, I think it makes sense to add the prefix to public API, so it won't clash with the proper soup symbols in the future.

Agree.
 
> TPL/GVariant, I'd like to go with GVariant. TPL does bring _1_ additional dependency, and that is the actual TPL files, which you then have to keep up-to-date yourself. GVariant on the other hand is a standard part of our platform, and will be maintained by the glib maintainers =).

Yep, I'm moving the serialization/persistence stuff to GVariant and removing the tpl dependency.
Comment 28 Sergio Villar Senin 2010-08-25 07:47:02 PDT
Some extra information regarding the GVariant/tpl stuff. I coded a couple of serialization/deserialization routines both using tpl and GVariant in order to evaluate the performance. For the test the data that I wanted to serialize was an array of structures with a uint32, a boolean and 2 strings (the amount of data in the cache is actually higher but both systems claim to be O(n)).

Then I registered the time for serializing/deserializing arrays of different sizes. The first size I chose was 5000 elements. Basically because if we take a typical cache size of 50MB and an average file size of 10k then that means 5000 cache entries. The other test I did was with 10 times that size (50000 elements), for quite extreme cases.

For the case of tpl I implemented it both using fixed-length arrays and variable-length arrays.

These are the numbers I got (average values after 10 runs):

Serialization
*************

      | tpl (fixed) | tpl     | gvariant |
      ------------------------------------
5000  | 0.02456     | 0.02606 | 0.1098   |
50000 | 0.16678     | 0.23968 | 1.0829   |


Deserialization
***************

      | tpl (fixed) | tpl     | gvariant |
      ------------------------------------
5000  | 0.02456     | 0.02606 | 0.1098   |
50000 | 0.16678     | 0.23968 | 1.0829   |

The most important numbers are the deserialization ones because that happens when the applications start while the serialization is typically performed on application exit.

It's pretty clear that tpl (even using the not-so-fast variable length arrays is way faster (~85%) than gvariant. But if we consider that 5000 is an accurate figure then the difference in time is less than 1/10 sec.

Maybe a figure in between the two I chose fits better with actual numbers. In that case application start will wait during 1/2 second for cache to start. That can be partially avoided if we perform the deserialization in a different thread keeping the cache in a not_available state. Deserialization will most likely finish before the user could even start typing a URL (home page load could be affected by the concurrent deserialization BTW).
Comment 29 Gustavo Noronha (kov) 2010-08-26 08:12:08 PDT
Created attachment 65567 [details]
patch for check-webkit-style to ignore our new stuff
Comment 30 WebKit Review Bot 2010-08-26 08:16:23 PDT
Attachment 65567 [details] did not pass style-queue:

Failed to run "['WebKitTools/Scripts/check-webkit-style']" exit_code: 1
WebKitTools/Scripts/webkitpy/style/checker.py:218:  whitespace before ']'  [pep8/E202] [5]
Total errors found: 1 in 2 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 31 Martin Robinson 2010-08-26 08:56:29 PDT
Comment on attachment 65567 [details]
patch for check-webkit-style to ignore our new stuff

The style error seems like a false positive, so this looks good to me.
Comment 32 Gustavo Noronha (kov) 2010-08-26 12:19:04 PDT
Comment on attachment 65567 [details]
patch for check-webkit-style to ignore our new stuff

Landed as r66120. Turns out that the python style rules require a , after the last item of a list (the original code had it), so I added it, and the style checker is happy with itself again =)
Comment 33 Sergio Villar Senin 2010-08-30 03:36:24 PDT
Created attachment 65900 [details]
Imported SoupURILoader to Webkit. New public API for HTTP cache

* Added files from libsoup with SoupURILoader stuff
* New API that adds WebKitSoupCache
Comment 34 Sergio Villar Senin 2010-08-30 03:38:44 PDT
Created attachment 65901 [details]
Changes to WebKitGtk+ core network stuff to use the new API from the imported SoupURILoader code

WebKitGtk+ using SoupURILoader
Comment 35 WebKit Review Bot 2010-08-30 04:04:11 PDT
Attachment 65900 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/3850142
Comment 36 WebKit Review Bot 2010-08-30 04:44:05 PDT
Attachment 65901 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/3877142
Comment 37 Gustavo Noronha (kov) 2010-08-30 07:35:03 PDT
Comment on attachment 65900 [details]
Imported SoupURILoader to Webkit. New public API for HTTP cache

This will still need a changelog entry, and the build breakage implies we need to either depend on a newer glib or factor this code out if building with an oldish glib. There's a check in place to decide whether to use GSettings. I'd recommend adding a check specifically for GVariant, to decide whether to build this new code or not. I won't bother reviewing the actual code, except for the public APIs. The identation in soup-cache.c is completely broken (no spaces at all in various places, 4-space indentation in others), I think you may want to run indent in all soup cache files to have them match the libsoup style.
Comment 38 Gustavo Noronha (kov) 2010-08-30 07:36:09 PDT
Actually, I think trying to make this buildable with older glibs is going to pull us into hell, so I would go with making us depend on the earliest GVariant-providing glib.
Comment 39 Sergio Villar Senin 2010-08-31 10:15:51 PDT
(In reply to comment #38)
> Actually, I think trying to make this buildable with older glibs is going to pull us into hell, so I would go with making us depend on the earliest GVariant-providing glib.

Agree. I'll add changelog's and update the glib version
Comment 40 Sergio Villar Senin 2010-09-01 03:34:01 PDT
Created attachment 66189 [details]
Imported SoupURILoader to Webkit. New public API for HTTP cache

Now with ChangeLogs. Files imported from libsoup have now the proper libsoup coding style
Comment 41 Sergio Villar Senin 2010-09-01 03:35:42 PDT
Created attachment 66190 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

g_timeout_add is now back in place after replacing g_idle_add. BTW the test that was working only with g_idle_add is now working with the timeout_add too.
Comment 42 Sergio Villar Senin 2010-09-01 03:38:22 PDT
Created attachment 66191 [details]
Added fast/encoding/percent-escaping.html to the list of Skipped

That test does not work with the new SoupURILoader stuff.

See https://bugzilla.gnome.org/show_bug.cgi?id=523100#c28 for further details. A small patch to libsoup will fix it.
Comment 43 WebKit Review Bot 2010-09-01 06:36:05 PDT
Attachment 66190 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/3921019
Comment 44 Sergio Villar Senin 2010-09-06 03:52:05 PDT
Comment on attachment 65901 [details]
Changes to WebKitGtk+ core network stuff to use the new API from the imported SoupURILoader code

Marked as obsolete
Comment 45 Sergio Villar Senin 2010-09-06 09:07:28 PDT
Created attachment 66650 [details]
Imported SoupURILoader to Webkit. New public API for HTTP cache

A few changes since the last version of the patch:

* glib dependency is now 2.24
* removed several data from cache that is not really needed (as it was saved anyway in the http headers)
Comment 46 WebKit Review Bot 2010-09-06 09:34:55 PDT
Attachment 66650 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/3914232
Comment 47 Martin Robinson 2010-09-06 09:51:44 PDT
Comment on attachment 66190 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

This patch is looking good. I'm excited for this to land.

View in context: https://bugs.webkit.org/attachment.cgi?id=66190&action=prettypatch

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:138
> +        m_soupRequest.clear();
m_soupRequest.clear(); is unecessary here, I believe. The destructor will take care of decrementing the smart pointer reference count. If this is needed otherwise there should really be a comment explaining why.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:196
> +    String location = String(uri.get());
Again, I believe this should be ::fromUTF8(...).

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:305
> +#define BUFFER_SIZE 8192
If this is a constant, why do we need to store m_bufferSize? Let's just get rid of that member if possible. Again, I think this should be called READ_BUFFER_SIZE and be at the top of the file too.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:689
> +                            sendRequestCallback, 0);
This newline seems unecessary. I usually split the line when it goes over 120 characters.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:846
> +        client->didReceiveData(handle.get(), reinterpret_cast<const char*>(unicodeStr.get()), g_utf8_strlen(d->m_buffer, bytesRead) * sizeof(UChar), 0);
Would it be possible to simply use WTFString here?

String data = String::fromUTF8(d->m_buffer, bytesRead);
client->didReceiveData(handle.get(), reinterpret_cast<const char*>(data.characters()), data.length() * sizeof(UChar), 0);

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:867
> +    GOwnPtr<char> urlStr;
Please use CString here, if possible. It was made just for this purpose.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:901
> +
Please remove the newline.
Comment 48 Sergio Villar Senin 2010-09-07 01:38:16 PDT
Created attachment 66692 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

New version with fixes to the issues detected by Martin
Comment 49 WebKit Review Bot 2010-09-07 03:09:05 PDT
Attachment 66692 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/3938283
Comment 50 Martin Robinson 2010-09-07 21:35:20 PDT
Comment on attachment 66650 [details]
Imported SoupURILoader to Webkit. New public API for HTTP cache

View in context: https://bugs.webkit.org/attachment.cgi?id=66650&action=prettypatch

My major concern with this patch is exposing a public API that will not be compatible with some further Soup version. Is it possible to keep this API private for now?

> GNUmakefile.am:325
>  
Please keep this list in alphabetical order.

> WebCore/GNUmakefile.am:2575
>  	WebCore/plugins/gtk/PluginDataGtk.cpp \
Please keep the source lists in alphabetical order. I've been ordered them by filtering them through sort.
Comment 51 Gustavo Noronha (kov) 2010-09-09 11:40:27 PDT
(In reply to comment #50)
> (From update of attachment 66650 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=66650&action=prettypatch
> 
> My major concern with this patch is exposing a public API that will not be compatible with some further Soup version. Is it possible to keep this API private for now?

The applications that uses WebKitGTK+ will need that API to setup the caching, so we can't make it private. That concern is why we are adding the WebKit prefix to the API bits, we should be fine.
Comment 52 Sergio Villar Senin 2010-09-10 00:17:49 PDT
(In reply to comment #50)
> (From update of attachment 66650 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=66650&action=prettypatch
> 
> My major concern with this patch is exposing a public API that will not be compatible with some further Soup version. Is it possible to keep this API private for now?

As Gustavo said it's a must for clients.

> > GNUmakefile.am:325
> >  
> Please keep this list in alphabetical order.

It's currently not properly sorted, do you think it makes sense to sort them in this patch?
Comment 53 Sergio Villar Senin 2010-09-10 06:00:27 PDT
Comment on attachment 66692 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

Removing the review request as I found a bug that could lead to a crash
Comment 54 Sergio Villar Senin 2010-09-10 15:59:53 PDT
Created attachment 67254 [details]
Imported SoupURILoader to Webkit. New public API for HTTP cache

Fixed minor issues with sorting reported by Martin.

Apart from that several changes in cache *implementation* (no changes in pure WebKitGtk+ code or API)
   * Fixed crash caused by multiple free
   * Fixed memory leaks
   * LRU is now properly updated
   * Cache resources properly freed on finalize
   * Decreased cache's threads priority to priorize libsoup's and webkit's
   * GVariant does not like non-UTF8 strings. Validate them before serializing.

I won't be available next week so don't expect comments from my side till next Saturday at least :-)
Comment 55 Sergio Villar Senin 2010-09-10 16:01:33 PDT
Comment on attachment 66692 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

Added missing r=?
Comment 56 WebKit Review Bot 2010-09-10 18:23:11 PDT
Attachment 67254 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/3897376
Comment 57 Martin Robinson 2010-09-17 08:05:17 PDT
Comment on attachment 66191 [details]
Added fast/encoding/percent-escaping.html to the list of Skipped

View in context: https://bugs.webkit.org/attachment.cgi?id=66191&action=prettypatch

> LayoutTests/platform/gtk/Skipped:1138
> +fast/encoding/percent-escaping.html

I think I'd rather see a work-around in our code than a regression. We should chat about this sometime.
Comment 58 Michal Pakula vel Rutka 2010-09-17 09:51:11 PDT
As in WebKit-EFL we want to use your cache implementation could you tell me if it is already finished or are there any features that need to be implemented?
Comment 59 Sergio Villar Senin 2010-09-20 01:28:46 PDT
(In reply to comment #57)
> (From update of attachment 66191 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=66191&action=prettypatch
> 
> > LayoutTests/platform/gtk/Skipped:1138
> > +fast/encoding/percent-escaping.html
> 
> I think I'd rather see a work-around in our code than a regression. We should chat about this sometime.

Unfortunatelly there is no work-around possible from our side without breaking all the new URI loader way of getting resources because the problem is how libsoup handles file: URLs. The fix should be added to libsoup, the patch is really small (see https://bugzilla.gnome.org/show_bug.cgi?id=523100#c28 for more details). I can talk with danw about that before pushing this upstream.
Comment 60 Sergio Villar Senin 2010-09-20 01:29:33 PDT
(In reply to comment #58)
> As in WebKit-EFL we want to use your cache implementation could you tell me if it is already finished or are there any features that need to be implemented?

It's finished but under review process, hopefully will land soon.
Comment 61 Sergio Villar Senin 2010-09-21 01:46:29 PDT
(In reply to comment #59)
> (In reply to comment #57)
> > (From update of attachment 66191 [details] [details])
> > View in context: https://bugs.webkit.org/attachment.cgi?id=66191&action=prettypatch
> > 
> > > LayoutTests/platform/gtk/Skipped:1138
> > > +fast/encoding/percent-escaping.html
> > 
> > I think I'd rather see a work-around in our code than a regression. We should chat about this sometime.
> 
> Unfortunatelly there is no work-around possible from our side without breaking all the new URI loader way of getting resources because the problem is how libsoup handles file: URLs. The fix should be added to libsoup, the patch is really small (see https://bugzilla.gnome.org/show_bug.cgi?id=523100#c28 for more details). I can talk with danw about that before pushing this upstream.

Actually the problem is not with the percent signs but with the blank spaces in the filename. Soup will always try to convert them into "%20" because otherwise is won't be a valid URL. So even if you encode them before passing the url to libsoup you'll be generating a different filename and thus the test will fail. I think the best I can do is to have a little chat about this issue with danw and try to get the libsoup patch upstream.
Comment 62 Martin Robinson 2010-09-21 08:30:09 PDT
(In reply to comment #61)
> Actually the problem is not with the percent signs but with the blank spaces in the filename. Soup will always try to convert them into "%20" because otherwise is won't be a valid URL. So even if you encode them before passing the url to libsoup you'll be generating a different filename and thus the test will fail. I think the best I can do is to have a little chat about this issue with danw and try to get the libsoup patch upstream.

So does this mean filenames with spaces will be inaccessible? That definitely seems like a blocker for merging this.
Comment 63 Sergio Villar Senin 2010-09-22 01:51:15 PDT
Comment on attachment 66191 [details]
Added fast/encoding/percent-escaping.html to the list of Skipped

Removing as the test is working with the newest changes to be uploaded
Comment 64 Sergio Villar Senin 2010-09-22 02:04:42 PDT
Created attachment 68349 [details]
Imported SoupURILoader to Webkit. New public API for HTTP cache

New version of the patch. Added fix for the test that started to fail
Comment 65 Sergio Villar Senin 2010-09-22 02:07:40 PDT
Created attachment 68350 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

New version of the patch. Rebased against latest changes to the master
Comment 66 Sergio Villar Senin 2010-09-22 02:09:16 PDT
Comment on attachment 66692 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

Marking this as obsolete, forgot to do it before
Comment 67 WebKit Review Bot 2010-09-22 02:19:03 PDT
Attachment 68349 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/3996101
Comment 68 WebKit Review Bot 2010-09-22 02:40:51 PDT
Attachment 68350 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/4092036
Comment 69 Sergio Villar Senin 2010-09-22 03:50:18 PDT
Comment on attachment 68349 [details]
Imported SoupURILoader to Webkit. New public API for HTTP cache

Still one issue remaining. Removing the r? temporarily
Comment 70 Sergio Villar Senin 2010-09-23 02:03:48 PDT
Created attachment 68505 [details]
Imported SoupURILoader to Webkit. New public API for HTTP cache

This is the right version of the (hopefully) final patch.

I managed to fix the issue with the percent-escaping.html test that was failing and the good news is that the fix is part of the imported soup uri loader code. That way no special code is needed by current WebKitGtk+ code, and if it eventually lands on libsoup, every client will take advantage of that.
Comment 71 WebKit Review Bot 2010-09-23 03:16:05 PDT
Attachment 68505 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/4070098
Comment 72 Martin Robinson 2010-09-23 16:43:47 PDT
Comment on attachment 68350 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

View in context: https://bugs.webkit.org/attachment.cgi?id=68350&action=review

Looks good! I think we should merge this soon. I just have a few style nits.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:128
> +static void cleanupSoupRequestOperation(ResourceHandle* handle, bool isDestroying);
> +static void sendRequestCallback(GObject* source, GAsyncResult* res, gpointer);
> +static void readCallback(GObject* source, GAsyncResult* res, gpointer);
> +static void closeCallback(GObject* source, GAsyncResult* res, gpointer);

I think all of these parameter names are unnecessary except for isDestroying.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:381
> +    handle->ref();

This really needs a comment describing where the balancing deref() is.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:386
> +    g_input_stream_read_async(d->m_inputStream, d->m_buffer, READ_BUFFER_SIZE,
> +                              G_PRIORITY_DEFAULT, d->m_cancellable,
> +                              readCallback, (isBase64) ? 0 : GINT_TO_POINTER(1));

Perhaps smoosh this down to two lines in the interest of vertical space?

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:403
> +

This blank line seems accidental.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:501
> +            ResourceError resourceError(g_quark_to_string(SOUP_HTTP_ERROR),
> +                                        d->m_msg->status_code,
> +                                        uriStr.get(),
> +                                        String::fromUTF8(d->m_msg->reason_phrase));

Once again, smoosh this down to two lines.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:513
> +            ResourceError resourceError(g_quark_to_string(G_IO_ERROR),
> +                                        error->code,
> +                                        uriStr.get(),
> +                                        error ? String::fromUTF8(error->message) : String());

Ditto.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:574
> +    g_input_stream_read_async(d->m_inputStream, d->m_buffer, READ_BUFFER_SIZE,
> +                              G_PRIORITY_DEFAULT, d->m_cancellable,
> +                              readCallback, 0);

Ditto.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:824
>          ResourceError resourceError(g_quark_to_string(G_IO_ERROR),
>                                      error->code,
> -                                    uri,
> +                                    uriStr.get(),
>                                      error ? String::fromUTF8(error->message) : String());

Ditto.

> WebCore/platform/network/soup/ResourceRequest.h:70
> +        void updateSoupMessage(SoupMessage* soupMessage) const;

Please remove the parameter name.
Comment 73 Sergio Villar Senin 2010-09-24 02:47:35 PDT
Created attachment 68664 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

New version with Martin's style fix requests
Comment 74 WebKit Review Bot 2010-09-24 04:37:54 PDT
Attachment 68664 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/4068122
Comment 75 Dan Winship 2010-09-24 11:22:36 PDT
Created attachment 68717 [details]
patches to Sergio's patches

OK, the reason SoupRequester needed a special case to get file URIs working right was because there was already an incorrect special case for file URIs in ResourceHandleSoup::startGio() that was messing the URL up before it got to SoupRequester. So this patch removes both special cases (I didn't bother attaching it in the form of a git commit since it actually needs to be split between both of Sergio's existing patches.).

Additionally, to deal with the problem the startGio() hack was originally attempting to work around, we need to make soup_uri_decode() accept badly-%-encoded URIs rather than returning NULL (which is something we'd already discussed in gnomebug 523100 anyway). There is now a patch for that in https://bugzilla.gnome.org/show_bug.cgi?id=630540 but it's not yet committed because we're in hard code freeze for 2.32.
Comment 76 Martin Robinson 2010-09-24 11:28:42 PDT
(In reply to comment #75)
> OK, the reason SoupRequester needed a special case to get file URIs working right was because there was already an incorrect special case for file URIs in ResourceHandleSoup::startGio() that was messing the URL up before it got to SoupRequester. So this patch removes both special cases (I didn't bother attaching it in the form of a git commit since it actually needs to be split between both of Sergio's existing patches.).

Thank you! Sergio will just need to merge these fixes into his patches before the final r+.

> Additionally, to deal with the problem the startGio() hack was originally attempting to work around, we need to make soup_uri_decode() accept badly-%-encoded URIs rather than returning NULL (which is something we'd already discussed in gnomebug 523100 anyway). There is now a patch for that in https://bugzilla.gnome.org/show_bug.cgi?id=630540 but it's not yet committed because we're in hard code freeze for 2.32.

So, if I understand correctly, to avoid a regression we'll either need to:
1. Re-implement this logic in WebKit GTK+ for older soup versions.
2. Wait to land this until you release a new version of soup with this fix and then depend on that.
Comment 77 Dan Winship 2010-09-24 11:31:10 PDT
I haven't looked at the code in detail, but I think you probably want to rename all of the soup-prefixed stuff to be webkit-prefixed instead so that it doesn't start causing compile and run-time problems when the code lands in libsoup. (Even if it's not visible outside of libwebkit, you still have to worry about GType's type database; bad things will happen if libsoup and webkit both try to register a type with the name "SoupRequest".)

I have not looked at the tpl-using code or the GVariant-using alternative, but it does seem like using GVariant instead of bringing in tpl would be nice... if there are performance issues presumably GVariant can still be optimized a bunch?
Comment 78 Dan Winship 2010-09-24 11:40:00 PDT
(In reply to comment #76)
> So, if I understand correctly, to avoid a regression we'll either need to:
> 1. Re-implement this logic in WebKit GTK+ for older soup versions.
> 2. Wait to land this until you release a new version of soup with this fix and then depend on that.

Yes. For #1, you could make soup-request-file.c use KURL.h's decodeURLEscapeSequences() instead of soup_uri_decode(), assuming it's not too obnoxious to call that from C...

For #2, it will be in libsoup 2.32.1. Are you planning to try to get this in for GNOME 2.32.0 still? I'm pretty sure it's too late to bump external dependency versions at this point...
Comment 79 Sergio Villar Senin 2010-09-25 04:06:51 PDT
(In reply to comment #75)
> Created an attachment (id=68717) [details]
> patches to Sergio's patches
> 
> OK, the reason SoupRequester needed a special case to get file URIs working right was because there was already an incorrect special case for file URIs in ResourceHandleSoup::startGio() that was messing the URL up before it got to SoupRequester. So this patch removes both special cases (I didn't bother attaching it in the form of a git commit since it actually needs to be split between both of Sergio's existing patches.).
> 

Well that's not really the reason. The special case in souprequester needs the special case in startGio. One cannot work without the other. With both of them it works fine.

> Additionally, to deal with the problem the startGio() hack was originally attempting to work around, we need to make soup_uri_decode() accept badly-%-encoded URIs rather than returning NULL (which is something we'd already discussed in gnomebug 523100 anyway). There is now a patch for that in https://bugzilla.gnome.org/show_bug.cgi?id=630540 but it's not yet committed because we're in hard code freeze for 2.32.

That's the real issue IMO. That's why my first thought was to modify the decoding in soup-uri.c although your changes seem indeed better than adding a special case for file: URL's. I also considered removing that fixup parametter but I was not sure about breaking current URI parsing in soup. Note that glib APIs do return NULL when they cannot decode URIs

One last question Dan, did you run the tests with both patches?
Comment 80 Xan Lopez 2010-09-26 07:16:30 PDT
Comment on attachment 68664 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

View in context: https://bugs.webkit.org/attachment.cgi?id=68664&action=review

> WebCore/platform/network/ResourceHandleInternal.h:51
> +#include <GRefPtr.h>

Nitpick, but requester.h already includes request.h

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:196
> +    String location = String::fromUTF8(uri.get());

Can't we just commit this bit now?

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:305
>  // parseDataUrl() is taken from the CURL http backend.

I guess you want to remove this comment now, since the function is wildly different by now (or at least say 'originally' or something).

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:387
>  

GINT_TO_POINTER(!isBase64) ? Could fail?

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:453
> +

Any reason not to use platform ptrs here too?

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:459
> +

And here?

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:515
> +

These two ifs are almost identical, I think they can be trivially merged.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:544
> +    d->m_buffer = static_cast<char*>(g_malloc(READ_BUFFER_SIZE));

Am I the only one worried that we malloc 8k for each outgoing HTTP request? or am I missing something here. Is this just making something explicit that was implicit before or we adding an extra copy to the whole process?
Comment 81 Sergio Villar Senin 2010-09-27 02:12:32 PDT
Comment on attachment 68664 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

View in context: https://bugs.webkit.org/attachment.cgi?id=68664&action=review

>> WebCore/platform/network/ResourceHandleInternal.h:51
>> +#include <GRefPtr.h>
> 
> Nitpick, but requester.h already includes request.h

Indeed. Will remove it.

>> WebCore/platform/network/soup/ResourceHandleSoup.cpp:196
>> +    String location = String::fromUTF8(uri.get());
> 
> Can't we just commit this bit now?

Yes. Martin asked for the change, that's why it's included in this patch

>> WebCore/platform/network/soup/ResourceHandleSoup.cpp:305
>>  // parseDataUrl() is taken from the CURL http backend.
> 
> I guess you want to remove this comment now, since the function is wildly different by now (or at least say 'originally' or something).

True

>> WebCore/platform/network/soup/ResourceHandleSoup.cpp:387
>>  
> 
> GINT_TO_POINTER(!isBase64) ? Could fail?

I just wanted to be ultra-explicit but what you propose would be the same.

>> WebCore/platform/network/soup/ResourceHandleSoup.cpp:453
>> +
> 
> Any reason not to use platform ptrs here too?

They are not platform ptrs right now. Not sure if that change should be included in this patch. I thought it was a general rule not to include changes not directly related to the issue the patch tries to fix.

>> WebCore/platform/network/soup/ResourceHandleSoup.cpp:515
>> +
> 
> These two ifs are almost identical, I think they can be trivially merged.

Indeed they can

>> WebCore/platform/network/soup/ResourceHandleSoup.cpp:544
>> +    d->m_buffer = static_cast<char*>(g_malloc(READ_BUFFER_SIZE));
> 
> Am I the only one worried that we malloc 8k for each outgoing HTTP request? or am I missing something here. Is this just making something explicit that was implicit before or we adding an extra copy to the whole process?

:m really good point here. I think you're right and we're adding an extra copy for http requests when reading from the input stream. With the old code all the didReceiveData calls were using buffers provided by libsoup without any extra copy. I guess we need something better then...
Comment 82 Dan Winship 2010-09-27 07:30:35 PDT
(In reply to comment #80)
> > WebCore/platform/network/soup/ResourceHandleSoup.cpp:544
> > +    d->m_buffer = static_cast<char*>(g_malloc(READ_BUFFER_SIZE));
> 
> Am I the only one worried that we malloc 8k for each outgoing HTTP request? or am I missing something here. Is this just making something explicit that was implicit before or we adding an extra copy to the whole process?

At the moment, given the kludginess of SoupHTTPInputStream, this is an additional malloc and an additional copy. When things are redone so that libsoup uses streams internally, then the malloc here will be replacing a malloc that is currently in soup-message-io.c, and libsoup will be reading directly from the network into your buffers.

At any rate, you can copy the buffer 10 times and it will still be unnoticeable compared to waiting for the network and rendering HTML, so, it's not that awful...
Comment 83 Xan Lopez 2010-09-28 04:24:49 PDT
(In reply to comment #82)
> (In reply to comment #80)
> > > WebCore/platform/network/soup/ResourceHandleSoup.cpp:544
> > > +    d->m_buffer = static_cast<char*>(g_malloc(READ_BUFFER_SIZE));
> > 
> > Am I the only one worried that we malloc 8k for each outgoing HTTP request? or am I missing something here. Is this just making something explicit that was implicit before or we adding an extra copy to the whole process?
> 
> At the moment, given the kludginess of SoupHTTPInputStream, this is an additional malloc and an additional copy. When things are redone so that libsoup uses streams internally, then the malloc here will be replacing a malloc that is currently in soup-message-io.c, and libsoup will be reading directly from the network into your buffers.
> 
> At any rate, you can copy the buffer 10 times and it will still be unnoticeable compared to waiting for the network and rendering HTML, so, it's not that awful...

WebKit in general has a rule of never allowing performance regressions, even if new features are being added. I think it makes sense to enforce this rule in our port too.

If this is going to go away in the future and there's really no reasonably easy to avoid it I think we could make an exception, but it would still be unfortunate. FWIW talking with Holger some weeks ago he told me in QtWebKit the copying of data in the network stack is noticeable performance-wise in general, although in their case the number of copies was larger (something between 3 or 5 IIRC).
Comment 84 Dan Winship 2010-09-28 08:21:07 PDT
(In reply to comment #83)
> WebKit in general has a rule of never allowing performance regressions, even if new features are being added. I think it makes sense to enforce this rule in our port too.

But presumably that means "measured performance regressions", not "theoretical performance regressions". And one would hope that caching is going to increase performance by much more than an extra copy is going to decrease it... :)

> If this is going to go away in the future and there's really no reasonably easy to avoid it

Duh, now that you mention it, there's an API designed exactly for avoiding this extra copy... soup_message_set_chunk_allocator() (added for the gstreamer guys when they were porting their http stream to use libsoup). SoupHTTPInputStream could be modified to use this to get libsoup to read directly into the buffers provided by webkit.
Comment 85 Sergio Villar Senin 2010-09-29 06:49:44 PDT
(In reply to comment #84)
> (In reply to comment #83)
> > WebKit in general has a rule of never allowing performance regressions, even if new features are being added. I think it makes sense to enforce this rule in our port too.
> 
> But presumably that means "measured performance regressions", not "theoretical performance regressions". And one would hope that caching is going to increase performance by much more than an extra copy is going to decrease it... :)
> 
> > If this is going to go away in the future and there's really no reasonably easy to avoid it
> 
> Duh, now that you mention it, there's an API designed exactly for avoiding this extra copy... soup_message_set_chunk_allocator() (added for the gstreamer guys when they were porting their http stream to use libsoup). SoupHTTPInputStream could be modified to use this to get libsoup to read directly into the buffers provided by webkit.

Unfortunately I don't think that will fix anything. First of all because I think the chunk_allocator thing it's not properly implemented in libsoup, because the SoupBuffer (and its data pointer) that comes with the got_chunk signal is not the same than the one returned by the allocator. That's because the decoding routines create a different buffer for each decoder.

Secondly, even if we fix the problem stated above, we'd still have to copy the data returned by the decoders to the caller's buffer (soup_coding_apply_into wouldn't fix anything as input and output buffers must not overlap)
Comment 86 Sergio Villar Senin 2010-09-29 09:54:37 PDT
Created attachment 69205 [details]
Imported SoupURILoader to Webkit. New public API for HTTP cache

Includes Dan's patch that reverts the file: URIs hack
Comment 87 Sergio Villar Senin 2010-09-29 10:17:55 PDT
Created attachment 69209 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

New version with Dan's patch applied.

It also includes Xan's requested changes:
   * Moved m_inputStream and m_cancellable to platform ptrs
   * Refactored the 2 errors
   * GINT_TO_POINTER(!isBase64)
   * Removed duplicated include file

Apart from that I removed some glib2.21 #ifdef code as the required version for this patch is 2.24
Comment 88 WebKit Review Bot 2010-09-29 11:07:07 PDT
Attachment 69205 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/4201011
Comment 89 WebKit Review Bot 2010-09-29 13:07:41 PDT
Attachment 69209 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/4225012
Comment 90 Sergio Villar Senin 2010-09-30 03:07:14 PDT
(In reply to comment #77)
> I haven't looked at the code in detail, but I think you probably want to rename all of the soup-prefixed stuff to be webkit-prefixed instead so that it doesn't start causing compile and run-time problems when the code lands in libsoup. (Even if it's not visible outside of libwebkit, you still have to worry about GType's type database; bad things will happen if libsoup and webkit both try to register a type with the name "SoupRequest".)

I think it was kov the one that said that it'd be better not to change them all in order to easen the migration to libsoup.

Anyway I agree that it could be an important issue. I have a patch that changes all those names so if everybody think it's interesting I can upload new versions of the patches.
Comment 91 Sergio Villar Senin 2010-09-30 11:13:32 PDT
Created attachment 69355 [details]
Imported SoupURILoader to Webkit. New public API for HTTP cache

Renamed all the new Soup* symbols to WebKitSoup*
Comment 92 Sergio Villar Senin 2010-09-30 11:14:50 PDT
Comment on attachment 69205 [details]
Imported SoupURILoader to Webkit. New public API for HTTP cache

Forgot to mark this as obsolete
Comment 93 Sergio Villar Senin 2010-09-30 11:16:06 PDT
Created attachment 69356 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

Renamed new Soup* symbols to WebKitSoup*

Also refactored some duplicated code in parseDataUrl (nice catch Martin)
Comment 94 WebKit Review Bot 2010-09-30 13:04:58 PDT
Attachment 69355 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/4167023
Comment 95 Sergio Villar Senin 2010-09-30 13:29:07 PDT
As discussed on IRC, the fix for soup_uri_decode won't land on libsoup till 2.32.1 (mid November). As we want to release this first I'll copy&pastet the soup-uri.c code (uri_decoded_copy function) into the imported SoupURILoader stuff.

We'll remove that code once the 2.32.1 is released.
Comment 96 WebKit Review Bot 2010-09-30 14:34:18 PDT
Attachment 69356 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/4219041
Comment 97 Sergio Villar Senin 2010-09-30 14:35:38 PDT
Created attachment 69384 [details]
Imported SoupURILoader to Webkit. New public API for HTTP cache

New version of the patch:
   * A couple of indentation issues fixed
   * Do not need to decode base64 encoded data URLs
   * Imported uri_decoded_copy() with the fix that will land in libsoup 2.32.1. Use it instead of soup_uri_decode for file: and data: URLs
Comment 98 Martin Robinson 2010-09-30 15:19:34 PDT
Comment on attachment 69356 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

View in context: https://bugs.webkit.org/attachment.cgi?id=69356&action=review

This patch is so very close, but I've found some issues with memory leaks. When constructing a platformRefPtr, you should use adoptPlatformRef, unless you want the reference count to increase (the default constructor calls g_object_ref_sink).

> WebCore/platform/network/ResourceHandleInternal.h:118
> +            , m_soupRequest(0)
> +            , m_requester(0)
>              , m_inputStream(0)
>              , m_cancellable(0)

Since these are now PlatformRefPtrs you don't need to initialize them. They are automatically 0.

> WebCore/platform/network/ResourceHandleInternal.h:140
> +            m_requester = webkit_soup_requester_new();

This should use adoptPlatformRef, see below for an extended note about this.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:357
> +    d->m_soupRequest = webkit_soup_requester_request(d->m_requester.get(), handle->firstRequest().url().string().utf8().data(), session, &error.outPtr());

Ditto.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:365
> +    if (error)
> +        return false;
> +
> +    d->m_inputStream = in;

Careful here. From the soup code it looks like it is the callers responsibility to unref the value that webkit_soup_request_send returns. Constructing a PlatformRefPtr without using adoptPlatformRef has the effect of calling g_object_ref another time on the GInputStream. 

d->m_inputStream = adoptPlatformRef(webkit_soup_request_send(d->m_soupRequest.get(), 0, &error.outPtr()));
if (error) {
    d->m_inputStream = 0;
    return false;
}

and dispense with the intermediate 'in' variable entirely.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:373
> +    d->m_cancellable = g_cancellable_new();

This should be d->m_cancellable = adoptPlatformRef(g_cancellable_new()); as well.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:436
> +static void cleanupSoupRequestOperation(ResourceHandle* handle, bool isDestroying = false)
> +{
> +    ResourceHandleInternal* d = handle->getInternal();
> +
> +    if (d->m_soupRequest) {
> +        g_object_set_data(G_OBJECT(d->m_soupRequest.get()), "webkit-resource", 0);
> +        d->m_soupRequest.clear();
> +    }

We still need to clear the m_cancellable here to ensure that the unref happens, right?

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:519
> +    d->m_inputStream = in;

The same goes for this section. After this the reference count on d->m_inputStream will be 2.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:564
> +    d->m_soupRequest = webkit_soup_requester_request(d->m_requester.get(), url.string().utf8().data(), session, &error.outPtr());

Ditto.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:652
> +    d->m_cancellable = g_cancellable_new();

Same issue here.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:833
>      // GIO doesn't know how to handle refs and queries, so remove them

Shouldn't this comment be changed to reflect the fact that we're using soup now?

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:839
> +    urlStr = url.string().utf8().data();

This might as well be:
CString urlStr = url.string().utf8().data();

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:842
> +    d->m_soupRequest = webkit_soup_requester_request(d->m_requester.get(), urlStr.data(), session, &error.outPtr());

I think this should use adoptPlatformRef as well.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:851
>      d->m_cancellable = g_cancellable_new();

Same issue here.
Comment 99 WebKit Review Bot 2010-09-30 16:18:52 PDT
Attachment 69384 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/4141040
Comment 100 Sergio Villar Senin 2010-10-01 02:47:29 PDT
Created attachment 69436 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

New version. Includes:
   * Added adoptPlatformRef where needed (thx Martin)
   * Renamed m_msg to m_soupMsg and converted it to PlatformRefPtr
Comment 101 WebKit Review Bot 2010-10-01 02:48:38 PDT
Attachment 69436 [details] did not pass style-queue:

Failed to run "['WebKitTools/Scripts/check-webkit-style']" exit_code: 1
WebCore/platform/network/soup/ResourceHandleSoup.cpp:481:  Declaration has space between type name and * in SoupMessage *soupMsg  [whitespace/declaration] [3]
WebCore/platform/network/soup/ResourceHandleSoup.cpp:578:  Declaration has space between type name and * in SoupMessage *soupMsg  [whitespace/declaration] [3]
Total errors found: 2 in 6 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 102 Sergio Villar Senin 2010-10-01 02:56:38 PDT
Created attachment 69438 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

Sorry for the previous one. Fixed the style issues.
Comment 103 WebKit Review Bot 2010-10-01 03:53:32 PDT
Attachment 69436 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/4179041
Comment 104 WebKit Review Bot 2010-10-01 04:18:16 PDT
Attachment 69438 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/4229039
Comment 105 Martin Robinson 2010-10-01 09:55:20 PDT
Comment on attachment 69438 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

View in context: https://bugs.webkit.org/attachment.cgi?id=69438&action=review

General comment: I'm not as picky about this as others, but some reviewers would have you end all comments with a period. That's something to consider for this patch.

> WebCore/platform/network/ResourceHandleInternal.h:189
> +        PlatformRefPtr<SoupMessage> m_soupMsg;

This should probably be m_soupMessage, as the guideline is to avoid abbreviations in new code.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:-132
> -    if (m_msg) {
> -        g_object_unref(m_msg);
> -        m_msg = 0;
> -    }

Nice change.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:145
> -    if (d->m_msg)
> -        g_signal_handlers_disconnect_matched(d->m_msg, G_SIGNAL_MATCH_DATA,
> +    if (d->m_soupMsg)
> +        g_signal_handlers_disconnect_matched(d->m_soupMsg.get(), G_SIGNAL_MATCH_DATA,

I could be wrong about this, but shouldn't this be in cleanupSoupRequestOperation? Otherwise, we might get responses to messages for a no-longer-valid SoupMessage.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:364
> +    d->m_buffer = static_cast<char*>(g_malloc(READ_BUFFER_SIZE));

My intuition is that this allocation should either be done with the GLib slice allocator or with fastAlloc/fastFree. Perhaps someone else can chime in here.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:445
> +    if (d->m_cancellable)
> +        d->m_cancellable.clear();
> +
> +    if (d->m_soupMsg)
> +        d->m_soupMsg.clear();

No need to check these before clearing them. Just clear them.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:506
> +            // WebCore might have cancelled the job in the while
> +            if (d->m_cancelled)
> +                return;
> +
> +            if (soupMsg->response_body->data)
> +                client->didReceiveData(handle.get(), soupMsg->response_body->data, soupMsg->response_body->length, true);

In the case that the request was cancelled this should be calling cleanupSoupRequestOperation(). So you might change 
    if (soupMsg->response_body->data)
to
    if (!d->m_cancelled && soupMsg->response_body->data)
and remove the early return.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:513
> +        if (d->m_cancelled || !client) {
> +          cleanupSoupRequestOperation(handle.get());
> +          return;
> +        }

The tab width here should be 4.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:525
> +    d->m_buffer = static_cast<char*>(g_malloc(READ_BUFFER_SIZE));

Same issue with the allocation.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:578
> +    SoupMessage* soupMsg = d->m_soupMsg.get();

Avoid abbreviations.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:582
> +    request.updateSoupMessage(soupMsg);
> +
> +    if (!d->m_soupMsg)
>          return false;

It doesn't look like you want to call updateSoupMessage with a null message, so you should probably move this check up to line 578.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:775
> +static void readCallback(GObject* source, GAsyncResult* res, gpointer data)

Please change res to asyncResult.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:826
> +    g_input_stream_read_async(d->m_inputStream.get(), d->m_buffer, READ_BUFFER_SIZE,
> +                              G_PRIORITY_DEFAULT, d->m_cancellable.get(),
> +                              readCallback, data);

This can be two lines.

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:845
> +    CString urlStr = url.string().utf8().data();

This can just be: CString urlStr = url.string().utf8(); I think I suggested .data() originally, sorry.
Comment 106 Sergio Villar Senin 2010-10-04 03:35:40 PDT
Created attachment 69611 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

Some more changes to the patch from Martin's comments:
   * renamed soupMsg to soupMessage
   * disconnect signals in cleanupSoupRequestOperation
   * replaced g_malloc by g_slice_alloc
   * some style fixes
   * moved early returns to better locations
Comment 107 WebKit Review Bot 2010-10-04 04:00:53 PDT
Attachment 69611 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/4200074
Comment 108 Martin Robinson 2010-10-04 08:07:06 PDT
Comment on attachment 69611 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

Great. I'm going to wait for a nod from Kov or Xan to r+ the API changes.
Comment 109 Gustavo Noronha (kov) 2010-10-05 07:08:59 PDT
Comment on attachment 69384 [details]
Imported SoupURILoader to Webkit. New public API for HTTP cache

View in context: https://bugs.webkit.org/attachment.cgi?id=69384&action=review

> WebCore/platform/network/soup/cache/webkit/soup-cache.c:1595
> +	filename = g_build_filename (priv->cache_dir, WEBKIT_SOUP_CACHE_FILE, NULL);
> +	if (!g_file_get_contents (filename, &contents, &length, NULL)) {
> +		g_free (filename);
> +		return;
> +	}

There are several indentation inconsistencies like this one (2 spaces indentation for the first level, but 4 spaces in second-level indentation) in the soup files. If you could fix those when landing, that'd be great. *nod* otherwise! (I'll leave the r+ for mrobinson)
Comment 110 Xan Lopez 2010-10-05 07:15:49 PDT
Just a reminder, but EWS needs to be updated before this can land.
Comment 111 Sergio Villar Senin 2010-10-05 10:56:58 PDT
(In reply to comment #109)
> (From update of attachment 69384 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=69384&action=review
> 
> > WebCore/platform/network/soup/cache/webkit/soup-cache.c:1595
> > +	filename = g_build_filename (priv->cache_dir, WEBKIT_SOUP_CACHE_FILE, NULL);
> > +	if (!g_file_get_contents (filename, &contents, &length, NULL)) {
> > +		g_free (filename);
> > +		return;
> > +	}
> 
> There are several indentation inconsistencies like this one (2 spaces indentation for the first level, but 4 spaces in second-level indentation) in the soup files. If you could fix those when landing, that'd be great. *nod* otherwise! (I'll leave the r+ for mrobinson)

I think you're cheat by the way bugzilla shows patches with tabs, as libsoup code is using tabs for indentation. I have double checked that the patches are properly indented with 8-space tabs, it's just how they're rendered when reviewed.

Anyway I'll upload a new version with emacs headers for indentation as libsoup uses.
Comment 112 Sergio Villar Senin 2010-10-05 10:59:08 PDT
Created attachment 69810 [details]
Imported SoupURILoader to Webkit. New public API for HTTP cache

Added emacs indentation headers as the originals in libsoup. I also fixed a couple of typos in the names of some macros
Comment 113 Martin Robinson 2010-10-05 11:05:38 PDT
Comment on attachment 69810 [details]
Imported SoupURILoader to Webkit. New public API for HTTP cache

I double-checked the patch and it seems it's tabbed properly. We need to update the EWS version of libsoup before we can land this patch.
Comment 114 Gajendra singh 2010-10-12 02:09:43 PDT
Just out of curiosity, Would this cache implementation suffice in multiprocess environment, Say Two processes A and B are linked with webkit and having same Cache location, then how the synchronization would be done for "soup.cache" file, if A and B load same pages.
Comment 115 WebKit Commit Bot 2010-10-12 10:59:39 PDT
Comment on attachment 69611 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

Clearing flags on attachment: 69611

Committed r69589: <http://trac.webkit.org/changeset/69589>
Comment 116 Martin Robinson 2010-10-12 11:33:25 PDT
Comment on attachment 69611 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

Just realized that these patches don't have complete ChangeLogs. Flipping the review flag until Sergio can fix them.
Comment 117 Sergio Villar Senin 2010-10-13 01:39:03 PDT
(In reply to comment #114)
> Just out of curiosity, Would this cache implementation suffice in multiprocess environment, Say Two processes A and B are linked with webkit and having same Cache location, then how the synchronization would be done for "soup.cache" file, if A and B load same pages.

There wouldn't be any kind of synchronization. The implementation assumes that every process will use it own directory.
Comment 118 Sergio Villar Senin 2010-10-13 10:55:47 PDT
Created attachment 70628 [details]
Imported SoupURILoader to Webkit. New public API for HTTP cache

Corrected ChangeLogs
Comment 119 Sergio Villar Senin 2010-10-13 10:56:33 PDT
Created attachment 70630 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

Updated ChangeLogs
Comment 120 Martin Robinson 2010-10-13 10:58:35 PDT
Comment on attachment 70628 [details]
Imported SoupURILoader to Webkit. New public API for HTTP cache

Great. Thanks. I'll land these patches.
Comment 121 Sergio Villar Senin 2010-10-13 11:08:38 PDT
Comment on attachment 69611 [details]
WebKitGtk+ to use the new API from the imported SoupURILoader code

Obsolete actually
Comment 122 Martin Robinson 2010-10-13 17:33:03 PDT
Committed r69718: <http://trac.webkit.org/changeset/69718>
Comment 123 Ryosuke Niwa 2010-10-13 18:02:49 PDT
http://trac.webkit.org/changeset/69718 seems to have broken Gtk/Qt release builds:

../../WebCore/platform/network/soup/cache/soup-directory-input-stream.c:114: warning: comparison between signed and unsigned integer expressions
../../WebCore/platform/network/soup/cache/soup-directory-input-stream.c:127: warning: comparison between signed and unsigned integer expressions
../../WebCore/platform/network/soup/cache/soup-request-data.c:82: warning: comparison between signed and unsigned integer expressions
../../WebCore/platform/network/soup/cache/soup-request-data.c:82: warning: signed and unsigned type in conditional expression
../../WebCore/platform/network/soup/cache/soup-request-file.c:171: warning: implicit declaration of function ‘soup_uri_get_path’
../../WebCore/platform/network/soup/cache/soup-request-file.c:171: warning: initialization makes pointer from integer without a cast
../../WebCore/platform/network/soup/cache/webkit/soup-cache.c:271: warning: signed and unsigned type in conditional expression
../../WebCore/platform/network/soup/cache/webkit/soup-cache.c:1347: warning: comparison between signed and unsigned integer expressions
../../WebCore/platform/network/soup/cache/webkit/soup-cache.c:1366: warning: comparison between signed and unsigned integer expressions
Comment 124 Martin Robinson 2010-10-13 18:08:25 PDT
(In reply to comment #123)
> http://trac.webkit.org/changeset/69718 seems to have broken Gtk/Qt release builds:


The issue here seems be that the Release bots do not have a new enough version of libsoup to have soup_uri_get_path. Here's the failure line:

make[1]: Entering directory `/var/lib/buildbot/build/gtk-linux-32-release/build/WebKitBuild/Release'
  CC     WebKit/gtk/tests/Programs_unittests_testdomdocument-testdomdocument.o
  CCLD   Programs/unittests/testdomdocument
./.libs/libwebkitgtk-1.0.so: undefined reference to `soup_uri_get_path'


> ../../WebCore/platform/network/soup/cache/soup-directory-input-stream.c:114: warning: comparison between signed and unsigned integer expressions
> ../../WebCore/platform/network/soup/cache/soup-directory-input-stream.c:127: warning: comparison between signed and unsigned integer expressions
> ../../WebCore/platform/network/soup/cache/soup-request-data.c:82: warning: comparison between signed and unsigned integer expressions
> ../../WebCore/platform/network/soup/cache/soup-request-data.c:82: warning: signed and unsigned type in conditional expression
> ../../WebCore/platform/network/soup/cache/soup-request-file.c:171: warning: implicit declaration of function ‘soup_uri_get_path’
> ../../WebCore/platform/network/soup/cache/soup-request-file.c:171: warning: initialization makes pointer from integer without a cast
> ../../WebCore/platform/network/soup/cache/webkit/soup-cache.c:271: warning: signed and unsigned type in conditional expression
> ../../WebCore/platform/network/soup/cache/webkit/soup-cache.c:1347: warning: comparison between signed and unsigned integer expressions
> ../../WebCore/platform/network/soup/cache/webkit/soup-cache.c:1366: warning: comparison between signed and unsigned integer expressions

We should also fix these warnings before we try to land this patch again.
Comment 125 Xan Lopez 2010-10-13 18:15:15 PDT
(In reply to comment #124)
> (In reply to comment #123)
> > http://trac.webkit.org/changeset/69718 seems to have broken Gtk/Qt release builds:
> 
> 
> The issue here seems be that the Release bots do not have a new enough version of libsoup to have soup_uri_get_path. Here's the failure line:
> 
> make[1]: Entering directory `/var/lib/buildbot/build/gtk-linux-32-release/build/WebKitBuild/Release'
>   CC     WebKit/gtk/tests/Programs_unittests_testdomdocument-testdomdocument.o
>   CCLD   Programs/unittests/testdomdocument
> ./.libs/libwebkitgtk-1.0.so: undefined reference to `soup_uri_get_path'

Fix it in tree by accessing path directly (uri->path), the struct is public. The getter was added in 2.32 only for the benefit of the bindings AFAIK.
Comment 126 Martin Robinson 2010-10-13 18:36:55 PDT
(In reply to comment #125)
> Fix it in tree by accessing path directly (uri->path), the struct is public. The getter was added in 2.32 only for the benefit of the bindings AFAIK.

Okay. Fixed in the tree, but we really should fix those warnings.
Comment 127 Martin Robinson 2010-10-13 23:35:08 PDT
Committed r69742: <http://trac.webkit.org/changeset/69742>
Comment 128 Hayato Ito 2010-10-14 01:05:55 PDT
http://trac.webkit.org/changeset/69742 seems to have broken Gtk builds:

http://build.webkit.org/builders/GTK%20Linux%2032-bit%20Release/builds/6599/steps/compile-webkit/logs

../../WebCore/platform/network/soup/ResourceHandleSoup.cpp: In function ‘void WebCore::sendRequestCallback(GObject*, GAsyncResult*, void*)’:
../../WebCore/platform/network/soup/ResourceHandleSoup.cpp:495: error: ‘soupMessage’ was not declared in this scope
make[1]: *** [WebCore/platform/network/soup/libwebkitgtk_1_0_la-ResourceHandleSoup.lo] Error 1
make[1]: Leaving directory `/var/lib/buildbot/build/gtk-linux-32-release/build/WebKitBuild/Release'
make: *** [all] Error 2

(In reply to comment #127)
> Committed r69742: <http://trac.webkit.org/changeset/69742>
Comment 129 Alejandro G. Castro 2010-10-14 02:09:30 PDT
(In reply to comment #128)
> http://trac.webkit.org/changeset/69742 seems to have broken Gtk builds:
> 
> http://build.webkit.org/builders/GTK%20Linux%2032-bit%20Release/builds/6599/steps/compile-webkit/logs
> 
> ../../WebCore/platform/network/soup/ResourceHandleSoup.cpp: In function ‘void WebCore::sendRequestCallback(GObject*, GAsyncResult*, void*)’:
> ../../WebCore/platform/network/soup/ResourceHandleSoup.cpp:495: error: ‘soupMessage’ was not declared in this scope
> make[1]: *** [WebCore/platform/network/soup/libwebkitgtk_1_0_la-ResourceHandleSoup.lo] Error 1
> make[1]: Leaving directory `/var/lib/buildbot/build/gtk-linux-32-release/build/WebKitBuild/Release'
> make: *** [all] Error 2
> 
> (In reply to comment #127)
> > Committed r69742: <http://trac.webkit.org/changeset/69742>

Fixed here, thanks for the report:

http://trac.webkit.org/changeset/69748