This bug comes from https://bugzilla.gnome.org/show_bug.cgi?id=523100. Summarizing, a cache for libsoup was implemented using the SoupURILoader stuff that is under development. Dan Winship considered that the SoupURILoader stuff is not ready for trunk https://bugzilla.gnome.org/show_bug.cgi?id=523100#c27. That's why he suggested to move all the cache stuff to webkitgtk while the libsoup stuff does not land upstream. Since now it's purely a webkitgtk+ issue it's better to manage it here.
Created attachment 64849 [details] HTTP cache for WebKitGtk+ Basically this patch is made of: * changes to ResourceHandleSoup already reviewed in bug 40222 * libsoup's URI loader code discussed here https://bugzilla.gnome.org/show_bug.cgi?id=557777 * caching support https://bugzilla.gnome.org/show_bug.cgi?id=523100 Once applied to current trunk all the tests work fine except these two: * http/test/history/back-to-post.php: recently changed by http://trac.webkit.org/changeset/65340. Need to review what happens * fast/encoding/percent-escaping.html: as stated here https://bugzilla.gnome.org/show_bug.cgi?id=523100#c28 it needs some changes in libsoup to work properly, as libsoup incorrectly escapes file URL's with '%'
Comment on attachment 64849 [details] HTTP cache for WebKitGtk+ This patch is huge! I'd really like to see it split up into at least two parts. 1. The code drop from Soup, which should probably be separated somehow from the rest of our source, so we can treat it specially (it will go away eventually, right?). 2. The bits that actually implement the cache in WebCore/WebKit. That being said, I have a few comments on what I presume to be the second part. > + GRefPtr<WebKitSoupRequest> m_req; > + GRefPtr<WebKitSoupRequester> m_requester; Please do not abbreviate member names. This could be m_request or, heck, even m_soupRequest. Someone new will look at the code and know exactly what it is. The indentation is a little funky here as well. > #include "SharedBuffer.h" > #include "TextEncoding.h" > > +#include "webkitsouprequesthttp.h" > + > #include <errno.h> > #include <fcntl.h> > #include <gio/gio.h> Just remove all spaces in the list of includes per the style guidelines. There is no need to separate them by include style. > + if (m_req) { > + g_object_set_data(G_OBJECT(m_req.get()), "webkit-resource", 0); > + m_req.clear(); There's no chance that this may be the last reference to m_req here? > + GOwnPtr<char> uri(soup_uri_to_string(soup_message_get_uri(msg), false)); > + String location = String(uri.get()); Even though URL are generally URL-encoded, I don't know if we can rely on that here. You should use String::fromUTF8. > +#define BUFFER_SIZE 8192 Why not move this to the top and make the name more descriptive? INPUT_STREAM_BUFFER_SIZE or something like that. > + GInputStream* in = webkitSoupRequestSend(d->m_req.get(), 0, &error.outPtr()); This should be a full-blown variable name like inputStream. > - d->m_idleHandler = g_timeout_add(0, parseDataUrl, handle); > + d->m_idleHandler = g_idle_add(parseDataUrl, handle); Why did you switch this to an idle? I think Gustavo is trying to move away from that because they are often starved. Maybe ping him about this. > g_object_set(session, > SOUP_SESSION_MAX_CONNS, MAX_CONNECTIONS, > SOUP_SESSION_MAX_CONNS_PER_HOST, MAX_CONNECTIONS_PER_HOST, > - NULL); > + 0); The style bot should not complain about this, because the missing NULL causes a compiler warning. Please just use NULL here. > + > + if (d->m_cancellable) { > + g_object_unref(d->m_cancellable); > + d->m_cancellable = 0; > + } m_cancellable can probably be a GRefPtr in another patch. > + if (d->m_inputStream) { > + g_object_set_data(G_OBJECT(d->m_inputStream), "webkit-resource", 0); > + g_object_unref(d->m_inputStream); > + d->m_inputStream = 0; > + } Ditto for m_inputStream. > + > + if (d->m_buffer) { > + g_free(d->m_buffer); > + d->m_buffer = 0; > + } Ditto for m_buffer, except GOwnPtr. > + GInputStream* in = webkitSoupRequestSendFinish(d->m_req.get(), res, &error.outPtr()); inputStream again. :) > + > + if (error) { > + if (d->m_msg && SOUP_STATUS_IS_TRANSPORT_ERROR(d->m_msg->status_code)) { > + SoupURI* uri = webkitSoupRequestGetUri(d->m_req.get()); > + GOwnPtr<char> uriStr(soup_uri_to_string(uri, false)); No need to abbreviate the variable name here. uriString is fine. > > - // Used to set the authentication dialog toplevel; may be NULL > + // Used to set the authentication dialog toplevel; may be 0 I think "null" here is clearer and probably won't upset the style gods. > + // We have to convert it to UTF-16 due to limitations in KURL > + GOwnPtr<gunichar2> unicodeStr(g_utf8_to_utf16(d->m_buffer, bytesRead, 0, 0, 0)); > + client->didReceiveData(handle.get(), reinterpret_cast<const char*>(unicodeStr.get()), g_utf8_strlen(d->m_buffer, bytesRead) * sizeof(UChar), 0); Okay, again this is kind of a nit, but unicodeStr is not entirely precise. It should probably be utf16String. g_utf8_strlen(d->m_buffer, bytesRead) * sizeof(UChar) seems unsafe against encoding errors and is a little clunky. I think it's better to do something like this: long wordsWritten; GOwnPtr<gunichar2> utf16String(g_utf8_to_utf16(d->m_buffer, bytesRead, 0, &wordsWritten, 0)); client->didReceiveData(handle.get(), reinterpret_cast<const char*>(utf16String()), wordsWritten * sizeof(UChar), 0); > + g_object_set(soupMessage, SOUP_MESSAGE_METHOD, httpMethod().utf8().data(), 0); NULL here instead of 0.
(In reply to comment #2) > (From update of attachment 64849 [details]) > This patch is huge! I'd really like to see it split up into at least > two parts. > > 1. The code drop from Soup, which should probably be separated somehow from > the rest of our source, so we can treat it specially (it will go away eventually, right?). > 2. The bits that actually implement the cache in WebCore/WebKit. Thx for reviewing Martin. Well basically the patch consists of: 1) changes in ResourceHandleSoup/ResourceHandleInternal: basically adaptations to work with the URILoader stuff 2) WebCore/platform/network/soup/webkitsoup*: all the stuff moved from webkit 3) WebKit/Wekit/gtk/webkit/webkitsoupcache.[c|h]: the webkitgtk+ API for clients Now some comments: > > + if (m_req) { > > + g_object_set_data(G_OBJECT(m_req.get()), "webkit-resource", 0); > > + m_req.clear(); > > There's no chance that this may be the last reference to m_req here? What's the problem then? > > + GOwnPtr<char> uri(soup_uri_to_string(soup_message_get_uri(msg), false)); > > + String location = String(uri.get()); > > Even though URL are generally URL-encoded, I don't know if we can rely on that > here. You should use String::fromUTF8. soup_uri_to_string always returns an URL encoded URL, do you think we still need to use ::fromUTF8 ? > > - d->m_idleHandler = g_timeout_add(0, parseDataUrl, handle); > > + d->m_idleHandler = g_idle_add(parseDataUrl, handle); > > Why did you switch this to an idle? I think Gustavo is trying to move > away from that because they are often starved. Maybe ping him about this. As stated here https://bugs.webkit.org/show_bug.cgi?id=40222#c11 there was 1 test that fails with timeout but not with idle. After a long time debugging it, it seems that it's just a matter of timing. All the signals and events are emitted properly but somehow in a different order depending on the function used.
> > > + if (m_req) { > > > + g_object_set_data(G_OBJECT(m_req.get()), "webkit-resource", 0); > > > + m_req.clear(); > > > > There's no chance that this may be the last reference to m_req here? > What's the problem then? Sorry, misread this one! Looks fine actually. > > here. You should use String::fromUTF8. > soup_uri_to_string always returns an URL encoded URL, do you think we still need to use ::fromUTF8 ? Yes. Since GLib strings are UTF-8, you should still use the proper String constructor. For instance, do you know what Soup's behavior is with IDNs? > As stated here https://bugs.webkit.org/show_bug.cgi?id=40222#c11 > there was 1 test that fails with timeout but not with idle. After > a long time debugging it, it seems that it's just a matter of timing. > All the signals and events are emitted properly but somehow in a > different order depending on the function used. Interesting. I'm convinced, but I'd still like to get Gustavo's opinion on it. He may have some further insights.
(In reply to comment #4) > > > here. You should use String::fromUTF8. > > soup_uri_to_string always returns an URL encoded URL, do you think we still need to use ::fromUTF8 ? > > Yes. Since GLib strings are UTF-8, you should still use the proper String constructor. For instance, do you know what Soup's behavior is with IDNs? OK will fix that. Regarding IDNs there is no specific code about them in libsoup AFAIK > > As stated here https://bugs.webkit.org/show_bug.cgi?id=40222#c11 > > there was 1 test that fails with timeout but not with idle. After > > a long time debugging it, it seems that it's just a matter of timing. > > All the signals and events are emitted properly but somehow in a > > different order depending on the function used. > > Interesting. I'm convinced, but I'd still like to get Gustavo's opinion > on it. He may have some further insights. I don't want to convince you :). Just sharing my conclussions after some investigation.
Gently added kov and xan to the Cc
Created attachment 65135 [details] SoupURILoader stuff as part of webkitgtk+ core network This patch adds the objects used by SoupURILoader branch of libsoup. This code is required for the HTTP cache implementation. WebKit code style has been used when moving code to WebKit.
Created attachment 65136 [details] Changes to WebKitGtk+ core network stuff to use the new API from the imported SoupURILoader code See also https://bugs.webkit.org/show_bug.cgi?id=40222
Created attachment 65137 [details] New public API to enable clients to use the HTTP cache Public API for the cache
Patch splitted as suggested by Martin
Great, I'll take a look at these later today, is there any specific reason why you did not mark the patches for review, though?
(In reply to comment #11) > Great, I'll take a look at these later today, is there any specific reason why you did not mark the patches for review, though? Cool! if you can review them Regarding the r?, I always forget :/
Attachment 65135 [details] did not pass style-queue: Failed to run "['WebKitTools/Scripts/check-webkit-style']" exit_code: 1 Last 3072 characters of output: ty/naming] [4] WebCore/platform/network/soup/webkitsouprequestdata.cpp:178: request_data_class is incorrectly named. Don't use underscores in your identifier names. [readability/naming] [4] WebCore/platform/network/soup/webkitsouprequestdata.cpp:180: object_class is incorrectly named. Don't use underscores in your identifier names. [readability/naming] [4] WebCore/platform/network/soup/webkitsouprequestdata.cpp:181: request_class is incorrectly named. Don't use underscores in your identifier names. [readability/naming] [4] WebCore/platform/network/soup/webkitsouprequestfile.cpp:47: webkit_soup_request_file_init is incorrectly named. Don't use underscores in your identifier names. [readability/naming] [4] WebCore/platform/network/soup/webkitsouprequestfile.cpp:54: webkit_soup_request_file_finalize is incorrectly named. Don't use underscores in your identifier names. [readability/naming] [4] WebCore/platform/network/soup/webkitsouprequestfile.cpp:268: webkit_soup_request_file_class_init is incorrectly named. Don't use underscores in your identifier names. [readability/naming] [4] WebCore/platform/network/soup/webkitsouprequest.h:57: parent_class is incorrectly named. Don't use underscores in your identifier names. [readability/naming] [4] WebCore/platform/network/soup/webkitsouprequest.h:65: Tab found; better to use spaces [whitespace/tab] [1] WebCore/platform/network/soup/webkitsouprequest.h:66: Tab found; better to use spaces [whitespace/tab] [1] WebCore/platform/network/soup/webkitsouprequest.h:67: Tab found; better to use spaces [whitespace/tab] [1] WebCore/platform/network/soup/webkitsouprequest.h:76: webkit_soup_request_get_type is incorrectly named. Don't use underscores in your identifier names. [readability/naming] [4] WebCore/platform/network/soup/webkitsouprequester.cpp:39: webkit_soup_requester_init is incorrectly named. Don't use underscores in your identifier names. [readability/naming] [4] WebCore/platform/network/soup/webkitsouprequester.cpp:57: webkit_soup_requester_class_init is incorrectly named. Don't use underscores in your identifier names. [readability/naming] [4] WebCore/platform/network/soup/webkitsouprequester.cpp:57: requester_class is incorrectly named. Don't use underscores in your identifier names. [readability/naming] [4] WebCore/platform/network/soup/webkitsouprequester.cpp:59: object_class is incorrectly named. Don't use underscores in your identifier names. [readability/naming] [4] WebCore/platform/network/soup/webkitsouprequester.cpp:67: init_request_types is incorrectly named. Don't use underscores in your identifier names. [readability/naming] [4] WebCore/platform/network/soup/webkitsouprequester.cpp:93: webkit_soup_requester_request is incorrectly named. Don't use underscores in your identifier names. [readability/naming] [4] WebCore/platform/network/soup/webkitsouprequester.cpp:109: webkit_soup_requester_request_uri is incorrectly named. Don't use underscores in your identifier names. [readability/naming] [4] Total errors found: 934 in 19 files If any of these errors are false positives, please file a bug against check-webkit-style.
Attachment 65136 [details] did not pass style-queue: Failed to run "['WebKitTools/Scripts/check-webkit-style']" exit_code: 1 WebCore/platform/network/ResourceHandleInternal.h:51: Alphabetical sorting problem. [build/include_order] [4] WebCore/platform/network/ResourceHandleInternal.h:210: Tab found; better to use spaces [whitespace/tab] [1] Total errors found: 2 in 4 files If any of these errors are false positives, please file a bug against check-webkit-style.
Attachment 65137 [details] did not pass style-queue: Failed to run "['WebKitTools/Scripts/check-webkit-style']" exit_code: 1 Last 3072 characters of output: kitsoupcache.cpp:107: One line control clauses should not use braces. [whitespace/braces] [4] WebKit/gtk/webkit/webkitsoupcache.cpp:108: Tab found; better to use spaces [whitespace/tab] [1] WebKit/gtk/webkit/webkitsoupcache.cpp:112: Extra space before ( in function call [whitespace/parens] [4] WebKit/gtk/webkit/webkitsoupcache.cpp:114: Tab found; better to use spaces [whitespace/tab] [1] WebKit/gtk/webkit/webkitsoupcache.cpp:114: Declaration has space between type name and * in GObjectClass *gobject_class [whitespace/declaration] [3] WebKit/gtk/webkit/webkitsoupcache.cpp:116: Tab found; better to use spaces [whitespace/tab] [1] WebKit/gtk/webkit/webkitsoupcache.cpp:118: Tab found; better to use spaces [whitespace/tab] [1] WebKit/gtk/webkit/webkitsoupcache.cpp:118: Extra space before ( in function call [whitespace/parens] [4] WebKit/gtk/webkit/webkitsoupcache.cpp:133: Extra space before ( in function call [whitespace/parens] [4] WebKit/gtk/webkit/webkitsoupcache.cpp:136: Extra space before ( in function call [whitespace/parens] [4] WebKit/gtk/webkit/webkitsoupcache.cpp:141: Extra space before ( in function call [whitespace/parens] [4] WebKit/gtk/webkit/webkitsoupcache.cpp:157: Extra space before ( in function call [whitespace/parens] [4] WebKit/gtk/webkit/webkitsoupcache.cpp:159: Tab found; better to use spaces [whitespace/tab] [1] WebKit/gtk/webkit/webkitsoupcache.cpp:159: Extra space before ( in function call [whitespace/parens] [4] WebKit/gtk/webkit/webkitsoupcache.cpp:170: Extra space before ( in function call [whitespace/parens] [4] WebKit/gtk/webkit/webkitsoupcache.cpp:172: Tab found; better to use spaces [whitespace/tab] [1] WebKit/gtk/webkit/webkitsoupcache.cpp:172: Extra space before ( in function call [whitespace/parens] [4] WebKit/gtk/webkit/webkitsoupcache.cpp:176: Extra space before ( in function call [whitespace/parens] [4] WebKit/gtk/webkit/webkitsoupcache.cpp:178: Tab found; better to use spaces [whitespace/tab] [1] WebKit/gtk/webkit/webkitsoupcache.cpp:178: Extra space before ( in function call [whitespace/parens] [4] WebKit/gtk/webkit/webkitsoupcache.cpp:182: Extra space before ( in function call [whitespace/parens] [4] WebKit/gtk/webkit/webkitsoupcache.cpp:184: Tab found; better to use spaces [whitespace/tab] [1] WebKit/gtk/webkit/webkitsoupcache.cpp:184: Extra space before ( in function call [whitespace/parens] [4] WebKit/gtk/webkit/webkitsoupcache.cpp:188: Extra space before ( in function call [whitespace/parens] [4] WebKit/gtk/webkit/webkitsoupcache.cpp:189: Tab found; better to use spaces [whitespace/tab] [1] WebKit/gtk/webkit/webkitsoupcache.cpp:191: Tab found; better to use spaces [whitespace/tab] [1] WebKit/gtk/webkit/webkitsoupcache.cpp:191: Extra space before ( in function call [whitespace/parens] [4] WebKit/gtk/webkit/webkitsoupcache.cpp:195: Extra space before ( in function call [whitespace/parens] [4] WebKit/gtk/webkit/webkitsoupcache.cpp:197: Tab found; better to use spaces [whitespace/tab] [1] Total errors found: 111 in 3 files If any of these errors are false positives, please file a bug against check-webkit-style.
Attachment 65135 [details] did not build on gtk: Build output: http://queues.webkit.org/results/3757605
Attachment 65136 [details] did not build on gtk: Build output: http://queues.webkit.org/results/3725601
Attachment 65137 [details] did not build on gtk: Build output: http://queues.webkit.org/results/3751564
Comment on attachment 65135 [details] SoupURILoader stuff as part of webkitgtk+ core network I have three main concerns: 1. I believe we should import the code without style changes into a subdirectory; I know you have probably been told to change it into webkit's style, but I think it would make backporting fixes into soup's version harder than needed - and it's not something we plan to keep in the tree for long, anyway 2. do we really need to import the TPL one-file library? I guess the answer is yes (though we could use GVariant?), but then we probably need to figure out a way of isolating it from the rest of the build as the build failure suggests it'll be problematic =( 3. There's API mixed in there (like webkit_soup_http_input_stream_new/webkitSoupHttpInputStreamNew) - I think I would prefer having this code go to WebKit/soup/; the API should be kept in soup/GTK+-style even if we decide to move the rest (which is one more argument in favor of keeping the style)
I'm also in favor of keeping this in the original style, in the interest of pushing fixes back into your soup branch. I'd love to see this code isolated into Webkit/gtk/soup. svillar, can you comment on the performance characteristics of the code which uses tpl? It may make more sense to use GVariant or sqlite, depending on what it does.
(In reply to comment #20) > I'm also in favor of keeping this in the original style, in the interest of pushing fixes back into your soup branch. I'd love to see this code isolated into Webkit/gtk/soup. If we are going to do this then let's open a bug right away to make the style bots ignore that bunch of code, otherwise this is going to be annoying. That way we'll also have something to point people to when they come asking why is there a ball of code not following the style inside the source tree (hello Mark! :)). > > svillar, can you comment on the performance characteristics of the code which uses tpl? It may make more sense to use GVariant or sqlite, depending on what it does. I agree with the sentiment. Did you talk with Dan and decided that using TPL made sense here? As suggested GVariant seems to be made exactly for this kind of task, so it would make sense to reuse it here.
Using Xan's comments for the answer the previous reviews (In reply to comment #21) > (In reply to comment #20) > > I'm also in favor of keeping this in the original style, in the interest of pushing fixes back into your soup branch. I'd love to see this code isolated into Webkit/gtk/soup. > > If we are going to do this then let's open a bug right away to make the style bots ignore that bunch of code, otherwise this is going to be annoying. That way we'll also have something to point people to when they come asking why is there a ball of code not following the style inside the source tree (hello Mark! :)). Not really sure how to make bots ignore that, could you help on that? Provided you all are in favour of keeping the style do you think it still makes sense to add a "webkit_" prefix to all the symbols? > > > > svillar, can you comment on the performance characteristics of the code which uses tpl? It may make more sense to use GVariant or sqlite, depending on what it does. > > I agree with the sentiment. Did you talk with Dan and decided that using TPL made sense here? As suggested GVariant seems to be made exactly for this kind of task, so it would make sense to reuse it here. Regarding why I chose tpl, there are some reasons: * I had previously used it for a personal project * The API is quite compact, easy and understandable * No external dependencies needed * It's supposed to be fast Most of them are likely valid for the GVariant stuff, I guess I was mostly biased by the first one. I don't know about performance using GVariant, but if it's acceptable I agree it makes sense to use it.
To make the bots ignore the path you need to add the path, + information about which rules are allowed to be broken on that path to WebKit/WebKitTools/Scripts/webkitpy/style/checker.py. If we agree on the path I can make the patch. WebCore/platform/network/soup/cache for the internal files and WebKit/gtk/soup for the API? Or should we go with WebKit/gtk/soup/ for everything, making this gtk-only, essentially? As for the webkit_ prefix, I think it makes sense to add the prefix to public API, so it won't clash with the proper soup symbols in the future. TPL/GVariant, I'd like to go with GVariant. TPL does bring _1_ additional dependency, and that is the actual TPL files, which you then have to keep up-to-date yourself. GVariant on the other hand is a standard part of our platform, and will be maintained by the glib maintainers =).
(In reply to comment #23) > If we agree on the path I can make the patch. WebCore/platform/network/soup/cache for the internal files and WebKit/gtk/soup for the API? Or should we go with WebKit/gtk/soup/ for everything, making this gtk-only, essentially? Please don't make it gtk-only putting files into WebKit/gtk/soup/. Also EFL port is using soup as network backend, and this can be nice feature for EFL also.
> Please don't make it gtk-only putting files into WebKit/gtk/soup/. Also EFL port is using soup as network backend, and this can be nice feature for EFL also. That is why I initially suggested WebKit/soup/. I think for simplicity's sake we should use WebKit/soup for all files.
(In reply to comment #25) > > Please don't make it gtk-only putting files into WebKit/gtk/soup/. Also EFL port is using soup as network backend, and this can be nice feature for EFL also. > That is why I initially suggested WebKit/soup/. I think for simplicity's sake we should use WebKit/soup for all files. I can see why this is a good idea now. Here's another vote for WebKit/soup.
(In reply to comment #23) > To make the bots ignore the path you need to add the path, + information about which rules are allowed to be broken on that path to WebKit/WebKitTools/Scripts/webkitpy/style/checker.py. If we agree on the path I can make the patch. WebCore/platform/network/soup/cache for the internal files and WebKit/gtk/soup for the API? Or should we go with WebKit/gtk/soup/ for everything, making this gtk-only, essentially? I like the former. Basically because most of stuff is only needed by the ResourceHandleSoup. That's why having something under WebCore and the public API under WebKit makes a lot of sense to me. Regarding everything under WebKit/soup, I thought that being WebCore objects they should be under WebCore/ but I could be obviously wrong. > As for the webkit_ prefix, I think it makes sense to add the prefix to public API, so it won't clash with the proper soup symbols in the future. Agree. > TPL/GVariant, I'd like to go with GVariant. TPL does bring _1_ additional dependency, and that is the actual TPL files, which you then have to keep up-to-date yourself. GVariant on the other hand is a standard part of our platform, and will be maintained by the glib maintainers =). Yep, I'm moving the serialization/persistence stuff to GVariant and removing the tpl dependency.
Some extra information regarding the GVariant/tpl stuff. I coded a couple of serialization/deserialization routines both using tpl and GVariant in order to evaluate the performance. For the test the data that I wanted to serialize was an array of structures with a uint32, a boolean and 2 strings (the amount of data in the cache is actually higher but both systems claim to be O(n)). Then I registered the time for serializing/deserializing arrays of different sizes. The first size I chose was 5000 elements. Basically because if we take a typical cache size of 50MB and an average file size of 10k then that means 5000 cache entries. The other test I did was with 10 times that size (50000 elements), for quite extreme cases. For the case of tpl I implemented it both using fixed-length arrays and variable-length arrays. These are the numbers I got (average values after 10 runs): Serialization ************* | tpl (fixed) | tpl | gvariant | ------------------------------------ 5000 | 0.02456 | 0.02606 | 0.1098 | 50000 | 0.16678 | 0.23968 | 1.0829 | Deserialization *************** | tpl (fixed) | tpl | gvariant | ------------------------------------ 5000 | 0.02456 | 0.02606 | 0.1098 | 50000 | 0.16678 | 0.23968 | 1.0829 | The most important numbers are the deserialization ones because that happens when the applications start while the serialization is typically performed on application exit. It's pretty clear that tpl (even using the not-so-fast variable length arrays is way faster (~85%) than gvariant. But if we consider that 5000 is an accurate figure then the difference in time is less than 1/10 sec. Maybe a figure in between the two I chose fits better with actual numbers. In that case application start will wait during 1/2 second for cache to start. That can be partially avoided if we perform the deserialization in a different thread keeping the cache in a not_available state. Deserialization will most likely finish before the user could even start typing a URL (home page load could be affected by the concurrent deserialization BTW).
Created attachment 65567 [details] patch for check-webkit-style to ignore our new stuff
Attachment 65567 [details] did not pass style-queue: Failed to run "['WebKitTools/Scripts/check-webkit-style']" exit_code: 1 WebKitTools/Scripts/webkitpy/style/checker.py:218: whitespace before ']' [pep8/E202] [5] Total errors found: 1 in 2 files If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 65567 [details] patch for check-webkit-style to ignore our new stuff The style error seems like a false positive, so this looks good to me.
Comment on attachment 65567 [details] patch for check-webkit-style to ignore our new stuff Landed as r66120. Turns out that the python style rules require a , after the last item of a list (the original code had it), so I added it, and the style checker is happy with itself again =)
Created attachment 65900 [details] Imported SoupURILoader to Webkit. New public API for HTTP cache * Added files from libsoup with SoupURILoader stuff * New API that adds WebKitSoupCache
Created attachment 65901 [details] Changes to WebKitGtk+ core network stuff to use the new API from the imported SoupURILoader code WebKitGtk+ using SoupURILoader
Attachment 65900 [details] did not build on gtk: Build output: http://queues.webkit.org/results/3850142
Attachment 65901 [details] did not build on gtk: Build output: http://queues.webkit.org/results/3877142
Comment on attachment 65900 [details] Imported SoupURILoader to Webkit. New public API for HTTP cache This will still need a changelog entry, and the build breakage implies we need to either depend on a newer glib or factor this code out if building with an oldish glib. There's a check in place to decide whether to use GSettings. I'd recommend adding a check specifically for GVariant, to decide whether to build this new code or not. I won't bother reviewing the actual code, except for the public APIs. The identation in soup-cache.c is completely broken (no spaces at all in various places, 4-space indentation in others), I think you may want to run indent in all soup cache files to have them match the libsoup style.
Actually, I think trying to make this buildable with older glibs is going to pull us into hell, so I would go with making us depend on the earliest GVariant-providing glib.
(In reply to comment #38) > Actually, I think trying to make this buildable with older glibs is going to pull us into hell, so I would go with making us depend on the earliest GVariant-providing glib. Agree. I'll add changelog's and update the glib version
Created attachment 66189 [details] Imported SoupURILoader to Webkit. New public API for HTTP cache Now with ChangeLogs. Files imported from libsoup have now the proper libsoup coding style
Created attachment 66190 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code g_timeout_add is now back in place after replacing g_idle_add. BTW the test that was working only with g_idle_add is now working with the timeout_add too.
Created attachment 66191 [details] Added fast/encoding/percent-escaping.html to the list of Skipped That test does not work with the new SoupURILoader stuff. See https://bugzilla.gnome.org/show_bug.cgi?id=523100#c28 for further details. A small patch to libsoup will fix it.
Attachment 66190 [details] did not build on gtk: Build output: http://queues.webkit.org/results/3921019
Comment on attachment 65901 [details] Changes to WebKitGtk+ core network stuff to use the new API from the imported SoupURILoader code Marked as obsolete
Created attachment 66650 [details] Imported SoupURILoader to Webkit. New public API for HTTP cache A few changes since the last version of the patch: * glib dependency is now 2.24 * removed several data from cache that is not really needed (as it was saved anyway in the http headers)
Attachment 66650 [details] did not build on gtk: Build output: http://queues.webkit.org/results/3914232
Comment on attachment 66190 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code This patch is looking good. I'm excited for this to land. View in context: https://bugs.webkit.org/attachment.cgi?id=66190&action=prettypatch > WebCore/platform/network/soup/ResourceHandleSoup.cpp:138 > + m_soupRequest.clear(); m_soupRequest.clear(); is unecessary here, I believe. The destructor will take care of decrementing the smart pointer reference count. If this is needed otherwise there should really be a comment explaining why. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:196 > + String location = String(uri.get()); Again, I believe this should be ::fromUTF8(...). > WebCore/platform/network/soup/ResourceHandleSoup.cpp:305 > +#define BUFFER_SIZE 8192 If this is a constant, why do we need to store m_bufferSize? Let's just get rid of that member if possible. Again, I think this should be called READ_BUFFER_SIZE and be at the top of the file too. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:689 > + sendRequestCallback, 0); This newline seems unecessary. I usually split the line when it goes over 120 characters. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:846 > + client->didReceiveData(handle.get(), reinterpret_cast<const char*>(unicodeStr.get()), g_utf8_strlen(d->m_buffer, bytesRead) * sizeof(UChar), 0); Would it be possible to simply use WTFString here? String data = String::fromUTF8(d->m_buffer, bytesRead); client->didReceiveData(handle.get(), reinterpret_cast<const char*>(data.characters()), data.length() * sizeof(UChar), 0); > WebCore/platform/network/soup/ResourceHandleSoup.cpp:867 > + GOwnPtr<char> urlStr; Please use CString here, if possible. It was made just for this purpose. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:901 > + Please remove the newline.
Created attachment 66692 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code New version with fixes to the issues detected by Martin
Attachment 66692 [details] did not build on gtk: Build output: http://queues.webkit.org/results/3938283
Comment on attachment 66650 [details] Imported SoupURILoader to Webkit. New public API for HTTP cache View in context: https://bugs.webkit.org/attachment.cgi?id=66650&action=prettypatch My major concern with this patch is exposing a public API that will not be compatible with some further Soup version. Is it possible to keep this API private for now? > GNUmakefile.am:325 > Please keep this list in alphabetical order. > WebCore/GNUmakefile.am:2575 > WebCore/plugins/gtk/PluginDataGtk.cpp \ Please keep the source lists in alphabetical order. I've been ordered them by filtering them through sort.
(In reply to comment #50) > (From update of attachment 66650 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=66650&action=prettypatch > > My major concern with this patch is exposing a public API that will not be compatible with some further Soup version. Is it possible to keep this API private for now? The applications that uses WebKitGTK+ will need that API to setup the caching, so we can't make it private. That concern is why we are adding the WebKit prefix to the API bits, we should be fine.
(In reply to comment #50) > (From update of attachment 66650 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=66650&action=prettypatch > > My major concern with this patch is exposing a public API that will not be compatible with some further Soup version. Is it possible to keep this API private for now? As Gustavo said it's a must for clients. > > GNUmakefile.am:325 > > > Please keep this list in alphabetical order. It's currently not properly sorted, do you think it makes sense to sort them in this patch?
Comment on attachment 66692 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code Removing the review request as I found a bug that could lead to a crash
Created attachment 67254 [details] Imported SoupURILoader to Webkit. New public API for HTTP cache Fixed minor issues with sorting reported by Martin. Apart from that several changes in cache *implementation* (no changes in pure WebKitGtk+ code or API) * Fixed crash caused by multiple free * Fixed memory leaks * LRU is now properly updated * Cache resources properly freed on finalize * Decreased cache's threads priority to priorize libsoup's and webkit's * GVariant does not like non-UTF8 strings. Validate them before serializing. I won't be available next week so don't expect comments from my side till next Saturday at least :-)
Comment on attachment 66692 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code Added missing r=?
Attachment 67254 [details] did not build on gtk: Build output: http://queues.webkit.org/results/3897376
Comment on attachment 66191 [details] Added fast/encoding/percent-escaping.html to the list of Skipped View in context: https://bugs.webkit.org/attachment.cgi?id=66191&action=prettypatch > LayoutTests/platform/gtk/Skipped:1138 > +fast/encoding/percent-escaping.html I think I'd rather see a work-around in our code than a regression. We should chat about this sometime.
As in WebKit-EFL we want to use your cache implementation could you tell me if it is already finished or are there any features that need to be implemented?
(In reply to comment #57) > (From update of attachment 66191 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=66191&action=prettypatch > > > LayoutTests/platform/gtk/Skipped:1138 > > +fast/encoding/percent-escaping.html > > I think I'd rather see a work-around in our code than a regression. We should chat about this sometime. Unfortunatelly there is no work-around possible from our side without breaking all the new URI loader way of getting resources because the problem is how libsoup handles file: URLs. The fix should be added to libsoup, the patch is really small (see https://bugzilla.gnome.org/show_bug.cgi?id=523100#c28 for more details). I can talk with danw about that before pushing this upstream.
(In reply to comment #58) > As in WebKit-EFL we want to use your cache implementation could you tell me if it is already finished or are there any features that need to be implemented? It's finished but under review process, hopefully will land soon.
(In reply to comment #59) > (In reply to comment #57) > > (From update of attachment 66191 [details] [details]) > > View in context: https://bugs.webkit.org/attachment.cgi?id=66191&action=prettypatch > > > > > LayoutTests/platform/gtk/Skipped:1138 > > > +fast/encoding/percent-escaping.html > > > > I think I'd rather see a work-around in our code than a regression. We should chat about this sometime. > > Unfortunatelly there is no work-around possible from our side without breaking all the new URI loader way of getting resources because the problem is how libsoup handles file: URLs. The fix should be added to libsoup, the patch is really small (see https://bugzilla.gnome.org/show_bug.cgi?id=523100#c28 for more details). I can talk with danw about that before pushing this upstream. Actually the problem is not with the percent signs but with the blank spaces in the filename. Soup will always try to convert them into "%20" because otherwise is won't be a valid URL. So even if you encode them before passing the url to libsoup you'll be generating a different filename and thus the test will fail. I think the best I can do is to have a little chat about this issue with danw and try to get the libsoup patch upstream.
(In reply to comment #61) > Actually the problem is not with the percent signs but with the blank spaces in the filename. Soup will always try to convert them into "%20" because otherwise is won't be a valid URL. So even if you encode them before passing the url to libsoup you'll be generating a different filename and thus the test will fail. I think the best I can do is to have a little chat about this issue with danw and try to get the libsoup patch upstream. So does this mean filenames with spaces will be inaccessible? That definitely seems like a blocker for merging this.
Comment on attachment 66191 [details] Added fast/encoding/percent-escaping.html to the list of Skipped Removing as the test is working with the newest changes to be uploaded
Created attachment 68349 [details] Imported SoupURILoader to Webkit. New public API for HTTP cache New version of the patch. Added fix for the test that started to fail
Created attachment 68350 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code New version of the patch. Rebased against latest changes to the master
Comment on attachment 66692 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code Marking this as obsolete, forgot to do it before
Attachment 68349 [details] did not build on gtk: Build output: http://queues.webkit.org/results/3996101
Attachment 68350 [details] did not build on gtk: Build output: http://queues.webkit.org/results/4092036
Comment on attachment 68349 [details] Imported SoupURILoader to Webkit. New public API for HTTP cache Still one issue remaining. Removing the r? temporarily
Created attachment 68505 [details] Imported SoupURILoader to Webkit. New public API for HTTP cache This is the right version of the (hopefully) final patch. I managed to fix the issue with the percent-escaping.html test that was failing and the good news is that the fix is part of the imported soup uri loader code. That way no special code is needed by current WebKitGtk+ code, and if it eventually lands on libsoup, every client will take advantage of that.
Attachment 68505 [details] did not build on gtk: Build output: http://queues.webkit.org/results/4070098
Comment on attachment 68350 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code View in context: https://bugs.webkit.org/attachment.cgi?id=68350&action=review Looks good! I think we should merge this soon. I just have a few style nits. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:128 > +static void cleanupSoupRequestOperation(ResourceHandle* handle, bool isDestroying); > +static void sendRequestCallback(GObject* source, GAsyncResult* res, gpointer); > +static void readCallback(GObject* source, GAsyncResult* res, gpointer); > +static void closeCallback(GObject* source, GAsyncResult* res, gpointer); I think all of these parameter names are unnecessary except for isDestroying. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:381 > + handle->ref(); This really needs a comment describing where the balancing deref() is. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:386 > + g_input_stream_read_async(d->m_inputStream, d->m_buffer, READ_BUFFER_SIZE, > + G_PRIORITY_DEFAULT, d->m_cancellable, > + readCallback, (isBase64) ? 0 : GINT_TO_POINTER(1)); Perhaps smoosh this down to two lines in the interest of vertical space? > WebCore/platform/network/soup/ResourceHandleSoup.cpp:403 > + This blank line seems accidental. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:501 > + ResourceError resourceError(g_quark_to_string(SOUP_HTTP_ERROR), > + d->m_msg->status_code, > + uriStr.get(), > + String::fromUTF8(d->m_msg->reason_phrase)); Once again, smoosh this down to two lines. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:513 > + ResourceError resourceError(g_quark_to_string(G_IO_ERROR), > + error->code, > + uriStr.get(), > + error ? String::fromUTF8(error->message) : String()); Ditto. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:574 > + g_input_stream_read_async(d->m_inputStream, d->m_buffer, READ_BUFFER_SIZE, > + G_PRIORITY_DEFAULT, d->m_cancellable, > + readCallback, 0); Ditto. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:824 > ResourceError resourceError(g_quark_to_string(G_IO_ERROR), > error->code, > - uri, > + uriStr.get(), > error ? String::fromUTF8(error->message) : String()); Ditto. > WebCore/platform/network/soup/ResourceRequest.h:70 > + void updateSoupMessage(SoupMessage* soupMessage) const; Please remove the parameter name.
Created attachment 68664 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code New version with Martin's style fix requests
Attachment 68664 [details] did not build on gtk: Build output: http://queues.webkit.org/results/4068122
Created attachment 68717 [details] patches to Sergio's patches OK, the reason SoupRequester needed a special case to get file URIs working right was because there was already an incorrect special case for file URIs in ResourceHandleSoup::startGio() that was messing the URL up before it got to SoupRequester. So this patch removes both special cases (I didn't bother attaching it in the form of a git commit since it actually needs to be split between both of Sergio's existing patches.). Additionally, to deal with the problem the startGio() hack was originally attempting to work around, we need to make soup_uri_decode() accept badly-%-encoded URIs rather than returning NULL (which is something we'd already discussed in gnomebug 523100 anyway). There is now a patch for that in https://bugzilla.gnome.org/show_bug.cgi?id=630540 but it's not yet committed because we're in hard code freeze for 2.32.
(In reply to comment #75) > OK, the reason SoupRequester needed a special case to get file URIs working right was because there was already an incorrect special case for file URIs in ResourceHandleSoup::startGio() that was messing the URL up before it got to SoupRequester. So this patch removes both special cases (I didn't bother attaching it in the form of a git commit since it actually needs to be split between both of Sergio's existing patches.). Thank you! Sergio will just need to merge these fixes into his patches before the final r+. > Additionally, to deal with the problem the startGio() hack was originally attempting to work around, we need to make soup_uri_decode() accept badly-%-encoded URIs rather than returning NULL (which is something we'd already discussed in gnomebug 523100 anyway). There is now a patch for that in https://bugzilla.gnome.org/show_bug.cgi?id=630540 but it's not yet committed because we're in hard code freeze for 2.32. So, if I understand correctly, to avoid a regression we'll either need to: 1. Re-implement this logic in WebKit GTK+ for older soup versions. 2. Wait to land this until you release a new version of soup with this fix and then depend on that.
I haven't looked at the code in detail, but I think you probably want to rename all of the soup-prefixed stuff to be webkit-prefixed instead so that it doesn't start causing compile and run-time problems when the code lands in libsoup. (Even if it's not visible outside of libwebkit, you still have to worry about GType's type database; bad things will happen if libsoup and webkit both try to register a type with the name "SoupRequest".) I have not looked at the tpl-using code or the GVariant-using alternative, but it does seem like using GVariant instead of bringing in tpl would be nice... if there are performance issues presumably GVariant can still be optimized a bunch?
(In reply to comment #76) > So, if I understand correctly, to avoid a regression we'll either need to: > 1. Re-implement this logic in WebKit GTK+ for older soup versions. > 2. Wait to land this until you release a new version of soup with this fix and then depend on that. Yes. For #1, you could make soup-request-file.c use KURL.h's decodeURLEscapeSequences() instead of soup_uri_decode(), assuming it's not too obnoxious to call that from C... For #2, it will be in libsoup 2.32.1. Are you planning to try to get this in for GNOME 2.32.0 still? I'm pretty sure it's too late to bump external dependency versions at this point...
(In reply to comment #75) > Created an attachment (id=68717) [details] > patches to Sergio's patches > > OK, the reason SoupRequester needed a special case to get file URIs working right was because there was already an incorrect special case for file URIs in ResourceHandleSoup::startGio() that was messing the URL up before it got to SoupRequester. So this patch removes both special cases (I didn't bother attaching it in the form of a git commit since it actually needs to be split between both of Sergio's existing patches.). > Well that's not really the reason. The special case in souprequester needs the special case in startGio. One cannot work without the other. With both of them it works fine. > Additionally, to deal with the problem the startGio() hack was originally attempting to work around, we need to make soup_uri_decode() accept badly-%-encoded URIs rather than returning NULL (which is something we'd already discussed in gnomebug 523100 anyway). There is now a patch for that in https://bugzilla.gnome.org/show_bug.cgi?id=630540 but it's not yet committed because we're in hard code freeze for 2.32. That's the real issue IMO. That's why my first thought was to modify the decoding in soup-uri.c although your changes seem indeed better than adding a special case for file: URL's. I also considered removing that fixup parametter but I was not sure about breaking current URI parsing in soup. Note that glib APIs do return NULL when they cannot decode URIs One last question Dan, did you run the tests with both patches?
Comment on attachment 68664 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code View in context: https://bugs.webkit.org/attachment.cgi?id=68664&action=review > WebCore/platform/network/ResourceHandleInternal.h:51 > +#include <GRefPtr.h> Nitpick, but requester.h already includes request.h > WebCore/platform/network/soup/ResourceHandleSoup.cpp:196 > + String location = String::fromUTF8(uri.get()); Can't we just commit this bit now? > WebCore/platform/network/soup/ResourceHandleSoup.cpp:305 > // parseDataUrl() is taken from the CURL http backend. I guess you want to remove this comment now, since the function is wildly different by now (or at least say 'originally' or something). > WebCore/platform/network/soup/ResourceHandleSoup.cpp:387 > GINT_TO_POINTER(!isBase64) ? Could fail? > WebCore/platform/network/soup/ResourceHandleSoup.cpp:453 > + Any reason not to use platform ptrs here too? > WebCore/platform/network/soup/ResourceHandleSoup.cpp:459 > + And here? > WebCore/platform/network/soup/ResourceHandleSoup.cpp:515 > + These two ifs are almost identical, I think they can be trivially merged. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:544 > + d->m_buffer = static_cast<char*>(g_malloc(READ_BUFFER_SIZE)); Am I the only one worried that we malloc 8k for each outgoing HTTP request? or am I missing something here. Is this just making something explicit that was implicit before or we adding an extra copy to the whole process?
Comment on attachment 68664 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code View in context: https://bugs.webkit.org/attachment.cgi?id=68664&action=review >> WebCore/platform/network/ResourceHandleInternal.h:51 >> +#include <GRefPtr.h> > > Nitpick, but requester.h already includes request.h Indeed. Will remove it. >> WebCore/platform/network/soup/ResourceHandleSoup.cpp:196 >> + String location = String::fromUTF8(uri.get()); > > Can't we just commit this bit now? Yes. Martin asked for the change, that's why it's included in this patch >> WebCore/platform/network/soup/ResourceHandleSoup.cpp:305 >> // parseDataUrl() is taken from the CURL http backend. > > I guess you want to remove this comment now, since the function is wildly different by now (or at least say 'originally' or something). True >> WebCore/platform/network/soup/ResourceHandleSoup.cpp:387 >> > > GINT_TO_POINTER(!isBase64) ? Could fail? I just wanted to be ultra-explicit but what you propose would be the same. >> WebCore/platform/network/soup/ResourceHandleSoup.cpp:453 >> + > > Any reason not to use platform ptrs here too? They are not platform ptrs right now. Not sure if that change should be included in this patch. I thought it was a general rule not to include changes not directly related to the issue the patch tries to fix. >> WebCore/platform/network/soup/ResourceHandleSoup.cpp:515 >> + > > These two ifs are almost identical, I think they can be trivially merged. Indeed they can >> WebCore/platform/network/soup/ResourceHandleSoup.cpp:544 >> + d->m_buffer = static_cast<char*>(g_malloc(READ_BUFFER_SIZE)); > > Am I the only one worried that we malloc 8k for each outgoing HTTP request? or am I missing something here. Is this just making something explicit that was implicit before or we adding an extra copy to the whole process? :m really good point here. I think you're right and we're adding an extra copy for http requests when reading from the input stream. With the old code all the didReceiveData calls were using buffers provided by libsoup without any extra copy. I guess we need something better then...
(In reply to comment #80) > > WebCore/platform/network/soup/ResourceHandleSoup.cpp:544 > > + d->m_buffer = static_cast<char*>(g_malloc(READ_BUFFER_SIZE)); > > Am I the only one worried that we malloc 8k for each outgoing HTTP request? or am I missing something here. Is this just making something explicit that was implicit before or we adding an extra copy to the whole process? At the moment, given the kludginess of SoupHTTPInputStream, this is an additional malloc and an additional copy. When things are redone so that libsoup uses streams internally, then the malloc here will be replacing a malloc that is currently in soup-message-io.c, and libsoup will be reading directly from the network into your buffers. At any rate, you can copy the buffer 10 times and it will still be unnoticeable compared to waiting for the network and rendering HTML, so, it's not that awful...
(In reply to comment #82) > (In reply to comment #80) > > > WebCore/platform/network/soup/ResourceHandleSoup.cpp:544 > > > + d->m_buffer = static_cast<char*>(g_malloc(READ_BUFFER_SIZE)); > > > > Am I the only one worried that we malloc 8k for each outgoing HTTP request? or am I missing something here. Is this just making something explicit that was implicit before or we adding an extra copy to the whole process? > > At the moment, given the kludginess of SoupHTTPInputStream, this is an additional malloc and an additional copy. When things are redone so that libsoup uses streams internally, then the malloc here will be replacing a malloc that is currently in soup-message-io.c, and libsoup will be reading directly from the network into your buffers. > > At any rate, you can copy the buffer 10 times and it will still be unnoticeable compared to waiting for the network and rendering HTML, so, it's not that awful... WebKit in general has a rule of never allowing performance regressions, even if new features are being added. I think it makes sense to enforce this rule in our port too. If this is going to go away in the future and there's really no reasonably easy to avoid it I think we could make an exception, but it would still be unfortunate. FWIW talking with Holger some weeks ago he told me in QtWebKit the copying of data in the network stack is noticeable performance-wise in general, although in their case the number of copies was larger (something between 3 or 5 IIRC).
(In reply to comment #83) > WebKit in general has a rule of never allowing performance regressions, even if new features are being added. I think it makes sense to enforce this rule in our port too. But presumably that means "measured performance regressions", not "theoretical performance regressions". And one would hope that caching is going to increase performance by much more than an extra copy is going to decrease it... :) > If this is going to go away in the future and there's really no reasonably easy to avoid it Duh, now that you mention it, there's an API designed exactly for avoiding this extra copy... soup_message_set_chunk_allocator() (added for the gstreamer guys when they were porting their http stream to use libsoup). SoupHTTPInputStream could be modified to use this to get libsoup to read directly into the buffers provided by webkit.
(In reply to comment #84) > (In reply to comment #83) > > WebKit in general has a rule of never allowing performance regressions, even if new features are being added. I think it makes sense to enforce this rule in our port too. > > But presumably that means "measured performance regressions", not "theoretical performance regressions". And one would hope that caching is going to increase performance by much more than an extra copy is going to decrease it... :) > > > If this is going to go away in the future and there's really no reasonably easy to avoid it > > Duh, now that you mention it, there's an API designed exactly for avoiding this extra copy... soup_message_set_chunk_allocator() (added for the gstreamer guys when they were porting their http stream to use libsoup). SoupHTTPInputStream could be modified to use this to get libsoup to read directly into the buffers provided by webkit. Unfortunately I don't think that will fix anything. First of all because I think the chunk_allocator thing it's not properly implemented in libsoup, because the SoupBuffer (and its data pointer) that comes with the got_chunk signal is not the same than the one returned by the allocator. That's because the decoding routines create a different buffer for each decoder. Secondly, even if we fix the problem stated above, we'd still have to copy the data returned by the decoders to the caller's buffer (soup_coding_apply_into wouldn't fix anything as input and output buffers must not overlap)
Created attachment 69205 [details] Imported SoupURILoader to Webkit. New public API for HTTP cache Includes Dan's patch that reverts the file: URIs hack
Created attachment 69209 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code New version with Dan's patch applied. It also includes Xan's requested changes: * Moved m_inputStream and m_cancellable to platform ptrs * Refactored the 2 errors * GINT_TO_POINTER(!isBase64) * Removed duplicated include file Apart from that I removed some glib2.21 #ifdef code as the required version for this patch is 2.24
Attachment 69205 [details] did not build on gtk: Build output: http://queues.webkit.org/results/4201011
Attachment 69209 [details] did not build on gtk: Build output: http://queues.webkit.org/results/4225012
(In reply to comment #77) > I haven't looked at the code in detail, but I think you probably want to rename all of the soup-prefixed stuff to be webkit-prefixed instead so that it doesn't start causing compile and run-time problems when the code lands in libsoup. (Even if it's not visible outside of libwebkit, you still have to worry about GType's type database; bad things will happen if libsoup and webkit both try to register a type with the name "SoupRequest".) I think it was kov the one that said that it'd be better not to change them all in order to easen the migration to libsoup. Anyway I agree that it could be an important issue. I have a patch that changes all those names so if everybody think it's interesting I can upload new versions of the patches.
Created attachment 69355 [details] Imported SoupURILoader to Webkit. New public API for HTTP cache Renamed all the new Soup* symbols to WebKitSoup*
Comment on attachment 69205 [details] Imported SoupURILoader to Webkit. New public API for HTTP cache Forgot to mark this as obsolete
Created attachment 69356 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code Renamed new Soup* symbols to WebKitSoup* Also refactored some duplicated code in parseDataUrl (nice catch Martin)
Attachment 69355 [details] did not build on gtk: Build output: http://queues.webkit.org/results/4167023
As discussed on IRC, the fix for soup_uri_decode won't land on libsoup till 2.32.1 (mid November). As we want to release this first I'll copy&pastet the soup-uri.c code (uri_decoded_copy function) into the imported SoupURILoader stuff. We'll remove that code once the 2.32.1 is released.
Attachment 69356 [details] did not build on gtk: Build output: http://queues.webkit.org/results/4219041
Created attachment 69384 [details] Imported SoupURILoader to Webkit. New public API for HTTP cache New version of the patch: * A couple of indentation issues fixed * Do not need to decode base64 encoded data URLs * Imported uri_decoded_copy() with the fix that will land in libsoup 2.32.1. Use it instead of soup_uri_decode for file: and data: URLs
Comment on attachment 69356 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code View in context: https://bugs.webkit.org/attachment.cgi?id=69356&action=review This patch is so very close, but I've found some issues with memory leaks. When constructing a platformRefPtr, you should use adoptPlatformRef, unless you want the reference count to increase (the default constructor calls g_object_ref_sink). > WebCore/platform/network/ResourceHandleInternal.h:118 > + , m_soupRequest(0) > + , m_requester(0) > , m_inputStream(0) > , m_cancellable(0) Since these are now PlatformRefPtrs you don't need to initialize them. They are automatically 0. > WebCore/platform/network/ResourceHandleInternal.h:140 > + m_requester = webkit_soup_requester_new(); This should use adoptPlatformRef, see below for an extended note about this. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:357 > + d->m_soupRequest = webkit_soup_requester_request(d->m_requester.get(), handle->firstRequest().url().string().utf8().data(), session, &error.outPtr()); Ditto. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:365 > + if (error) > + return false; > + > + d->m_inputStream = in; Careful here. From the soup code it looks like it is the callers responsibility to unref the value that webkit_soup_request_send returns. Constructing a PlatformRefPtr without using adoptPlatformRef has the effect of calling g_object_ref another time on the GInputStream. d->m_inputStream = adoptPlatformRef(webkit_soup_request_send(d->m_soupRequest.get(), 0, &error.outPtr())); if (error) { d->m_inputStream = 0; return false; } and dispense with the intermediate 'in' variable entirely. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:373 > + d->m_cancellable = g_cancellable_new(); This should be d->m_cancellable = adoptPlatformRef(g_cancellable_new()); as well. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:436 > +static void cleanupSoupRequestOperation(ResourceHandle* handle, bool isDestroying = false) > +{ > + ResourceHandleInternal* d = handle->getInternal(); > + > + if (d->m_soupRequest) { > + g_object_set_data(G_OBJECT(d->m_soupRequest.get()), "webkit-resource", 0); > + d->m_soupRequest.clear(); > + } We still need to clear the m_cancellable here to ensure that the unref happens, right? > WebCore/platform/network/soup/ResourceHandleSoup.cpp:519 > + d->m_inputStream = in; The same goes for this section. After this the reference count on d->m_inputStream will be 2. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:564 > + d->m_soupRequest = webkit_soup_requester_request(d->m_requester.get(), url.string().utf8().data(), session, &error.outPtr()); Ditto. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:652 > + d->m_cancellable = g_cancellable_new(); Same issue here. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:833 > // GIO doesn't know how to handle refs and queries, so remove them Shouldn't this comment be changed to reflect the fact that we're using soup now? > WebCore/platform/network/soup/ResourceHandleSoup.cpp:839 > + urlStr = url.string().utf8().data(); This might as well be: CString urlStr = url.string().utf8().data(); > WebCore/platform/network/soup/ResourceHandleSoup.cpp:842 > + d->m_soupRequest = webkit_soup_requester_request(d->m_requester.get(), urlStr.data(), session, &error.outPtr()); I think this should use adoptPlatformRef as well. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:851 > d->m_cancellable = g_cancellable_new(); Same issue here.
Attachment 69384 [details] did not build on gtk: Build output: http://queues.webkit.org/results/4141040
Created attachment 69436 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code New version. Includes: * Added adoptPlatformRef where needed (thx Martin) * Renamed m_msg to m_soupMsg and converted it to PlatformRefPtr
Attachment 69436 [details] did not pass style-queue: Failed to run "['WebKitTools/Scripts/check-webkit-style']" exit_code: 1 WebCore/platform/network/soup/ResourceHandleSoup.cpp:481: Declaration has space between type name and * in SoupMessage *soupMsg [whitespace/declaration] [3] WebCore/platform/network/soup/ResourceHandleSoup.cpp:578: Declaration has space between type name and * in SoupMessage *soupMsg [whitespace/declaration] [3] Total errors found: 2 in 6 files If any of these errors are false positives, please file a bug against check-webkit-style.
Created attachment 69438 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code Sorry for the previous one. Fixed the style issues.
Attachment 69436 [details] did not build on gtk: Build output: http://queues.webkit.org/results/4179041
Attachment 69438 [details] did not build on gtk: Build output: http://queues.webkit.org/results/4229039
Comment on attachment 69438 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code View in context: https://bugs.webkit.org/attachment.cgi?id=69438&action=review General comment: I'm not as picky about this as others, but some reviewers would have you end all comments with a period. That's something to consider for this patch. > WebCore/platform/network/ResourceHandleInternal.h:189 > + PlatformRefPtr<SoupMessage> m_soupMsg; This should probably be m_soupMessage, as the guideline is to avoid abbreviations in new code. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:-132 > - if (m_msg) { > - g_object_unref(m_msg); > - m_msg = 0; > - } Nice change. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:145 > - if (d->m_msg) > - g_signal_handlers_disconnect_matched(d->m_msg, G_SIGNAL_MATCH_DATA, > + if (d->m_soupMsg) > + g_signal_handlers_disconnect_matched(d->m_soupMsg.get(), G_SIGNAL_MATCH_DATA, I could be wrong about this, but shouldn't this be in cleanupSoupRequestOperation? Otherwise, we might get responses to messages for a no-longer-valid SoupMessage. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:364 > + d->m_buffer = static_cast<char*>(g_malloc(READ_BUFFER_SIZE)); My intuition is that this allocation should either be done with the GLib slice allocator or with fastAlloc/fastFree. Perhaps someone else can chime in here. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:445 > + if (d->m_cancellable) > + d->m_cancellable.clear(); > + > + if (d->m_soupMsg) > + d->m_soupMsg.clear(); No need to check these before clearing them. Just clear them. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:506 > + // WebCore might have cancelled the job in the while > + if (d->m_cancelled) > + return; > + > + if (soupMsg->response_body->data) > + client->didReceiveData(handle.get(), soupMsg->response_body->data, soupMsg->response_body->length, true); In the case that the request was cancelled this should be calling cleanupSoupRequestOperation(). So you might change if (soupMsg->response_body->data) to if (!d->m_cancelled && soupMsg->response_body->data) and remove the early return. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:513 > + if (d->m_cancelled || !client) { > + cleanupSoupRequestOperation(handle.get()); > + return; > + } The tab width here should be 4. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:525 > + d->m_buffer = static_cast<char*>(g_malloc(READ_BUFFER_SIZE)); Same issue with the allocation. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:578 > + SoupMessage* soupMsg = d->m_soupMsg.get(); Avoid abbreviations. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:582 > + request.updateSoupMessage(soupMsg); > + > + if (!d->m_soupMsg) > return false; It doesn't look like you want to call updateSoupMessage with a null message, so you should probably move this check up to line 578. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:775 > +static void readCallback(GObject* source, GAsyncResult* res, gpointer data) Please change res to asyncResult. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:826 > + g_input_stream_read_async(d->m_inputStream.get(), d->m_buffer, READ_BUFFER_SIZE, > + G_PRIORITY_DEFAULT, d->m_cancellable.get(), > + readCallback, data); This can be two lines. > WebCore/platform/network/soup/ResourceHandleSoup.cpp:845 > + CString urlStr = url.string().utf8().data(); This can just be: CString urlStr = url.string().utf8(); I think I suggested .data() originally, sorry.
Created attachment 69611 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code Some more changes to the patch from Martin's comments: * renamed soupMsg to soupMessage * disconnect signals in cleanupSoupRequestOperation * replaced g_malloc by g_slice_alloc * some style fixes * moved early returns to better locations
Attachment 69611 [details] did not build on gtk: Build output: http://queues.webkit.org/results/4200074
Comment on attachment 69611 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code Great. I'm going to wait for a nod from Kov or Xan to r+ the API changes.
Comment on attachment 69384 [details] Imported SoupURILoader to Webkit. New public API for HTTP cache View in context: https://bugs.webkit.org/attachment.cgi?id=69384&action=review > WebCore/platform/network/soup/cache/webkit/soup-cache.c:1595 > + filename = g_build_filename (priv->cache_dir, WEBKIT_SOUP_CACHE_FILE, NULL); > + if (!g_file_get_contents (filename, &contents, &length, NULL)) { > + g_free (filename); > + return; > + } There are several indentation inconsistencies like this one (2 spaces indentation for the first level, but 4 spaces in second-level indentation) in the soup files. If you could fix those when landing, that'd be great. *nod* otherwise! (I'll leave the r+ for mrobinson)
Just a reminder, but EWS needs to be updated before this can land.
(In reply to comment #109) > (From update of attachment 69384 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=69384&action=review > > > WebCore/platform/network/soup/cache/webkit/soup-cache.c:1595 > > + filename = g_build_filename (priv->cache_dir, WEBKIT_SOUP_CACHE_FILE, NULL); > > + if (!g_file_get_contents (filename, &contents, &length, NULL)) { > > + g_free (filename); > > + return; > > + } > > There are several indentation inconsistencies like this one (2 spaces indentation for the first level, but 4 spaces in second-level indentation) in the soup files. If you could fix those when landing, that'd be great. *nod* otherwise! (I'll leave the r+ for mrobinson) I think you're cheat by the way bugzilla shows patches with tabs, as libsoup code is using tabs for indentation. I have double checked that the patches are properly indented with 8-space tabs, it's just how they're rendered when reviewed. Anyway I'll upload a new version with emacs headers for indentation as libsoup uses.
Created attachment 69810 [details] Imported SoupURILoader to Webkit. New public API for HTTP cache Added emacs indentation headers as the originals in libsoup. I also fixed a couple of typos in the names of some macros
Comment on attachment 69810 [details] Imported SoupURILoader to Webkit. New public API for HTTP cache I double-checked the patch and it seems it's tabbed properly. We need to update the EWS version of libsoup before we can land this patch.
Just out of curiosity, Would this cache implementation suffice in multiprocess environment, Say Two processes A and B are linked with webkit and having same Cache location, then how the synchronization would be done for "soup.cache" file, if A and B load same pages.
Comment on attachment 69611 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code Clearing flags on attachment: 69611 Committed r69589: <http://trac.webkit.org/changeset/69589>
Comment on attachment 69611 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code Just realized that these patches don't have complete ChangeLogs. Flipping the review flag until Sergio can fix them.
(In reply to comment #114) > Just out of curiosity, Would this cache implementation suffice in multiprocess environment, Say Two processes A and B are linked with webkit and having same Cache location, then how the synchronization would be done for "soup.cache" file, if A and B load same pages. There wouldn't be any kind of synchronization. The implementation assumes that every process will use it own directory.
Created attachment 70628 [details] Imported SoupURILoader to Webkit. New public API for HTTP cache Corrected ChangeLogs
Created attachment 70630 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code Updated ChangeLogs
Comment on attachment 70628 [details] Imported SoupURILoader to Webkit. New public API for HTTP cache Great. Thanks. I'll land these patches.
Comment on attachment 69611 [details] WebKitGtk+ to use the new API from the imported SoupURILoader code Obsolete actually
Committed r69718: <http://trac.webkit.org/changeset/69718>
http://trac.webkit.org/changeset/69718 seems to have broken Gtk/Qt release builds: ../../WebCore/platform/network/soup/cache/soup-directory-input-stream.c:114: warning: comparison between signed and unsigned integer expressions ../../WebCore/platform/network/soup/cache/soup-directory-input-stream.c:127: warning: comparison between signed and unsigned integer expressions ../../WebCore/platform/network/soup/cache/soup-request-data.c:82: warning: comparison between signed and unsigned integer expressions ../../WebCore/platform/network/soup/cache/soup-request-data.c:82: warning: signed and unsigned type in conditional expression ../../WebCore/platform/network/soup/cache/soup-request-file.c:171: warning: implicit declaration of function ‘soup_uri_get_path’ ../../WebCore/platform/network/soup/cache/soup-request-file.c:171: warning: initialization makes pointer from integer without a cast ../../WebCore/platform/network/soup/cache/webkit/soup-cache.c:271: warning: signed and unsigned type in conditional expression ../../WebCore/platform/network/soup/cache/webkit/soup-cache.c:1347: warning: comparison between signed and unsigned integer expressions ../../WebCore/platform/network/soup/cache/webkit/soup-cache.c:1366: warning: comparison between signed and unsigned integer expressions
(In reply to comment #123) > http://trac.webkit.org/changeset/69718 seems to have broken Gtk/Qt release builds: The issue here seems be that the Release bots do not have a new enough version of libsoup to have soup_uri_get_path. Here's the failure line: make[1]: Entering directory `/var/lib/buildbot/build/gtk-linux-32-release/build/WebKitBuild/Release' CC WebKit/gtk/tests/Programs_unittests_testdomdocument-testdomdocument.o CCLD Programs/unittests/testdomdocument ./.libs/libwebkitgtk-1.0.so: undefined reference to `soup_uri_get_path' > ../../WebCore/platform/network/soup/cache/soup-directory-input-stream.c:114: warning: comparison between signed and unsigned integer expressions > ../../WebCore/platform/network/soup/cache/soup-directory-input-stream.c:127: warning: comparison between signed and unsigned integer expressions > ../../WebCore/platform/network/soup/cache/soup-request-data.c:82: warning: comparison between signed and unsigned integer expressions > ../../WebCore/platform/network/soup/cache/soup-request-data.c:82: warning: signed and unsigned type in conditional expression > ../../WebCore/platform/network/soup/cache/soup-request-file.c:171: warning: implicit declaration of function ‘soup_uri_get_path’ > ../../WebCore/platform/network/soup/cache/soup-request-file.c:171: warning: initialization makes pointer from integer without a cast > ../../WebCore/platform/network/soup/cache/webkit/soup-cache.c:271: warning: signed and unsigned type in conditional expression > ../../WebCore/platform/network/soup/cache/webkit/soup-cache.c:1347: warning: comparison between signed and unsigned integer expressions > ../../WebCore/platform/network/soup/cache/webkit/soup-cache.c:1366: warning: comparison between signed and unsigned integer expressions We should also fix these warnings before we try to land this patch again.
(In reply to comment #124) > (In reply to comment #123) > > http://trac.webkit.org/changeset/69718 seems to have broken Gtk/Qt release builds: > > > The issue here seems be that the Release bots do not have a new enough version of libsoup to have soup_uri_get_path. Here's the failure line: > > make[1]: Entering directory `/var/lib/buildbot/build/gtk-linux-32-release/build/WebKitBuild/Release' > CC WebKit/gtk/tests/Programs_unittests_testdomdocument-testdomdocument.o > CCLD Programs/unittests/testdomdocument > ./.libs/libwebkitgtk-1.0.so: undefined reference to `soup_uri_get_path' Fix it in tree by accessing path directly (uri->path), the struct is public. The getter was added in 2.32 only for the benefit of the bindings AFAIK.
(In reply to comment #125) > Fix it in tree by accessing path directly (uri->path), the struct is public. The getter was added in 2.32 only for the benefit of the bindings AFAIK. Okay. Fixed in the tree, but we really should fix those warnings.
Committed r69742: <http://trac.webkit.org/changeset/69742>
http://trac.webkit.org/changeset/69742 seems to have broken Gtk builds: http://build.webkit.org/builders/GTK%20Linux%2032-bit%20Release/builds/6599/steps/compile-webkit/logs ../../WebCore/platform/network/soup/ResourceHandleSoup.cpp: In function ‘void WebCore::sendRequestCallback(GObject*, GAsyncResult*, void*)’: ../../WebCore/platform/network/soup/ResourceHandleSoup.cpp:495: error: ‘soupMessage’ was not declared in this scope make[1]: *** [WebCore/platform/network/soup/libwebkitgtk_1_0_la-ResourceHandleSoup.lo] Error 1 make[1]: Leaving directory `/var/lib/buildbot/build/gtk-linux-32-release/build/WebKitBuild/Release' make: *** [all] Error 2 (In reply to comment #127) > Committed r69742: <http://trac.webkit.org/changeset/69742>
(In reply to comment #128) > http://trac.webkit.org/changeset/69742 seems to have broken Gtk builds: > > http://build.webkit.org/builders/GTK%20Linux%2032-bit%20Release/builds/6599/steps/compile-webkit/logs > > ../../WebCore/platform/network/soup/ResourceHandleSoup.cpp: In function ‘void WebCore::sendRequestCallback(GObject*, GAsyncResult*, void*)’: > ../../WebCore/platform/network/soup/ResourceHandleSoup.cpp:495: error: ‘soupMessage’ was not declared in this scope > make[1]: *** [WebCore/platform/network/soup/libwebkitgtk_1_0_la-ResourceHandleSoup.lo] Error 1 > make[1]: Leaving directory `/var/lib/buildbot/build/gtk-linux-32-release/build/WebKitBuild/Release' > make: *** [all] Error 2 > > (In reply to comment #127) > > Committed r69742: <http://trac.webkit.org/changeset/69742> Fixed here, thanks for the report: http://trac.webkit.org/changeset/69748