Bug 40222 - Use the new SoupURILoader API
Summary: Use the new SoupURILoader API
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKitGTK (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC Linux
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-06-07 03:32 PDT by Sergio Villar Senin
Modified: 2011-02-16 08:46 PST (History)
5 users (show)

See Also:


Attachments
Http working with the new SoupURILoader API (15.29 KB, patch)
2010-06-07 03:32 PDT, Sergio Villar Senin
no flags Details | Formatted Diff | Diff
WebKitGtk+ working for http:/ftp:/file:/data: protocols (24.80 KB, patch)
2010-06-11 08:24 PDT, Sergio Villar Senin
no flags Details | Formatted Diff | Diff
WebKitGtk+ working for http/ftp/file/data protocols (28.93 KB, patch)
2010-06-24 03:48 PDT, Sergio Villar Senin
no flags Details | Formatted Diff | Diff
WebKitGtk+ working for http/ftp/file/data protocols (28.74 KB, patch)
2010-06-24 04:30 PDT, Sergio Villar Senin
gns: review-
Details | Formatted Diff | Diff
WebKitGtk+ using libsoup SoupURILoader new API (29.08 KB, patch)
2010-07-23 04:45 PDT, Sergio Villar Senin
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Sergio Villar Senin 2010-06-07 03:32:39 PDT
Created attachment 58006 [details]
Http working with the new SoupURILoader API

See https://bugzilla.gnome.org/show_bug.cgi?id=557777 for more details about the new SoupURILoader stuff

It seems sensible to handle here the WebKitGtk+ patch.
Comment 1 Sergio Villar Senin 2010-06-11 08:24:03 PDT
Created attachment 58477 [details]
WebKitGtk+ working for http:/ftp:/file:/data: protocols

With this new version of the patch and the changes to libsoup mentioned here https://bugzilla.gnome.org/show_bug.cgi?id=557777 all the tests for WebKitGtk+ are working fine.

With this new version not only http://, but ftp:// data:// and file:// protocols are correctly working with the new SoupURILoader stuff
Comment 2 Sergio Villar Senin 2010-06-24 03:48:12 PDT
Created attachment 59628 [details]
WebKitGtk+ working for http/ftp/file/data protocols

The previous patch still had the old handling code for data:// URLs. Now we use the SoupURILoader stuff also for loading data:.

All the tests work fine with this patch and the libsoup version stored in http://gitorious.org/libsoup-io-cache

libsoup changes can be tracked in https://bugzilla.gnome.org/show_bug.cgi?id=557777
Comment 3 WebKit Review Bot 2010-06-24 03:50:13 PDT
Attachment 59628 [details] did not pass style-queue:

Failed to run "['WebKitTools/Scripts/check-webkit-style', '--no-squash']" exit_code: 1
WebCore/platform/network/soup/ResourceHandleSoup.cpp:308:  Should have a space between // and comment  [whitespace/comments] [4]
WebCore/platform/network/soup/ResourceHandleSoup.cpp:377:  Declaration has space between type name and * in GInputStream *in  [whitespace/declaration] [3]
WebCore/platform/network/soup/ResourceHandleSoup.cpp:391:  Extra space before ( in function call  [whitespace/parens] [4]
WebCore/platform/network/soup/ResourceHandleSoup.cpp:490:  Extra space before ( in function call  [whitespace/parens] [4]
WebCore/platform/network/soup/ResourceHandleSoup.cpp:492:  Extra space before ( in function call  [whitespace/parens] [4]
WebCore/platform/network/soup/ResourceHandleSoup.cpp:500:  Declaration has space between type name and * in GError *error  [whitespace/declaration] [3]
WebCore/platform/network/soup/ResourceHandleSoup.cpp:571:  Declaration has space between type name and * in GFile *file  [whitespace/declaration] [3]
WebCore/platform/network/soup/ResourceHandleSoup.cpp:571:  Extra space before ( in function call  [whitespace/parens] [4]
WebCore/platform/network/soup/ResourceHandleSoup.cpp:585:  Extra space before ( in function call  [whitespace/parens] [4]
WebCore/platform/network/soup/ResourceHandleSoup.cpp:859:  One line control clauses should not use braces.  [whitespace/braces] [4]
WebCore/platform/network/soup/ResourceHandleSoup.cpp:883:  Declaration has space between type name and * in gchar *urlstr  [whitespace/declaration] [3]
WebCore/platform/network/soup/ResourceRequestSoup.cpp:39:  Extra space before ( in function call  [whitespace/parens] [4]
Total errors found: 12 in 4 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 4 Sergio Villar Senin 2010-06-24 04:30:19 PDT
Created attachment 59635 [details]
WebKitGtk+ working for http/ftp/file/data protocols

Style fixes
Comment 5 WebKit Review Bot 2010-06-24 05:14:46 PDT
Attachment 59628 [details] did not build on gtk:
Build output: http://webkit-commit-queue.appspot.com/results/3304674
Comment 6 WebKit Review Bot 2010-06-24 06:47:03 PDT
Attachment 59635 [details] did not build on gtk:
Build output: http://webkit-commit-queue.appspot.com/results/3342348
Comment 7 Gustavo Noronha (kov) 2010-06-25 05:57:15 PDT
Comment on attachment 59635 [details]
WebKitGtk+ working for http/ftp/file/data protocols

WebCore/platform/network/soup/ResourceHandleSoup.cpp:547
 +          client->didFinishLoading(handle.get());
You are missing checks here for the existance of the client and for cancelling. The didReceiveData above may have cancelled it, we'll get crashes.


WebCore/platform/network/soup/ResourceHandleSoup.cpp:490
 +          gulong gotChunkHandler = GPOINTER_TO_UINT(userData);
Eeww. I don't think this is right. This will do a cast to guint. I would prefer removing this hack, and using a variable inside the ResourceHandleInternal hack instead.

WebCore/platform/network/soup/ResourceHandleSoup.cpp:566
 +      // returns a stream whose outcome is a HTML with a list of files
"whose content is HTML", I think, sounds better; I don't see any special handling for directories, though, should there be? It appears the special handling is for any file: URIs

r- for the reasons above, I'll try to get this installed and start running with it to get it tested =)
Comment 8 Sergio Villar Senin 2010-06-28 02:04:25 PDT
(In reply to comment #7)
> (From update of attachment 59635 [details])
> WebCore/platform/network/soup/ResourceHandleSoup.cpp:547
>  +          client->didFinishLoading(handle.get());
> You are missing checks here for the existance of the client and for cancelling. The didReceiveData above may have cancelled it, we'll get crashes.

True

> 
> WebCore/platform/network/soup/ResourceHandleSoup.cpp:490
>  +          gulong gotChunkHandler = GPOINTER_TO_UINT(userData);
> Eeww. I don't think this is right. This will do a cast to guint. I would prefer removing this hack, and using a variable inside the ResourceHandleInternal hack instead.

Ok

> WebCore/platform/network/soup/ResourceHandleSoup.cpp:566
>  +      // returns a stream whose outcome is a HTML with a list of files
> "whose content is HTML", I think, sounds better; I don't see any special handling for directories, though, should there be? It appears the special handling is for any file: URIs

Thing is that for directories we'll get a SoupDirectoryInputStream instead of a GFileInputStream. If we get that soup stream we'll handle it as a normal stream with HTML content. Maybe we could set the condition as !SOUP_IS_DIRECTORY_INPUT_STREAM
Comment 9 Xan Lopez 2010-07-01 05:05:09 PDT
Comment on attachment 59635 [details]
WebKitGtk+ working for http/ftp/file/data protocols

>-        GFile* m_gfile;
>+        SoupRequest* m_req;

Maybe we can start using GRefPtr for this?

>-    String url = handle->request().url().string();
>-    ASSERT(url.startsWith("data:", false));
>+    String urlStr = handle->request().url().string();
>+    ASSERT(urlStr.startsWith("data:", false));
> 
>-    int index = url.find(',');
>+    int index = urlStr.find(',');
>     if (index == -1) {
>         client->cannotShowURL(handle);
>         return false;
>     }
> 
>-    String mediaType = url.substring(5, index - 5);
>-    String data = url.substring(index + 1);
>+    String mediaType = urlStr.substring(5, index - 5);

This seems to be a bit random? Should go in a different patch at the very least.

>+    SoupSession* session = handle->defaultSession();
>+    GError* error = 0;

Should use GOwnPtr for GError here.

>+    d->m_req = soup_session_request(session, handle->request().url().string().utf8().data(), &error);
>+    if (error) {
>+        g_error_free(error);
>+        return false;
>+    }
>+
>+    GInputStream* in = soup_request_send(d->m_req, 0, &error);
>+    if (error) {
>+        g_error_free(error);
>+        return false;
>+    }
>+
>+    d->m_inputStream = in;
>+    d->m_bufferSize = 8192;
>+    d->m_buffer = static_cast<char*>(g_malloc(d->m_bufferSize));
>+    d->m_total = 0;
>+
>+    g_object_set_data(G_OBJECT(d->m_inputStream), "webkit-resource", handle);
>+    handle->ref();
>+
>+    d->m_cancellable = g_cancellable_new();
>+    g_input_stream_read_async(d->m_inputStream, d->m_buffer, d->m_bufferSize,
>+                              G_PRIORITY_DEFAULT, d->m_cancellable,
>+                              readCallback, (isBase64) ? 0 : GINT_TO_POINTER(1));

I suppose it's good to unify stuff and always go through libsoup for everything, but isn't parsing a simple string asynchronously a bit overkill? /me wonders

> 
>     return false;
> }
>@@ -428,7 +404,7 @@ static bool startData(ResourceHandle* handle, String urlString)
> 
>     // If parseDataUrl is called synchronously the job is not yet effectively started
>     // and webkit won't never know that the data has been parsed even didFinishLoading is called.
>-    d->m_idleHandler = g_timeout_add(0, parseDataUrl, handle);
>+    d->m_idleHandler = g_idle_add(parseDataUrl, handle);

This is not the same thing. Do you need to change it or you just felt it was better? Needs an explanation.

>     return true;
> }
> 
>@@ -469,6 +445,151 @@ static void ensureSessionIsInitialized(SoupSession* session)
>     g_object_set_data(G_OBJECT(session), "webkit-init", reinterpret_cast<void*>(0xdeadbeef));
> }
> 

>+static void sendRequestCallback(GObject* source, GAsyncResult* res, gpointer userData)
>+{
>+    RefPtr<ResourceHandle> handle = static_cast<ResourceHandle*>(g_object_get_data(source, "webkit-resource"));
>+    if (!handle)
>+        return;
>+
>+    ResourceHandleInternal* d = handle->getInternal();
>+    ResourceHandleClient* client = handle->client();
>+
>+    if (userData) {
>+        // No need to call gotChunkHandler anymore. Received data will
>+        // be reported by readCallback
>+        gulong gotChunkHandler = GPOINTER_TO_UINT(userData);
>+        if (g_signal_handler_is_connected(d->m_msg, gotChunkHandler))
>+            g_signal_handler_disconnect(d->m_msg, gotChunkHandler);
>+    }
>+
>+    if (d->m_cancelled || !client) {
>+        cleanupSoupRequestOperation(handle.get());
>+        return;
>+    }
>+
>+    GError* error = 0;

GOwnPtr. This is repeated elsewhere, the same applies.

>+    GInputStream* in = soup_request_send_finish(d->m_req, res, &error);
>+
>+    if (error) {
>+

Extra blank line!

>+        if (d->m_msg && SOUP_STATUS_IS_TRANSPORT_ERROR(d->m_msg->status_code)) {
>+            SoupURI* uri = soup_request_get_uri(d->m_req);
>+            char* uristr = soup_uri_to_string(uri, false);

uriStr? With GOwnPtr. This applies in a bunch of places too.

>+            ResourceError resourceError(g_quark_to_string(SOUP_HTTP_ERROR),
>+                                        d->m_msg->status_code,
>+                                        uristr,
>+                                        String::fromUTF8(d->m_msg->reason_phrase));
>+            g_free(uristr);
>+            g_error_free(error);
>+            cleanupSoupRequestOperation(handle.get());
>+            client->didFail(handle.get(), resourceError);
>+            return;
>+        }
>+
>+        if (error->domain == G_IO_ERROR) {
>+            SoupURI* uri = soup_request_get_uri(d->m_req);
>+            char* uristr = soup_uri_to_string(uri, false);
>+            ResourceError resourceError(g_quark_to_string(G_IO_ERROR),
>+                                        error->code,
>+                                        uristr,
>+                                        error ? String::fromUTF8(error->message) : String());
>+            g_free(uristr);
>+            g_error_free(error);
>+            cleanupSoupRequestOperation(handle.get());
>+            client->didFail(handle.get(), resourceError);
>+            return;
>+        }
>+
>+        g_error_free(error);
>+
>+        if (d->m_msg && d->m_msg->status_code == SOUP_STATUS_UNAUTHORIZED) {
>+            fillResponseFromMessage(d->m_msg, &d->m_response);
>+            client->didReceiveResponse(handle.get(), d->m_response);
>+
>+            // WebCore might have cancelled the job in the while
>+            if (d->m_cancelled)
>+                return;
>+
>+            if (d->m_msg->response_body->data)
>+                client->didReceiveData(handle.get(), d->m_msg->response_body->data, d->m_msg->response_body->length, true);
>+        }
>+
>+        client->didFinishLoading(handle.get());
>+        return;
>+    }
>+
>+    if (d->m_cancelled) {
>+        cleanupSoupRequestOperation(handle.get());
>+        return;
>+    }
>+
>+    d->m_inputStream = in;
>+    d->m_bufferSize = 8192;

This number appears around more than once, should be a #define

>+    d->m_buffer = static_cast<char*>(g_malloc(d->m_bufferSize));
>+    d->m_total = 0;
>+
>+    // readCallback needs it
>+    g_object_set_data(G_OBJECT(d->m_inputStream), "webkit-resource", handle.get());
>+
>+    // We need to check if it's a file: URL and if it is a regular
>+    // file as it could be a directory. In that case Soup properly
>+    // returns a stream whose outcome is a HTML with a list of files
>+    // in the directory
>+    if (equalIgnoringCase(handle->request().url().protocol(), "file")
>+        && G_IS_FILE_INPUT_STREAM(in)) {
>+
>+            GFile* file = soup_request_file_get_file(SOUP_REQUEST_FILE(d->m_req));

GRefPtr.

>+
>+            ResourceResponse response;
>+
>+            response.setURL(handle->request().url());
>+            response.setMimeType(soup_request_get_content_type(d->m_req));
>+            response.setExpectedContentLength(soup_request_get_content_length(d->m_req));
>+            client->didReceiveResponse(handle.get(), response);
>+
>+            if (d->m_cancelled) {
>+                cleanupSoupRequestOperation(handle.get());
>+                return;
>+            }
>+
>+            g_object_unref(file);
>+    }
>+
>+    g_input_stream_read_async(d->m_inputStream, d->m_buffer, d->m_bufferSize,
>+                              G_PRIORITY_DEFAULT, d->m_cancellable,
>+                              readCallback, 0);
>+}
>+

> 
>+

Not extra line.

> bool ResourceHandle::start(Frame* frame)
> {
>     ASSERT(!d->m_msg);
>@@ -591,21 +720,21 @@ bool ResourceHandle::start(Frame* frame)
>         return false;
> 
>     KURL url = request().url();
>-    String urlString = url.string();
>-    String protocol = url.protocol();
> 
>     // Used to set the authentication dialog toplevel; may be NULL
>     d->m_frame = frame;
> 
>-    if (equalIgnoringCase(protocol, "data"))
>-        return startData(this, urlString);
>+    if (url.protocolIs("data"))
>+        return startData(this, url);
> 
>-    if (equalIgnoringCase(protocol, "http") || equalIgnoringCase(protocol, "https")) {
>+    if (url.protocolInHTTPFamily()) {
>         if (startHttp(this))
>             return true;
>     }
> 
>-    if (equalIgnoringCase(protocol, "file") || equalIgnoringCase(protocol, "ftp") || equalIgnoringCase(protocol, "ftps")) {
>+    if (url.protocolIs("file")
>+        || url.protocolIs("ftp")
>+        || url.protocolIs("ftps")) {
>         // FIXME: should we be doing any other protocols here?
>         if (startGio(this, url))
>             return true;

This is all nice but I don't believe it's strictly needed in the patch, you could push it separately and earlier.

>@@ -665,38 +794,6 @@ void ResourceHandle::loadResourceSynchronously(const ResourceRequest& request, S
>     syncLoader.run();
> }
> 

> static void closeCallback(GObject* source, GAsyncResult* res, gpointer)
> {
>     RefPtr<ResourceHandle> handle = static_cast<ResourceHandle*>(g_object_get_data(source, "webkit-resource"));
>@@ -707,7 +804,6 @@ static void closeCallback(GObject* source, GAsyncResult* res, gpointer)
>     ResourceHandleClient* client = handle->client();
> 
>     g_input_stream_close_finish(d->m_inputStream, res, 0);
>-    cleanupGioOperation(handle.get());
> 
>     // The load may have been cancelled, the client may have been
>     // destroyed already. In such cases calling didFinishLoading is a
>@@ -716,9 +812,11 @@ static void closeCallback(GObject* source, GAsyncResult* res, gpointer)
>         return;
> 
>     client->didFinishLoading(handle.get());
>+
>+    cleanupSoupRequestOperation(handle.get());

Any reason to move this?

> }
> 

>@@ -755,141 +854,36 @@ static void readCallback(GObject* source, GAsyncResult* res, gpointer)
>     }
> 
>     d->m_total += bytesRead;
>-    client->didReceiveData(handle.get(), d->m_buffer, bytesRead, d->m_total);
>+    if (G_LIKELY(!data))
>+        client->didReceiveData(handle.get(), d->m_buffer, bytesRead, d->m_total);

So the likely case is that there's no data read? Uh?

>+    else {
>+        // We have to convert it to UTF-16 due to limitations in KURL
>+        GOwnPtr<gunichar2> unicodeStr(g_utf8_to_utf16(d->m_buffer, bytesRead, 0, 0, 0));
>+        client->didReceiveData(handle.get(), reinterpret_cast<const char*>(unicodeStr.get()), g_utf8_strlen(d->m_buffer, bytesRead) * sizeof(UChar), 0);

Mmm, we don't seem to do this in the other didReceiveData calls, we pass UTF8 data and lenght in bytes, not characters. If the tests pass I suppose this is ok, but it seems puzzling.
>+    }
>
Comment 10 Sergio Villar Senin 2010-07-01 07:28:38 PDT
(In reply to comment #9)
> (From update of attachment 59635 [details])
> >-        GFile* m_gfile;
> >+        SoupRequest* m_req;
> 
> Maybe we can start using GRefPtr for this?

Seems a good place indeed
 
> >-    String url = handle->request().url().string();
> >-    ASSERT(url.startsWith("data:", false));
> >+    String urlStr = handle->request().url().string();
> >+    ASSERT(urlStr.startsWith("data:", false));
> > 
> >-    int index = url.find(',');
> >+    int index = urlStr.find(',');
> >     if (index == -1) {
> >         client->cannotShowURL(handle);
> >         return false;
> >     }
> > 
> >-    String mediaType = url.substring(5, index - 5);
> >-    String data = url.substring(index + 1);
> >+    String mediaType = urlStr.substring(5, index - 5);
> 
> This seems to be a bit random? Should go in a different patch at the very least.

Absolutely, I guess I changed it sometime in the past when there was another url variable around.
 
> >+    SoupSession* session = handle->defaultSession();
> >+    GError* error = 0;
> 
> Should use GOwnPtr for GError here.

Ok

> >+    d->m_req = soup_session_request(session, handle->request().url().string().utf8().data(), &error);
> >+    if (error) {
> >+        g_error_free(error);
> >+        return false;
> >+    }
> >+
> >+    GInputStream* in = soup_request_send(d->m_req, 0, &error);
> >+    if (error) {
> >+        g_error_free(error);
> >+        return false;
> >+    }
> >+
> >+    d->m_inputStream = in;
> >+    d->m_bufferSize = 8192;
> >+    d->m_buffer = static_cast<char*>(g_malloc(d->m_bufferSize));
> >+    d->m_total = 0;
> >+
> >+    g_object_set_data(G_OBJECT(d->m_inputStream), "webkit-resource", handle);
> >+    handle->ref();
> >+
> >+    d->m_cancellable = g_cancellable_new();
> >+    g_input_stream_read_async(d->m_inputStream, d->m_buffer, d->m_bufferSize,
> >+                              G_PRIORITY_DEFAULT, d->m_cancellable,
> >+                              readCallback, (isBase64) ? 0 : GINT_TO_POINTER(1));
> 
> I suppose it's good to unify stuff and always go through libsoup for everything, but isn't parsing a simple string asynchronously a bit overkill? /me wonders

I agree that the real benefit for the data: case is only to use a single path for every protocol, as what we do basically is to parse the string as you say. Don't feel like strongly defending any alternative though.

> > 
> >     return false;
> > }
> >@@ -428,7 +404,7 @@ static bool startData(ResourceHandle* handle, String urlString)
> > 
> >     // If parseDataUrl is called synchronously the job is not yet effectively started
> >     // and webkit won't never know that the data has been parsed even didFinishLoading is called.
> >-    d->m_idleHandler = g_timeout_add(0, parseDataUrl, handle);
> >+    d->m_idleHandler = g_idle_add(parseDataUrl, handle);
> 
> This is not the same thing. Do you need to change it or you just felt it was better? Needs an explanation.

I agree, I have to find one first :-). Thing is that somehow tests show some flackiness with the timeout_add while they worked fine everytime for me with the idle stuff.

> >     return true;
> > }
> > 
> >@@ -469,6 +445,151 @@ static void ensureSessionIsInitialized(SoupSession* session)
> >     g_object_set_data(G_OBJECT(session), "webkit-init", reinterpret_cast<void*>(0xdeadbeef));
> > }
> > 
> 
> >+static void sendRequestCallback(GObject* source, GAsyncResult* res, gpointer userData)
> >+{
> >+    RefPtr<ResourceHandle> handle = static_cast<ResourceHandle*>(g_object_get_data(source, "webkit-resource"));
> >+    if (!handle)
> >+        return;
> >+
> >+    ResourceHandleInternal* d = handle->getInternal();
> >+    ResourceHandleClient* client = handle->client();
> >+
> >+    if (userData) {
> >+        // No need to call gotChunkHandler anymore. Received data will
> >+        // be reported by readCallback
> >+        gulong gotChunkHandler = GPOINTER_TO_UINT(userData);
> >+        if (g_signal_handler_is_connected(d->m_msg, gotChunkHandler))
> >+            g_signal_handler_disconnect(d->m_msg, gotChunkHandler);
> >+    }
> >+
> >+    if (d->m_cancelled || !client) {
> >+        cleanupSoupRequestOperation(handle.get());
> >+        return;
> >+    }
> >+
> >+    GError* error = 0;
> 
> GOwnPtr. This is repeated elsewhere, the same applies.

Yep

> >+    GInputStream* in = soup_request_send_finish(d->m_req, res, &error);
> >+
> >+    if (error) {
> >+
> 
> Extra blank line!
> 
> >+        if (d->m_msg && SOUP_STATUS_IS_TRANSPORT_ERROR(d->m_msg->status_code)) {
> >+            SoupURI* uri = soup_request_get_uri(d->m_req);
> >+            char* uristr = soup_uri_to_string(uri, false);
> 
> uriStr? With GOwnPtr. This applies in a bunch of places too.

Fair enough

> >+            ResourceError resourceError(g_quark_to_string(SOUP_HTTP_ERROR),
> >+                                        d->m_msg->status_code,
> >+                                        uristr,
> >+                                        String::fromUTF8(d->m_msg->reason_phrase));
> >+            g_free(uristr);
> >+            g_error_free(error);
> >+            cleanupSoupRequestOperation(handle.get());
> >+            client->didFail(handle.get(), resourceError);
> >+            return;
> >+        }
> >+
> >+        if (error->domain == G_IO_ERROR) {
> >+            SoupURI* uri = soup_request_get_uri(d->m_req);
> >+            char* uristr = soup_uri_to_string(uri, false);
> >+            ResourceError resourceError(g_quark_to_string(G_IO_ERROR),
> >+                                        error->code,
> >+                                        uristr,
> >+                                        error ? String::fromUTF8(error->message) : String());
> >+            g_free(uristr);
> >+            g_error_free(error);
> >+            cleanupSoupRequestOperation(handle.get());
> >+            client->didFail(handle.get(), resourceError);
> >+            return;
> >+        }
> >+
> >+        g_error_free(error);
> >+
> >+        if (d->m_msg && d->m_msg->status_code == SOUP_STATUS_UNAUTHORIZED) {
> >+            fillResponseFromMessage(d->m_msg, &d->m_response);
> >+            client->didReceiveResponse(handle.get(), d->m_response);
> >+
> >+            // WebCore might have cancelled the job in the while
> >+            if (d->m_cancelled)
> >+                return;
> >+
> >+            if (d->m_msg->response_body->data)
> >+                client->didReceiveData(handle.get(), d->m_msg->response_body->data, d->m_msg->response_body->length, true);
> >+        }
> >+
> >+        client->didFinishLoading(handle.get());
> >+        return;
> >+    }
> >+
> >+    if (d->m_cancelled) {
> >+        cleanupSoupRequestOperation(handle.get());
> >+        return;
> >+    }
> >+
> >+    d->m_inputStream = in;
> >+    d->m_bufferSize = 8192;
> 
> This number appears around more than once, should be a #define

Ok
 
> >+    d->m_buffer = static_cast<char*>(g_malloc(d->m_bufferSize));
> >+    d->m_total = 0;
> >+
> >+    // readCallback needs it
> >+    g_object_set_data(G_OBJECT(d->m_inputStream), "webkit-resource", handle.get());
> >+
> >+    // We need to check if it's a file: URL and if it is a regular
> >+    // file as it could be a directory. In that case Soup properly
> >+    // returns a stream whose outcome is a HTML with a list of files
> >+    // in the directory
> >+    if (equalIgnoringCase(handle->request().url().protocol(), "file")
> >+        && G_IS_FILE_INPUT_STREAM(in)) {
> >+
> >+            GFile* file = soup_request_file_get_file(SOUP_REQUEST_FILE(d->m_req));
> 
> GRefPtr.

Sure
 
> >+
> >+            ResourceResponse response;
> >+
> >+            response.setURL(handle->request().url());
> >+            response.setMimeType(soup_request_get_content_type(d->m_req));
> >+            response.setExpectedContentLength(soup_request_get_content_length(d->m_req));
> >+            client->didReceiveResponse(handle.get(), response);
> >+
> >+            if (d->m_cancelled) {
> >+                cleanupSoupRequestOperation(handle.get());
> >+                return;
> >+            }
> >+
> >+            g_object_unref(file);
> >+    }
> >+
> >+    g_input_stream_read_async(d->m_inputStream, d->m_buffer, d->m_bufferSize,
> >+                              G_PRIORITY_DEFAULT, d->m_cancellable,
> >+                              readCallback, 0);
> >+}
> >+
> 
> > 
> >+
> 
> Not extra line.
> 
> > bool ResourceHandle::start(Frame* frame)
> > {
> >     ASSERT(!d->m_msg);
> >@@ -591,21 +720,21 @@ bool ResourceHandle::start(Frame* frame)
> >         return false;
> > 
> >     KURL url = request().url();
> >-    String urlString = url.string();
> >-    String protocol = url.protocol();
> > 
> >     // Used to set the authentication dialog toplevel; may be NULL
> >     d->m_frame = frame;
> > 
> >-    if (equalIgnoringCase(protocol, "data"))
> >-        return startData(this, urlString);
> >+    if (url.protocolIs("data"))
> >+        return startData(this, url);
> > 
> >-    if (equalIgnoringCase(protocol, "http") || equalIgnoringCase(protocol, "https")) {
> >+    if (url.protocolInHTTPFamily()) {
> >         if (startHttp(this))
> >             return true;
> >     }
> > 
> >-    if (equalIgnoringCase(protocol, "file") || equalIgnoringCase(protocol, "ftp") || equalIgnoringCase(protocol, "ftps")) {
> >+    if (url.protocolIs("file")
> >+        || url.protocolIs("ftp")
> >+        || url.protocolIs("ftps")) {
> >         // FIXME: should we be doing any other protocols here?
> >         if (startGio(this, url))
> >             return true;
> 
> This is all nice but I don't believe it's strictly needed in the patch, you could push it separately and earlier.
> 

That's true

> >@@ -665,38 +794,6 @@ void ResourceHandle::loadResourceSynchronously(const ResourceRequest& request, S
> >     syncLoader.run();
> > }
> > 
> 
> > static void closeCallback(GObject* source, GAsyncResult* res, gpointer)
> > {
> >     RefPtr<ResourceHandle> handle = static_cast<ResourceHandle*>(g_object_get_data(source, "webkit-resource"));
> >@@ -707,7 +804,6 @@ static void closeCallback(GObject* source, GAsyncResult* res, gpointer)
> >     ResourceHandleClient* client = handle->client();
> > 
> >     g_input_stream_close_finish(d->m_inputStream, res, 0);
> >-    cleanupGioOperation(handle.get());
> > 
> >     // The load may have been cancelled, the client may have been
> >     // destroyed already. In such cases calling didFinishLoading is a
> >@@ -716,9 +812,11 @@ static void closeCallback(GObject* source, GAsyncResult* res, gpointer)
> >         return;
> > 
> >     client->didFinishLoading(handle.get());
> >+
> >+    cleanupSoupRequestOperation(handle.get());
> 
> Any reason to move this?

I just moved it because I thought it was releasing something too early, but it did not. Agree that it can be moved back to the middle of the function.

> 
> > }
> > 
> 
> >@@ -755,141 +854,36 @@ static void readCallback(GObject* source, GAsyncResult* res, gpointer)
> >     }
> > 
> >     d->m_total += bytesRead;
> >-    client->didReceiveData(handle.get(), d->m_buffer, bytesRead, d->m_total);
> >+    if (G_LIKELY(!data))
> >+        client->didReceiveData(handle.get(), d->m_buffer, bytesRead, d->m_total);
> 
> So the likely case is that there's no data read? Uh?

Maybe the name mislead you. That data is just a gpointer that I used as a flag in order to know whether we should convert the read bytes to UTF-16 or not. The most common case is that we don't have to, as it should be done only for non-bas64 data streams, which I guess are not so frequent as http: or another protocols.

> 
> >+    else {
> >+        // We have to convert it to UTF-16 due to limitations in KURL
> >+        GOwnPtr<gunichar2> unicodeStr(g_utf8_to_utf16(d->m_buffer, bytesRead, 0, 0, 0));
> >+        client->didReceiveData(handle.get(), reinterpret_cast<const char*>(unicodeStr.get()), g_utf8_strlen(d->m_buffer, bytesRead) * sizeof(UChar), 0);
> 
> Mmm, we don't seem to do this in the other didReceiveData calls, we pass UTF8 data and lenght in bytes, not characters. If the tests pass I suppose this is ok, but it seems puzzling.
> >+    }
> >

We're passing here also the length in bytes (that's why we have the sizeof(UChar). Anyway it's true that this is the only place where we do this. If you take a look at the original parseDataUrl method this is only done also for non-base64 data streams. And indeed, tests like it :)

Thx for reviewing!!!
Comment 11 Sergio Villar Senin 2010-07-06 10:22:53 PDT
(In reply to comment #9)
> >@@ -428,7 +404,7 @@ static bool startData(ResourceHandle* handle, String urlString)
> > 
> >     // If parseDataUrl is called synchronously the job is not yet effectively started
> >     // and webkit won't never know that the data has been parsed even didFinishLoading is called.
> >-    d->m_idleHandler = g_timeout_add(0, parseDataUrl, handle);
> >+    d->m_idleHandler = g_idle_add(parseDataUrl, handle);
> 
> This is not the same thing. Do you need to change it or you just felt it was better? Needs an explanation.

There is only 1 test that fails if I keep the timeout_add thing which is 
http/tests/security/dataURL/xss-DENIED-from-data-url-in-foreign-domain-subframe.html

This test (from the resources POV) does the following:
  * loads an HTML which then...
  * loads another HTML which...
  * loads a data URL with some code that finally calls notifyDone

The sequence of calls with the "old" g_timeout_add is

startHttp -> closeCallback -> startHttp -> startData -> parseDataURL -> closeCallback -> closeCallback -> startHttp -> closeCallback

The problem is that parseDataURL is called *before* the last startHttp is issued and thus the test keeps waiting forever for the notifyDone call which is called when parseDataURL finishes.

With the g_idle_add it looks like

startHttp -> closeCallback -> startHttp -> startData -> closeCallback -> startHttp ->  parseDataURL -> closeCallback -> closeCallback

In this case the parseDataURL is called after the last call to startHttp. This is pretty much what happens with master's ResourceHandleSoup. The reason why it currently works in master with a timeout is not clear to me yet but I suspect that with the new libsoup it fails because the amount of asynchronous calls increases. Don't know core guts enough to know why some callbacks are called before than others.

Thing is that the test works fine, I mean, the output of the test is OK, WebKit behaves as it should not allowing access to the root element from a dataURL loaded from another domain, it's just that the test framework don't get the events when it expects them to arrive, that's why it reaches the timeout. So maybe skipping the test is a possibility.
Comment 12 Sergio Villar Senin 2010-07-23 04:45:17 PDT
Created attachment 62413 [details]
WebKitGtk+ using libsoup SoupURILoader new API

New version of the patch based on Xan's and Gustavo's comments
Comment 13 Gustavo Noronha (kov) 2010-08-30 07:44:28 PDT
(In reply to comment #11)
> There is only 1 test that fails if I keep the timeout_add thing which is 
> http/tests/security/dataURL/xss-DENIED-from-data-url-in-foreign-domain-subframe.html
[...]
> Thing is that the test works fine, I mean, the output of the test is OK, WebKit behaves as it should not allowing access to the root element from a dataURL loaded from another domain, it's just that the test framework don't get the events when it expects them to arrive, that's why it reaches the timeout. So maybe skipping the test is a possibility.

I think this test is being fragile by depending on timing or on a specific order of events. I would recommend skipping it, and reenabling it after we figure a way of making it less dependant on the order of the events - though it might be that the order of events is actually specified in the standard. Either way, I am indeed against replacing a timeout with an idle, idles may never run like Martin described.
Comment 14 Sergio Villar Senin 2011-02-16 00:23:06 PST
I guess we should close this right ?
Comment 15 Martin Robinson 2011-02-16 08:39:05 PST
What was the resolution of this issue?
Comment 16 Sergio Villar Senin 2011-02-16 08:45:42 PST
(In reply to comment #15)
> What was the resolution of this issue?

It was actually the first step to land the SoupCache but at the end we did it we did all at the same time. I guess this should be an already fixed
Comment 17 Martin Robinson 2011-02-16 08:46:20 PST
Great. Yeah. We can just close this now then.