RESOLVED FIXED Bug 28687
should not pass URI fragments to libsoup
https://bugs.webkit.org/show_bug.cgi?id=28687
Summary should not pass URI fragments to libsoup
Dan Winship
Reported 2009-08-24 14:49:59 PDT
(pointed out by reinouts on irc) If you click on a link like "http://example.com/#comments" in WebKitGTK, it passes the whole URI (including the fragment) to libsoup. Normally libsoup doesn't pass the "#comments" to the remote server, but if you're using a proxy, then it does. And some servers will return a 404 or 403 in that case because they interpret the "#comments" as being part of the path. The fact that libsoup's behavior changes depending on whether you use a proxy is a libsoup bug, but you could argue about which of the two behaviors is the buggy one. I think WebKit should not be passing fragments to libsoup; WebKit wants the whole page back, and so it should be requesting the whole page from libsoup, not just a portion of it. Also, it could be confusing if messages with URIs that are not soup_uri_equal() would result in identical requests, and this could end up tripping up things like caching, etc, as well. It seems wrong for "soup_message_get_uri()" to return a URI with a fragment in it, since the response body actually contains all fragments. But it would be wrong for SoupMessage to clear the fragment from the URI itself too, since that could trip up the app if it expected the URI to match what it had originally set it to. So anyway, I think ResourceHandleSoup should strip the URI fragment from the URI as part of its process of canonicalizing it before passing it to libsoup.
Attachments
dontpassfragment.patch (2.86 KB, patch)
2009-09-08 08:12 PDT, Xan Lopez
gustavo: review+
xan.lopez: commit-queue-
Xan Lopez
Comment 1 2009-08-26 03:59:27 PDT
So, I've tried something like... diff --git a/WebCore/platform/network/soup/ResourceRequestSoup.cpp b/WebCore/platform/network/soup/ResourceRequestSoup.cpp index f2011bb..eec75e0 100644 --- a/WebCore/platform/network/soup/ResourceRequestSoup.cpp +++ b/WebCore/platform/network/soup/ResourceRequestSoup.cpp @@ -30,9 +30,20 @@ using namespace std; namespace WebCore { +static char* stripFragmentFromURI(char* uri) +{ + char* ptr = g_strrstr(uri, "#"); + if (ptr && *(ptr-1) != '&') + *ptr = '\0'; + + return uri; +} + SoupMessage* ResourceRequest::toSoupMessage() const { - SoupMessage* soupMessage = soup_message_new(httpMethod().utf8().data(), url().string().utf8().data()); + GOwnPtr<char> uri(stripFragmentFromURI(g_strdup(url().string().utf8().data()))); + SoupMessage* soupMessage = soup_message_new(httpMethod().utf8().data(), + const_cast<char*>(uri.get())); if (!soupMessage) return 0; @@ -53,7 +64,7 @@ SoupMessage* ResourceRequest::toSoupMessage() const void ResourceRequest::updateFromSoupMessage(SoupMessage* soupMessage) { SoupURI* soupURI = soup_message_get_uri(soupMessage); - GOwnPtr<gchar> uri(soup_uri_to_string(soupURI, FALSE)); + GOwnPtr<gchar> uri(stripFragmentFromURI(soup_uri_to_string(soupURI, FALSE))); m_url = KURL(KURL(), String::fromUTF8(uri.get())); And all the http tests seem to still pass. Looks reasonable Dan? I first didn't check that the previous character is a & and failed one test that dealt with espaced hex stuff in the URL. Is there any function in soup/glib I should be using to strip the fragment from a string? (Other than going through SoupURI somehow, I suppose).
Dan Winship
Comment 2 2009-08-26 05:24:47 PDT
You probably want to operate on the parsed URL, not the string form. In ResourceHandle::start(), you have: KURL url = request().url(); String urlString = url.string(); and you could just add a call to url.removeFragmentIdentifier() in between. Likewise in updateFromSoupMessage. (Though I'm not sure if you want m_url to contain the fragment or not.)
Xan Lopez
Comment 3 2009-09-08 08:12:40 PDT
Created attachment 39187 [details] dontpassfragment.patch Don't pass fragments in URIs to libsoup.
Gustavo Noronha (kov)
Comment 4 2009-09-08 12:18:24 PDT
Comment on attachment 39187 [details] dontpassfragment.patch Looks good, thanks!
Xan Lopez
Comment 5 2009-09-08 13:07:43 PDT
Landed in r48178, closing.
Note You need to log in before you can comment on or make changes to this bug.