Bug 89788 - [chromium] Introduce way to reload a page using the original request URL
: [chromium] Introduce way to reload a page using the original request URL
Status: RESOLVED FIXED
: WebKit
WebKit Misc.
: 528+ (Nightly build)
: Unspecified Android
: P2 Normal
Assigned To:
:
:
:
: 66687
  Show dependency treegraph
 
Reported: 2012-06-22 14:55 PST by
Modified: 2012-06-28 13:30 PST (History)


Attachments
Patch (5.86 KB, patch)
2012-06-25 18:08 PST, dfalcantara@chromium.org
no flags Review Patch | Details | Formatted Diff | Diff
Patch (6.67 KB, patch)
2012-06-27 17:08 PST, dfalcantara@chromium.org
no flags Review Patch | Details | Formatted Diff | Diff
Patch (6.40 KB, patch)
2012-06-27 18:01 PST, dfalcantara@chromium.org
no flags Review Patch | Details | Formatted Diff | Diff
Patch (8.54 KB, patch)
2012-06-28 03:28 PST, dfalcantara@chromium.org
no flags Review Patch | Details | Formatted Diff | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2012-06-22 14:55:40 PST
Servers may redirect a browser to a different location based on the user agent being used.  If the user agent is changed to get access to a different version of the page, simply reloading the page will fail because the user agent often isn't checked at the final location.  These situations require reloading the page using a URL from earlier in the chain, before the user agent gets checked.
------- Comment #1 From 2012-06-25 18:08:56 PST -------
Created an attachment (id=149415) [details]
Patch
------- Comment #2 From 2012-06-25 18:09:43 PST -------
Uploaded patch to get context for e-mails.
------- Comment #3 From 2012-06-26 09:50:06 PST -------
Link to the Chromium half of the changes:
https://chromiumcodereview.appspot.com/10667028/
------- Comment #4 From 2012-06-26 11:15:40 PST -------
(From update of attachment 149415 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=149415&action=review

> Source/WebKit/chromium/src/WebFrameImpl.cpp:976
> +    m_frame->loader()->documentLoader()->request().setURL(url);

I know you said that this patch wasn't ready for general review, but this line looks very strange.  It's odd that the WebFrame is reaching this far into the loading machinery to mutate this state.  If you look at the implementations of reload() and loadRequest(), you'll see that they prepare a request and then hand it off to the loader.

I guess I don't fully understand what you're trying to achieve by "reloading" a page with a different URL.  Do you mean that you want to load a document in the frame and replace the current history item (and perhaps preserve the scroll state)?  If so, there might already be a more direct way to do that which we use to implement location.replace(...).
------- Comment #5 From 2012-06-26 13:11:24 PST -------
(In reply to comment #4)
> (From update of attachment 149415 [details] [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=149415&action=review
> 
> > Source/WebKit/chromium/src/WebFrameImpl.cpp:976
> > +    m_frame->loader()->documentLoader()->request().setURL(url);
> 
> I know you said that this patch wasn't ready for general review, but this line looks very strange.  It's odd that the WebFrame is reaching this far into the loading machinery to mutate this state.  If you look at the implementations of reload() and loadRequest(), you'll see that they prepare a request and then hand it off to the loader.
>
> I guess I don't fully understand what you're trying to achieve by "reloading" a page with a different URL.  Do you mean that you want to load a document in the frame and replace the current history item (and perhaps preserve the scroll state)?  If so, there might already be a more direct way to do that which we use to implement location.replace(...).

I guess it'd make sense to plop down and summarize my various e-mail threads here:

This is part of upstreaming the "Request desktop site" stuff for Chrome for Android.  When the user agent is overridden, we need to reload the page using a URL earlier in the redirect chain to get around any redirects caused by user agent detection; e.g. cnn.com will redirect m.cnn.com for mobile browsers.  If you do a normal reload using a new desktop user agent, you'll still be served m.cnn.com because that's the URL you're reloading.  This is kind of an odd case because it falls between a reload and a load: you're technically reloading the same navigation, but you might not get the same content.

There are two routes to go here: either the page could be reloaded with the original request URL or the page could be loaded as a new entry.  I put out a couple of feelers on this a month or so ago, and Darin suggested going with the reload route and trying to pull the originalRequest URL from the WebHistoryItem or some other location in WebKit code.  Problematically, the originalRequest URL isn't "properly" saved Chromium side -- the WebHistoryItem stored by a NavigationEntry can be overwritten several times per page with different original request URLs.  Rather than try to preserve the original request URL we needed inside the WebHistoryItem (which would require pickling and de-pickling it every time it was updated), we opted to store it separately (https://chromiumcodereview.appspot.com/9999010/).  This leads to the function prototype you see here, where we explicitly pass in the URL that should be reloaded.

The extra bits for the page scale and scrolling state are there because reloading a page in this way causes the page to zoom back in erroneously; if you load slashdot.org with a mobile UA and switch to a desktop UA, it will retain the same page scale and zoom in.

reloadWithGivenURL() calls reload() at its end, which pulls the request from the same document loader that we change the URL in.  Unfortunately, the only way to get reload() to load up a different URL is if it was unreachable the first time, so we explicitly change it here.  If we avoided calling reload(), then we could get around the issue, but we'd likely have to dive into WebCore.  I'll look into the location.replace() thing, though.
------- Comment #6 From 2012-06-26 13:45:09 PST -------
Thanks for the explanation.  It sounds like what you want to do is very similar to what we're doing on http://trac.webkit.org/browser/trunk/Source/WebCore/loader/FrameLoader.cpp#L1453 when reloading an error page.

Perhaps the best course of action is to introduce FrameLoader::reloadWithOverrideURL (following reloadWithOverrideEncoding) that takes a URL as a parameter.  We can then change FrameLoader:reload to call reloadWithOverrideURL around line 1457 (after it has decided whether to reload the current URL or the unreachableURL).

That ends up being very similar to what you've got in this patch, it just moves the detailed work of talking to the DocumentLoader into the FrameLoader class, which is much more interested in talking directly with the DocumentLoader than WebView is.

I'm less sure what to do precisely with the page scale aspect of this change.  Perhaps we should make this change in two parts, and think about the page scale aspects in the second patch?
------- Comment #7 From 2012-06-26 15:15:05 PST -------
(In reply to comment #6)
> Thanks for the explanation.  It sounds like what you want to do is very similar to what we're doing on http://trac.webkit.org/browser/trunk/Source/WebCore/loader/FrameLoader.cpp#L1453 when reloading an error page.
Yeah, that sounds about right.

> Perhaps the best course of action is to introduce FrameLoader::reloadWithOverrideURL (following reloadWithOverrideEncoding) that takes a URL as a parameter.  We can then change FrameLoader:reload to call reloadWithOverrideURL around line 1457 (after it has decided whether to reload the current URL or the unreachableURL).
>
> That ends up being very similar to what you've got in this patch, it just moves the detailed work of talking to the DocumentLoader into the FrameLoader class, which is much more interested in talking directly with the DocumentLoader than WebView is.
Is there a logical place to stick and set the override URL so that we can pull it out in FrameLoader::reload()?

> I'm less sure what to do precisely with the page scale aspect of this change.  Perhaps we should make this change in two parts, and think about the page scale aspects in the second patch?
Sure, we can push it to another patch.
------- Comment #8 From 2012-06-26 15:23:30 PST -------
(In reply to comment #7)
> (In reply to comment #6)
> > Thanks for the explanation.  It sounds like what you want to do is very similar to what we're doing on http://trac.webkit.org/browser/trunk/Source/WebCore/loader/FrameLoader.cpp#L1453 when reloading an error page.
> Yeah, that sounds about right.
> 
> > Perhaps the best course of action is to introduce FrameLoader::reloadWithOverrideURL (following reloadWithOverrideEncoding) that takes a URL as a parameter.  We can then change FrameLoader:reload to call reloadWithOverrideURL around line 1457 (after it has decided whether to reload the current URL or the unreachableURL).
> >
> > That ends up being very similar to what you've got in this patch, it just moves the detailed work of talking to the DocumentLoader into the FrameLoader class, which is much more interested in talking directly with the DocumentLoader than WebView is.
> Is there a logical place to stick and set the override URL so that we can pull it out in FrameLoader::reload()?
> 
> > I'm less sure what to do precisely with the page scale aspect of this change.  Perhaps we should make this change in two parts, and think about the page scale aspects in the second patch?
> Sure, we can push it to another patch.

Perhaps, instead of reloadWithOverrideURL(), we should consider reloadOriginalRequest() with no parameters? DocumentLoader::originalRequest() should have the desired URL already.
------- Comment #9 From 2012-06-26 15:38:58 PST -------
> Perhaps, instead of reloadWithOverrideURL(), we should consider reloadOriginalRequest() with no parameters? DocumentLoader::originalRequest() should have the desired URL already.

That's what I expected, but it doesn't always seem to.  Chromium appears to overwrite the WebHistoryItem for the same NavigationEntry sometimes, which causes us to replace the original request URL we need with a different one.  I'm pretty sure this happens when a redirect occurs, but I'll double check.
------- Comment #10 From 2012-06-26 15:45:05 PST -------
(In reply to comment #9)
> > Perhaps, instead of reloadWithOverrideURL(), we should consider reloadOriginalRequest() with no parameters? DocumentLoader::originalRequest() should have the desired URL already.
> 
> That's what I expected, but it doesn't always seem to.  Chromium appears to overwrite the WebHistoryItem for the same NavigationEntry sometimes, which causes us to replace the original request URL we need with a different one.  I'm pretty sure this happens when a redirect occurs, but I'll double check.

Note that DocumentLoader::originalRequest() and DocumentLoader::originalRequestCopy() are different in corner cases. To make matters worse:

FrameLoader::initialRequest() == DocumentLoader::originalRequest()
FrameLoader::originalRequest() == DocumentLoader::originalRequestCopy()

I assert that DocumentLoader::originalRequest() (aka FrameLoader::initialRequest()) will never change for the life of the DocumentLoader.
------- Comment #11 From 2012-06-26 16:06:59 PST -------
(In reply to comment #10)
> (In reply to comment #9)
> > > Perhaps, instead of reloadWithOverrideURL(), we should consider reloadOriginalRequest() with no parameters? DocumentLoader::originalRequest() should have the desired URL already.
> > 
> > That's what I expected, but it doesn't always seem to.  Chromium appears to overwrite the WebHistoryItem for the same NavigationEntry sometimes, which causes us to replace the original request URL we need with a different one.  I'm pretty sure this happens when a redirect occurs, but I'll double check.
> 
> Note that DocumentLoader::originalRequest() and DocumentLoader::originalRequestCopy() are different in corner cases. To make matters worse:
> 
> FrameLoader::initialRequest() == DocumentLoader::originalRequest()
> FrameLoader::originalRequest() == DocumentLoader::originalRequestCopy()
> 
> I assert that DocumentLoader::originalRequest() (aka FrameLoader::initialRequest()) will never change for the life of the DocumentLoader.

Changed the function to look like this in my branch:
 987 void WebFrameImpl::reloadWithGivenURL(const WebURL& url)
 988 {
 989     const KURL& initialURL = m_frame->loader()->initialRequest().url();
 990     const KURL& originalURL = m_frame->loader()->originalRequest().url();
 991     fprintf(stderr, "Initial URL: %s\n", initialURL.string().utf8().data());
 992     fprintf(stderr, "Original URL: %s\n", originalURL.string().utf8().data());
 993     fprintf(stderr, "Passed in URL: %s\n", url.spec().data());
 994 
 995     m_frame->loader()->activeDocumentLoader()->request().setURL(originalURL);
 996 
 997     // We're reloading with a different URL than the loader used the last time,
 998     // so ignore the contents of the cache.
 999     viewImpl()->clearPageScaleFactorForReload();
1000     reload(true);
1001 }   
The passed in URL is the original request URL that I manually track in Chromium's NavigationEntries.

ESPN's site is one of the cases that redirects for mobile user agents; when I navigated to http://espn.com on a Chrome for Android build, I ended up with this output when I called the function:
06-26 16:03:03.036 15200 15204 I stderr  : Initial URL: http://m.espn.go.com/wireless/?i=COMR
06-26 16:03:03.036 15200 15204 I stderr  : Original URL: http://m.espn.go.com/wireless/?i=COMR
06-26 16:03:03.036 15200 15204 I stderr  : Passed in URL: http://espn.com/

Am I checking the wrong ones?
------- Comment #12 From 2012-06-26 16:27:03 PST -------
(In reply to comment #11)
> (In reply to comment #10)
> > (In reply to comment #9)
> > > > Perhaps, instead of reloadWithOverrideURL(), we should consider reloadOriginalRequest() with no parameters? DocumentLoader::originalRequest() should have the desired URL already.
> > > 
> > > That's what I expected, but it doesn't always seem to.  Chromium appears to overwrite the WebHistoryItem for the same NavigationEntry sometimes, which causes us to replace the original request URL we need with a different one.  I'm pretty sure this happens when a redirect occurs, but I'll double check.
> > 
> > Note that DocumentLoader::originalRequest() and DocumentLoader::originalRequestCopy() are different in corner cases. To make matters worse:
> > 
> > FrameLoader::initialRequest() == DocumentLoader::originalRequest()
> > FrameLoader::originalRequest() == DocumentLoader::originalRequestCopy()
> > 
> > I assert that DocumentLoader::originalRequest() (aka FrameLoader::initialRequest()) will never change for the life of the DocumentLoader.
> 
> Changed the function to look like this in my branch:
>  987 void WebFrameImpl::reloadWithGivenURL(const WebURL& url)
>  988 {
>  989     const KURL& initialURL = m_frame->loader()->initialRequest().url();
>  990     const KURL& originalURL = m_frame->loader()->originalRequest().url();
>  991     fprintf(stderr, "Initial URL: %s\n", initialURL.string().utf8().data());
>  992     fprintf(stderr, "Original URL: %s\n", originalURL.string().utf8().data());
>  993     fprintf(stderr, "Passed in URL: %s\n", url.spec().data());
>  994 
>  995     m_frame->loader()->activeDocumentLoader()->request().setURL(originalURL);
>  996 
>  997     // We're reloading with a different URL than the loader used the last time,
>  998     // so ignore the contents of the cache.
>  999     viewImpl()->clearPageScaleFactorForReload();
> 1000     reload(true);
> 1001 }   
> The passed in URL is the original request URL that I manually track in Chromium's NavigationEntries.
> 
> ESPN's site is one of the cases that redirects for mobile user agents; when I navigated to http://espn.com on a Chrome for Android build, I ended up with this output when I called the function:
> 06-26 16:03:03.036 15200 15204 I stderr  : Initial URL: http://m.espn.go.com/wireless/?i=COMR
> 06-26 16:03:03.036 15200 15204 I stderr  : Original URL: http://m.espn.go.com/wireless/?i=COMR
> 06-26 16:03:03.036 15200 15204 I stderr  : Passed in URL: http://espn.com/
> 
> Am I checking the wrong ones?


Doh, I bet chrome is swapping out the renderer. So we wouldn't necessarily have the URL in the current renderer already. Bah.

I guess that means we would have to go the route of reloadWithOverrideURL() after all. Sorry for leading you astray :(
------- Comment #13 From 2012-06-26 16:36:49 PST -------
> Doh, I bet chrome is swapping out the renderer. So we wouldn't necessarily have the URL in the current renderer already. Bah.
> 
> I guess that means we would have to go the route of reloadWithOverrideURL() after all. Sorry for leading you astray :(

No worries; I was hoping that you were right ;)  I'll prep a patch with the new function and see how that goes.
------- Comment #14 From 2012-06-27 17:08:34 PST -------
Created an attachment (id=149827) [details]
Patch
------- Comment #15 From 2012-06-27 17:12:21 PST -------
Please wait for approval from abarth@webkit.org, dglazkov@chromium.org, fishd@chromium.org, jamesr@chromium.org or tkent@chromium.org before submitting, as this patch contains changes to the Chromium public API. See also https://trac.webkit.org/wiki/ChromiumWebKitAPI.
------- Comment #16 From 2012-06-27 17:26:21 PST -------
(From update of attachment 149827 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=149827&action=review

If I were writing this patch, I would include a simple API test that showed that reloading a WebFrame with an override URL preserved some state about the page that you cared about (like the scroll position) but ended up at a different URL.  Note: It's not important for the test to involve a redirect---it just needs two URLs.

> Source/WebCore/loader/FrameLoader.cpp:1446
> +    // If the URL is empty, don't bother with reloading.

I would remove this comment because it just states what the code does.

> Source/WebCore/loader/FrameLoader.cpp:1452
> +    ResourceRequest initialRequest = m_documentLoader->request();
> +    initialRequest.setURL(overrideUrl);
> +    reloadInitialRequest(initialRequest, endToEndReload);

nit: I would just call initialRequest "request".  The word "initial" doesn't really mean much here.

> Source/WebCore/loader/FrameLoader.cpp:1478
> +    if (!m_documentLoader)
> +        return;
> +

Given that this is a private method, we don't need to re-check m_documentLoader for being 0.  If you like, you can ASSERT that it's non-0 instead.

> Source/WebCore/loader/FrameLoader.h:118
> +    void reloadWithOverrideURL(const KURL& overrideUrl, bool endToEndReload);

Should endToEndReload default to false, like for reload() ?

> Source/WebCore/loader/FrameLoader.h:348
> +    void reloadInitialRequest(const ResourceRequest&, bool endToEndReload);

Maybe reloadInitialRequest -> reloadWithRequest ?  Again, there's nothing particularly "initial" about the request at this point.

> Source/WebKit/chromium/public/WebFrame.h:323
> +    // Reload the current document, but change the URL to the given one first.
> +    // This is used for situations where we need to avoid a redirect.

I might say something like:

// This is used for situations where we want to reload a different URL because of a redirect.

> Source/WebKit/chromium/src/WebFrameImpl.cpp:981
> +void WebFrameImpl::reloadWithOverrideURL(const WebURL& overrideUrl, bool endToEndReload)

It's slightly odd that the second parameter changed names.
------- Comment #17 From 2012-06-27 17:59:59 PST -------
(From update of attachment 149827 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=149827&action=review

> If I were writing this patch, I would include a simple API test that showed that reloading a WebFrame with an override URL preserved some state about the page that you cared about (like the scroll position) but ended up at a different URL.  Note: It's not important for the test to involve a redirect---it just needs two URLs.

I'll get started on adding one.  I'm uploading a patch in the meantime that addresses your other comments.

I'm slightly confused about why the WebFrameImpl::reload() preserves the document and scroll state when it's reloaded from scratch and the cache is ignored, though.  Is that intentional?

>> Source/WebCore/loader/FrameLoader.cpp:1446
>> +    // If the URL is empty, don't bother with reloading.
> 
> I would remove this comment because it just states what the code does.

Done.

>> Source/WebCore/loader/FrameLoader.cpp:1452
>> +    reloadInitialRequest(initialRequest, endToEndReload);
> 
> nit: I would just call initialRequest "request".  The word "initial" doesn't really mean much here.

Done.

>> Source/WebCore/loader/FrameLoader.cpp:1478
>> +
> 
> Given that this is a private method, we don't need to re-check m_documentLoader for being 0.  If you like, you can ASSERT that it's non-0 instead.

Yeah, I added the if because I didn't realize ASSERTs were available.  Switched it over.

>> Source/WebCore/loader/FrameLoader.h:118
>> +    void reloadWithOverrideURL(const KURL& overrideUrl, bool endToEndReload);
> 
> Should endToEndReload default to false, like for reload() ?

Added in the default.

>> Source/WebCore/loader/FrameLoader.h:348
>> +    void reloadInitialRequest(const ResourceRequest&, bool endToEndReload);
> 
> Maybe reloadInitialRequest -> reloadWithRequest ?  Again, there's nothing particularly "initial" about the request at this point.

Done.

>> Source/WebKit/chromium/public/WebFrame.h:323
>> +    // This is used for situations where we need to avoid a redirect.
> 
> I might say something like:
> 
> // This is used for situations where we want to reload a different URL because of a redirect.

Done.

>> Source/WebKit/chromium/src/WebFrameImpl.cpp:981
>> +void WebFrameImpl::reloadWithOverrideURL(const WebURL& overrideUrl, bool endToEndReload)
> 
> It's slightly odd that the second parameter changed names.

Switched it over; I think it might have been a holdover from one of my previous versions.
------- Comment #18 From 2012-06-27 18:01:26 PST -------
Created an attachment (id=149836) [details]
Patch
------- Comment #19 From 2012-06-27 18:02:59 PST -------
> I'm slightly confused about why the WebFrameImpl::reload() preserves the document and scroll state when it's reloaded from scratch and the cache is ignored, though.  Is that intentional?

Preserving the scroll state makes sense to me.  Even if you "shift-reload" the page, you expect the scroll state to be preserved.  I'm less sure what preserving the document does.  I do think it makes sense for both the reload and reloadWithOverrideURL to do the same thing in these respects.
------- Comment #20 From 2012-06-27 22:31:56 PST -------
(From update of attachment 149836 [details])
Looks good.  Not sure if you want to add an API test before landing.
------- Comment #21 From 2012-06-28 03:28:58 PST -------
Created an attachment (id=149912) [details]
Patch
------- Comment #22 From 2012-06-28 03:37:37 PST -------
Added a test to WebFrameTest.cpp; I ended up reusing a few random HTML filenames from the data directory, but I can add new ones, if necessary.

Passes on my linux box locally, but I'm still trying to set up my Chromium repo with WebKit to try it out on the trybots.
------- Comment #23 From 2012-06-28 08:17:20 PST -------
(From update of attachment 149912 [details])
Looks great, thanks!  By the way, I would have expected EXPECT_EQ rather than ASSERT_EQ, but I'm not a gtest expert.
------- Comment #24 From 2012-06-28 09:05:59 PST -------
(From update of attachment 149912 [details])
Clearing flags on attachment: 149912

Committed r121434: <http://trac.webkit.org/changeset/121434>
------- Comment #25 From 2012-06-28 09:06:06 PST -------
All reviewed patches have been landed.  Closing bug.
------- Comment #26 From 2012-06-28 13:30:51 PST -------
(In reply to comment #23)
> (From update of attachment 149912 [details] [details])
> Looks great, thanks!  By the way, I would have expected EXPECT_EQ rather than ASSERT_EQ, but I'm not a gtest expert.

Yeah, I was debating whether to use the ASSERT or the EXPECT but figured the number of checks in there was small enough.  I'll likely have to revisit the test to tackle the part of this bug that got splintered off, so I can switch over then.

Thanks for landing it!