Bug 146306

Summary: WebView and WebPage URLs not updated after URL is modified by InjectedBundlePageResourceLoadClient::willSendRequestForFrame
Product: WebKit Reporter: Michael Catanzaro <mcatanzaro>
Component: WebKit2Assignee: Michael Catanzaro <mcatanzaro>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, bugzilla, cgarcia, darin, mcatanzaro, mitz, youennf
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: Linux   
See Also: https://bugs.webkit.org/show_bug.cgi?id=163436
Bug Depends on:    
Bug Blocks: 163389    
Attachments:
Description Flags
Patch
cgarcia: review-
Patch
none
Updated patch darin: review+, darin: commit-queue-

Description Michael Catanzaro 2015-06-24 19:59:54 PDT
InjectedBundlePageResourceLoadClient::willSendRequestForFrame() allows clients to modify the ResourceRequest passed to it. We rely on this in the GTK+ API in order to allow a web extension to modify URLs before they are loaded. For example, Epiphany uses this to strip out query parameters that are well-known to be used for tracking users. This mostly works fine; the problem is that our WebKitWebPage and WebKitWebView objects' URI properties are not updated when this happens, so applications wind up displaying to users the original URL rather than what actually hit the network.

I will attach one possible solution. I probably should also add a cross-platform test, since it looks like the Mac port allows this as well.
Comment 1 Michael Catanzaro 2015-06-24 20:09:44 PDT
Created attachment 255538 [details]
Patch
Comment 2 Carlos Garcia Campos 2015-06-24 23:59:43 PDT
Comment on attachment 255538 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=255538&action=review

> Source/WebKit2/WebProcess/InjectedBundle/API/gtk/WebKitWebPage.cpp:204
> +    webkitWebPageSetURI(webPage, webkit_uri_request_get_uri(request.get()));

I don't think this is correct, you are setting the page URI as the URL of any subresource. willSendRequestForFrame is not only called for the main resource, but for all resources.

> Source/WebKit2/WebProcess/WebCoreSupport/WebFrameLoaderClient.cpp:184
> +    if (request.url() != originalURL) {

The request can be null after willSendRequest, which will cancel the load. We don't want to send the DidChangeProvisionalURLForFrame in that case.

> Source/WebKit2/WebProcess/WebCoreSupport/WebFrameLoaderClient.cpp:188
> +        if (FrameLoader* frameLoader = loader->frameLoader()) {
> +            if (auto documentLoader = static_cast<WebDocumentLoader*>(frameLoader->provisionalDocumentLoader()))
> +                webPage->send(Messages::WebPageProxy::DidChangeProvisionalURLForFrame(m_frame->frameID(), documentLoader->navigationID(), request.url().string()));
> +        }

And the same happens here, you are notifying the UI process for every subresource. Also, I don't think this is the right place nor the right way, since this could not be in sync with the ResourceRequest of the DocumentLoader. I think the message should always be sent with the current value of the DocumentLoader. So, DocumentLoader::startLoadingMainResource() might be the right place, right after calling willSendRequest, using frameLoader()->client().dispatchDidChangeProvisionalURL().
Comment 3 Michael Catanzaro 2016-10-09 14:10:20 PDT
I just added HTTPS Everywhere support to Epiphany today. This is causing odd results, we show e.g. http://www.imgur.com with a secure icon, because it's really displaying https://www.imgur.com and we really are using TLS, but the URI property of the WebKitWebView is set wrong.
Comment 4 Michael Catanzaro 2016-10-09 20:29:48 PDT
(In reply to comment #2)
> So, DocumentLoader::startLoadingMainResource()
> might be the right place, right after calling willSendRequest, using
> frameLoader()->client().dispatchDidChangeProvisionalURL().

I was having trouble with this. Turns out that in my test case ("http://imgur.com") is actually not called by DocumentLoader at all, it's called by SubresourceLoader. Yes, for the main resource! So that's strange.
Comment 5 Bastien Nocera 2016-10-11 21:10:12 PDT
(In reply to comment #4)
> (In reply to comment #2)
> > So, DocumentLoader::startLoadingMainResource()
> > might be the right place, right after calling willSendRequest, using
> > frameLoader()->client().dispatchDidChangeProvisionalURL().
> 
> I was having trouble with this. Turns out that in my test case
> ("http://imgur.com") is actually not called by DocumentLoader at all, it's
> called by SubresourceLoader. Yes, for the main resource! So that's strange.

Is this what also causes the "utm" tracking not being removed from the URLs shown in epiphany?
Comment 6 Michael Catanzaro 2016-10-12 09:31:41 PDT
(In reply to comment #5)
> Is this what also causes the "utm" tracking not being removed from the URLs
> shown in epiphany?

Yes, I will try to fix it soon because it's important now for any site that triggers the HTTPS Everywhere ruleset, but I need to understand the code here more. It's very confusing that the willSendRequest load delegate is called from SubresourceLoader even for the main resource. There is code in DocumentLoader to call it for the main resource, but it's not ever hit when loading http://imgur.com, my test case. There is also a DocumentLoader::willSendRequest function that does get called, and which confused me for a while, but that function doesn't call the load delegate of the same name.
Comment 7 Carlos Garcia Campos 2016-10-13 05:45:21 PDT
Created attachment 291474 [details]
Patch
Comment 8 Carlos Garcia Campos 2016-10-13 06:57:23 PDT
Forgot to say, this patch includes new cross-platform wk2 unit test, but I don't have a mac to add the new files to the xcode file. Fortunately it doesn't break the build, so someone with a mac can add them in a follow up commit.
Comment 9 Michael Catanzaro 2016-10-13 07:46:21 PDT
Comment on attachment 291474 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=291474&action=review

Thanks very much for working on this.

> Tools/TestWebKitAPI/Tests/WebKit2/ProvisionalURLAfterWillSendRequestCallback_Bundle.cpp:46
> +    static WKURLRequestRef willSendRequestForFrame(WKBundlePageRef, WKBundleFrameRef frame, uint64_t resourceIdentifier, WKURLRequestRef request, WKURLResponseRef redirectResponse, const void *clientInfo)

The uri property of WebKitWebPage in the GTK+ API was also broken, so can you check in didSendRequestForFrame that the URL of WKBundlePage has actually changed (if it's the main frame)? I guess it could be done with WKBundlePageGetMainFrame -> WKBundleFrameCopyURL.
Comment 10 Carlos Garcia Campos 2016-10-13 08:31:06 PDT
Comment on attachment 291474 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=291474&action=review

>> Tools/TestWebKitAPI/Tests/WebKit2/ProvisionalURLAfterWillSendRequestCallback_Bundle.cpp:46
>> +    static WKURLRequestRef willSendRequestForFrame(WKBundlePageRef, WKBundleFrameRef frame, uint64_t resourceIdentifier, WKURLRequestRef request, WKURLResponseRef redirectResponse, const void *clientInfo)
> 
> The uri property of WebKitWebPage in the GTK+ API was also broken, so can you check in didSendRequestForFrame that the URL of WKBundlePage has actually changed (if it's the main frame)? I guess it could be done with WKBundlePageGetMainFrame -> WKBundleFrameCopyURL.

The frame is a parameter of all callbacks, so we can just use that. But here the URL hasn't changed yet, we would need to check it in didCommitLoad callback. Bu the URI property of WebKitWebPage wouldn't be affected by this, we set it manually. So, I think we should connect to didcommitLoad and set it there. It's actually a different bug, so for the GTK+ issue we can file a new bug and I¡'ll submit a new patch with the gtk+ specific unit test.
Comment 11 Carlos Garcia Campos 2016-10-13 08:54:33 PDT
Created attachment 291485 [details]
Updated patch

Check also the injected bundle frame url in the unit test.
Comment 12 Michael Catanzaro 2016-10-13 09:05:56 PDT
(In reply to comment #10)
> The frame is a parameter of all callbacks, so we can just use that. But here
> the URL hasn't changed yet, we would need to check it in didCommitLoad
> callback. Bu the URI property of WebKitWebPage wouldn't be affected by this,
> we set it manually. So, I think we should connect to didcommitLoad and set
> it there. It's actually a different bug, so for the GTK+ issue we can file a
> new bug and I¡'ll submit a new patch with the gtk+ specific unit test.

OK thanks. Bug #163389.
Comment 13 Darin Adler 2016-10-13 10:00:45 PDT
Comment on attachment 291485 [details]
Updated patch

View in context: https://bugs.webkit.org/attachment.cgi?id=291485&action=review

This change seems to be causing the Acid3 test to fail on the bots, so please check into that before landing it.

> Source/WebCore/loader/DocumentLoader.cpp:233
> +    bool notifyProvisionalURLChange = false;

For a boolean we typically don’t want to use a verb phrase. So better names for this would be shouldNotifyAboutProvisionalURLChange or didChangeProvisionalURL.

> Source/WebCore/loader/DocumentLoader.cpp:236
> +    else if (isLoadingMainResource() && !m_request.url().isEmpty() && req.url() != m_request.url())

I must admit I don’t understand why the m_request.url().isEmpty() check is correct.
Comment 14 Darin Adler 2016-10-13 10:03:36 PDT
No, the Acid3 failure seems to be happening before the patch also, and is not due to this patch.
Comment 15 Michael Catanzaro 2016-10-13 10:14:04 PDT
(In reply to comment #13)
> I must admit I don’t understand why the m_request.url().isEmpty() check is
> correct.

If it's empty then the load is going to be canceled. I guess it's pointless to add a new didChangeProvisionalURL callback in that case. Is that right, Carlos?
Comment 16 Michael Catanzaro 2016-10-13 10:16:17 PDT
(In reply to comment #15)
> If it's empty then the load is going to be canceled. I guess it's pointless
> to add a new didChangeProvisionalURL callback in that case. Is that right,
> Carlos?

Nope, I'm wrong as usual. If that was the point of this code, then it would be checking req.url(), not m_request.url().

How could this code ever be reached if m_request.url() is empty?
Comment 17 Carlos Garcia Campos 2016-10-13 23:00:51 PDT
No, the empty check is simply not needed, I forgot to remove it, because I started with a different approach in ResourceRequest where it was needed, and forgot to remove it when I moved it to the DocumentLoader.
Comment 18 Carlos Garcia Campos 2016-10-13 23:32:25 PDT
Committed r207325: <http://trac.webkit.org/changeset/207325>
Comment 19 Carlos Garcia Campos 2016-10-13 23:32:57 PDT
Darin, could you please add the new files to XCode in a follow up commit?
Comment 20 Carlos Garcia Campos 2016-10-14 01:52:12 PDT
Adding unit tests for this in the GTK+ port I've noticed that this still doesn't work when a request is modified by willSendRequest after a redirect. The thing is that DocumentLoader::setRequest is not called after willSendRequest is called for a redirect. The first willSendRequest happens before DocumentLoader::startLoadingMainResource, that calls setRequest, but the second one happens after DocumentLoader::redirectReceived() and then the request is never updated again. I'll file a new bug report.
Comment 21 youenn fablet 2016-10-14 01:57:52 PDT
(In reply to comment #20)
> Adding unit tests for this in the GTK+ port I've noticed that this still
> doesn't work when a request is modified by willSendRequest after a redirect.
> The thing is that DocumentLoader::setRequest is not called after
> willSendRequest is called for a redirect. The first willSendRequest happens
> before DocumentLoader::startLoadingMainResource, that calls setRequest, but
> the second one happens after DocumentLoader::redirectReceived() and then the
> request is never updated again. I'll file a new bug report.

I can update the Xcode files.
Should we wait for 163436 or is it ok to add them now?
Comment 22 Carlos Garcia Campos 2016-10-14 02:04:52 PDT
(In reply to comment #21)
> (In reply to comment #20)
> > Adding unit tests for this in the GTK+ port I've noticed that this still
> > doesn't work when a request is modified by willSendRequest after a redirect.
> > The thing is that DocumentLoader::setRequest is not called after
> > willSendRequest is called for a redirect. The first willSendRequest happens
> > before DocumentLoader::startLoadingMainResource, that calls setRequest, but
> > the second one happens after DocumentLoader::redirectReceived() and then the
> > request is never updated again. I'll file a new bug report.
> 
> I can update the Xcode files.

Thanks! Now that you are on it . . . I also need help with that in bug #160497, hopefully the patch still applies.

> Should we wait for 163436 or is it ok to add them now?

It's ok to do it now, bug #163436 is related but a different bug in the end.