Bug 163103

Summary: CachedResourceLoader should set headers of the HTTP request prior checking for the cache
Product: WebKit Reporter: youenn fablet <youennf>
Component: WebCore Misc.Assignee: youenn fablet <youennf>
Status: RESOLVED FIXED    
Severity: Normal CC: achristensen, beidson, cdumez, commit-queue, darin, dbates, japhet, koivisto
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
WIP
none
Rebasing
none
Patch
none
Patch for landing
none
Archive of layout-test-results from webkit-cq-02 for mac-yosemite
none
Patch for landing none

Description youenn fablet 2016-10-07 01:22:36 PDT
This would allow to align more clearly with fetch and would remove the need to create a temporary ResourceRequest in css of response with Vary headers.
Comment 1 youenn fablet 2016-10-07 02:11:03 PDT
Created attachment 290916 [details]
Patch
Comment 2 Darin Adler 2016-10-07 12:15:13 PDT
Comment on attachment 290916 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=290916&action=review

Is our test coverage good enough here? All this is just refactoring and doesn’t cause any visible improvement in behavior?

> Source/WebCore/loader/cache/CachedResource.h:242
>      void registerHandle(CachedResourceHandleBase* h);
>      WEBCORE_EXPORT void unregisterHandle(CachedResourceHandleBase* h);

Wow, look at that. Those argument names "h" and the fact that these take a pointer, not a reference.

> Source/WebCore/loader/cache/CachedResourceLoader.cpp:652
> +    FrameLoader& frameLoader = frame->loader();

auto&

> Source/WebCore/loader/cache/CachedResourceLoader.cpp:689
> +    if (outgoingReferrer.isEmpty())
> +        request.clearHTTPReferrer();
> +    else
> +        request.setHTTPReferrer(outgoingReferrer);

Maybe the setHTTPReferrer function should handle the special case for empty string instead of having it be done here at the call site.

> Source/WebCore/loader/cache/CachedResourceRequest.cpp:100
> +    if (!m_origin || m_options.mode == FetchOptions::Mode::SameOrigin)
> +        return;
> +
> +    if (m_resourceRequest.url().protocolIsData() && m_options.sameOriginDataURLFlag == SameOriginDataURLFlag::Set)
> +        return;

Since this relies on m_isCrossOrigin already being false, then maybe we should assert that.

Or, better, change this into a function that returns a boolean and then just write:

    m_isCrossOrigin = computeIsCrossOrigin(xxx);

> Source/WebCore/loader/cache/CachedResourceRequest.h:63
> +    bool isCrossOrigin() const { return m_isCrossOrigin; }

Can we remove the other isCrossOrigin() function, or do we need both now?
Comment 3 youenn fablet 2016-10-11 03:46:15 PDT
Comment on attachment 290916 [details]
Patch

Thanks for the review.

View in context: https://bugs.webkit.org/attachment.cgi?id=290916&action=review

>> Source/WebCore/loader/cache/CachedResource.h:242
>>      WEBCORE_EXPORT void unregisterHandle(CachedResourceHandleBase* h);
> 
> Wow, look at that. Those argument names "h" and the fact that these take a pointer, not a reference.

Probably worth some refactoring.
I guess doing a grep on "(this)" would give us other places where we could do such refactoring.

>> Source/WebCore/loader/cache/CachedResourceLoader.cpp:652
>> +    FrameLoader& frameLoader = frame->loader();
> 
> auto&

ok

>> Source/WebCore/loader/cache/CachedResourceLoader.cpp:689
>> +        request.setHTTPReferrer(outgoingReferrer);
> 
> Maybe the setHTTPReferrer function should handle the special case for empty string instead of having it be done here at the call site.

That could probably be done since referrer value is probably not valid if empty.
I prefer keeping it like this for now as that may be too subtle for other call sites.

>> Source/WebCore/loader/cache/CachedResourceRequest.cpp:100
>> +        return;
> 
> Since this relies on m_isCrossOrigin already being false, then maybe we should assert that.
> 
> Or, better, change this into a function that returns a boolean and then just write:
> 
>     m_isCrossOrigin = computeIsCrossOrigin(xxx);

computeIsCrossOrigin sounds good.

>> Source/WebCore/loader/cache/CachedResourceRequest.h:63
>> +    bool isCrossOrigin() const { return m_isCrossOrigin; }
> 
> Can we remove the other isCrossOrigin() function, or do we need both now?

Do you mean CachedResource::isCrossOrigin()?
If so, we need both. I will change origin() access to hasOrigin().
Comment 4 youenn fablet 2016-10-11 03:55:08 PDT
> Is our test coverage good enough here? All this is just refactoring and
> doesn’t cause any visible improvement in behavior?

I tried to make it so there should be no change of behavior.
This patch aligns the code with fetch algorithm organisation so that we can better reason about it.

The problem with modifying our fetch algorithm is that the spec is still moving and browsers interoperability is not very good yet on corner cases.
In some cases, we could change our behavior to align with the spec, but that may not mean that we will improve our interoperability, at least in the short term.

We dit that for preflight after cross-origin redirections since this is a "new" feature and impact could be easily identified. For deeper changes, this is of course much more difficult.
Comment 5 youenn fablet 2016-10-11 05:57:12 PDT
Created attachment 291251 [details]
WIP
Comment 6 youenn fablet 2016-10-11 05:58:24 PDT
(In reply to comment #5)
> Created attachment 291251 [details]
> WIP

There is a problem with computing isCrossOrigin in CachedResourceRequest, since its request URL can be upgraded or changed through content security blocker.
Comment 7 youenn fablet 2016-10-17 06:43:56 PDT
Created attachment 291813 [details]
Rebasing
Comment 8 Alex Christensen 2016-10-17 08:39:56 PDT
Comment on attachment 291813 [details]
Rebasing

View in context: https://bugs.webkit.org/attachment.cgi?id=291813&action=review

> Source/WebCore/loader/FrameLoader.cpp:2629
> +    if (isMainResource) {

We are moving things into this if statement that weren't there before.  If that is correct and doesn't change behavior, we ought to assert that it already has an Origin header and a User Agent, and make sure it's covered by tests by removing these headers locally and verifying that a test fails.  Otherwise, we should add such tests.

> Source/WebCore/loader/PingLoader.cpp:192
> +    frame.loader().applyUserAgent(request);

Are all pings supposed to have a user agent?  Are there tests for this?

> Source/WebCore/loader/cache/CachedResourceRequest.cpp:73
> +static bool computeIsCrossOrigin(SecurityOrigin*, const URL&, const ResourceLoaderOptions&);

Just put the code here if we go with this approach, which I don't think we should.

> Source/WebCore/loader/cache/CachedResourceRequest.h:94
>      RefPtr<SecurityOrigin> m_origin;
> +    bool m_isCrossOrigin { false };

I don't think the optimization of calculating isCrossOrigin once is worth adding the complexity of caching m_isCrossOrigin.  If we forget to update it somewhere where we update the URL or SecurityOrigin, bad things would happen.  I think we should just move the code from computeIsCrossOrigin to isCrossOrigin.
Comment 9 youenn fablet 2016-10-17 08:56:52 PDT
Comment on attachment 291813 [details]
Rebasing

Thanks for the review.

View in context: https://bugs.webkit.org/attachment.cgi?id=291813&action=review

>> Source/WebCore/loader/FrameLoader.cpp:2629
>> +    if (isMainResource) {
> 
> We are moving things into this if statement that weren't there before.  If that is correct and doesn't change behavior, we ought to assert that it already has an Origin header and a User Agent, and make sure it's covered by tests by removing these headers locally and verifying that a test fails.  Otherwise, we should add such tests.

We are doing so as we now do this directly by callers of this method which are not part of the navigation algorithm.

I'll try to check things locally.
Asserting Origin might not make sense since it is not always present but I can check this locally.

>> Source/WebCore/loader/PingLoader.cpp:192
>> +    frame.loader().applyUserAgent(request);
> 
> Are all pings supposed to have a user agent?  Are there tests for this?

Not sure about tests. I think user-agent is lightly tested but I'll try it.
Ping is supposed to use fetch and fetch is assuming user-agent to be there (it is only a SHOULD though).

>> Source/WebCore/loader/cache/CachedResourceRequest.h:94
>> +    bool m_isCrossOrigin { false };
> 
> I don't think the optimization of calculating isCrossOrigin once is worth adding the complexity of caching m_isCrossOrigin.  If we forget to update it somewhere where we update the URL or SecurityOrigin, bad things would happen.  I think we should just move the code from computeIsCrossOrigin to isCrossOrigin.

There will only be two calls at max to isCrossOrigin for each CachedResourceRequest, so your suggestion sounds good, at least for the moment.
Comment 10 youenn fablet 2016-10-18 08:33:00 PDT
Created attachment 291948 [details]
Patch
Comment 11 youenn fablet 2016-10-18 08:34:01 PDT
(In reply to comment #10)
> Created attachment 291948 [details]
> Patch

I removed the changes to FrameLoader related to user agent and Origin headers.
This should be done as a follow-up patch.
Comment 12 Darin Adler 2016-10-24 10:32:24 PDT
Comment on attachment 291948 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=291948&action=review

This looks like an improvement; a little sad for performance to do some extra work even before looking in the cache, but seems like the right thing.

> Source/WebCore/loader/cache/CachedResource.cpp:254
> +    if (type() != CachedResource::MainResource)
> +        frameLoader.addExtraFieldsToSubresourceRequest(m_resourceRequest);
> +
>  

Extra blank line here.

Might need a why comment explaining why there’s no need to do anything here for main resources. Name of function makes it clear why we would not call that function, but *not* why we don’t need any work here for the main resource. Kind of inelegant that they are different in this way, in fact.

> Source/WebCore/loader/cache/CachedResource.h:242
> +    void registerHandle(CachedResourceHandleBase*);
> +    WEBCORE_EXPORT void unregisterHandle(CachedResourceHandleBase*);

These should take references; worth fixing next time you touch this.

> Source/WebCore/loader/cache/CachedResourceLoader.cpp:648
> +    // FIXME: We should reconcile handling of MainResource with other resources.

Ah, look at that, you already had the same thought as the thing I said above.

> Source/WebCore/loader/cache/CachedResourceLoader.cpp:654
> +        auto* frame = this->frame();
> +        if (frame)
> +            request.updateReferrerOriginAndUserAgentHeaders(frame->loader(), document() ? document()->referrerPolicy() : ReferrerPolicy::Default);

Should put the declaration of frame inside the if. Also might want document in a local variable instead of calling the function twice.

> Source/WebCore/loader/cache/CachedResourceRequest.cpp:221
> +    String outgoingReferrer;
> +    String outgoingOrigin;
> +    if (m_resourceRequest.httpReferrer().isNull()) {
> +        outgoingReferrer = frameLoader.outgoingReferrer();
> +        outgoingOrigin = frameLoader.outgoingOrigin();
> +    } else {
> +        outgoingReferrer = m_resourceRequest.httpReferrer();
> +        outgoingOrigin = SecurityOrigin::createFromString(outgoingReferrer)->toString();
> +    }

Could initialize outgoingReferrer and then check it for null, rather than calling m_resourceRequest.httpReferrer() twice.

> Source/WebCore/loader/cache/CachedResourceRequest.cpp:248
> +    if (outgoingReferrer.isEmpty())
> +        m_resourceRequest.clearHTTPReferrer();
> +    else
> +        m_resourceRequest.setHTTPReferrer(outgoingReferrer);

Should we teach this rule to setHTTPReferrer instead of doing it here?

> Source/WebCore/loader/cache/CachedResourceRequest.cpp:254
> +bool isRequestCrossOrigin(SecurityOrigin* origin, const URL& url, const ResourceLoaderOptions& options)

I would name the argument requestURL rather than url since the origin also likely has a URL.

> Source/WebCore/loader/cache/CachedResourceRequest.cpp:260
> +    if (options.mode == FetchOptions::Mode::SameOrigin)
> +        return false;

This probably needs a why comment. Something like:

    // Using same origin mode guarantees the loader will not do a cross-origin load, so we let it take care of it and just return false.

But maybe an even better comment will help more.

> Source/WebCore/loader/cache/CachedResourceRequest.cpp:263
> +    if (url.protocolIsData() && options.sameOriginDataURLFlag == SameOriginDataURLFlag::Set)
> +        return false;

This one might need a why comment too. I don’t fully understand it.

> Source/WebCore/loader/cache/CachedResourceRequest.cpp:268
> +}
> +
> +

Extra blank line here.

> Source/WebCore/loader/cache/CachedResourceRequest.h:49
> +bool isRequestCrossOrigin(SecurityOrigin*, const URL&, const ResourceLoaderOptions&);

I think we should name the URL argument requestURL; maybe it seems obvious, but keep in mind that security origins also often have URLs so there is more than one URL involved here.
Comment 13 youenn fablet 2016-10-24 14:54:47 PDT
Thanks for the review.

> > Source/WebCore/loader/cache/CachedResourceLoader.cpp:648
> > +    // FIXME: We should reconcile handling of MainResource with other resources.
> 
> Ah, look at that, you already had the same thought as the thing I said above.

Note that this task might not be straightforward as the fetch algorithm and the navigation algorithm are not aligned in spec and in code.
We need more tests and be careful before modifying the navigation algorithm.
This is why this version of the patch is more conservative with navigation algorithm changes than a previous version.

But I agree we should definitely go into that direction.
Also the navigation algorithm current code is complex to understand and difficult to maintain.
Improving this would be a healthy task for sure.
Comment 14 youenn fablet 2016-10-25 01:57:42 PDT
Created attachment 292737 [details]
Patch for landing
Comment 15 youenn fablet 2016-10-25 02:03:58 PDT
Thanks for the review.

(In reply to comment #12)
> Comment on attachment 291948 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=291948&action=review
> 
> This looks like an improvement; a little sad for performance to do some
> extra work even before looking in the cache, but seems like the right thing.

I felt the same.
I think we should first try to simplify things to have manageable rules.
Then we could start reoptimizing things.


> > Source/WebCore/loader/cache/CachedResource.cpp:254
> > +    if (type() != CachedResource::MainResource)
> > +        frameLoader.addExtraFieldsToSubresourceRequest(m_resourceRequest);
> > +
> >  
> 
> Extra blank line here.

OK

> Might need a why comment explaining why there’s no need to do anything here
> for main resources. Name of function makes it clear why we would not call
> that function, but *not* why we don’t need any work here for the main
> resource. Kind of inelegant that they are different in this way, in fact.

This is really inelegant and makes the understanding hard.

> > Source/WebCore/loader/cache/CachedResource.h:242
> > +    void registerHandle(CachedResourceHandleBase*);
> > +    WEBCORE_EXPORT void unregisterHandle(CachedResourceHandleBase*);
> 
> These should take references; worth fixing next time you touch this.

OK

> > Source/WebCore/loader/cache/CachedResourceLoader.cpp:648
> > +    // FIXME: We should reconcile handling of MainResource with other resources.
> 
> Ah, look at that, you already had the same thought as the thing I said above.
> 
> > Source/WebCore/loader/cache/CachedResourceLoader.cpp:654
> > +        auto* frame = this->frame();
> > +        if (frame)
> > +            request.updateReferrerOriginAndUserAgentHeaders(frame->loader(), document() ? document()->referrerPolicy() : ReferrerPolicy::Default);
> 
> Should put the declaration of frame inside the if. Also might want document
> in a local variable instead of calling the function twice.

OK
For document(), this is direct accessor to m_document.

> > Source/WebCore/loader/cache/CachedResourceRequest.cpp:221
> > +    String outgoingReferrer;
> > +    String outgoingOrigin;
> > +    if (m_resourceRequest.httpReferrer().isNull()) {
> > +        outgoingReferrer = frameLoader.outgoingReferrer();
> > +        outgoingOrigin = frameLoader.outgoingOrigin();
> > +    } else {
> > +        outgoingReferrer = m_resourceRequest.httpReferrer();
> > +        outgoingOrigin = SecurityOrigin::createFromString(outgoingReferrer)->toString();
> > +    }
> 
> Could initialize outgoingReferrer and then check it for null, rather than
> calling m_resourceRequest.httpReferrer() twice.

OK

> > Source/WebCore/loader/cache/CachedResourceRequest.cpp:248
> > +    if (outgoingReferrer.isEmpty())
> > +        m_resourceRequest.clearHTTPReferrer();
> > +    else
> > +        m_resourceRequest.setHTTPReferrer(outgoingReferrer);
> 
> Should we teach this rule to setHTTPReferrer instead of doing it here?

I prefer the status quo, setHTTPReferrer is used in several places.
This is the only case where we use clearHTTPReferrer.

> > Source/WebCore/loader/cache/CachedResourceRequest.cpp:254
> > +bool isRequestCrossOrigin(SecurityOrigin* origin, const URL& url, const ResourceLoaderOptions& options)
> 
> I would name the argument requestURL rather than url since the origin also
> likely has a URL.

OK

> > Source/WebCore/loader/cache/CachedResourceRequest.cpp:260
> > +    if (options.mode == FetchOptions::Mode::SameOrigin)
> > +        return false;
> 
> This probably needs a why comment. Something like:
> 
>     // Using same origin mode guarantees the loader will not do a
> cross-origin load, so we let it take care of it and just return false.

OK

> > Source/WebCore/loader/cache/CachedResourceRequest.cpp:263
> > +    if (url.protocolIsData() && options.sameOriginDataURLFlag == SameOriginDataURLFlag::Set)
> > +        return false;
> 
> This one might need a why comment too. I don’t fully understand it.

fetch spec is not yet totally settled in data URL handling.
At some point, data URLs were set as cross-origin depending on an option set by clients. This has been simplified but there are still decisions to make in case of redirections. Once done, we should rework data URL handling and remove this flag.

> > Source/WebCore/loader/cache/CachedResourceRequest.cpp:268
> > +}
> > +
> > +
> 
> Extra blank line here.

OK

> > Source/WebCore/loader/cache/CachedResourceRequest.h:49
> > +bool isRequestCrossOrigin(SecurityOrigin*, const URL&, const ResourceLoaderOptions&);
> 
> I think we should name the URL argument requestURL; maybe it seems obvious,
> but keep in mind that security origins also often have URLs so there is more
> than one URL involved here.

OK
Comment 16 WebKit Commit Bot 2016-10-25 03:04:37 PDT
Comment on attachment 292737 [details]
Patch for landing

Rejecting attachment 292737 [details] from commit-queue.

New failing tests:
imported/w3c/web-platform-tests/fetch/api/basic/referrer.html
imported/w3c/web-platform-tests/fetch/api/policies/referrer-origin-worker.html
imported/w3c/web-platform-tests/fetch/api/basic/request-headers-worker.html
http/tests/xmlhttprequest/workers/referer.html
imported/w3c/web-platform-tests/fetch/api/policies/referrer-unsafe-url.html
imported/w3c/web-platform-tests/fetch/api/basic/referrer-worker.html
imported/w3c/web-platform-tests/fetch/api/cors/cors-preflight-referrer-worker.html
imported/w3c/web-platform-tests/fetch/api/basic/request-headers.html
imported/w3c/web-platform-tests/fetch/api/policies/referrer-unsafe-url-worker.html
imported/w3c/web-platform-tests/fetch/api/basic/request-referrer.html
imported/w3c/web-platform-tests/fetch/api/policies/referrer-origin-when-cross-origin-worker.html
imported/w3c/web-platform-tests/fetch/api/cors/cors-preflight-referrer.html
imported/w3c/web-platform-tests/fetch/api/policies/referrer-origin-when-cross-origin.html
imported/w3c/web-platform-tests/fetch/api/policies/referrer-origin.html
Full output: http://webkit-queues.webkit.org/results/2364116
Comment 17 WebKit Commit Bot 2016-10-25 03:04:41 PDT
Created attachment 292742 [details]
Archive of layout-test-results from webkit-cq-02 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the commit-queue.
Bot: webkit-cq-02  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 18 youenn fablet 2016-10-25 06:41:20 PDT
Created attachment 292748 [details]
Patch for landing
Comment 19 WebKit Commit Bot 2016-10-25 07:19:05 PDT
Comment on attachment 292748 [details]
Patch for landing

Clearing flags on attachment: 292748

Committed r207817: <http://trac.webkit.org/changeset/207817>
Comment 20 WebKit Commit Bot 2016-10-25 07:19:10 PDT
All reviewed patches have been landed.  Closing bug.