Bug 63460 - CORS should only deal with request headers set by script authors
Summary: CORS should only deal with request headers set by script authors
Status: UNCONFIRMED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-06-27 08:46 PDT by Per-Erik Brodin
Modified: 2016-06-22 13:44 PDT (History)
10 users (show)

See Also:


Attachments
proposed patch (15.94 KB, patch)
2011-06-27 08:49 PDT, Per-Erik Brodin
no flags Details | Formatted Diff | Diff
Patch (5.76 KB, patch)
2011-08-19 16:18 PDT, David Levin
webkit.review.bot: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Per-Erik Brodin 2011-06-27 08:46:38 PDT
The CORS specification has recently been updated to clarify that only request headers set by script authors, "author request headers", should be matched against the list of simple headers and sent in Access-Control-Request-Headers in preflight requests, etc.  All headers set by XHR are explicitly or implicitly set by authors, but in EventSource there are no author set headers but rather two request headers set by the implementation, Cache-Control and Last-Event-ID.
Comment 1 Per-Erik Brodin 2011-06-27 08:49:02 PDT
Created attachment 98735 [details]
proposed patch
Comment 2 Adam Barth 2011-06-27 09:33:09 PDT
Interesting approach.  I'm going to let ap take a first crack at reviewing this patch.
Comment 3 Alexey Proskuryakov 2011-06-27 13:03:05 PDT
Comment on attachment 98735 [details]
proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=98735&action=review

The implementation cannot work on Mac, because header status will be lost after a round-trip through NSURLRequest (or any other HTTP library request object). WebKit needs this round-trip because it has delegate methods where it informs client applications about requests that are about to be made.

ResourceRequest is ah HTTP level notion, and it shouldn't have any knowledge of Web platform concepts.

Secondly, when we talk about author vs. implementation, where do headers added by client applications fit? It kind of sounds like it's "implementation", but we can't know if those applications use untrusted content to decide which headers to add.

> LayoutTests/ChangeLog:6
> +        CORS should only deal with request headers set by script authors.
> +        https://bugs.webkit.org/show_bug.cgi?id=63460

That's an interesting idea, although it seems somewhat questionable to me. The design principle use to be to avoid hitting servers with anything they couldn't be hit with through form submission, and the new rule deviates from that pretty far.

> Source/WebCore/platform/network/ResourceRequestBase.h:47
> +    enum ResourceRequestHeaderStatus {

Naming nit: I would have liked something like "origination" or "source" more than "status".
Comment 4 Alexey Proskuryakov 2011-06-27 13:09:21 PDT
One more question: what about Content-Type headers that are not set by authors? Currently XMLHttpRequest defaults to application/xml, which triggers preflight. Should the same logic be applied there?
Comment 5 Per-Erik Brodin 2011-06-28 14:17:03 PDT
(In reply to comment #3)

I appreciate the quick review.

> The implementation cannot work on Mac, because header status will be lost after a round-trip through NSURLRequest (or any other HTTP library request object). WebKit needs this round-trip because it has delegate methods where it informs client applications about requests that are about to be made.
>

I realize that the header status might get lost when platforms deal with the request. However, the header status is only meant to be used by the CORS implementation and that happens before the status is lost. I was not able to build on Mac with the patch applied because of the optional arguments but once that was fixed I achieved the desired behavior on Mac as well.
 
> ResourceRequest is ah HTTP level notion, and it shouldn't have any knowledge of Web platform concepts.
> 

Could you recommend a different approach to fixing this bug? The included test demonstrates a deviation in the implementation from the current CORS spec (Origin being sent in Access-Control-Request-Headers header). To enable CORS in EventSource this patch was going to prevent a preflight from the two headers added by the implementation (thus not in the list of simple headers) but I guess I could solve that with a flag in ThreadableLoaderOptions instead.

> Secondly, when we talk about author vs. implementation, where do headers added by client applications fit? It kind of sounds like it's "implementation", but we can't know if those applications use untrusted content to decide which headers to add.
> 

Can clients invoke CORS? I would say that client added headers are implementation headers and that it is up to clients to safely add headers.

> > LayoutTests/ChangeLog:6
> > +        CORS should only deal with request headers set by script authors.
> > +        https://bugs.webkit.org/show_bug.cgi?id=63460
> 
> That's an interesting idea, although it seems somewhat questionable to me. The design principle use to be to avoid hitting servers with anything they couldn't be hit with through form submission, and the new rule deviates from that pretty far.
> 

I think the reasoning is that headers added by the implementation are safe even for cross origin requests. With this clarification the CORS specification does not have to be updated for every new specification that wants to add a header without causing a preflight request to be made. Also, there is only one list of simple headers so if Last-Event-ID was to be added back then you could use this header in XHR without preflight as well (not sure if this is a problem though, just an interesting consequence).

> > Source/WebCore/platform/network/ResourceRequestBase.h:47
> > +    enum ResourceRequestHeaderStatus {
> 
> Naming nit: I would have liked something like "origination" or "source" more than "status".

Yes, you are right, "source" is much better than "status". Anything related to "origin" should probably be avoided in this context though.

(In reply to comment #4)
> One more question: what about Content-Type headers that are not set by authors? Currently XMLHttpRequest defaults to application/xml, which triggers preflight. Should the same logic be applied there?

The Content-Type headers set by the XHR implementation would fall under the category "implicitly set by authors", see the second paragraph in http://lists.w3.org/Archives/Public/public-webapps/2011AprJun/1222.html
Comment 6 Alexey Proskuryakov 2011-06-28 16:32:08 PDT
> Could you recommend a different approach to fixing this bug?

I don't have a lot of detail to give, but the source of headers would need to be tracked outside of ResourceRequest. Or perhaps the code could be refactored so that implementation headers would be added after deciding whether to make a simple request.

It seems that this change to the spec hasn't really settled yet, it that correct?
Comment 7 Anne van Kesteren 2011-07-08 01:32:19 PDT
Need to figure out how to phrase the requirement on Content-Type adequately. People seemed to think however that this is what the specification should have said all along.
Comment 8 David Levin 2011-08-19 16:18:27 PDT
Created attachment 104586 [details]
Patch
Comment 9 David Levin 2011-08-19 16:19:46 PDT
Added depends link since I wrote this code using the patch from 66340.
Comment 10 WebKit Review Bot 2011-08-19 16:23:24 PDT
Comment on attachment 104586 [details]
Patch

Attachment 104586 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/9439468
Comment 11 Gyuyoung Kim 2011-08-19 16:23:53 PDT
Comment on attachment 104586 [details]
Patch

Attachment 104586 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/9435513
Comment 12 Per-Erik Brodin 2011-08-19 16:42:23 PDT
(In reply to comment #8)
> Created an attachment (id=104586) [details]
> Patch

What about request headers set by the implementation that are not in UserAgentHeaderData such as Cache-Control and Last-Event-ID set by EventSource? Those headers should trigger preflight when set by authors using XHR and thus can't be added to the list. That's why I tried to keep track of which headers that are actually set by the implementation and which are set by the author.
Comment 13 Early Warning System Bot 2011-08-19 16:44:57 PDT
Comment on attachment 104586 [details]
Patch

Attachment 104586 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/9438514
Comment 14 Alexey Proskuryakov 2011-08-19 17:01:37 PDT
+    m_userAgentHeaders.add("accept-charset");
+    m_userAgentHeaders.add("accept-encoding");
+    m_userAgentHeaders.add("access-control-request-headers");

Despite what I said in comment 1, I'm not sure if it's the best approach to move headers that are special cased in XMLHttpRequest to some shared location. It's ThreadableLoader client (in this case, XHR) who really knows if JavaScript code called one of those methods that added non-engine headers. Many other clients simply don't provide any way to specify custom headers, so doing checks on these is pointless.

What do you think?
Comment 15 WebKit Review Bot 2011-08-19 17:24:41 PDT
Comment on attachment 104586 [details]
Patch

Attachment 104586 [details] did not pass mac-ews (mac):
Output: http://queues.webkit.org/results/9438527
Comment 16 Collabora GTK+ EWS bot 2011-08-19 18:12:52 PDT
Comment on attachment 104586 [details]
Patch

Attachment 104586 [details] did not pass gtk-ews (gtk):
Output: http://queues.webkit.org/results/9438561
Comment 17 David Levin 2011-08-22 11:34:31 PDT
Comment on attachment 104586 [details]
Patch

Withdrawn. I'll leave this bug for others.
Comment 18 David Levin 2011-08-22 11:40:52 PDT
(In reply to comment #12)
> (In reply to comment #8)
> > Created an attachment (id=104586) [details] [details]
> > Patch
> 
> What about request headers set by the implementation that are not in UserAgentHeaderData such as Cache-Control and Last-Event-ID set by EventSource? Those headers should trigger preflight when set by authors using XHR and thus can't be added to the list. That's why I tried to keep track of which headers that are actually set by the implementation and which are set by the author.

I don't think that keeping track of headers sent by the author and headers set by the implementation is the correct way of thinking about the problem.

imo, instead one should think of what headers to whitelist. It really doesn't matter how they are set if the request could do malicious things on the server which seems to be the real purpose of deciding whether to do a preflight request or not.

Regardless, I've withdrawn this patch to allow anyone else to take it up as they see fit.
Comment 19 Alexey Proskuryakov 2011-08-22 11:54:56 PDT
> It really doesn't matter how they are set if the request could do malicious things on the server which seems to be the real purpose of deciding whether to do a preflight request or not.

That's my thinking, too, but the current spec draft disagrees, only talking about how the headers are set.
Comment 20 Alexey Proskuryakov 2016-06-21 23:07:35 PDT
One particularly unfortunate consequence of this is that when Do Not Track is enabled in Safari preferences, all CORS requests become non-simple ones, and start using preflight.
Comment 21 youenn fablet 2016-06-22 01:34:30 PDT
(In reply to comment #20)
> One particularly unfortunate consequence of this is that when Do Not Track
> is enabled in Safari preferences, all CORS requests become non-simple ones,
> and start using preflight.

I don't know where is happening the insertion of DNT header in that case.
I guess that if they are inserted after core preflight checker, this should work nicely.

For WebKit GStreamer (bug 131484), although non-simple headers may be added, I think preflight is never used. This makes sense to me.
Comment 22 Anne van Kesteren 2016-06-22 03:18:47 PDT
(In reply to comment #21)
> I guess that if they are inserted after core preflight checker, this should
> work nicely.

That is definitely how Fetch approaches this. DNT is set with other headers just before the request goes to the network. Notably, this is after service workers. See step 12 of https://fetch.spec.whatwg.org/#concept-http-network-or-cache-fetch. It's a little vague still, but hopefully that will get better over time.

Now, it is a problem that user agents are somehow exempt from the same-origin policy and we keep introducing new headers that we emit across origins and servers might get tripped up by. I don't have a good story for that yet. Nobody seems to really think about it that when they add DNT to all requests, they also violate the implicit agreements around the same-origin policy.
Comment 23 Alexey Proskuryakov 2016-06-22 13:44:31 PDT
The DNT header is inserted by Safari injected bundle code in a client callback.