51638 – Protect path of HTTP Referer Header

NEW51638

Protect path of HTTP Referer Header

https://bugs.webkit.org/show_bug.cgi?id=51638

Summary Protect path of HTTP Referer Header

Robert Hogan

Reported 2010-12-27 04:16:29 PST

From https://bugzilla.mozilla.org/show_bug.cgi?id=587523: "The browser's http referer header is a source of significant amount of private data leakage. See http://www.cs.wpi.edu/~cew/papers/wosn08.pdf and http://online.wsj.com/article/NA_WSJ_PUB:SB10001424052748704513104575256701215465596.html as an example of something that was fixed by particular sites (Facebook). One issue that has not been fixed yet, is the fact that users' search terms leak to 3rd party sites via the referer header, when they click on results from the search engine results page. An example of this can be seen by searching for 'no knead bread' with Google, and clicking on the 4th search result, which takes you to www.breadtopia.com/basic-no-knead-method/, a page which "helpfully" lets you know that it is aware of the search terms that brought you to the site. This bug (https://bugzilla.mozilla.org/show_bug.cgi?id=55477) has quite a bit of debate about the idea of stripping some info from the referer header. One of the good ideas in that bug is the idea of a REFERRER_3RDPARTY_NO_PREPATH option, which "Strip[s] off the path from the referrer for 3rd party requests, otherwise leave[s] it alone." Under such a model, a user visiting wikipedia.com, and clicking on a link to another page on wikipedia would still have the full referer transmitted. However, a user clicking on a link from Google.com's results page to a 3rd party site would result in the referer of "http://www.google.com" being sent. Looking through that and other bug reports, the two main positive use cases for the delivery of the referer header to 3rd party sites seem to be: 1. Stopping bandwidth leeching. E.g. Stopping other sites from including images from your site, which cost you bandwidth when users visit those sites. 2. Analytics: Helping webmasters to figure out where their traffic is coming from. Item 1 is easy to solve, since even with just the domain in the referer header, it would still be easy to determine that a myspace.com user had embedded your content in their site. You wouldn't know which myspace user had done so, but you could still easily block such requests (or give them a different image). With regard to item 2. Webmasters simply do not have any natural right to know where their users are coming from. Yes, this is how the web has always worked, but it doesn't need to stay this way, particularly when the user has expressed a desire to be private, by turning on private browsing mode. By switching to a model of scrubbing the path, but not the domain, from the referer header, these 3rd party sites would still have a rough idea of where users are coming from, but wouldn't learn the exact page the user is on. For sites that include private info in the URL (for example: http://www.webmd.com/breast-cancer/default.htm), this would lead to a significant improvement in user privacy. Furthermore, for webmasters that want to find out what search terms are drawing users, Google already offers aggregate stats for individual webmasters, which can be viewed at http://www.google.com/webmaster. These webmasters would merely be denied this info about individual users in real time, and would instead have to make do with aggregate info. What I propose is adding this option to strip the path from the referer headers sent to third party sites. This option should not be enabled by default, but if a user wishes to go into about:config and enable it, so be it. However, this option would be enabled whenever the user goes into private browsing mode. Once the user leaves private browsing mode, their browser will back to sending full headers again."

Attachments
Add attachment proposed patch, testcase, etc.

Marco Peereboom

Comment 1 2011-07-11 11:46:28 PDT

I would love a knob for this as well. Something along the lines of "default", "domain_only" and "disabled". Just like cookies.

Note You need to log in before you can comment on or make changes to this bug.

Status NEW

Resolution

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware All

OS All

Product WebKit

Component Page Loading

Assignee

Nobody

Reported

2010-12-27 04:16 PST

Modified

2024-07-10 13:19 PDT History

CC List

8 users Show

URL

Keywords

Depends on

Blocks

41801

Dependencies

tree graph