Bug 26994 - Store non-standard HTTP headers in history
Summary: Store non-standard HTTP headers in history
Status: UNCONFIRMED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Enhancement
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-07-06 09:43 PDT by Marshall Greenblatt
Modified: 2010-06-10 19:17 PDT (History)
2 users (show)

See Also:


Attachments
Patch #1 to resolve bug 26994. (3.07 KB, patch)
2009-09-23 10:30 PDT, Marshall Greenblatt
darin: review-
Details | Formatted Diff | Diff
Patch #2 to resolve bug 26994. (2.97 KB, patch)
2009-09-23 12:55 PDT, Marshall Greenblatt
no flags Details | Formatted Diff | Diff
Patch #3 to resolve bug 26994 (6.42 KB, patch)
2009-09-23 17:02 PDT, Marshall Greenblatt
abarth: review-
Details | Formatted Diff | Diff
Patch #4 to resolve bug 26994 (4.35 KB, patch)
2009-09-24 09:03 PDT, Marshall Greenblatt
abarth: review-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Marshall Greenblatt 2009-07-06 09:43:21 PDT
Non-standard HTTP headers, usually prefixed with 'X-', are used by a number of server-based applications. It is currently possible to create an NSMutableURLRequest that contains non-standard HTTP headers and load that request via a call to WebFrame::loadRequest(). If the request is then re-loaded via a call to WebFrame::reload() the non-standard HTTP headers are remembered and re-submitted. However, if the request is loaded by navigating the history then the non-standard HTTP headers will have been lost and will not be re-submitted. To address what I believe is an API omission I propose that WebCore::HistoryItem be modified to store non-standard HTTP headers so that they can be re-submitted when the user navigates back to the request via the history.
Comment 1 Marshall Greenblatt 2009-09-23 10:30:57 PDT
Created attachment 40005 [details]
Patch #1 to resolve bug 26994.
Comment 2 Darin Adler 2009-09-23 10:53:43 PDT
Comment on attachment 40005 [details]
Patch #1 to resolve bug 26994.

> +        if (it->first.find("X-") == 0)

This is not the correct way to check for a prefix. It wastefully searches the entire string for "X-". There's a startsWith function that can be used for this purpose. But also, should this be a case-sensitive check or case-insensitive?

Why keep the extra headers in a map instead of just in a vector of pairs? Do we need efficient key lookup for some reason.

I don't understand the concept of this patch; why are headers the start with "x-" different from others? I suspect this is the wrong approach.
Comment 3 Marshall Greenblatt 2009-09-23 11:48:00 PDT
(In reply to comment #2)
> (From update of attachment 40005 [details])
> > +        if (it->first.find("X-") == 0)
> 
> This is not the correct way to check for a prefix. It wastefully searches the
> entire string for "X-". There's a startsWith function that can be used for this
> purpose. But also, should this be a case-sensitive check or case-insensitive?

I'll make the changes that you suggest. Case-insensitive matching is not a requirement but would be acceptable.

> 
> Why keep the extra headers in a map instead of just in a vector of pairs? Do we
> need efficient key lookup for some reason.

Efficient key lookup is not a priority for this usage case. I chose this particular implementation to be consistent with the ResourceRequest implementation.  Perhaps a better overall approach would be having HistoryItem store a ResourceRequest object internally instead of duplicating many of the same members.

> 
> I don't understand the concept of this patch; why are headers the start with
> "x-" different from others? I suspect this is the wrong approach.

Some web applications use custom headers to extend an HTTP request. The convention is to prefix the header name with X- to indicate that it is non-standard.  This is a reasonably well-documented approach -- see, for instance, section 2.1 of the following (expired) draft specification:

http://www.mnot.net/drafts/draft-nottingham-hdrreg-http/draft-nottingham-hdrreg-http-00.txt

Searching only for headers that begin with X- implicitly excludes all standard headers.  An alternative approach would be explicitly excluding all standard headers that are either stored separately or should not be stored as part of the history such as Host, Cookie, Content-Length, Accept, User-Agent, Referrer, Content-Type, etc.  Either approach would be valid but I think the current implicit approach is more manageable from both an implementation and documentation standpoint.
Comment 4 Marshall Greenblatt 2009-09-23 12:55:55 PDT
Created attachment 40013 [details]
Patch #2 to resolve bug 26994.

This update fixes startsWith issue and ignores case for "X-".
Comment 5 Darin Adler 2009-09-23 13:01:16 PDT
Comment on attachment 40013 [details]
Patch #2 to resolve bug 26994.

This patch is going to need a regression test -- should be easy to make one of our http tests to test this.

Also, it would be good to have an example of a website where the change makes a WebKit-based browser work better with the site.

it->first.startsWith("x-", false) is what you want for case sensitivity.

I don't think that using "x-" prefixed headers only is the right way to go about this. Something that instead involved some sort of whitelist would be better.
Comment 6 Marshall Greenblatt 2009-09-23 13:35:09 PDT
(In reply to comment #5)
> (From update of attachment 40013 [details])
> This patch is going to need a regression test -- should be easy to make one of
> our http tests to test this.

I agree, I'll work on that.

> 
> Also, it would be good to have an example of a website where the change makes a
> WebKit-based browser work better with the site.

Normal web browsing won't cause non-standard HTTP request headers to be sent so this capability would be used mainly by applications embedding WebKit or by extensions loaded into a WebKit-based browser.  A good use-case for this capability would be an extension implementing a WebDAV report viewer. WebDAV uses a number of non-standard HTTP header fields:

http://www.webdav.org/specs/rfc2518.html#http.headers.for.distributed.authoring

The extension would modify the display of the WebDAV report, perhaps adding a search form and/or hyperlinks.  When the user interacts with these elements the extension sets the extra HTTP headers and submits the request.  WebKit can then transparently handle history navigation by storing those extra HTTP header values.

> 
> it->first.startsWith("x-", false) is what you want for case sensitivity.
> 
> I don't think that using "x-" prefixed headers only is the right way to go
> about this. Something that instead involved some sort of whitelist would be
> better.

The WebDAV header fields don't begin with "X-" so that's a good argument for not restricting the use to "X-" headers only.  How would you want to see the whitelist concept implemented?
Comment 7 Marshall Greenblatt 2009-09-23 17:02:51 PDT
Created attachment 40031 [details]
Patch #3 to resolve bug 26994

This patch addresses the HTTP header whitelist issue by offloading the test to the client.  It adds a new FrameLoaderClient::canStoreHTTPHeaderFieldInHistory() method.
Comment 8 Adam Barth 2009-09-23 22:42:11 PDT
Comment on attachment 40031 [details]
Patch #3 to resolve bug 26994

Why would different clients want to store different headers in the history item?  It seems like we should store all the headers except some specific list of headers that are problematic.  (Which are the problematic headers?)
Comment 9 Marshall Greenblatt 2009-09-24 06:57:03 PDT
(In reply to comment #8)
> (From update of attachment 40031 [details])
> Why would different clients want to store different headers in the history
> item?  It seems like we should store all the headers except some specific list
> of headers that are problematic.  (Which are the problematic headers?)

Headers in any of the following categories should not be stored in the history.

1. Transfer and/or content negotiation.

Accept-Ranges, Content-Encoding, Content-Length, Content-Location, Content-MD5, Content-Range, Content-Type, If-Range, Range, TE, Trailer, Transfer-Encoding, Via

2. Managed state information.

Authorization, Cookie, Host, Proxy-Authorization, Referer.

3. Browser capabilities.

Accept, Accept-Charset, Accept-Encoding, Accept-Language, Expect, Upgrade, User-Agent.


Am I missing any?
Comment 10 Marshall Greenblatt 2009-09-24 09:03:43 PDT
Created attachment 40068 [details]
Patch #4 to resolve bug 26994

Use a fixed list of excluded headers in HistoryItem.  Also exclude the Origin header.
Comment 11 Adam Barth 2009-10-13 14:55:26 PDT
Comment on attachment 40068 [details]
Patch #4 to resolve bug 26994

Darin asked for a regression test, but you did not provide one.  Also, this patch is missing a ChangeLog.

In general, this header list seems to overlap significantly with the XMLHttpRequest black list.  Are the lists actually the same?  Is one a subset of the other?  I'd like to see an explanation of any differences.