Bug 26777

Summary: Page cache should allow HTTPS pages taking Cache-control header into account
Product: WebKit Reporter: Antti Koivisto <koivisto>
Component: Page LoadingAssignee: Brady Eidson <beidson>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, beidson, darin, ddkilzer, emacemac7, matafagafo, mjs, mnot, paulirish, psolanki, sam, skyul
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: Mac   
OS: OS X 10.5   
Bug Depends on: 78597    
Bug Blocks:    
Attachments:
Description Flags
Patch v1 - Fix + layout test andersca: review+

Description Antti Koivisto 2009-06-27 18:27:58 PDT
HTTP pages that specify "Cache-control: no-store" should not be cacheable to the page cache. Currently they get cached if other factors don't stop it. 

This will make "no-store" work consistently and match other browsers.

On HTTPS pages "Cache-control: no-store" and "Cache-control: no-cache" should prevent caching to the page cache. Otherwise they should be cacheable. Currently HTTPS pages are never allowed in the page cache. 

This will make page caching available on HTTPS pages in secure manner, improving performance and user experience.

These changes would make WebKit page cache behave similarly to Firefox3 (https://bugzilla.mozilla.org/show_bug.cgi?id=441751).
Comment 1 Antti Koivisto 2009-06-27 18:33:35 PDT
<rdar://problem/7013921>
Comment 2 Alexey Proskuryakov 2009-06-30 02:28:58 PDT
Consistency with Firefox is a strong argument - however, the current behavior is by design. To prevent a page from getting into page cache, one can add an onunload handler.
Comment 3 Brady Eidson 2009-09-07 21:23:56 PDT
In addition to the bugzilla is their developer document of intended behavior:  https://developer.mozilla.org/En/Using_Firefox_1.5_caching

Adding another radar - <rdar://problem/7196487> - which tracks enabling https in the page cache (which seems closely related to this bug)
Comment 4 Alexey Proskuryakov 2009-09-07 22:35:26 PDT
One use case for the current design is the following scenario:
- web search is performed using an engine that returns results with Cache-control: no-store (a totally reasonable thing to do);
- a user clicks some search result, then goes back.

We should be able to return to search results page without making another request to the server, which is slow, and may give different results.
Comment 5 Brady Eidson 2009-09-07 22:47:32 PDT
I agree with Alexey here, for non-https pages.

From a user-experience standpoint, this would be absolutely desirable.

In general, there's been a struggle for browser vendors wanting to make the page cache work for more and more pages, and web developers wanting to opt out of the page cache for application-specific reasons.

I get the impression that "Cache-control: no-cache" was set in stone by mozilla to be "the way to opt out."

From a browser-engineer standpoint, and from the standpoint of someone who uses a web browser heavily, "opting out of the page cache" is the same thing as "opting in to low-performance-with-direct-user impact mode."

Is it important for us to *let* them opt out?

I think that maybe we should use this bug to track the https work.  That is, we start letting https pages into the page cache, unless they send no-cache or no-store.
Comment 6 Brady Eidson 2009-09-15 15:54:30 PDT
In addition to <rdar://problem/7013921>, the HTTPS specific aspect is also covered by <rdar://problem/7196487>
Comment 7 Mark Nottingham 2009-09-16 19:01:27 PDT
Cache-Control directives are targeted at HTTP caches, and RFC2616 explicitly states that caches and history lists are separate; see http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.13
In particular, see the note at the end.

I'm concerned that while this text is confusing and indeed not very helpful at the moment, re-defining the semantics of existing cache directives in an ad hoc fashion is going to create more confusion and interoperability problems. In other words, we need to document this better, especially if the way that the directives work is going to change.

The HTTPbis Working Group in the IETF is chartered to revise RFC2616, so this is a good opportunity to do that; for example, we could:

  * Rewrite section 13.13 to explicitly define CC: no-store as applying to history lists (which is within HTTPbis' charter), or

  * Define one or more new directives to explicitly control history lists (which isn't in HTTPbis' charter, but we can provide a forum for discussion and feedback), or

  * Refine the text in other ways.

To make that happen, however, we need browser vendors (you) to actively participate in discussions; ideally, making proposals for text changes.

You can see the current drafts at:
  http://tools.ietf.org/wg/httpbis/
and participate on the list at:
  http://lists.w3.org/Archives/Public/ietf-http-wg/
 
I've raised a HTTPbis ticket regarding this particular issue at:
    http://trac.tools.ietf.org/wg/httpbis/trac/ticket/197

(I'll leave it up to you to dediced whether to reopen this, open a new bug, etc.)

Mark Nottingham
IETF HTTPbis WG Chair / Part 6 (caching) Editor
Comment 8 Antti Koivisto 2009-09-16 20:35:54 PDT
Right, RFC2616 as-is clearly does not apply to the page cache.

Since this is a browser behavior the HTML5 specification might be the natural place to standardize it. It already acknowledges the existence of the page cache but does not currently go any further: "This specification does not specify when user agents should discard Document objects and when they should cache them."
Comment 9 Brady Eidson 2009-09-16 20:46:13 PDT
Coincidental to the comments in this bug, I just blogged about recent and planned changes in our page cache behavior:
http://webkit.org/blog/427/webkit-page-cache-i-the-basics/
Comment 10 David Kilzer (:ddkilzer) 2010-11-17 12:00:42 PST
See also Bug 49672.
Comment 11 Brady Eidson 2012-02-13 14:02:59 PST
Updating title to:
"Page cache should allow HTTPS pages taking Cache-control header into account"

We've long ignored Firefox's "cache-control: no-store" disqualification for normal HTTP pages and it hasn't identifiably caused any problems.

But in introducing support for HTTPS it would be prudent to play the conservative part and disqualify both "cache-control: no-store" and "cache-control: no-cache" like they do.

And that's what the patch I'm about to attach does.
Comment 12 Brady Eidson 2012-02-13 14:03:29 PST
Created attachment 126830 [details]
Patch v1 - Fix + layout test
Comment 13 Brady Eidson 2012-02-13 14:13:06 PST
http://trac.webkit.org/changeset/107607
Comment 14 Brady Eidson 2012-02-13 14:13:06 PST
http://trac.webkit.org/changeset/107607