Bug 13128 - Safari not obeying cache header
Summary: Safari not obeying cache header
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Page Loading (show other bugs)
Version: 420+
Hardware: Mac OS X 10.4
: P2 Normal
Assignee: Dimitri Glazkov (Google)
URL: http://www.sirensclef.com/safari_imag...
Keywords: InRadar
: 25129 (view as bug list)
Depends on:
Blocks:
 
Reported: 2007-03-19 21:24 PDT by Alexander Romanovich
Modified: 2009-11-05 09:29 PST (History)
9 users (show)

See Also:


Attachments
preliminary patch (26.40 KB, patch)
2009-05-29 16:33 PDT, Antti Koivisto
no flags Details | Formatted Diff | Diff
updated patch (41.52 KB, patch)
2009-06-03 16:06 PDT, Antti Koivisto
ddkilzer: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alexander Romanovich 2007-03-19 21:24:54 PDT
Safari does not seem to be obeying the following cache header: "Cache-control: public, must-revalidate, max-age=3600". This is demonstrated with the above url, where you do not match FF's caching behavior (or Mac IE for that matter!). This is with Safari in 10.4.9, as well as nightly build.

Load the url above, which uses PHP to generate a PNG with the current timestamp.
Refresh the page. You will see the timestamp increments, as expected.
Now, press return inside the address bar to cause the page to load again. Again, the timestamp will increment. (This can also be demonstrated by clicking a link on a separate web page, to go to this image script.)

If it had obeyed the cache header, pressing return in the address bar would not have incremented the timestamp because it would have loaded the image from the cache.

Doing this in FF gives the expected results: it only increments on an actual refresh, as opposed to pressing return in the address bar.

The PHP I am using to test this can be found at: http://www.sirensclef.com/safari_image_cache/code.html
Comment 1 Dave Hyatt 2007-03-19 22:01:54 PDT
Hitting return in the URL bar with the same URL displayed is treated as a reload in Safari.  This behavior is by design.

Comment 2 Alexander Romanovich 2007-03-19 22:21:17 PDT
(In reply to comment #1)
> Hitting return in the URL bar with the same URL displayed is treated as a
> reload in Safari.  This behavior is by design.
> 

Fair enough. But then here's an example of two html files that link to each other:
http://www.sirensclef.com/safari_image_cache/pageA.html

Containing:

<html><body>
<img src="index.php"/>
<a href="pageA.html">test</a>
</body></html>

And:

<html><body>
<img src="index.php"/>
<a href="pageB.html">test</a>
</body></html>

This is a normal inclusion of the image via an img tag, and yet the TS increments each time. This also demonstrates how this behavior is a problem for me. I have a PHP script that watermarks photos on the fly. I run *many* images through this, and none of those images get cached with Safari, causing the watermark script to run every time.
Comment 3 David Kilzer (:ddkilzer) 2007-03-20 04:17:38 PDT
Caching is done by NSFoundation classes, so this bug needs a Radar bug to be fixed.  (Note that this bug may eventually be closed as RESOLVED/INVALID, but only because the bug can't be fixed in WebKit itself.)

Comment 4 Alexander Romanovich 2007-03-20 08:34:37 PDT
<rdar://problem/5074795>
Comment 5 David Kilzer (:ddkilzer) 2007-03-20 09:11:52 PDT
(In reply to comment #4)
> <rdar://problem/5074795>

Alexander, if you haven't already done so, please add a link to this Bugzilla bug in Radar.  Thanks for filing the Radar bug!

Comment 6 Alexander Romanovich 2007-03-20 09:21:42 PDT
I included a link to this bug at the start of the radar report. Thanks David.
Comment 7 Alexander Romanovich 2007-04-14 21:40:41 PDT
<rdar://problem/5074795>

15-Apr-2007 12:11 AM Alexander Romanovich:
It seems that a workaround is to use the "Expires" header to achieve the caching behavior. While that works for me as an alternative, this bug still remains and should be fixed. At least this implies that your caching methods are working properly, you're just ignoring a certain type of header that should trigger it.
Working example of workaround header is: header('Expires: '.gmdate('D, d M Y H:i:s',mktime(0,0,0,date('m'),date('d')+1,date('y'))).' GMT');
Comment 8 Alexey Proskuryakov 2007-12-29 10:48:09 PST
This issue is not in WebKit, and it will continue to be tracked in Radar. Closing as INVALID per our process. Thank you for filing this!
Comment 9 Alexander Romanovich 2008-12-21 11:18:53 PST
This bug has been ignored in radar for well over a year, and my request @ radar for an update have also been ignored. It would be nice to fix this in tandem with the new shift-reload stuff, as well as other caching changes the WebKit team has recently been dealing with. Any chance this can get some attention now? The bug is simply that "must-revalidate" always forces the cache to be bypassed on Safari, but not on other browsers (FF, IE).
Comment 10 Alexey Proskuryakov 2008-12-21 12:03:43 PST
Please note that the right way to request status update from Apple is to e-mail devbugs@apple.com.
Comment 11 Steve Clay 2009-02-24 14:14:12 PST
(In reply to comment #7)
> It seems that a workaround is to use the "Expires" header to achieve the

On Windows nightly WebKit-r41128 and Safari 4 beta I still see this bug, even when Expires is sent matching the max-age. I documented this here: http://mrclay.org/index.php/2009/02/24/safari-4-beta-cache-controlmust-revalidate-bug/

The "Expires" workaround *was* working in Safari 3 so this is even a regression.
Comment 12 Alexey Proskuryakov 2009-02-25 00:18:19 PST
> The "Expires" workaround *was* working in Safari 3 so this is even a
> regression.

Could you please file a new bug for this? It may or may not be a WebKit issue - I haven't been able to reproduce it so far, so I can't tell for sure. In any case, being a regression, it should be tracked separately.
Comment 13 Alexey Proskuryakov 2009-04-08 04:23:04 PDT
The new bug is bug 24179.
Comment 14 Antti Koivisto 2009-05-29 16:26:45 PDT
This is a valid bug that can be fixed in WebCore level. Reopening.
Comment 15 Antti Koivisto 2009-05-29 16:33:24 PDT
Created attachment 30796 [details]
preliminary patch

Implement RFC2616 cache expiration calculations in WebCore instead of relying on the networking layer.
Comment 16 Antti Koivisto 2009-06-03 16:06:13 PDT
Created attachment 30930 [details]
updated patch

Test cases, bug fixes and cleanups.
Comment 17 Antti Koivisto 2009-06-03 17:01:02 PDT
Note that this patch fixes the subresource case only, fix for bug 26178 is needed to handle the main resource case.
Comment 18 Antti Koivisto 2009-06-03 17:11:56 PDT
*** Bug 25129 has been marked as a duplicate of this bug. ***
Comment 19 David Kilzer (:ddkilzer) 2009-06-04 08:42:26 PDT
Comment on attachment 30930 [details]
updated patch

>Index: WebCore/ForwardingHeaders/runtime/DateMath.h
>===================================================================
>--- WebCore/ForwardingHeaders/runtime/DateMath.h	(revision 0)
>+++ WebCore/ForwardingHeaders/runtime/DateMath.h	(revision 0)
>@@ -0,0 +1,4 @@
>+#ifndef WebCore_FWD_DateMath_h
>+#define WebCore_FWD_DateMath_h
>+#include <JavaScriptCore/DateMath.h>
>+#endif

It would be nice to have this forwarding header added elsewhere for consistency:

JavaScriptGlue/ForwardingHeaders/runtime
WebKit/mac/ForwardingHeaders/runtime

>Index: WebCore/loader/CachedResource.cpp
>===================================================================
>--- WebCore/loader/CachedResource.cpp	(revision 44396)
>+++ WebCore/loader/CachedResource.cpp	(working copy)
>@@ -26,12 +26,14 @@
> 
> #include "Cache.h"
> #include "CachedResourceHandle.h"
>+#include "CString.h"

Is this still needed?  It was probably added for debugging.

>Index: WebCore/platform/network/ResourceResponseBase.cpp
>===================================================================
>--- WebCore/platform/network/ResourceResponseBase.cpp	(revision 44396)
>+++ WebCore/platform/network/ResourceResponseBase.cpp	(working copy)
>@@ -35,6 +38,34 @@ namespace WebCore {
> 
> static void parseCacheHeader(const String& header, Vector<pair<String, String> >& result);
> 
>+ResourceResponseBase::ResourceResponseBase()  
>+    : m_expectedContentLength(0)
>+    , m_httpStatusCode(0)
>+    , m_isNull(true)
>+    , m_haveParsedCacheControlHeader(false)
>+    , m_haveParsedAgeHeader(false)
>+    , m_haveParsedDateHeader(false)
>+    , m_haveParsedExpiresHeader(false)
>+    , m_haveParsedLastModifiedHeader(false)
>+{
>+}
>+
>+ResourceResponseBase::ResourceResponseBase(const KURL& url, const String& mimeType, long long expectedLength, const String& textEncodingName, const String& filename)
>+    : m_url(url)
>+    , m_mimeType(mimeType)
>+    , m_expectedContentLength(expectedLength)
>+    , m_textEncodingName(textEncodingName)
>+    , m_suggestedFilename(filename)
>+    , m_httpStatusCode(0)
>+    , m_isNull(false)
>+    , m_haveParsedCacheControlHeader(false)
>+    , m_haveParsedAgeHeader(false)
>+    , m_haveParsedDateHeader(false)
>+    , m_haveParsedExpiresHeader(false)
>+    , m_haveParsedLastModifiedHeader(false)
>+{
>+}

Any particular reason the constructors were moved from the header into the source file?  They won't be inlined any longer if they're defined here.

>Index: WebCore/platform/network/ResourceResponseBase.h
>===================================================================
>--- WebCore/platform/network/ResourceResponseBase.h	(revision 44396)
>+++ WebCore/platform/network/ResourceResponseBase.h	(working copy)
>@@ -147,16 +117,26 @@ protected:
>     int m_httpStatusCode;
>     String m_httpStatusText;
>     HTTPHeaderMap m_httpHeaderFields;
>-    time_t m_expirationDate;
>-    time_t m_lastModifiedDate;
>-    bool m_isNull : 1;
> 
>+    bool m_isNull : 1;
>+    
> private:
>     void parseCacheControlDirectives() const;
> 
>-    mutable bool m_haveParsedCacheControl : 1;
>-    mutable bool m_cacheControlContainsMustRevalidate : 1;
>+    mutable bool m_haveParsedCacheControlHeader : 1;
>+    mutable bool m_haveParsedAgeHeader : 1;
>+    mutable bool m_haveParsedDateHeader : 1;
>+    mutable bool m_haveParsedExpiresHeader : 1;
>+    mutable bool m_haveParsedLastModifiedHeader : 1;
>+
>     mutable bool m_cacheControlContainsNoCache : 1;
>+    mutable bool m_cacheControlContainsMustRevalidate : 1;
>+    mutable double m_cacheControlMaxAge;
>+
>+    mutable double m_age;
>+    mutable double m_date;
>+    mutable double m_expires;
>+    mutable double m_lastModified;
> };

Any 64-bit class size considerations here?  I think Simon was moving doubles above bool bitfields in other classes.  Maybe it would be a good idea to group the bool bitfields together even if you have to switch between private and protected.

The only thing that I want to do is to compare the RFC language to the implementation.  (Thanks for adding appropriate comments to the source!)  Otherwise, this patch looks great!
Comment 20 Antti Koivisto 2009-06-04 10:41:24 PDT
(In reply to comment #19)
> Any particular reason the constructors were moved from the header into the
> source file?  They won't be inlined any longer if they're defined here.

To avoid gratuitous inilining. :)

Mostly to avoid triggering large recompiles on modifications. I don't think responses are constructed in sufficient numbers for inlining to be important.

> Any 64-bit class size considerations here?  I think Simon was moving doubles
> above bool bitfields in other classes.  Maybe it would be a good idea to group
> the bool bitfields together even if you have to switch between private and
> protected.

I don't see how reordering would help here. MIcro optimizing here is not massively important  in any case.

> The only thing that I want to do is to compare the RFC language to the
> implementation.  (Thanks for adding appropriate comments to the source!) 
> Otherwise, this patch looks great!

Thanks!
Comment 21 David Kilzer (:ddkilzer) 2009-06-04 16:25:00 PDT
Comment on attachment 30930 [details]
updated patch

(In reply to comment #19)
> The only thing that I want to do is to compare the RFC language to the
> implementation.  (Thanks for adding appropriate comments to the source!) 
> Otherwise, this patch looks great!

I only found one issue with the comments:

>+    // RFC2616 13.3.3
>[...]
>+    // RFC2616 13.3.4

These should be sections 13.2.3 and 13.2.4, respectively.

r=me!
Comment 22 Antti Koivisto 2009-06-04 23:25:09 PDT
http://trac.webkit.org/changeset/44452
Comment 23 Antti Koivisto 2009-06-04 23:29:24 PDT
Landed, with one additional change that gives non-HTTP resources long cache lifetime. This matches existing behavior.
Comment 24 David Kilzer (:ddkilzer) 2009-06-05 06:59:30 PDT
(In reply to comment #23)
> Landed, with one additional change that gives non-HTTP resources long cache
> lifetime. This matches existing behavior.

Was a test also added for non-HTTP resources?
Comment 26 Dimitri Glazkov (Google) 2009-06-05 10:17:52 PDT
Since it uses JSC::parseDate, this patch broke Chromium/V8 build. I am trying to figure out the best way to fix it.

One idea is another Script* abstraction, but it doesn't smell right -- technically, the only JSC-specific part about DateMath is that it uses UString and is located in JavaScriptCore/runtime.
Comment 27 David Kilzer (:ddkilzer) 2009-06-05 10:39:57 PDT
(In reply to comment #26)
> Since it uses JSC::parseDate, this patch broke Chromium/V8 build. I am trying
> to figure out the best way to fix it.
> 
> One idea is another Script* abstraction, but it doesn't smell right --
> technically, the only JSC-specific part about DateMath is that it uses UString
> and is located in JavaScriptCore/runtime.

Can JSC::parseDate() (or DateMath) be moved to JavaScritpCore/wtf?
Comment 28 David Kilzer (:ddkilzer) 2009-06-05 10:50:03 PDT
(In reply to comment #27)
> (In reply to comment #26)
> > Since it uses JSC::parseDate, this patch broke Chromium/V8 build. I am trying
> > to figure out the best way to fix it.
> 
> Can JSC::parseDate() (or DateMath) be moved to JavaScritpCore/wtf?

Or why can't Chromium/V8 compile DateMath.cpp?  Surely there's not a namespace issue?
Comment 29 Dimitri Glazkov (Google) 2009-06-05 10:56:15 PDT
DateMath uses JSC::UString, which makes moving to wtf or compiling in Chromium a bit tricky.

What if we add a ResourceResponseBase::parseDate method, which the implementations can override? This should allow us to hook in date parsing from network stack layer.
Comment 30 Dimitri Glazkov (Google) 2009-06-05 11:01:15 PDT
(In reply to comment #29)
> What if we add a ResourceResponseBase::parseDate method, which the
> implementations can override? This should allow us to hook in date parsing from
> network stack layer.

This seems like a bad idea, scratch that. Thinking ... 

Comment 31 Dimitri Glazkov (Google) 2009-06-05 11:06:17 PDT
Ideally (thinking outloud), date parsing _should_ be handled by the network stack. We don't need to be having multiple code paths for doing the same thing. I am out of great ideas at the moment, but still thinking :)
Comment 32 Dimitri Glazkov (Google) 2009-06-05 11:45:40 PDT
Here's another idea:

Add a parseDate method to HTTPParsers.h, and put an implementation in HTTPParsers.cpp (with !PLATFORM(CHROMIUM) guards). Then I can provide a Chromium impl of this method. WDYT?
Comment 33 Dimitri Glazkov (Google) 2009-06-05 11:56:44 PDT
Per IRC discussion, I am moving JSC::parseDate guts to WTF::parseDate(char*, unsigned).
Comment 34 Dimitri Glazkov (Google) 2009-06-06 14:39:31 PDT
Reopening to fix Chromium build.
Comment 35 David Kilzer (:ddkilzer) 2009-06-06 15:28:15 PDT
(In reply to comment #34)
> Reopening to fix Chromium build.

You should really file a new bug for this, and note the new bug number here.
Comment 36 Dimitri Glazkov (Google) 2009-06-06 22:30:06 PDT
Sure thing. New bug is here https://bugs.webkit.org/show_bug.cgi?id=26238.
Comment 37 Alexey Proskuryakov 2009-06-15 01:19:03 PDT
(In reply to comment #14)
> This is a valid bug that can be fixed in WebCore level. Reopening.

To clarify what Antti said, this bug can be mostly worked around at WebCore level (and this is done now). It is still a bug in CFNetwork that still affects WebCore, but to a lesser degree.

I've added the necessary information to Radar bug for Apple to keep tracking the problem.