Bug 17998

Summary: When a resource is cached locally, WebKit should follow RFC 2616 "Specific end-to-end revalidation" instead of "Unspecified end-to-end revalidation"
Product: WebKit Reporter: Mark J. Hughes <mark.hughes>
Component: Page LoadingAssignee: Antti Koivisto <koivisto>
Status: NEW    
Severity: Major CC: aestes, akwb, alex, ap, beidson, cdumez, darin, ddkilzer, emacemac7, japhet, kieran, koivisto, mihnea, mnot, mrowe
Priority: P2 Keywords: InRadar
Version: 525.x (Safari 3.1)   
Hardware: Mac (Intel)   
OS: OS X 10.5   
Bug Depends on: 49246    
Bug Blocks:    
Attachments:
Description Flags
implement Specific end-to-end revalidation for WebCore memory cache
none
updated patch
darin: review-
more updates
none
WebCore patch for enabling revalidation on reload
none
WebKit patch for enabling revalidation on reload
none
updated WebCore patch none

Mark J. Hughes
Reported 2008-03-21 18:31:10 PDT
Overview: --------- After much research, WebKit seems to ignore Expire / ETag headers in a response, either for static or dynamic content. Using Apache's mod_expires as a basis point, WebKit/Safari seems unable to understand server-side caching headers, and caching accordingly. Ultimately, this is bad on its face, but especially bad for low bandwidth, high latency, and/or resource-limited situations, such as dial-up users or the iPhone. Steps to Reproduce: ------------------- 1) Using Mac OS X builtin web sharing, configure my userdir.conf file (under 10.5/Intel) shell> sudo vi /etc/apache2/users/mhughes.conf <Directory "/Users/mhughes/Sites"> Options Indexes MultiViews AllowOverride None Order allow,deny Allow from all ExpiresActive On ExpiresDefault "access plus 7 days" </Directory> 2) Restart Apache shell> sudo apachectl graceful --or-- "System Preferences" > "Sharing" > Uncheck "Web Sharing" > Wait a few seconds for it to shutdown > Re-check "Web Sharing" ** Confirm the restart by opening http://localhost/~mhughes/ - you should see "Your website here" and such. 3) Confirm Expires and ETag are being sent properly by the webserver (Apache 2.2). Open your terminal window and issue the following command: shell> curl -i --head http://localhost/~mhughes/images/macosxlogo.gif Result should look similar to below: [antbox:~] mhughes% curl -i --head http://localhost/~mhughes/images/macosxlogo.gif HTTP/1.1 200 OK Date: Sat, 22 Mar 2008 00:12:40 GMT Server: Apache/2.2.6 (Unix) mod_ssl/2.2.6 OpenSSL/0.9.7l DAV/2 PHP/5.2.4 Last-Modified: Sat, 03 Nov 2007 20:12:50 GMT ETag: "52b87-b0d-43e0be8509480" Accept-Ranges: bytes Content-Length: 2829 Cache-Control: max-age=604800 Expires: Sat, 29 Mar 2008 00:12:40 GMT Content-Type: image/gif ** Note the ETag, Cache-Control, and Expires headers being returned. 3) Open Terminal.app and run the following: shell> sudo tail -n 0 -f /var/log/apache2/access_log | grep macosxlogo ## The -n 0 will ensure no unwelcome previous access lines 4) Open Safari to http://localhost/~mhughes/ 5) Return to your terminal window. Results should look similar to below: ::1 - - [21/Mar/2008:17:06:40 -0700] "GET /~mhughes/images/macosxlogo.gif HTTP/1.1" 200 2829 ** Request return with a 200 response code. This is fine. 6) Return to Safari and reload the page. 7) Return to your terminal window. Results should look similar to below: ::1 - - [21/Mar/2008:17:06:48 -0700] "GET /~mhughes/images/macosxlogo.gif HTTP/1.1" 200 2829 ** Request still have a 200 response code, not 304 (Not Modified) as expected. Expected Results: ----------------- All static content should have, upon subsequent load, been cached via the numerous caching headers. Actual Results: --------------- All content, regardless of cache headers from the server, is re-requested and retransferred from the server. This causes much more bandwidth to be used, which should also increase local resource usage and cache churning. Products affected: ------------------ Safari Version 3.1 (5525.13) 10.5 / Intel Safari Version 3.0.4 (523.15) WinXP SP2 WebKit r31114 10.5 / Intel Additional Builds and Platforms: -------------------------------- Other major browsers, by comparison, perform as expected (below are the Apache logs for relevant browsers) Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12 <---- Running on 10.5.2 / Intel First request: 127.0.0.1 - - [21/Mar/2008:17:13:58 -0700] "GET /~mhughes/images/macosxlogo.gif HTTP/1.1" 200 2829 Second request: 127.0.0.1 - - [21/Mar/2008:17:14:04 -0700] "GET /~mhughes/images/macosxlogo.gif HTTP/1.1" 304 - Opera/9.24 (Macintosh; Intel Mac OS X; U; en) <---- Running on 10.5.2 / Intel First request: 127.0.0.1 - - [21/Mar/2008:17:18:15 -0700] "GET /~mhughes/images/macosxlogo.gif HTTP/1.1" 200 2829 Second request: 127.0.0.1 - - [21/Mar/2008:17:18:31 -0700] "GET /~mhughes/images/macosxlogo.gif HTTP/1.1" 304 - Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en; rv:1.8.1.14pre) Gecko/20080321 Camino/1.6b3pre (like Firefox/2.0.0.14pre) <---- (Camino nightly) Running on 10.5.2 / Intel First request: 127.0.0.1 - - [21/Mar/2008:17:20:04 -0700] "GET /~mhughes/images/macosxlogo.gif HTTP/1.1" 200 2829 Second request: 127.0.0.1 - - [21/Mar/2008:17:20:08 -0700] "GET /~mhughes/images/macosxlogo.gif HTTP/1.1" 304 - Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) <---- Running on WinXP SP2 First request: 10.50.3.140 - - [21/Mar/2008:17:24:53 -0700] "GET /~mhughes/images/macosxlogo.gif HTTP/1.1" 200 2829 Second request: 10.50.3.140 - - [21/Mar/2008:17:25:05 -0700] "GET /~mhughes/images/macosxlogo.gif HTTP/1.1" 304 - Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1) <---- Running on WinXP SP2 First request: 10.50.3.140 - - [21/Mar/2008:17:28:47 -0700] "GET /~mhughes/images/macosxlogo.gif HTTP/1.1" 200 2829 Second request: 10.50.3.140 - - [21/Mar/2008:17:28:49 -0700] "GET /~mhughes/images/macosxlogo.gif HTTP/1.1" 304 - Additional information: ----------------------- I am not familiar with the core of the engine enough to point blame, but I did notice that even with returning: ... Last-Modified: Sat, 03 Nov 2007 20:12:50 GMT ETag: "52b87-b0d-43e0be8509480" ... Cache-Control: max-age=604800 Expires: Sat, 29 Mar 2008 00:12:40 GMT ... The subsequent request(s) from WebKit/Safari will show (by using the web inspector to dump headers): >>> request headers >>> Cache-Control max-age=0 Referer http://localhost/~mhughes/ User-Agent Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_2; en-us) AppleWebKit/526.1+ (KHTML, like Gecko) Version/3.0.4 Safari/523.15 <<< request headers <<< >>> response headers >>> Accept-Ranges bytes Cache-Control max-age=604800 Connection Keep-Alive Content-Length 2829 Content-Type image/gif Date Sat, 22 Mar 2008 00:33:41 GMT Etag "52b87-b0d-43e0be8509480" Expires Sat, 29 Mar 2008 00:33:41 GMT Keep-Alive timeout=5, max=98 Last-Modified Sat, 03 Nov 2007 20:12:50 GMT Server Apache/2.2.6 (Unix) mod_ssl/2.2.6 OpenSSL/0.9.7l DAV/2 PHP/5.2.4 <<< response headers <<< Of note is that no matter how many cache cleans or reloads, the request always seems to contain: Cache-Control max-age=0 A good example to test with: ---------------------------- * http://www.apple.com/startpage/ Clear cache > Open Inspector > Load Page. *** According to network panel, all of the resources total ~770KB. Reload page leaving inspector open. *** Same transfer of data. A sampling of the requests show all as pulling max-age=0 and retransferring. Note: As a point of comparison, repeat the same thing in FireFox with FireBug installed. On reload, ONLY the HTML page is reloaded from the network. All other pages trust their cached copies, even though they do confirm with the server that their copies are good.
Attachments
implement Specific end-to-end revalidation for WebCore memory cache (46.00 KB, patch)
2008-08-18 18:09 PDT, Antti Koivisto
no flags
updated patch (46.52 KB, patch)
2008-08-18 20:44 PDT, Antti Koivisto
darin: review-
more updates (51.34 KB, patch)
2008-08-25 15:16 PDT, Antti Koivisto
no flags
WebCore patch for enabling revalidation on reload (27.42 KB, patch)
2008-11-11 02:51 PST, Antti Koivisto
no flags
WebKit patch for enabling revalidation on reload (4.18 KB, patch)
2008-11-11 02:54 PST, Antti Koivisto
no flags
updated WebCore patch (27.58 KB, patch)
2008-11-11 14:30 PST, Antti Koivisto
no flags
Mark Rowe (bdash)
Comment 1 2008-03-21 19:13:46 PDT
Please try a similar experiment but instead of hitting Reload, simply load the page again by hitting Enter in the address bar.
Mark J. Hughes
Comment 2 2008-03-22 15:02:20 PDT
(In reply to comment #1) > Please try a similar experiment but instead of hitting Reload, simply load the > page again by hitting Enter in the address bar. On resubmission, or by following a link to that page, the cache is wholly respected and there is not even a request to validate the content to confirm the page's status - there is no network activity whatsoever. This is great, and even more "aggressive" than other browsers. However, why during a reload are the linked resources themselves not trusted? I could see a reload causing the contents of the "primary" URL to be forceably reloaded/reverifed (Cache-Control: max-age=0), but if that page references resources in the cache already, wouldn't it be more sensible/efficient to let the cache work the same way as if you had reentered the URL? In this case, with Expires/Cache-control: max-age=>0 (freshness) and ETag (validator) returned in the initial response, sending the If-None-Match / If-Modified-Since headers would make the utmost sense. In short, even if reload becomes a condition for the browser to forceably revalidate all of the resources involved, shouldn't the proper outcome of described scenario be a stream of 304s from the server in the ideal situation?
Mark Rowe (bdash)
Comment 3 2008-03-22 15:08:23 PDT
See <http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.4>: What you describe for the current behaviour is termed "Unspecified end-to-end revalidation". What you describe for the preferred behaviour is termed "Specific end-to-end revalidation". The paragraph preceding the definitions suggests that specific end-to-end revalidation should be used when a local cached copy of the resource is available, which would appear to be in agreement the approach you suggest. I suspect that the current behaviour that WebKit uses is determined by the CFNetwork stack it uses for HTTP rather than in WebKit itself.
Mark Rowe (bdash)
Comment 4 2008-03-22 15:09:02 PDT
I think the bug should be retitled to reflect the specific problem you mentioned. At the moment it's too general and doesn't accurately describe the situation.
Mark J. Hughes
Comment 5 2008-03-22 15:55:49 PDT
Renamed per your suggestion. You are certainly correct in this, as the file is in the cache already, and therefore not a total unknown.
Mark Rowe (bdash)
Comment 6 2008-03-23 21:44:42 PDT
Alexey Proskuryakov
Comment 7 2008-03-23 23:18:16 PDT
> Note: As a point of comparison, repeat the same thing in FireFox with FireBug > installed. On reload, ONLY the HTML page is reloaded from the network. In general, Safari reloading behavior should be compared to what other browsers do on Shift-reload - since we only have on flavor of reload, it has to be the more thorough one.
Mark J. Hughes
Comment 8 2008-03-24 09:02:24 PDT
> > Note: As a point of comparison, repeat the same thing in FireFox with FireBug > > installed. On reload, ONLY the HTML page is reloaded from the network. > > In general, Safari reloading behavior should be compared to what other browsers > do on Shift-reload - since we only have on flavor of reload, it has to be the > more thorough one. All due respect to the many engineers who have spent their time working on this project, but that doesn't sound like a proper implementation of the mechanism. Most all browsers seem to implement "Specific end to end" on reload, as referenced earlier. That WebKit would forcibly ignore the cache on reload seems counterintuitive, and unfriendly to the types of issues I list above - low bandwidth, high latency, or resource-limited devices - the first two I have spent much time in, and I am made to shy away from WebKit/Safari when I do. That said, what is the rationale for wholesale reloading on "reload?"
Alexey Proskuryakov
Comment 9 2008-03-24 09:22:31 PDT
Please note that my personal opinion is that we should have both reload and shift-reload. But in the absence of the latter, it appears obvious that reload should fulfill the primary purpose of reloading - which is to guarantee freshness of content, regardless of proxy misconfiguration, time zone changes, software bugs and other issues that may necessitate full reload. Saving bandwidth is a very important consideration, but a secondary one. Just as an example, in the past I've personally experienced a situation where I couldn't reload stale content that was stuck in a proxy from within Safari - I had to open the same page in Firefox and use shift-reload; only after that, Safari would display updated content.
Antti Koivisto
Comment 10 2008-03-24 10:10:03 PDT
It is not acceptable to make user experience worse for every user to deal with exteremely rare caching problems. Supporting shift-reload is probably the right way to deal with those cases.
Mark J. Hughes
Comment 11 2008-03-24 10:19:20 PDT
> Please note that my personal opinion is that we should have both reload and > shift-reload. But in the absence of the latter, it appears obvious that reload > should fulfill the primary purpose of reloading - which is to guarantee > freshness of content, regardless of proxy misconfiguration, time zone changes, > software bugs and other issues that may necessitate full reload. Saving > bandwidth is a very important consideration, but a secondary one. So what would the process be to have both options available? Clearly, it's not a situation where the ideal is one or the other, but in fact both. I would also think that there could be a client-side preference to force super-reload when connected to a known bad proxy, or in general (for web dev testing for instance), or to be OS X Network Locaton specific (so that you could say to always err on the side of trust on a cellular WiFi network). I'd like to dig in further to the codebase and see if I could perhaps contribute this. Does anyone know if the work would be in WebKit, WebCore, CFNetwork, or is it not community changeable? > Just as an example, in the past I've personally experienced a situation where I > couldn't reload stale content that was stuck in a proxy from within Safari - I > had to open the same page in Firefox and use shift-reload; only after that, > Safari would display updated content. As both are just browsers, I believe it would then be advisable to research if there are any differences between WebKit SuperReload'ing and FF's SuperReload'ing. If FF can force an upstream to re-cache, and WebKit cannot, that seems like WebKit needs additional headers/something to force upstreams to re-cache/re-validate content.
Alexey Proskuryakov
Comment 12 2008-03-24 15:26:21 PDT
(In reply to comment #11) > Does anyone know if the work would be in WebKit, WebCore, > CFNetwork, or is it not community changeable? Depending on what the UI will be, this may be WebKit or Safari. Web Inspector is in open source WebKit, while Safari UI is closed source. Some supporting changes in lower level frameworks might be necessary, too. > As both are just browsers, I believe it would then be advisable to research if > there are any differences between WebKit SuperReload'ing and FF's > SuperReload'ing. This was a while ago, I believe Safari's behavior has changed, and using Firefox to reset proxy wouldn't have been necessary now.
Alexey Proskuryakov
Comment 13 2008-03-25 01:08:07 PDT
(In reply to comment #12) > This was a while ago, I believe Safari's behavior has changed, and using > Firefox to reset proxy wouldn't have been necessary now. Actually, I take my words back - this part hasn't changed, see bug 7414 comment 6 (the plain reload results mentioned there are obsolete).
Antti Koivisto
Comment 14 2008-08-18 18:09:59 PDT
Created attachment 22872 [details] implement Specific end-to-end revalidation for WebCore memory cache putting this up for comments
Antti Koivisto
Comment 15 2008-08-18 20:44:44 PDT
Created attachment 22874 [details] updated patch Fixes based on comments by Mark. Added code that includes "Cache-Control: max-age=0" in reload case (which is not enabled for now) required for correct "Unspecified end-to-end revalidation" .
Darin Adler
Comment 16 2008-08-24 15:32:07 PDT
Comment on attachment 22874 [details] updated patch This looks great. I have a few questions about it. How can we test this? I assume you did some testing already, but I'm curious how we can prevent changes from causing regressions in this area. I'm also thinking that the division of labor between WebCore and the underlying loader library is getting weaker. Some of the code here should probably be using functions from ResourceHandle; maybe the constant "304" could be coming from there for example and also the "can revalidate". 195 if (resource->resourceToRevalidate()) 196 return; I don't like the way this if statement reads. It's in a function named revalidateResource it says "if there is a resource to revalidate, just return". This shows that the names aren't quite right. I think I understand the case where this would be true, but I'm not sure. Maybe this is a resource that already represents a revalidation? Maybe we need a comment explaining this a little more? 201 String url = resource->url(),; The comma here looks like a syntax error. It seems that we wouldn't be able to compile this. Also might be more efficient to use const& here. 282 // Equivalent of deref() for all clients 283 m_clients.clear(); 284 285 unsigned moveCount = clientsToMove.size(); 286 for (unsigned n = 0; n < moveCount; ++n) 287 m_resourceToRevalidate->addClient(clientsToMove[n]); Does this create a window where the clients could go away altogether or something along those lines? Normally doing a deref before a reref is incorrect. Maybe it's OK since it's not literally deref(). Also, it's possible your use of "deref()" here is out of date, since Hyatt renamed the functions here in the cache since they weren't quite ref/deref. 292 return !m_loading && (!m_response.httpHeaderField("Last-Modified").isEmpty() || !m_response.httpHeaderField("ETag").isEmpty()); This assumes that ETag is implemented. It's not implemented in current versions of CFNetwork! 48 void setResource(CachedResource* res) { We put the braces on a separate line. And maybe a function like this shouldn't be inline. The general rule is that we want to inline functions that are completely trivial, like a single assignment, or ones where there's a clear performance win. But not others. 57 CachedResourceHandleBase& operator=(const CachedResourceHandleBase&) { return *this; } Do we want this function to be defined at all? Who needs to call it? 245 String lastModified = resourceToRevalidate->response().httpHeaderField("Last-Modified"); 246 String eTag = resourceToRevalidate->response().httpHeaderField("ETag"); These can be more efficient if they use const& so they don't have to ref/deref the string. I'm going to say review- because of that comma (at least) and because there seem to be enough questions that at least one might cause you to change the patch.
Mark Rowe (bdash)
Comment 17 2008-08-24 15:42:48 PDT
(In reply to comment #16) > (From update of attachment 22874 [details] [edit]) > > 292 return !m_loading && > (!m_response.httpHeaderField("Last-Modified").isEmpty() || > !m_response.httpHeaderField("ETag").isEmpty()); > > This assumes that ETag is implemented. It's not implemented in current versions > of CFNetwork! The revalidation is handled entirely in WebCore, with Loader::Host::servePendingRequests adding the If-None-Match header if the original response had an ETag header. This should allow us to use ETags to revalidate from the memory cache even when using a network library that does not support revalidation via ETags.
Antti Koivisto
Comment 18 2008-08-24 17:20:45 PDT
(In reply to comment #16) > (From update of attachment 22874 [details] [edit]) > This looks great. I have a few questions about it. > > How can we test this? I assume you did some testing already, but I'm curious > how we can prevent changes from causing regressions in this area. I'm also > thinking that the division of labor between WebCore and the underlying loader > library is getting weaker. I don't have good story for that. I have been running versions of this patch in regular use over a few months but that obviously that obviously does not catch that much. Automated HTTP tests that give code coverage should be possible but good coverage of real world scenarios (involving potentially broken servers, caches, etc) is very difficult. This is true for all our networking code. I don't think division of labor is getting weaker. That overall idea is to treat networking library cache and WebCore cache as two separate levels in HTTP cache hierarchy. The specification is written in multiple cache levels in mind and the interaction between levels is well defined. This allows us to implement functionality (like ETag, below) without caring if it is implemented in other levels. For example if the CFNetwork level includes a cache validator to a request for an item in its own cache we never see the resulting 304 response in WebCore level. It gets transparently turned into 200 response. We only get 304 responses for requests that include a validator from the WebCore level. > Some of the code here should probably be using functions from ResourceHandle; > maybe the constant "304" could be coming from there for example and also the > "can revalidate". > > 195 if (resource->resourceToRevalidate()) > 196 return; > > I don't like the way this if statement reads. It's in a function named > revalidateResource it says "if there is a resource to revalidate, just return". > This shows that the names aren't quite right. I think I understand the case > where this would be true, but I'm not sure. Maybe this is a resource that > already represents a revalidation? Maybe we need a comment explaining this a > little more? I don't like the name either. I changed it a few times and this was the best I came up with. The questions here is along the lines of "Is this a resource that represents a request that includes cache validator for an existing resource". > > 201 String url = resource->url(),; > > The comma here looks like a syntax error. It seems that we wouldn't be able to > compile this. Also might be more efficient to use const& here. Good catch. Suprisingly it compiles fine, as does int a,; No idea if this is a compiler bug or if it is legal C++. > > 282 // Equivalent of deref() for all clients > 283 m_clients.clear(); > 284 > 285 unsigned moveCount = clientsToMove.size(); > 286 for (unsigned n = 0; n < moveCount; ++n) > 287 m_resourceToRevalidate->addClient(clientsToMove[n]); > > Does this create a window where the clients could go away altogether or > something along those lines? Normally doing a deref before a reref is > incorrect. Maybe it's OK since it's not literally deref(). Also, it's possible > your use of "deref()" here is out of date, since Hyatt renamed the functions > here in the cache since they weren't quite ref/deref. Yeah, deref() comment is pre-rename. CachedResources have weird destruction rules that involve multiple conditions (CachedResources::canDelete()), and this patch adds more (for resources in process of being revalidated) so just removing clients don't kill the objects. > > 292 return !m_loading && > (!m_response.httpHeaderField("Last-Modified").isEmpty() || > !m_response.httpHeaderField("ETag").isEmpty()); > > This assumes that ETag is implemented. It's not implemented in current versions > of CFNetwork! The implementation is entirely self contained in WebCore. We can validate WebCore cache with ETag validators even if CFNetwork can not do it with its cache. > > 48 void setResource(CachedResource* res) { > > We put the braces on a separate line. And maybe a function like this shouldn't > be inline. The general rule is that we want to inline functions that are > completely trivial, like a single assignment, or ones where there's a clear > performance win. But not others. Ok. > 57 CachedResourceHandleBase& operator=(const > CachedResourceHandleBase&) { return *this; } > > Do we want this function to be defined at all? Who needs to call it? I made it private to ensure that no one is invoking the default version. The handle went though various iterations, that helped to catch bugs at least in some form. > > 245 String lastModified = > resourceToRevalidate->response().httpHeaderField("Last-Modified"); > 246 String eTag = > resourceToRevalidate->response().httpHeaderField("ETag"); > > These can be more efficient if they use const& so they don't have to ref/deref > the string. > > I'm going to say review- because of that comma (at least) and because there > seem to be enough questions that at least one might cause you to change the > patch. Thanks for the review!
Antti Koivisto
Comment 19 2008-08-25 15:16:18 PDT
Created attachment 22987 [details] more updates Fixes based on review comments, cleanups - added CachedResource::mustRevalidate(CachePolicy) that takes "Cache-Control: no-cache" and "CacheControl: must-revalidate" into account - added CachedResource::isCacheValidator() - renamed CachedResource::canBeRevalidated() to canUseCacheValidator() - added CachedResourceHandle.cpp - sorted headers - random cosmetic fixes
Darin Adler
Comment 20 2008-08-30 19:14:07 PDT
Comment on attachment 22987 [details] more updates Sorry, I can't remember which of these are repeating something I already asked. 213 CachedScript *cs = pendingScripts.first().get(); Would be nice to move the star when you're tweaking this line of code. 225 int delta = static_cast<int>(resource->size()); Does this really require an explicit cast? 103 void revalidationSucceeded(CachedResource* revalidatingResource, const ResourceResponse& response); 104 void revalidationFailed(CachedResource* revalidatingResource); Seems that you don't need the argument names here. 111 return (difftime(now, m_expirationDate) >= 0); Extra parentheses on this line. 88 void checkDelete(); Not a great name. This names it sound like we check deletion, which doesn't make a lot of logical sense. A better name might be deleteIfPossible() or deleteIfNotInCache(). Those are less elegant than checkDelete, but at least a little easier to understand. There's probably an even better name. 301 return !cacheControl.isEmpty() && (cacheControl.contains("no-cache") || (isExpired() && cacheControl.contains("must-revalidate"))); 302 return isExpired() || cacheControl.contains("no-cache"); I think we'd want the case-ignoring version of contains, right? Also, is a substring match really the correct way to read this field? That seems a little sloppy to me. What's the correct algorithm? 134 void setResponse(const ResourceResponse& response); 188 void setResourceToRevalidate(CachedResource* resource); Should omit the argument names in these cases. 31 void CachedResourceHandleBase::setResource(CachedResource* resource) { Brace goes on a separate line. 39 operator bool() const { return get(); } We normally don't use operator bool. I think it would be best to imitate the technique used in our smart pointers such as RefPtr instead. 42 CachedResourceHandleBase() : m_resource(0) {} We use a space in these parentheses. 46 void setResource(CachedResource* res); Should omit the argument names in these cases. None of these are serious concerns, so I'll say review+.
Antti Koivisto
Comment 21 2008-09-05 02:32:04 PDT
Sending WebCore/ChangeLog Sending WebCore/WebCore.pro Sending WebCore/WebCore.vcproj/WebCore.vcproj Sending WebCore/WebCore.xcodeproj/project.pbxproj Sending WebCore/css/CSSFontFaceSource.h Sending WebCore/css/CSSImportRule.h Sending WebCore/dom/Clipboard.h Sending WebCore/dom/ProcessingInstruction.h Sending WebCore/dom/ScriptElement.h Sending WebCore/dom/XMLTokenizer.cpp Sending WebCore/dom/XMLTokenizer.h Sending WebCore/html/HTMLImageLoader.cpp Sending WebCore/html/HTMLImageLoader.h Sending WebCore/html/HTMLLinkElement.h Sending WebCore/html/HTMLTokenizer.cpp Sending WebCore/html/HTMLTokenizer.h Sending WebCore/loader/Cache.cpp Sending WebCore/loader/Cache.h Sending WebCore/loader/CachedResource.cpp Sending WebCore/loader/CachedResource.h Adding WebCore/loader/CachedResourceHandle.cpp Adding WebCore/loader/CachedResourceHandle.h Sending WebCore/loader/DocLoader.cpp Sending WebCore/loader/UserStyleSheetLoader.h Sending WebCore/loader/loader.cpp Sending WebCore/page/EventHandler.cpp Sending WebCore/rendering/RenderImage.cpp Sending WebCore/rendering/RenderImage.h Sending WebCore/rendering/style/RenderStyle.h Sending WebCore/rendering/style/StyleCachedImage.h Sending WebCore/svg/SVGFEImageElement.h Sending WebCore/svg/graphics/filters/SVGFEImage.h Sending WebCore/xml/XSLImportRule.h Transmitting file data ................................. Committed revision 36109.
Antti Koivisto
Comment 22 2008-09-05 02:33:45 PDT
> Also, is a substring match really the correct way to read this field? That > seems a little sloppy to me. What's the correct algorithm? I added a FIXME. It should really be tokenized but substring match works in practice (and some http code i have seen does the same thing).
Antti Koivisto
Comment 23 2008-09-05 02:34:28 PDT
Leaving this open for followup patch to enable it for reloading case.
Eric Seidel (no email)
Comment 24 2008-10-06 18:25:34 PDT
Comment on attachment 22987 [details] more updates Clearing review flag since this has landed.
Antti Koivisto
Comment 25 2008-11-11 02:51:58 PST
Created attachment 25045 [details] WebCore patch for enabling revalidation on reload
Antti Koivisto
Comment 26 2008-11-11 02:54:29 PST
Created attachment 25046 [details] WebKit patch for enabling revalidation on reload This also contains hack that allows WebKit running in shipping Safari to use shift-reload to force end-to-end reload.
Antti Koivisto
Comment 27 2008-11-11 14:30:23 PST
Created attachment 25055 [details] updated WebCore patch Some small updates - set request cache policy in addExtraFieldsToRequest() - some more text in ChangeLog
Mark Rowe (bdash)
Comment 28 2008-11-13 15:48:13 PST
*** Bug 13892 has been marked as a duplicate of this bug. ***
Andrew
Comment 29 2008-11-24 10:42:46 PST
Does this bug need to be accepted as a bug (instead of having the status NEW) in order to get reviewed?
Darin Adler
Comment 30 2008-11-24 11:49:19 PST
(In reply to comment #29) > Does this bug need to be accepted as a bug (instead of having the status NEW) > in order to get reviewed? No. "NEW" just means that the assignee hasn't accepted. It's more the status of the assignee than the status of the bug. We don't use it much on the WebKit project.
Andrew
Comment 31 2008-11-24 12:41:01 PST
So currently the bug is assigned to "Nobody". Should it be assigned to the person in charge of this area of the Webkit engine so that it can be moved along? I'm quite keen to see these patches get reviewed as this bug very much affects my daily usage of Webkit.
Darin Adler
Comment 32 2008-11-24 12:55:36 PST
(In reply to comment #31) > So currently the bug is assigned to "Nobody". Should it be assigned to the > person in charge of this area of the Webkit engine so that it can be moved > along? I'm quite keen to see these patches get reviewed as this bug very much > affects my daily usage of Webkit. This patch is in the review queue. There are about 50 others in the queue too. Assignment of the bug won't affect how soon it's reviewed.
Darin Adler
Comment 33 2008-11-24 13:51:30 PST
Comment on attachment 25046 [details] WebKit patch for enabling revalidation on reload > + // FIXME: This is a hack to make shift-reload work on shipping Safari. Remove when Safari has this functionality. > + if ([[[NSBundle mainBundle] bundleIdentifier] isEqualToString:@"com.apple.Safari"]) Normally we would do a linked-on-or-after check to make this back off automatically when the new Safari ships.
Darin Adler
Comment 34 2008-11-24 13:51:30 PST
Comment on attachment 25046 [details] WebKit patch for enabling revalidation on reload > + // FIXME: This is a hack to make shift-reload work on shipping Safari. Remove when Safari has this functionality. > + if ([[[NSBundle mainBundle] bundleIdentifier] isEqualToString:@"com.apple.Safari"]) Normally we would do a linked-on-or-after check to make this back off automatically when the new Safari ships.
Darin Adler
Comment 35 2008-11-24 13:56:12 PST
Comment on attachment 25055 [details] updated WebCore patch > + CachePolicy cachePolicy = this->cachePolicy(); > + if (cachePolicy == CachePolicyVerify || cachePolicy == CachePolicyCache) { > + CachedResource* existing = cache()->resourceForURL(fullURL.string()); > + if (existing && !existing->isPreloaded() && existing->mustRevalidate(cachePolicy)) { > + cache()->revalidateResource(existing, this); > + m_reloadedURLs.add(fullURL.string()); > + } > + return; > + } > + if (cachePolicy == CachePolicyReload || cachePolicy == CachePolicyRevalidate) { > + CachedResource* existing = cache()->resourceForURL(fullURL.string()); > + if (existing && !existing->isPreloaded()) { > + if (cachePolicy == CachePolicyReload) > + cache()->remove(existing); > + else > + cache()->revalidateResource(existing, this); > + m_reloadedURLs.add(fullURL.string()); > + } > } The above looks like it should be a switch statement. > - ScheduledRedirection(double redirectDelay, const String& redirectURL, bool redirectLockHistory, bool userGesture) > + ScheduledRedirection(double redirectDelay, const String& redirectURL, bool redirectLockHistory, bool userGesture, bool refresh) It's a little confusing to have both the term "reload" and the term "refresh" used. > + ResourceRequest request(url, referrer, refresh ? ReloadIgnoringCacheData : UseProtocolCachePolicy); Two spaces here after the comma. > - scheduleRedirection(new ScheduledRedirection(delay, url, delay <= 1, false)); > + scheduleRedirection(new ScheduledRedirection(delay, url, delay <= 1, false, false)); Yuck, I hate booleans. We should really use enums instead so we can see what "false" means. > + // FIXME: We don't have a mechanism to revalidate the main resource without reloadin at the moment. Typo here. Need a "g" on the end of "reloading". > - void changeLocation(const String& url, const String& referrer, bool lockHistory = true, bool userGesture = false); > - void changeLocation(const KURL&, const String& referrer, bool lockHistory = true, bool userGesture = false); > + void changeLocation(const String& url, const String& referrer, bool lockHistory = true, bool userGesture = false, bool refresh = false); > + void changeLocation(const KURL&, const String& referrer, bool lockHistory = true, bool userGesture = false, bool refresh = false); You added the "= false" default here, but you also added the false boolean explicitly to some call sites. I think you should omit it at the call sites. r=me
Antti Koivisto
Comment 36 2008-12-15 07:14:08 PST
> The above looks like it should be a switch statement. Fixed. > It's a little confusing to have both the term "reload" and the term "refresh" > used. The confusing terminology predates this patch. At least some instances of "refresh" refer to meta refresh header. Cleaning this up left as future exercise. > > + ResourceRequest request(url, referrer, refresh ? ReloadIgnoringCacheData : UseProtocolCachePolicy); > > Two spaces here after the comma. Fixed. > Yuck, I hate booleans. We should really use enums instead so we can see what > "false" means. Agreed. Still keeping this bit of refactoring out from this patch. > Typo here. Need a "g" on the end of "reloading". Fixed. > You added the "= false" default here, but you also added the false boolean > explicitly to some call sites. I think you should omit it at the call sites. Fixed. Also did some slight additional refactoring which probably does not need review.
Antti Koivisto
Comment 37 2008-12-15 07:30:49 PST
Sending WebCore/ChangeLog Sending WebCore/WebCore.base.exp Sending WebCore/loader/CachePolicy.h Sending WebCore/loader/DocLoader.cpp Sending WebCore/loader/DocLoader.h Sending WebCore/loader/DocumentLoader.cpp Sending WebCore/loader/FrameLoader.cpp Sending WebCore/loader/FrameLoader.h Sending WebCore/loader/FrameLoaderTypes.h Sending WebCore/loader/NavigationAction.cpp Sending WebCore/loader/SubresourceLoader.cpp Sending WebCore/loader/loader.cpp Transmitting file data ............ Committed revision 39304.
Antti Koivisto
Comment 38 2008-12-15 07:32:02 PST
Sending WebKit/mac/ChangeLog Sending WebKit/mac/WebView/WebFrame.h Sending WebKit/mac/WebView/WebFrame.mm Sending WebKit/mac/WebView/WebFramePrivate.h Sending WebKit/mac/WebView/WebView.h Sending WebKit/mac/WebView/WebView.mm Transmitting file data ...... Committed revision 39305.
Antti Koivisto
Comment 39 2008-12-15 07:33:48 PST
> Normally we would do a linked-on-or-after check to make this back off > automatically when the new Safari ships. I will add this bit later.
Antti Koivisto
Comment 40 2008-12-15 07:34:31 PST
This bug still remain unresolved for the top level resources.
Mark Rowe (bdash)
Comment 41 2009-01-12 23:36:55 PST
Comment on attachment 25046 [details] WebKit patch for enabling revalidation on reload Clearing review flag as this was landed.
Mark Rowe (bdash)
Comment 42 2009-01-12 23:37:11 PST
Comment on attachment 25055 [details] updated WebCore patch Clearing review flag as this was landed.
Andrew
Comment 43 2009-03-27 08:46:25 PDT
Perhaps this is bad bugzilla-etiquette but can someone review these patches please? They were added to this bug 3.5 months ago now. If I had the skill I would but I don't so I'm afraid I need to ask. Thank you.
Mark Rowe (bdash)
Comment 44 2009-03-27 09:07:57 PDT
None of the patches attached to this bug are waiting for review. The relevant ones have already been landed, as my two previous comments indicate.
Andrew
Comment 45 2009-03-27 09:10:35 PDT
Mark, thanks. So we're awaiting further patches from Antti in order to be able to mark this bug as resolved?
David Kilzer (:ddkilzer)
Comment 46 2010-01-04 16:44:48 PST
(In reply to comment #45) > Mark, thanks. So we're awaiting further patches from Antti in order to be able > to mark this bug as resolved? Correct. Antti? :)
Alexander Romanovich
Comment 47 2010-10-27 09:23:59 PDT
Just to be clear: are you planning to send If-None-Match/If-Modified-Since headers with a regular reload request in WebKit? At the moment WebKit browsers do not do this, while FF and IE do. FF and IE can then benefit from 304 Not Modified responses from the server with no body content and it would be great if WebKit could receive those too, especially in situations where end users are hammering their reload button in order to get content they're expecting to be published in the immediate future.
Alexey Proskuryakov
Comment 48 2010-10-27 10:09:12 PDT
Just to avoid confusion, this is only about top level resources. Subresources do get their conditional headers on reload, and we also have shift-reload now (there is no Cmd+Shift+R, but you can shift-click the reload button). The top level resource being different is still a bug that needs to be fixed.
Antti Koivisto
Comment 49 2010-11-09 04:28:32 PST
I filed bug 49246 specifically for the main resource in the memory cache.
Alexander Romanovich
Comment 50 2013-01-30 06:51:45 PST
Is this any closer to being fixed for main resources now that Nate Chapin landed all the changes for main resources and the memory cache in: Bug 49246 - Main resource should be cached in the memory cache Bug 105667 - Enable reuse of cached main resources
Brady Eidson
Comment 51 2013-04-18 13:47:45 PDT
We assumed that once we cached main resources, main resources would get this for free. Well, we cache main resources now and - as pointed out in 114738 - we haven't gotten it for free. We can finish this up now.
Brady Eidson
Comment 52 2013-04-18 13:53:31 PDT
*** Bug 114738 has been marked as a duplicate of this bug. ***
Alexander Romanovich
Comment 53 2015-10-21 11:09:33 PDT
Hi all. Though there was some initial debate about this 7 years ago, the most recent comments here suggest there isn't any controversy today. Especially based on comments 48 and 51. Are any WebKit developers actively interested in pursuing this?
Note You need to log in before you can comment on or make changes to this bug.