Bug 156048 - We sometimes fail to remove outdated entry from the disk cache after revalidation and when the resource is no longer cacheable
Summary: We sometimes fail to remove outdated entry from the disk cache after revalida...
Alias: None
Product: WebKit
Classification: Unclassified
Component: Page Loading (show other bugs)
Version: WebKit Nightly Build
Hardware: Mac OS X 10.11
: P2 Normal
Assignee: Chris Dumez
Keywords: InRadar
Depends on:
Reported: 2016-03-30 17:17 PDT by Nolan Lawson
Modified: 2016-04-05 09:50 PDT (History)
8 users (show)

See Also:

Reproducible test case (2.71 KB, application/zip)
2016-03-30 17:17 PDT, Nolan Lawson
no flags Details
Patch (7.79 KB, patch)
2016-04-04 15:59 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Nolan Lawson 2016-03-30 17:17:04 PDT
Created attachment 275241 [details]
Reproducible test case

This bug appears in WebKit Nightly (9.1 11601.5.17.1 r198834), but not the stable version of Safari (9.1 11601.5.17.1).

Steps to repro:

1. Safari performs a GET, server provides a cached response with an ETag
2. Safari requests with if-none-match and the server responds with a 304
3. Safari requests again, but resource has been removed so the server responds with a non-empty 404
4. Safari requests with if-none-match again, so server responds with 304 (this is where the bug occurs!)
5. Every subsequent request is a 304 for the *old* resource rather than the new resource

This is not a niche situation - it happens pretty frequently with CouchDB due to CouchDB re-using the revision identifier (_rev) as the ETag. If a resource is removed and re-created with the same _rev, then Safari will not ask for new content and will remain permanently stuck on old content.

If step 2 is skipped (i.e. Safari doesn't receive the 304 response), then the bug does not occur.

This bug cannot be reproduced in Firefox (which submits if-none-match) or Chrome (which doesn't submit if-none-match).

I've attached a test case as a ZIP file. I've also created a Gist: https://gist.github.com/nolanlawson/6cc2a537d390578bc235c00996f5b938. Steps to reproduce are in the _readme.md.
Comment 1 Nolan Lawson 2016-03-31 06:06:09 PDT
Clarification: by "server provides a cached response" I meant "server provides a cacheable response."

I'll also note that this could be considered a CouchDB bug rather than a WebKit bug, because if the client never gets the 404, they would never be informed that the content had changed, since the ETag never changed. And AFAICT the spec doesn't specify whether 404 should invalidate the ETag: http://tools.ietf.org/html/rfc7232#section-2.3
Comment 2 Nolan Lawson 2016-03-31 06:59:47 PDT
FWIW, I've filed a bug on CouchDB: https://issues.apache.org/jira/browse/COUCHDB-2978

I leave it up to your discretion as to whether you consider this a bug from WebKit's perspective.
Comment 3 Radar WebKit Bug Importer 2016-04-02 23:06:32 PDT
Comment 4 Chris Dumez 2016-04-03 16:06:47 PDT
Disabling disk cache speculative validation does not fix the problem so this is not a regression from this new feature.
Comment 5 Chris Dumez 2016-04-03 16:29:47 PDT
Seems like a disk cache bug. We should probably delete the entry we have in the disk cache in case of failed revalidation, if the new response is not cacheable.
Comment 6 Chris Dumez 2016-04-03 16:48:59 PDT
(In reply to comment #5)
> Seems like a disk cache bug. We should probably delete the entry we have in
> the disk cache in case of failed revalidation, if the new response is not
> cacheable.

Actually, it looks like we already do so at disk cache level. However, what I see is the memory cache making revalidation requests.
Comment 7 Chris Dumez 2016-04-03 17:27:16 PDT
This is a regression from:
Comment 8 Chris Dumez 2016-04-03 17:28:36 PDT
I will try and look into this soon.
Comment 9 Chris Dumez 2016-04-04 12:33:53 PDT
@Antti: So I found the bug.

The issue is that Storage::removeFromPendingWriteOperations(const Key&) only removes the *first* pending write operation with the key instead of *ALL* of them.

So even though, the disk cache try to remove outdated entries from the cache, it can fail if there are several pending writes for this entry.
Comment 10 Antti Koivisto 2016-04-04 12:37:05 PDT
Nice find. Sounds like easy fix.
Comment 11 Chris Dumez 2016-04-04 15:59:40 PDT
Created attachment 275586 [details]
Comment 12 Chris Dumez 2016-04-04 16:00:41 PDT
@Nolan Lawson: Thanks a lot for the bug report and especially for the test case. It helped a lot.
Comment 13 Antti Koivisto 2016-04-05 06:32:29 PDT
Comment on attachment 275586 [details]

View in context: https://bugs.webkit.org/attachment.cgi?id=275586&action=review

> Source/WebKit2/NetworkProcess/cache/NetworkCacheStorage.cpp:529
> +        auto found = m_pendingWriteOperations.findIf([&key](const std::unique_ptr<WriteOperation>& operation) {

Another option would be to  eliminate duplicates from m_pendingWriteOperations by turning it into ListHashSet or something. It is bit silly we may end up writing same entry multiple times (though it didn't feel common enough to optimize just for itself).
Comment 14 WebKit Commit Bot 2016-04-05 09:50:33 PDT
Comment on attachment 275586 [details]

Clearing flags on attachment: 275586

Committed r199061: <http://trac.webkit.org/changeset/199061>
Comment 15 WebKit Commit Bot 2016-04-05 09:50:38 PDT
All reviewed patches have been landed.  Closing bug.