Bug 117007

Summary: Memory-cached file: resources should be revalidated when accessed
Product: WebKit Reporter: Andy Estes <aestes>
Component: Page LoadingAssignee: Nobody <webkit-unassigned>
Status: NEW ---    
Severity: Enhancement CC: ap, beidson, darin, kling, koivisto, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Bug Depends on: 113626, 116906    
Bug Blocks:    

Description Andy Estes 2013-05-29 16:05:30 PDT
Right now we cache file: resources indefinitely in WebCore's memory cache, which means that WebCore will continue to use a cached resource even if the underlying file was changed on disk. An attempt was made to stop doing this in <https://bugs.webkit.org/show_bug.cgi?id=113626>, but simply not caching file: resources caused regressions in our test infrastructure and in third-party WebKit clients. We should find an approach that gives us the benefits of memory caching but does the right thing if the underlying file on disk changes after we've cached it.

Here are two approaches I can think of:

1) Compare file modification time to resource's response time when accessing a cached resource, and reload the resource if the file modification time is newer than the response time. This would be a simple solution, but could cause responsiveness issues if the file is hosted on a slow or non-responsive network filesystem. The typical I/O-on-the-main-thread caveats apply here.

2) Implement a revalidation check similar to what we do for HTTP revalidation. We could assume the cached resource is valid but at the same time queue a validation check on a background thread that will check file modification time. If the resource is still valid then we mark it as such and synthesize resource load callbacks, but if it isn't then we kick off a new load and replace the old cached resource with the new response. We could even paint images while validation is occurring to avoid flickering so long as we don't fire load/error events and resource load delegate callbacks more than once.

There might be other approaches that would work here, too.
Comment 1 Andy Estes 2013-05-29 16:05:54 PDT
<rdar://problem/14017743>
Comment 2 Darin Adler 2013-05-30 12:00:49 PDT
On some platforms at least, Mac for example, maybe we could do something based on registering for file system notifications for everything in the cache.

I think the custom protocol issue is also interesting, albeit separate.
Comment 3 Andy Estes 2013-06-04 14:58:41 PDT
(In reply to comment #2)
> On some platforms at least, Mac for example, maybe we could do something based on registering for file system notifications for everything in the cache.

Oh, I like this idea. It looks like this can ben done rather easily on platforms that support libdispatch by creating a dispatch source of type DISPATCH_SOURCE_TYPE_VNODE. There are also FSEvents on the Mac, but it can only monitor a directory whereas DISPATCH_SOURCE_TYPE_VNODE lets you monitor specific files.

> 
> I think the custom protocol issue is also interesting, albeit separate.

I agree, and I have some thoughts there too. I'll file a separate bug.
Comment 4 Darin Adler 2013-06-04 17:55:09 PDT
(In reply to comment #3)
> (In reply to comment #2)
> > On some platforms at least, Mac for example, maybe we could do something based on registering for file system notifications for everything in the cache.
> 
> Oh, I like this idea. It looks like this can ben done rather easily on platforms that support libdispatch by creating a dispatch source of type DISPATCH_SOURCE_TYPE_VNODE. There are also FSEvents on the Mac, but it can only monitor a directory whereas DISPATCH_SOURCE_TYPE_VNODE lets you monitor specific files.

To correctly monitor the “contents of file at this path” you might need to monitor both the contents of the file and the contents of the directories in the path.