Bug 113626

Summary: Resources from non-HTTP schemes should not be cached indefinitely
Product: WebKit Reporter: Andy Estes <aestes>
Component: New BugsAssignee: Andy Estes <aestes>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, fpizlo, japhet, rniwa, webkit.review.bot
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 113632    
Bug Blocks: 116259, 117007, 116906    
Attachments:
Description Flags
Patch
none
Patch rniwa: review+

Description Andy Estes 2013-03-29 16:23:21 PDT
Let cached resources from file: schemes expire immediately
Comment 1 Andy Estes 2013-03-29 16:29:57 PDT
Created attachment 195817 [details]
Patch
Comment 2 Andy Estes 2013-03-29 16:30:30 PDT
<rdar://problem/13313769>
Comment 3 Andy Estes 2013-03-29 17:02:51 PDT
Committed r147263: <http://trac.webkit.org/changeset/147263>
Comment 4 WebKit Review Bot 2013-03-29 18:58:31 PDT
Re-opened since this is blocked by bug 113632
Comment 5 Andy Estes 2013-03-29 20:19:23 PDT
(In reply to comment #4)
> Re-opened since this is blocked by bug 113632

This failure is expected given how the test is written. This needs to be an http test to verify caching behavior. I'll fix it.
Comment 6 Andy Estes 2013-05-15 18:23:21 PDT
Created attachment 201905 [details]
Patch
Comment 7 Ryosuke Niwa 2013-05-15 19:38:38 PDT
Comment on attachment 201905 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=201905&action=review

Why don’t you use the short format: https://webkit.org/b/116199

> Source/WebCore/ChangeLog:12
> +        Writing a test for this is blocked on <https://bugs.webkit.org/show_bug.cgi?id=116199>.

Why don’t you use the short format: https://webkit.org/b/116199 and we don’t need <> around the url.
Comment 8 Ryosuke Niwa 2013-05-15 19:41:08 PDT
Comment on attachment 201905 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=201905&action=review

> Source/WebCore/platform/SchemeRegistry.cpp:130
> +#if PLATFORM(MAC)
> +        indefinitelyCacheableSchemes.add("applewebdata");
> +#endif
> +        indefinitelyCacheableSchemes.add("blob");
> +        indefinitelyCacheableSchemes.add("data");

Since there are only 3 items, it’s probably faster to string compare instead as in scheme == “applewebdata” || scheme == “blob” || scheme == “data”.
Comment 9 Andy Estes 2013-05-15 19:53:30 PDT
Committed r150169: <http://trac.webkit.org/changeset/150169>
Comment 10 Alexey Proskuryakov 2013-05-15 21:40:05 PDT
Comment on attachment 201905 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=201905&action=review

> Source/WebCore/platform/SchemeRegistry.cpp:129
> +        indefinitelyCacheableSchemes.add("blob");

This is not correct. When a blob is backed by a file, and the file changes, loading form the blob should return an error.
Comment 11 Filip Pizlo 2013-05-25 18:39:10 PDT
This is a *MASSIVE* regression in script caching performance.  SunSpider, V8Spider, and JSBench all get a lot slower.  I've narrowed it down to this revision.
Comment 12 Ryosuke Niwa 2013-05-25 18:44:28 PDT
(In reply to comment #11)
> This is a *MASSIVE* regression in script caching performance.  SunSpider, V8Spider, and JSBench all get a lot slower.  I've narrowed it down to this revision.

Oops!
Comment 13 Filip Pizlo 2013-05-25 18:45:37 PDT
(In reply to comment #12)
> (In reply to comment #11)
> > This is a *MASSIVE* regression in script caching performance.  SunSpider, V8Spider, and JSBench all get a lot slower.  I've narrowed it down to this revision.
> 
> Oops!

Wait, I'm still confirming.  It looks like it regresses performance with how we measure in DumpRenderTree, but it's not clearly a regression in browser.

I'm investigating...
Comment 14 Filip Pizlo 2013-05-25 18:51:56 PDT
(In reply to comment #13)
> (In reply to comment #12)
> > (In reply to comment #11)
> > > This is a *MASSIVE* regression in script caching performance.  SunSpider, V8Spider, and JSBench all get a lot slower.  I've narrowed it down to this revision.
> > 
> > Oops!
> 
> Wait, I'm still confirming.  It looks like it regresses performance with how we measure in DumpRenderTree, but it's not clearly a regression in browser.
> 
> I'm investigating...

Yeah, this is DRT-only.
Comment 15 Filip Pizlo 2013-05-25 18:52:55 PDT
Comment on attachment 201905 [details]
Patch

View in context: http://bugs.webkit.org/attachment.cgi?id=201905&action=review

> Source/WebCore/loader/cache/CachedResource.cpp:423
> +    // Don't cache other non-HTTP resources since we can't check for freshness.
> +    if (!m_response.url().protocolIsInHTTPFamily())
> +        return 0;

What effect does this have on file://?
Comment 16 Andy Estes 2013-05-25 21:53:07 PDT
(In reply to comment #15)
> (From update of attachment 201905 [details])
> View in context: http://bugs.webkit.org/attachment.cgi?id=201905&action=review
> 
> > Source/WebCore/loader/cache/CachedResource.cpp:423
> > +    // Don't cache other non-HTTP resources since we can't check for freshness.
> > +    if (!m_response.url().protocolIsInHTTPFamily())
> > +        return 0;
> 
> What effect does this have on file://?

The effect is that file: resources in the memory cache expire immediately, so subsequent loads will hit the filesystem.

This change was made to mitigate the compatibility fallout from caching main resources. Some WebKit clients that we can't break update their UI by overwriting a file on disk and reloading their WebView, so we can't assume that file: main resources can be cached indefinitely. And since many custom schemes are backed by files, we can't assume if for those either.

However, I think this change has had a significant impact on our testing infrastructure (there's also been some additional test flakiness) and so we might want to consider a different approach.

One solution would be to implement a freshness check for files, but that would involve calling stat() on each memory cache load, which isn't cheap and might even hang the main thread on a network filesystem.

The other solution might be to limit this change to the specific apps that we know need it for correctness.
Comment 17 Ryosuke Niwa 2013-05-25 21:59:00 PDT
(In reply to comment #16)
> One solution would be to implement a freshness check for files, but that would involve calling stat() on each memory cache load, which isn’t cheap and might even hang the main thread on a network filesystem.

Can’t we do this only for local (file protocol) resources?

> The other solution might be to limit this change to the specific apps that we know need it for correctness.

Maybe we can add a setting to turn this on/off?
Comment 18 Filip Pizlo 2013-05-25 22:16:43 PDT
(In reply to comment #16)
> (In reply to comment #15)
> > (From update of attachment 201905 [details] [details])
> > View in context: http://bugs.webkit.org/attachment.cgi?id=201905&action=review
> > 
> > > Source/WebCore/loader/cache/CachedResource.cpp:423
> > > +    // Don't cache other non-HTTP resources since we can't check for freshness.
> > > +    if (!m_response.url().protocolIsInHTTPFamily())
> > > +        return 0;
> > 
> > What effect does this have on file://?
> 
> The effect is that file: resources in the memory cache expire immediately, so subsequent loads will hit the filesystem.
> 
> This change was made to mitigate the compatibility fallout from caching main resources. Some WebKit clients that we can't break update their UI by overwriting a file on disk and reloading their WebView, so we can't assume that file: main resources can be cached indefinitely. And since many custom schemes are backed by files, we can't assume if for those either.
> 
> However, I think this change has had a significant impact on our testing infrastructure (there's also been some additional test flakiness) and so we might want to consider a different approach.

What about giving the test infrastructure a testFile:// protocol that works like file:// did before?

> 
> One solution would be to implement a freshness check for files, but that would involve calling stat() on each memory cache load, which isn't cheap and might even hang the main thread on a network filesystem.
> 
> The other solution might be to limit this change to the specific apps that we know need it for correctness.
Comment 19 Andy Estes 2013-05-25 23:30:08 PDT
(In reply to comment #17)
> (In reply to comment #16)
> > One solution would be to implement a freshness check for files, but that would involve calling stat() on each memory cache load, which isn’t cheap and might even hang the main thread on a network filesystem.
> 
> Can’t we do this only for local (file protocol) resources?

Well, file: URLs can still reference network resources (e.g. file:///Volumes/SomeAFPMount/...).

> 
> > The other solution might be to limit this change to the specific apps that we know need it for correctness.
> 
> Maybe we can add a setting to turn this on/off?

Yeah, this is a good idea. That lets us turn the new behavior on/off per-client, and it sounds like DRT/WKTR/TestWebKitAPI will want to turn it off.
Comment 20 Andy Estes 2013-05-25 23:35:38 PDT
(In reply to comment #18)
> (In reply to comment #16)
> > (In reply to comment #15)
> > > (From update of attachment 201905 [details] [details] [details])
> > > View in context: http://bugs.webkit.org/attachment.cgi?id=201905&action=review
> > > 
> > > > Source/WebCore/loader/cache/CachedResource.cpp:423
> > > > +    // Don't cache other non-HTTP resources since we can't check for freshness.
> > > > +    if (!m_response.url().protocolIsInHTTPFamily())
> > > > +        return 0;
> > > 
> > > What effect does this have on file://?
> > 
> > The effect is that file: resources in the memory cache expire immediately, so subsequent loads will hit the filesystem.
> > 
> > This change was made to mitigate the compatibility fallout from caching main resources. Some WebKit clients that we can't break update their UI by overwriting a file on disk and reloading their WebView, so we can't assume that file: main resources can be cached indefinitely. And since many custom schemes are backed by files, we can't assume if for those either.
> > 
> > However, I think this change has had a significant impact on our testing infrastructure (there's also been some additional test flakiness) and so we might want to consider a different approach.
> 
> What about giving the test infrastructure a testFile:// protocol that works like file:// did before?

I actually really like this idea, because it makes better sense to think about all this in terms of schemes. However, I don't think we have the concept of aliasing file: to another scheme in WebCore, so it'd require a port-by-port effort to implement the testFile: scheme handler.
Comment 21 Ryosuke Niwa 2013-05-25 23:49:19 PDT
(In reply to comment #20)
> (In reply to comment #18)
> > (In reply to comment #16)
> >
> > What about giving the test infrastructure a testFile:// protocol that works like file:// did before?
> 
> I actually really like this idea, because it makes better sense to think about all this in terms of schemes. However, I don't think we have the concept of aliasing file: to another scheme in WebCore, so it'd require a port-by-port effort to implement the testFile: scheme handler.

For that reason, I prefer adding settings because we can easily override settings via internals.settings.*.
Comment 22 Andy Estes 2013-05-25 23:59:10 PDT
(In reply to comment #21)
> (In reply to comment #20)
> > (In reply to comment #18)
> > > (In reply to comment #16)
> > >
> > > What about giving the test infrastructure a testFile:// protocol that works like file:// did before?
> > 
> > I actually really like this idea, because it makes better sense to think about all this in terms of schemes. However, I don't think we have the concept of aliasing file: to another scheme in WebCore, so it'd require a port-by-port effort to implement the testFile: scheme handler.
> 
> For that reason, I prefer adding settings because we can easily override settings via internals.settings.*.

I agree. I can do this now.
Comment 23 Alexey Proskuryakov 2013-05-26 00:11:56 PDT
> Some WebKit clients that we can't break update their UI by overwriting a file on disk and reloading their WebView

Are they actually doing a reload? Because supporting that does not necessarily mean never caching any non-http resources. We can differentiate between normal loads and reloads.
Comment 24 Andy Estes 2013-05-29 13:50:36 PDT
(In reply to comment #23)
> > Some WebKit clients that we can't break update their UI by overwriting a file on disk and reloading their WebView
> 
> Are they actually doing a reload? Because supporting that does not necessarily mean never caching any non-http resources. We can differentiate between normal loads and reloads.

No, not an actual reload in the API sense. Just loading the same main resource again.
Comment 25 Andy Estes 2013-05-29 13:52:06 PDT
(In reply to comment #11)
> This is a *MASSIVE* regression in script caching performance.  SunSpider, V8Spider, and JSBench all get a lot slower.  I've narrowed it down to this revision.

<http://trac.webkit.org/changeset/150863> should have fixed the regression.
Comment 26 Ryosuke Niwa 2013-05-30 12:37:11 PDT
http://trac.webkit.org/changeset/150863 fixed this problem.