Bug 61225

Summary: Cache XHRs (and other ThreadableLoaderClients)
Product: WebKit Reporter: Nate Chapin <japhet>
Component: WebCore Misc.Assignee: Nate Chapin <japhet>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, ap, dglazkov, dpranke, gustavo.noronha, gustavo, japhet, kbr, koivisto, webkit.review.bot, xan.lopez
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 61318, 65330, 66018, 69790, 70388    
Bug Blocks:    
Attachments:
Description Flags
work in progress
none
work in progress, Aug 10
none
First reviewable attempt
webkit-ews: commit-queue-
patch - merged to trunk
webkit-ews: commit-queue-
Fix compile issues
webkit.review.bot: commit-queue-
Address comments + add a CachedRawResourceClient
gustavo.noronha: commit-queue-
Another merge to trunk
none
Patch for landing
none
Patch for landing
none
Address issues from revert none

Description Nate Chapin 2011-05-20 16:43:57 PDT
Created attachment 94288 [details]
work in progress

I've been experimenting with making DocumentThreadableLoader a CachedResourceClient instead of a SubresourceLoaderClient.  I have a mostly-working patch (still failing a couple layout tests), but it at least outlines what I'm thinking.
Comment 1 Adam Barth 2011-05-20 16:51:16 PDT
Comment on attachment 94288 [details]
work in progress

Wow.  This patch does so many good things.  It's even got some useful infrastructure for Bug 61223.
Comment 2 Nate Chapin 2011-08-10 14:31:11 PDT
Created attachment 103534 [details]
work in progress, Aug 10

I started looking at this again, here's what I have so far.

Note that this is refactor only, and any resource loaded through a DocumentThreadableLoader will always be reloaded.

Barring objections, I'm going to start breaking pieces off of this WIP and get them properly reviewed in digestible chunks.
Comment 3 Nate Chapin 2011-10-04 13:45:12 PDT
Created attachment 109674 [details]
First reviewable attempt

This handles the plumbing of making DocumentThreadableLoader a CachedResourceClient, but still always reloads from the network and ignores the MemoryCache.

I don't see any obvious ways to split this into smaller patches, but I would love to do so if someone else sees something self-contained.
Comment 4 WebKit Review Bot 2011-10-04 13:47:19 PDT
Attachment 109674 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/ChangeLog', u'LayoutTests/http..." exit_code: 1

Source/WebCore/loader/DocumentThreadableLoader.h:46:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
Total errors found: 1 in 26 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 5 Early Warning System Bot 2011-10-04 14:00:14 PDT
Comment on attachment 109674 [details]
First reviewable attempt

Attachment 109674 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/9958016
Comment 6 Nate Chapin 2011-10-07 12:33:19 PDT
Created attachment 110194 [details]
patch - merged to trunk
Comment 7 WebKit Review Bot 2011-10-07 12:36:45 PDT
Attachment 110194 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/ChangeLog', u'LayoutTests/http..." exit_code: 1

Source/WebCore/loader/DocumentThreadableLoader.h:46:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
Total errors found: 1 in 19 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 8 Early Warning System Bot 2011-10-07 13:17:54 PDT
Comment on attachment 110194 [details]
patch - merged to trunk

Attachment 110194 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/10000302
Comment 9 Gyuyoung Kim 2011-10-07 13:41:13 PDT
Comment on attachment 110194 [details]
patch - merged to trunk

Attachment 110194 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/10005325
Comment 10 Nate Chapin 2011-10-07 13:42:02 PDT
Created attachment 110210 [details]
Fix compile issues
Comment 11 WebKit Review Bot 2011-10-07 13:45:34 PDT
Attachment 110210 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/ChangeLog', u'LayoutTests/http..." exit_code: 1

Source/WebCore/loader/DocumentThreadableLoader.h:46:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
Total errors found: 1 in 19 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 12 WebKit Review Bot 2011-10-07 16:57:15 PDT
Comment on attachment 110210 [details]
Fix compile issues

Attachment 110210 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/10015182

New failing tests:
http/tests/media/remove-while-loading.html
http/tests/media/video-cancel-load.html
http/tests/media/video-play-progress.html
http/tests/media/reload-after-dialog.html
http/tests/media/video-error-abort.html
http/tests/media/video-cookie.html
http/tests/security/xssAuditor/cookie-injection.html
http/tests/media/video-served-as-text.html
fast/forms/input-spinbutton-capturing.html
fast/forms/input-number-large-padding.html
fast/forms/input-number-events.html
fast/forms/implicit-submission.html
http/tests/security/contentSecurityPolicy/media-src-allowed.html
http/tests/media/video-referer.html
Comment 13 Antti Koivisto 2011-10-09 10:49:14 PDT
Looks good to me. I'm not at all familiar with DocumentThreadableLoader though.
Comment 14 Adam Barth 2011-10-09 11:05:40 PDT
Comment on attachment 110210 [details]
Fix compile issues

View in context: https://bugs.webkit.org/attachment.cgi?id=110210&action=review

> Source/WebKit/chromium/ChangeLog:4
> +        DocumentThreadableLoader no longer exposes setDefersLoading(),
> +        so don't try to call it in AssociatedURLLoader.

Does that cause a problem for folks who use AssociatedURLLoader ?

> Source/WebCore/platform/network/BlobResourceHandle.cpp:541
> +        RefPtr<BlobResourceHandle> protect(this);

Are all callers of this method aware that |this| can be deleted?

> Source/WebCore/loader/chromium/CachedResourceRequestChromium.cpp:44
> +void CachedResourceRequest::didDownloadData(SubresourceLoader* loader, int length)
> +{
> +    ASSERT_UNUSED(loader, loader == m_loader);
> +    m_resource->didDownloadData(length);
> +}
> +

This is for Blob support in XHR?

> Source/WebCore/loader/cache/CachedRawResource.h:45
> +#if PLATFORM(CHROMIUM)
> +    virtual void didDownloadData(int);
> +#endif

I suspect this should have some sort of USE-guard, but that's something we can solve in a future patch.

> Source/WebCore/loader/cache/CachedRawResource.h:53
> +

nit: this line seems unneeded.

> Source/WebCore/loader/cache/CachedResourceClient.h:62
> +        virtual void dataSent(CachedResource*, unsigned long long /* bytesSent */, unsigned long long /* totalBytesToBeSent */) { }
> +        virtual void responseReceived(CachedResource*, const ResourceResponse&) { }
> +        virtual void dataReceived(CachedResource*, const char* /* data */, int /* length */) { }
> +        virtual void redirectReceived(CachedResource*, ResourceRequest&, const ResourceResponse&) { }
> +#if PLATFORM(CHROMIUM)
> +        virtual void dataDownloaded(CachedResource*, int) { }
> +#endif

It seems slightly unfortunate to expose all this detailed information on CachedResourceClient.  It seems like lower-level stuff, somehow.  Maybe we should break it off into a separate interface?  I'm not entirely sure.

I guess it's not entirely clear to me how these methods work on a cache hit.

> Source/WebCore/loader/cache/CachedResourceRequest.cpp:251
> +    if (!protect->isLoading())

Usually we'd use m_resource here.

> Source/WebCore/loader/cache/CachedRawResource.cpp:50
> +        int previousDataLength = (m_options.shouldBufferData == BufferData) ? m_dataLength : 0;

lengths of buffers should be in size_t.  I'm not sure if this is a pervasive problem in the loader.

> LayoutTests/ChangeLog:6
> +        Test update for https://bugs.webkit.org/show_bug.cgi?id=61225.
> +        Due to changes in the implementation of DocumentThreadableLoader (and
> +        therefore XHR), this test will have empty (rather than null) content
> +        if we go over the length limit.

Is this just an internal data structure, or is this exposed to web sites?
Comment 15 Adam Barth 2011-10-09 11:07:01 PDT
This is really cool.  My only concern is that we're exposing a bunch of seemingly low-level callbacks on CachedResourceClient.  Most clients won't care about that (e.g., if they just want to fetch images).  I guess they can ignore them...
Comment 16 Nate Chapin 2011-10-10 08:55:46 PDT
(In reply to comment #15)
> This is really cool.  My only concern is that we're exposing a bunch of seemingly low-level callbacks on CachedResourceClient.  Most clients won't care about that (e.g., if they just want to fetch images).  I guess they can ignore them...

Yeah, that's the main thing I didn't like about how I ended up writing this patch.  I hadn't thought about your concerns with callbacks in the case of a cache hit, so I'll need to look at that more.

If I can't figure out a way to work around adding these callbacks, maybe there should be something like a subclass CachedResourceClientWithLoadCallbacks (but with a better name)?
Comment 17 Antti Koivisto 2011-10-10 09:34:57 PDT
We should probably eliminate the general purpose CachedResourceClient (and perhaps make it the common base) and have subtype specific client interfaces instead. There is no point in having imageChanged() method in a CachedScript client…
Comment 18 Nate Chapin 2011-10-10 11:50:18 PDT
(In reply to comment #17)
> We should probably eliminate the general purpose CachedResourceClient (and perhaps make it the common base) and have subtype specific client interfaces instead. There is no point in having imageChanged() method in a CachedScript client…

I like this idea.  I'm spending some time today experimenting with it.
Comment 19 Nate Chapin 2011-10-11 11:20:30 PDT
Some general replies to comments:

> > Source/WebCore/loader/chromium/CachedResourceRequestChromium.cpp:44
> > +void CachedResourceRequest::didDownloadData(SubresourceLoader* loader, int length)
> > +{
> > +    ASSERT_UNUSED(loader, loader == m_loader);
> > +    m_resource->didDownloadData(length);
> > +}
> > +
> 
> This is for Blob support in XHR?
> 
> > Source/WebCore/loader/cache/CachedRawResource.h:45
> > +#if PLATFORM(CHROMIUM)
> > +    virtual void didDownloadData(int);
> > +#endif
> 
> I suspect this should have some sort of USE-guard, but that's something we can solve in a future patch.

This is for AssociatedURLLoader only for the foreseeable future, hence the chromium-specific #if and the file names ending in Chromium.

> 
> > Source/WebCore/loader/cache/CachedResourceRequest.cpp:251
> > +    if (!protect->isLoading())
> 
> Usually we'd use m_resource here.

Unfortunately, protect needs to be used, because by the time this line is reached, this might be deleted.  I should probably comment this more. (This is a temporary and hideous hack until CachedResourceRequest and SubresourceLoader get merged, when we can just hold a local RefPtr).

> > LayoutTests/ChangeLog:6
> > +        Test update for https://bugs.webkit.org/show_bug.cgi?id=61225.
> > +        Due to changes in the implementation of DocumentThreadableLoader (and
> > +        therefore XHR), this test will have empty (rather than null) content
> > +        if we go over the length limit.
> 
> Is this just an internal data structure, or is this exposed to web sites?

It should be inspector-only.
Comment 20 Nate Chapin 2011-10-11 11:36:23 PDT
(In reply to comment #16)
> (In reply to comment #15)
> > This is really cool.  My only concern is that we're exposing a bunch of seemingly low-level callbacks on CachedResourceClient.  Most clients won't care about that (e.g., if they just want to fetch images).  I guess they can ignore them...
> 
> Yeah, that's the main thing I didn't like about how I ended up writing this patch.  I hadn't thought about your concerns with callbacks in the case of a cache hit, so I'll need to look at that more.

Re: the cache hits, here's what I'm thinking:

dataSent() : If this is relevant, cache should be forbidden.
dataDownloaded() : This is only used by AssociatedURLLoader, which will never reuse cache entries from what I'm told, so I think we can punt on this one.

responseReceived: Should be replayed for cache hit.
dataReceived: Should be replayed for cache hit (all at once, rather than incremental, of course).
notifyFinished: Should be replayed for cache hit (already done for all CachedResource types).

redirectReceived: This is the hard one.  Under the current behavior, the resource will be stored in MemoryCache with a key of its original url (i.e., the url before any redirects).  This means that DocumentThreadableLoader won't get a chance to do any meaningful redirect checks on the intermediate steps unless we replay all the redirects (not that DocumentThreadableLoader does proper redirect checking anyway).  I think this a reasonable case to wait on and figure out how to handle properly when we actually start reusing resources.
Comment 21 Nate Chapin 2011-10-11 12:09:07 PDT
Created attachment 110556 [details]
Address comments + add a CachedRawResourceClient
Comment 22 Nate Chapin 2011-10-11 12:16:02 PDT
> > Source/WebCore/platform/network/BlobResourceHandle.cpp:541
> > +        RefPtr<BlobResourceHandle> protect(this);
> 
> Are all callers of this method aware that |this| can be deleted?

Oops, forgot to comment on this one.....
I'm pretty sure it's safe as-is.  It appears all callers either immediately exit the now-deleted object or are sync-specific (and this case is only dangerous for async loads).
Comment 23 Alexey Proskuryakov 2011-10-11 12:21:03 PDT
How do XHRs share the cache with document resources having the same URL? Will loading an image with XHR kick it from cache, making <img> reload it again next time?

See bug 51286 for one example of expected interaction between XHR and document resource cache.
Comment 24 Nate Chapin 2011-10-11 12:35:44 PDT
(In reply to comment #23)
> How do XHRs share the cache with document resources having the same URL? Will loading an image with XHR kick it from cache, making <img> reload it again next time?
> 
> See bug 51286 for one example of expected interaction between XHR and document resource cache.

Yes, I think it will kick the img from the cache.  Currently if you try to (In reply to comment #23)
> How do XHRs share the cache with document resources having the same URL? Will loading an image with XHR kick it from cache, making <img> reload it again next time?
> 
> See bug 51286 for one example of expected interaction between XHR and document resource cache.

As written, this patch will evict the image.  So it will fix 51286, but it will be too aggressive in evicting.

This comes down to a known issue: We need to come up with a sane way to convert CachedResources between types.  I'm inclined to say that refactoring the MemoryCache to handle this case is a separate yak shave, but I'm willing to be convinced otherwise.
Comment 25 Nate Chapin 2011-10-11 12:37:17 PDT
> > How do XHRs share the cache with document resources having the same URL? Will loading an image with XHR kick it from cache, making <img> reload it again next time?
> > 
> > See bug 51286 for one example of expected interaction between XHR and document resource cache.
> 
> As written, this patch will evict the image.  So it will fix 51286, but it will be too aggressive in evicting.
> 
> This comes down to a known issue: We need to come up with a sane way to convert CachedResources between types.  I'm inclined to say that refactoring the MemoryCache to handle this case is a separate yak shave, but I'm willing to be convinced otherwise.

Sorry, I left some cruft in my previous comment.  The replied text in this post is what was intended.
Comment 26 Collabora GTK+ EWS bot 2011-10-11 21:53:16 PDT
Comment on attachment 110556 [details]
Address comments + add a CachedRawResourceClient

Attachment 110556 [details] did not pass gtk-ews (gtk):
Output: http://queues.webkit.org/results/10026777
Comment 27 Daniel Bates 2011-10-16 13:11:40 PDT
Comment on attachment 110556 [details]
Address comments + add a CachedRawResourceClient

Attachment 110556 [details] did not pass mac-ews (mac):
Output: http://queues.webkit.org/results/10073737
Comment 28 Daniel Bates 2011-10-16 13:47:41 PDT
Comment on attachment 110556 [details]
Address comments + add a CachedRawResourceClient

Attachment 110556 [details] did not pass mac-ews (mac):
Output: http://queues.webkit.org/results/10079614
Comment 29 Nate Chapin 2011-10-17 09:58:11 PDT
Created attachment 111278 [details]
Another merge to trunk
Comment 30 WebKit Review Bot 2011-10-17 09:59:51 PDT
Attachment 111278 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/update-webkit', '--chromium']" exit_code: 2

Updating OpenSource
Current branch master is up to date.
Updating chromium port dependencies using gclient...
Error: Can't switch the checkout to http://v8.googlecode.com/svn/branches/3.6@9637; UUID don't match and there is local changes in /mnt/git/webkit-style-queue/Source/WebKit/chromium/v8. Delete the directory and try again.
Re-trying 'depot_tools/gclient sync'
Error: Can't switch the checkout to http://v8.googlecode.com/svn/branches/3.6@9637; UUID don't match and there is local changes in /mnt/git/webkit-style-queue/Source/WebKit/chromium/v8. Delete the directory and try again.
Re-trying 'depot_tools/gclient sync'
Error: Can't switch the checkout to http://v8.googlecode.com/svn/branches/3.6@9637; UUID don't match and there is local changes in /mnt/git/webkit-style-queue/Source/WebKit/chromium/v8. Delete the directory and try again.
Error: 'depot_tools/gclient sync' failed 3 tries and returned 256 at Tools/Scripts/update-webkit-chromium line 107.
Re-trying 'depot_tools/gclient sync'
No such file or directory at Tools/Scripts/update-webkit line 104.


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 31 Antti Koivisto 2011-10-17 10:07:54 PDT
Comment on attachment 111278 [details]
Another merge to trunk

r=me

The bot is whining something so cq-.
Comment 32 Nate Chapin 2011-10-17 16:42:43 PDT
Created attachment 111346 [details]
Patch for landing
Comment 33 WebKit Review Bot 2011-10-17 20:56:47 PDT
Comment on attachment 111346 [details]
Patch for landing

Rejecting attachment 111346 [details] from commit-queue.

Failed to run "['/mnt/git/webkit-commit-queue/Tools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '-..." exit_code: 2

Last 500 characters of output:
ce::didDownloadData(int)':
Source/WebCore/loader/chromium/CachedRawResourceChromium.cpp:41: error: missing template arguments before 'w'
Source/WebCore/loader/chromium/CachedRawResourceChromium.cpp:41: error: expected ';' before 'w'
Source/WebCore/loader/chromium/CachedRawResourceChromium.cpp:42: error: 'w' was not declared in this scope
make: *** [out/Debug/obj.target/webcore_remaining/Source/WebCore/loader/chromium/CachedRawResourceChromium.o] Error 1
make: *** Waiting for unfinished jobs....

Full output: http://queues.webkit.org/results/10126088
Comment 34 Nate Chapin 2011-10-18 08:49:03 PDT
Created attachment 111448 [details]
Patch for landing
Comment 35 Nate Chapin 2011-10-18 08:49:45 PDT
(In reply to comment #34)
> Created an attachment (id=111448) [details]
> Patch for landing

The only change between this and the previous patch is that this updates CachedRawResourceChromium.cpp to use a templated CachedResourceClientWalker.
Comment 36 WebKit Review Bot 2011-10-18 09:30:41 PDT
Comment on attachment 111448 [details]
Patch for landing

Clearing flags on attachment: 111448

Committed r97765: <http://trac.webkit.org/changeset/97765>
Comment 37 WebKit Review Bot 2011-10-18 09:30:49 PDT
All reviewed patches have been landed.  Closing bug.
Comment 38 Dirk Pranke 2011-10-18 14:39:15 PDT
It looks like this patch may have broken webkit_unit_tests?

http://build.chromium.org/p/chromium.webkit/builders/Webkit%20Linux/builds/15032
Comment 39 Nate Chapin 2011-10-24 14:46:38 PDT
Created attachment 112252 [details]
Address issues from revert

Diff from last version:

CachedRawResources default to ResourceLoadPriorityMedium.  This matches the existing behavior for DocumentThreadableLoaders.

CachedRawResource::data() protects the CachedRawResource.  The crash described in https://bugs.webkit.org/show_bug.cgi?id=70388 was caused by the CachedRawResource being deleted prematurely by the dataReceived() call.
Comment 40 Adam Barth 2011-10-25 11:42:30 PDT
Comment on attachment 112252 [details]
Address issues from revert

I'm mostly forwarding Antti's R+.  The change Nate has made to this patch since it landed last time sound correct.
Comment 41 WebKit Review Bot 2011-10-25 12:49:34 PDT
Comment on attachment 112252 [details]
Address issues from revert

Clearing flags on attachment: 112252

Committed r98380: <http://trac.webkit.org/changeset/98380>
Comment 42 WebKit Review Bot 2011-10-25 12:49:43 PDT
All reviewed patches have been landed.  Closing bug.