Bug 49008

Summary: Unbounded memory growth for long lived pages with many requests
Product: WebKit Reporter: Sam Magnuson <smagnuso>
Component: WebCore Misc.Assignee: Nobody <webkit-unassigned>
Status: RESOLVED WONTFIX    
Severity: Normal CC: abarth, ap, bnason, buildbot, dglazkov, eric, gustavo, japhet, koivisto, loislo, pfeldman, thomas, tonikitoo, webkit-ews, webkit.review.bot, xan.lopez
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: OS X 10.5   
Attachments:
Description Flags
Demonstration of unbounded growth
none
Patch
none
Patch
none
Patch
none
Patch
none
merge
none
Patch abarth: review-

Sam Magnuson
Reported 2010-11-04 10:44:01 PDT
Created attachment 72960 [details] Demonstration of unbounded growth I have a long lived web app that changes images on screen to unique URL's over a period of time. I have found that the memory grows in an unbounded way (probably not considered a leak due to the fact that the pointers/references to the data is not lost only that its growth is unbounded). I discussed the issue with Adam Barth with respect to the three places I found such behaviour and it was agreed that I would submit a patch for approval that would bound these references to references that remain in the in-memory cache. In this way you can limit the size of your cache and the reference to the URL's or resources will be limited as well. I have done this with a new class called SubCache (which is a subset of the current global cache). Anything referenced can be inserted into the SubCache (which is referenced only by string) and will be evicted from the SubCache once the Cache no longer references it. I have identified three spots where this kind of unbounded growth can use the SubCache and will submit the three fixes in addition to the new SubCache as separate patches. I've attached a small test case that demonstrates the unbounded growth as well, which can be seen by watching the process.
Attachments
Demonstration of unbounded growth (1.66 KB, text/html)
2010-11-04 10:44 PDT, Sam Magnuson
no flags
Patch (5.56 KB, patch)
2010-11-04 10:51 PDT, Sam Magnuson
no flags
Patch (6.67 KB, patch)
2010-11-04 10:55 PDT, Sam Magnuson
no flags
Patch (1.20 KB, patch)
2010-11-04 10:58 PDT, Sam Magnuson
no flags
Patch (1.50 KB, patch)
2010-11-04 11:01 PDT, Sam Magnuson
no flags
merge (12.41 KB, patch)
2010-11-05 12:44 PDT, Sam Magnuson
no flags
Patch (13.13 KB, patch)
2010-11-10 10:58 PST, Sam Magnuson
abarth: review-
Sam Magnuson
Comment 1 2010-11-04 10:51:19 PDT
Sam Magnuson
Comment 2 2010-11-04 10:55:56 PDT
Sam Magnuson
Comment 3 2010-11-04 10:58:54 PDT
Adam Barth
Comment 4 2010-11-04 10:58:55 PDT
Comment on attachment 72962 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=72962&action=review > WebCore/loader/Cache.cpp:802 > +SubCache::SubCache() : m_cache(cache()) The ": m_cache(cache())" part should be on its own line and indented four spaces. > WebCore/loader/Cache.cpp:813 > +CachedResource* > +SubCache::resourceForURL(const String &url) const The return type should be on the same line as the function declaration. Also, I don't see where this function is called. > WebCore/loader/Cache.cpp:825 > +void > +SubCache::add(const String &url) > +{ > + if (m_cache->resourceForURL(url)) // It must be in the cache for it to go into the subcache. > + m_resources.add(url); > +} Where is this function called? > WebCore/loader/Cache.h:241 > +class SubCache { This should be in its own file. Also, we should think of a more descriptive name. > WebCore/loader/Cache.h:243 > +public: > + No blank line here.
Adam Barth
Comment 5 2010-11-04 10:59:11 PDT
Oh, i see there are more patches.
Adam Barth
Comment 6 2010-11-04 11:01:28 PDT
I appreciate your writing small patches, but I think this would be easier to review as one patch. It's easier to understand what's going on if you can see the implementation and the caller at the same time.
Sam Magnuson
Comment 7 2010-11-04 11:01:32 PDT
Early Warning System Bot
Comment 8 2010-11-04 11:08:44 PDT
Early Warning System Bot
Comment 9 2010-11-04 11:20:59 PDT
Alexey Proskuryakov
Comment 10 2010-11-04 11:40:02 PDT
Is this a duplicate of bug 31253? I guess the main question here is how this affects Activity Window and Web Inspector functionality.
Antti Koivisto
Comment 11 2010-11-04 16:32:19 PDT
Please combine the patches for a single patch for readability. There shouldn't be any unbounded growth. The memory cache has a maximum size which will eventually be reached. At that point unreferenced items will be kicked out. You will fill the cache in any sufficiently long browsing session too so this is nothing particularly alarming. It would be useful to try to recognize resources that are unlikely to be referenced ever again and kick them out early. Doing that well is difficult. I don't understand how these patches solve the problem, neither from description nor from looking at the code.
Build Bot
Comment 12 2010-11-04 22:19:24 PDT
WebKit Review Bot
Comment 13 2010-11-04 22:30:07 PDT
WebKit Review Bot
Comment 14 2010-11-04 23:12:07 PDT
WebKit Review Bot
Comment 15 2010-11-04 23:20:31 PDT
WebKit Review Bot
Comment 16 2010-11-05 06:57:22 PDT
WebKit Review Bot
Comment 17 2010-11-05 07:08:39 PDT
Sam Magnuson
Comment 18 2010-11-05 12:44:38 PDT
Sam Magnuson
Comment 19 2010-11-05 12:48:42 PDT
(In reply to comment #11) > Please combine the patches for a single patch for readability. > Ok. I thought it might be nice to have the subcache mods in separate patches to discuss the merit of using the data structure in each place but I'll merge together into a single patch for this review. > There shouldn't be any unbounded growth. The memory cache has a maximum size which will eventually be reached. At that point unreferenced items will be kicked out. You will fill the cache in any sufficiently long browsing session too so this is nothing particularly alarming. It would be useful to try to recognize resources that are unlikely to be referenced ever again and kick them out early. Doing that well is difficult. > This doesn't really augment the in-memory cache. This is to add a datastructure (called SubCache at the moment) which is a subset of the in-memory cache. It only references them by keyname (url) and when they get evicted from the global in-memory cache they fall out of the SubCache. WebKit has the pattern in a number of places to remember a set of url's (a HashSet) that it has seen before and the growth of that map is unbounded. In my tests I have a number of unique url's that each get book-kept and thus the memory continues to grow. Using the SubCache caps those to only things in the Cache which at least leaves a single place to tune memory requirements. > I don't understand how these patches solve the problem, neither from description nor from looking at the code. Hopefully this is more clear.
Sam Magnuson
Comment 20 2010-11-05 16:04:07 PDT
(In reply to comment #10) > Is this a duplicate of bug 31253? > It might be, however I believe 31253's patch is only attempting to address the m_documentResources in CachedResourceLoader.cpp (nee DocLoader.cpp). The patch I've attached here address that as well as the two "known" URL HashSet (one in CacheResourceLoader.cpp one in DocumentLoader.cpp). > I guess the main question here is how this affects Activity Window and Web Inspector functionality. I tested with the Inspector and a very very small cache and it seems to work as expected. It looks like the Inspector is not using the table that used to live in DocumentLoader.cpp anymore (in fact InspectorResource.cpp is been removed in lieu of InspectorResourceAgent.cpp in trunk).
Adam Barth
Comment 21 2010-11-05 22:09:42 PDT
Who is using the set in DocumentLoader? Maybe it's orphaned?
Alexey Proskuryakov
Comment 22 2010-11-05 23:47:31 PDT
Activity window, Web Archives and Mac API, I think.
Sam Magnuson
Comment 23 2010-11-10 10:58:29 PST
Sam Magnuson
Comment 24 2010-11-10 10:59:25 PST
I redid the patch due to the Cache->MemoryCache rename.
Antonio Gomes
Comment 25 2011-01-17 19:07:03 PST
Progresses here?
Pavel Feldman
Comment 26 2011-01-17 23:09:20 PST
Let me dump Web Inspector assumptions here: - When inspector is opened, it traverses Frame tree and collects frame resources using CachedResourceLoaders (cachedResourceLoader()->allCachedResources() calls). - Once opened, it updates this tree of resources to keep it up-to-date based on the network events. - At any time user wants to see the resource content, we either use frameLoader->documentLoader()->mainResourceData() (for main resource) or use its CachedResourceLoader again. It is important for us to show the entire tree of resources to the user. Aggressive clearing of the cachedresourceloader will probably regress our functionality. Worst case we should mark CachedResources as "emptied" in order to re-create the resource tree structure properly. Unfortunately there are no tests that would regress, but if we make the wipe threshold configurable, it would not be hard to come up with one.
Adam Barth
Comment 27 2011-04-26 15:57:58 PDT
Comment on attachment 73509 [details] Patch This does not appear to be a big problem in practice. It's unclear whether the added complexity from this patch is justified by the benefits.
Note You need to log in before you can comment on or make changes to this bug.