Bug 49008 - Unbounded memory growth for long lived pages with many requests
Summary: Unbounded memory growth for long lived pages with many requests
Status: RESOLVED WONTFIX
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC OS X 10.5
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-11-04 10:44 PDT by Sam Magnuson
Modified: 2011-04-26 15:58 PDT (History)
16 users (show)

See Also:


Attachments
Demonstration of unbounded growth (1.66 KB, text/html)
2010-11-04 10:44 PDT, Sam Magnuson
no flags Details
Patch (5.56 KB, patch)
2010-11-04 10:51 PDT, Sam Magnuson
no flags Details | Formatted Diff | Diff
Patch (6.67 KB, patch)
2010-11-04 10:55 PDT, Sam Magnuson
no flags Details | Formatted Diff | Diff
Patch (1.20 KB, patch)
2010-11-04 10:58 PDT, Sam Magnuson
no flags Details | Formatted Diff | Diff
Patch (1.50 KB, patch)
2010-11-04 11:01 PDT, Sam Magnuson
no flags Details | Formatted Diff | Diff
merge (12.41 KB, patch)
2010-11-05 12:44 PDT, Sam Magnuson
no flags Details | Formatted Diff | Diff
Patch (13.13 KB, patch)
2010-11-10 10:58 PST, Sam Magnuson
abarth: review-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Sam Magnuson 2010-11-04 10:44:01 PDT
Created attachment 72960 [details]
Demonstration of unbounded growth

I have a long lived web app that changes images on screen to unique URL's over a period of time. I have found that the memory grows in an unbounded way (probably not considered a leak due to the fact that the pointers/references to the data is not lost only that its growth is unbounded).

I discussed the issue with Adam Barth with respect to the three places I found such behaviour and it was agreed that I would submit a patch for approval that would bound these references to references that remain in the in-memory cache. In this way you can limit the size of your cache and the reference to the URL's or resources will be limited as well. 

I have done this with a new class called SubCache (which is a subset of the current global cache). Anything referenced can be inserted into the SubCache (which is referenced only by string) and will be evicted from the SubCache once the Cache no longer references it. I have identified three spots where this kind of unbounded growth can use the SubCache and will submit the three fixes in addition to the new SubCache as separate patches.

I've attached a small test case that demonstrates the unbounded growth as well, which can be seen by watching the process.
Comment 1 Sam Magnuson 2010-11-04 10:51:19 PDT
Created attachment 72962 [details]
Patch
Comment 2 Sam Magnuson 2010-11-04 10:55:56 PDT
Created attachment 72964 [details]
Patch
Comment 3 Sam Magnuson 2010-11-04 10:58:54 PDT
Created attachment 72965 [details]
Patch
Comment 4 Adam Barth 2010-11-04 10:58:55 PDT
Comment on attachment 72962 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=72962&action=review

> WebCore/loader/Cache.cpp:802
> +SubCache::SubCache() :  m_cache(cache())

The ": m_cache(cache())" part should be on its own line and indented four spaces.

> WebCore/loader/Cache.cpp:813
> +CachedResource*
> +SubCache::resourceForURL(const String &url) const

The return type should be on the same line as the function declaration.  Also, I don't see where this function is called.

> WebCore/loader/Cache.cpp:825
> +void
> +SubCache::add(const String &url)
> +{
> +    if (m_cache->resourceForURL(url)) // It must be in the cache for it to go into the subcache.
> +        m_resources.add(url);
> +}

Where is this function called?

> WebCore/loader/Cache.h:241
> +class SubCache {

This should be in its own file.  Also, we should think of a more descriptive name.

> WebCore/loader/Cache.h:243
> +public:
> +

No blank line here.
Comment 5 Adam Barth 2010-11-04 10:59:11 PDT
Oh, i see there are more patches.
Comment 6 Adam Barth 2010-11-04 11:01:28 PDT
I appreciate your writing small patches, but I think this would be easier to review as one patch.  It's easier to understand what's going on if you can see the implementation and the caller at the same time.
Comment 7 Sam Magnuson 2010-11-04 11:01:32 PDT
Created attachment 72967 [details]
Patch
Comment 8 Early Warning System Bot 2010-11-04 11:08:44 PDT
Attachment 72964 [details] did not build on qt:
Build output: http://queues.webkit.org/results/5108011
Comment 9 Early Warning System Bot 2010-11-04 11:20:59 PDT
Attachment 72967 [details] did not build on qt:
Build output: http://queues.webkit.org/results/5135019
Comment 10 Alexey Proskuryakov 2010-11-04 11:40:02 PDT
Is this a duplicate of bug 31253?

I guess the main question here is how this affects Activity Window and Web Inspector functionality.
Comment 11 Antti Koivisto 2010-11-04 16:32:19 PDT
Please combine the patches for a single patch for readability.

There shouldn't be any unbounded growth. The memory cache has a maximum size which will eventually be reached. At that point unreferenced items will be kicked out. You will fill the cache in any sufficiently long browsing session too so this is nothing particularly alarming. It would be useful to try to recognize resources that are unlikely to be referenced ever again and kick them out early. Doing that well is difficult.

I don't understand how these patches solve the problem, neither from description nor from looking at the code.
Comment 12 Build Bot 2010-11-04 22:19:24 PDT
Attachment 72967 [details] did not build on win:
Build output: http://queues.webkit.org/results/5285009
Comment 13 WebKit Review Bot 2010-11-04 22:30:07 PDT
Attachment 72964 [details] did not build on win:
Build output: http://queues.webkit.org/results/5345006
Comment 14 WebKit Review Bot 2010-11-04 23:12:07 PDT
Attachment 72964 [details] did not build on chromium:
Build output: http://queues.webkit.org/results/5294011
Comment 15 WebKit Review Bot 2010-11-04 23:20:31 PDT
Attachment 72967 [details] did not build on chromium:
Build output: http://queues.webkit.org/results/5333006
Comment 16 WebKit Review Bot 2010-11-05 06:57:22 PDT
Attachment 72964 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/5319011
Comment 17 WebKit Review Bot 2010-11-05 07:08:39 PDT
Attachment 72967 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/5248016
Comment 18 Sam Magnuson 2010-11-05 12:44:38 PDT
Created attachment 73102 [details]
merge
Comment 19 Sam Magnuson 2010-11-05 12:48:42 PDT
(In reply to comment #11)
> Please combine the patches for a single patch for readability.
> 

Ok. I thought it might be nice to have the subcache mods in separate patches to discuss the merit of using the data structure in each place but I'll merge together into a single patch for this review.

> There shouldn't be any unbounded growth. The memory cache has a maximum size which will eventually be reached. At that point unreferenced items will be kicked out. You will fill the cache in any sufficiently long browsing session too so this is nothing particularly alarming. It would be useful to try to recognize resources that are unlikely to be referenced ever again and kick them out early. Doing that well is difficult.
> 

This doesn't really augment the in-memory cache. This is to add a datastructure (called SubCache at the moment) which is a subset of the in-memory cache. It only references them by keyname (url) and when they get evicted from the global in-memory cache they fall out of the SubCache. WebKit has the pattern in a number of places to remember a set of url's (a HashSet) that it has seen before and the growth of that map is unbounded. In my tests I have a number of unique url's that each get book-kept and thus the memory continues to grow. Using the SubCache caps those to only things in the Cache which at least leaves a single place to tune memory requirements.

> I don't understand how these patches solve the problem, neither from description nor from looking at the code.

Hopefully this is more clear.
Comment 20 Sam Magnuson 2010-11-05 16:04:07 PDT
(In reply to comment #10)
> Is this a duplicate of bug 31253?
> 

It might be, however I believe 31253's patch is only attempting to address the m_documentResources in CachedResourceLoader.cpp (nee DocLoader.cpp). The patch I've attached here address that as well as the two "known" URL HashSet (one in CacheResourceLoader.cpp one in DocumentLoader.cpp).

> I guess the main question here is how this affects Activity Window and Web Inspector functionality.

I tested with the Inspector and a very very small cache and it seems to work as expected. It looks like the Inspector is not using the table that used to live in DocumentLoader.cpp anymore (in fact InspectorResource.cpp is been removed in lieu of InspectorResourceAgent.cpp in trunk).
Comment 21 Adam Barth 2010-11-05 22:09:42 PDT
Who is using the set in DocumentLoader?  Maybe it's orphaned?
Comment 22 Alexey Proskuryakov 2010-11-05 23:47:31 PDT
Activity window, Web Archives and Mac API, I think.
Comment 23 Sam Magnuson 2010-11-10 10:58:29 PST
Created attachment 73509 [details]
Patch
Comment 24 Sam Magnuson 2010-11-10 10:59:25 PST
I redid the patch due to the Cache->MemoryCache rename.
Comment 25 Antonio Gomes 2011-01-17 19:07:03 PST
Progresses here?
Comment 26 Pavel Feldman 2011-01-17 23:09:20 PST
Let me dump Web Inspector assumptions here:

- When inspector is opened, it traverses Frame tree and collects frame resources using CachedResourceLoaders (cachedResourceLoader()->allCachedResources() calls).
- Once opened, it updates this tree of resources to keep it up-to-date based on the network events.
- At any time user wants to see the resource content, we either use frameLoader->documentLoader()->mainResourceData() (for main resource) or use its CachedResourceLoader again.

It is important for us to show the entire tree of resources to the user. Aggressive clearing of the cachedresourceloader will probably regress our functionality. Worst case we should mark CachedResources as "emptied" in order to re-create the resource tree structure properly. Unfortunately there are no tests that would regress, but if we make the wipe threshold configurable, it would not be hard to come up with one.
Comment 27 Adam Barth 2011-04-26 15:57:58 PDT
Comment on attachment 73509 [details]
Patch

This does not appear to be a big problem in practice.  It's unclear whether the added complexity from this patch is justified by the benefits.