Description
Eric Seidel (no email)
2011-05-17 17:17:28 PDT
Created attachment 93850 [details]
test cases that loads images via new Image
This test case loads 50 512x512 images via new Image().src = "data:...". This causes a memory growth of ~400MB.
Since the image is 512x512x4 bytes/pixel, the data URL size is 512*512*4 * 4/3 (base64) * 2 (utf-16 encoding) = 2666kb/image. Given 50 loads this should be a memory use of 133MB, which means that we are leaking 3 copies of the URL per load. I imagine that we're doing deep string copies in a few places that perhaps we don't mean to, in addition to leaking it.
Looking at a memory profile more closely I think we're actually leaking 2 KURLs, each of which has one WTF::String which holds the data url as utf-16 data and holds on to one native representation of the string, which is stored in an encoding that uses one byte/character for this data (which is all ascii). In chromium that's a UTF-8 encoded CString, on the mac this is some sort of NSUrl or CF networking data structure. 2 UTF-16 copies + 2 utf-8/ascii copies = the 6x memory blowup we're seeing. I think fixing the CachedResource leak is likely to fix both of these. There is no "leak". The resources are freed when the document goes away. The bug is that if page does dynamic adding and removing of resources, the removed resources won't be freed until the document is. Re-titled appropriately. OK. I believe this is still a regression, and still seems like it would affect basically every site. :) The good news is that because our decoded data disposal is good, we don't really notice that we leak every dynamic resource. We only notice when the URLs for those resources are huge. Related bug: https://bugs.webkit.org/show_bug.cgi?id=61604 Created attachment 95710 [details]
Patch
Comment on attachment 95710 [details]
Patch
Whoops, no ChangeLog...
Created attachment 95711 [details]
Patch
Comment on attachment 95711 [details]
Patch
Makes sense to me. I'm not sure why the "owner" was ever just a pointer, since multiple loaders can be loading the same resource at the same time.
Comment on attachment 95711 [details]
Patch
Can we give a loader expert (Antti) a chance to comment before landing?
(or Brady) +cc Brady Sure, I'd appreciate more input from anyone who feels familiar with this code. And CC'ing Antti, he really needs to look at this. Also, I'd be somewhat surprised if it was *NOT* possible to write a test case for this. Note that making another test pass is great, but if there's no test to actually test the new behavior intended by this patch directly, it could still regress back later. (In reply to comment #15) > Also, I'd be somewhat surprised if it was *NOT* possible to write a test case for this. > > Note that making another test pass is great, but if there's no test to actually test the new behavior intended by this patch directly, it could still regress back later. Can you suggest what sort of test you'd like to see? I don't expect any direct change in behavior, just a smaller memory footprint. I don't know how to write a layout test for that. (In reply to comment #16) > (In reply to comment #15) > > Also, I'd be somewhat surprised if it was *NOT* possible to write a test case for this. > > > > Note that making another test pass is great, but if there's no test to actually test the new behavior intended by this patch directly, it could still regress back later. > > Can you suggest what sort of test you'd like to see? I don't expect any direct change in behavior, just a smaller memory footprint. I don't know how to write a layout test for that. Hmmm I suppose the only thing missing is a callback that a cached resource is actually destroyed. But using resource load callbacks it might be possible to tell if a resource is loaded from the network layer or the WebCore memory cache. If that's possible, I would start exploring with a test involving a subresource in a dynamic document, adding and/or removing dynamically to try and trigger it being loaded from the network layer again, instead of from the WebCore memory cache. Thinking about it more, I would no longer be surprised if a test wasn't possible. But I'd still *not* be surprised if a test *is* possible. (In reply to comment #17) > (In reply to comment #16) > > (In reply to comment #15) > > > Also, I'd be somewhat surprised if it was *NOT* possible to write a test case for this. > > > > > > Note that making another test pass is great, but if there's no test to actually test the new behavior intended by this patch directly, it could still regress back later. > > > > Can you suggest what sort of test you'd like to see? I don't expect any direct change in behavior, just a smaller memory footprint. I don't know how to write a layout test for that. > > Hmmm I suppose the only thing missing is a callback that a cached resource is actually destroyed. > > But using resource load callbacks it might be possible to tell if a resource is loaded from the network layer or the WebCore memory cache. If that's possible, I would start exploring with a test involving a subresource in a dynamic document, adding and/or removing dynamically to try and trigger it being loaded from the network layer again, instead of from the WebCore memory cache. > I'm pretty sure I'm not changing the semantics of the memory cache - no CachedResource should be deleted while it's in the cache, so this will only allow freeing CachedResources that are already evicted from the memory cache so loads will have to hit the network. > Thinking about it more, I would no longer be surprised if a test wasn't possible. > > But I'd still *not* be surprised if a test *is* possible. I'd be much happier with a test as well, of course, but I can't think of one :) (In reply to comment #18) > (In reply to comment #17) > > (In reply to comment #16) > > > (In reply to comment #15) > > > > Also, I'd be somewhat surprised if it was *NOT* possible to write a test case for this. > > > > > > > > Note that making another test pass is great, but if there's no test to actually test the new behavior intended by this patch directly, it could still regress back later. > > > > > > Can you suggest what sort of test you'd like to see? I don't expect any direct change in behavior, just a smaller memory footprint. I don't know how to write a layout test for that. > > > > Hmmm I suppose the only thing missing is a callback that a cached resource is actually destroyed. > > > > But using resource load callbacks it might be possible to tell if a resource is loaded from the network layer or the WebCore memory cache. If that's possible, I would start exploring with a test involving a subresource in a dynamic document, adding and/or removing dynamically to try and trigger it being loaded from the network layer again, instead of from the WebCore memory cache. > > > > I'm pretty sure I'm not changing the semantics of the memory cache - no CachedResource should be deleted while it's in the cache, so this will only allow freeing CachedResources that are already evicted from the memory cache so loads will have to hit the network. > If I understand the bug you reported that your patch purports to fix, then: Before your patch: - A document has a resource in it, and that document is the only one with a reference to that resource. - Removing that resource from the document will not immediately free the associated CachedResource. - Only later - when the document itself is destroyed - will this CachedResource actually be freed. - So before that Document is destroyed, if you then loaded a *new* document with this same resource, the resource would presumably be loaded from the memory Cache. After your patch: - A document has a resource in it, and that document is the only one with a reference to that resource. - Removing that resource from the document *will* immediately destroy the CachedResource, even if the Document hangs around forever. - If you then load a *new* document with this same resource, the resource would presumably be loaded from the network layer. Is this description correct? If so, it should be possible to test with resource load callbacks. There is a resource load callback in WebKit that specifies a resource loaded from the memory cache, as opposed to a normal network load (WebFrameLoaderClient::dispatchDidLoadResourceFromMemoryCache) DumpRenderTree might not listen to this right now, but it could easily. And your patch would cause a change in behavior for this scenario that could be tested. *HOWEVER* - In writing out the above description, I noticed something I don't like. If a resource is removed from a document, then immediately freed, then later added back in to that document, that'd be extremely wasteful - including for these huge data URLs we're concerned with. (In reply to comment #19) > (In reply to comment #18) > > (In reply to comment #17) > > > (In reply to comment #16) > > > > (In reply to comment #15) > > > > > Also, I'd be somewhat surprised if it was *NOT* possible to write a test case for this. > > > > > > > > > > Note that making another test pass is great, but if there's no test to actually test the new behavior intended by this patch directly, it could still regress back later. > > > > > > > > Can you suggest what sort of test you'd like to see? I don't expect any direct change in behavior, just a smaller memory footprint. I don't know how to write a layout test for that. > > > > > > Hmmm I suppose the only thing missing is a callback that a cached resource is actually destroyed. > > > > > > But using resource load callbacks it might be possible to tell if a resource is loaded from the network layer or the WebCore memory cache. If that's possible, I would start exploring with a test involving a subresource in a dynamic document, adding and/or removing dynamically to try and trigger it being loaded from the network layer again, instead of from the WebCore memory cache. > > > > > > > I'm pretty sure I'm not changing the semantics of the memory cache - no CachedResource should be deleted while it's in the cache, so this will only allow freeing CachedResources that are already evicted from the memory cache so loads will have to hit the network. > > > > > If I understand the bug you reported that your patch purports to fix, then: > > Before your patch: > - A document has a resource in it, and that document is the only one with a reference to that resource. Which implies that the resource is not in the memory cache, since while it is in the memory cache then the DocumentLoader cannot be the sole reference. > - Removing that resource from the document will not immediately free the associated CachedResource. > - Only later - when the document itself is destroyed - will this CachedResource actually be freed. Correct. > - So before that Document is destroyed, if you then loaded a *new* document with this same resource, the resource would presumably be loaded from the memory Cache. Not quite! Presence in the memory cache is not the same thing as existence of the CachedResource. > > After your patch: > - A document has a resource in it, and that document is the only one with a reference to that resource. > - Removing that resource from the document *will* immediately destroy the CachedResource, even if the Document hangs around forever. > - If you then load a *new* document with this same resource, the resource would presumably be loaded from the network layer. > > Is this description correct? > > If so, it should be possible to test with resource load callbacks. There is a resource load callback in WebKit that specifies a resource loaded from the memory cache, as opposed to a normal network load (WebFrameLoaderClient::dispatchDidLoadResourceFromMemoryCache) > > DumpRenderTree might not listen to this right now, but it could easily. And your patch would cause a change in behavior for this scenario that could be tested. > > *HOWEVER* - In writing out the above description, I noticed something I don't like. If a resource is removed from a document, then immediately freed, then later added back in to that document, that'd be extremely wasteful - including for these huge data URLs we're concerned with. That's not what the patch intends - if the CachedResource is held in the memory cache before this patch then it still should be held alive after this patch. This patch ensures that the CachedResource can be freed after the resource is evicted from the memory cache even if the DocumentLoader is still around. A CachedResource can stick around in the memory cache even if there aren't any DocumentLoaders referencing it. Okay, think I understand now. This is only about the URLs in CachedResource potentially being huge, not at all about the resources themselves. Antti still needs to take a look at this. Yup, agree on both points. Nate pointed out I have a spurious protocolIsData() check in here as well, I'll update the patch shortly. Created attachment 95800 [details]
merge up to ToT and remove bogus protocolIsData check from CachedResourceLoader::requestResource
Friendly ping The pointer help by CachedResourceHandle is updated automatically to point from the proxy resource to the validated one when a cache validation (304 response) succeeds (by CachedResource::switchClientsToRevalidatedResource()). By using plain pointers you will lose this mechanism and I don't see anything here that achieves the same effect. This probably means that m_documentResources will end up missing revalidated resources. To fix this the revalidated resources could be inserted to the m_documentResources manually or maybe CachedResourceHandle could be added a flag that makes it weak. I attempted to refine this patch to update m_documentResources for the 304 path. However, if the map is updated from CachedResource::switchClientsToRevalidatedResource, then ~CachedResource for the proxy fails because it expects to find a pointer to itself in the map. I tried deferring the update until after the proxy is destroyed, but it would need to be deferred until after the request is complete as the proxy !canDelete() until then (which is pretty far back up the stack). And, it would need to save off and pass back a Set of CachedResourceLoaders as well as the resource to replace with, which seems ugly. Naive next guess at a fix might be to flag the proxy resources as such so they don't try to remove themselves? Antti's suggestion of instead adding a weak flag to CachedResourceHandle may work better, but it also looks like it would need to touch a lot of places that might retrieve an invalidated weak pointer. The most interesting resources from a memory use perspective are large data: URLs, which never validate/304. Is it possible to come up with a simpler solution for resources that we know will never have to validate? Created attachment 100818 [details]
Patch
Add additional handling for 304 by tracking when a CachedResource is used as a proxy
Reviews/comments much wanted. This patch at ToT now reduces memory usage in the test case jamesr was using, but unfortunately there's unrelated code that's still hanging on to things for too long. Usage is around 250MB now, rather than 400, but it still shouldn't be that high. Comment on attachment 100818 [details] Patch Attachment 100818 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/9059279 New failing tests: http/tests/inspector/resource-tree/resource-tree-non-unique-url.html http/tests/inspector/network/network-cachedresources-with-same-urls.html Created attachment 100835 [details]
Archive of layout-test-results from ec2-cr-linux-03
The attached test failures were seen while running run-webkit-tests on the chromium-ews.
Bot: ec2-cr-linux-03 Port: Chromium Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
Created attachment 100980 [details]
Update to fix recently added inspector tests by updating m_documentResources in cache with all new responses
Comment on attachment 100980 [details] Update to fix recently added inspector tests by updating m_documentResources in cache with all new responses Attachment 100980 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/9099257 New failing tests: http/tests/inspector/resource-tree/resource-tree-reload.html http/tests/inspector/network/network-content-replacement-xhr.html Created attachment 100989 [details]
Archive of layout-test-results from ec2-cr-linux-01
The attached test failures were seen while running run-webkit-tests on the chromium-ews.
Bot: ec2-cr-linux-01 Port: Chromium Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
This looks generally ok to me, though it is not nice to increase the complexity of the already convoluted code. However it occurred to me that it might be easier to simply garbage collect m_documentResources occasionally (everything held alive by m_documentResources only is garbage). Did you already reject that approach for some reason? Thanks. I agree the patch does not make the code simpler to understand. GC sounds like a reasonable idea. I guess the only obvious drawback is that it might take "a while" before the resources get freed. It would need only check the refcount of the CachedResource in m_documentResources and decref if 1, correct? I guess the only tricky thing might be finding a suitable time to run that, to make sure there wasn't someone holding only a raw ptr. Could anyone suggest a suitable place to trigger that from? Created attachment 101045 [details]
run a garbage collection on m_documentResources when a load finishes
This is much simpler and more localized, but a bit hacky (what if someone else also has a pseudo-weak map of CachedResourceHandles?). But, overall it's probably more straightforward and is a much smaller patch. The only other concern I have is the cost of iterating the map, which might be expensive if large. Perhaps there is a better place to trigger this from less frequently than loadDone(). This patch doesn't include the unrelated change of CachedResource::m_owningCachedResourceLoader to be a set, I'll patch that separately if it makes sense. You should run this with zero-duration (or a bit longer) timer to avoid O(n^2). Other than that this is much better. Comment on attachment 101045 [details] run a garbage collection on m_documentResources when a load finishes View in context: https://bugs.webkit.org/attachment.cgi?id=101045&action=review > Source/WebCore/loader/cache/CachedResource.h:185 > bool canDelete() const { return !hasClients() && !m_request && !m_preloadCount && !m_handleCount && !m_resourceToRevalidate && !m_proxyResource; } > + unsigned getHandleCount() const { return m_handleCount; } get* naming is not according to coding style. Please replace this with something like hasOneHandle(), no need to expose the count. > Source/WebCore/loader/cache/CachedResourceLoader.cpp:565 > + garbageCollectDocumentResources(); This should start a timer (zero duration or more) so we don't get to O(n^2). > Source/WebCore/loader/cache/CachedResourceLoader.cpp:577 > + for (DocumentResourceMap::iterator it = m_documentResources.begin(); it != m_documentResources.end(); ++it) { > + if (it->second->getHandleCount() == 1) { > + m_documentResources.remove(it); > + // Iterators invalidated. We don't want to copy keys, because they > + // maybe be very large 'data:...'. Rather than restart traversal, > + // simply only remove the first free-able resource, and get > + // another next time. > + return; This is unnecessary. Map keys here are Strings which are really just pointers to StringImpls. Copying a String is just a pointer copy. You should garbage collect the entire map. Also add a comment explaining why this is done and how it could be improved (weak handles). Created attachment 101160 [details]
corrections per comments
Thanks for your comments, fixed all of the things you pointed out. r=me Comment on attachment 101160 [details] corrections per comments Clearing flags on attachment: 101160 Committed r91384: <http://trac.webkit.org/changeset/91384> After this change I’m seeing assertion failures in Mail on Lion when navigating away from some messages. They’re below CachedResourceLoader::garbageCollectDocumentResourcesTimerFired, and it’s the assertion in CachedResource::unregisterHandle that m_handleCount > 0 that’s being hit. I’ll write up a new bug as soon as I can find a reproducible case that I can share. I'm tracking a crash that sounds similar but was only reported when cache was disabled in inspector. I'm going to roll back this behaviour for now, if it's happening more broadly. Created attachment 101892 [details]
rollback, pending crash investigation
Created attachment 101905 [details]
rollback, pending crash investigation
Comment on attachment 101905 [details] rollback, pending crash investigation Clearing flags on attachment: 101905 Committed r91725: <http://trac.webkit.org/changeset/91725> Created attachment 102018 [details]
GC patch updated with test and fix for crash
This patch is identical to https://bugs.webkit.org/attachment.cgi?id=101160 except for the addition of nulling m_owningCachedResourceLoader when the handles are GCd. This avoids ~CachedResource causing a double deletion via removeCachedResource. There's also a test added for this crash. Antti, do you have time to review again? Comment on attachment 102018 [details] GC patch updated with test and fix for crash View in context: https://bugs.webkit.org/attachment.cgi?id=102018&action=review > LayoutTests/http/tests/inspector/network/disabled-cache-crash.html:31 > + NetworkAgent.setCacheDisabled(true, step1); We need to enable memory cache back again before calling completeTest(). Please note, that we've decided that inspector's disable cache option should only clear memory cache, not disable it (https://bugs.webkit.org/show_bug.cgi?id=65184). Since disabling memory cache proved to be the easiest way to reproduce the crash, I suggest that you use window.internals approach to allow disabling memory cache from layout tests (see https://bugs.webkit.org/show_bug.cgi?id=30080 for an example). Created attachment 102178 [details]
GC patch updated with fix and test using window.internals
Fixed comment from vsevik re: re-enabling cache at end of test, and modified the test to use window.internals to disable the memory cache now that the functionality is different in inspector. Comment on attachment 102178 [details] GC patch updated with fix and test using window.internals Attachment 102178 [details] did not pass gtk-ews (gtk): Output: http://queues.webkit.org/results/9260084 Created attachment 102195 [details]
Patch, the same as the previous one for mac bot.
Comment on attachment 102195 [details] Patch, the same as the previous one for mac bot. Attachment 102195 [details] did not pass gtk-ews (gtk): Output: http://queues.webkit.org/results/9247981 Created attachment 102222 [details]
further update to export symbols for test harness
Comment on attachment 102222 [details] further update to export symbols for test harness Attachment 102222 [details] did not pass gtk-ews (gtk): Output: http://queues.webkit.org/results/9265109 Created attachment 102266 [details]
fix export symbols for test harness
Comment on attachment 102266 [details] fix export symbols for test harness Attachment 102266 [details] did not pass gtk-ews (gtk): Output: http://queues.webkit.org/results/9269169 Created attachment 102294 [details]
fix sources.filter for gtk port
I believe this is really, finally (!) a patch worth looking at. Despite all the patches, the only code change since the last r+ is in nulling m_owningCachedResourceLoader on deletion. The rest of the file changes are for adding a test and the various build file and test harness changes required. Comment on attachment 102294 [details]
fix sources.filter for gtk port
r=me
Comment on attachment 102294 [details] fix sources.filter for gtk port Rejecting attachment 102294 [details] from commit-queue. Failed to run "['./Tools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '--bot-id=ec2-cq-02', '--port..." exit_code: 2 Last 500 characters of output: 1 succeeded at 154 with fuzz 1 (offset 4 lines). patching file Source/WebKit2/win/WebKit2CFLite.def Hunk #1 succeeded at 148 with fuzz 2 (offset 6 lines). Hunk #2 FAILED at 153. 1 out of 2 hunks FAILED -- saving rejects to file Source/WebKit2/win/WebKit2CFLite.def.rej patching file Source/autotools/symbols.filter Hunk #1 succeeded at 35 with fuzz 2 (offset 4 lines). Failed to run "[u'/mnt/git/webkit-commit-queue/Tools/Scripts/svn-apply', u'--reviewer', u'Antti Koivisto', u'--f..." exit_code: 1 Full output: http://queues.webkit.org/results/9284452 Comment on attachment 102294 [details] fix sources.filter for gtk port View in context: https://bugs.webkit.org/attachment.cgi?id=102294&action=review > Source/WebCore/loader/cache/CachedResourceLoader.cpp:591 > + Vector<String, 10> toDelete; Nit: It would be nice to make Vector<String, 10> a typedef so you don't have to duplicate the type below. Nit: toDelete is kind of generic. What are we about to delete? Maybe urlsToDelete? resourcesToDelete? resourceURLsToDelete? resourcesToDelete? > Source/WebCore/loader/cache/CachedResourceLoader.cpp:601 > + for (Vector<String, 10>::const_iterator idel = toDelete.begin(); idel != toDelete.end(); ++idel) > + m_documentResources.remove(*idel); Nit: idel is a strange iterator name. Why not just use it? It's scoped to the for() loop above, so you can "reuse" it here. Created attachment 102529 [details]
update based on comments
(In reply to comment #67) > (From update of attachment 102294 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=102294&action=review > > > Source/WebCore/loader/cache/CachedResourceLoader.cpp:591 > > + Vector<String, 10> toDelete; > > Nit: It would be nice to make Vector<String, 10> a typedef so you don't have to duplicate the type below. Thanks, fixed. > Nit: toDelete is kind of generic. What are we about to delete? Maybe urlsToDelete? resourcesToDelete? resourceURLsToDelete? resourcesToDelete? Thanks, fixed as resourcesToDelete. > > > Source/WebCore/loader/cache/CachedResourceLoader.cpp:601 > > + for (Vector<String, 10>::const_iterator idel = toDelete.begin(); idel != toDelete.end(); ++idel) > > + m_documentResources.remove(*idel); > > Nit: idel is a strange iterator name. Why not just use it? It's scoped to the for() loop above, so you can "reuse" it here. Burned too many times by ancient compilers that used the old scoping rules. Thanks, fixed. Comment on attachment 102529 [details] update based on comments Clearing flags on attachment: 102529 Committed r92143: <http://trac.webkit.org/changeset/92143> All reviewed patches have been landed. Closing bug. |