DocumentLoader keeps a reference to all URL strings ever loaded leading to lots of memory waste
This set is used in FrameLoader::loadedResourceFromMemoryCache() to control whether the client notifications are sent for loads that go through this codepath. Possible fixes for the memory waste are: 1.) Preserve current behavior exactly and use a more compact data structure to determine whether a given URL is in the set of URLs the client knows about, for example a HashSet of MD5 hashes. 2.) Decide that we don't need the current behavior for data: URLs and exclude them from the set since these URLs are generally large and contribute more to WebKit's excessive memory use than other types of URLs. The client call this set effects is dispatchDidLoadResourceFromMemoryCache, which is notImplemented() in every in-tree port except for the Mac port (not WebKit2) and Chromium.
Created attachment 95708 [details] Patch
Patch posted for consideration. If someone more familiar with the details of the mac port's use of this client API can suggest a better solution for that port that would be excellent, but in the meantime I think it's a clear win to fix this leak for everyone else.
Comment on attachment 95708 [details] Patch LGTM. You should file a separate bug for the mac port, or possibly repurpose this one.
Bradee has historically worked on the mac port's loader.
Filed https://bugs.webkit.org/show_bug.cgi?id=61904 for Mac.
Comment on attachment 95708 [details] Patch Especially since we don't even need it in WK2, we should be able to get away from this in WK1 too. But clearly we can't simply remove this concept until someone explores the Mac port's usage of it. I'd suggest a better patch for now that would be a greater win for WebCore and all ports as a whole, in readability and prepping for future removal of this code: - File bugzilla for the Mac port exploration as Eric suggested - In DocumentLoader.h, put didTellClientAboutLoad, haveToldClientAboutLoad, and m_resourcesClientKnowsAbout in a single #if PLATFORM(MAC) block. Note that this means for non-Mac platforms, they will be compiled out and not exist. - Put an #if PLATFORM(MAC) block at the call sites of the two methods. Without looking at code, I bet this will only be a few places. (you probably already know) - Take these 3 or 4 #if PLATFORM(MAC) sites with a FIXME that includes the bugzilla you filed.
Heh. Apparently I can't clear the commit-queue flag even though I r-'ed, and now I can't even change the review or commit-queue flags. We'll see if the commit-queue lands an r-'ed patch :) (BTW, I did think it was worth an r- because of how much fracture this adds to DocumentLoader.h)
(In reply to comment #7) > (From update of attachment 95708 [details]) > Especially since we don't even need it in WK2, we should be able to get away from this in WK1 too. > > But clearly we can't simply remove this concept until someone explores the Mac port's usage of it. I've explored it to the extent that I'm able, but all I can tell is that it's not needed for anything except for Mac WK1 and the Mac WK1 usage is black magic to me. > > I'd suggest a better patch for now that would be a greater win for WebCore and all ports as a whole, in readability and prepping for future removal of this code: > - File bugzilla for the Mac port exploration as Eric suggested Done - https://bugs.webkit.org/show_bug.cgi?id=61904 > - In DocumentLoader.h, put didTellClientAboutLoad, haveToldClientAboutLoad, and m_resourcesClientKnowsAbout in a single #if PLATFORM(MAC) block. Note that this means for non-Mac platforms, they will be compiled out and not exist. > - Put an #if PLATFORM(MAC) block at the call sites of the two methods. Without looking at code, I bet this will only be a few places. (you probably already know) > - Take these 3 or 4 #if PLATFORM(MAC) sites with a FIXME that includes the bugzilla you filed. These sound nice but the resolution depends on how https://bugs.webkit.org/show_bug.cgi?id=61904 is resolved, which isn't something I can't really handle personally. I hope that we can just delete this all. In the meantime, however, it seems unfair to penalize all of the other ports (and WK2) with this.
Comment on attachment 95708 [details] Patch clearin' cq
Created attachment 95722 [details] guards entire functions + callsites
Added ifdefs to the callsites as suggested. It's more #ifdef-age, but I agree that it's a bit cleaner.
(In reply to comment #9) > (In reply to comment #7) > > (From update of attachment 95708 [details] [details]) > > Especially since we don't even need it in WK2, we should be able to get away from this in WK1 too. > > > > But clearly we can't simply remove this concept until someone explores the Mac port's usage of it. > > I've explored it to the extent that I'm able, but all I can tell is that it's not needed for anything except for Mac WK1 and the Mac WK1 usage is black magic to me. I meant "until someone familiar with it explores the Mac port's usage", not you :) > > > - In DocumentLoader.h, put didTellClientAboutLoad, haveToldClientAboutLoad, and m_resourcesClientKnowsAbout in a single #if PLATFORM(MAC) block. Note that this means for non-Mac platforms, they will be compiled out and not exist. > > - Put an #if PLATFORM(MAC) block at the call sites of the two methods. Without looking at code, I bet this will only be a few places. (you probably already know) > > - Take these 3 or 4 #if PLATFORM(MAC) sites with a FIXME that includes the bugzilla you filed. > > These sound nice but the resolution depends on how https://bugs.webkit.org/show_bug.cgi?id=61904 is resolved, which isn't something I can't really handle personally. I hope that we can just delete this all. In the meantime, however, it seems unfair to penalize all of the other ports (and WK2) with this. I think you misunderstood me here, too. I am fully in support of cleanup now that removes the penalty for all the other ports. I, too, hope we can just delete this all. The patch I suggested moves all of the code that needs to be removed to exactly one place in DocumentLoader.h, and tags the call sites that can be simplified later with the bug # that will simplify them.
Comment on attachment 95722 [details] guards entire functions + callsites I guess this is simply enough that we don't really need to tag the callsites with the bug # :) Thanks for refining it (I hate branched #ifdefs!)
Comment on attachment 95722 [details] guards entire functions + callsites Clearing flags on attachment: 95722 Committed r87901: <http://trac.webkit.org/changeset/87901>
All reviewed patches have been landed. Closing bug.
Reverted r87901 for reason: Might Committed r87990: <http://trac.webkit.org/changeset/87990>
Created attachment 101188 [details] generate large images via data urls also at www/~scottmg/imgmemory.html inside Google
Created attachment 101189 [details] exclude data urls from m_resourcesClientKnowsAbout
Attached test doesn't leak any longer. There's still a pretty high high-watermark due to image cache, but can run indefinitely now.
Oops, should have added discussion to comments. Previous attempt at a fix by removing the cache caused a performance regression of around 3% in Chromium page_cycler tests. New patch instead simply excludes data urls from the cache. This takes care of the majority of unwanted memory usage, while maintaining performance characteristics.
Created attachment 101735 [details] exclude data urls from m_resourcesClientKnowsAbout (except on MAC pending bug 61904)
Same as previous patch, but exclude PLATFORM(MAC) until someone has time to look at https://bugs.webkit.org/show_bug.cgi?id=61904 Now only affects Chromium based on previous discussion.
Comment on attachment 101735 [details] exclude data urls from m_resourcesClientKnowsAbout (except on MAC pending bug 61904) R=me
Comment on attachment 101735 [details] exclude data urls from m_resourcesClientKnowsAbout (except on MAC pending bug 61904) Clearing flags on attachment: 101735 Committed r91600: <http://trac.webkit.org/changeset/91600>