patch http://trac.webkit.org/changeset/39309 caused a memory regression on the mozilla membuster tests. The specific site with the regression is www.vodcars.com. After loading the site RSIZE jumps 10-15 MB.
This message is also printed to the terminal several times:
Thu Dec 18 20:33:36 slewis.apple.com Safari <Error>: CGImageSourceUpdateData image source was already finalized
<rdar://problem/6457900> REGRESSION: Memory regression loading www.vodcars.com
That error message seems strange. I will try and take a first look at this, may need Hyatt's eyes too.
I bet I know what this is, or at least part of it. Now that we're not deleting the whole decoder after every frame, we shouldn't be calling setData() on it either. I haven't had time to test the effect of removing this yet (typing this on my phone) and don't know how to run the memory test locally anyway, but I bet this will solve it.
I also suspect this may help rid the CG decoder of some of the perf issues I still hadn't figured out, since setData() resets the whole decoder state.
Will whip up a patch when I get into the office in half an hour.
Created attachment 26148 [details]
This ought to fix the error message, and if the memory regression is caused by the same problem it should fix that too. I'm not sure how to test that locally so I'll be interested in the performance of this in the test suite.
In short, the issue was that in the old code, we always deleted the decoder, so we always called setData() after it. Now that we don't delete the decoder all the time, we shouldn't always call setData() again either. Because BitmapImage seems like the wrong place to know how ImageSource is handling a clear() call internally, I moved this logic into ImageSource::clear() itself.
The patch cleared up the error, but not the memory regression.
Created attachment 26216 [details]
OK, this patch goes one small step further by reverting the CG ImageSource to its former behavior of destroying the decoder after every large frame. This should fix the memory issues (please verify).
I'm not totally sure why this is necessary. On the images I looked at it looked as if the CG decoders were dropping frames that no one referenced (in fact, so aggressively that they continued to redecode frames unnecessarily). But I don't have the CG source code so I can't look into this or implement a "drop unnecessary frames" call like the other platforms now have, which is IMO the correct way to address this issue.
I just realized this doesn't quite revert to the previous behavior. It's actually more aggressive than that -- it destroys the decoder even when we were merely invalidating the last frame because we'd gotten new data. Not sure how big of a problem that is. I'm inclined to not worry about it unless it actually leads to CPU usage problems, since solving that issue alone will require yet more hacks; the "right" solution (IMO), to implement selective destruction of frames like the other platforms, ought to take care of this too.
Also, note that checking this in will re-cause bug 22108 :(
Comment on attachment 26216 [details]
Clearing r? on this, I'm not willing to regress another bug to fix this one. We should try and get the first patch in at least since it's an unqualified win, and then figure out what the memory usage here is actually caused by.
ObjectAlloc shows many more live CGImages due to r39309. They are held by the CGImageSource. Prior to r39309, ImageSource::clear() always released the CGImageSource, which led to the CGImages being destroyed.
Yes. Normally both the CGImageSource and the BitmapImage hold references to CGImages. BitmapImage is designed to release its references for large images, and tell the CGImageSource it can do the same. Unfortunately there's currently no API (that I know of) to tell the CGImageSource that it can release certain frames selectively, and it seems to be overly aggressive about hanging onto everything even when no one else has a ref, so the old code forcibly destroyed it to keep memory usage low. Unfortunately this trashes other useful data like repeat counts, leading to problems with animated images.
The best solution is to find some way to communicate which frames can be released, a la the public GIFImageDecoder.cpp. Another possibility is to try to return to a hack like the old code and yet still preserve metadata like the repeat count on the side in order to avoid regressing issue 22108; this is a lot more fragile.
Finally, in fairness to the CGImageSource, it's not totally clear to me that it's always wasting memory in what images it holds references to. Perhaps in some cases it's holding onto prior frames to prevent redecoding, which means that it would use more memory now, but legitimately so (i.e. this is an intentional "regression" in memory usage in a few cases in order to save CPU and gain correctness). OTOH, testing seems to show that the CGImageSource decodes too frequently anyway, so if this is the reason, then the code isn't very good.
I need someone with access to the closed source decoders, like Hyatt, to help solve this problem. I have no visibility into the decoder :(
This memory use regression is far more serious than the original problem in terms of effect on Safari. That problem meant that certain GIFs wouldn't animate, but this causes memory leaks and makes the browser slower and slower over time.
I don't see any memory leaks here. There may be increased memory usage for certain GIFs but AFAIK no leaks (which would worry me very much). Am I missing something?
The patch does not cause memory leaks, jut a very large memory use regression. The mozilla membuster test is one of the primary benchmarks we are tracking for memory use. The memory regression needs to be fixed soon, or the original patch will need to be reverted. We can't have a huge memory regression in the tree for this long.
What information do you feel you are missing that you could gather from looking at the CG sources? Perhaps I can help.
Basically, I need to understand the criteria it uses to decide how to hold references to frames, so that I can understand
(a) Which frames are getting held here that weren't before,
(b) Whether that's actually accruing any advantage to the decoder (i.e. is this a pure regression, or is this a tradeoff?), and
(c) How I can ask the decoder to release frames -- especially to release frames selectively a la the interface that got added recently (see the open-source GIFImageDecoder.cpp code).
The dumbest thing for the decoder to do would be to just always hold refs to all frames, and not have any API to release them. Better would be either a smart algorithm that drops references unless they're really needed, or an API that callers can use to instruct the decoder to drop refs, or even better, both.
(In reply to comment #15)
> (c) How I can ask the decoder to release frames -- especially to release frames
> selectively a la the interface that got added recently (see the open-source
> GIFImageDecoder.cpp code).
ImageIO does not have API or any other way that I could see to selectively release images that a CGImageSource has retained. The way to release those images is to release the CGImageSource. That is what the code used to do prior to r39309, and I strongly recommend going back to that behavior.
OK, but _why_ has it retained these images? It clearly doesn't retain all the images all the time -- most of the time, it discards them. Why does it retain the ones on this particular site? Everyone keeps assuming this is just completely wasted memory, but that's not obvious to me until I understand what's different about this case than the common ones.
If I simply revert r39309, I'll break animation in a lot of cases, since the old code destroyed the repeat count. Instead, if we want to go back to the solution of destroying the whole decoder, we're going to need to put more complex logic into caching this value inside the ImageSource. I can probably try to write up such a patch. Can someone please review the patch that's already on this bug, though?
Has anyone given any thought to simply using the open-source GIF decoder? At this point it is significantly faster and uses less memory than the GIF decoder in ImageIO. I have been operating under the assumption that this suggestion wouldn't even be considered, but it seems worth making.
(In reply to comment #17)
> OK, but _why_ has it retained these images? It clearly doesn't retain all the
> images all the time -- most of the time, it discards them.
What makes you say this? From the source, it looks like the only time it discards images is when the CGImageSource is destroyed.
> ... Why does it retain
> the ones on this particular site? Everyone keeps assuming this is just
> completely wasted memory, but that's not obvious to me until I understand
> what's different about this case than the common ones.
We are not looking to justify an increase in memory use, we are trying to fix a regression.
> If I simply revert r39309, I'll break animation in a lot of cases, since the
> old code destroyed the repeat count.
Broken animations were an earlier regression. Before that, we had no significant issues in this area.
> Instead, if we want to go back to the
> solution of destroying the whole decoder, we're going to need to put more
> complex logic into caching this value inside the ImageSource.
Alternatively, we can consider reverting more of the changes that led up to the regression r39309 was meant to solve.
> I can probably
> try to write up such a patch. Can someone please review the patch that's
> already on this bug, though?
That patch does not address the memory issue.
(In reply to comment #18)
> (In reply to comment #17)
> > OK, but _why_ has it retained these images? It clearly doesn't retain all the
> > images all the time -- most of the time, it discards them.
> What makes you say this? From the source, it looks like the only time it
> discards images is when the CGImageSource is destroyed.
That's strange. That ought to mean that not destroying the CGImageSource regresses the memory usage for any "large" GIF (where "large" means "big enough decoded size to trip the heuristics in the old code"). But on most of my test images I see no real difference in memory usage in Safari for Windows. Also, for very large images, we ought to see memory usage growing accordingly as the image frames are decoded, which is precisely the behavior I could reproduce on the Cairo/Chromium builds when I tested taking out all frame-freeing entirely. But instead memory usage on these images seems capped. I don't know how to square all of this with your examination of the source code :(
> > ... Why does it retain
> > the ones on this particular site? Everyone keeps assuming this is just
> > completely wasted memory, but that's not obvious to me until I understand
> > what's different about this case than the common ones.
> We are not looking to justify an increase in memory use, we are trying to fix a
I'm not denying memory usage has increased on this site. I certainly want to fix that assuming it doesn't cause an even worse tradeoff. I merely want to understand what that tradeoff is, if any. The data looks odd and this code is hairy, and I'd rather not cause an endless series of regressions by failing to completely understand what's going on.
> > If I simply revert r39309, I'll break animation in a lot of cases, since the
> > old code destroyed the repeat count.
> Broken animations were an earlier regression. Before that, we had no
> significant issues in this area.
Not true. Safari has always suffered from major CPU usage issues decoding certain GIFs (which are fixed in Cairo/Chromium but not fixed in Safari, partly due to the lack of ability to tell the CG decoder to discard certain frames), and animated GIFs far too slowly (unboundedly slowly, actually), with GIFs taking anywhere between 10% to 200% longer to animate than they should have (an issue which _is_ fixed). Additionally, the Cairo/Chromium ports suffered from unbounded memory usage (spikes of several hundred megs on large images), which is also fixed. Image animations also didn't sync correctly across tab switching, scroll events, etc.; this is also fixed.
> Alternatively, we can consider reverting more of the changes that led up to the
> regression r39309 was meant to solve.
The only path backward is probably to back out everything all the way to the beginning, which would reintroduce all the problems I mentioned above.
> > I can probably
> > try to write up such a patch. Can someone please review the patch that's
> > already on this bug, though?
> That patch does not address the memory issue.
But it _does_ address the other errors, is correct, and is a necessary precursor to the patch I plan to write to have CGImageSource.cpp destroy the decoder, since there seems to be no other way to release these image references.
> But it _does_ address the other errors, is correct, and is a necessary
> precursor to the patch I plan to write to have CGImageSource.cpp destroy the
> decoder, since there seems to be no other way to release these image
This regression has been in the tree long enough at this time that unless you are certain you can fix this issue in the next few days we really will want to roll out the series of patches that lead up to here. For that reason, I not sure it makes sense land additional partial fixes on top of it.
Comment on attachment 26148 [details]
After hearing from Peter that he thinks he can fix the memory regression in a timely manner (end of the week?) I am going to r+ this. The changelog needs to be changed to note that it does not actually fix the regression.
Comment on attachment 26148 [details]
Clearing review flag to keep this patch off the "waiting for commit" list", since I've landed this patch but am not yet closing the bug.
Created attachment 26542 [details]
memory regression patch v1
My fix for bug 22108 was helpful for non-CG backends, but with further testing looks to have been totally useless for CG, which apparently scans as much of the GIF as it needs to determine the repetition count. Since that's true, it won't make animation any flakier to go ahead and destroy the decoder after each frame if the image is large.
In theory, the CG decoder could use the images it's reffed to speed decoding of later frames, which means this would trade off lower memory usage for higher CPU usage; and if the decoder wasn't extremely aggressive about throwing away frames internally during decoding, even the memory savings wouldn't be true savings, as transiently memory would spike that high anyway. However, the CG decoder doesn't seem to use these frames that way, so keeping them doesn't save anything.
Comment on attachment 26542 [details]
memory regression patch v1
This does indeed address the memory issue. Thanks for tracking this down. r=me.
Fixed in r39734.