Bug 89901 - [chromium] Respect memory needed for RenderSurfaces when reserving contents textures
Summary: [chromium] Respect memory needed for RenderSurfaces when reserving contents t...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Dana Jansens
URL:
Keywords:
Depends on: 84308 89793
Blocks:
  Show dependency treegraph
 
Reported: 2012-06-25 13:00 PDT by Dana Jansens
Modified: 2012-07-04 13:51 PDT (History)
9 users (show)

See Also:


Attachments
Patch (19.65 KB, patch)
2012-06-25 13:49 PDT, Dana Jansens
no flags Details | Formatted Diff | Diff
Patch (26.36 KB, patch)
2012-06-25 17:13 PDT, Dana Jansens
no flags Details | Formatted Diff | Diff
Patch (32.60 KB, patch)
2012-06-25 17:35 PDT, Dana Jansens
no flags Details | Formatted Diff | Diff
Patch (26.39 KB, patch)
2012-06-26 07:37 PDT, Dana Jansens
no flags Details | Formatted Diff | Diff
Patch (112.97 KB, patch)
2012-06-27 12:31 PDT, Dana Jansens
no flags Details | Formatted Diff | Diff
Patch (117.39 KB, patch)
2012-06-29 11:20 PDT, Dana Jansens
no flags Details | Formatted Diff | Diff
Patch (109.33 KB, patch)
2012-06-29 13:25 PDT, Dana Jansens
no flags Details | Formatted Diff | Diff
Patch (108.13 KB, patch)
2012-07-04 07:52 PDT, Dana Jansens
no flags Details | Formatted Diff | Diff
Patch (108.02 KB, patch)
2012-07-04 08:21 PDT, Dana Jansens
no flags Details | Formatted Diff | Diff
Patch (108.11 KB, patch)
2012-07-04 11:48 PDT, Dana Jansens
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dana Jansens 2012-06-25 13:00:14 PDT
[chromium] Respect memory needed for RenderSurfaces when reserving contents textures
Comment 1 Dana Jansens 2012-06-25 13:49:20 PDT
Created attachment 149344 [details]
Patch
Comment 2 Dana Jansens 2012-06-25 13:50:23 PDT
Once we do this, we can allocate RenderSurface textures in LRC freely.
Comment 3 Eric Penner 2012-06-25 15:23:47 PDT
Comment on attachment 149344 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=149344&action=review

> Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:494
>      // otherwise it will always push us towards the maximum limit.

I think with this change this comment doesn't apply anymore. With this change, we would actually want the contents limit to push towards the maximum limit (after render surfaces have been subtracted). So we could just compute the render surface usage and set the maximum once at the top here and be done with preferred etc.

> Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:568
> +        reserveRenderSurfaceMemoryBytes(*m_contentsTextureManager.get(), requiredBytes);

Rather than calling this function several times inside here, could we just have another function like ::computeRenderSurfaceMemoryBytes() that is called once before any painting even starts?
Comment 4 Eric Penner 2012-06-25 15:34:01 PDT
I personally like this. We should still set a limit of render surfaces (minimally 100% of the total limit, but maybe 50% or 33% would make more sense?)  I also like contents using the "remainder" rather than surfaces, since contents will just grow to fill any limit, where as hopefully render surfaces will be pretty reasonable.   If surfaces were using 50% of our total limit it seems the page would be almost completely dead already just rendering all those surfaces.
Comment 5 James Robinson 2012-06-25 15:35:38 PDT
Can we add some histograms about RS usage in the wild? I'd be surprised if we were hitting or even getting close to the RS limits on most pages
Comment 6 Dana Jansens 2012-06-25 15:51:39 PDT
Comment on attachment 149344 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=149344&action=review

>> Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:494
>>      // otherwise it will always push us towards the maximum limit.
> 
> I think with this change this comment doesn't apply anymore. With this change, we would actually want the contents limit to push towards the maximum limit (after render surfaces have been subtracted). So we could just compute the render surface usage and set the maximum once at the top here and be done with preferred etc.

Oh was the preferred meant to help for RenderSurfaces? Then ya we could remove it now :)

>> Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:568
>> +        reserveRenderSurfaceMemoryBytes(*m_contentsTextureManager.get(), requiredBytes);
> 
> Rather than calling this function several times inside here, could we just have another function like ::computeRenderSurfaceMemoryBytes() that is called once before any painting even starts?

I wanted to only reserve one at a time, as its contents are reserved, so that when we OOM, we allocate surfaces that will have contents, and don't prevent contents with surfaces that won't have any contents allocated that draw into them.
Comment 7 Dana Jansens 2012-06-25 15:53:17 PDT
(In reply to comment #4)
> I personally like this. We should still set a limit of render surfaces (minimally 100% of the total limit, but maybe 50% or 33% would make more sense?)  I also like contents using the "remainder" rather than surfaces, since contents will just grow to fill any limit, where as hopefully render surfaces will be pretty reasonable.   If surfaces were using 50% of our total limit it seems the page would be almost completely dead already just rendering all those surfaces.

Personally I think 100% is fine, especially since this is reserving space for each surface as it reserves contents that draw into it.

(In reply to comment #5)
> Can we add some histograms about RS usage in the wild? I'd be surprised if we were hitting or even getting close to the RS limits on most pages

Yes!
Comment 8 Dana Jansens 2012-06-25 17:13:13 PDT
Created attachment 149399 [details]
Patch
Comment 9 Dana Jansens 2012-06-25 17:14:21 PDT
Added histograms, and stopped using the preferred limit as per epenner's suggestion - instead just using the max limit after subtracting render surfaces.
Comment 10 Dana Jansens 2012-06-25 17:35:29 PDT
Created attachment 149406 [details]
Patch

- Little change to the test to clear the rootLayer pointer, as the others are.
Comment 11 Eric Penner 2012-06-25 17:47:57 PDT
Comment on attachment 149344 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=149344&action=review

>>> Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:568
>>> +        reserveRenderSurfaceMemoryBytes(*m_contentsTextureManager.get(), requiredBytes);
>> 
>> Rather than calling this function several times inside here, could we just have another function like ::computeRenderSurfaceMemoryBytes() that is called once before any painting even starts?
> 
> I wanted to only reserve one at a time, as its contents are reserved, so that when we OOM, we allocate surfaces that will have contents, and don't prevent contents with surfaces that won't have any contents allocated that draw into them.

I see. This is going to be more difficult to support in the model where we request all textures with priorities in advance (rushes to submit patch first ;) ). I'm hoping it will be acceptable to just request all surfaces and subtract that from contents memory. All renders surfaces are visible, so we need them to render the page correctly... So this will only really "help" in the ~50%-100% memory usage range, and by "help" I mean render the page "less incorrect".
Comment 12 Adrienne Walker 2012-06-25 18:23:33 PDT
Comment on attachment 149406 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=149406&action=review

> Source/WebCore/ChangeLog:80
> +        Here we stop doing this, and create RenderSurfaces on both threads
> +        under the same conditions, so main thread has surfaces only if the
> +        layers in its current frame demand them. While this may reduce
> +        paint culling within an animating subtree, this seems like an edge
> +        case and knowing the amount of surface memory needed for the frame
> +        is important.

How does that work? You're not really creating render surfaces in the same conditions, since on the main thread these animations may not have even started yet.  Would it make more sense to reserve as if the animation had started rather than potentially underestimating?

> Source/WebCore/platform/graphics/chromium/RenderSurfaceChromium.cpp:82
> +    if (!m_backgroundFilters.isEmpty())
> +        bytes += TextureManager::memoryUseBytes(m_contentRect.size(), backgroundTextureFormat());

Is this also just meant to be an estimate, since m_contentRect is not the size of the background texture?

> Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:579
> +        bool isRootRenderSurface = it.targetRenderSurfaceLayer() != renderSurfaceLayerList[0];

You could just stick the root render surface in allocatedSurfaces...

> Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:581
> +            reserveRenderSurfaceMemoryBytes(*m_contentsTextureManager.get(), occlusionTracker.overdrawMetrics(), target->requiredTextureBytes());

So the idea is to allocate all render surfaces simultaneously and just cache all surfaces?
Comment 13 Eric Penner 2012-06-25 19:05:10 PDT
Comment on attachment 149344 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=149344&action=review

>>>> Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:568
>>>> +        reserveRenderSurfaceMemoryBytes(*m_contentsTextureManager.get(), requiredBytes);
>>> 
>>> Rather than calling this function several times inside here, could we just have another function like ::computeRenderSurfaceMemoryBytes() that is called once before any painting even starts?
>> 
>> I wanted to only reserve one at a time, as its contents are reserved, so that when we OOM, we allocate surfaces that will have contents, and don't prevent contents with surfaces that won't have any contents allocated that draw into them.
> 
> I see. This is going to be more difficult to support in the model where we request all textures with priorities in advance (rushes to submit patch first ;) ). I'm hoping it will be acceptable to just request all surfaces and subtract that from contents memory. All renders surfaces are visible, so we need them to render the page correctly... So this will only really "help" in the ~50%-100% memory usage range, and by "help" I mean render the page "less incorrect".

And just to be clear I'm joking above and I like this; I'm just not sure whether we should add a lot of code to support custom priorities for each surface once that CL lands. Currently we don't need layer ordering.
Comment 14 Dana Jansens 2012-06-25 19:24:19 PDT
Comment on attachment 149406 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=149406&action=review

>> Source/WebCore/ChangeLog:80
>> +        is important.
> 
> How does that work? You're not really creating render surfaces in the same conditions, since on the main thread these animations may not have even started yet.  Would it make more sense to reserve as if the animation had started rather than potentially underestimating?

Oh crap that's supposed to be 89793, I uploaded them together *^_^*.

My thinking is that is seems fine to go over the limit a little here and there. If we assume all animations are surfaces we will greatly overestimate surface memory (most animations are not surfaces), and evict contents. Instead, I favoured going over the limit for a bit. The main thread animates too (at a slower rate) and will realize these animating things are a render surface and adjust, if and when it sees that.

>> Source/WebCore/platform/graphics/chromium/RenderSurfaceChromium.cpp:82
>> +        bytes += TextureManager::memoryUseBytes(m_contentRect.size(), backgroundTextureFormat());
> 
> Is this also just meant to be an estimate, since m_contentRect is not the size of the background texture?

m_contentRect is the size of the background texture, but not the size of the readback texture. The readback texture is <= the root surface so it's accounted for.

>> Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:579
>> +        bool isRootRenderSurface = it.targetRenderSurfaceLayer() != renderSurfaceLayerList[0];
> 
> You could just stick the root render surface in allocatedSurfaces...

Oh. ya that's true. I will do that!

>> Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:581
>> +            reserveRenderSurfaceMemoryBytes(*m_contentsTextureManager.get(), occlusionTracker.overdrawMetrics(), target->requiredTextureBytes());
> 
> So the idea is to allocate all render surfaces simultaneously and just cache all surfaces?

yeh.
Comment 15 Dana Jansens 2012-06-25 20:54:02 PDT
(In reply to comment #13)
> (From update of attachment 149344 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=149344&action=review
> And just to be clear I'm joking above and I like this; I'm just not sure whether we should add a lot of code to support custom priorities for each surface once that CL lands. Currently we don't need layer ordering.

Ah I see. With PTM it seems like it would make sense to just compute the sum total of surfaces at the start. I am just trying to worry about how it might be abused. For any memory limit, I can construct a legitimate webpage that would use arbitrarily close to 100% of the available memory for surfaces, and a small amount for content, without going outside our memory limit. And I think we should be able to display it, since it is within our bounds. So I don't like limiting surfaces to some arbitrary < 100%. Also if we do that, then which surfaces to drop? Impl side has to agree again suddenly. Right now it has a very clear clue which surfaces to draw (they have non empty quad list, ie at least one contributing texture). We would need to add a flag of some sort to RenderSurface in order to drop/skip the RenderPass when creating quads.

It's a bit awkward though, I mean, if we reserve all the surface memory we could easily prevent any contents from being drawn. Is that okay? I've been assuming it's not.


Currently PTM puts the cutoff so that no textures above the cutoff fail to be allocated. Then it just grabs more-or-less arbitrary textures to fill in the remaining memory during paint.

What if we decided which textures would be drawn during prioritization instead of during paint. Ie once we run out of memory, we pick an arbitrary set of textures with priority = the cutoff that stay below the limit, and mark them as above (others are below). And then during paint we only allow textures strictly marked as above.

Then, what if we did prioritization such that visible textures drawing to the root surface get visible+0.5. Visible textures drawing to the next surface get visible+0.4. Visible textures drawing to the next visible+0.3, etc.

Theh, when we sort, the visible things will be sorted according to target surface. As we walk down the priority list, at priority visible+0.5, we must reduce available memory by the size of the root render surface. At visible+0.4, we reduce available memory by the next render surface.

If we reach the end of visible priorities, we'll have reserved space for all the render surfaces. If we do not, we will have reserved space for any surfaces with visible textures that will be available.


I could write this change against the PTM patch, with intent to land it after PRM does.
Comment 16 Dana Jansens 2012-06-26 07:37:31 PDT
Created attachment 149530 [details]
Patch

- Without 89793 included.
Comment 17 Dana Jansens 2012-06-26 08:41:08 PDT
(In reply to comment #15)
> What if we decided which textures would be drawn during prioritization instead of during paint. Ie once we run out of memory, we pick an arbitrary set of textures with priority = the cutoff that stay below the limit, and mark them as above (others are below). And then during paint we only allow textures strictly marked as above.

Right.. and occlusion culling throws its wrench in here.
Comment 18 Dana Jansens 2012-06-27 12:31:07 PDT
Created attachment 149783 [details]
Patch

This CL is now rebased onto the PrioritizedTextureManager.
Comment 19 Dana Jansens 2012-06-29 11:20:21 PDT
Created attachment 150220 [details]
Patch
Comment 20 Eric Penner 2012-06-29 11:41:51 PDT
Comment on attachment 150220 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=150220&action=review

> Source/WebCore/platform/graphics/chromium/cc/CCPrioritizedTextureManager.cpp:59
> +void CCPrioritizedTextureManager::prioritizeTextures(size_t rootRenderSurfaceMemoryBytes, size_t childRenderSurfacesMemoryBytes)

Could we possibly just change the limit instead of passing this custom information in here? Ideally this class could be generic with the exception of the priority calculator. What's wrong with just making render surfaces universally higher priority and just making sure there is some memory left for the root render surface contents.

It looks like once these are on the same thread we could actually just request the render surfaces like any other memory. Possibly as stop-gap we could add a custom format that never gets allocated. Then we could just add a big texture request for all the other render surfaces, which would get prioritized with everything else.
Comment 21 Dana Jansens 2012-06-29 11:45:07 PDT
Comment on attachment 150220 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=150220&action=review

>> Source/WebCore/platform/graphics/chromium/cc/CCPrioritizedTextureManager.cpp:59
>> +void CCPrioritizedTextureManager::prioritizeTextures(size_t rootRenderSurfaceMemoryBytes, size_t childRenderSurfacesMemoryBytes)
> 
> Could we possibly just change the limit instead of passing this custom information in here? Ideally this class could be generic with the exception of the priority calculator. What's wrong with just making render surfaces universally higher priority and just making sure there is some memory left for the root render surface contents.
> 
> It looks like once these are on the same thread we could actually just request the render surfaces like any other memory. Possibly as stop-gap we could add a custom format that never gets allocated. Then we could just add a big texture request for all the other render surfaces, which would get prioritized with everything else.

The reason I pass in the render surface limit is it gets applied after the visible textures in the root render surface. This is to handle OOM the way we talked about - visible things in the root render surface will still appear. Otherwise, if we just decrease the limit, and surfaces OOM us, we display nothing at all.
Comment 22 Eric Penner 2012-06-29 11:54:57 PDT
(In reply to comment #21)
> (From update of attachment 150220 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=150220&action=review
> 
> >> Source/WebCore/platform/graphics/chromium/cc/CCPrioritizedTextureManager.cpp:59
> >> +void CCPrioritizedTextureManager::prioritizeTextures(size_t rootRenderSurfaceMemoryBytes, size_t childRenderSurfacesMemoryBytes)
> > 
> > Could we possibly just change the limit instead of passing this custom information in here? Ideally this class could be generic with the exception of the priority calculator. What's wrong with just making render surfaces universally higher priority and just making sure there is some memory left for the root render surface contents.
> > 
> > It looks like once these are on the same thread we could actually just request the render surfaces like any other memory. Possibly as stop-gap we could add a custom format that never gets allocated. Then we could just add a big texture request for all the other render surfaces, which would get prioritized with everything else.
> 
> The reason I pass in the render surface limit is it gets applied after the visible textures in the root render surface. This is to handle OOM the way we talked about - visible things in the root render surface will still appear. Otherwise, if we just decrease the limit, and surfaces OOM us, we display nothing at all.

I think I understand the end behavior you want. I thought you were going to just not let render surfaces OOM us by not letting them use all of our memory. Given that this is just to provide some user feedback that the page is OOM, I don't know how optimal this needs to be.

As an alternative, would using a texture format of -1 work? Then we could put in a normal request for the render surface memory rather than a custom bytes value.
Comment 23 Eric Penner 2012-06-29 11:59:35 PDT
(In reply to comment #22)
> (In reply to comment #21)
> > (From update of attachment 150220 [details] [details])
> > View in context: https://bugs.webkit.org/attachment.cgi?id=150220&action=review
> > 
> > >> Source/WebCore/platform/graphics/chromium/cc/CCPrioritizedTextureManager.cpp:59
> > >> +void CCPrioritizedTextureManager::prioritizeTextures(size_t rootRenderSurfaceMemoryBytes, size_t childRenderSurfacesMemoryBytes)
> > > 
> > > Could we possibly just change the limit instead of passing this custom information in here? Ideally this class could be generic with the exception of the priority calculator. What's wrong with just making render surfaces universally higher priority and just making sure there is some memory left for the root render surface contents.
> > > 
> > > It looks like once these are on the same thread we could actually just request the render surfaces like any other memory. Possibly as stop-gap we could add a custom format that never gets allocated. Then we could just add a big texture request for all the other render surfaces, which would get prioritized with everything else.
> > 
> > The reason I pass in the render surface limit is it gets applied after the visible textures in the root render surface. This is to handle OOM the way we talked about - visible things in the root render surface will still appear. Otherwise, if we just decrease the limit, and surfaces OOM us, we display nothing at all.
> 
> I think I understand the end behavior you want. I thought you were going to just not let render surfaces OOM us by not letting them use all of our memory. Given that this is just to provide some user feedback that the page is OOM, I don't know how optimal this needs to be.
> 
> As an alternative, would using a texture format of -1 work? Then we could put in a normal request for the render surface memory rather than a custom bytes value.

Hmm, since this is already implemented, I'll leave it up to you and Enne to decide what is best. If we keep it this way, just add a FIXME in prioritizeTextures that notes how we could remove the custom behavior (once everything is on the same thread and surfaces can be requested like other textures).
Comment 24 Dana Jansens 2012-06-29 12:03:51 PDT
Comment on attachment 150220 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=150220&action=review

>>>>> Source/WebCore/platform/graphics/chromium/cc/CCPrioritizedTextureManager.cpp:59
>>>>> +void CCPrioritizedTextureManager::prioritizeTextures(size_t rootRenderSurfaceMemoryBytes, size_t childRenderSurfacesMemoryBytes)
>>>> 
>>>> Could we possibly just change the limit instead of passing this custom information in here? Ideally this class could be generic with the exception of the priority calculator. What's wrong with just making render surfaces universally higher priority and just making sure there is some memory left for the root render surface contents.
>>>> 
>>>> It looks like once these are on the same thread we could actually just request the render surfaces like any other memory. Possibly as stop-gap we could add a custom format that never gets allocated. Then we could just add a big texture request for all the other render surfaces, which would get prioritized with everything else.
>>> 
>>> The reason I pass in the render surface limit is it gets applied after the visible textures in the root render surface. This is to handle OOM the way we talked about - visible things in the root render surface will still appear. Otherwise, if we just decrease the limit, and surfaces OOM us, we display nothing at all.
>> 
>> I think I understand the end behavior you want. I thought you were going to just not let render surfaces OOM us by not letting them use all of our memory. Given that this is just to provide some user feedback that the page is OOM, I don't know how optimal this needs to be.
>> 
>> As an alternative, would using a texture format of -1 work? Then we could put in a normal request for the render surface memory rather than a custom bytes value.
> 
> Hmm, since this is already implemented, I'll leave it up to you and Enne to decide what is best. If we keep it this way, just add a FIXME in prioritizeTextures that notes how we could remove the custom behavior (once everything is on the same thread and surfaces can be requested like other textures).

I don't think we can just request them like normal memory until we get rid of requestLate. Once that happens we could though!

If all the root surface texture don't fit in memory, we'll prioritize nothing, and then we'll walk front-to-back requestLate and taking all the memory for textures in child surfaces, and nothing will appear again. See the 3 lines in requestLate for the important bit.
Comment 25 Eric Penner 2012-06-29 12:20:23 PDT
Comment on attachment 150220 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=150220&action=review

>>>>>> Source/WebCore/platform/graphics/chromium/cc/CCPrioritizedTextureManager.cpp:59
>>>>>> +void CCPrioritizedTextureManager::prioritizeTextures(size_t rootRenderSurfaceMemoryBytes, size_t childRenderSurfacesMemoryBytes)
>>>>> 
>>>>> Could we possibly just change the limit instead of passing this custom information in here? Ideally this class could be generic with the exception of the priority calculator. What's wrong with just making render surfaces universally higher priority and just making sure there is some memory left for the root render surface contents.
>>>>> 
>>>>> It looks like once these are on the same thread we could actually just request the render surfaces like any other memory. Possibly as stop-gap we could add a custom format that never gets allocated. Then we could just add a big texture request for all the other render surfaces, which would get prioritized with everything else.
>>>> 
>>>> The reason I pass in the render surface limit is it gets applied after the visible textures in the root render surface. This is to handle OOM the way we talked about - visible things in the root render surface will still appear. Otherwise, if we just decrease the limit, and surfaces OOM us, we display nothing at all.
>>> 
>>> I think I understand the end behavior you want. I thought you were going to just not let render surfaces OOM us by not letting them use all of our memory. Given that this is just to provide some user feedback that the page is OOM, I don't know how optimal this needs to be.
>>> 
>>> As an alternative, would using a texture format of -1 work? Then we could put in a normal request for the render surface memory rather than a custom bytes value.
>> 
>> Hmm, since this is already implemented, I'll leave it up to you and Enne to decide what is best. If we keep it this way, just add a FIXME in prioritizeTextures that notes how we could remove the custom behavior (once everything is on the same thread and surfaces can be requested like other textures).
> 
> I don't think we can just request them like normal memory until we get rid of requestLate. Once that happens we could though!
> 
> If all the root surface texture don't fit in memory, we'll prioritize nothing, and then we'll walk front-to-back requestLate and taking all the memory for textures in child surfaces, and nothing will appear again. See the 3 lines in requestLate for the important bit.

Okay, so because we paint from front-to-back, requestLate will allow the other memory instead of the root render surface memory like you want. Is that correct?

We can fix that by only allowing requestLate to succeed when the textures are equal to the cutoff priority (rather than letting any texture succeed). I mean to add that before actually. Would that help? Request late is meant to disambiguate rather than providing a totally different priority order (it just didn't matter when "visible" was the only priority we needed disambiguate).

I would also love to just remove requestLate. It's probably will perform very poorly as it can result in cascading cache misses like the old LRU. Android clearly doesn't need it since we don't even have occlusions.. Not sure how to decide if we can remove it.
Comment 26 Dana Jansens 2012-06-29 12:23:41 PDT
Comment on attachment 150220 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=150220&action=review

>>>>>>> Source/WebCore/platform/graphics/chromium/cc/CCPrioritizedTextureManager.cpp:59
>>>>>>> +void CCPrioritizedTextureManager::prioritizeTextures(size_t rootRenderSurfaceMemoryBytes, size_t childRenderSurfacesMemoryBytes)
>>>>>> 
>>>>>> Could we possibly just change the limit instead of passing this custom information in here? Ideally this class could be generic with the exception of the priority calculator. What's wrong with just making render surfaces universally higher priority and just making sure there is some memory left for the root render surface contents.
>>>>>> 
>>>>>> It looks like once these are on the same thread we could actually just request the render surfaces like any other memory. Possibly as stop-gap we could add a custom format that never gets allocated. Then we could just add a big texture request for all the other render surfaces, which would get prioritized with everything else.
>>>>> 
>>>>> The reason I pass in the render surface limit is it gets applied after the visible textures in the root render surface. This is to handle OOM the way we talked about - visible things in the root render surface will still appear. Otherwise, if we just decrease the limit, and surfaces OOM us, we display nothing at all.
>>>> 
>>>> I think I understand the end behavior you want. I thought you were going to just not let render surfaces OOM us by not letting them use all of our memory. Given that this is just to provide some user feedback that the page is OOM, I don't know how optimal this needs to be.
>>>> 
>>>> As an alternative, would using a texture format of -1 work? Then we could put in a normal request for the render surface memory rather than a custom bytes value.
>>> 
>>> Hmm, since this is already implemented, I'll leave it up to you and Enne to decide what is best. If we keep it this way, just add a FIXME in prioritizeTextures that notes how we could remove the custom behavior (once everything is on the same thread and surfaces can be requested like other textures).
>> 
>> I don't think we can just request them like normal memory until we get rid of requestLate. Once that happens we could though!
>> 
>> If all the root surface texture don't fit in memory, we'll prioritize nothing, and then we'll walk front-to-back requestLate and taking all the memory for textures in child surfaces, and nothing will appear again. See the 3 lines in requestLate for the important bit.
> 
> Okay, so because we paint from front-to-back, requestLate will allow the other memory instead of the root render surface memory like you want. Is that correct?
> 
> We can fix that by only allowing requestLate to succeed when the textures are equal to the cutoff priority (rather than letting any texture succeed). I mean to add that before actually. Would that help? Request late is meant to disambiguate rather than providing a totally different priority order (it just didn't matter when "visible" was the only priority we needed disambiguate).
> 
> I would also love to just remove requestLate. It's probably will perform very poorly as it can result in cascading cache misses like the old LRU. Android clearly doesn't need it since we don't even have occlusions.. Not sure how to decide if we can remove it.

Oh, hm. Well I think we definitely can't remove it for now, until we have occlusion information in the prioritization step. Or we will just not use available memory for visible textures that we could have.

But yes, making it only allocate things equal to the cutoff...I can't think of any reason why that wouldn't work.
Comment 27 Eric Penner 2012-06-29 12:38:10 PDT
(In reply to comment #26)
> (From update of attachment 150220 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=150220&action=review
> 
> >>>>>>> Source/WebCore/platform/graphics/chromium/cc/CCPrioritizedTextureManager.cpp:59
> >>>>>>> +void CCPrioritizedTextureManager::prioritizeTextures(size_t rootRenderSurfaceMemoryBytes, size_t childRenderSurfacesMemoryBytes)
> >>>>>> 
> >>>>>> Could we possibly just change the limit instead of passing this custom information in here? Ideally this class could be generic with the exception of the priority calculator. What's wrong with just making render surfaces universally higher priority and just making sure there is some memory left for the root render surface contents.
> >>>>>> 
> >>>>>> It looks like once these are on the same thread we could actually just request the render surfaces like any other memory. Possibly as stop-gap we could add a custom format that never gets allocated. Then we could just add a big texture request for all the other render surfaces, which would get prioritized with everything else.
> >>>>> 
> >>>>> The reason I pass in the render surface limit is it gets applied after the visible textures in the root render surface. This is to handle OOM the way we talked about - visible things in the root render surface will still appear. Otherwise, if we just decrease the limit, and surfaces OOM us, we display nothing at all.
> >>>> 
> >>>> I think I understand the end behavior you want. I thought you were going to just not let render surfaces OOM us by not letting them use all of our memory. Given that this is just to provide some user feedback that the page is OOM, I don't know how optimal this needs to be.
> >>>> 
> >>>> As an alternative, would using a texture format of -1 work? Then we could put in a normal request for the render surface memory rather than a custom bytes value.
> >>> 
> >>> Hmm, since this is already implemented, I'll leave it up to you and Enne to decide what is best. If we keep it this way, just add a FIXME in prioritizeTextures that notes how we could remove the custom behavior (once everything is on the same thread and surfaces can be requested like other textures).
> >> 
> >> I don't think we can just request them like normal memory until we get rid of requestLate. Once that happens we could though!
> >> 
> >> If all the root surface texture don't fit in memory, we'll prioritize nothing, and then we'll walk front-to-back requestLate and taking all the memory for textures in child surfaces, and nothing will appear again. See the 3 lines in requestLate for the important bit.
> > 
> > Okay, so because we paint from front-to-back, requestLate will allow the other memory instead of the root render surface memory like you want. Is that correct?
> > 
> > We can fix that by only allowing requestLate to succeed when the textures are equal to the cutoff priority (rather than letting any texture succeed). I mean to add that before actually. Would that help? Request late is meant to disambiguate rather than providing a totally different priority order (it just didn't matter when "visible" was the only priority we needed disambiguate).
> > 
> > I would also love to just remove requestLate. It's probably will perform very poorly as it can result in cascading cache misses like the old LRU. Android clearly doesn't need it since we don't even have occlusions.. Not sure how to decide if we can remove it.
> 
> Oh, hm. Well I think we definitely can't remove it for now, until we have occlusion information in the prioritization step. Or we will just not use available memory for visible textures that we could have.
> 
> But yes, making it only allocate things equal to the cutoff...I can't think of any reason why that wouldn't work.

Actually, thinking about it more, requestLate might not be that bad after all. The cascading cache misses in the LRU came from an attempt recycling inside of protectTexture(). Since we defer recycling now, it shouldn't be a problem.  Nonetheless, it would be good to get rid of requestLate. If you have some thoughts on figuring out how to persistent occlusions (for the prioritize step) I'd love to hear it. That would be pretty great!
Comment 28 Dana Jansens 2012-06-29 13:25:18 PDT
Created attachment 150246 [details]
Patch
Comment 29 Dana Jansens 2012-06-29 13:33:05 PDT
Simplied the patch a bit. The requestLate() can now only request things == the priority cutoff. This makes less non-obvious code yay.

I convinced myself root surface will always be 0 bytes under ubercomp. (The root surface is always axis aligned, opaque etc. The host compositor might use a surface for drawing the quads, but it will choose that and own the memory for it.)

Discussed with Eric some and we decided to go with the custom RenderSurface request approach as it is here. When we make this a more global resource manager, we'll have placeholder requests and surfaces can fit into that model. Added a FIXME to this effect.
Comment 30 Eric Penner 2012-06-29 13:59:57 PDT
Since we are removing the preferred limit (yay!) I'm okay with adding some custom logic for render surfaces for now.

However, I think to remove the preferred limit safely we need to fix this code:

void LayerRendererChromium::decideRenderPassAllocationsForFrame(const CCRenderPassList& renderPassesInDrawOrder)
{
    // FIXME: Get this memory limit from GPU Memory Manager
    size_t contentsMemoryUseBytes = m_contentsTextureAllocator->currentMemoryUseBytes();
    size_t maxLimitBytes = TextureManager::highLimitBytes(viewportSize());
    size_t memoryLimitBytes = maxLimitBytes - contentsMemoryUseBytes > 0u ? maxLimitBytes - contentsMemoryUseBytes : 0u;

    m_implTextureManager->setMaxMemoryLimitBytes(memoryLimitBytes);
}

I added back the preferred limit to be 3/4 of max in the PTM so that this limit will never be zero (it would happily go to zero before I think!). The PTM is much more likely to make it go to zero as it will more likely use all available memory up to it's limit. Should we just set this limit to the limit we calculated on the main thread * some bonus multiplier?
Comment 31 Eric Penner 2012-06-29 14:04:57 PDT
(In reply to comment #30)
> Since we are removing the preferred limit (yay!) I'm okay with adding some custom logic for render surfaces for now.
> 
> However, I think to remove the preferred limit safely we need to fix this code:
> 
> void LayerRendererChromium::decideRenderPassAllocationsForFrame(const CCRenderPassList& renderPassesInDrawOrder)
> {
>     // FIXME: Get this memory limit from GPU Memory Manager
>     size_t contentsMemoryUseBytes = m_contentsTextureAllocator->currentMemoryUseBytes();
>     size_t maxLimitBytes = TextureManager::highLimitBytes(viewportSize());
>     size_t memoryLimitBytes = maxLimitBytes - contentsMemoryUseBytes > 0u ? maxLimitBytes - contentsMemoryUseBytes : 0u;
> 
>     m_implTextureManager->setMaxMemoryLimitBytes(memoryLimitBytes);
> }
> 
> I added back the preferred limit to be 3/4 of max in the PTM so that this limit will never be zero (it would happily go to zero before I think!). The PTM is much more likely to make it go to zero as it will more likely use all available memory up to it's limit. Should we just set this limit to the limit we calculated on the main thread * some bonus multiplier?

I guess to fit with this CL, we would want to set it to max - "contents in root surface"?
Comment 32 Dana Jansens 2012-06-29 14:08:45 PDT
(In reply to comment #30)
> Since we are removing the preferred limit (yay!) I'm okay with adding some custom logic for render surfaces for now.
> 
> However, I think to remove the preferred limit safely we need to fix this code:
> 
> void LayerRendererChromium::decideRenderPassAllocationsForFrame(const CCRenderPassList& renderPassesInDrawOrder)
> {
>     // FIXME: Get this memory limit from GPU Memory Manager
>     size_t contentsMemoryUseBytes = m_contentsTextureAllocator->currentMemoryUseBytes();
>     size_t maxLimitBytes = TextureManager::highLimitBytes(viewportSize());
>     size_t memoryLimitBytes = maxLimitBytes - contentsMemoryUseBytes > 0u ? maxLimitBytes - contentsMemoryUseBytes : 0u;
> 
>     m_implTextureManager->setMaxMemoryLimitBytes(memoryLimitBytes);
> }
> 
> I added back the preferred limit to be 3/4 of max in the PTM so that this limit will never be zero (it would happily go to zero before I think!). The PTM is much more likely to make it go to zero as it will more likely use all available memory up to it's limit. Should we just set this limit to the limit we calculated on the main thread * some bonus multiplier?

That limit is kinda meaningless atm. We don't respect it at all during drawing, it just sets a limit that we drop down to at the end of the frame. At any rate, this CL as it is should strictly improve the current situation, by letting us keep more surfaces across frames.

I think we want to set this to maxint though basically. (I have another CL in progress to completely remove the implTextureManager mess for RenderSurfaces so it'll take care of that also.)
Comment 33 Eric Penner 2012-06-29 14:39:06 PDT
(In reply to comment #32)
> (In reply to comment #30)
> > Since we are removing the preferred limit (yay!) I'm okay with adding some custom logic for render surfaces for now.
> > 
> > However, I think to remove the preferred limit safely we need to fix this code:
> > 
> > void LayerRendererChromium::decideRenderPassAllocationsForFrame(const CCRenderPassList& renderPassesInDrawOrder)
> > {
> >     // FIXME: Get this memory limit from GPU Memory Manager
> >     size_t contentsMemoryUseBytes = m_contentsTextureAllocator->currentMemoryUseBytes();
> >     size_t maxLimitBytes = TextureManager::highLimitBytes(viewportSize());
> >     size_t memoryLimitBytes = maxLimitBytes - contentsMemoryUseBytes > 0u ? maxLimitBytes - contentsMemoryUseBytes : 0u;
> > 
> >     m_implTextureManager->setMaxMemoryLimitBytes(memoryLimitBytes);
> > }
> > 
> > I added back the preferred limit to be 3/4 of max in the PTM so that this limit will never be zero (it would happily go to zero before I think!). The PTM is much more likely to make it go to zero as it will more likely use all available memory up to it's limit. Should we just set this limit to the limit we calculated on the main thread * some bonus multiplier?
> 
> That limit is kinda meaningless atm. We don't respect it at all during drawing, it just sets a limit that we drop down to at the end of the frame. At any rate, this CL as it is should strictly improve the current situation, by letting us keep more surfaces across frames.
> 
> I think we want to set this to maxint though basically. (I have another CL in progress to completely remove the implTextureManager mess for RenderSurfaces so it'll take care of that also.)

I think zero is a special case where anything that needs a render surfaces will disappear (but maybe I'm misunderstanding). If it's irrelevant, why don't we just set it to be 1000MB until we can fix that code. Or if I'm wrong then just ignore me ;)
Comment 34 WebKit Review Bot 2012-06-29 16:07:16 PDT
Comment on attachment 150246 [details]
Patch

Attachment 150246 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/13114520
Comment 35 Dana Jansens 2012-06-29 17:31:28 PDT
Yep when theres not enough space for surfaces then we also dont prioritize/allocate any textures that draw into surfaces. So then we wont need to allocate the surfaces.
Comment 36 Eric Penner 2012-07-02 13:58:39 PDT
(In reply to comment #35)
> Yep when theres not enough space for surfaces then we also dont prioritize/allocate any textures that draw into surfaces. So then we wont need to allocate the surfaces.

Okay, so instead of perferred limit we now have render surface memory which will also prevent us from passing 0 to implTextureManager.  SG
Comment 37 Dana Jansens 2012-07-04 07:52:30 PDT
Created attachment 150794 [details]
Patch

rebase
Comment 38 WebKit Review Bot 2012-07-04 07:58:03 PDT
Comment on attachment 150794 [details]
Patch

Attachment 150794 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/13129840
Comment 39 Dana Jansens 2012-07-04 08:21:02 PDT
Created attachment 150801 [details]
Patch

fix uninit errors
Comment 40 Adrienne Walker 2012-07-04 11:20:19 PDT
Comment on attachment 150801 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=150801&action=review

Am I right that this current patch will only render the root (and its UI) if it has too many render surfaces to render? That seems like really poor OOM behavior.  I felt a lot more comfortable when this patch was reserving render surfaces incrementally.  I agree with jamesr that RS usage in the wild is probably pretty low on average, but I need a little bit of convincing to feel like we're not going to break some small number of existing sites that use them heavily.

Have you run this patch on sites that generate a lot of render surfaces (or have previously hit texture limits and caused us bugs) to make sure they still render the same with and without this patch that reserves all render surfaces simultaneously? For example, http://www.apple.com/iphone/ and http://www.kraftrecipes.com/recipes/sandwich/index.aspx and http://acko.net/blog/this-is-your-brain-on-css/.

> Source/WebCore/ChangeLog:19
> +        Since surface memory can no longer be taken by contents, we remove the
> +        preferred memory limit from the texture manager.

I love that this can go away.

> Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:531
> +    for (size_t i = 1; i < updateList.size(); ++i) {

Can you add a comment for why this starts at 1?

> Source/WebCore/platform/graphics/chromium/cc/CCPrioritizedTextureManager.cpp:162
> +    if (CCPriorityCalculator::priorityIsLower(texture->requestPriority(), m_priorityCutoff))
> +        return false;
> +
>      size_t newMemoryBytes = m_memoryAboveCutoffBytes + texture->memorySizeBytes();
> -    if (newMemoryBytes > m_maxMemoryLimitBytes)
> +    if (newMemoryBytes > m_memoryAvailableBytes)
>          return false;

What's wrong with letting other textures do a requestLate? For example, say you had a scrollable non-root content layer whose prepainted textures didn't make the priority cutoff, but due to occlusion, there was room for prepainting.  Would we not want to give them memory?
Comment 41 Dana Jansens 2012-07-04 11:43:35 PDT
(In reply to comment #40)
> (From update of attachment 150801 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=150801&action=review
> 
> Am I right that this current patch will only render the root (and its UI) if it has too many render surfaces to render? That seems like really poor OOM behavior.  I felt a lot more comfortable when this patch was reserving render surfaces incrementally.  I agree with jamesr that RS usage in the wild is probably pretty low on average, but I need a little bit of convincing to feel like we're not going to break some small number of existing sites that use them heavily.

It will only render the root render surface (which isn't the same as the root layer). In order to reach this situation we have:
a) a very low memory limit
b) contents in the root surface taking up a lot of memory, or
c) ridiculous render surface usage.

I expect that displaying the root surface will be enough to tell that the page is attempting to display, and probably enough to use it.

> Have you run this patch on sites that generate a lot of render surfaces (or have previously hit texture limits and caused us bugs) to make sure they still render the same with and without this patch that reserves all render surfaces simultaneously? For example, http://www.apple.com/iphone/ and http://www.kraftrecipes.com/recipes/sandwich/index.aspx and http://acko.net/blog/this-is-your-brain-on-css/.

The pages all seem to function the same with/out the patch though FWIW. I don't actually know any website that would up all the contents budget with surfaces.

Iphone uses a couple surfaces, but doesn't OOM us. The other 2 don't seem to use any surfaces. They have OOMed us in the past with layers that draw into the root surface. If they did use a render surface, then all layers drawing into it would disappear first, now. Previously, it would be the back-most layer to disappear first. And we might allocate all the contents memory to surfaces which then were too large to allocate and show nothing.

> > Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:531
> > +    for (size_t i = 1; i < updateList.size(); ++i) {
> 
> Can you add a comment for why this starts at 1?

Yup! (It's skipping the root surface.)

> > Source/WebCore/platform/graphics/chromium/cc/CCPrioritizedTextureManager.cpp:162
> > +    if (CCPriorityCalculator::priorityIsLower(texture->requestPriority(), m_priorityCutoff))
> > +        return false;
> > +
> >      size_t newMemoryBytes = m_memoryAboveCutoffBytes + texture->memorySizeBytes();
> > -    if (newMemoryBytes > m_maxMemoryLimitBytes)
> > +    if (newMemoryBytes > m_memoryAvailableBytes)
> >          return false;
> 
> What's wrong with letting other textures do a requestLate? For example, say you had a scrollable non-root content layer whose prepainted textures didn't make the priority cutoff, but due to occlusion, there was room for prepainting.  Would we not want to give them memory?

Because requestLate() does not happen in priority order. The goal is the following:

If a contents texture is allocated, we also have space for all of its target textures up to the root. If we don't have space for surface textures, then we should not allocate any contents that will draw into non-root surfaces. This way we can allocate all surfaces that have contents drawing into them in LRC, we don't need to keep around extra state for which surfaces are to be allocated.

If requestLate() happened in priority order, or even in surface dependency order, we could do something with it. But as it is, it happens in front-to-back order (requirement for occlusion). We'd need to do a "request space for surfaces on demand" approach.

I went with this because I feel the on-demand case is just overkill for what we need. It would complicate up the texture manager, requiring it to know/use the layer/surface tree structure in some way. This method allows it to know/use a single integer. This is the trade off as I am seeing it.
Comment 42 Dana Jansens 2012-07-04 11:48:27 PDT
Created attachment 150828 [details]
Patch

- added comment to for loop
Comment 43 Adrienne Walker 2012-07-04 12:46:28 PDT
Comment on attachment 150828 [details]
Patch

Thanks for testing on those pages.  They've OOMed us in the past (when we had lower limits) and have had anecdotally had a lot of RS usage.  If they don't really have much at all, then all the better.

Re: root surface vs. root layer.  Ah, quite right.  I misread that.  Just rendering the root surface is an entirely different thing.  That makes sense that you wouldn't want to reserve anything but the root surface in that case.  I am totally swayed by arguments about simplicity here.

Re: requestLate.  Sure, it doesn't happen in priority order and I buy that for render surface OOM, you wouldn't want anything else.  But for content OOM + occlusion, maybe you would, even if you found out that you could at the wrong time to make an optimal choice? Maybe that's just too much complication.  In the future, I'm hoping impl-side painting will simplify this because we can have SkPicture and occlusion information before we allocate or prioritize textures.

R=me.  Let's see how this goes.
Comment 44 Dana Jansens 2012-07-04 12:48:43 PDT
(In reply to comment #43)
> (From update of attachment 150828 [details])
> Thanks for testing on those pages.  They've OOMed us in the past (when we had lower limits) and have had anecdotally had a lot of RS usage.  If they don't really have much at all, then all the better.
> 
> Re: root surface vs. root layer.  Ah, quite right.  I misread that.  Just rendering the root surface is an entirely different thing.  That makes sense that you wouldn't want to reserve anything but the root surface in that case.  I am totally swayed by arguments about simplicity here.
> 
> Re: requestLate.  Sure, it doesn't happen in priority order and I buy that for render surface OOM, you wouldn't want anything else.  But for content OOM + occlusion, maybe you would, even if you found out that you could at the wrong time to make an optimal choice? Maybe that's just too much complication.  In the future, I'm hoping impl-side painting will simplify this because we can have SkPicture and occlusion information before we allocate or prioritize textures.

Yah, that. I'm expecting us to improve the prioritization a lot with occlusion, and we can remove requestLate entirely at that point.

> R=me.  Let's see how this goes.

Thanks!
Comment 45 WebKit Review Bot 2012-07-04 13:51:12 PDT
Comment on attachment 150828 [details]
Patch

Clearing flags on attachment: 150828

Committed r121870: <http://trac.webkit.org/changeset/121870>
Comment 46 WebKit Review Bot 2012-07-04 13:51:19 PDT
All reviewed patches have been landed.  Closing bug.