89901 – [chromium] Respect memory needed for RenderSurfaces when reserving contents textures

Dana Jansens

Reported 2012-06-25 13:00:14 PDT

[chromium] Respect memory needed for RenderSurfaces when reserving contents textures

Dana Jansens

Comment 1 2012-06-25 13:49:20 PDT

Created attachment 149344 [details] Patch

Dana Jansens

Comment 2 2012-06-25 13:50:23 PDT

Once we do this, we can allocate RenderSurface textures in LRC freely.

Eric Penner

Comment 3 2012-06-25 15:23:47 PDT

Comment on attachment 149344 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=149344&action=review > Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:494 > // otherwise it will always push us towards the maximum limit. I think with this change this comment doesn't apply anymore. With this change, we would actually want the contents limit to push towards the maximum limit (after render surfaces have been subtracted). So we could just compute the render surface usage and set the maximum once at the top here and be done with preferred etc. > Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:568 > + reserveRenderSurfaceMemoryBytes(*m_contentsTextureManager.get(), requiredBytes); Rather than calling this function several times inside here, could we just have another function like ::computeRenderSurfaceMemoryBytes() that is called once before any painting even starts?

Eric Penner

Comment 4 2012-06-25 15:34:01 PDT

I personally like this. We should still set a limit of render surfaces (minimally 100% of the total limit, but maybe 50% or 33% would make more sense?) I also like contents using the "remainder" rather than surfaces, since contents will just grow to fill any limit, where as hopefully render surfaces will be pretty reasonable. If surfaces were using 50% of our total limit it seems the page would be almost completely dead already just rendering all those surfaces.

James Robinson

Comment 5 2012-06-25 15:35:38 PDT

Can we add some histograms about RS usage in the wild? I'd be surprised if we were hitting or even getting close to the RS limits on most pages

Dana Jansens

Comment 6 2012-06-25 15:51:39 PDT

Comment on attachment 149344 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=149344&action=review >> Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:494 >> // otherwise it will always push us towards the maximum limit. > > I think with this change this comment doesn't apply anymore. With this change, we would actually want the contents limit to push towards the maximum limit (after render surfaces have been subtracted). So we could just compute the render surface usage and set the maximum once at the top here and be done with preferred etc. Oh was the preferred meant to help for RenderSurfaces? Then ya we could remove it now :) >> Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:568 >> + reserveRenderSurfaceMemoryBytes(*m_contentsTextureManager.get(), requiredBytes); > > Rather than calling this function several times inside here, could we just have another function like ::computeRenderSurfaceMemoryBytes() that is called once before any painting even starts? I wanted to only reserve one at a time, as its contents are reserved, so that when we OOM, we allocate surfaces that will have contents, and don't prevent contents with surfaces that won't have any contents allocated that draw into them.

Dana Jansens

Comment 7 2012-06-25 15:53:17 PDT

(In reply to comment #4) > I personally like this. We should still set a limit of render surfaces (minimally 100% of the total limit, but maybe 50% or 33% would make more sense?) I also like contents using the "remainder" rather than surfaces, since contents will just grow to fill any limit, where as hopefully render surfaces will be pretty reasonable. If surfaces were using 50% of our total limit it seems the page would be almost completely dead already just rendering all those surfaces. Personally I think 100% is fine, especially since this is reserving space for each surface as it reserves contents that draw into it. (In reply to comment #5) > Can we add some histograms about RS usage in the wild? I'd be surprised if we were hitting or even getting close to the RS limits on most pages Yes!

Dana Jansens

Comment 8 2012-06-25 17:13:13 PDT

Created attachment 149399 [details] Patch

Dana Jansens

Comment 9 2012-06-25 17:14:21 PDT

Added histograms, and stopped using the preferred limit as per epenner's suggestion - instead just using the max limit after subtracting render surfaces.

Dana Jansens

Comment 10 2012-06-25 17:35:29 PDT

Created attachment 149406 [details] Patch - Little change to the test to clear the rootLayer pointer, as the others are.

Eric Penner

Comment 11 2012-06-25 17:47:57 PDT

Comment on attachment 149344 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=149344&action=review >>> Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:568 >>> + reserveRenderSurfaceMemoryBytes(*m_contentsTextureManager.get(), requiredBytes); >> >> Rather than calling this function several times inside here, could we just have another function like ::computeRenderSurfaceMemoryBytes() that is called once before any painting even starts? > > I wanted to only reserve one at a time, as its contents are reserved, so that when we OOM, we allocate surfaces that will have contents, and don't prevent contents with surfaces that won't have any contents allocated that draw into them. I see. This is going to be more difficult to support in the model where we request all textures with priorities in advance (rushes to submit patch first ;) ). I'm hoping it will be acceptable to just request all surfaces and subtract that from contents memory. All renders surfaces are visible, so we need them to render the page correctly... So this will only really "help" in the ~50%-100% memory usage range, and by "help" I mean render the page "less incorrect".

Adrienne Walker

Comment 12 2012-06-25 18:23:33 PDT

Comment on attachment 149406 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=149406&action=review > Source/WebCore/ChangeLog:80 > + Here we stop doing this, and create RenderSurfaces on both threads > + under the same conditions, so main thread has surfaces only if the > + layers in its current frame demand them. While this may reduce > + paint culling within an animating subtree, this seems like an edge > + case and knowing the amount of surface memory needed for the frame > + is important. How does that work? You're not really creating render surfaces in the same conditions, since on the main thread these animations may not have even started yet. Would it make more sense to reserve as if the animation had started rather than potentially underestimating? > Source/WebCore/platform/graphics/chromium/RenderSurfaceChromium.cpp:82 > + if (!m_backgroundFilters.isEmpty()) > + bytes += TextureManager::memoryUseBytes(m_contentRect.size(), backgroundTextureFormat()); Is this also just meant to be an estimate, since m_contentRect is not the size of the background texture? > Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:579 > + bool isRootRenderSurface = it.targetRenderSurfaceLayer() != renderSurfaceLayerList[0]; You could just stick the root render surface in allocatedSurfaces... > Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:581 > + reserveRenderSurfaceMemoryBytes(*m_contentsTextureManager.get(), occlusionTracker.overdrawMetrics(), target->requiredTextureBytes()); So the idea is to allocate all render surfaces simultaneously and just cache all surfaces?

Eric Penner

Comment 13 2012-06-25 19:05:10 PDT

Comment on attachment 149344 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=149344&action=review >>>> Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:568 >>>> + reserveRenderSurfaceMemoryBytes(*m_contentsTextureManager.get(), requiredBytes); >>> >>> Rather than calling this function several times inside here, could we just have another function like ::computeRenderSurfaceMemoryBytes() that is called once before any painting even starts? >> >> I wanted to only reserve one at a time, as its contents are reserved, so that when we OOM, we allocate surfaces that will have contents, and don't prevent contents with surfaces that won't have any contents allocated that draw into them. > > I see. This is going to be more difficult to support in the model where we request all textures with priorities in advance (rushes to submit patch first ;) ). I'm hoping it will be acceptable to just request all surfaces and subtract that from contents memory. All renders surfaces are visible, so we need them to render the page correctly... So this will only really "help" in the ~50%-100% memory usage range, and by "help" I mean render the page "less incorrect". And just to be clear I'm joking above and I like this; I'm just not sure whether we should add a lot of code to support custom priorities for each surface once that CL lands. Currently we don't need layer ordering.

Dana Jansens

Comment 14 2012-06-25 19:24:19 PDT

Comment on attachment 149406 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=149406&action=review >> Source/WebCore/ChangeLog:80 >> + is important. > > How does that work? You're not really creating render surfaces in the same conditions, since on the main thread these animations may not have even started yet. Would it make more sense to reserve as if the animation had started rather than potentially underestimating? Oh crap that's supposed to be 89793, I uploaded them together *^_^*. My thinking is that is seems fine to go over the limit a little here and there. If we assume all animations are surfaces we will greatly overestimate surface memory (most animations are not surfaces), and evict contents. Instead, I favoured going over the limit for a bit. The main thread animates too (at a slower rate) and will realize these animating things are a render surface and adjust, if and when it sees that. >> Source/WebCore/platform/graphics/chromium/RenderSurfaceChromium.cpp:82 >> + bytes += TextureManager::memoryUseBytes(m_contentRect.size(), backgroundTextureFormat()); > > Is this also just meant to be an estimate, since m_contentRect is not the size of the background texture? m_contentRect is the size of the background texture, but not the size of the readback texture. The readback texture is <= the root surface so it's accounted for. >> Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:579 >> + bool isRootRenderSurface = it.targetRenderSurfaceLayer() != renderSurfaceLayerList[0]; > > You could just stick the root render surface in allocatedSurfaces... Oh. ya that's true. I will do that! >> Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:581 >> + reserveRenderSurfaceMemoryBytes(*m_contentsTextureManager.get(), occlusionTracker.overdrawMetrics(), target->requiredTextureBytes()); > > So the idea is to allocate all render surfaces simultaneously and just cache all surfaces? yeh.

Dana Jansens

Comment 15 2012-06-25 20:54:02 PDT

(In reply to comment #13) > (From update of attachment 149344 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=149344&action=review > And just to be clear I'm joking above and I like this; I'm just not sure whether we should add a lot of code to support custom priorities for each surface once that CL lands. Currently we don't need layer ordering. Ah I see. With PTM it seems like it would make sense to just compute the sum total of surfaces at the start. I am just trying to worry about how it might be abused. For any memory limit, I can construct a legitimate webpage that would use arbitrarily close to 100% of the available memory for surfaces, and a small amount for content, without going outside our memory limit. And I think we should be able to display it, since it is within our bounds. So I don't like limiting surfaces to some arbitrary < 100%. Also if we do that, then which surfaces to drop? Impl side has to agree again suddenly. Right now it has a very clear clue which surfaces to draw (they have non empty quad list, ie at least one contributing texture). We would need to add a flag of some sort to RenderSurface in order to drop/skip the RenderPass when creating quads. It's a bit awkward though, I mean, if we reserve all the surface memory we could easily prevent any contents from being drawn. Is that okay? I've been assuming it's not. Currently PTM puts the cutoff so that no textures above the cutoff fail to be allocated. Then it just grabs more-or-less arbitrary textures to fill in the remaining memory during paint. What if we decided which textures would be drawn during prioritization instead of during paint. Ie once we run out of memory, we pick an arbitrary set of textures with priority = the cutoff that stay below the limit, and mark them as above (others are below). And then during paint we only allow textures strictly marked as above. Then, what if we did prioritization such that visible textures drawing to the root surface get visible+0.5. Visible textures drawing to the next surface get visible+0.4. Visible textures drawing to the next visible+0.3, etc. Theh, when we sort, the visible things will be sorted according to target surface. As we walk down the priority list, at priority visible+0.5, we must reduce available memory by the size of the root render surface. At visible+0.4, we reduce available memory by the next render surface. If we reach the end of visible priorities, we'll have reserved space for all the render surfaces. If we do not, we will have reserved space for any surfaces with visible textures that will be available. I could write this change against the PTM patch, with intent to land it after PRM does.

Dana Jansens

Comment 16 2012-06-26 07:37:31 PDT

Created attachment 149530 [details] Patch - Without 89793 included.

Dana Jansens

Comment 17 2012-06-26 08:41:08 PDT

(In reply to comment #15) > What if we decided which textures would be drawn during prioritization instead of during paint. Ie once we run out of memory, we pick an arbitrary set of textures with priority = the cutoff that stay below the limit, and mark them as above (others are below). And then during paint we only allow textures strictly marked as above. Right.. and occlusion culling throws its wrench in here.

Dana Jansens

Comment 18 2012-06-27 12:31:07 PDT

Created attachment 149783 [details] Patch This CL is now rebased onto the PrioritizedTextureManager.

Dana Jansens

Comment 19 2012-06-29 11:20:21 PDT

Created attachment 150220 [details] Patch

Eric Penner

Comment 20 2012-06-29 11:41:51 PDT

Comment on attachment 150220 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=150220&action=review > Source/WebCore/platform/graphics/chromium/cc/CCPrioritizedTextureManager.cpp:59 > +void CCPrioritizedTextureManager::prioritizeTextures(size_t rootRenderSurfaceMemoryBytes, size_t childRenderSurfacesMemoryBytes) Could we possibly just change the limit instead of passing this custom information in here? Ideally this class could be generic with the exception of the priority calculator. What's wrong with just making render surfaces universally higher priority and just making sure there is some memory left for the root render surface contents. It looks like once these are on the same thread we could actually just request the render surfaces like any other memory. Possibly as stop-gap we could add a custom format that never gets allocated. Then we could just add a big texture request for all the other render surfaces, which would get prioritized with everything else.

Dana Jansens

Comment 21 2012-06-29 11:45:07 PDT

Comment on attachment 150220 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=150220&action=review >> Source/WebCore/platform/graphics/chromium/cc/CCPrioritizedTextureManager.cpp:59 >> +void CCPrioritizedTextureManager::prioritizeTextures(size_t rootRenderSurfaceMemoryBytes, size_t childRenderSurfacesMemoryBytes) > > Could we possibly just change the limit instead of passing this custom information in here? Ideally this class could be generic with the exception of the priority calculator. What's wrong with just making render surfaces universally higher priority and just making sure there is some memory left for the root render surface contents. > > It looks like once these are on the same thread we could actually just request the render surfaces like any other memory. Possibly as stop-gap we could add a custom format that never gets allocated. Then we could just add a big texture request for all the other render surfaces, which would get prioritized with everything else. The reason I pass in the render surface limit is it gets applied after the visible textures in the root render surface. This is to handle OOM the way we talked about - visible things in the root render surface will still appear. Otherwise, if we just decrease the limit, and surfaces OOM us, we display nothing at all.

Eric Penner

Comment 22 2012-06-29 11:54:57 PDT

(In reply to comment #21) > (From update of attachment 150220 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=150220&action=review > > >> Source/WebCore/platform/graphics/chromium/cc/CCPrioritizedTextureManager.cpp:59 > >> +void CCPrioritizedTextureManager::prioritizeTextures(size_t rootRenderSurfaceMemoryBytes, size_t childRenderSurfacesMemoryBytes) > > > > Could we possibly just change the limit instead of passing this custom information in here? Ideally this class could be generic with the exception of the priority calculator. What's wrong with just making render surfaces universally higher priority and just making sure there is some memory left for the root render surface contents. > > > > It looks like once these are on the same thread we could actually just request the render surfaces like any other memory. Possibly as stop-gap we could add a custom format that never gets allocated. Then we could just add a big texture request for all the other render surfaces, which would get prioritized with everything else. > > The reason I pass in the render surface limit is it gets applied after the visible textures in the root render surface. This is to handle OOM the way we talked about - visible things in the root render surface will still appear. Otherwise, if we just decrease the limit, and surfaces OOM us, we display nothing at all. I think I understand the end behavior you want. I thought you were going to just not let render surfaces OOM us by not letting them use all of our memory. Given that this is just to provide some user feedback that the page is OOM, I don't know how optimal this needs to be. As an alternative, would using a texture format of -1 work? Then we could put in a normal request for the render surface memory rather than a custom bytes value.

Eric Penner

Comment 23 2012-06-29 11:59:35 PDT

(In reply to comment #22) > (In reply to comment #21) > > (From update of attachment 150220 [details] [details]) > > View in context: https://bugs.webkit.org/attachment.cgi?id=150220&action=review > > > > >> Source/WebCore/platform/graphics/chromium/cc/CCPrioritizedTextureManager.cpp:59 > > >> +void CCPrioritizedTextureManager::prioritizeTextures(size_t rootRenderSurfaceMemoryBytes, size_t childRenderSurfacesMemoryBytes) > > > > > > Could we possibly just change the limit instead of passing this custom information in here? Ideally this class could be generic with the exception of the priority calculator. What's wrong with just making render surfaces universally higher priority and just making sure there is some memory left for the root render surface contents. > > > > > > It looks like once these are on the same thread we could actually just request the render surfaces like any other memory. Possibly as stop-gap we could add a custom format that never gets allocated. Then we could just add a big texture request for all the other render surfaces, which would get prioritized with everything else. > > > > The reason I pass in the render surface limit is it gets applied after the visible textures in the root render surface. This is to handle OOM the way we talked about - visible things in the root render surface will still appear. Otherwise, if we just decrease the limit, and surfaces OOM us, we display nothing at all. > > I think I understand the end behavior you want. I thought you were going to just not let render surfaces OOM us by not letting them use all of our memory. Given that this is just to provide some user feedback that the page is OOM, I don't know how optimal this needs to be. > > As an alternative, would using a texture format of -1 work? Then we could put in a normal request for the render surface memory rather than a custom bytes value. Hmm, since this is already implemented, I'll leave it up to you and Enne to decide what is best. If we keep it this way, just add a FIXME in prioritizeTextures that notes how we could remove the custom behavior (once everything is on the same thread and surfaces can be requested like other textures).

Dana Jansens

Comment 24 2012-06-29 12:03:51 PDT

Comment on attachment 150220 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=150220&action=review >>>>> Source/WebCore/platform/graphics/chromium/cc/CCPrioritizedTextureManager.cpp:59 >>>>> +void CCPrioritizedTextureManager::prioritizeTextures(size_t rootRenderSurfaceMemoryBytes, size_t childRenderSurfacesMemoryBytes) >>>> >>>> Could we possibly just change the limit instead of passing this custom information in here? Ideally this class could be generic with the exception of the priority calculator. What's wrong with just making render surfaces universally higher priority and just making sure there is some memory left for the root render surface contents. >>>> >>>> It looks like once these are on the same thread we could actually just request the render surfaces like any other memory. Possibly as stop-gap we could add a custom format that never gets allocated. Then we could just add a big texture request for all the other render surfaces, which would get prioritized with everything else. >>> >>> The reason I pass in the render surface limit is it gets applied after the visible textures in the root render surface. This is to handle OOM the way we talked about - visible things in the root render surface will still appear. Otherwise, if we just decrease the limit, and surfaces OOM us, we display nothing at all. >> >> I think I understand the end behavior you want. I thought you were going to just not let render surfaces OOM us by not letting them use all of our memory. Given that this is just to provide some user feedback that the page is OOM, I don't know how optimal this needs to be. >> >> As an alternative, would using a texture format of -1 work? Then we could put in a normal request for the render surface memory rather than a custom bytes value. > > Hmm, since this is already implemented, I'll leave it up to you and Enne to decide what is best. If we keep it this way, just add a FIXME in prioritizeTextures that notes how we could remove the custom behavior (once everything is on the same thread and surfaces can be requested like other textures). I don't think we can just request them like normal memory until we get rid of requestLate. Once that happens we could though! If all the root surface texture don't fit in memory, we'll prioritize nothing, and then we'll walk front-to-back requestLate and taking all the memory for textures in child surfaces, and nothing will appear again. See the 3 lines in requestLate for the important bit.

Eric Penner

Comment 25 2012-06-29 12:20:23 PDT

Comment on attachment 150220 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=150220&action=review >>>>>> Source/WebCore/platform/graphics/chromium/cc/CCPrioritizedTextureManager.cpp:59 >>>>>> +void CCPrioritizedTextureManager::prioritizeTextures(size_t rootRenderSurfaceMemoryBytes, size_t childRenderSurfacesMemoryBytes) >>>>> >>>>> Could we possibly just change the limit instead of passing this custom information in here? Ideally this class could be generic with the exception of the priority calculator. What's wrong with just making render surfaces universally higher priority and just making sure there is some memory left for the root render surface contents. >>>>> >>>>> It looks like once these are on the same thread we could actually just request the render surfaces like any other memory. Possibly as stop-gap we could add a custom format that never gets allocated. Then we could just add a big texture request for all the other render surfaces, which would get prioritized with everything else. >>>> >>>> The reason I pass in the render surface limit is it gets applied after the visible textures in the root render surface. This is to handle OOM the way we talked about - visible things in the root render surface will still appear. Otherwise, if we just decrease the limit, and surfaces OOM us, we display nothing at all. >>> >>> I think I understand the end behavior you want. I thought you were going to just not let render surfaces OOM us by not letting them use all of our memory. Given that this is just to provide some user feedback that the page is OOM, I don't know how optimal this needs to be. >>> >>> As an alternative, would using a texture format of -1 work? Then we could put in a normal request for the render surface memory rather than a custom bytes value. >> >> Hmm, since this is already implemented, I'll leave it up to you and Enne to decide what is best. If we keep it this way, just add a FIXME in prioritizeTextures that notes how we could remove the custom behavior (once everything is on the same thread and surfaces can be requested like other textures). > > I don't think we can just request them like normal memory until we get rid of requestLate. Once that happens we could though! > > If all the root surface texture don't fit in memory, we'll prioritize nothing, and then we'll walk front-to-back requestLate and taking all the memory for textures in child surfaces, and nothing will appear again. See the 3 lines in requestLate for the important bit. Okay, so because we paint from front-to-back, requestLate will allow the other memory instead of the root render surface memory like you want. Is that correct? We can fix that by only allowing requestLate to succeed when the textures are equal to the cutoff priority (rather than letting any texture succeed). I mean to add that before actually. Would that help? Request late is meant to disambiguate rather than providing a totally different priority order (it just didn't matter when "visible" was the only priority we needed disambiguate). I would also love to just remove requestLate. It's probably will perform very poorly as it can result in cascading cache misses like the old LRU. Android clearly doesn't need it since we don't even have occlusions.. Not sure how to decide if we can remove it.

Dana Jansens

Comment 26 2012-06-29 12:23:41 PDT

Comment on attachment 150220 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=150220&action=review >>>>>>> Source/WebCore/platform/graphics/chromium/cc/CCPrioritizedTextureManager.cpp:59 >>>>>>> +void CCPrioritizedTextureManager::prioritizeTextures(size_t rootRenderSurfaceMemoryBytes, size_t childRenderSurfacesMemoryBytes) >>>>>> >>>>>> Could we possibly just change the limit instead of passing this custom information in here? Ideally this class could be generic with the exception of the priority calculator. What's wrong with just making render surfaces universally higher priority and just making sure there is some memory left for the root render surface contents. >>>>>> >>>>>> It looks like once these are on the same thread we could actually just request the render surfaces like any other memory. Possibly as stop-gap we could add a custom format that never gets allocated. Then we could just add a big texture request for all the other render surfaces, which would get prioritized with everything else. >>>>> >>>>> The reason I pass in the render surface limit is it gets applied after the visible textures in the root render surface. This is to handle OOM the way we talked about - visible things in the root render surface will still appear. Otherwise, if we just decrease the limit, and surfaces OOM us, we display nothing at all. >>>> >>>> I think I understand the end behavior you want. I thought you were going to just not let render surfaces OOM us by not letting them use all of our memory. Given that this is just to provide some user feedback that the page is OOM, I don't know how optimal this needs to be. >>>> >>>> As an alternative, would using a texture format of -1 work? Then we could put in a normal request for the render surface memory rather than a custom bytes value. >>> >>> Hmm, since this is already implemented, I'll leave it up to you and Enne to decide what is best. If we keep it this way, just add a FIXME in prioritizeTextures that notes how we could remove the custom behavior (once everything is on the same thread and surfaces can be requested like other textures). >> >> I don't think we can just request them like normal memory until we get rid of requestLate. Once that happens we could though! >> >> If all the root surface texture don't fit in memory, we'll prioritize nothing, and then we'll walk front-to-back requestLate and taking all the memory for textures in child surfaces, and nothing will appear again. See the 3 lines in requestLate for the important bit. > > Okay, so because we paint from front-to-back, requestLate will allow the other memory instead of the root render surface memory like you want. Is that correct? > > We can fix that by only allowing requestLate to succeed when the textures are equal to the cutoff priority (rather than letting any texture succeed). I mean to add that before actually. Would that help? Request late is meant to disambiguate rather than providing a totally different priority order (it just didn't matter when "visible" was the only priority we needed disambiguate). > > I would also love to just remove requestLate. It's probably will perform very poorly as it can result in cascading cache misses like the old LRU. Android clearly doesn't need it since we don't even have occlusions.. Not sure how to decide if we can remove it. Oh, hm. Well I think we definitely can't remove it for now, until we have occlusion information in the prioritization step. Or we will just not use available memory for visible textures that we could have. But yes, making it only allocate things equal to the cutoff...I can't think of any reason why that wouldn't work.

Eric Penner

Comment 27 2012-06-29 12:38:10 PDT

(In reply to comment #26) > (From update of attachment 150220 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=150220&action=review > > >>>>>>> Source/WebCore/platform/graphics/chromium/cc/CCPrioritizedTextureManager.cpp:59 > >>>>>>> +void CCPrioritizedTextureManager::prioritizeTextures(size_t rootRenderSurfaceMemoryBytes, size_t childRenderSurfacesMemoryBytes) > >>>>>> > >>>>>> Could we possibly just change the limit instead of passing this custom information in here? Ideally this class could be generic with the exception of the priority calculator. What's wrong with just making render surfaces universally higher priority and just making sure there is some memory left for the root render surface contents. > >>>>>> > >>>>>> It looks like once these are on the same thread we could actually just request the render surfaces like any other memory. Possibly as stop-gap we could add a custom format that never gets allocated. Then we could just add a big texture request for all the other render surfaces, which would get prioritized with everything else. > >>>>> > >>>>> The reason I pass in the render surface limit is it gets applied after the visible textures in the root render surface. This is to handle OOM the way we talked about - visible things in the root render surface will still appear. Otherwise, if we just decrease the limit, and surfaces OOM us, we display nothing at all. > >>>> > >>>> I think I understand the end behavior you want. I thought you were going to just not let render surfaces OOM us by not letting them use all of our memory. Given that this is just to provide some user feedback that the page is OOM, I don't know how optimal this needs to be. > >>>> > >>>> As an alternative, would using a texture format of -1 work? Then we could put in a normal request for the render surface memory rather than a custom bytes value. > >>> > >>> Hmm, since this is already implemented, I'll leave it up to you and Enne to decide what is best. If we keep it this way, just add a FIXME in prioritizeTextures that notes how we could remove the custom behavior (once everything is on the same thread and surfaces can be requested like other textures). > >> > >> I don't think we can just request them like normal memory until we get rid of requestLate. Once that happens we could though! > >> > >> If all the root surface texture don't fit in memory, we'll prioritize nothing, and then we'll walk front-to-back requestLate and taking all the memory for textures in child surfaces, and nothing will appear again. See the 3 lines in requestLate for the important bit. > > > > Okay, so because we paint from front-to-back, requestLate will allow the other memory instead of the root render surface memory like you want. Is that correct? > > > > We can fix that by only allowing requestLate to succeed when the textures are equal to the cutoff priority (rather than letting any texture succeed). I mean to add that before actually. Would that help? Request late is meant to disambiguate rather than providing a totally different priority order (it just didn't matter when "visible" was the only priority we needed disambiguate). > > > > I would also love to just remove requestLate. It's probably will perform very poorly as it can result in cascading cache misses like the old LRU. Android clearly doesn't need it since we don't even have occlusions.. Not sure how to decide if we can remove it. > > Oh, hm. Well I think we definitely can't remove it for now, until we have occlusion information in the prioritization step. Or we will just not use available memory for visible textures that we could have. > > But yes, making it only allocate things equal to the cutoff...I can't think of any reason why that wouldn't work. Actually, thinking about it more, requestLate might not be that bad after all. The cascading cache misses in the LRU came from an attempt recycling inside of protectTexture(). Since we defer recycling now, it shouldn't be a problem. Nonetheless, it would be good to get rid of requestLate. If you have some thoughts on figuring out how to persistent occlusions (for the prioritize step) I'd love to hear it. That would be pretty great!

Dana Jansens

Comment 28 2012-06-29 13:25:18 PDT

Created attachment 150246 [details] Patch

Dana Jansens

Comment 29 2012-06-29 13:33:05 PDT

Simplied the patch a bit. The requestLate() can now only request things == the priority cutoff. This makes less non-obvious code yay. I convinced myself root surface will always be 0 bytes under ubercomp. (The root surface is always axis aligned, opaque etc. The host compositor might use a surface for drawing the quads, but it will choose that and own the memory for it.) Discussed with Eric some and we decided to go with the custom RenderSurface request approach as it is here. When we make this a more global resource manager, we'll have placeholder requests and surfaces can fit into that model. Added a FIXME to this effect.

Eric Penner

Comment 30 2012-06-29 13:59:57 PDT

Since we are removing the preferred limit (yay!) I'm okay with adding some custom logic for render surfaces for now. However, I think to remove the preferred limit safely we need to fix this code: void LayerRendererChromium::decideRenderPassAllocationsForFrame(const CCRenderPassList& renderPassesInDrawOrder) { // FIXME: Get this memory limit from GPU Memory Manager size_t contentsMemoryUseBytes = m_contentsTextureAllocator->currentMemoryUseBytes(); size_t maxLimitBytes = TextureManager::highLimitBytes(viewportSize()); size_t memoryLimitBytes = maxLimitBytes - contentsMemoryUseBytes > 0u ? maxLimitBytes - contentsMemoryUseBytes : 0u; m_implTextureManager->setMaxMemoryLimitBytes(memoryLimitBytes); } I added back the preferred limit to be 3/4 of max in the PTM so that this limit will never be zero (it would happily go to zero before I think!). The PTM is much more likely to make it go to zero as it will more likely use all available memory up to it's limit. Should we just set this limit to the limit we calculated on the main thread * some bonus multiplier?

Eric Penner

Comment 31 2012-06-29 14:04:57 PDT

(In reply to comment #30) > Since we are removing the preferred limit (yay!) I'm okay with adding some custom logic for render surfaces for now. > > However, I think to remove the preferred limit safely we need to fix this code: > > void LayerRendererChromium::decideRenderPassAllocationsForFrame(const CCRenderPassList& renderPassesInDrawOrder) > { > // FIXME: Get this memory limit from GPU Memory Manager > size_t contentsMemoryUseBytes = m_contentsTextureAllocator->currentMemoryUseBytes(); > size_t maxLimitBytes = TextureManager::highLimitBytes(viewportSize()); > size_t memoryLimitBytes = maxLimitBytes - contentsMemoryUseBytes > 0u ? maxLimitBytes - contentsMemoryUseBytes : 0u; > > m_implTextureManager->setMaxMemoryLimitBytes(memoryLimitBytes); > } > > I added back the preferred limit to be 3/4 of max in the PTM so that this limit will never be zero (it would happily go to zero before I think!). The PTM is much more likely to make it go to zero as it will more likely use all available memory up to it's limit. Should we just set this limit to the limit we calculated on the main thread * some bonus multiplier? I guess to fit with this CL, we would want to set it to max - "contents in root surface"?

Dana Jansens

Comment 32 2012-06-29 14:08:45 PDT

(In reply to comment #30) > Since we are removing the preferred limit (yay!) I'm okay with adding some custom logic for render surfaces for now. > > However, I think to remove the preferred limit safely we need to fix this code: > > void LayerRendererChromium::decideRenderPassAllocationsForFrame(const CCRenderPassList& renderPassesInDrawOrder) > { > // FIXME: Get this memory limit from GPU Memory Manager > size_t contentsMemoryUseBytes = m_contentsTextureAllocator->currentMemoryUseBytes(); > size_t maxLimitBytes = TextureManager::highLimitBytes(viewportSize()); > size_t memoryLimitBytes = maxLimitBytes - contentsMemoryUseBytes > 0u ? maxLimitBytes - contentsMemoryUseBytes : 0u; > > m_implTextureManager->setMaxMemoryLimitBytes(memoryLimitBytes); > } > > I added back the preferred limit to be 3/4 of max in the PTM so that this limit will never be zero (it would happily go to zero before I think!). The PTM is much more likely to make it go to zero as it will more likely use all available memory up to it's limit. Should we just set this limit to the limit we calculated on the main thread * some bonus multiplier? That limit is kinda meaningless atm. We don't respect it at all during drawing, it just sets a limit that we drop down to at the end of the frame. At any rate, this CL as it is should strictly improve the current situation, by letting us keep more surfaces across frames. I think we want to set this to maxint though basically. (I have another CL in progress to completely remove the implTextureManager mess for RenderSurfaces so it'll take care of that also.)

Eric Penner

Comment 33 2012-06-29 14:39:06 PDT

(In reply to comment #32) > (In reply to comment #30) > > Since we are removing the preferred limit (yay!) I'm okay with adding some custom logic for render surfaces for now. > > > > However, I think to remove the preferred limit safely we need to fix this code: > > > > void LayerRendererChromium::decideRenderPassAllocationsForFrame(const CCRenderPassList& renderPassesInDrawOrder) > > { > > // FIXME: Get this memory limit from GPU Memory Manager > > size_t contentsMemoryUseBytes = m_contentsTextureAllocator->currentMemoryUseBytes(); > > size_t maxLimitBytes = TextureManager::highLimitBytes(viewportSize()); > > size_t memoryLimitBytes = maxLimitBytes - contentsMemoryUseBytes > 0u ? maxLimitBytes - contentsMemoryUseBytes : 0u; > > > > m_implTextureManager->setMaxMemoryLimitBytes(memoryLimitBytes); > > } > > > > I added back the preferred limit to be 3/4 of max in the PTM so that this limit will never be zero (it would happily go to zero before I think!). The PTM is much more likely to make it go to zero as it will more likely use all available memory up to it's limit. Should we just set this limit to the limit we calculated on the main thread * some bonus multiplier? > > That limit is kinda meaningless atm. We don't respect it at all during drawing, it just sets a limit that we drop down to at the end of the frame. At any rate, this CL as it is should strictly improve the current situation, by letting us keep more surfaces across frames. > > I think we want to set this to maxint though basically. (I have another CL in progress to completely remove the implTextureManager mess for RenderSurfaces so it'll take care of that also.) I think zero is a special case where anything that needs a render surfaces will disappear (but maybe I'm misunderstanding). If it's irrelevant, why don't we just set it to be 1000MB until we can fix that code. Or if I'm wrong then just ignore me ;)

WebKit Review Bot

Comment 34 2012-06-29 16:07:16 PDT

Comment on attachment 150246 [details] Patch Attachment 150246 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/13114520

Dana Jansens

Comment 35 2012-06-29 17:31:28 PDT

Yep when theres not enough space for surfaces then we also dont prioritize/allocate any textures that draw into surfaces. So then we wont need to allocate the surfaces.

Eric Penner

Comment 36 2012-07-02 13:58:39 PDT

(In reply to comment #35) > Yep when theres not enough space for surfaces then we also dont prioritize/allocate any textures that draw into surfaces. So then we wont need to allocate the surfaces. Okay, so instead of perferred limit we now have render surface memory which will also prevent us from passing 0 to implTextureManager. SG

Dana Jansens

Comment 37 2012-07-04 07:52:30 PDT

Created attachment 150794 [details] Patch rebase

WebKit Review Bot

Comment 38 2012-07-04 07:58:03 PDT

Comment on attachment 150794 [details] Patch Attachment 150794 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/13129840

Dana Jansens

Comment 39 2012-07-04 08:21:02 PDT

Created attachment 150801 [details] Patch fix uninit errors

Adrienne Walker

Comment 40 2012-07-04 11:20:19 PDT

Comment on attachment 150801 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=150801&action=review Am I right that this current patch will only render the root (and its UI) if it has too many render surfaces to render? That seems like really poor OOM behavior. I felt a lot more comfortable when this patch was reserving render surfaces incrementally. I agree with jamesr that RS usage in the wild is probably pretty low on average, but I need a little bit of convincing to feel like we're not going to break some small number of existing sites that use them heavily. Have you run this patch on sites that generate a lot of render surfaces (or have previously hit texture limits and caused us bugs) to make sure they still render the same with and without this patch that reserves all render surfaces simultaneously? For example, http://www.apple.com/iphone/ and http://www.kraftrecipes.com/recipes/sandwich/index.aspx and http://acko.net/blog/this-is-your-brain-on-css/. > Source/WebCore/ChangeLog:19 > + Since surface memory can no longer be taken by contents, we remove the > + preferred memory limit from the texture manager. I love that this can go away. > Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:531 > + for (size_t i = 1; i < updateList.size(); ++i) { Can you add a comment for why this starts at 1? > Source/WebCore/platform/graphics/chromium/cc/CCPrioritizedTextureManager.cpp:162 > + if (CCPriorityCalculator::priorityIsLower(texture->requestPriority(), m_priorityCutoff)) > + return false; > + > size_t newMemoryBytes = m_memoryAboveCutoffBytes + texture->memorySizeBytes(); > - if (newMemoryBytes > m_maxMemoryLimitBytes) > + if (newMemoryBytes > m_memoryAvailableBytes) > return false; What's wrong with letting other textures do a requestLate? For example, say you had a scrollable non-root content layer whose prepainted textures didn't make the priority cutoff, but due to occlusion, there was room for prepainting. Would we not want to give them memory?

Dana Jansens

Comment 41 2012-07-04 11:43:35 PDT

(In reply to comment #40) > (From update of attachment 150801 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=150801&action=review > > Am I right that this current patch will only render the root (and its UI) if it has too many render surfaces to render? That seems like really poor OOM behavior. I felt a lot more comfortable when this patch was reserving render surfaces incrementally. I agree with jamesr that RS usage in the wild is probably pretty low on average, but I need a little bit of convincing to feel like we're not going to break some small number of existing sites that use them heavily. It will only render the root render surface (which isn't the same as the root layer). In order to reach this situation we have: a) a very low memory limit b) contents in the root surface taking up a lot of memory, or c) ridiculous render surface usage. I expect that displaying the root surface will be enough to tell that the page is attempting to display, and probably enough to use it. > Have you run this patch on sites that generate a lot of render surfaces (or have previously hit texture limits and caused us bugs) to make sure they still render the same with and without this patch that reserves all render surfaces simultaneously? For example, http://www.apple.com/iphone/ and http://www.kraftrecipes.com/recipes/sandwich/index.aspx and http://acko.net/blog/this-is-your-brain-on-css/. The pages all seem to function the same with/out the patch though FWIW. I don't actually know any website that would up all the contents budget with surfaces. Iphone uses a couple surfaces, but doesn't OOM us. The other 2 don't seem to use any surfaces. They have OOMed us in the past with layers that draw into the root surface. If they did use a render surface, then all layers drawing into it would disappear first, now. Previously, it would be the back-most layer to disappear first. And we might allocate all the contents memory to surfaces which then were too large to allocate and show nothing. > > Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHost.cpp:531 > > + for (size_t i = 1; i < updateList.size(); ++i) { > > Can you add a comment for why this starts at 1? Yup! (It's skipping the root surface.) > > Source/WebCore/platform/graphics/chromium/cc/CCPrioritizedTextureManager.cpp:162 > > + if (CCPriorityCalculator::priorityIsLower(texture->requestPriority(), m_priorityCutoff)) > > + return false; > > + > > size_t newMemoryBytes = m_memoryAboveCutoffBytes + texture->memorySizeBytes(); > > - if (newMemoryBytes > m_maxMemoryLimitBytes) > > + if (newMemoryBytes > m_memoryAvailableBytes) > > return false; > > What's wrong with letting other textures do a requestLate? For example, say you had a scrollable non-root content layer whose prepainted textures didn't make the priority cutoff, but due to occlusion, there was room for prepainting. Would we not want to give them memory? Because requestLate() does not happen in priority order. The goal is the following: If a contents texture is allocated, we also have space for all of its target textures up to the root. If we don't have space for surface textures, then we should not allocate any contents that will draw into non-root surfaces. This way we can allocate all surfaces that have contents drawing into them in LRC, we don't need to keep around extra state for which surfaces are to be allocated. If requestLate() happened in priority order, or even in surface dependency order, we could do something with it. But as it is, it happens in front-to-back order (requirement for occlusion). We'd need to do a "request space for surfaces on demand" approach. I went with this because I feel the on-demand case is just overkill for what we need. It would complicate up the texture manager, requiring it to know/use the layer/surface tree structure in some way. This method allows it to know/use a single integer. This is the trade off as I am seeing it.

Dana Jansens

Comment 42 2012-07-04 11:48:27 PDT

Created attachment 150828 [details] Patch - added comment to for loop

Adrienne Walker

Comment 43 2012-07-04 12:46:28 PDT

Comment on attachment 150828 [details] Patch Thanks for testing on those pages. They've OOMed us in the past (when we had lower limits) and have had anecdotally had a lot of RS usage. If they don't really have much at all, then all the better. Re: root surface vs. root layer. Ah, quite right. I misread that. Just rendering the root surface is an entirely different thing. That makes sense that you wouldn't want to reserve anything but the root surface in that case. I am totally swayed by arguments about simplicity here. Re: requestLate. Sure, it doesn't happen in priority order and I buy that for render surface OOM, you wouldn't want anything else. But for content OOM + occlusion, maybe you would, even if you found out that you could at the wrong time to make an optimal choice? Maybe that's just too much complication. In the future, I'm hoping impl-side painting will simplify this because we can have SkPicture and occlusion information before we allocate or prioritize textures. R=me. Let's see how this goes.

Dana Jansens

Comment 44 2012-07-04 12:48:43 PDT

(In reply to comment #43) > (From update of attachment 150828 [details]) > Thanks for testing on those pages. They've OOMed us in the past (when we had lower limits) and have had anecdotally had a lot of RS usage. If they don't really have much at all, then all the better. > > Re: root surface vs. root layer. Ah, quite right. I misread that. Just rendering the root surface is an entirely different thing. That makes sense that you wouldn't want to reserve anything but the root surface in that case. I am totally swayed by arguments about simplicity here. > > Re: requestLate. Sure, it doesn't happen in priority order and I buy that for render surface OOM, you wouldn't want anything else. But for content OOM + occlusion, maybe you would, even if you found out that you could at the wrong time to make an optimal choice? Maybe that's just too much complication. In the future, I'm hoping impl-side painting will simplify this because we can have SkPicture and occlusion information before we allocate or prioritize textures. Yah, that. I'm expecting us to improve the prioritization a lot with occlusion, and we can remove requestLate entirely at that point. > R=me. Let's see how this goes. Thanks!

WebKit Review Bot

Comment 45 2012-07-04 13:51:12 PDT

Comment on attachment 150828 [details] Patch Clearing flags on attachment: 150828 Committed r121870: <http://trac.webkit.org/changeset/121870>

WebKit Review Bot

Comment 46 2012-07-04 13:51:19 PDT

All reviewed patches have been landed. Closing bug.

Attachments
Patch (19.65 KB, patch) 2012-06-25 13:49 PDT, Dana Jansens	no flags	Details Formatted Diff Diff
Patch (26.36 KB, patch) 2012-06-25 17:13 PDT, Dana Jansens	no flags	Details Formatted Diff Diff
Patch (32.60 KB, patch) 2012-06-25 17:35 PDT, Dana Jansens	no flags	Details Formatted Diff Diff
Patch (26.39 KB, patch) 2012-06-26 07:37 PDT, Dana Jansens	no flags	Details Formatted Diff Diff
Patch (112.97 KB, patch) 2012-06-27 12:31 PDT, Dana Jansens	no flags	Details Formatted Diff Diff
Patch (117.39 KB, patch) 2012-06-29 11:20 PDT, Dana Jansens	no flags	Details Formatted Diff Diff
Patch (109.33 KB, patch) 2012-06-29 13:25 PDT, Dana Jansens	no flags	Details Formatted Diff Diff
Patch (108.13 KB, patch) 2012-07-04 07:52 PDT, Dana Jansens	no flags	Details Formatted Diff Diff
Patch (108.02 KB, patch) 2012-07-04 08:21 PDT, Dana Jansens	no flags	Details Formatted Diff Diff
Patch (108.11 KB, patch) 2012-07-04 11:48 PDT, Dana Jansens	no flags	Details Formatted Diff Diff
Show Obsolete (9) View All Add attachment proposed patch, testcase, etc.