70533 – [Chromium] Cull occluded tiles in tiled layers

Dana Jansens

Reported 2011-10-20 12:04:13 PDT

Avoids drawing of fully occluded tiles in CCTiledLayerImpl. Simple first proof of concept - only considers tiles completely occluded by any single child. This can be expanded to consider the union of children and siblings. Currently does not work well as opaque() is true for layers even when it should not be. That is something that can be addressed in parallel. I'd like this to be reviewed primarily for the direction it is going. Is this how we'd like to avoid overdraw within a layer, is it in the right place?

Dana Jansens

Comment 1 2011-10-20 12:12:46 PDT

Created attachment 111820 [details] Patch

Dana Jansens

Comment 2 2011-10-20 12:19:50 PDT

http://code.google.com/p/chromium/issues/detail?id=99522

Antoine Labour

Comment 3 2011-10-20 12:23:50 PDT

(In reply to comment #0) > Avoids drawing of fully occluded tiles in CCTiledLayerImpl. > > Simple first proof of concept - only considers tiles completely occluded by any single child. This can be expanded to consider the union of children and siblings. > > Currently does not work well as opaque() is true for layers even when it should not be. That is something that can be addressed in parallel. The default for LayerChromium::m_opaque is wrong (different from GraphicsLayerChromium's default of false, but GrahpicsLayerChromium doesn't set it on the LayerChromium until it changes - and it never does). https://bugs.webkit.org/show_bug.cgi?id=70085 fixes it. But it's a trivial change that you can include in this patch. > > I'd like this to be reviewed primarily for the direction it is going. Is this how we'd like to avoid overdraw within a layer, is it in the right place?

Antoine Labour

Comment 4 2011-10-20 12:47:57 PDT

On the overall approach, I'll let James/Nat/Enne/Shawn pitch in. I could imagine an approach that is less "stateful" where the logic goes to the LayerRendererChromium, keeping a list of occluders in drawLayersInternal, that we'd pass to drawLayer, and then possibly to CCLayerImpl::draw. The advantage is that it wouldn't leak to LayerChromium, you would have access to a flattened list of children, and I think it would be closer to where you'd need to implement a fuller approach (e.g. not limited to tiled content layers, possibly scissoring the tiles to eliminate more pixels). It may also integrate well with Shawn's scissoring patch. That being said, there's possibly issues I'm not seeing with that approach, so take it with a grain of salt. Also, you'll probably won't want to consider as an occluder the layers that have a transform that is more than a x+y translation and scale, since their enclosing rect is a strict superset of the layer itself.

Adrienne Walker

Comment 5 2011-10-20 13:21:50 PDT

Comment on attachment 111820 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=111820&action=review > Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHostCommon.cpp:300 > + if (child->opaque() && child->opacity() >= 0.9999 && childRenderSurface->drawOpacity() >= 0.9999) > + layer->addOccludedArea(enclosingIntRect(childRenderSurface->drawableContentRect())); > } else { > IntRect drawableContentRect = layer->drawableContentRect(); > drawableContentRect.unite(child->drawableContentRect()); > layer->setDrawableContentRect(drawableContentRect); > + > + if (child->opaque() && child->opacity() >= 0.9999 && child->drawOpacity() >= 0.9999) I am not sure about this logic. Am I reading this code correctly that only a layer's direct child can occlude that layer, but not any other layers? Shouldn't it be the case that a layer is a potential occluder for all layers that draw before it so long as all of those layers are going into the same render surface? So, later siblings can occlude earlier siblings and children can occlude grandparents?

Dana Jansens

Comment 6 2011-10-20 13:27:22 PDT

(In reply to comment #5) > (From update of attachment 111820 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=111820&action=review > > > Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHostCommon.cpp:300 > > + if (child->opaque() && child->opacity() >= 0.9999 && childRenderSurface->drawOpacity() >= 0.9999) > > + layer->addOccludedArea(enclosingIntRect(childRenderSurface->drawableContentRect())); > > } else { > > IntRect drawableContentRect = layer->drawableContentRect(); > > drawableContentRect.unite(child->drawableContentRect()); > > layer->setDrawableContentRect(drawableContentRect); > > + > > + if (child->opaque() && child->opacity() >= 0.9999 && child->drawOpacity() >= 0.9999) > > I am not sure about this logic. Am I reading this code correctly that only a layer's direct child can occlude that layer, but not any other layers? Shouldn't it be the case that a layer is a potential occluder for all layers that draw before it so long as all of those layers are going into the same render surface? So, later siblings can occlude earlier siblings and children can occlude grandparents? Yes you're right. It's a todo to pass occlusion up to parents if the current layer is non-opaque. In the opaque case the parent will already occlude all its children. Siblings can occlude too and is a todo also. I'm hoping we can just assume we're considering all the correct layers for occlusion and discuss other aspects of the code for this patch. But your comments are right on point.

Dana Jansens

Comment 7 2011-10-20 14:29:42 PDT

Created attachment 111850 [details] Updated to ignore non-axis-aligned transformations

Adrienne Walker

Comment 8 2011-10-20 14:45:21 PDT

(In reply to comment #6) > In the opaque case the parent will already occlude all its children. I don't believe this is true. The layer tree is not a bounding box hierarchy. An opaque parent may or may not occlude its children. > I'm hoping we can just assume we're considering all the correct layers for occlusion and discuss other aspects of the code for this patch. There aren't false positives here, but given the number of empty layers that get created in WebKit, I'd really worry about all the potential occluders (mostly child -> grand*parent relationships) that we're missing. Maybe I'm overstating this worry? I'm interested in Antoine's previous comment about where this might need to go in the future to support those TODOs. If you constructed a flattened list of layers that get drawn into a render surface, the set of potential occluders for a layer in that list is all the (opaque) layers that come after it in the list. If you didn't have such an ordered list, it seems like it'd be O(n^2) to set all the potential occluders directly on each layer.

Nat Duca

Comment 9 2011-10-21 08:48:21 PDT

Comment on attachment 111820 [details] Patch Dana, do you have a sense of what the stateless approach would be like, both in terms of complexity-to-implement as well as complexity to maintain? I definitely like just folding in "visibility culling" as a postprocess --- there are maintainability wins and also testing wins to be had from such an approach. If it helps, we have little indication that the time spent in the compositor drawing is a bottleneck. That is to say, an inefficient implementation thats maintainable is vastly preferred to the fastest version.

Dana Jansens

Comment 10 2011-10-21 08:51:05 PDT

(In reply to comment #9) > (From update of attachment 111820 [details]) > Dana, do you have a sense of what the stateless approach would be like, both in terms of complexity-to-implement as well as complexity to maintain? Will get back with any questions about it. Haven't dug into what it all means yet as I've been looking at setting our friend the opaque flag.

W. James MacLean

Comment 11 2011-10-28 08:57:59 PDT

Created attachment 112872 [details] Patch

W. James MacLean

Comment 12 2011-10-28 09:00:32 PDT

This is a first-attempt at addressing Antoine's desire to make the culling process less stateful. It does also attempt to consider all subsequent layers in the draw order when culling a tile, but does not attempt to draw partial tiles (although the code is designed to allow this). Missing: the ability to actually draw a partial tile (i.e. a partial texture): looking for feedback on this.

Adrienne Walker

Comment 13 2011-10-28 09:33:45 PDT

Comment on attachment 112872 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=112872&action=review I'm liking this approach. :) My biggest sadness is that the complexity of drawTiles is getting a little bit out of hand. If there's some way to encapsulate the culling more into different functions or otherwise break up that monolithic function, I think that'd be excellent. > Source/WebCore/platform/graphics/chromium/cc/CCTiledLayerImpl.cpp:200 > +static bool isAxisAligned(const TransformationMatrix& m) Out of curiosity, are you only considering axis-aligned rects so that you can use IntRect::contains rather than some more complicated occlusion test? > Source/WebCore/platform/graphics/chromium/cc/CCTiledLayerImpl.cpp:310 > + if (subsequentLayer->opacity() >= 0.9999 > + && subsequentLayer->drawOpacity() >= 0.9999 Can these checks get wrapped into some sort of drawsOpaque function? I also think you're going to need to check opaque() on the layer itself here. > Source/WebCore/platform/graphics/chromium/cc/CCTiledLayerImpl.cpp:322 > + continue; // WJM - verify this jumps out of the for loop with no ill effect ... I think this is wrong. This will not set prevEdgeX (see bottom of the loop). See also: this function is too complex for its own good. > Source/WebCore/platform/graphics/chromium/cc/CCTiledLayerImpl.cpp:405 > + // FIXME: this only works at present because we know tileRectsVisible contains at most > + // one full-sized tile. When we need to process a list of sub-rects, then we need > + // a way to draw a partial tile. Not sure if this is best done through a new > + // shader (since the draw geometry is fixed) or what. I guess you mention this, but I'm a little confused why you have this loop inside of the loop. It seems to me that you either need to collect up {tilerect, quad} or maybe {tilerect, i, j} pairs (so that you could defer all the quad calculations only for visible tiles) and then do all the occlusion culling and drawing in a later loop. Alternatively, you need to just occlusion cull one tile at a time inside of this loop right here and not have a vector of visible tile rects. Also, the draw geometry isn't fixed anymore for tiles. The "tileRect" variable ends up storing the surface-space rectangle to draw from. If you modify that rect to be smaller, it will only draw a partial tile.

W. James MacLean

Comment 14 2011-10-28 09:52:36 PDT

(In reply to comment #13) > (From update of attachment 112872 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=112872&action=review > > I'm liking this approach. :) > > My biggest sadness is that the complexity of drawTiles is getting a little bit out of hand. If there's some way to encapsulate the culling more into different functions or otherwise break up that monolithic function, I think that'd be excellent. I agree ... this is here (in monolithic form) mostly as 'proof of concept', but I'm happy to factor it out to simplify drawTiles(). > > Source/WebCore/platform/graphics/chromium/cc/CCTiledLayerImpl.cpp:200 > > +static bool isAxisAligned(const TransformationMatrix& m) > > Out of curiosity, are you only considering axis-aligned rects so that you can use IntRect::contains rather than some more complicated occlusion test? Basically, yes. We're going for the low-hanging fruit first, although in subsequent iterations can consider more sophisticated overlaps. > > Source/WebCore/platform/graphics/chromium/cc/CCTiledLayerImpl.cpp:310 > > + if (subsequentLayer->opacity() >= 0.9999 > > + && subsequentLayer->drawOpacity() >= 0.9999 > > Can these checks get wrapped into some sort of drawsOpaque function? I also think you're going to need to check opaque() on the layer itself here. Yes, and yes (I thought I had checked opaqueness for this layer, but turns out I just check axis-aligned-ness ...). > > Source/WebCore/platform/graphics/chromium/cc/CCTiledLayerImpl.cpp:322 > > + continue; // WJM - verify this jumps out of the for loop with no ill effect ... > > I think this is wrong. This will not set prevEdgeX (see bottom of the loop). See also: this function is too complex for its own good. I suspected that might be a problem, but wasn't entirely sure so far. Will remove. > > Source/WebCore/platform/graphics/chromium/cc/CCTiledLayerImpl.cpp:405 > > + // FIXME: this only works at present because we know tileRectsVisible contains at most > > + // one full-sized tile. When we need to process a list of sub-rects, then we need > > + // a way to draw a partial tile. Not sure if this is best done through a new > > + // shader (since the draw geometry is fixed) or what. > > I guess you mention this, but I'm a little confused why you have this loop inside of the loop. It seems to me that you either need to collect up {tilerect, quad} or maybe {tilerect, i, j} pairs (so that you could defer all the quad calculations only for visible tiles) and then do all the occlusion culling and drawing in a later loop. Alternatively, you need to just occlusion cull one tile at a time inside of this loop right here and not have a vector of visible tile rects. I had imagined that occlusion-testing will return a list of sub-rects within the current tile, in case (for example) the visible portion of the tile was L-shaped, or split down the middle or some such thing. Not worth considering too many special cases though, as beyond rect or L-shaped it likely becomes more effort than it's worth. > Also, the draw geometry isn't fixed anymore for tiles. The "tileRect" variable ends up storing the surface-space rectangle to draw from. If you modify that rect to be smaller, it will only draw a partial tile. Cool! I'll likely have more questions about this, but will first go back an re-read the relevant code to understand it better.

Adrienne Walker

Comment 15 2011-10-28 10:19:10 PDT

Comment on attachment 112872 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=112872&action=review >>> Source/WebCore/platform/graphics/chromium/cc/CCTiledLayerImpl.cpp:405 >>> + // shader (since the draw geometry is fixed) or what. >> >> I guess you mention this, but I'm a little confused why you have this loop inside of the loop. It seems to me that you either need to collect up {tilerect, quad} or maybe {tilerect, i, j} pairs (so that you could defer all the quad calculations only for visible tiles) and then do all the occlusion culling and drawing in a later loop. Alternatively, you need to just occlusion cull one tile at a time inside of this loop right here and not have a vector of visible tile rects. >> >> Also, the draw geometry isn't fixed anymore for tiles. The "tileRect" variable ends up storing the surface-space rectangle to draw from. If you modify that rect to be smaller, it will only draw a partial tile. > > I had imagined that occlusion-testing will return a list of sub-rects within the current tile, in case (for example) the visible portion of the tile was L-shaped, or split down the middle or some such thing. Not worth considering too many special cases though, as beyond rect or L-shaped it likely becomes more effort than it's worth. Oh, I see where you're going. I have mixed feelings about whether that'd be useful. I know on some platforms, John Bates has already talked about draw call overhead being expensive and wanting to make tile sizes bigger to help with that. I guess I'm trying to say that this won't be a win in all cases.

W. James MacLean

Comment 16 2011-10-28 10:41:11 PDT

(In reply to comment #15) > (From update of attachment 112872 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=112872&action=review > > >>> Source/WebCore/platform/graphics/chromium/cc/CCTiledLayerImpl.cpp:405 > >>> + // shader (since the draw geometry is fixed) or what. > >> > >> I guess you mention this, but I'm a little confused why you have this loop inside of the loop. It seems to me that you either need to collect up {tilerect, quad} or maybe {tilerect, i, j} pairs (so that you could defer all the quad calculations only for visible tiles) and then do all the occlusion culling and drawing in a later loop. Alternatively, you need to just occlusion cull one tile at a time inside of this loop right here and not have a vector of visible tile rects. > >> > >> Also, the draw geometry isn't fixed anymore for tiles. The "tileRect" variable ends up storing the surface-space rectangle to draw from. If you modify that rect to be smaller, it will only draw a partial tile. > > > > I had imagined that occlusion-testing will return a list of sub-rects within the current tile, in case (for example) the visible portion of the tile was L-shaped, or split down the middle or some such thing. Not worth considering too many special cases though, as beyond rect or L-shaped it likely becomes more effort than it's worth. > > Oh, I see where you're going. > > I have mixed feelings about whether that'd be useful. I know on some platforms, John Bates has already talked about draw call overhead being expensive and wanting to make tile sizes bigger to help with that. I guess I'm trying to say that this won't be a win in all cases. That had occurred to us too ... If it's not too difficult to setup we can try it, and if it's not a win abandon it. That being said, I wouldn't imagine going below two subrects for a tile, although if we can draw *one* smaller rect there may be some benefit.

Nat Duca

Comment 17 2011-10-28 10:52:54 PDT

(In reply to comment #15) Maybe we should do some whiteboarding on this topic next week. Personally, I think we should start with an architecturally clean approach to this, even if it may cost us a little bit in performance. DrawLayers et all is a pasta-monster already --- scissoring, culling are just making it worse. What do people think of the following flow *per renderlayer*: class TileDrawCommand { transform; textureId // material; } vector<TileDrawCommand> tilesToDraw(walkTreeAndComputeTilesToDraw()); if (settings().cullingEnabled) { vector<TileDrawCommand> culledTiles = cullTiles(tilesToDraw); drawTiles(culledTiles); } else { drawTiles(culledTIles); } If that makes sense, would someone on this list be willing to sign up for step 0, which is refactoring the code into a design like above? In parallel, we can start writing a cullTiles that takes a list of TileDrawCommands in culls it. But, for such an algorithm, I'd really like to see it come with unit tests, quality tests [cull ratios, etc] and come with high level explanation of the genaeology of its algorithm -- e.g. rasterization, KD-trees, BSP, etc.

W. James MacLean

Comment 18 2011-10-28 10:59:10 PDT

(In reply to comment #17) > (In reply to comment #15) > > Maybe we should do some whiteboarding on this topic next week. > > Personally, I think we should start with an architecturally clean approach to this, even if it may cost us a little bit in performance. DrawLayers et all is a pasta-monster already --- scissoring, culling are just making it worse. > > What do people think of the following flow *per renderlayer*: > class TileDrawCommand { > transform; > textureId // material; > } > vector<TileDrawCommand> tilesToDraw(walkTreeAndComputeTilesToDraw()); > if (settings().cullingEnabled) { > vector<TileDrawCommand> culledTiles = cullTiles(tilesToDraw); > drawTiles(culledTiles); > } else { > drawTiles(culledTIles); > } > > If that makes sense, would someone on this list be willing to sign up for step 0, which is refactoring the code into a design like above? > > In parallel, we can start writing a cullTiles that takes a list of TileDrawCommands in culls it. But, for such an algorithm, I'd really like to see it come with unit tests, quality tests [cull ratios, etc] and come with high level explanation of the genaeology of its algorithm -- e.g. rasterization, KD-trees, BSP, etc. Yes, I very much like the sounds of this. And, as you say, it has the advantage of simplifying the code structure, which will be a great win for both maintainability as well as implementation. +1

Daniel Bates

Comment 19 2011-10-28 13:21:49 PDT

Comment on attachment 112872 [details] Patch Attachment 112872 [details] did not pass mac-ews (mac): Output: http://queues.webkit.org/results/10237319

Dana Jansens

Comment 20 2011-10-31 06:47:15 PDT

(In reply to comment #17) > What do people think of the following flow *per renderlayer*: > class TileDrawCommand { > transform; > textureId // material; > } > vector<TileDrawCommand> tilesToDraw(walkTreeAndComputeTilesToDraw()); > if (settings().cullingEnabled) { > vector<TileDrawCommand> culledTiles = cullTiles(tilesToDraw); > drawTiles(culledTiles); > } else { > drawTiles(culledTIles); > } > > If that makes sense, would someone on this list be willing to sign up for step 0, which is refactoring the code into a design like above? I like this also. If I finish layout tests or otherwise block on the opaque flag stuff before James gets to this I will take it on. > In parallel, we can start writing a cullTiles that takes a list of TileDrawCommands in culls it. But, for such an algorithm, I'd really like to see it come with unit tests, quality tests [cull ratios, etc] and come with high level explanation of the genaeology of its algorithm -- e.g. rasterization, KD-trees, BSP, etc. backer@ and I came up with a plan for this also, it's much simpler than KD-trees etc.

W. James MacLean

Comment 21 2011-10-31 06:59:42 PDT

(In reply to comment #20) > (In reply to comment #17) > > What do people think of the following flow *per renderlayer*: > > class TileDrawCommand { > > transform; > > textureId // material; > > } > > vector<TileDrawCommand> tilesToDraw(walkTreeAndComputeTilesToDraw()); > > if (settings().cullingEnabled) { > > vector<TileDrawCommand> culledTiles = cullTiles(tilesToDraw); > > drawTiles(culledTiles); > > } else { > > drawTiles(culledTIles); > > } > > > > If that makes sense, would someone on this list be willing to sign up for step 0, which is refactoring the code into a design like above? > > I like this also. If I finish layout tests or otherwise block on the opaque flag stuff before James gets to this I will take it on. > > > In parallel, we can start writing a cullTiles that takes a list of TileDrawCommands in culls it. But, for such an algorithm, I'd really like to see it come with unit tests, quality tests [cull ratios, etc] and come with high level explanation of the genaeology of its algorithm -- e.g. rasterization, KD-trees, BSP, etc. > > backer@ and I came up with a plan for this also, it's much simpler than KD-trees etc. I'm also happy to undertake the refactoring work. Dana, if you have a plan for cullTiles, perhaps it would make sense for you to put spare cycles into that effort?

Dana Jansens

Comment 22 2011-10-31 07:02:21 PDT

(In reply to comment #21) > I'm also happy to undertake the refactoring work. Dana, if you have a plan for cullTiles, perhaps it would make sense for you to put spare cycles into that effort? Yup, sounds good.

W. James MacLean

Comment 23 2011-11-07 11:18:09 PST

Created attachment 113897 [details] Patch

Dana Jansens

Comment 24 2011-11-07 11:25:58 PST

> Source/WebCore/platform/graphics/chromium/cc/CCTiledLayerImpl.cpp:297 > + if (tile && tile->textureId() && tileCuller.nextTileToDraw() == tile) { Why not just pass in a list of tiles? I think the plan was to not have any optimization-specific code in here, something like a Vector (which all optimations would take in and return), rather than optimization objects. Else we're going to have to keep passing in more and more optimization objects and changing this code every time a new optimization is added.

W. James MacLean

Comment 25 2011-11-07 11:29:29 PST

(In reply to comment #23) > Created an attachment (id=113897) [details] > Patch This is a more modular version of the previous patch, taking into account our discussions last week. The main weakness I see is a fair bit of redundancy in determining whether a tile even draws in the first place, although some of this can be addresses through re-factoring the checks once we're OK with the approach. Unit tests are coming next. Enne, I think I gardened this through the solid-colour change OK, but please let me know if I got anything wrong.

W. James MacLean

Comment 26 2011-11-07 11:30:36 PST

(In reply to comment #24) > > Source/WebCore/platform/graphics/chromium/cc/CCTiledLayerImpl.cpp:297 > > + if (tile && tile->textureId() && tileCuller.nextTileToDraw() == tile) { > > Why not just pass in a list of tiles? I think the plan was to not have any optimization-specific code in here, something like a Vector (which all optimations would take in and return), rather than optimization objects. Else we're going to have to keep passing in more and more optimization objects and changing this code every time a new optimization is added. Since CCTileCuller just maintains a single list, not one list per layer, we need some way for each layer to know where is starts in the list.

Dana Jansens

Comment 27 2011-11-07 11:33:21 PST

(In reply to comment #26) > (In reply to comment #24) > > > Source/WebCore/platform/graphics/chromium/cc/CCTiledLayerImpl.cpp:297 > > > + if (tile && tile->textureId() && tileCuller.nextTileToDraw() == tile) { > > > > Why not just pass in a list of tiles? I think the plan was to not have any optimization-specific code in here, something like a Vector (which all optimations would take in and return), rather than optimization objects. Else we're going to have to keep passing in more and more optimization objects and changing this code every time a new optimization is added. > > Since CCTileCuller just maintains a single list, not one list per layer, we need some way for each layer to know where is starts in the list. Yeh, definitely. But this is something very specific to a single optimization (culling), whereas I think we want to pass something global to all future optimizations to this function, so that we don't end up passing N optimization-specific things to it.

W. James MacLean

Comment 28 2011-11-07 14:43:18 PST

(In reply to comment #27) > (In reply to comment #26) > > (In reply to comment #24) > > > > Source/WebCore/platform/graphics/chromium/cc/CCTiledLayerImpl.cpp:297 > > > > + if (tile && tile->textureId() && tileCuller.nextTileToDraw() == tile) { > > > > > > Why not just pass in a list of tiles? I think the plan was to not have any optimization-specific code in here, something like a Vector (which all optimations would take in and return), rather than optimization objects. Else we're going to have to keep passing in more and more optimization objects and changing this code every time a new optimization is added. > > > > Since CCTileCuller just maintains a single list, not one list per layer, we need some way for each layer to know where is starts in the list. > > Yeh, definitely. But this is something very specific to a single optimization (culling), whereas I think we want to pass something global to all future optimizations to this function, so that we don't end up passing N optimization-specific things to it. It's easy to create a separate tile list support class to do this, and this can be done as part of this CL, but I'd like to get some buy-in on the overall approach before further refinements.

Nat Duca

Comment 29 2011-11-07 15:32:54 PST

Comment on attachment 113897 [details] Patch Can we split this into a refactoring and feature patch? I think I see here a patch that redirects tile drawing into some new classs that you're passin garound (CCRenderer instead of CCTileCuller?). That that class has a tileculler is something you could add later. Also, I don't think we need the suffix on TileCuller. We really only need the impl suffix for classes where one exists on both sides of the fence.

Adrienne Walker

Comment 30 2011-11-07 16:47:34 PST

Comment on attachment 113897 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=113897&action=review I feel similarly to nduca that this patch is trying to do too much. This patch should either be just culling changes (maybe where you hand a culler and an iterator to the draw function) or it should be a refactoring to use a CCRenderer and a list of tiles to draw, but not both. > Source/WebCore/platform/graphics/chromium/cc/CCTileCullerImpl.cpp:37 > +bool CCTileCullerFunctor::isAxisAligned(const TransformationMatrix& m) This function and drawsOpaque should both be marked static. > Source/WebCore/platform/graphics/chromium/cc/CCTileCullerImpl.cpp:75 > +void CCTileCullerImpl::createDrawTileList(const CCLayerList& layerList, CCRenderSurface* targetSurface, const TransformationMatrix& prefixMatrix) > +{ > + for (CCLayerList::const_iterator it = layerList.begin(); it != layerList.end(); ++it) { > + CCLayerImpl* layer = it->get(); > + if (layer->isTiledLayer()) { This whole function feels like a layering violation (pun intended). I think I'd prefer a model where we iterate through all the layers and ask them for all the quads they might draw, rather than reach in and get those tiles out. The fragility of checking what tile is next and culling using that is a little bit worrisome too. > Source/WebCore/platform/graphics/chromium/cc/CCTileCullerImpl.cpp:85 > + const IntRect contentRect = CCLayerTreeHostCommon::calculateVisibleLayerRect<CCLayerImpl>(layer); Ugh. I believe that the visible rect could be set as a part of calculateDrawTransformsAndVisibility after recursing through a layer's children. If you could write a small patch to fix that (both in paint and in draw), that'd be an excellent cleanup. > Source/WebCore/platform/graphics/chromium/cc/CCTiledLayerImpl.cpp:297 > + if (tile && tile->textureId() && tileCuller.nextTileToDraw() == tile) { I think this if and the else if conditions would be clearer if you just had an early out for culling right at the top of the loop.

James Robinson

Comment 31 2011-11-07 17:33:34 PST

Comment on attachment 113897 [details] Patch Agreeing with what everyone else has said - this needs to be broken up. Also, what's the test strategy? Ideally I think you should be able to test CCTiledCullerImpl (or whatever its name ends up being) in isolation, but it looks like that would be pretty difficult to do with its current design.

W. James MacLean

Comment 32 2011-11-08 10:08:06 PST

Created attachment 114103 [details] Patch

W. James MacLean

Comment 33 2011-11-08 10:24:51 PST

(In reply to comment #29) > Can we split this into a refactoring and feature patch? Yes. > I think I see here a patch that redirects tile drawing into some new classs > that you're passin garound (CCRenderer instead of CCTileCuller?). That that > class has a tileculler is something you could add later. Hmm, not this intention. All tile drawing is still done in CCTiledLayerImpl, all I'm doing is passing in a list of tiles that (according to the culler) are worth drawing, and modifying drawTiles() to only draw those tiles. In the new patch I've simplified the input to draw() to reflect that it is just a list of tile pointers (as unique id's)(as per Dana's suggestion). > Also, I don't think we need the suffix on TileCuller. We really only need the > impl suffix for classes where one exists on both sides of the fence. Removed. I wasn't sure about how the naming conventions would apply there. (In reply to comment #30) > I feel similarly to nduca that this patch is trying to do too much. This > patch should either be just culling changes (maybe where you hand a culler > and an iterator to the draw function) or it should be a refactoring to use a > CCRenderer and a list of tiles to draw, but not both. See comments above. >> Source/WebCore/platform/graphics/chromium/cc/CCTileCullerImpl.cpp:37 >> +bool CCTileCullerFunctor::isAxisAligned(const TransformationMatrix& m) > >This function and drawsOpaque should both be marked static. Ooops, just now saw this. Will add to next patch. >> Source/WebCore/platform/graphics/chromium/cc/CCTileCullerImpl.cpp:75 >> +void CCTileCullerImpl::createDrawTileList(const CCLayerList& layerList, >> CCRenderSurface* targetSurface, const TransformationMatrix& prefixMatrix) >> +{ >> + for (CCLayerList::const_iterator it = layerList.begin(); it != >>layerList.end(); ++it) { >> + CCLayerImpl* layer = it->get(); >> + if (layer->isTiledLayer()) { > >This whole function feels like a layering violation (pun intended). I think >I'd prefer a model where we iterate through all the layers and ask them for >all the quads they might draw, rather than reach in and get those tiles out. OK, I can try a build something like this. Again, I was trying to touch CCTiledLayerImpl as little as possible. >The fragility of checking what tile is next and culling using that is a little >bit worrisome too. Yeah, I worried about this also. That's why I included the checks at the end to make sure everything got drawn, but it would be nice if it wasn't order-dependent. Would we be better putting the tiles to draw into a hash table of some sort? > Source/WebCore/platform/graphics/chromium/cc/CCTileCullerImpl.cpp:85 > + const IntRect contentRect = CCLayerTreeHostCommon::calculateVisibleLayerRect<CCLayerImpl>(layer); > Ugh. I believe that the visible rect could be set as a part of > calculateDrawTransformsAndVisibility after recursing through a layer's > children. If you could write a small patch to fix that (both in paint and in > draw), that'd be an excellent cleanup. Will do this before proceeding. I was trying to change as little of existing infrastructure as possible until I got some feedback. >> Source/WebCore/platform/graphics/chromium/cc/CCTiledLayerImpl.cpp:297 >> + if (tile && tile->textureId() && tileCuller.nextTileToDraw() == tile) { > >I think this if and the else if conditions would be clearer if you just had an >early out for culling right at the top of the loop. Will try to improve this. I had a certain amount of stuff that needed to be computed whether or not we drew the tile (edgeY, edgeX, etc.) but will see if I can tighten this up. (In reply to comment #31) > (From update of attachment 113897 [details]) > Agreeing with what everyone else has said - this needs to be broken up. I'm quite happy to break it up, but would like to get agreement on the final concept first. > Also, what's the test strategy? Ideally I think you should be able to test CCTiledCullerImpl (or whatever its name ends up being) in isolation, but it looks like that would be pretty difficult to do with its current design. In addition to not breaking any existing layout tests, I plan to add unit tests to create a small stack of tiled layers, let it "draw", and make sure occluded tiles are not drawn (similar to http://trac.webkit.org/changeset/98567, thanks to Enne for referring this to me). I haven't written this yet, but I'm hoping it should be too difficult.

W. James MacLean

Comment 34 2011-11-08 10:35:13 PST

(In reply to comment #32) > Created an attachment (id=114103) [details] > Patch Ugh, ignore this patch ... it's incomplete.

Adrienne Walker

Comment 35 2011-11-08 10:44:16 PST

> (In reply to comment #30) > > >> Source/WebCore/platform/graphics/chromium/cc/CCTileCullerImpl.cpp:75 > >> +void CCTileCullerImpl::createDrawTileList(const CCLayerList& layerList, > >> CCRenderSurface* targetSurface, const TransformationMatrix& prefixMatrix) > >> +{ > >> + for (CCLayerList::const_iterator it = layerList.begin(); it != > >>layerList.end(); ++it) { > >> + CCLayerImpl* layer = it->get(); > >> + if (layer->isTiledLayer()) { > > > >This whole function feels like a layering violation (pun intended). I think > >I'd prefer a model where we iterate through all the layers and ask them for > >all the quads they might draw, rather than reach in and get those tiles out. > > OK, I can try a build something like this. Again, I was trying to touch CCTiledLayerImpl as little as possible. If you're trying to optimize for "least intrusive", passing the culler and an iterator to TiledLayerChromium and just calling cull() there rather than pre-generating a list seems like one way to do it. > >The fragility of checking what tile is next and culling using that is a little > >bit worrisome too. > > Yeah, I worried about this also. That's why I included the checks at the end to make sure everything got drawn, but it would be nice if it wasn't order-dependent. > > Would we be better putting the tiles to draw into a hash table of some sort? The "order-dependent" comment wasn't about how the list of tiles to draw is stored, since that is already in a well-defined order. I was worried more about the order-dependent fragility of checking whether a tile was culled or not by testing against the iterator. > > Source/WebCore/platform/graphics/chromium/cc/CCTileCullerImpl.cpp:85 > > + const IntRect contentRect = CCLayerTreeHostCommon::calculateVisibleLayerRect<CCLayerImpl>(layer); > > > Ugh. I believe that the visible rect could be set as a part of > > calculateDrawTransformsAndVisibility after recursing through a layer's > > children. If you could write a small patch to fix that (both in paint and in > draw), that'd be an excellent cleanup. > > Will do this before proceeding. I was trying to change as little of existing infrastructure as possible until I got some feedback. Thanks! :)

WebKit Review Bot

Comment 36 2011-11-08 11:22:24 PST

Comment on attachment 114103 [details] Patch Attachment 114103 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/10376144

W. James MacLean

Comment 37 2011-11-16 06:39:44 PST

(In reply to comment #35) > > (In reply to comment #30) > > > > >This whole function feels like a layering violation (pun intended). I think > > >I'd prefer a model where we iterate through all the layers and ask them for > > >all the quads they might draw, rather than reach in and get those tiles out. > > > > OK, I can try a build something like this. Again, I was trying to touch CCTiledLayerImpl as little as possible. > > If you're trying to optimize for "least intrusive", passing the culler and an iterator to TiledLayerChromium and just calling cull() there rather than pre-generating a list seems like one way to do it. Hmm, OK, but then that doesn't seem like the architecture of creating a tile list and passing it through multiple filters (culling, squashing, etc.). > > >The fragility of checking what tile is next and culling using that is a little > > >bit worrisome too. > > > > Yeah, I worried about this also. That's why I included the checks at the end to make sure everything got drawn, but it would be nice if it wasn't order-dependent. > > > > Would we be better putting the tiles to draw into a hash table of some sort? > > The "order-dependent" comment wasn't about how the list of tiles to draw is stored, since that is already in a well-defined order. I was worried more about the order-dependent fragility of checking whether a tile was culled or not by testing against the iterator. > I suppose one could test by doing a search in the list, although that seems expensive, as the list is presumably unsorted (w.r.t. the pointer values we store). Ultimately we'd only draw the tiles that were in the list, in the order they appear in the list, but that complicates computing the edge indices - but it's probably doable. Let me ask *one* question here: do we prefer 1) the cull-in-place option (with culling logic provided by an external object), or do we prefer 2) "provide a list of all drawable tiles to the culler, but only draw whatever subset of them that is returned by the culler"? I'm happy to prepare a sample patch for either, but would prefer not to code both :-)

Adrienne Walker

Comment 38 2011-11-16 09:43:59 PST

(In reply to comment #37) > Let me ask *one* question here: do we prefer > > 1) the cull-in-place option (with culling logic provided by an external object), > > or do we prefer > > 2) "provide a list of all drawable tiles to the culler, but only draw whatever subset of them that is returned by the culler"? > > I'm happy to prepare a sample patch for either, but would prefer not to code both :-) I think I prefer #1. You're right that this doesn't seem like the architecture of creating a tile list. However, I think that architectural change (#2) should be done in another patch, probably after this one. Additionally, I think that the architecture needs to be that tiles are collected from layers and then drawn elsewhere, without going back to the layer classes. So, I'm picturing a lot of refactoring in what happens during drawing a CCTiledLayerImpl. The more encapsulated the culler is, the easier it will be to test and to move it to its proper place once that exists.

W. James MacLean

Comment 39 2011-11-17 11:05:49 PST

Created attachment 115622 [details] Patch

W. James MacLean

Comment 40 2011-11-17 11:06:33 PST

This new patch should be closer to what we're looking for, let me know.

Dana Jansens

Comment 41 2011-11-17 14:52:17 PST

Tried this out just now, nothing in a tab is rendering, and while it seems to have the right idea which tiles to cull, in some way, it seems to be skipping the opaque occluding tiles also. Wanted to get some fps numbers with/without this, but can't even see an fps counter at the moment since tab contents aren't rendering.

WebKit Review Bot

Comment 42 2011-11-18 01:34:40 PST

Comment on attachment 115622 [details] Patch Attachment 115622 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/10517040 New failing tests: compositing/geometry/video-opacity-overlay.html compositing/reflections/load-video-in-reflection.html compositing/overflow/overflow-compositing-descendant.html fast/replaced/width100percent-textarea.html compositing/layers-inside-overflow-scroll.html compositing/geometry/video-fixed-scrolling.html compositing/video-page-visibility.html compositing/self-painting-layers.html compositing/overflow/scroll-ancestor-update.html compositing/geometry/clipped-video-controller.html

David Reveman

Comment 43 2011-11-22 09:15:56 PST

Comment on attachment 115622 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=115622&action=review Sorry for being a bit late in the game on this issue. I apologize if I bring something up that's already been discussed. The patch's algorithmic complexity is higher than it needs to be. It will run at O(n^2). This can be done in O(n) including the check for any combination of subsequent layers. I think it's also worth keeping in mind that we probably want to support isOpaque() on a per-tile basis in the future. > Source/WebCore/platform/graphics/chromium/cc/CCTileCuller.cpp:46 > + && !m.m14() && !m.m24() && !m.m43() && m.m44(); This also needs to snap to the pixel grid to be correct. Instead of analyzing the transform, I think we should use the same logic as the anti-aliasing code, which is: transformedQuad.isRectilinear() && transformedQuad.boundingBox().isExpressibleAsIntRect() > Source/WebCore/platform/graphics/chromium/cc/CCTileCuller.cpp:52 > + return layer->opaque() && layer->opacity() >= 0.9999 && layer->drawOpacity() >= 0.9999; LayerRendererChromium::drawLayer uses "layer->drawOpacity() == 1" to determine if a layer is opaque. I prefer if we were consistent. Maybe add a isOpaque() function? > Source/WebCore/platform/graphics/chromium/cc/CCTileCuller.cpp:68 > +bool CCTileCullerSimple::operator()(const IntRect& tileRect) Yes, we really want the covered by any combination of subsequent layers functionality here as you pointed out.

Antoine Labour

Comment 44 2011-11-22 15:31:46 PST

How about making a first patch that defines the CCTileFilter that gets passed to all the CCLayerImpl::draw + the change to CCTiledLayerImpl::drawTiles to apply it, and a separate patch to implement the tile culler. The first one would be a small change with very little impact. That way, Adrienne can work on the refactoring while James/Dana can work on the culler, and they should mostly be unaffected by each other.

David Reveman

Comment 45 2011-11-22 22:11:57 PST

Though I like the flexibility of a CCTileFilter getting passed to all the CCLayerImpl::draw calls, I'm not sure this is the best approach here. Afaik, to do this culling efficiently we would walk all layers in top-to-bottom order while creating an occlusion tile-set (TilingData class seem sufficient) from the opaque tiles we've passed, which at the same time can be used to determine the tiles that are visible. So there's basically two problems to solve: 1. Walk tiles in reverse order to how they are drawn. 2. We need a data structure that matches the compositor tree hierarchy to store information used to determine the tiles to draw in the bottom-to-top rendering pass. The CCTileFilter approach seem to make both of these things more complicated. Number 2 for example could instead be easily solved by just adding an m_occluded member to the tile class. General tree traversal on a per-tile basis is introduced in the following patch and I think that would be useful for solving problem number 1: https://bugs.webkit.org/show_bug.cgi?id=72192

W. James MacLean

Comment 46 2011-11-23 06:12:16 PST

(In reply to comment #44) > How about making a first patch that defines the CCTileFilter that gets passed to all the CCLayerImpl::draw + the change to CCTiledLayerImpl::drawTiles to apply it, and a separate patch to implement the tile culler. The first one would be a small change with very little impact. > That way, Adrienne can work on the refactoring while James/Dana can work on the culler, and they should mostly be unaffected by each other. OK, although given that the TileCuller will likely need to be changed (possibly a lot) to match the refactoring, perhaps this should wait until there is at least an API spec available for the re-factoring.

W. James MacLean

Comment 47 2011-11-23 06:17:46 PST

(In reply to comment #43) > (From update of attachment 115622 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=115622&action=review > > Sorry for being a bit late in the game on this issue. I apologize if I bring something up that's already been discussed. > > The patch's algorithmic complexity is higher than it needs to be. It will run at O(n^2). This can be done in O(n) including the check for any combination of subsequent layers. I think it's also worth keeping in mind that we probably want to support isOpaque() on a per-tile basis in the future. "isOpaque()" is being implemented in another patch, in parallel. As for O(n^2), this was previously discussed, and is the eventual high-road goal. However, for this patch, which we had originally hoped to land for the upcoming release, we decided O(n^2) would suffice, especially as n is small enough that the additional computation is still small with the cost of overdraw. > > Source/WebCore/platform/graphics/chromium/cc/CCTileCuller.cpp:46 > > + && !m.m14() && !m.m24() && !m.m43() && m.m44(); > > This also needs to snap to the pixel grid to be correct. Instead of analyzing the transform, I think we should use the same logic as the anti-aliasing code, which is: transformedQuad.isRectilinear() && transformedQuad.boundingBox().isExpressibleAsIntRect() Excellent suggestion, thanks! > > Source/WebCore/platform/graphics/chromium/cc/CCTileCuller.cpp:52 > > + return layer->opaque() && layer->opacity() >= 0.9999 && layer->drawOpacity() >= 0.9999; > > LayerRendererChromium::drawLayer uses "layer->drawOpacity() == 1" to determine if a layer is opaque. Sure thing. > I prefer if we were consistent. Maybe add a isOpaque() function? > > > Source/WebCore/platform/graphics/chromium/cc/CCTileCuller.cpp:68 > > +bool CCTileCullerSimple::operator()(const IntRect& tileRect) > > Yes, we really want the covered by any combination of subsequent layers functionality here as you pointed out. Yes, in a future, improved version.

W. James MacLean

Comment 48 2011-11-23 06:22:17 PST

(In reply to comment #45) > Though I like the flexibility of a CCTileFilter getting passed to all the CCLayerImpl::draw calls, I'm not sure this is the best approach here. The final intention of this work has always been that we would operate on a single list of all tiles to be drawn, to be collected across layers. This of course requires substantial refactoring in CCTiledLayerImpl, and so is a ways off. Such a list allows for other optimizations, such as tile squashing. The approach in this patch is just an intermediate step on our way to the final version. > Afaik, to do this culling efficiently we would walk all layers in top-to-bottom order while creating an occlusion tile-set (TilingData class seem sufficient) from the opaque tiles we've passed, which at the same time can be used to determine the tiles that are visible. Yes, this is the eventual plan. > So there's basically two problems to solve: > > 1. Walk tiles in reverse order to how they are drawn. Easily solve once tiles are exported to a draw list in the CCTiledLayerImpl refactoring. > 2. We need a data structure that matches the compositor tree hierarchy to store information used to determine the tiles to draw in the bottom-to-top rendering pass. > > The CCTileFilter approach seem to make both of these things more complicated. Number 2 for example could instead be easily solved by just adding an m_occluded member to the tile class. It was desired to avoid putting too much extra state into the tiles themselves. > General tree traversal on a per-tile basis is introduced in the following patch and I think that would be useful for solving problem number 1: > > https://bugs.webkit.org/show_bug.cgi?id=72192

David Reveman

Comment 49 2011-11-23 10:56:40 PST

(In reply to comment #48) > (In reply to comment #45) > > Though I like the flexibility of a CCTileFilter getting passed to all the CCLayerImpl::draw calls, I'm not sure this is the best approach here. > > The final intention of this work has always been that we would operate on a single list of all tiles to be drawn, to be collected across layers. This of course requires substantial refactoring in CCTiledLayerImpl, and so is a ways off. Such a list allows for other optimizations, such as tile squashing. > > The approach in this patch is just an intermediate step on our way to the final version. > > > Afaik, to do this culling efficiently we would walk all layers in top-to-bottom order while creating an occlusion tile-set (TilingData class seem sufficient) from the opaque tiles we've passed, which at the same time can be used to determine the tiles that are visible. > > Yes, this is the eventual plan. > > > So there's basically two problems to solve: > > > > 1. Walk tiles in reverse order to how they are drawn. > > Easily solve once tiles are exported to a draw list in the CCTiledLayerImpl refactoring. > > > 2. We need a data structure that matches the compositor tree hierarchy to store information used to determine the tiles to draw in the bottom-to-top rendering pass. > > > > The CCTileFilter approach seem to make both of these things more complicated. Number 2 for example could instead be easily solved by just adding an m_occluded member to the tile class. > > It was desired to avoid putting too much extra state into the tiles themselves. I agree that it would be nice to not add extra state to the tiles themselves. It's just so simple compared to the alternatives :) Anyhow, the m_occluded approach would be awkward in the current version of the compositor as there's no general per tile/texture data structure. This will change once 72192 lands and the m_occluded approach would be straight forward so I just wanted to make sure you're aware of that. You seem to have your head wrapped around this. Sorry if I was repeating things that had already been discussed. Just wanted to make sure we were all on the same page.

W. James MacLean

Comment 50 2011-11-23 11:27:10 PST

(In reply to comment #49) > > It was desired to avoid putting too much extra state into the tiles themselves. > > I agree that it would be nice to not add extra state to the tiles themselves. It's just so simple compared to the alternatives :) Anyhow, the m_occluded approach would be awkward in the current version of the compositor as there's no general per tile/texture data structure. This will change once 72192 lands and the m_occluded approach would be straight forward so I just wanted to make sure you're aware of that. Thanks for the head's up, I will keep an eye on that. > You seem to have your head wrapped around this. Sorry if I was repeating things that had already been discussed. Just wanted to make sure we were all on the same page. NP ... I appreciate your taking the time to make the suggestions (all of them good!). It's not always easy to know where discussions have been offline :-)

Dana Jansens

Comment 51 2011-11-24 11:21:43 PST

View in context: https://bugs.webkit.org/attachment.cgi?id=115622&action=review > Source/WebCore/platform/graphics/chromium/cc/CCTileCuller.cpp:74 > + IntRect layerRect = layer->drawableContentRect(); Can construct a Region from these content rects, and then test if the region contains tileRect. See comment on bug #70751 regarding this. It would not make the code much larger, and I think we could drop #70751 and just do it here. Here's code sample for testing contains: https://bugs.webkit.org/attachment.cgi?id=116403&action=diff#a/Source/WebCore/rendering/RenderLayerBacking.cpp_sec2 And you can just use Region::unite() to construct the Region.

W. James MacLean

Comment 52 2011-11-30 07:32:52 PST

(In reply to comment #51) > View in context: https://bugs.webkit.org/attachment.cgi?id=115622&action=review > > > Source/WebCore/platform/graphics/chromium/cc/CCTileCuller.cpp:74 > > + IntRect layerRect = layer->drawableContentRect(); > > Can construct a Region from these content rects, and then test if the region contains tileRect. See comment on bug #70751 regarding this. It would not make the code much larger, and I think we could drop #70751 and just do it here. > > Here's code sample for testing contains: https://bugs.webkit.org/attachment.cgi?id=116403&action=diff#a/Source/WebCore/rendering/RenderLayerBacking.cpp_sec2 > > And you can just use Region::unite() to construct the Region. Cool, I didn't know about Region. This will simplify things greatly!

Nat Duca

Comment 53 2011-11-30 08:41:43 PST

Parachuting into a conversation thread late, but please please please no ephemeral state on persistent objects. m_occluded on a tile is an example of badness I really want to avoid. Occlusion is an ephemeral state --- m_occluded is only valid during draw. Once draw returns, m_occluded may be invalid. Putting it on a tile, which is a persisted resource, leads to badness.

David Reveman

Comment 54 2011-11-30 11:53:44 PST

(In reply to comment #53) > Parachuting into a conversation thread late, but please please please no ephemeral state on persistent objects. > > m_occluded on a tile is an example of badness I really want to avoid. Occlusion is an ephemeral state --- m_occluded is only valid during draw. Once draw returns, m_occluded may be invalid. Putting it on a tile, which is a persisted resource, leads to badness. Yes, I perfectly right about that. I think that rather then having some form of tile filter call-back in the existing drawing code. The occlusion code should be the only code actually doing a pass over the tree of persistent resources. The drawing code should instead operate on a draw list generated by the occlusion pass. I assume this is what's planned once enne's tile drawing re-factoring has landed. Btw, the general per tile data structure I mentioned earlier on this thread is unrelated to all this. That's only for texture uploads. Sorry about the confusion there.

W. James MacLean

Comment 55 2011-12-22 07:27:57 PST

Created attachment 120319 [details] Patch

WebKit Review Bot

Comment 56 2011-12-22 07:38:44 PST

Comment on attachment 120319 [details] Patch Attachment 120319 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/10986505

W. James MacLean

Comment 57 2011-12-22 07:54:36 PST

Created attachment 120322 [details] Patch

W. James MacLean

Comment 58 2011-12-22 10:28:58 PST

Created attachment 120339 [details] Patch

W. James MacLean

Comment 59 2011-12-22 10:30:08 PST

Revised to remove commented-out code that was being retained pending whether the Windows bot would accept the patch.

Adrienne Walker

Comment 60 2011-12-22 11:18:43 PST

Comment on attachment 120339 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=120339&action=review Other than a few small things, this looks great and so much more encapsulated. Thanks for waiting until I could land that refactoring patch. > Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHostImpl.cpp:193 > + // FIXME: if there is useful information to share between render passes to > + // assist in optimization, plumb it here. Small quibble: I tend to use FIXME only when I'm knowingly implementing something suboptimal or that makes some assumptions that I'd like to remove later. I don't think it's useful for a "here's where future features can go". Future features can sort that out themselves, and (in this case) it may turn out that this is not the best place to do such a thing and it requires some larger refactoring. > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.cpp:42 > +// Specialize for OwnPtr<CCDrawQuad> since Vector doesn't know how to reverse a Vector of OwnPtr<T> in general. > +template<> void swap(WTF::OwnPtr<WebCore::CCDrawQuad>&a, WTF::OwnPtr<WebCore::CCDrawQuad>& b) { a.swap(b); } I think this should go in wtf/OwnPtr.h and not be specialized. > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.cpp:78 > + for (int i = quadList.size() -1; i >= 0; --i) { nit: space between '-' and '1'. > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.cpp:89 > + // FIXME: have all quads treated the same! > + switch (drawQuad->material()) { > + case CCDrawQuad::CustomLayer: > + quadRect = drawQuad->toCustomLayerDrawQuad()->layer()->drawableContentRect(); > + break; > + default: > + quadRect = drawQuad->quadTransform().mapRect(drawQuad->quadRect()); > + } This is unexpected. Does the default case for custom layers not return the right information? I had hoped it would. Maybe we should change it so it does? > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.cpp:99 > + else > + quadList[i].clear(); // This releases the quad permanently. Do you need to do this? Doesn't the vector destroy them when it gets destroyed?

W. James MacLean

Comment 61 2011-12-22 11:37:20 PST

(In reply to comment #60) > (From update of attachment 120339 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=120339&action=review > > Other than a few small things, this looks great and so much more encapsulated. Thanks for waiting until I could land that refactoring patch. No problem, then end-product is much nicer ... > > Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHostImpl.cpp:193 > > + // FIXME: if there is useful information to share between render passes to > > + // assist in optimization, plumb it here. > > Small quibble: I tend to use FIXME only when I'm knowingly implementing something suboptimal or that makes some assumptions that I'd like to remove later. I don't think it's useful for a "here's where future features can go". Future features can sort that out themselves, and (in this case) it may turn out that this is not the best place to do such a thing and it requires some larger refactoring. OK, will remove. > > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.cpp:42 > > +// Specialize for OwnPtr<CCDrawQuad> since Vector doesn't know how to reverse a Vector of OwnPtr<T> in general. > > +template<> void swap(WTF::OwnPtr<WebCore::CCDrawQuad>&a, WTF::OwnPtr<WebCore::CCDrawQuad>& b) { a.swap(b); } > > I think this should go in wtf/OwnPtr.h and not be specialized. I agree, except I'm not sure partial-specialization is allowed for functions (I couldn't get it to compile with template<typename T> void swap(WTF::OwnPtr<T>&a, WTF::OwnPtr<T>& b) and also there seem to be issues with overloading std::swap ... I'm open to suggestions. > > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.cpp:78 > > + for (int i = quadList.size() -1; i >= 0; --i) { > > nit: space between '-' and '1'. Fixed. > > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.cpp:89 > > + // FIXME: have all quads treated the same! > > + switch (drawQuad->material()) { > > + case CCDrawQuad::CustomLayer: > > + quadRect = drawQuad->toCustomLayerDrawQuad()->layer()->drawableContentRect(); > > + break; > > + default: > > + quadRect = drawQuad->quadTransform().mapRect(drawQuad->quadRect()); > > + } > > This is unexpected. Does the default case for custom layers not return the right information? I had hoped it would. Maybe we should change it so it does? Sure, sounds good ... want me to do that? > > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.cpp:99 > > + else > > + quadList[i].clear(); // This releases the quad permanently. > > Do you need to do this? Doesn't the vector destroy them when it gets destroyed? I guess it would ... will remove this.

W. James MacLean

Comment 62 2011-12-22 11:53:55 PST

(In reply to comment #60) > > > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.cpp:89 > > + // FIXME: have all quads treated the same! > > + switch (drawQuad->material()) { > > + case CCDrawQuad::CustomLayer: > > + quadRect = drawQuad->toCustomLayerDrawQuad()->layer()->drawableContentRect(); > > + break; > > + default: > > + quadRect = drawQuad->quadTransform().mapRect(drawQuad->quadRect()); > > + } > > This is unexpected. Does the default case for custom layers not return the right information? I had hoped it would. Maybe we should change it so it does? > Sorry, forgot to answer your question. I found, which testing with Aura, that the custome layer quadRect together with the quadTransform gave a different location on the screen from where the layer was actually being drawn.

W. James MacLean

Comment 63 2011-12-22 12:23:56 PST

Created attachment 120362 [details] Patch

W. James MacLean

Comment 64 2011-12-22 12:28:46 PST

I took care of all the comments save one: I took another look at putting a partial swap specialization into OwnPtr.h, namely namespace std { template<typename T> void swap(WTF::OwnPtr<T>&a, WTF::OwnPtr<T>& b) { a.swap(b); } } but my template-fu is too weak. Many other headers complained I was re-defining "swap" (e.g. HashTable.h +273), but Vector specifically uses std::swap. I did read somewhere that you're not allowed to *overload* swap, yet partial specialization of a function isn't allowed. If there is a way, I'm happy to make the change ...

Adrienne Walker

Comment 65 2011-12-22 12:38:40 PST

(In reply to comment #64) > I took care of all the comments save one: > > I took another look at putting a partial swap specialization into OwnPtr.h, namely > > namespace std { > > template<typename T> void swap(WTF::OwnPtr<T>&a, WTF::OwnPtr<T>& b) > { > a.swap(b); > } > } > > but my template-fu is too weak. Many other headers complained I was re-defining "swap" (e.g. HashTable.h +273), but Vector specifically uses std::swap. I did read somewhere that you're not allowed to *overload* swap, yet partial specialization of a function isn't allowed. > > If there is a way, I'm happy to make the change ... No, you're quite right. I think your original solution is the best you can do here.

Adrienne Walker

Comment 66 2011-12-22 13:46:08 PST

Comment on attachment 120362 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=120362&action=review > Source/WebCore/platform/graphics/chromium/cc/CCLayerImpl.cpp:133 > + IntRect quadRect(quadTransform().inverse().mapRect(drawableContentRect())); > + quadList.append(CCCustomLayerDrawQuad::create(sharedQuadState, quadRect, this)); drawableContentRect() is not what you want. That's the enclosed IntRect of the transformed layer. So, if the layer is rotated, it'll be much larger than the actual layer (and not fully opaque within its bounds). I think you need a layer rect that's something more like CCLayerTreeHostCommon.cpp:223, where you offset it by half the width and height. I'm not totally sure how to handle the FloatRect/IntRect issue. Quads are really all IntRect in size, so it'd be sad to have to move to float to handle a drawing offset. One alternative might be to bake that half-width/half-height offset into the transform for the custom layer quad instead?

W. James MacLean

Comment 67 2011-12-28 13:11:38 PST

Created attachment 120696 [details] Patch

W. James MacLean

Comment 68 2011-12-28 13:13:31 PST

Revised computation of quadRect in CCLayerImpl. Compensated for non-int-ness by adding correction into quadTransform().

Adrienne Walker

Comment 69 2011-12-28 14:37:39 PST

Comment on attachment 120696 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=120696&action=review > Source/WebCore/platform/graphics/chromium/cc/CCLayerImpl.cpp:133 > + IntRect quadRect(-0.5 * bounds().width(), -0.5 * bounds().height(), bounds().width(), bounds().height()); > + quadList.append(CCCustomLayerDrawQuad::create(sharedQuadState, quadRect, this)); Sorry to add one last nitpick, but is it possible to do this offset entirely in the quad transform? It's a little magic to have this implicit flooring here and correction elsewhere.

W. James MacLean

Comment 70 2011-12-29 08:11:39 PST

Created attachment 120750 [details] Patch

W. James MacLean

Comment 71 2011-12-29 08:13:28 PST

Moved additional float calculations into quadTransform(). This is certainly better, I just wasn't sure what other expectations there might have been for the quadRect values themselves.

Adrienne Walker

Comment 72 2011-12-29 09:51:55 PST

(In reply to comment #71) > Moved additional float calculations into quadTransform(). This is certainly better, I just wasn't sure what other expectations there might have been for the quadRect values themselves. Thanks. This patch looks good to me. For what it's worth, my expectation is that transforming the quadRect by the quadTransform should always give the transformed rect in target surface space. The shaders themselves might care more about what's in the rect vs. the transform, but as more layers get converted to not use CCDrawQuad::CustomLayer, they can override quadTransform and appendQuads to provide the values they want, as CCTiledLayerImpl does.

W. James MacLean

Comment 73 2011-12-29 09:56:48 PST

(In reply to comment #72) > (In reply to comment #71) > > Moved additional float calculations into quadTransform(). This is certainly better, I just wasn't sure what other expectations there might have been for the quadRect values themselves. > > Thanks. This patch looks good to me. Thanks! > For what it's worth, my expectation is that transforming the quadRect by the quadTransform should always give the transformed rect in target surface space. The shaders themselves might care more about what's in the rect vs. the transform, but as more layers get converted to not use CCDrawQuad::CustomLayer, they can override quadTransform and appendQuads to provide the values they want, as CCTiledLayerImpl does. Yeah, that was sort of what I expected, although I wasn't sure if there might be other uses for the drawQuad values that it might affect.

James Robinson

Comment 74 2012-01-02 20:07:02 PST

Comment on attachment 120750 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=120750&action=review > Source/WebCore/platform/graphics/chromium/cc/CCLayerImpl.cpp:179 > + TransformationMatrix qt = drawTransform(); we don't use abbreviations for variables like this in WebKit, can you please give this a descriptive name (or at least one that's a word)? > Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHostImpl.cpp:193 > + passes[i].get()->optimizeQuads(); just "passes[i]->optimizeQuads();" please > Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHostImpl.h:137 > + void optimizeRenderPasses(Vector<OwnPtr<CCRenderPass> >&); nit: i'd slightly prefer this function's declaration and definition be after calculateRenderPasses()es, to make it a bit easier to read through the code in order can we get a typedef for Vector<OwnPtr<CCRenderPass> > please? it's a mouthful > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.cpp:42 > +template<> void swap(WTF::OwnPtr<WebCore::CCDrawQuad>&a, WTF::OwnPtr<WebCore::CCDrawQuad>& b) { a.swap(b); } we don't normally use WTF:: prefixes for OwnPtr because there's a using declaration in OwnPtr.h putting WTF::OwnPtr into the global namespace. do you definitely need those here or will it compile without it? space before the "a" in the param list please. some newlines, please! i'd put one after "template<>" and space out the function like a normal function > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.h:35 > + CCQuadCuller() { }; nit: no trailing ; > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.h:36 > + virtual void operator()(CCQuadList&) = 0; we very rarely use operator overloading in WebKit or chromium - are you really sure you need it here? i think i would prefer a descriptively-named virtual functions > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.h:40 > +class CCQuadCullerSimple : public CCQuadCuller { > +public: do you want the compiler-generated copy and assignment c'tors? we normally put WTF_MAKE_NONCOPYABLE on every class unless there's a specific need for it to be copyable > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.h:44 > + virtual void operator()(CCQuadList&); > +}; are CCQuadCullers expected to have state between calls? > Source/WebKit/chromium/tests/CCQuadCullerTest.cpp:72 > +TEST_F(CCQuadCullerTest, verifyQuadsAreCulledProperly) Why are you using a test fixture here if you aren't using any state fro mthe CCQuadCullerTest class? you could just declare this TEST() > Source/WebKit/chromium/tests/CCQuadCullerTest.cpp:85 > + // Test culling of centre tile in root layer. > + { please break these out into separate TEST()s with descriptive names. you can share the setup code in a SetUp() function if you'd like. having these bunched together risks sharing state between tests and makes failures harder to diagnose

James Robinson

Comment 75 2012-01-02 20:10:05 PST

Comment on attachment 120750 [details] Patch I don't understand where you are going with a CCQuadCuller base class and virtual implementations - it seems to be designed for some use case that isn't in effect here and isn't obvious to me. If you want to have potentially different cullers, why aren't they just different static functions? with just one implementation, do you need a base class at all? a single static function should do just fine here. this looks overengineered.

W. James MacLean

Comment 76 2012-01-03 11:40:27 PST

Created attachment 120973 [details] Patch

W. James MacLean

Comment 77 2012-01-03 11:46:21 PST

(In reply to comment #74) > (From update of attachment 120750 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=120750&action=review > > > Source/WebCore/platform/graphics/chromium/cc/CCLayerImpl.cpp:179 > > + TransformationMatrix qt = drawTransform(); > > we don't use abbreviations for variables like this in WebKit, can you please give this a descriptive name (or at least one that's a word)? Done. > > Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHostImpl.cpp:193 > > + passes[i].get()->optimizeQuads(); > > just "passes[i]->optimizeQuads();" please Done. > > Source/WebCore/platform/graphics/chromium/cc/CCLayerTreeHostImpl.h:137 > > + void optimizeRenderPasses(Vector<OwnPtr<CCRenderPass> >&); > > nit: i'd slightly prefer this function's declaration and definition be after calculateRenderPasses()es, to make it a bit easier to read through the code in order Done. > can we get a typedef for Vector<OwnPtr<CCRenderPass> > please? it's a mouthful Done. > > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.cpp:42 > > +template<> void swap(WTF::OwnPtr<WebCore::CCDrawQuad>&a, WTF::OwnPtr<WebCore::CCDrawQuad>& b) { a.swap(b); } > > we don't normally use WTF:: prefixes for OwnPtr because there's a using declaration in OwnPtr.h putting WTF::OwnPtr into the global namespace. do you definitely need those here or will it compile without it? Done. > space before the "a" in the param list please. Done. > some newlines, please! i'd put one after "template<>" and space out the function like a normal function Done. > > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.h:35 > > + CCQuadCuller() { }; > > nit: no trailing ; Oops, done. > > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.h:36 > > + virtual void operator()(CCQuadList&) = 0; > > we very rarely use operator overloading in WebKit or chromium - are you really sure you need it here? i think i would prefer a descriptively-named virtual functions Done. > > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.h:40 > > +class CCQuadCullerSimple : public CCQuadCuller { > > +public: > > do you want the compiler-generated copy and assignment c'tors? we normally put WTF_MAKE_NONCOPYABLE on every class unless there's a specific need for it to be copyable Done. > > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.h:44 > > + virtual void operator()(CCQuadList&); > > +}; > > are CCQuadCullers expected to have state between calls? Yes, that was the ultimate architecutre we decided on earlier. QuadCullerSimple doesn't keep state, but in general state may be desired, e.g. when culling several RenderPassLists with one QuadCuller object. > > Source/WebKit/chromium/tests/CCQuadCullerTest.cpp:72 > > +TEST_F(CCQuadCullerTest, verifyQuadsAreCulledProperly) > > Why are you using a test fixture here if you aren't using any state fro mthe CCQuadCullerTest class? you could just declare this TEST() Done. > > Source/WebKit/chromium/tests/CCQuadCullerTest.cpp:85 > > + // Test culling of centre tile in root layer. > > + { > > please break these out into separate TEST()s with descriptive names. you can share the setup code in a SetUp() function if you'd like. having these bunched together risks sharing state between tests and makes failures harder to diagnose Done. > I don't understand where you are going with a CCQuadCuller base class and > virtual implementations - it seems to be designed for some use case that > isn't in effect here and isn't obvious to me. If you want to have potentially > different cullers, why aren't they just different static functions? with just > one implementation, do you need a base class at all? > a single static function should do just fine here. this looks overengineered. I have made it a single static function in the CCQuadCuller class, as requested, although I think the hope was to (ultimately) be able to compare different culling approaches "on the fly", in which case using multiple derived classes overriding a virtual function makes the most sense.

James Robinson

Comment 78 2012-01-05 14:07:37 PST

Comment on attachment 120973 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=120973&action=review > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.cpp:54 > +static bool regionContainsRect(const Region& region, const IntRect& rect) static is redundant with the anonymous namespace. i think i would just drop the anonymous namespace here > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.cpp:97 > +} // Namespace WebCore nit: it's normally capitalized 'namespace WebCore' > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.h:2 > + * Copyright (C) 2011 Google Inc. All rights reserved. it's 2012 now > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.h:34 > + WTF_MAKE_NONCOPYABLE(CCQuadCuller); even better would be making this class not instantiable (private c'tor) since it's just for namespacing and isn't really an object

W. James MacLean

Comment 79 2012-01-05 14:53:10 PST

Created attachment 121336 [details] Patch

W. James MacLean

Comment 80 2012-01-05 14:54:40 PST

(In reply to comment #79) > Created an attachment (id=121336) [details] > Patch Sorry for the delay, repo woes! If this is worthy of r+, can you hit c+ too? Thanks.

James Robinson

Comment 81 2012-01-05 17:18:53 PST

Comment on attachment 121336 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=121336&action=review R=me. I'd like to see test coverage for a couple more cases, but otherwise we're good to go. > Source/WebCore/platform/graphics/chromium/cc/CCQuadCuller.cpp:81 > + if (keepQuad && drawQuad->drawsOpaque() && drawQuad->isLayerAxisAlignedIntRect()) can you please add some tests for non-axis aligned quads to make sure that we (a) don't consider them as generating opaque regions and (b) only cull when it's definitely safe to do so? > Source/WebKit/chromium/tests/CCQuadCullerTest.cpp:2 > + * Copyright (C) 2011 Google Inc. All rights reserved. 2012

W. James MacLean

Comment 82 2012-01-06 06:49:10 PST

Created attachment 121429 [details] Patch

W. James MacLean

Comment 83 2012-01-06 06:51:56 PST

Created attachment 121430 [details] Patch

W. James MacLean

Comment 84 2012-01-06 06:52:42 PST

(In reply to comment #83) > Created an attachment (id=121430) [details] > Patch Tests added, copyright date corrected.

W. James MacLean

Comment 85 2012-01-06 13:49:19 PST

Created attachment 121486 [details] Patch for landing

WebKit Review Bot

Comment 86 2012-01-06 18:39:34 PST

Comment on attachment 121486 [details] Patch for landing Clearing flags on attachment: 121486 Committed r104368: <http://trac.webkit.org/changeset/104368>

WebKit Review Bot

Comment 87 2012-01-06 18:39:43 PST

All reviewed patches have been landed. Closing bug.

Attachments
Patch (21.55 KB, patch) 2011-10-20 12:12 PDT, Dana Jansens	no flags	Details Formatted Diff Diff
Updated to ignore non-axis-aligned transformations (23.60 KB, patch) 2011-10-20 14:29 PDT, Dana Jansens	no flags	Details Formatted Diff Diff
Patch (22.82 KB, patch) 2011-10-28 08:57 PDT, W. James MacLean	no flags	Details Formatted Diff Diff
Patch (38.06 KB, patch) 2011-11-07 11:18 PST, W. James MacLean	no flags	Details Formatted Diff Diff
Patch (28.45 KB, patch) 2011-11-08 10:08 PST, W. James MacLean	no flags	Details Formatted Diff Diff
Patch (22.34 KB, patch) 2011-11-17 11:05 PST, W. James MacLean	no flags	Details Formatted Diff Diff
Patch (20.50 KB, patch) 2011-12-22 07:27 PST, W. James MacLean	no flags	Details Formatted Diff Diff
Patch (20.50 KB, patch) 2011-12-22 07:54 PST, W. James MacLean	no flags	Details Formatted Diff Diff
Patch (20.20 KB, patch) 2011-12-22 10:28 PST, W. James MacLean	no flags	Details Formatted Diff Diff
Patch (20.71 KB, patch) 2011-12-22 12:23 PST, W. James MacLean	no flags	Details Formatted Diff Diff
Patch (21.21 KB, patch) 2011-12-28 13:11 PST, W. James MacLean	no flags	Details Formatted Diff Diff
Patch (21.03 KB, patch) 2011-12-29 08:11 PST, W. James MacLean	no flags	Details Formatted Diff Diff
Patch (21.74 KB, patch) 2012-01-03 11:40 PST, W. James MacLean	no flags	Details Formatted Diff Diff
Patch (21.88 KB, patch) 2012-01-05 14:53 PST, W. James MacLean	no flags	Details Formatted Diff Diff
Patch (23.63 KB, patch) 2012-01-06 06:49 PST, W. James MacLean	no flags	Details Formatted Diff Diff
Patch (23.63 KB, patch) 2012-01-06 06:51 PST, W. James MacLean	no flags	Details Formatted Diff Diff
Patch for landing (23.68 KB, patch) 2012-01-06 13:49 PST, W. James MacLean	no flags	Details Formatted Diff Diff
Show Obsolete (16) View All Add attachment proposed patch, testcase, etc.