|Component:||WebCore Misc.||Assignee:||Glenn Hartmann <hartmanng>|
|Severity:||Normal||CC:||dglazkov, eric, esprehn+autocc, hartmanng, ojan.autocc, simon.fraser, vollick, webkit.review.bot|
|Version:||528+ (Nightly build)|
|Bug Depends on:||109966|
Description vollick 2013-02-12 08:57:36 PST
Currently, we do not check visibility when setting the m_hasOutOfFlowPositionedDescendant flag, but we should. We should also dirty the flag when visibility changes.
Comment 2 Simon Fraser (smfr) 2013-02-12 09:16:39 PST
Comment on attachment 187880 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=187880&action=review > Source/WebCore/ChangeLog:10 > + Currently, we do not check visibility when setting the > + m_hasOutOfFlowPositionedDescendant flag, but we should. With this flag > + we do. We also dirty the flag when visibility changes. I'm a little concerned that we're spending time maintaining the m_hasOutOfFlowPositionedDescendant when most of the time it's unused. > Source/WebCore/rendering/RenderLayer.cpp:1010 > + bool childIsOutOfFlowPositioned = child->renderer() && child->renderer()->isOutOfFlowPositioned() && child->isVisuallyNonEmpty(); isVisuallyNonEmpty() is a bit expensive to call here. I think you should just check the visibility flags (hasVisibleContent/hasVisibleDescendant). > LayoutTests/compositing/overflow/do-not-opt-in-with-out-of-flow-descendant.html:2 > +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" > + "http://www.w3.org/TR/html4/loose.dtd"> Just <!DOCTYPE html> please > LayoutTests/compositing/overflow/do-not-opt-in-with-out-of-flow-descendant.html:4 > +<html lang="en"> No need for lang. > LayoutTests/compositing/overflow/do-not-opt-in-with-out-of-flow-descendant.html:6 > + <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> No need for this. > LayoutTests/compositing/overflow/do-not-opt-in-with-out-of-flow-descendant.html:64 > + var layerTree = window.internals.layerTreeAsText(document); Doesn't layerTreeAsText() update layout already?
Comment 3 vollick 2013-02-12 21:08:32 PST
Created attachment 188002 [details] Patch (In reply to comment #2) > (From update of attachment 187880 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=187880&action=review > > > Source/WebCore/ChangeLog:10 > > + Currently, we do not check visibility when setting the > > + m_hasOutOfFlowPositionedDescendant flag, but we should. With this flag > > + we do. We also dirty the flag when visibility changes. > > I'm a little concerned that we're spending time maintaining the m_hasOutOfFlowPositionedDescendant when most of the time it's unused. I've updated the code so we do much less work in updateDescendantDependentFlags if the opt-in feature is disabled. This seemed cleaner than doing the check at every spot where we muck with the dirty bit for this property. Is this better? > > > Source/WebCore/rendering/RenderLayer.cpp:1010 > > + bool childIsOutOfFlowPositioned = child->renderer() && child->renderer()->isOutOfFlowPositioned() && child->isVisuallyNonEmpty(); > > isVisuallyNonEmpty() is a bit expensive to call here. I think you should just check the visibility flags (hasVisibleContent/hasVisibleDescendant). Done. > > > LayoutTests/compositing/overflow/do-not-opt-in-with-out-of-flow-descendant.html:2 > > +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" > > + "http://www.w3.org/TR/html4/loose.dtd"> > > Just <!DOCTYPE html> please Fixed. > > > LayoutTests/compositing/overflow/do-not-opt-in-with-out-of-flow-descendant.html:4 > > +<html lang="en"> > > No need for lang. Removed. > > > LayoutTests/compositing/overflow/do-not-opt-in-with-out-of-flow-descendant.html:6 > > + <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> > > No need for this. Removed. > > > LayoutTests/compositing/overflow/do-not-opt-in-with-out-of-flow-descendant.html:64 > > + var layerTree = window.internals.layerTreeAsText(document); > > Doesn't layerTreeAsText() update layout already? Yep. Removed.
Comment 4 vollick 2013-03-13 08:09:32 PDT
Comment on attachment 188002 [details] Patch Need to ditch this approach. Glenn and I discussed the following approach which seems like the right way forward. - get rid of m_hasOutOfFlowEtcEtc - instead, on the fly in updateNeedsCompositedEtc, we'll - check if we have a descendant - if so, check if its containing block is us - if so, we form a containing block and don't need to worry about out-of-flow positioned descendants - if not, go through the list of positioned descendants maintained in RenderBlock and if any of them have containing block that is an ancestor of us, we can't opt in.
Comment 5 vollick 2013-03-15 10:29:36 PDT
(In reply to comment #4) > (From update of attachment 188002 [details]) > Need to ditch this approach. Glenn and I discussed the following approach which seems like the right way forward. > - get rid of m_hasOutOfFlowEtcEtc > - instead, on the fly in updateNeedsCompositedEtc, we'll > - check if we have a descendant > - if so, check if its containing block is us > - if so, we form a containing block and don't need to worry about out-of-flow positioned descendants > - if not, go through the list of positioned descendants maintained in RenderBlock and if any of them have containing block that is an ancestor of us, we can't opt in. Glenn pointed out that this won't work. According to the spec (http://www.w3.org/TR/CSS2/visudet.html), the position attribute of an element can change which element forms its containing block. This means that the question "is this layer a containing block?" doesn't actually make any sense; it depends on the descendant. So the revised (but naive) plan is: 1 Determine-if-we-have-an-out-of-flow-positioned-descendant-whose-containing-block-is-not-also-our-descendant: 2 for each ancestor that could be a containing block for our descendants*: 3 for each of its positioned descendants** 4 if the positioned descendant is also our descendant, don't opt in. * It turns out that there are at most two of these ancestors we need to care about, because: a) We will always opt out if we have fixed-pos descendants (this will be dealt with in a separate, blocking bug), and b) We trivially opt in if we're the root layer. So, according to the containing block definition in the spec (http://www.w3.org/TR/CSS2/visudet.html) we only care about conditions 2 and 4. That is, the containing block for: a) relative and static positioned descendants, and b) absolute positioned descendants. Also, if either of these containing blocks turn out to be us, then we don't enter the 'for each' loop at line 2 of the algorithm; we already know we will clip those descendants. ** Conveniently, RenderBlocks already maintain a list of positioned descendants for us to consult. If we use the above algorithm in updateNeedsCompositedScrolling, I believe we are guaranteed to be correct. But this function gets called a LOT, and the algorithm's complexity is O(n*d), where n = # of positioned descendants to consider, d = tree height (because we have to check if a layer is a descendant of another). I believe that the algorithm as presented here is a non starter -- it's just too expensive -- but I think that there are ways to speed it up considerably. First, we could dramatically reduce n if on line 3 of the algorithm we only had to process those positioned descendants that had been added or removed from the list since we last checked. And this isn't too hard to accomplish. In RenderBlock.cpp, alongside gPositionedDescendantMap, we could store a list of deltas. Every time we modify gPositionedDescendantMap, we can add a delta to the list, and on line 3 of the algorithm we only need to process those deltas we haven't seen yet. There are two problems with the deltas approach: 1. The list will grow without bound. A simple way to deal with this is to put a small limit on the number of deltas we'll keep and clear the list periodically. If we keep track of the number of times the list has been cleared, we'll be able to tell that on line 3 of the algorithm we have to fallback to processing the whole list again. This saves us from having to push a notification to all layers that might care; we'll essentially be pulling the 'freshness token' only for layers that could opt in. At first we can keep it simple and clear the list of deltas whenever it passes the size threshold, but in future, we could be smart and only clear when we go idle. 2. If a 'problematic' layer is removed in one of the deltas, how do we tell if we can opt in? We're obviously sunk if we just store a boolean (m_hasOutOfFlowPositionDescendant, say), but if instead we keep track of all positioned descendants that are preventing us from opting in (in a global map, just like in RenderBlock), we know it's safe to opt in whenever this collection is empty. Here is the revised algorithm: 1 Determine-if-we-have-an-out-of-flow-positioned-descendant-whose-containing-block-is-not-also-our-descendant: 2 for each ancestor that could be a containing block for our descendants: 3 if ancestor has cleared its delta list 4 clear our collection, P, of 'problematic' descendants 5 for each of its positioned descendants 6 if the positioned descendant is also our descendant, add it to P 7 else 8 for each unprocessed delta: 9 if delta.layer is our descendant 10 if delta.removal 11 remove the layer from P 12 else 13 add the layer to P 14 return !P.isEmpty In the expected case, this algorithm is O(m*d), where m is the number of unprocessed deltas. In the worst case it could be O(n*d), but that should be very infrequent provided we don't clear the delta lists too often. As the page stabilizes, m will approach zero. We will need to instrument this function and measure its cost on heavy pages in the wild, but my intuition is that this will be inexpensive. As for storage, this requires O(d + m + s) storage, where * d is all of the 'problematic' descendants we're keeping track of (but remember, we only take this hit for overflow:scroll divs that cannot opt in. * m is the size of the delta lists we're maintaining, and * s is all the state we need to hang onto to determine which deltas we have yet to process and if we need to fallback to worst case. This should be sizeof(size_t)*2*number-of-candidates-for-opt-in. We will also need to add instrumentation to allow us keep track of these sizes (see RenderBlock::reportMemoryUsage).
Comment 6 Simon Fraser (smfr) 2013-03-15 11:09:54 PDT
Is this all really worth the benefit?
Comment 7 vollick 2013-03-15 11:32:13 PDT
(In reply to comment #6) > Is this all really worth the benefit? I hear you. This is fairly complex, and I really wish it wasn't. That said, I am entirely convinced that it's worth it. We've seen massive performance gains on the comp-scroll path, both because of the painting we avoid, and also because it allows us to process scroll updates off the main thread. The more we can opt in, the better.. provided we don't break the web. And I don't think we can guarantee we won't without a reliable check for descendants that could extend beyond our clip. The test I suggested above is most efficient I could think of, but I'd be very eager to hear of any alternatives you might have. I'd also love to know if I've got any flaws in my assumptions or reasoning above --if you notice any, please pass them along.
Comment 8 vollick 2013-04-02 16:28:07 PDT
Created attachment 196250 [details] Patch (In reply to comment #6) > Is this all really worth the benefit? Ack, I'm sorry. I think I misunderstood you. I think you were asking if all of these optimizations were really worth the benefit. Turns out they weren't. I went ahead and implemented them, but I found that even with the naive approach, we only spend about 0.3ms in ::hasOutOfFlowPositionedDescendant while loading gmail on my desktop machine. This simplifies things a lot. This patch includes the changes for 109966. It doesn't perform correctly without the fixes in that patch.
Comment 9 WebKit Review Bot 2013-04-03 02:44:49 PDT
Comment 10 WebKit Review Bot 2013-04-03 02:44:53 PDT
Created attachment 196308 [details] Archive of layout-test-results from gce-cr-linux-05 for chromium-linux-x86_64 The attached test failures were seen while running run-webkit-tests on the chromium-ews. Bot: gce-cr-linux-05 Port: chromium-linux-x86_64 Platform: Linux-3.3.8-gcg-201212281604-x86_64-with-GCEL-10.04-gcel_10.04