Bug 79267

Summary: [chromium] Reset damage tracker on visibility change.
Product: WebKit Reporter: Jonathan Backer <backer>
Component: New BugsAssignee: Jonathan Backer <backer>
Status: RESOLVED FIXED    
Severity: Normal CC: cc-bugs, commit-queue, jamesr, mmocny, nduca, shawnsingh, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
none
Patch none

Jonathan Backer
Reported 2012-02-22 12:31:18 PST
[chromium] Reset damage tracker on visibility change.
Attachments
Patch (1.36 KB, patch)
2012-02-22 12:32 PST, Jonathan Backer
no flags
Patch (1.83 KB, patch)
2012-02-28 05:27 PST, Jonathan Backer
no flags
Patch (5.54 KB, patch)
2012-02-28 14:12 PST, Jonathan Backer
no flags
Jonathan Backer
Comment 1 2012-02-22 12:32:44 PST
Jonathan Backer
Comment 2 2012-02-22 12:40:37 PST
This is mostly for Aura Chromium, but I think it may translate to Win 7 Chromium as well. When a tab goes into the background, we reclaim the back buffer which disassociates the front and back buffer. When we go to the foreground, we get a new back buffer/front buffer pair, whose front buffer is in a undefined state. So we should push a full frame to the front buffer on the next composite cycle. On Aura chrome, the tab does not get an expose event, so we can't rely on a the damage tracker reset to be propagated from the browser via RenderWidget::OnMsgRepaint. The two reset paths are complementary (we need both for Aura --- one for vterm switch and one for tab switch).
Jonathan Backer
Comment 3 2012-02-22 13:10:18 PST
Just a few additional thoughts: we probably want something like this for when we drop front buffers too (when we have too many tabs open). We could definitely do this for OSX, Win7, and Aura --- possibly other platforms.
Michal Mocny
Comment 4 2012-02-22 13:17:33 PST
We certainly want to do this for dropped frontbuffers as well. From the looks of the CL it looks like this change will affect all platforms, but your comments hint at the opposite. May you confirm one way or the other?
Jonathan Backer
Comment 5 2012-02-22 13:38:48 PST
(In reply to comment #4) > We certainly want to do this for dropped frontbuffers as well. From the looks of the CL it looks like this change will affect all platforms, but your comments hint at the opposite. May you confirm one way or the other? I think that we want to do this for all platforms. It's conservative and I think that it will serve us well in the future. That said, this is motivated by a bug found in the wild for Aura (early adopter of partial swaps): http://code.google.com/p/chromium/issues/detail?id=115280
Nat Duca
Comment 6 2012-02-22 15:14:47 PST
iirw, we already route the GpuMemoryManager's decision to drop a frontbuffer/backbuffer to the stub and then up to the RendererGLContext. We should modify the WebGraphicsContext3D callback and the Extensions3DChromium callbacks to take a struct as well [I'm not clear why we didn't do that in the first place]. In that struct, we should pass whether a back/front was allocated. The compositor should then hook up to the callback, and in the callback, take the right actions when, for example, a backbuffer goes away.
Michal Mocny
Comment 7 2012-02-23 08:47:15 PST
The changes to the callback are easy, and likely the right way forward, so I'll get started on a patch. However, I worry about a possible race when rapid tab switching between two tabs, where the callback may not be delivered in a timely enough manner (we would be expecting to receive the callback before the next visibility change, which isn't guaranteed). The current model is that the memory allocations are asynchronous hints and may reflect a previous state-of-the-world. One solution may be to add a wait-for-memory-allocation in the renderer, if and only if there is a subsequent visibility change and the current memory allocation is not yet received. Once/if visibility changes start going to the gpu process directly from the browser, things will change - may become more complicated.
Nat Duca
Comment 8 2012-02-23 13:30:42 PST
Okay, I think I see the underlying problem here. Its that the GPU process is doing the drop of the backbuffer on its own. That is what is getting us into hot water. If instead the GPU process just sends a new allocation to the renderer with backbuffer=false, and we add API to GL to drop the backbuffer, then we don't get this race. I dont think latency would be impacted negatively either... a few milliseconds at most are added to the flow. WDYT?
Michal Mocny
Comment 9 2012-02-23 14:14:05 PST
sgtm, it seemed inevitable that we would need to synchronize dropping buffers between renderer and gpu and that seems like a very clean way to do it. Though I think the memory allocation struct will warrant a renaming to shouldHave*Buffer, and the renderer may need to track current buffer allocation state or else send needless IPC on each memory allocation change (which may be acceptable now, but will become more frequent than just on tab switch). fwiw, we discussed a simpler fix for the original issue here: currently on aura we dont force a paint on tab foregrounding (we do on (all?) other platforms). Adding that would fix the damage tracker issue, as well as a similar issue which will arise when dropping frontbuffers. Both issues can be solved using a MemoryAllocationChangedCallback, and I'm not sure if we want both solutions.
Nat Duca
Comment 10 2012-02-23 14:49:56 PST
(In reply to comment #9) > sgtm, it seemed inevitable that we would need to synchronize dropping buffers between renderer and gpu and that seems like a very clean way to do it. Though I think the memory allocation struct will warrant a renaming to shouldHave*Buffer, and the renderer may need to track current buffer allocation state or else send needless IPC on each memory allocation change (which may be acceptable now, but will become more frequent than just on tab switch). Great point about naming. This underscores the value in making the callbacks on webkit side take a struct as an argument. I dont see a lot of risk in sending memory managment callbacks every frame, even. IPCs are not themselves that heavy. > fwiw, we discussed a simpler fix for the original issue here: currently on aura we dont force a paint on tab foregrounding (we do on (all?) other platforms). Adding that would fix the damage tracker issue, as well as a similar issue which will arise when dropping frontbuffers. Totally, but this is a hack, right? I think that'd make fast tab switching heavier because you'd end up repainting all the time when you switch tabs, too. If we're under OMGOMGFIXRIGHTAWAY pressure, I'm personally fine with a temporary fix for it, but I think the underlying issue is probably also worth working-toward.
Jonathan Backer
Comment 11 2012-02-24 07:57:38 PST
> Totally, but this is a hack, right? I think that'd make fast tab switching heavier because you'd end up repainting all the time when you switch tabs, too. This patch is not a hack. It is the right thing to do. a) we shouldn't release any buffers (front or back) unless that tab is not visible regardless of any co-ordination scheme you can come up with but more importantly b) once the tab is made visible, we should not make any assumptions about the buffer state, regardless of buffer management, because any assumptions would be platform specific (e.g. on desktop linux the front buffer is in a bad state and needs to be completely redrawn --- this particular example is mitigated by the XExpose event being propagated from the browser and causing a full frame damage). This change is safe. The only drawback that I see, is that it isn't obvious why it is needed. I think that could be resolved with a comment/renaming, or perhaps putting a check that we're using partial swaps before forcing the damage reset. > If we're under OMGOMGFIXRIGHTAWAY pressure, I'm personally fine with a temporary fix for it, but I think the underlying issue is probably also worth working-toward. We've got about 2 weeks for fix this. The sooner the better.
James Robinson
Comment 12 2012-02-24 19:11:23 PST
To rephrase Nat's concern (which I agree with completely, if I understand it correctly) our world will be a lot simpler if all decisions about a context's resources are made by the compositor and not externally to it. If we want to drop the back buffer, we should tell the compositor and let it issue a command into the GL command stream to do so. If we want it to drop some textures, tell the compositor and let it issue the appropriate glDelete*() calls. That way the compositor is aware at all times of what it does and does not have and the compositor's GL command stream is always internally consistent with the resources it has available in the GPU process. Is that a fair summary? If there are cases where in the GPU process we absolutely must free up resources and we cannot afford to wait for a compositor to respond to an async message, then we should kill the context outright and let the lost context mechanism handle recovery. You're right, Jonathan, that front buffer damage on a non-composited WM is something that is out of our control completely. It's not triggered by the GPU process or the compositor's actions and so we have to react asynchronously to it. I think this is a unique special case and not a sign of a more general issue. The backbuffer in particular is a GL resource that our code has complete control over.
James Robinson
Comment 13 2012-02-24 19:12:29 PST
Comment on attachment 128265 [details] Patch R- since I'm concerned about very hard to deal with flashes resulting from this approach, see the preceding few comments for what I think would be a more workable approach.
Jonathan Backer
Comment 14 2012-02-27 06:02:57 PST
> To rephrase Nat's concern (which I agree with completely, if I understand it correctly) our world will be a lot simpler if all decisions about a context's resources are made by the compositor and not externally to it. If we want to drop the back buffer, we should tell the compositor and let it issue a command into the GL command stream to do so. If we want it to drop some textures, tell the compositor and let it issue the appropriate glDelete*() calls. That way the compositor is aware at all times of what it does and does not have and the compositor's GL command stream is always internally consistent with the resources it has available in the GPU process. Is that a fair summary? That's a fair summary, and I don't disagree with it --- within limits. Let's go back to the Linux desktop example: it's the browser that determines the visibility of the tab contents window; when the window is hidden, the the front buffer is trashed; neither the renderer nor the GPU process get any say in this decision (and they shouldn't); but we've got an async message sent to the renderer for the visibility change --- I think that we should respond to this by resetting the damage tracker. Note that in this example, the front buffer is not necessarily freed (I think that this is driver dependent --- I'd want to resize that unmapped window to guarantee it). Allowing the renderer to drive this decision is fine (and consistent with the position that Nat advocates). (In reply to comment #13) > (From update of attachment 128265 [details]) > R- since I'm concerned about very hard to deal with flashes resulting from this approach, see the preceding few comments for what I think would be a more workable approach. This patch isn't about addressing flashing. Flashing is a bad user experience (but somewhat bearable). With --enable-partial-swap, this is about most of the screen not being drawn, and fragments being drawn as we mouse over --- a horrific user experience. We need to reset the damage tracker. It needs to be reset based on visibility change (somehow). This problem is mitigated on desktop Linux where we get a full frame damage based on XExpose (there's a race but it looks like a flash to the user). But we don't get that XExpose event on Aura (only the root window in an X window on Aura --- everything else is a composited texture). Hence, why I want to do a conservative, platform-independent reset on visibility change. Alternatively, I can force a reset in Aura on the Chromium side via the same mechanism used by desktop Linux with XExpose --- but I think this is the wrong approach (too platform specific --- I want to code partial swaps/scissoring to the lowest common denominator). @jamesr: Please reconsider.
James Robinson
Comment 15 2012-02-27 09:48:55 PST
Is the reason we need to reset on visibility change in Aura because the OS is doing something outside our control, or because we're dropping a resource?
Jonathan Backer
Comment 16 2012-02-27 10:39:54 PST
(In reply to comment #15) > Is the reason we need to reset on visibility change in Aura because the OS is doing something outside our control, or because we're dropping a resource? Hmm... I think that in this case, it's in our control. AFAIK, it is due to dropping the backbuffer on GLX image transport. That said, Aura now has 4 different backends for texture transport: GLX, EGL, OSMesa, and GPU process (behind a --ui-use-gpu-process flag). Eventually we will move to just the later (yes!!!!). I think that this bug only affects the GLX backend... not the others. But this is my point --- if we don't do this, we're making platform specific decisions. Without the XExpose on desktop linux, we'd have the same problem there. The performance hit on a full frame damage is negligible and we really shouldn't make any assumptions about the state of the buffers. I take to heart that there's a danger that we paper over other bugs. But I just don't think that's the case here.
Nat Duca
Comment 17 2012-02-27 13:50:08 PST
(In reply to comment #16) > (In reply to comment #15) What are the semantics of the frontbuffer with the post-sub extension?
Jonathan Backer
Comment 18 2012-02-28 05:27:01 PST
Jonathan Backer
Comment 19 2012-02-28 05:30:54 PST
Comment on attachment 129233 [details] Patch Updated as per offline discussion and filed tracking bug (link in comments) for future work.
James Robinson
Comment 20 2012-02-28 10:54:25 PST
Comment on attachment 129233 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=129233&action=review > Source/WebCore/platform/graphics/chromium/LayerRendererChromium.cpp:296 > + // process on visibility change. is it damaged every time visibility changes? I thought it was only damaged on the visible -> not visible transition.
James Robinson
Comment 21 2012-02-28 10:56:50 PST
Also can you add tests for this? We have CCDamageTrackerTest.cpp, this might fit in there.
Jonathan Backer
Comment 22 2012-02-28 14:12:57 PST
Jonathan Backer
Comment 23 2012-02-28 14:17:37 PST
Comment on attachment 129328 [details] Patch (In reply to comment #20) > (From update of attachment 129233 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=129233&action=review > > > Source/WebCore/platform/graphics/chromium/LayerRendererChromium.cpp:296 > > + // process on visibility change. > > is it damaged every time visibility changes? I thought it was only damaged on the visible -> not visible transition. Correct. It will be damaged when made invisible. I've updated the CL to damage on the transition to visible because I want the damage clear at this point (we're about to push a new frame). I should work just as well by clearing on the change to invisible. Your call. (In reply to comment #21) > Also can you add tests for this? We have CCDamageTrackerTest.cpp, this might fit in there. Added a unit test to CCLayerTreeHostImpl.cpp by adding a test similar to an existing test there.
James Robinson
Comment 24 2012-02-28 14:31:31 PST
Comment on attachment 129328 [details] Patch Awesome, R=me. I think either behavior for visibility transitions is fine so long as we're clear about it and have tests to make sure we don't break it, since we never draw when !visible.
WebKit Review Bot
Comment 25 2012-02-28 16:35:22 PST
Comment on attachment 129328 [details] Patch Clearing flags on attachment: 129328 Committed r109168: <http://trac.webkit.org/changeset/109168>
WebKit Review Bot
Comment 26 2012-02-28 16:35:28 PST
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.