Bug 79267 - [chromium] Reset damage tracker on visibility change.
Summary: [chromium] Reset damage tracker on visibility change.
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Jonathan Backer
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-02-22 12:31 PST by Jonathan Backer
Modified: 2012-02-28 16:35 PST (History)
7 users (show)

See Also:


Attachments
Patch (1.36 KB, patch)
2012-02-22 12:32 PST, Jonathan Backer
no flags Details | Formatted Diff | Diff
Patch (1.83 KB, patch)
2012-02-28 05:27 PST, Jonathan Backer
no flags Details | Formatted Diff | Diff
Patch (5.54 KB, patch)
2012-02-28 14:12 PST, Jonathan Backer
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jonathan Backer 2012-02-22 12:31:18 PST
[chromium] Reset damage tracker on visibility change.
Comment 1 Jonathan Backer 2012-02-22 12:32:44 PST
Created attachment 128265 [details]
Patch
Comment 2 Jonathan Backer 2012-02-22 12:40:37 PST
This is mostly for Aura Chromium, but I think it may translate to Win 7 Chromium as well.

When a tab goes into the background, we reclaim the back buffer which disassociates the front and back buffer. When we go to the foreground, we get a new back buffer/front buffer pair, whose front buffer is in a undefined state. So we should push a full frame to the front buffer on the next composite cycle.

On Aura chrome, the tab does not get an expose event, so we can't rely on a the damage tracker reset to be propagated from the browser via RenderWidget::OnMsgRepaint. The two reset paths are complementary (we need both for Aura --- one for vterm switch and one for tab switch).
Comment 3 Jonathan Backer 2012-02-22 13:10:18 PST
Just a few additional thoughts: we probably want something like this for when we drop front buffers too (when we have too many tabs open). We could definitely do this for OSX, Win7, and Aura --- possibly other platforms.
Comment 4 Michal Mocny 2012-02-22 13:17:33 PST
We certainly want to do this for dropped frontbuffers as well.  From the looks of the CL it looks like this change will affect all platforms, but your comments hint at the opposite.  May you confirm one way or the other?
Comment 5 Jonathan Backer 2012-02-22 13:38:48 PST
(In reply to comment #4)
> We certainly want to do this for dropped frontbuffers as well.  From the looks of the CL it looks like this change will affect all platforms, but your comments hint at the opposite.  May you confirm one way or the other?

I think that we want to do this for all platforms. It's conservative and I think that it will serve us well in the future.

That said, this is motivated by a bug found in the wild for Aura (early adopter of partial swaps): http://code.google.com/p/chromium/issues/detail?id=115280
Comment 6 Nat Duca 2012-02-22 15:14:47 PST
iirw, we already route the GpuMemoryManager's decision to drop a frontbuffer/backbuffer to the stub and then up to the RendererGLContext.

We should modify the WebGraphicsContext3D callback and the Extensions3DChromium callbacks to take a struct as well [I'm not clear why we didn't do that in the first place]. In that struct, we should pass whether a back/front was allocated. The compositor should then hook up to the callback, and in the callback, take the right actions when, for example, a backbuffer goes away.
Comment 7 Michal Mocny 2012-02-23 08:47:15 PST
The changes to the callback are easy, and likely the right way forward, so I'll get started on a patch.

However, I worry about a possible race when rapid tab switching between two tabs, where the callback may not be delivered in a timely enough manner (we would be expecting to receive the callback before the next visibility change, which isn't guaranteed).  The current model is that the memory allocations are asynchronous hints and may reflect a previous state-of-the-world.

One solution may be to add a wait-for-memory-allocation in the renderer, if and only if there is a subsequent visibility change and the current memory allocation is not yet received.

Once/if visibility changes start going to the gpu process directly from the browser, things will change - may become more complicated.
Comment 8 Nat Duca 2012-02-23 13:30:42 PST
Okay, I think I see the underlying problem here. Its that the GPU process is doing the drop of the backbuffer on its own. That is what is getting us into hot water.

If instead the GPU process just sends a new allocation to the renderer with backbuffer=false, and we add API to GL to drop the backbuffer, then we don't get this race.

I dont think latency would be impacted negatively either... a few milliseconds at most are added to the flow.

WDYT?
Comment 9 Michal Mocny 2012-02-23 14:14:05 PST
sgtm, it seemed inevitable that we would need to synchronize dropping buffers between renderer and gpu and that seems like a very clean way to do it.  Though I think the memory allocation struct will warrant a renaming to shouldHave*Buffer, and the renderer may need to track current buffer allocation state or else send needless IPC on each memory allocation change (which may be acceptable now, but will become more frequent than just on tab switch).

fwiw, we discussed a simpler fix for the original issue here: currently on aura we dont force a paint on tab foregrounding (we do on (all?) other platforms).  Adding that would fix the damage tracker issue, as well as a similar issue which will arise when dropping frontbuffers.  Both issues can be solved using a MemoryAllocationChangedCallback, and I'm not sure if we want both solutions.
Comment 10 Nat Duca 2012-02-23 14:49:56 PST
(In reply to comment #9)
> sgtm, it seemed inevitable that we would need to synchronize dropping buffers between renderer and gpu and that seems like a very clean way to do it.  Though I think the memory allocation struct will warrant a renaming to shouldHave*Buffer, and the renderer may need to track current buffer allocation state or else send needless IPC on each memory allocation change (which may be acceptable now, but will become more frequent than just on tab switch).

Great point about naming. This underscores the value in making the callbacks on webkit side take a struct as an argument.

I dont see a lot of risk in sending memory managment callbacks every frame, even. IPCs are not themselves that heavy.


> fwiw, we discussed a simpler fix for the original issue here: currently on aura we dont force a paint on tab foregrounding (we do on (all?) other platforms).  Adding that would fix the damage tracker issue, as well as a similar issue which will arise when dropping frontbuffers. 

Totally, but this is a hack, right? I think that'd make fast tab switching heavier because you'd end up repainting all the time when you switch tabs, too.

If we're under OMGOMGFIXRIGHTAWAY pressure, I'm personally fine with a temporary fix for it, but I think the underlying issue is probably also worth working-toward.
Comment 11 Jonathan Backer 2012-02-24 07:57:38 PST
> Totally, but this is a hack, right? I think that'd make fast tab switching heavier because you'd end up repainting all the time when you switch tabs, too.

This patch is not a hack. It is the right thing to do.

a) we shouldn't release any buffers (front or back) unless that tab is not visible regardless of any co-ordination scheme you can come up with

but more importantly

b) once the tab is made visible, we should not make any assumptions about the buffer state, regardless of buffer management, because any assumptions would be platform specific (e.g. on desktop linux the front buffer is in a bad state and needs to be completely redrawn --- this particular example is mitigated by the XExpose event being propagated from the browser and causing a full frame damage).

This change is safe. The only drawback that I see, is that it isn't obvious why it is needed. I think that could be resolved with a comment/renaming, or perhaps putting a check that we're using partial swaps before forcing the damage reset.

> If we're under OMGOMGFIXRIGHTAWAY pressure, I'm personally fine with a temporary fix for it, but I think the underlying issue is probably also worth working-toward.

We've got about 2 weeks for fix this. The sooner the better.
Comment 12 James Robinson 2012-02-24 19:11:23 PST
To rephrase Nat's concern (which I agree with completely, if I understand it correctly) our world will be a lot simpler if all decisions about a context's resources are made by the compositor and not externally to it.  If we want to drop the back buffer, we should tell the compositor and let it issue a command into the GL command stream to do so.  If we want it to drop some textures, tell the compositor and let it issue the appropriate glDelete*() calls.  That way the compositor is aware at all times of what it does and does not have and the compositor's GL command stream is always internally consistent with the resources it has available in the GPU process.  Is that a fair summary?

If there are cases where in the GPU process we absolutely must free up resources and we cannot afford to wait for a compositor to respond to an async message, then we should kill the context outright and let the lost context mechanism handle recovery.

You're right, Jonathan, that front buffer damage on a non-composited WM is something that is out of our control completely.  It's not triggered by the GPU process or the compositor's actions and so we have to react asynchronously to it.  I think this is a unique special case and not a sign of a more general issue.  The backbuffer in particular is a GL resource that our code has complete control over.
Comment 13 James Robinson 2012-02-24 19:12:29 PST
Comment on attachment 128265 [details]
Patch

R- since I'm concerned about very hard to deal with flashes resulting from this approach, see the preceding few comments for what I think would be a more workable approach.
Comment 14 Jonathan Backer 2012-02-27 06:02:57 PST
> To rephrase Nat's concern (which I agree with completely, if I understand it correctly) our world will be a lot simpler if all decisions about a context's resources are made by the compositor and not externally to it.  If we want to drop the back buffer, we should tell the compositor and let it issue a command into the GL command stream to do so.  If we want it to drop some textures, tell the compositor and let it issue the appropriate glDelete*() calls.  That way the compositor is aware at all times of what it does and does not have and the compositor's GL command stream is always internally consistent with the resources it has available in the GPU process.  Is that a fair summary?

That's a fair summary, and I don't disagree with it --- within limits. Let's go back to the Linux desktop example: it's the browser that determines the visibility of the tab contents window; when the window is hidden, the the front buffer is trashed; neither the renderer nor the GPU process get any say in this decision (and they shouldn't); but we've got an async message sent to the renderer for the visibility change --- I think that we should respond to this by resetting the damage tracker.

Note that in this example, the front buffer is not necessarily freed (I think that this is driver dependent --- I'd want to resize that unmapped window to guarantee it). Allowing the renderer to drive this decision is fine (and consistent with the position that Nat advocates).

(In reply to comment #13)
> (From update of attachment 128265 [details])
> R- since I'm concerned about very hard to deal with flashes resulting from this approach, see the preceding few comments for what I think would be a more workable approach.

This patch isn't about addressing flashing. Flashing is a bad user experience (but somewhat bearable). With --enable-partial-swap, this is about most of the screen not being drawn, and fragments being drawn as we mouse over --- a horrific user experience. We need to reset the damage tracker. It needs to be reset based on visibility change (somehow). This problem is mitigated on desktop Linux where we get a full frame damage based on XExpose (there's a race but it looks like a flash to the user). But we don't get that XExpose event on Aura (only the root window in an X window on Aura --- everything else is a composited texture). Hence, why I want to do a conservative, platform-independent reset on visibility change. Alternatively, I can force a reset in Aura on the Chromium side via the same mechanism used by desktop Linux with XExpose --- but I think this is the wrong approach (too platform specific --- I want to code partial swaps/scissoring to the lowest common denominator).

@jamesr: Please reconsider.
Comment 15 James Robinson 2012-02-27 09:48:55 PST
Is the reason we need to reset on visibility change in Aura because the OS is doing something outside our control, or because we're dropping a resource?
Comment 16 Jonathan Backer 2012-02-27 10:39:54 PST
(In reply to comment #15)
> Is the reason we need to reset on visibility change in Aura because the OS is doing something outside our control, or because we're dropping a resource?

Hmm... I think that in this case, it's in our control. AFAIK, it is due to dropping the backbuffer on GLX image transport.

That said, Aura now has 4 different backends for texture transport: GLX, EGL, OSMesa, and GPU process (behind a --ui-use-gpu-process flag). Eventually we will move to just the later (yes!!!!). I think that this bug only affects the GLX backend... not the others.

But this is my point --- if we don't do this, we're making platform specific decisions. Without the XExpose on desktop linux, we'd have the same problem there. The performance hit on a full frame damage is negligible and we really shouldn't make any assumptions about the state of the buffers.

I take to heart that there's a danger that we paper over other bugs. But I just don't think that's the case here.
Comment 17 Nat Duca 2012-02-27 13:50:08 PST
(In reply to comment #16)
> (In reply to comment #15)
What are the semantics of the frontbuffer with the post-sub extension?
Comment 18 Jonathan Backer 2012-02-28 05:27:01 PST
Created attachment 129233 [details]
Patch
Comment 19 Jonathan Backer 2012-02-28 05:30:54 PST
Comment on attachment 129233 [details]
Patch

Updated as per offline discussion and filed tracking bug (link in comments) for future work.
Comment 20 James Robinson 2012-02-28 10:54:25 PST
Comment on attachment 129233 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=129233&action=review

> Source/WebCore/platform/graphics/chromium/LayerRendererChromium.cpp:296
> +        // process on visibility change.

is it damaged every time visibility changes? I thought it was only damaged on the visible -> not visible transition.
Comment 21 James Robinson 2012-02-28 10:56:50 PST
Also can you add tests for this? We have CCDamageTrackerTest.cpp, this might fit in there.
Comment 22 Jonathan Backer 2012-02-28 14:12:57 PST
Created attachment 129328 [details]
Patch
Comment 23 Jonathan Backer 2012-02-28 14:17:37 PST
Comment on attachment 129328 [details]
Patch

(In reply to comment #20)
> (From update of attachment 129233 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=129233&action=review
> 
> > Source/WebCore/platform/graphics/chromium/LayerRendererChromium.cpp:296
> > +        // process on visibility change.
> 
> is it damaged every time visibility changes? I thought it was only damaged on the visible -> not visible transition.

Correct. It will be damaged when made invisible. I've updated the CL to damage on the transition to visible because I want the damage clear at this point (we're about to push a new frame). I should work just as well by clearing on the change to invisible. Your call.

(In reply to comment #21)
> Also can you add tests for this? We have CCDamageTrackerTest.cpp, this might fit in there.

Added a unit test to CCLayerTreeHostImpl.cpp by adding a test similar to an existing test there.
Comment 24 James Robinson 2012-02-28 14:31:31 PST
Comment on attachment 129328 [details]
Patch

Awesome, R=me.  I think either behavior for visibility transitions is fine so long as we're clear about it and have tests to make sure we don't break it, since we never draw when !visible.
Comment 25 WebKit Review Bot 2012-02-28 16:35:22 PST
Comment on attachment 129328 [details]
Patch

Clearing flags on attachment: 129328

Committed r109168: <http://trac.webkit.org/changeset/109168>
Comment 26 WebKit Review Bot 2012-02-28 16:35:28 PST
All reviewed patches have been landed.  Closing bug.