Bug 51262

Summary: WebPageProxy should delete its backing store after not painting for a while
Product: WebKit Reporter: Adam Roben (:aroben) <aroben>
Component: Layout and RenderingAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: andersca, ggaren, sam
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Bug Depends on: 54907, 54911, 54916, 55382, 55417, 55715, 55730    
Bug Blocks: 55733    
Attachments:
Description Flags
Throw away DrawingAreaProxyImpl's backing store after not painting for 5 seconds andersca: review+

Description Adam Roben (:aroben) 2010-12-17 09:15:17 PST
Classic WebKit (on Windows, anyway) has an optimization where it deletes its backing store after not painting for a while (Classic WebKit uses a 5-second delay). This frees up memory in use by background tabs, e.g. WebKit2 should do this, too.
Comment 1 Adam Roben (:aroben) 2010-12-17 09:35:33 PST
<rdar://problem/8782537>
Comment 2 Adam Roben (:aroben) 2011-02-10 08:44:49 PST
Some info from discussions outside of Bugzilla:

Anders:
The design I imagined for this was that we have a maximum cap of backing stores and use a LRU list to figure out which backing store to evict. This could also be combined with a timer.

When a WKView is asked to paint and it doesn't have a backing store, it needs to send a request to the WebProcess and block for a little while so we won't get flashes.

Adam:
An LRU cache doesn't seem required to fix this bug.

Aren't we effectively triple-buffering on OS X (as we are on Windows with the DWM enabled)? I wonder if we can get rid of the extra buffer on OS X. Maybe the fact that we draw into a CGLayer prevents the triple-buffering? Again, not required to fix this bug.

Geoff:
One thing I'll mention about it: I think it's worth doing a basic measurement to figure out how long a full paint, starting with no backing store, takes on reasonably modern hardware. That way, we can come up with a time delay that has some methodology to it. For example, if a full paint takes 50ms, a 5s delay ensures a cost of only 50/5000 = 1%. But if a full paint takes only, say, 10ms, I'd make the timer even shorter.

Adam:
I guess this is a worst-case cost calculation. I.e., if we set the timer to 5s, and WebPageProxy has to paint every 5.000001s, then we'll spend 1% of the time recreating the backing store and painting into it. Of course, the other 99% of the time is spent doing nothing at all.
Comment 3 Geoffrey Garen 2011-02-10 08:49:49 PST
> Aren't we effectively triple-buffering

I think we are. My understanding is that the reason for the triple-buffering is that we don't have an API for direct access to the other two buffers. Or something.
Comment 4 Adam Roben (:aroben) 2011-02-10 12:55:21 PST
(In reply to comment #2)
> Geoff:
> One thing I'll mention about it: I think it's worth doing a basic measurement to figure out how long a full paint, starting with no backing store, takes on reasonably modern hardware.

In a Debug build on Windows XP on an 8-core 2.8GHz Mac Pro, it takes ~60ms. I'm of course going to try a Release build next.
Comment 5 Adam Roben (:aroben) 2011-02-10 13:31:14 PST
(In reply to comment #4)
> (In reply to comment #2)
> > Geoff:
> > One thing I'll mention about it: I think it's worth doing a basic measurement to figure out how long a full paint, starting with no backing store, takes on reasonably modern hardware.
> 
> In a Debug build on Windows XP on an 8-core 2.8GHz Mac Pro, it takes ~60ms. I'm of course going to try a Release build next.

I was testing by going to <http://www.apple.com/startpage/>, waiting for it to load, then dragging another window in front of the WKView (which caused it to have to redraw). I modified WebKit2 to fetch an entirely new backing store on each paint.

I decided to try some other sites, too. Here's what I see in a Release build on the same machine:

<about:blank>: ~10ms
<https://encrypted.google.com/>: ~18ms
<http://webkit.org/>: ~20ms
<http://tivofaq.com/>: ~30ms
<http://www.cnn.com/>: ~40ms
<http://www.apple.com/startpage/>: Mostly ~50ms, but sometimes spikes up to 150+ms

I haven't yet found a site that takes longer than 50ms, though I'm sure we could find some eventually.
Comment 6 Adam Roben (:aroben) 2011-02-10 13:37:36 PST
(In reply to comment #5)
> I haven't yet found a site that takes longer than 50ms, though I'm sure we could find some eventually.

Here we go:

<http://www.funnyordie.com/>: ~150ms
Comment 7 Adam Roben (:aroben) 2011-02-10 13:40:49 PST
Geoff, since the repaint time is so content-dependent, how would you suggest we pick a timer interval?
Comment 8 Adam Roben (:aroben) 2011-02-10 13:43:19 PST
(In reply to comment #6)
> (In reply to comment #5)
> > I haven't yet found a site that takes longer than 50ms, though I'm sure we could find some eventually.
> 
> Here we go:
> 
> <http://www.funnyordie.com/>: ~150ms

James Robinson suggested that a lot of this time may be spent calling into windowless plugins (i.e., Flash).
Comment 9 Geoffrey Garen 2011-02-10 14:11:11 PST
Straw-person proposal:

1. Begin with a timer interval of 5s.
2. After throwing away a backing store, record the time interval required for the next paint.
3. Update the timer interval to 100 * the recorded time interval in 2.

For fast-painting sites, this will throw backing stores away quickly. For slow-painting sites, this will approximate a maximum thrash cost of 1%.
Comment 10 Anders Carlsson 2011-02-10 14:13:16 PST
(In reply to comment #8)
> (In reply to comment #6)
> > (In reply to comment #5)
> > > I haven't yet found a site that takes longer than 50ms, though I'm sure we could find some eventually.
> > 
> > Here we go:
> > 
> > <http://www.funnyordie.com/>: ~150ms
> 
> James Robinson suggested that a lot of this time may be spent calling into windowless plugins (i.e., Flash).

Did you test on Mac or Windows? On Mac, painting plug-ins doesn't call into the plug-in
Comment 11 Adam Roben (:aroben) 2011-02-10 14:15:40 PST
(In reply to comment #10)
> (In reply to comment #8)
> > (In reply to comment #6)
> > > (In reply to comment #5)
> > > > I haven't yet found a site that takes longer than 50ms, though I'm sure we could find some eventually.
> > > 
> > > Here we go:
> > > 
> > > <http://www.funnyordie.com/>: ~150ms
> > 
> > James Robinson suggested that a lot of this time may be spent calling into windowless plugins (i.e., Flash).
> 
> Did you test on Mac or Windows? On Mac, painting plug-ins doesn't call into the plug-in

I tested on Windows, as I said in comment 4 and comment 5.
Comment 12 Adam Roben (:aroben) 2011-02-10 14:29:21 PST
(In reply to comment #9)
> Straw-person proposal:
> 
> 1. Begin with a timer interval of 5s.
> 2. After throwing away a backing store, record the time interval required for the next paint.
> 3. Update the timer interval to 100 * the recorded time interval in 2.
> 
> For fast-painting sites, this will throw backing stores away quickly. For slow-painting sites, this will approximate a maximum thrash cost of 1%.

A slight refinement to this would be to change step 3 to use an exponentially weighted moving average to update the timer interval. That would prevent an outlier repaint that took a vastly different amount of time from skewing the timer too much.
Comment 13 Adam Roben (:aroben) 2011-02-10 14:50:47 PST
I'm going to test on Mac, too.
Comment 14 Adam Roben (:aroben) 2011-02-10 15:34:12 PST
My testing in comment 4 and comment 5 was with a WKView size 1200x1050. That's the size of a WebView in Safari 5.0.3 on Windows in a full-height (but not full-width) window with the status bar showing on a 23" Cinema Display.
Comment 15 Sam Weinig 2011-02-10 15:37:01 PST
(In reply to comment #8)
> (In reply to comment #6)
> > (In reply to comment #5)
> > > I haven't yet found a site that takes longer than 50ms, though I'm sure we could find some eventually.
> > 
> > Here we go:
> > 
> > <http://www.funnyordie.com/>: ~150ms
> 
> James Robinson suggested that a lot of this time may be spent calling into windowless plugins (i.e., Flash).

I believe a huge amount of time is spent drawing inset shadows on this site.
Comment 16 Adam Roben (:aroben) 2011-02-11 06:35:24 PST
Testing on a 4-core 2.66GHz Mac Pro on Mac OS X shows roughly similar performance to the numbers in comment 5. Interestingly, funnyordie.com only takes ~90ms on Mac, perhaps because we aren't having to call into plugins as Anders said in comment 10.
Comment 17 Adam Roben (:aroben) 2011-02-11 11:49:25 PST
Some discussion from #webkit yesterday:

<aroben>	ggaren: othermaciej: I haven't found any [Chromium] code yet that deletes backing stores after some amount of inactivity
<aroben>	ggaren: othermaciej: I've just seen an MRU cache for backing stores
<othermaciej>	http://src.chromium.org/svn/trunk/src/chrome/browser/renderer_host/backing_store_manager.cc
<othermaciej>	MRU implies the least recently used aren't in there
<ggaren>	aroben: indeed -- it seems like chromium only evicts one backing store to make room for another
<aroben>	ggaren: othermaciej: of course, an MRU cache would eventually lead to deleting backing stores after inactivity, assuming *someone* is painting somewhere
<aroben>	jinx-ish
<othermaciej>	MRU means you take a fixed cost, not an unbounded per-tab cost, I guess
<othermaciej>	that might be better than a timeout approach, hard to say
<ggaren>	i think it would be better in cases of fast switching between many tabs on a limited system
<ggaren>	not sure if we care to optimize for that case
<othermaciej>	I'm not sure I follow your logic of what case it would be good for
<othermaciej>	the pros and cons seem to be:
<othermaciej>	- capped high-water mark (so you can't get up to insane memory use just by cycling through a lot of tabs once)
<othermaciej>	- maybe somewhat slower if you cycle through all your tabs quickly twice
<othermaciej>	- higher low-water mark, since it won't purge below the cap
<othermaciej>	I suspect once you store enough backing stores to cause system memory pressure, you are not getting a speed benefit from having them around
<othermaciej>	I guess the MRU system also means that if you switch to another tab for a decent while (like a minute) to read something, switching back is very fast
<othermaciej>	one could imagine a system in between with high-water and low-water marks that are not equal
<othermaciej>	and when above the low-water mark, you evict on a timer
<othermaciej>	I could also imagine incorporating a value metric in addition to just size
<othermaciej>	note that having extra backing stores does more than just optimize tab switching in the normal case, it also makes it smoother when the web process is out to lunch
<ggaren>	othermaciej: what does "smoother" mean?
<othermaciej>	it means, you can switch tabs when the web process is busy or hung and see the contents of that tab
<ggaren>	where otherwise you would see white or something?
<othermaciej>	instead of contents of previously viewed tab or blank white
<evmar>	the killer with backing stores is that if they get swapped out, it's generally way faster to rerender the page than to wait for the image to come back from the disk.  (depends on where you keep backing stores though, on windows we keep them in plain ol' system ram)
<othermaciej>	(not sure which would happen, but those would be the options)
Comment 18 Adam Roben (:aroben) 2011-02-11 12:33:55 PST
Let's compare an MRU cache that allows N concurrent backing stores to exist, vs. a per-WebPageProxy timer that throws away that page's backing store when it fires.

One case where the MRU cache would result in lower memory usage is if you had X windows open that were all animating, where X > N and the animation frequency is higher than the timer frequency. With the MRU cache, you'd have only N backing stores, while with the timer you'd have X backing stores.

One case where the timer would result in lower memory usage is if you had X windows open, then closed some of them to leave Y windows open, where X > N and Y < N. With the MRU cache, you'd still have N backing stores, while with the timer you'd only have Y backing stores, even if all the windows are animating.

Another case where the timer results in lower memory usage is if you have many windows open but none of them are needing to paint. The timer can get you down to 0 backing stores, while the MRU cache will leave you with N backing stores.
Comment 19 Adam Roben (:aroben) 2011-02-11 13:00:40 PST
I think I need to generalize the m_lastDidSetSizeSequenceNumber logic in DrawingAreaProxyImpl. Instead of only setting it when based on when the size changes, we'll also need to set it based on when a new backing store is created.
Comment 20 Adam Roben (:aroben) 2011-02-11 13:09:18 PST
I think the process for throwing away a backing store and creating a new one should be:

1) When we decide to throw away the backing store, do so and send the DrawingAreaImpl an InvalidateEverything message (suggestions for a better name are welcome!).
2) When the DrawingAreaImpl receives the InvalidateEverything message, set its dirty region to the page's bounds.
3) The next time DrawingAreaProxyImpl::paint is called, if we still don't have a backing store and we're not waiting for a DidSetSize message, send the DrawingAreaImpl a DisplayNow message (again, better names are welcome!) and use the waitForAndDispatchImmediately logic to block for up to some small amount of time until we receive the next Update message. Then proceed with painting.
Comment 21 Adam Roben (:aroben) 2011-02-11 13:09:47 PST
Based on the strategy in comment 20, I no longer think I need to change the m_lastDidSetSizeSequenceNumber logic (like I said I would have to in comment 19).
Comment 22 Adam Roben (:aroben) 2011-02-11 13:21:26 PST
One case I didn't cover in comment 20:

If DrawingAreaImpl sends an Update message between (1) and (2), DrawingAreaProxyImpl won't have a backing store to draw it into. But just creating a new backing store at that point won't be good enough, as the UpdateInfo will only contain images of parts of the page (whatever was invalid just before the Update was sent). We'll need to ignore that Update and send a DisplayNow message to the DrawingAreaImpl so it will send us another Update after it gets the InvalidateEverything message.
Comment 23 Adam Roben (:aroben) 2011-02-11 13:22:42 PST
(In reply to comment #22)
> One case I didn't cover in comment 20:
> 
> If DrawingAreaImpl sends an Update message between (1) and (2), DrawingAreaProxyImpl won't have a backing store to draw it into. But just creating a new backing store at that point won't be good enough, as the UpdateInfo will only contain images of parts of the page (whatever was invalid just before the Update was sent). We'll need to ignore that Update and send a DisplayNow message to the DrawingAreaImpl so it will send us another Update after it gets the InvalidateEverything message.

If we detect that the UpdateInfo does in fact contain an image of the entire page, we should just use it and skip the DisplayNow message.
Comment 24 Adam Roben (:aroben) 2011-02-11 14:04:21 PST
I think discarding the backing store could lead to partial updates appearing on screen. Imagine this scenario:

1) DrawingAreaProxyImpl throws away its backing store
2) DrawingAreaImpl finds out, invalidates everything
3) Some part of the web page changes, so DrawingAreaImpl sets its display timer
4) Part of the WKView's view/HWND gets invalidated for some reason (e.g., someone drags a window across the WKView's HWND on Windows)
5) WKView asks DrawingAreaProxyImpl to paint into the invalidated region
6) DrawingAreaProxyImpl asks DrawingAreaImpl to repaint
7) DrawingAreaProxyImpl paints into WKView's invalid region

At this point, WKView's invalid region has been painted with an updated version of the page. But the rest of WKView still shows the old version of the page!
Comment 25 Adam Roben (:aroben) 2011-02-11 14:06:41 PST
(In reply to comment #20)
> 3) The next time DrawingAreaProxyImpl::paint is called, if we still don't have a backing store and we're not waiting for a DidSetSize message, send the DrawingAreaImpl a DisplayNow message (again, better names are welcome!) and use the waitForAndDispatchImmediately logic to block for up to some small amount of time until we receive the next Update message. Then proceed with painting.

This could be problematic. If there's already an Update message in the queue from before the InvalidateEverything message reached DrawingAreaImpl, the Update won't contain an image of the entire view. Maybe we need to use a different message (i.e., DidDisplayNow). If we do that, we *will* need to generalize the m_lastDidSetSizeSequenceNumber logic.
Comment 26 Geoffrey Garen 2011-02-11 14:43:32 PST
> This could be problematic. If there's already an Update message in the queue from before the InvalidateEverything message reached DrawingAreaImpl, the Update won't contain an image of the entire view. Maybe we need to use a different message (i.e., DidDisplayNow). If we do that, we *will* need to generalize the m_lastDidSetSizeSequenceNumber logic.

Another option -- a little simpler but possibly less clear -- is for the message sender to keep an internal flag that says "I just sent an InvalidateEverything message, so from now on, ignore all update messages until I see an update message that covers the whole drawing region, at which point I will unset this flag."
Comment 27 Geoffrey Garen 2011-02-11 14:45:59 PST
> Another option -- a little simpler but possibly less clear -- is for the message sender to keep an internal flag that says "I just sent an InvalidateEverything message, so from now on, ignore all update messages until I see an update message that covers the whole drawing region, at which point I will unset this flag."

Basically, I'm saying that any update that covers the whole drawing region can act as an implicit DidDisplayNow.
Comment 28 Adam Roben (:aroben) 2011-02-11 14:47:02 PST
(In reply to comment #26)
> > This could be problematic. If there's already an Update message in the queue from before the InvalidateEverything message reached DrawingAreaImpl, the Update won't contain an image of the entire view. Maybe we need to use a different message (i.e., DidDisplayNow). If we do that, we *will* need to generalize the m_lastDidSetSizeSequenceNumber logic.
> 
> Another option -- a little simpler but possibly less clear -- is for the message sender to keep an internal flag that says "I just sent an InvalidateEverything message, so from now on, ignore all update messages until I see an update message that covers the whole drawing region, at which point I will unset this flag."

That would work except when we're actually asked to paint by the native view. At that point we need to wait for an Update message and dispatch it immediately. But we don't currently have code that allows us to wait for a specific instance of a given message, only code that allows us to wait for the first instance of a given message.

We do however need logic like what you described, for the reasons stated in comments 22 and 23. But instead of a separate flag, I think we can just use "m_backingStore is null".
Comment 29 Adam Roben (:aroben) 2011-02-11 14:49:32 PST
(In reply to comment #28)
> But we don't currently have code that allows us to wait for a specific instance of a given message, only code that allows us to wait for the first instance of a given message.

We could of course add such code. But it turns out to be convenient for other reasons to use a separate message. For instance, using a separate message will let us handle the case where DisplayNow put us into accelerated compositing mode. We'll end up sharing quite a bit of code with the existing SetSize/DidSetSize messages.
Comment 30 Adam Roben (:aroben) 2011-02-11 14:53:26 PST
(In reply to comment #29)
> We'll end up sharing quite a bit of code with the existing SetSize/DidSetSize messages.

Most of the SetSize logic is really for handling the case where the UI process commands the web process to display, rather than the web process deciding to display on its own. Luckily, that's exactly the point of the DisplayNow message, so a lot of code can be shared.

Similarly, most of the DidSetSize logic is really for handling the case where we're painting into a brand new backing store, but may have switched into accelerated compositing mode in the process. Luckily, that's exactly the point of the DidDisplayNow message, so a lot of code can be shared.
Comment 31 Adam Roben (:aroben) 2011-02-11 16:39:22 PST
(In reply to comment #27)
> > Another option -- a little simpler but possibly less clear -- is for the message sender to keep an internal flag that says "I just sent an InvalidateEverything message, so from now on, ignore all update messages until I see an update message that covers the whole drawing region, at which point I will unset this flag."
> 
> Basically, I'm saying that any update that covers the whole drawing region can act as an implicit DidDisplayNow.

I now think we may need to follow this approach. Consider the following scenario:

1) DrawingAreaProxyImpl throws away its backing store
2) DrawingAreaImpl finds out, invalidates everything
3) Some part of the web page changes, so DrawingAreaImpl sets its display timer
4) DrawingAreaImpl's display timer fires, so it displays and sends an Update for the entire view back to the UI process
5) WKView asks DrawingAreaProxyImpl to paint into the invalidated region
6) DrawingAreaProxyImpl sees it has no backing store (because it hasn't received the Update message yet), so sends a DisplayNow message and waits for a DidDisplayNow response
7) DrawingAreaImpl receives DisplayNow, but since it just displayed it has no dirty region to paint

I.e., the display we want has already happened!

I guess another option is to always send DidDisplayNow instead of Update for the first display after an InvalidateEverything. That would fix the bug, too.
Comment 32 Adam Roben (:aroben) 2011-02-11 16:48:37 PST
I should note all these message names are subject to change.
Comment 33 Adam Roben (:aroben) 2011-02-12 08:34:15 PST
Here's my latest thinking on the policy-less implementation:

DrawingAreaProxyImpl has some state that, when it changes, causes all operations performed before that state change to become invalid. When painting, we need to have performed at least one Update for the most recent state; otherwise, we'll be painting old/incorrect bits into the view.

Currently, this state consists of a single piece of data: the view's size. Whenever the view's size changes, we tell the web process, and discard any operations that occur until the web process lets us know that it has updated to match the new state (i.e., size). This is accomplished via a few means:

1) The Messages::DrawingArea::SetSize message
This is how we tell the web process when the state (i.e., size) changes.

2) The Messages::DrawingAreaProxy::DidSetSize message
This is how the web process tells us it has updated to match the new state (i.e., size).

3) The DrawingAreaProxyImpl::m_isWaitingForDidSetSize member
This is set to true when we send SetSize, and set back to false when we receive DidSetSize. This is used for two purposes:
  a) When DrawingAreaProxyImpl is asked to paint into the native view, if m_isWaitingForDidSetSize is true then we know we need to wait until the web process has updated to match the new state (i.e., size) before painting. This is accomplished via the waitForAndDispatchDidSetSize function. (If the web process takes too long to respond, we'll just paint whatever old data we have.)
  b) When the view's size changes, if m_isWaitingForDidSetSize is true then we don't send a new SetSize message. Instead, we send a new SetSize message when we receive DidSetSize, if needed. This prevents us from sending many redundant SetSize messages (if the view is being resized faster than the web process can respond).

4) The sequence numbers passed from the web process to the UI process, and the DrawingAreaProxyImpl::m_lastDidSetSizeSequenceNumber member
This is used to ignore messages that are for an old state (i.e., size). Every time the web process sends a message to DrawingAreaProxyImpl, it includes a monotonically increasing sequence number. When a DidSetSize message is received, DrawingAreaProxyImpl::m_lastDidSetSizeSequenceNumber is updated to match the DidSetSize message's sequence number. If DrawingAreaProxyImpl receives a message with a sequence number less than m_lastDidSetSizeSequenceNumber, it knows that it is for an old state (i.e., size) and ignores it. This is needed because waitForAndDispatchDidSetSize allows us to receive messages out of order.

I'm going to be adding a new piece of data to the state: the backing store. Whenever the backing store is thrown away, we now have entered a new state where any old operations are no longer valid. Here's how I think we can accomplish this:

I) Transform the "sequence number" into a "state ID", and move control of the state ID to the UI process
DrawingAreaProxyImpl would be in charge of incrementing the state ID whenever the state (i.e., size) changes. The new state ID would be passed to the web process in the SetSize message, and DrawingAreaImpl would store the state ID. DrawingAreaImpl would then pass its stored state ID back to the UI process the same way the sequence number is currently passed.

II) Perform renames to match the "sequence number" -> "state ID" transformation:
  a) Messages::DrawingArea::SetSize -> UpdateState
  b) Messages::DrawingAreaProxy::DidSetSize -> DidUpdateState
  c) DrawingAreaProxyImpl::m_isWaitingForDidSetSize -> m_isWaitingForDidUpdateState
  d) DrawingAreaProxyImpl::waitForAndDispatchDidSetSize -> waitForAndDispatchDidUpdateState
  e) DrawingAreaImpl::m_inSetSize -> m_inUpdateState

III) Add a boolean argument to the UpdateState message that specifies whether the update needs to happen immediately or can be deferred until the next paint
DrawingAreaImpl would always store the last state ID it sent to the UI process in a data member. For deferred state updates, it would only update the current state ID, not the last sent state ID. The next time DrawingAreaImpl does something that would end up sending a message to the UI process, it checks whether the last sent state ID matches the current state ID. If it matches, it proceeds as normal, but if it doesn't, it sends a DidUpdateState message instead of whatever message it was going to send. (Implicit in this is the requirement that DidUpdateState contain a superset of the information it was going to send in the normal message. This is currently the case: DidSetSize includes all the information that any other message can contain, and more.)

With this groundwork in place, I think throwing away a backing store becomes this:

A) When DrawingAreaProxyImpl throws away the backing store, update the state ID and send a deferred UpdateState message to the web process.
B) When DrawingAreaProxyImpl is asked to paint into the native view, if m_isWaitingForDidUpdateState is true, send a non-deferred UpdateState message to the web process without incrementing the state ID before calling waitForAndDispatchDidUpdateState.
C) When DrawingAreaImpl gets an UpdateState message, if the state ID in the message matches DrawingAreaImpl's current state ID, ignore the message (because we've already updated to match this state).

I believe this addresses all the problems mentioned so far, except for comment 24. I also think the generalization of the [Did]SetSize logic means that adding the logic to deal with throwing away a backing store can be added to DrawingArea[Proxy]Impl without making those classes harder to understand.
Comment 34 Adam Roben (:aroben) 2011-02-12 09:16:21 PST
(In reply to comment #33)
> A) When DrawingAreaProxyImpl throws away the backing store, update the state ID and send a deferred UpdateState message to the web process.

We need to set m_isWaitingForDidUpdateState to true here, too.
Comment 35 Adam Roben (:aroben) 2011-02-12 11:32:34 PST
The approach outlined in comment 33 seems to be working. I should be able to put some patches up for review on Monday. Then we just need to decide what the right policy is!
Comment 36 Adam Roben (:aroben) 2011-02-14 07:52:56 PST
(In reply to comment #33)
> I) Transform the "sequence number" into a "state ID", and move control of the state ID to the UI process
> DrawingAreaProxyImpl would be in charge of incrementing the state ID whenever the state (i.e., size) changes. The new state ID would be passed to the web process in the SetSize message, and DrawingAreaImpl would store the state ID. DrawingAreaImpl would then pass its stored state ID back to the UI process the same way the sequence number is currently passed.

When resized, DrawingAreaProxyImpl currently keeps around the old backing store until the web process sends back a DidSetSize message. This is beneficial in the following circumstance:

1) View is resized in the UI process
2) DrawingAreaProxyImpl sends a SetSize message to the web process
3) DrawingAreaProxyImpl processes some already-sent Update messages from the web process (this step is optional)
4) The view asks DrawingAreaProxyImpl to paint something
5) DrawingAreaProxyImpl waits 0.5 seconds for the web process to send a DidSetSize message, doesn't get one, and draws the contents of the old backing store into the view

Keeping around the old backing store allowed us to draw it into the view when the web process was taking a long time to respond, and to update it with new content that was drawn by the web process before it received our SetSize message.

We should probably preserve this behavior, which means that we should send m_stateID+1 in the UpdateState message, and then set m_stateID to the ID passed back to us in the DidUpdateState message.
Comment 37 Adam Roben (:aroben) 2011-02-21 14:22:10 PST
(In reply to comment #33)
> I) Transform the "sequence number" into a "state ID", and move control of the state ID to the UI process
> DrawingAreaProxyImpl would be in charge of incrementing the state ID whenever the state (i.e., size) changes. The new state ID would be passed to the web process in the SetSize message, and DrawingAreaImpl would store the state ID. DrawingAreaImpl would then pass its stored state ID back to the UI process the same way the sequence number is currently passed.

This is now bug 54907, though I left all the terminology unchanged for now.
Comment 38 Adam Roben (:aroben) 2011-02-21 15:07:02 PST
(In reply to comment #33)
> II) Perform renames to match the "sequence number" -> "state ID" transformation:
>   a) Messages::DrawingArea::SetSize -> UpdateState
>   b) Messages::DrawingAreaProxy::DidSetSize -> DidUpdateState
>   c) DrawingAreaProxyImpl::m_isWaitingForDidSetSize -> m_isWaitingForDidUpdateState
>   d) DrawingAreaProxyImpl::waitForAndDispatchDidSetSize -> waitForAndDispatchDidUpdateState
>   e) DrawingAreaImpl::m_inSetSize -> m_inUpdateState

This is bug 54911.
Comment 39 Adam Roben (:aroben) 2011-02-21 16:31:38 PST
(In reply to comment #33)
> B) When DrawingAreaProxyImpl is asked to paint into the native view, if m_isWaitingForDidUpdateState is true, send a non-deferred UpdateState message to the web process without incrementing the state ID before calling waitForAndDispatchDidUpdateState.

The first part of this is bug 54916.
Comment 40 Adam Roben (:aroben) 2011-02-21 16:33:47 PST
(In reply to comment #33)
> C) When DrawingAreaImpl gets an UpdateState message, if the state ID in the message matches DrawingAreaImpl's current state ID, ignore the message (because we've already updated to match this state).

This isn't quite right. We still need to send a DidUpdateState message in this case. We just shouldn't do anything else.
Comment 41 Adam Roben (:aroben) 2011-02-28 10:45:09 PST
(In reply to comment #9)
> 2. After throwing away a backing store, record the time interval required for the next paint.

It's a little tricky to define "the time interval required for the next paint". Here's the general sequence (with "<time passes>" used to break up blocks of synchronous execution): 

A) UI process deallocates backing store
B) UI process sends a message to the web process telling it this
<time passes>
C) CoreIPC sends the message from the UI process to the web process
<time passes>
D) Web process receives the message
E) Web process records the backing store was thrown away
<time passes>
F) UI process is asked to paint, sends message to the web process telling it to do a full backing store update
<time passes>
G) CoreIPC sends the message from the UI process to the web process
<time passes>
H) Web process receives the message
I) Web process does a full paint
J) Web process sends a message to the UI process with the updated bits
<time passes>
K) CoreIPC sends the message from the web process to the UI process
<time passes>
L) UI process receives the message
M) UI process allocates a new backing store
N) UI process blits the updated bits into the new backing store
O) UI process blits the backing store into the native view

However, if the web process decides it needs to paint after the backing store has been thrown away but before the UI process decides *it* needs to paint, steps F, G, and H would go away and these new steps would be added between N and O:

N1) UI process invalidates the native view
<time passes>
N2) OS asks the native view to paint

So.....it's complicated.
Comment 42 Adam Roben (:aroben) 2011-02-28 10:47:36 PST
And of course it gets even more complicated if both the UI process and the web process decide they need to paint around the same time.
Comment 43 Adam Roben (:aroben) 2011-02-28 15:19:40 PST
Turns out bug 55417 will keep the timer from working correctly on Windows.
Comment 44 Adam Roben (:aroben) 2011-03-01 11:14:12 PST
I have a patch that seems to work pretty well, but I'm hitting an assertion I added when running the WebKit2/ResizeViewWhileHidden API test.
Comment 45 Adam Roben (:aroben) 2011-03-03 11:41:49 PST
I'm looking into the assertion failure now.
Comment 46 Adam Roben (:aroben) 2011-03-03 13:03:09 PST
The WebKit2/ResizeViewWhileHidden API test asks the view to resize while hidden (duh). When this happens, the web process ends up sending back a DidUpdateBackingStoreState message with an empty update area (since painting is suspended). When the UI process receives this messsage, it updates its backing store state ID, but doesn't allocate a new backing store (since there's nothing to blit into it). This later confuses us when we don't have a backing store but our backing store state ID is up-to-date.
Comment 47 Adam Roben (:aroben) 2011-03-03 13:05:55 PST
When the assertion fires, the UI process is handling an Update message from the web process. The Update message seems to have been sent before the view was hidden when the view had a size of 802x602. But now the view has been hidden, resized, an unhidden, and its size is 800x600.
Comment 48 Adam Roben (:aroben) 2011-03-03 13:09:27 PST
Hm, but the Update message has the state ID from after the view was resized back to 800x600. How did the size and the state ID get out of sync?
Comment 49 Adam Roben (:aroben) 2011-03-03 13:10:59 PST
I see; the Update's viewSize *is* the right size. It's the updateRectBounds that is some other (larger) size.
Comment 50 Adam Roben (:aroben) 2011-03-03 13:12:29 PST
Ah, DrawingAreaImpl::updateBackingStoreState was uniting the current dirty region with the new bounds. Instead it should just set the dirty region to the new bounds.
Comment 51 Adam Roben (:aroben) 2011-03-03 15:25:58 PST
(In reply to comment #50)
> Ah, DrawingAreaImpl::updateBackingStoreState was uniting the current dirty region with the new bounds. Instead it should just set the dirty region to the new bounds.

This is now bug 55715.
Comment 52 Adam Roben (:aroben) 2011-03-03 17:17:13 PST
> III) Add a boolean argument to the UpdateState message that specifies whether the update needs to happen immediately or can be deferred until the next paint

This is now bug 55730.
Comment 53 Adam Roben (:aroben) 2011-03-03 17:28:56 PST
Created attachment 84662 [details]
Throw away DrawingAreaProxyImpl's backing store after not painting for 5 seconds
Comment 54 Adam Roben (:aroben) 2011-03-03 17:37:55 PST
Committed r80307: <http://trac.webkit.org/changeset/80307>
Comment 55 Adam Roben (:aroben) 2011-03-03 17:38:45 PST
(In reply to comment #9)
> Straw-person proposal:
> 
> 1. Begin with a timer interval of 5s.
> 2. After throwing away a backing store, record the time interval required for the next paint.
> 3. Update the timer interval to 100 * the recorded time interval in 2.
> 
> For fast-painting sites, this will throw backing stores away quickly. For slow-painting sites, this will approximate a maximum thrash cost of 1%.

I punted steps 2 and 3 out to bug 55733.