Bug 60008 - [chromium] Use mapTexSubImage2D for tile uploads if available
Summary: [chromium] Use mapTexSubImage2D for tile uploads if available
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Other OS X 10.5
: P2 Normal
Assignee: Nat Duca
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-05-02 22:03 PDT by Nat Duca
Modified: 2011-05-11 19:17 PDT (History)
6 users (show)

See Also:


Attachments
Patch (8.65 KB, patch)
2011-05-02 22:04 PDT, Nat Duca
no flags Details | Formatted Diff | Diff
Remove webview instrumentation. (7.10 KB, patch)
2011-05-02 22:07 PDT, Nat Duca
no flags Details | Formatted Diff | Diff
Remove tracing, use map in hud. (10.11 KB, patch)
2011-05-03 14:29 PDT, Nat Duca
no flags Details | Formatted Diff | Diff
Nits. (9.58 KB, patch)
2011-05-09 18:53 PDT, Nat Duca
no flags Details | Formatted Diff | Diff
Check for mapsub once. (11.29 KB, patch)
2011-05-11 16:06 PDT, Nat Duca
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Nat Duca 2011-05-02 22:03:43 PDT
[chromium] Use mapTexSubImage2D for tile uploads if available
Comment 1 Nat Duca 2011-05-02 22:04:04 PDT
Created attachment 92043 [details]
Patch
Comment 2 Nat Duca 2011-05-02 22:07:24 PDT
Created attachment 92044 [details]
Remove webview instrumentation.
Comment 3 James Robinson 2011-05-02 22:37:59 PDT
Comment on attachment 92044 [details]
Remove webview instrumentation.

View in context: https://bugs.webkit.org/attachment.cgi?id=92044&action=review

> Source/WebCore/platform/graphics/chromium/LayerTilerChromium.cpp:281
> +        TRACE_EVENT("LayerTilerChromium::update::paint", this, String::format("%i %i", m_paintRect.width(), m_paintRect.height()).utf8().data());

nit: does this mean we do a string format + utf16->utf8 conversion for each paint call, or does the macro only evaluate its arguments when tracing is actually enabled?  Also, does this actually put the string contents in the trace?  Could we just pass two ints?

> Source/WebCore/platform/graphics/chromium/LayerTilerChromium.cpp:361
> +                    memcpy(pixelDest, &paintPixels[4 * paintOffset.y() * paintRect.width()], paintRect.width() * destRect.height() * 4);

is memcpy()ing into a mapped buffer more efficient than passing the value to texSubImage2D directly?  there's a higher synchronization overhead for mapped buffers than for the normal transfer buffer, iirc

> Source/WebCore/platform/graphics/chromium/LayerTilerChromium.cpp:379
> +                        memcpy(&m_tilePixels[destRect.width() * 4 * row],

in the mapped case it seems like we want to avoid allocating m_tilePixels at all - it's an extra 256x256x4 = 262,144 byte overhead for each composited layer
Comment 4 Adrienne Walker 2011-05-03 07:50:49 PDT
(In reply to comment #3)
> (From update of attachment 92044 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=92044&action=review
> 
> > Source/WebCore/platform/graphics/chromium/LayerTilerChromium.cpp:281
> > +        TRACE_EVENT("LayerTilerChromium::update::paint", this, String::format("%i %i", m_paintRect.width(), m_paintRect.height()).utf8().data());
> 
> nit: does this mean we do a string format + utf16->utf8 conversion for each paint call, or does the macro only evaluate its arguments when tracing is actually enabled?  Also, does this actually put the string contents in the trace?  Could we just pass two ints?

I disagree.  I think the trace output should be *more* verbose (so long as it doesn't get executed except when tracing).  I had in my mind that we could eventually pass JSON through trace events, so we could do something like { paintedPixels: x, paintedTiles: y, usingMapSubImage: false }.  Then (insert hand waving here) you could select an event (or a set of events) on the trace and you could see aggregated values of these arguments.

It might make sense to just push this part of the change into a smaller "improving tracing" patch anyway.
Comment 5 Nat Duca 2011-05-03 14:29:57 PDT
Created attachment 92132 [details]
Remove tracing, use map in hud.
Comment 6 Nat Duca 2011-05-04 12:02:31 PDT
+vangelis for the overall discussion of should we even commit this change.

Upside: smooths out scrolling
Downside: uses memory because we don't have a memory manager for the allocations created by the mapBuffer system

Thoughts?
Comment 7 Vangelis Kokkevis 2011-05-05 10:25:10 PDT
Comment on attachment 92132 [details]
Remove tracing, use map in hud.

View in context: https://bugs.webkit.org/attachment.cgi?id=92132&action=review

Nat is right that I'm concerned about possibly over-allocating memory if we use mapped textures for the tiler (which means we'll be using mapped textures for the root and every content layer). 
A couple of questions that I'd be interested in getting answered:
1. Do we get a new shared memory block every time we map a new texture?  
2. What's the overhead of creating the shared memory? 
3. When does the memory allocated by the mapped tiles get freed?  Does it get freed when the texture is deleted? 

I agree that we should try to get the visible tab as many resources as there are available to speed it up.  Maybe all we need is to add an additional method to the tiler to force de-allocation of all the shared memory resources for mapped tiles and call it whenever the tab associated with the layer renderer goes to the background (or only keep some of them alive via an LRU scheme).

> Source/WebCore/platform/graphics/chromium/LayerTilerChromium.cpp:59
> +    m_useMapSubForUploads = layerRendererContext()->getExtensions()->supports("GL_CHROMIUM_map_sub");

I think it would be useful to make that a creation time option for the tiler so that we can decide which tilers use mapped textures.  For example we could use mapped textures for root tiles but not for content layer tiles to control memory usage.

> Source/WebCore/platform/graphics/chromium/LayerTilerChromium.cpp:91
> +    if (m_useMapSubForUploads)

shouldn't this be if (!m_useMapSubForUploads)  ?

> Source/WebCore/platform/graphics/chromium/cc/CCHeadsUpDisplay.h:72
> +    static const int kBeginFrameHistorySize = 64;

This change doesn't seem to belong to this patch.
Comment 8 Antoine Labour 2011-05-05 10:41:10 PDT
For your questions 1 2 and 3. Currently the shm is allocated on demand, used for the duration of the map, and then returned to a pool where it gets reused. There's a couple of sync IPCs to create the shm the first time (could in theory be reduced to 1 on mac and 0 on windows/linux) but after that the overhead is virtually null. There is no size cap and the shm isn't freed until the context is freed.

So in steady state for reasonable content the shm pool grows to about the amount of transfer data you need per frame or 2, and stays there, there is no synchronization and you save a copy per transfer.

Malicious content could grow the pool to an arbitrary large size. It should be easy to add a cap (which implies synchronization once the cap is reached).
To reduce memory usage, it is also possible (and not hard) to reclaim unused shm, possibly aggressively for off-screen content, at the cost of some synchronization when the content comes back on-screen.
Comment 9 Vangelis Kokkevis 2011-05-05 12:02:50 PDT
(In reply to comment #8)
> For your questions 1 2 and 3. Currently the shm is allocated on demand, used for the duration of the map, and then returned to a pool where it gets reused. There's a couple of sync IPCs to create the shm the first time (could in theory be reduced to 1 on mac and 0 on windows/linux) but after that the overhead is virtually null. There is no size cap and the shm isn't freed until the context is freed.
> 
> So in steady state for reasonable content the shm pool grows to about the amount of transfer data you need per frame or 2, and stays there, there is no synchronization and you save a copy per transfer.
> 
> Malicious content could grow the pool to an arbitrary large size. It should be easy to add a cap (which implies synchronization once the cap is reached).
> To reduce memory usage, it is also possible (and not hard) to reclaim unused shm, possibly aggressively for off-screen content, at the cost of some synchronization when the content comes back on-screen.

Thanks, Antoine.

We have several memory "managers" in place and we need to centralize them as a separate task definitely.  So I'm warming up to the idea of putting this patch in and dealing with memory management issues after that.
Comment 10 Nat Duca 2011-05-09 18:42:10 PDT
jamesr, enne, any comments?
Comment 11 Nat Duca 2011-05-09 18:51:13 PDT
(In reply to comment #7)
> I think it would be useful to make that a creation time option for the tiler so that we can decide which tilers use mapped textures.  For example we could use mapped textures for root tiles but not for content layer tiles to control memory usage.

I tried this on my local build and while its a simple change, right now, having this is a benefit 99% of the time. This is basically because our texImage2d implementation always blocks.

My suggestion is that we defer this to when we make a memory manager change.
Comment 12 Nat Duca 2011-05-09 18:53:53 PDT
Created attachment 92905 [details]
Nits.
Comment 13 Adrienne Walker 2011-05-10 10:38:19 PDT
(In reply to comment #10)
> jamesr, enne, any comments?

The patch itself looks good to me.  The memory issues definitely need to get solved, but I think our context-limiting code will help mitigate global memory usage in the short term.
Comment 14 James Robinson 2011-05-10 10:51:50 PDT
Comment on attachment 92905 [details]
Nits.

View in context: https://bugs.webkit.org/attachment.cgi?id=92905&action=review

See if you can consolidate the extension query+enable, but otherwise this looks fine.

> Source/WebCore/platform/graphics/chromium/LayerTilerChromium.cpp:61
> +    m_useMapSubForUploads = layerRendererContext()->getExtensions()->supports("GL_CHROMIUM_map_sub");
> +    if (m_useMapSubForUploads)
> +        layerRendererContext()->getExtensions()->ensureEnabled("GL_CHROMIUM_map_sub");

Rather than querying+enabling this once per LayerTilerChromium can we do this on the LRC to just do it once per compositor context?  It seems awfully wasteful to have to do this once per layer.

> Source/WebCore/platform/graphics/chromium/cc/CCHeadsUpDisplay.cpp:76
> +    m_useMapSubForUploads = m_layerRenderer->context()->getExtensions()->supports("GL_CHROMIUM_map_sub");
> +    if (m_useMapSubForUploads)
> +        m_layerRenderer->context()->getExtensions()->ensureEnabled("GL_CHROMIUM_map_sub");

This could reuse the bit on LRC if it existed there.
Comment 15 Nat Duca 2011-05-11 12:57:49 PDT
(In reply to comment #14)
> See if you can consolidate the extension query+enable, but otherwise this looks fine.

I liked this suggestion when I first saw it, because I agree its lame to query this for every WebGL layer. We'd always have to enable it on every context, though.... that's the GL way.

Then when I was coding this, I realized that we're looking for this extension on the WebGL layer context, not the compositor context. Right now, the presence of the extension on one implies presence on the other, but that's a pretty fragile assumption. Consider, e.g., the threading work where the compositor context isn't available on the main thread, or the quirky platforms we've been talking about where WebGL contexts may be out of process but compositor contexts may be in-process.

I'm leaning toward committing as-is because the suggested optimization is too fragile for my taste.

Thoughts?
Comment 16 James Robinson 2011-05-11 13:19:48 PDT
(In reply to comment #15)
> (In reply to comment #14)
> > See if you can consolidate the extension query+enable, but otherwise this looks fine.
> 
> I liked this suggestion when I first saw it, because I agree its lame to query this for every WebGL layer. We'd always have to enable it on every context, though.... that's the GL way.
> 
> Then when I was coding this, I realized that we're looking for this extension on the WebGL layer context, not the compositor context. Right now, the presence of the extension on one implies presence on the other, but that's a pretty fragile assumption. Consider, e.g., the threading work where the compositor context isn't available on the main thread, or the quirky platforms we've been talking about where WebGL contexts may be out of process but compositor contexts may be in-process.
> 
> I'm leaning toward committing as-is because the suggested optimization is too fragile for my taste.
> 
> Thoughts?

This patch enables the extension once for each LayerTilerChromium, but all the queries and enables are on the same context (the compositor context).  I don't think there are any WebGL contexts at play.
Comment 17 Nat Duca 2011-05-11 14:33:40 PDT
> This patch enables the extension once for each LayerTilerChromium, but all the queries and enables are on the same context (the compositor context).  I don't think there are any WebGL contexts at play.

Unless I read the code wrong [which is totally possible] this patch uses m_context --- m_context points at the WebGL context, not at layerRenderer()->context().
Comment 18 James Robinson 2011-05-11 14:38:16 PDT
(In reply to comment #17)
> > This patch enables the extension once for each LayerTilerChromium, but all the queries and enables are on the same context (the compositor context).  I don't think there are any WebGL contexts at play.
> 
> Unless I read the code wrong [which is totally possible] this patch uses m_context --- m_context points at the WebGL context, not at layerRenderer()->context().

I think we're talking about different patches - this is the extension code I'm referring to (from LayerTilerChromium.cpp):

 59    m_useMapSubForUploads = m_layerRenderer->context()->getExtensions()->supports("GL_CHROMIUM_map_sub");
 60    if (m_useMapSubForUploads)
 61        m_layerRenderer->context()->getExtensions()->ensureEnabled("GL_CHROMIUM_map_sub");

and (from CCHeadsUpDisplay.cpp):

 74    m_useMapSubForUploads = m_layerRenderer->context()->getExtensions()->supports("GL_CHROMIUM_map_sub");
 75    if (m_useMapSubForUploads)
 76        m_layerRenderer->context()->getExtensions()->ensureEnabled("GL_CHROMIUM_map_sub");
Comment 19 Nat Duca 2011-05-11 14:40:27 PDT
(In reply to comment #18)
> (In reply to comment #17)
Ugh I suck. My bad.
Comment 20 Nat Duca 2011-05-11 16:06:54 PDT
Created attachment 93205 [details]
Check for mapsub once.
Comment 21 James Robinson 2011-05-11 16:11:56 PDT
Comment on attachment 93205 [details]
Check for mapsub once.

View in context: https://bugs.webkit.org/attachment.cgi?id=93205&action=review

R=me

> Source/WebCore/platform/graphics/chromium/LayerRendererChromium.cpp:125
>      m_contextSupportsLatch = m_context->getExtensions()->supports("GL_CHROMIUM_latch");
> +    m_contextSupportsMapSub = m_context->getExtensions()->supports("GL_CHROMIUM_map_sub");
> +    if (m_contextSupportsMapSub)
> +        m_context->getExtensions()->ensureEnabled("GL_CHROMIUM_map_sub");

hmm, looks like I should go ask John Bates about why we aren't calling ensureEnabled("GL_CHROMIUM_latch").
Comment 22 WebKit Commit Bot 2011-05-11 19:16:56 PDT
Comment on attachment 93205 [details]
Check for mapsub once.

Clearing flags on attachment: 93205

Committed r86301: <http://trac.webkit.org/changeset/86301>
Comment 23 WebKit Commit Bot 2011-05-11 19:17:01 PDT
All reviewed patches have been landed.  Closing bug.