Bug 93606 - [Qt] UpdateAtlas is wasting memory
Summary: [Qt] UpdateAtlas is wasting memory
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit2 (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Allan Sandfeld Jensen
URL:
Keywords: Qt, QtTriaged
Depends on:
Blocks:
 
Reported: 2012-08-09 05:20 PDT by Balazs Kelemen
Modified: 2012-08-22 15:21 PDT (History)
8 users (show)

See Also:


Attachments
debug code (1.24 KB, patch)
2012-08-09 05:25 PDT, Balazs Kelemen
no flags Details | Formatted Diff | Diff
fix candidate 1 (5.38 KB, patch)
2012-08-09 05:33 PDT, Balazs Kelemen
no flags Details | Formatted Diff | Diff
fix candidate 2 (10.56 KB, patch)
2012-08-09 05:51 PDT, Balazs Kelemen
no flags Details | Formatted Diff | Diff
QAreaAllocator-Experiment (39.46 KB, patch)
2012-08-10 04:51 PDT, Allan Sandfeld Jensen
no flags Details | Formatted Diff | Diff
Patch (27.84 KB, patch)
2012-08-13 03:44 PDT, Allan Sandfeld Jensen
no flags Details | Formatted Diff | Diff
Patch (27.82 KB, patch)
2012-08-13 03:55 PDT, Allan Sandfeld Jensen
no flags Details | Formatted Diff | Diff
Patch (27.75 KB, patch)
2012-08-14 02:19 PDT, Allan Sandfeld Jensen
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Balazs Kelemen 2012-08-09 05:20:26 PDT
After spending some time reading the code of UpdateAtlas and trying to understand it I have added some printf's because my suspicion was that it don't use the whole surface it allocates. The suspicion has been trusted. I am going to upload the debug code and the results and the patches I made and the results with them. Actually I made two and I'm not sure which is the better :)
Comment 1 Balazs Kelemen 2012-08-09 05:25:36 PDT
Created attachment 157444 [details]
debug code

With this patch I see that:
Atlas cache fail; requiredDim=512, bufsize=4000000, usedBytes=1572864
a lot of times. So if I'm not wrong we use only ~1.5 MB from the ~4 MB surface we allocate (maybe a bit more for smaller tiles but those are rare so it does not make it much better).
Comment 2 Balazs Kelemen 2012-08-09 05:33:36 PDT
Created attachment 157449 [details]
fix candidate 1

This is a fix for the original logic (at least what I get from it). I believe we can possibly use the whole surface area with this.
This is the output with the patch:
"Atlas cache fail; Atlas cache fail; requiredDim=512, bufsize=4000000, usedBytes=2359296"
a lot of times, but less than before. What makes me worry is that we still use only about the half of the surface.
The use case I do is simple: just loading google.com, zooming as possible and scrolling.
Comment 3 Jocelyn Turcotte 2012-08-09 05:42:38 PDT
Allan did some work and had a pretty efficiently working patch using existing Qt code.
I looked at it briefly before, and the problem as it seemed to me is that we ONLY use a layouted sub-area of the atlas for an update size that fits in the chunk.

So while doing full page renderings (e.g. when loading), if my atlas is 2048x2048 and I need to send over 4 tiles of 512x512, then only 2 of them will be successfully allocated. The other half of the buffer is reserved for smaller updates, which won't happen for this frame and the space is wasted.

I didn't look at it very deeply so I might be wrong about this, but I at least did observe wasted space too.
Comment 4 Balazs Kelemen 2012-08-09 05:51:03 PDT
Created attachment 157457 [details]
fix candidate 2

This one divides the surface to equal size sub-buffers, so we can effectively use the whole surface. Atlases are created with a size class, so there can be an atlas for 512x512 requests, one for 256x256, and so on. I did not add the debug code here because it is evident that if there is a cache fail than the whole atlas is used.
Comment 5 Balazs Kelemen 2012-08-09 05:53:42 PDT
(In reply to comment #3)
> Allan did some work and had a pretty efficiently working patch using existing Qt code.
> I looked at it briefly before, and the problem as it seemed to me is that we ONLY use a layouted sub-area of the atlas for an update size that fits in the chunk.
> 
> So while doing full page renderings (e.g. when loading), if my atlas is 2048x2048 and I need to send over 4 tiles of 512x512, then only 2 of them will be successfully allocated. The other half of the buffer is reserved for smaller updates, which won't happen for this frame and the space is wasted.
> 
> I didn't look at it very deeply so I might be wrong about this, but I at least did observe wasted space too.

Yes. This is the problem I am trying to solve in candidate 2. (The other problem is that the current code is just buggy, candidate 1 if fixing it.)
Comment 6 Balazs Kelemen 2012-08-09 09:07:00 PDT
story continues as bugzilla resurrected...

Let's see how many atlases are allocated. For this reason I printed in the update atlas ctor with the dimension (which is contant in trunk and candidate 1). Results (all of them with simply zooming in google.com and scrolling):

ref
UpdateAtlas: 2000
UpdateAtlas: 2000
UpdateAtlas: 2000
UpdateAtlas: 2000
UpdateAtlas: 2000
UpdateAtlas: 2000
UpdateAtlas: 2000
UpdateAtlas: 2000
UpdateAtlas: 2000

candidate 1
UpdateAtlas: 2000
UpdateAtlas: 2000
UpdateAtlas: 2000
UpdateAtlas: 2000

candidate 2
UpdateAtlas: 2048
UpdateAtlas: 1024
UpdateAtlas: 128
UpdateAtlas: 512
UpdateAtlas: 2048
UpdateAtlas: 2048
UpdateAtlas: 128
UpdateAtlas: 512
UpdateAtlas: 1024

So, candidate 1 needs less allocation, but according to my calculation it still use 2.5Mb more memory.

Allan, could you upload your work, so we can compare with these?
Comment 7 Noam Rosenthal 2012-08-09 09:15:45 PDT
I'm good with either direction - the original UpdateAtlas was supposed to make memory allocations more predictable, but it can be fine tuned in many ways and we should pick one :)

I think I like (2) better because the smaller atlases allow us to lock GraphicsSurfaces, e.g. on Mac, without locking everything.
Comment 8 Allan Sandfeld Jensen 2012-08-09 09:36:58 PDT
(In reply to comment #6)
> 
> Allan, could you upload your work, so we can compare with these?

I will try to find it, but essentially it was just an experiment, moving the textureatlas from qt3d and hooking it in as replacement for the existing atlas. It worked just fine, but I never got around to reformating the textureatlas to WebKit style, or make qt3d export it.
Comment 9 Balazs Kelemen 2012-08-09 10:57:03 PDT
(In reply to comment #7)
> I'm good with either direction - the original UpdateAtlas was supposed to make memory allocations more predictable, but it can be fine tuned in many ways and we should pick one :)
> 
> I think I like (2) better because the smaller atlases allow us to lock GraphicsSurfaces, e.g. on Mac, without locking everything.

Cool. However, Jocelyn mentioned that we could probably stay in one simple atlas with a dynamic layout, so now I'm investigating with this.
Comment 10 Allan Sandfeld Jensen 2012-08-10 04:51:21 PDT
Created attachment 157705 [details]
QAreaAllocator-Experiment
Comment 11 Allan Sandfeld Jensen 2012-08-10 04:52:46 PDT
(In reply to comment #10)
> Created an attachment (id=157705) [details]
> QAreaAllocator-Experiment

This replaces updateatlas with hooks to QAreaAllocator which has been imported from Qt3D.
Comment 12 Allan Sandfeld Jensen 2012-08-10 05:02:47 PDT
(In reply to comment #6)
> story continues as bugzilla resurrected...
> 
> Let's see how many atlases are allocated. For this reason I printed in the update atlas ctor with the dimension (which is contant in trunk and candidate 1). Results (all of them with simply zooming in google.com and scrolling):
> 
> ref
> UpdateAtlas: 2000
> UpdateAtlas: 2000
> UpdateAtlas: 2000
> UpdateAtlas: 2000
> UpdateAtlas: 2000
> UpdateAtlas: 2000
> UpdateAtlas: 2000
> UpdateAtlas: 2000
> UpdateAtlas: 2000
> 
> candidate 1
> UpdateAtlas: 2000
> UpdateAtlas: 2000
> UpdateAtlas: 2000
> UpdateAtlas: 2000
> 
> candidate 2
> UpdateAtlas: 2048
> UpdateAtlas: 1024
> UpdateAtlas: 128
> UpdateAtlas: 512
> UpdateAtlas: 2048
> UpdateAtlas: 2048
> UpdateAtlas: 128
> UpdateAtlas: 512
> UpdateAtlas: 1024
> 
> So, candidate 1 needs less allocation, but according to my calculation it still use 2.5Mb more memory.
> 
> Allan, could you upload your work, so we can compare with these?

Apparently only two allocations:

WebKit::UpdateAtlas::UpdateAtlas(int, WebKit::ShareableBitmap::Flags) 2048 
WebKit::UpdateAtlas::UpdateAtlas(int, WebKit::ShareableBitmap::Flags) 2048
Comment 13 Balazs Kelemen 2012-08-10 05:28:47 PDT
I'm working on something like QGeneralAllocator, but maybe it will be never be as complete. In this case the best solution could be to convert this code to webkit style and using it.
Comment 14 Kenneth Rohde Christiansen 2012-08-10 05:35:34 PDT
(In reply to comment #13)
> I'm working on something like QGeneralAllocator, but maybe it will be never be as complete. In this case the best solution could be to convert this code to webkit style and using it.

It might be better to commit it as it is and then do the clean ups in separate patches which are easier to review
Comment 15 Kenneth Rohde Christiansen 2012-08-10 05:43:15 PDT
Comment on attachment 157705 [details]
QAreaAllocator-Experiment

View in context: https://bugs.webkit.org/attachment.cgi?id=157705&action=review

> Source/WebKit2/WebProcess/WebPage/UpdateAtlas.cpp:53
> +    delete m_areaAllocator;
> +    m_areaAllocator = 0;

OwnPtr? it has a clear() ? no?

> Source/WebKit2/WebProcess/WebPage/UpdateAtlas.cpp:76
> +   IntRect intrect(IntPoint::zero(), size);
> +   if (flags() & ShareableBitmap::SupportsAlpha) {
>          graphicsContext->setCompositeOperation(CompositeCopy);
> -        graphicsContext->fillRect(rect, Color::transparent, ColorSpaceDeviceRGB);
> +        graphicsContext->fillRect(intrect, Color::transparent, ColorSpaceDeviceRGB);

wrong indentation,

Why not just fillRect(IntRect(IntPoint::zero(), size)), ...
Comment 16 Noam Rosenthal 2012-08-10 08:21:00 PDT
(In reply to comment #14)
> (In reply to comment #13)
> > I'm working on something like QGeneralAllocator, but maybe it will be never be as complete. In this case the best solution could be to convert this code to webkit style and using it.
> 
> It might be better to commit it as it is and then do the clean ups in separate patches which are easier to review

That would probably break EFL, if they already use COORDINATED_GRAPHICS.
Comment 17 Kenneth Rohde Christiansen 2012-08-10 09:00:40 PDT
(In reply to comment #16)
> (In reply to comment #14)
> > (In reply to comment #13)
> > > I'm working on something like QGeneralAllocator, but maybe it will be never be as complete. In this case the best solution could be to convert this code to webkit style and using it.
> > 
> > It might be better to commit it as it is and then do the clean ups in separate patches which are easier to review
> 
> That would probably break EFL, if they already use COORDINATED_GRAPHICS.

The class can be added without being enabled for that reason
Comment 18 Noam Rosenthal 2012-08-10 09:19:10 PDT
(In reply to comment #17)
> (In reply to comment #16)
> > (In reply to comment #14)
> > > (In reply to comment #13)
> > > > I'm working on something like QGeneralAllocator, but maybe it will be never be as complete. In this case the best solution could be to convert this code to webkit style and using it.
> > > 
> > > It might be better to commit it as it is and then do the clean ups in separate patches which are easier to review
> > 
> > That would probably break EFL, if they already use COORDINATED_GRAPHICS.
> 
> The class can be added without being enabled for that reason

OK then :)
Comment 19 Allan Sandfeld Jensen 2012-08-13 03:44:44 PDT
Created attachment 157952 [details]
Patch
Comment 20 Allan Sandfeld Jensen 2012-08-13 03:55:13 PDT
Created attachment 157953 [details]
Patch

Fix a couple of leftover Qtisms
Comment 21 Jocelyn Turcotte 2012-08-13 04:09:55 PDT
Comment on attachment 157953 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=157953&action=review

I didn't look at the AreaAllocator code, but here are some comments for the rest.

> Source/WebKit2/Target.pri:706
> +    WebProcess/WebPage/AreaAllocator.cpp \

Are we still the only ones building LayerTreeCoordinator? It seems so in my tree but it's quite old.

> Source/WebKit2/WebProcess/WebPage/AreaAllocator.cpp:97
> +void GeneralAreaAllocator::expand(const WebCore::IntSize& size)
> +{

I'm wondering if it's better to keep the code as similar to the original as possible, or to remove code that we're not using?
I'd be more in favor of the later personally.

> Source/WebKit2/WebProcess/WebPage/CoordinatedGraphics/LayerTreeCoordinator.cpp:636
> -    static const int ScratchBufferDimension = 2000;
> -    m_updateAtlases.append(UpdateAtlas(ScratchBufferDimension, flags));
> -    return m_updateAtlases.last().beginPaintingOnAvailableBuffer(handle, size, offset);
> +    m_updateAtlases.append(adoptPtr(new UpdateAtlas(size, flags)));
> +    return m_updateAtlases.last()->beginPaintingOnAvailableBuffer(handle, size, offset);

You're now using the size of the needed update to determine the size of the constructed atlas, is this intended?
Maybe it would be nice not to change the behavior of the size used for the atlas in this patch to segment possible regressions. You could keep the hard-coded value to 2000 or 2048 for now.

> Source/WebKit2/WebProcess/WebPage/UpdateAtlas.cpp:43
> +UpdateAtlas::~UpdateAtlas()
>  {

It's empty, please check if it's necessary. If it's just so that the OwnPtr destructor knows its type, then you can remove the AreaAllocator.h include from UpdateAtlas.h.
Comment 22 Kenneth Rohde Christiansen 2012-08-13 08:41:38 PDT
> I'm wondering if it's better to keep the code as similar to the original as possible, or to remove code that we're not using?
> I'd be more in favor of the later personally.


I would go with the latter. But I suggest doing these cleans up in future patches to avoid regressions.
Comment 23 Kenneth Rohde Christiansen 2012-08-13 08:45:23 PDT
Comment on attachment 157953 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=157953&action=review

> Source/WebKit2/WebProcess/WebPage/AreaAllocator.cpp:102
> +    if (m_root->rect.size() == m_size)
> +        return; // No change.
> +    if (!m_root->left && m_root->largestFree.width() > 0) {

I would suggest a newline after the return

> Source/WebKit2/WebProcess/WebPage/AreaAllocator.cpp:116
> +    if (m_size.width() >= m_size.height())
> +        split = SplitOnX;
> +    else
> +        split = SplitOnY;
> +    while (m_root->rect.size() != m_size) {

also here a newline before while

> Source/WebKit2/WebProcess/WebPage/AreaAllocator.cpp:150
> +static inline bool fitsWithin(const WebCore::IntSize &size1, const WebCore::IntSize &size2)

source, target?

> Source/WebKit2/WebProcess/WebPage/AreaAllocator.h:104
> +    Node *splitNode(Node*, Split);

style
Comment 24 Noam Rosenthal 2012-08-13 18:05:24 PDT
Comment on attachment 157953 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=157953&action=review

>> Source/WebKit2/Target.pri:706
>> +    WebProcess/WebPage/AreaAllocator.cpp \
> 
> Are we still the only ones building LayerTreeCoordinator? It seems so in my tree but it's quite old.

EFL are building LayerTreeCoordinator as well.

> Source/WebKit2/WebProcess/WebPage/AreaAllocator.cpp:39
> +    m_size.expandedTo(size);

This statement would have no effect.
Comment 25 Allan Sandfeld Jensen 2012-08-14 02:02:06 PDT
(In reply to comment #24)
> > Source/WebKit2/WebProcess/WebPage/AreaAllocator.cpp:39
> > +    m_size.expandedTo(size);
> 
> This statement would have no effect.

Are you sure? I thought it would expand m_size to the minimum of m_size and size.
Comment 26 Allan Sandfeld Jensen 2012-08-14 02:19:18 PDT
Created attachment 158265 [details]
Patch
Comment 27 Noam Rosenthal 2012-08-14 03:49:02 PDT
(In reply to comment #25)
> (In reply to comment #24)
> > > Source/WebKit2/WebProcess/WebPage/AreaAllocator.cpp:39
> > > +    m_size.expandedTo(size);
> > 
> > This statement would have no effect.
> 
> Are you sure? I thought it would expand m_size to the minimum of m_size and size.

Yes, if I read the code correctly you need to assign the result, e.g. m_size = m_size.expandedTo(size). IntSize::expandedTo is const.
Comment 28 Allan Sandfeld Jensen 2012-08-14 04:22:24 PDT
Committed r125540: <http://trac.webkit.org/changeset/125540>
Comment 29 Eric Seidel (no email) 2012-08-22 15:21:05 PDT
Comment on attachment 158265 [details]
Patch

Cleared review? from attachment 158265 [details] so that this bug does not appear in http://webkit.org/pending-review.  If you would like this patch reviewed, please attach it to a new bug (or re-open this bug before marking it for review again).