Bug 102994

Summary: [META][Qt] Implement Threaded Coordinated Graphics in WebKit1
Product: WebKit Reporter: Jae Hyun Park <jaepark>
Component: Layout and RenderingAssignee: Nobody <webkit-unassigned>
Status: RESOLVED INVALID    
Severity: Normal CC: abecsi, d.nomiyama, dongseong.hwang, hausmann, jturcotte, laszlo.gombos, luiz, noam, ostap73, sergio, skyul, yoon, zeno
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 103854, 100819, 103046, 103233, 103864, 103983, 104360, 108149, 108899, 109661    
Bug Blocks: 100341    

Description Jae Hyun Park 2012-11-21 17:10:16 PST
Apply Coordinated Graphics in WebKit1.
Comment 1 Jae Hyun Park 2012-12-05 01:42:40 PST
The design document for this meta bug is in the following document.
https://docs.google.com/document/pub?id=1u1Oif9lDmVt_MkJLxaEYbTlOuZWVJcKrM1vAxAvFrZk

Any comments/discussions are appreciated.
Comment 2 Simon Hausmann 2012-12-17 05:44:31 PST
How would rendering work in this model?

I see two threads, the main thread and a separate thread for coordinate graphics. Which tasks fall into which thread?
Comment 3 Dongseong Hwang 2012-12-17 13:33:39 PST
(In reply to comment #2)
> How would rendering work in this model?
> 
> I see two threads, the main thread and a separate thread for coordinate graphics. Which tasks fall into which thread?

Hi, I'm glad for you to interest this project.

Currently, coordinated graphics has two kind of GraphicsLayer tree: CoordinatedGraphicsLayer tree on the main thread in Web Process, GraphicsLayerTextureMapper tree on the rendering thread in UI Process.
After this meta bug completes, we can get GraphicsLayerTextureMapper tree on the rendering thread in WK1 also. It means Texture Mapper runs off the main thread.

You can get more information from our design document. https://docs.google.com/document/pub?id=1UoI1zk-6nTUFtz8i4evURM8aQIjkDRC8boO1zPdMMBg
Comment 4 Simon Hausmann 2012-12-18 01:52:55 PST
(In reply to comment #3)
> (In reply to comment #2)
> > How would rendering work in this model?
> > 
> > I see two threads, the main thread and a separate thread for coordinate graphics. Which tasks fall into which thread?
> 
> Hi, I'm glad for you to interest this project.
> 
> Currently, coordinated graphics has two kind of GraphicsLayer tree: CoordinatedGraphicsLayer tree on the main thread in Web Process, GraphicsLayerTextureMapper tree on the rendering thread in UI Process.
> After this meta bug completes, we can get GraphicsLayerTextureMapper tree on the rendering thread in WK1 also. It means Texture Mapper runs off the main thread.
> 
> You can get more information from our design document. https://docs.google.com/document/pub?id=1UoI1zk-6nTUFtz8i4evURM8aQIjkDRC8boO1zPdMMBg

I'm sorry, it's still not entirely clear to me :(

What exactly is it that the compositor thread does in terms of graphics? I understand that you want to spin timers there to update animation states. But what does it do in term of rendering?

The actual layers themselves cannot be rendered from within the compositor thread without blocking the main thread. Or do you populate the layers from within the main thread via the ShareableSurface? 

Assuming you do that - populate the layers in the main thread, maintain the layer composition in the compositor thread - then how do you integrate that with the rendering in Qt? In WK1 the rendering happens in the main thread, when ::paintEvent() is called. In your scenario, how would the ::paintEvent() (or QGraphicsWidget::paint) implementation look like conceptually?
Comment 5 Dongseong Hwang 2012-12-18 03:01:35 PST
(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #2)
> > > How would rendering work in this model?
> > > 
> > > I see two threads, the main thread and a separate thread for coordinate graphics. Which tasks fall into which thread?
> > 
> > Hi, I'm glad for you to interest this project.
> > 
> > Currently, coordinated graphics has two kind of GraphicsLayer tree: CoordinatedGraphicsLayer tree on the main thread in Web Process, GraphicsLayerTextureMapper tree on the rendering thread in UI Process.
> > After this meta bug completes, we can get GraphicsLayerTextureMapper tree on the rendering thread in WK1 also. It means Texture Mapper runs off the main thread.
> > 
> > You can get more information from our design document. https://docs.google.com/document/pub?id=1UoI1zk-6nTUFtz8i4evURM8aQIjkDRC8boO1zPdMMBg
> 
> I'm sorry, it's still not entirely clear to me :(
> 
> What exactly is it that the compositor thread does in terms of graphics? I understand that you want to spin timers there to update animation states. But what does it do in term of rendering?
> 
> The actual layers themselves cannot be rendered from within the compositor thread without blocking the main thread. Or do you populate the layers from within the main thread via the ShareableSurface? 
> 
> Assuming you do that - populate the layers in the main thread, maintain the layer composition in the compositor thread - then how do you integrate that with the rendering in Qt? In WK1 the rendering happens in the main thread, when ::paintEvent() is called. In your scenario, how would the ::paintEvent() (or QGraphicsWidget::paint) implementation look like conceptually?

QT WK2 coordinated graphics already draws contents using TextureMapper off the main thread in UI Process.
You can see Vyacheslav's explanation : http://lists.webkit.org/pipermail/webkit-dev/2012-November/022924.html
QtQuick 5.0 url is changed : http://qt-project.org/doc/qt-5.0/qtquick/qquickitem.html#updatePaintNode

Coordinated Graphics maintains duplicated layer tree in the rendering thread, and the rendering thread paints contents by 60fps if animation exists.

Our way about separating threads would be the similar to current coordinated graphics' way. If webview is backed by qquickitem, the rendering thread draws UI Widgets as well as web contents, and synchronizes with the main thread only when updating states.
I don't decide yet whether to use qquickitem like WK2 to use the rendering thread or to use something like RunLoop in WK2. We can not just use thread because we need to run a timer in the separated thread. If we use RunLoop, we draws contents to texture (FBO) off the main thread, and blits texture to webview's surface on the main thread. The later way is hard to implement but a platform independent approach. We need to discuss whether is proper.
Comment 6 Simon Hausmann 2012-12-18 04:16:39 PST
(In reply to comment #5)
[...]
> > What exactly is it that the compositor thread does in terms of graphics? I understand that you want to spin timers there to update animation states. But what does it do in term of rendering?
> > 
> > The actual layers themselves cannot be rendered from within the compositor thread without blocking the main thread. Or do you populate the layers from within the main thread via the ShareableSurface? 
> > 
> > Assuming you do that - populate the layers in the main thread, maintain the layer composition in the compositor thread - then how do you integrate that with the rendering in Qt? In WK1 the rendering happens in the main thread, when ::paintEvent() is called. In your scenario, how would the ::paintEvent() (or QGraphicsWidget::paint) implementation look like conceptually?
> 
> QT WK2 coordinated graphics already draws contents using TextureMapper off the main thread in UI Process.
> You can see Vyacheslav's explanation : http://lists.webkit.org/pipermail/webkit-dev/2012-November/022924.html
> QtQuick 5.0 url is changed : http://qt-project.org/doc/qt-5.0/qtquick/qquickitem.html#updatePaintNode
> 
> Coordinated Graphics maintains duplicated layer tree in the rendering thread, and the rendering thread paints contents by 60fps if animation exists.
> 
> Our way about separating threads would be the similar to current coordinated graphics' way. If webview is backed by qquickitem, the rendering thread draws UI Widgets as well as web contents, and synchronizes with the main thread only when updating states.

"_If_ webview is backed by qquickitem" - That is exactly the point. With WebKit1 we do not have that situation, there is currently no support for using WebKit1 with QQuickItem (QtQuick2).

> I don't decide yet whether to use qquickitem like WK2 to use the rendering thread or to use something like RunLoop in WK2. We can not just use thread because we need to run a timer in the separated thread. If we use RunLoop, we draws contents to texture (FBO) off the main thread, and blits texture to webview's surface on the main thread. The later way is hard to implement but a platform independent approach. We need to discuss whether is proper.

Yeah, I think we do need to discuss this. In my opinion in boils down to the question of where we want to go with WebKit1 in the Qt port. Right now WebKit1 is used to integrate with QWidget and QGraphicsView and WebKit2 is used for the QtQuick2 integration.

Technically we have three rendering code paths to maintain right now:

    (1) The coordinated graphics way with WebKit2 and a tiny bit of private Qt API that allows us to integrate with the QtQuick2 scene graph. This implies the availability of OpenGL and rendering happens in a secondary thread (Qt's render thread that finishes off the frame with a swapbuffers call).

    (2) The QWidget (and regular QGraphicsView) way, where we know we're rendering in software into a given QPainter. On the WebKit side that's done using the software texture mapper and WebKit1. Rendering is required to happen in the main thread and ends up in the (image) backing store of the QWindow/Widget.

    (3) The way when using QGraphicsView with a QGLWidget as viewport, where we're again required to render in the main thread but we can use OpenGL and therefore use the GL texture mapper - with WebKit1.


We consciously promote option (1) over the others and would like to keep the effort required to maintain (2) and (3) at a minimum. The way (2) and (3) are tied to the main thread is indeed one of the main reasons for favoring (1).

This bug is effectively suggesting to add a fourth way of rendering that is similar to option (1) but is based on WebKit1 and would potentially be based on a QtQuick2 integration. I would like to see some compelling arguments why we should add this fourth way to the list of supported and maintained rendering code paths for the Qt port.

In other words: Why would somebody choose a potential WebKit1 based QtQuick2 integration over the existing WebKit2 based one?
Comment 7 Dongseong Hwang 2012-12-18 05:06:57 PST
Wow, thank you for explanation. Our team does not have a Qt Graphics expert like you, so we need more information like your comment.

(In reply to comment #6)
> > Our way about separating threads would be the similar to current coordinated graphics' way. If webview is backed by qquickitem, the rendering thread draws UI Widgets as well as web contents, and synchronizes with the main thread only when updating states.
> 
> "_If_ webview is backed by qquickitem" - That is exactly the point. With WebKit1 we do not have that situation, there is currently no support for using WebKit1 with QQuickItem (QtQuick2).

As I read your comment, QtQuick2 is not option of coordinated graphics on WK1, because we need to support qt4.8 also.

> 
> > I don't decide yet whether to use qquickitem like WK2 to use the rendering thread or to use something like RunLoop in WK2. We can not just use thread because we need to run a timer in the separated thread. If we use RunLoop, we draws contents to texture (FBO) off the main thread, and blits texture to webview's surface on the main thread. The later way is hard to implement but a platform independent approach. We need to discuss whether is proper.
> 
> Yeah, I think we do need to discuss this. In my opinion in boils down to the question of where we want to go with WebKit1 in the Qt port. Right now WebKit1 is used to integrate with QWidget and QGraphicsView and WebKit2 is used for the QtQuick2 integration.
> 
> Technically we have three rendering code paths to maintain right now:
> 
>     (1) The coordinated graphics way with WebKit2 and a tiny bit of private Qt API that allows us to integrate with the QtQuick2 scene graph. This implies the availability of OpenGL and rendering happens in a secondary thread (Qt's render thread that finishes off the frame with a swapbuffers call).
> 
>     (2) The QWidget (and regular QGraphicsView) way, where we know we're rendering in software into a given QPainter. On the WebKit side that's done using the software texture mapper and WebKit1. Rendering is required to happen in the main thread and ends up in the (image) backing store of the QWindow/Widget.

I can understand now where TextureMapperImagebuffer is used.

>     (3) The way when using QGraphicsView with a QGLWidget as viewport, where we're again required to render in the main thread but we can use OpenGL and therefore use the GL texture mapper - with WebKit1.
> 
> 
> We consciously promote option (1) over the others and would like to keep the effort required to maintain (2) and (3) at a minimum. The way (2) and (3) are tied to the main thread is indeed one of the main reasons for favoring (1).
> 
> This bug is effectively suggesting to add a fourth way of rendering that is similar to option (1) but is based on WebKit1 and would potentially be based on a QtQuick2 integration. I would like to see some compelling arguments why we should add this fourth way to the list of supported and maintained rendering code paths for the Qt port.
> 
> In other words: Why would somebody choose a potential WebKit1 based QtQuick2 integration over the existing WebKit2 based one?

As I read your comment, it is obvious that we need to use something like RunLoop to run AC on the rendering thread.
This plan would not create (4) option. This plan improve (2) and (3) option.

In the (2) option, we will composite content into a imageBuffer on the rendering thread using TextureMapperImagebuffer, and QWidget will paint the result into a given QPainter.

In the (3) option, we will composite content into a texture on the rendering thread usint TextureMapperGL, and QGraphicsView  will paint the result texture into QGLWidget's surface.

You can ask why we make big changes in (2) and (3) options although we decide to support minimally.

There are two reasons.
1. we will not make new implementation. we just reuse coordinated graphics in WK1. It is refactoring rather than implementation.

2. It make it easy to support (2) and (3) option by minimal effort.
We maintained some legacy code that only (2) and (3) option use.
For example, only (2) and (3) use TextureMapperTiledBackingStore. It causes Bug 105158, but we can not fix by minimal effort. 
There are so big gaps between (1) and (2)/(3) so it is hard to maintain (2) and (3) options. When we add new feature to (1), it is burden to consider how to handle (2) and (3).

This project makes (2) and (3) reuse coordinated graphics for (1).
After complete, we will maintain only coordinated graphics. All legacy code will be removed.

Although we need some effort to refactor, we can actually offer minimal support to (2) and (3) after that.

In addition, this project is not for only improving (2) and (3) in Qt.
After that, we apply threaded coordinated graphics to GTK WK1 and WK2.

I need your feedback :-)
Comment 8 Simon Hausmann 2012-12-18 08:09:50 PST
(In reply to comment #7)
> Wow, thank you for explanation. Our team does not have a Qt Graphics expert like you, so we need more information like your comment.
> 
> (In reply to comment #6)
> > > Our way about separating threads would be the similar to current coordinated graphics' way. If webview is backed by qquickitem, the rendering thread draws UI Widgets as well as web contents, and synchronizes with the main thread only when updating states.
> > 
> > "_If_ webview is backed by qquickitem" - That is exactly the point. With WebKit1 we do not have that situation, there is currently no support for using WebKit1 with QQuickItem (QtQuick2).
> 
> As I read your comment, QtQuick2 is not option of coordinated graphics on WK1, because we need to support qt4.8 also.

Why is that? WebKit trunk does not support Qt 4.8 and instead requires Qt 5.0.

> > > I don't decide yet whether to use qquickitem like WK2 to use the rendering thread or to use something like RunLoop in WK2. We can not just use thread because we need to run a timer in the separated thread. If we use RunLoop, we draws contents to texture (FBO) off the main thread, and blits texture to webview's surface on the main thread. The later way is hard to implement but a platform independent approach. We need to discuss whether is proper.
> > 
> > Yeah, I think we do need to discuss this. In my opinion in boils down to the question of where we want to go with WebKit1 in the Qt port. Right now WebKit1 is used to integrate with QWidget and QGraphicsView and WebKit2 is used for the QtQuick2 integration.
> > 
> > Technically we have three rendering code paths to maintain right now:
> > 
> >     (1) The coordinated graphics way with WebKit2 and a tiny bit of private Qt API that allows us to integrate with the QtQuick2 scene graph. This implies the availability of OpenGL and rendering happens in a secondary thread (Qt's render thread that finishes off the frame with a swapbuffers call).
> > 
> >     (2) The QWidget (and regular QGraphicsView) way, where we know we're rendering in software into a given QPainter. On the WebKit side that's done using the software texture mapper and WebKit1. Rendering is required to happen in the main thread and ends up in the (image) backing store of the QWindow/Widget.
> 
> I can understand now where TextureMapperImagebuffer is used.

Exactly :)

> >     (3) The way when using QGraphicsView with a QGLWidget as viewport, where we're again required to render in the main thread but we can use OpenGL and therefore use the GL texture mapper - with WebKit1.
> > 
> > 
> > We consciously promote option (1) over the others and would like to keep the effort required to maintain (2) and (3) at a minimum. The way (2) and (3) are tied to the main thread is indeed one of the main reasons for favoring (1).
> > 
> > This bug is effectively suggesting to add a fourth way of rendering that is similar to option (1) but is based on WebKit1 and would potentially be based on a QtQuick2 integration. I would like to see some compelling arguments why we should add this fourth way to the list of supported and maintained rendering code paths for the Qt port.
> > 
> > In other words: Why would somebody choose a potential WebKit1 based QtQuick2 integration over the existing WebKit2 based one?
> 
> As I read your comment, it is obvious that we need to use something like RunLoop to run AC on the rendering thread.
> This plan would not create (4) option. This plan improve (2) and (3) option.
> 
> In the (2) option, we will composite content into a imageBuffer on the rendering thread using TextureMapperImagebuffer, and QWidget will paint the result into a given QPainter.
> 
> In the (3) option, we will composite content into a texture on the rendering thread usint TextureMapperGL, and QGraphicsView  will paint the result texture into QGLWidget's surface.

While I agree that this would work I'm not really sure what the benefit of that is, apart from code sharing.

The heavy lifting of populating the layers happens in the main thread, the main application rendering is happening there, too. Only composing a few layers would happen in the thread. Is that worth it? Especially in terms of OpenGL it means that you have to do swapbuffers twice, once for the FBO rendering in the thread (layer composition) and the as part of the main rendering cycle.

And just to make it clear: I am indeed very interested in making (2) and (3) easier to maintain, provided we don't regress in performance.

> You can ask why we make big changes in (2) and (3) options although we decide to support minimally.
> 
> There are two reasons.
> 1. we will not make new implementation. we just reuse coordinated graphics in WK1. It is refactoring rather than implementation.
> 
> 2. It make it easy to support (2) and (3) option by minimal effort.
> We maintained some legacy code that only (2) and (3) option use.
> For example, only (2) and (3) use TextureMapperTiledBackingStore. It causes Bug 105158, but we can not fix by minimal effort. 
> There are so big gaps between (1) and (2)/(3) so it is hard to maintain (2) and (3) options. When we add new feature to (1), it is burden to consider how to handle (2) and (3).
> 
> This project makes (2) and (3) reuse coordinated graphics for (1).
> After complete, we will maintain only coordinated graphics. All legacy code will be removed.
> 
> Although we need some effort to refactor, we can actually offer minimal support to (2) and (3) after that.

I do see the advantage of getting rid of TextureMapperTiledBackingStore, but I do wonder if the result is slower in the end and maybe more complex. No'am, what is your take on all this?

> In addition, this project is not for only improving (2) and (3) in Qt.
> After that, we apply threaded coordinated graphics to GTK WK1 and WK2.

How do you plan to integrate with their application rendering? Also through an intermediate surface where just the layer composition happens in a separate thread?
Comment 9 Noam Rosenthal 2012-12-18 08:23:31 PST
(In reply to comment #8)
> (In reply to comment #7)
> The heavy lifting of populating the layers happens in the main thread, the main application rendering is happening there, too. Only composing a few layers would happen in the thread. Is that worth it? Especially in terms of OpenGL it means that you have to do swapbuffers twice, once for the FBO rendering in the thread (layer composition) and the as part of the main rendering cycle.

I think later maybe we can do something like paint in the main thread to a QPaintBuffer / cairo_recording_surface, and do the actual painting to the buffer in the compositor thread.
But I agree that right now in the Qt context we don't really offload too much to the compositor thread. also I'm not sure if in order to achieve this background painting we have to put 

> I do see the advantage of getting rid of TextureMapperTiledBackingStore, but I do wonder if the result is slower in the end and maybe more complex. No'am, what is your take on all this?
> 
From what I understood this project's goal is more about GTK, and making the compositor work in GTK the same way it does in Chromium, and as a byproduct the code would be more maintainable for Qt as well.
If this project was just for code maintainability I would rather avoid this refactor.
Comment 10 Dongseong Hwang 2012-12-18 19:03:17 PST
(In reply to comment #8)
> Why is that? WebKit trunk does not support Qt 4.8 and instead requires Qt 5.0.

Oops, you are right :)

> > As I read your comment, it is obvious that we need to use something like RunLoop to run AC on the rendering thread.
> > This plan would not create (4) option. This plan improve (2) and (3) option.
> > 
> > In the (2) option, we will composite content into a imageBuffer on the rendering thread using TextureMapperImagebuffer, and QWidget will paint the result into a given QPainter.
> > 
> > In the (3) option, we will composite content into a texture on the rendering thread usint TextureMapperGL, and QGraphicsView  will paint the result texture into QGLWidget's surface.
> 
> While I agree that this would work I'm not really sure what the benefit of that is, apart from code sharing.
> 
> The heavy lifting of populating the layers happens in the main thread, the main application rendering is happening there, too. Only composing a few layers would happen in the thread. Is that worth it? Especially in terms of OpenGL it means that you have to do swapbuffers twice, once for the FBO rendering in the thread (layer composition) and the as part of the main rendering cycle.

nits: we don't need swapbuffers twice. Rendering thread does not need swapbuffer. Rendering thread just draw into FBO.

> And just to make it clear: I am indeed very interested in making (2) and (3) easier to maintain, provided we don't regress in performance.

> > Although we need some effort to refactor, we can actually offer minimal support to (2) and (3) after that.
> 
> I do see the advantage of getting rid of TextureMapperTiledBackingStore, but I do wonder if the result is slower in the end and maybe more complex. No'am, what is your take on all this?
> 

I'm sure it is easier to maintain (2) and (3) after refactoring.

But you can concern performance issue. If refactoring affects performance badly, there can be major two reasons.
1. implementation issue of coordinated graphics
2. thread communication overhead

In the #1 case, we will solve because coordinated graphics is so important.
In the #2 case, we can implement single thread fallback like chromium. We can switch easily rendering loop between RunLoop and just Timer. (It can help debugging.)

IMO, coordinated graphics can be slower at the first painting time, but zooming, scrolling and animations are absolutely faster than WK1 TextureMapepr. I think we does not need to be afraid about performance.


> > In addition, this project is not for only improving (2) and (3) in Qt.
> > After that, we apply threaded coordinated graphics to GTK WK1 and WK2.
> 
> How do you plan to integrate with their application rendering? Also through an intermediate surface where just the layer composition happens in a separate thread?

Now I expect we composite into an intermediate surface (FBO) in a separate thread, and then blit the intermediate surface into Window's surface in the main thread.

(In reply to comment #9)
> (In reply to comment #8)
> > (In reply to comment #7)
> > The heavy lifting of populating the layers happens in the main thread, the main application rendering is happening there, too. Only composing a few layers would happen in the thread. Is that worth it? Especially in terms of OpenGL it means that you have to do swapbuffers twice, once for the FBO rendering in the thread (layer composition) and the as part of the main rendering cycle.
> 
> I think later maybe we can do something like paint in the main thread to a QPaintBuffer / cairo_recording_surface, and do the actual painting to the buffer in the compositor thread.

Absolutely, it is our goal as well as chromium folk's goal.

> But I agree that right now in the Qt context we don't really offload too much to the compositor thread. also I'm not sure if in order to achieve this background painting we have to put 
> 
> > I do see the advantage of getting rid of TextureMapperTiledBackingStore, but I do wonder if the result is slower in the end and maybe more complex. No'am, what is your take on all this?
> > 
> From what I understood this project's goal is more about GTK, and making the compositor work in GTK the same way it does in Chromium, and as a byproduct the code would be more maintainable for Qt as well.
> If this project was just for code maintainability I would rather avoid this refactor.

Very valid concern. This project had started for GTK. and now for QT and GTK. I agree that the benefit for QT is a bit little and controversial.

We will continue to refactor coordinated graphics until the refactoring has its own value.
When we reach at the point that refactoring makes code more complex, we will discuss with you how to work.
Comment 11 Gwang Yoon Hwang 2012-12-18 20:08:02 PST
(In reply to comment #10)
> (In reply to comment #8)
> > Why is that? WebKit trunk does not support Qt 4.8 and instead requires Qt 5.0.
> 
> Oops, you are right :)
> 
> > > As I read your comment, it is obvious that we need to use something like RunLoop to run AC on the rendering thread.
> > > This plan would not create (4) option. This plan improve (2) and (3) option.
> > > 
> > > In the (2) option, we will composite content into a imageBuffer on the rendering thread using TextureMapperImagebuffer, and QWidget will paint the result into a given QPainter.
> > > 
> > > In the (3) option, we will composite content into a texture on the rendering thread usint TextureMapperGL, and QGraphicsView  will paint the result texture into QGLWidget's surface.
> > 
> > While I agree that this would work I'm not really sure what the benefit of that is, apart from code sharing.
> > 
> > The heavy lifting of populating the layers happens in the main thread, the main application rendering is happening there, too. Only composing a few layers would happen in the thread. Is that worth it? Especially in terms of OpenGL it means that you have to do swapbuffers twice, once for the FBO rendering in the thread (layer composition) and the as part of the main rendering cycle.
> 
> nits: we don't need swapbuffers twice. Rendering thread does not need swapbuffer. Rendering thread just draw into FBO.
> 
> > And just to make it clear: I am indeed very interested in making (2) and (3) easier to maintain, provided we don't regress in performance.
> 
> > > Although we need some effort to refactor, we can actually offer minimal support to (2) and (3) after that.
> > 
> > I do see the advantage of getting rid of TextureMapperTiledBackingStore, but I do wonder if the result is slower in the end and maybe more complex. No'am, what is your take on all this?
> > 
> 
> I'm sure it is easier to maintain (2) and (3) after refactoring.
> 
> But you can concern performance issue. If refactoring affects performance badly, there can be major two reasons.
> 1. implementation issue of coordinated graphics
> 2. thread communication overhead
> 
> In the #1 case, we will solve because coordinated graphics is so important.
> In the #2 case, we can implement single thread fallback like chromium. We can switch easily rendering loop between RunLoop and just Timer. (It can help debugging.)
> 
> IMO, coordinated graphics can be slower at the first painting time, but zooming, scrolling and animations are absolutely faster than WK1 TextureMapepr. I think we does not need to be afraid about performance.
> 
> 
> > > In addition, this project is not for only improving (2) and (3) in Qt.
> > > After that, we apply threaded coordinated graphics to GTK WK1 and WK2.
> > 
> > How do you plan to integrate with their application rendering? Also through an intermediate surface where just the layer composition happens in a separate thread?
> 
> Now I expect we composite into an intermediate surface (FBO) in a separate thread, and then blit the intermediate surface into Window's surface in the main thread.
> 
> (In reply to comment #9)
> > (In reply to comment #8)
> > > (In reply to comment #7)
> > > The heavy lifting of populating the layers happens in the main thread, the main application rendering is happening there, too. Only composing a few layers would happen in the thread. Is that worth it? Especially in terms of OpenGL it means that you have to do swapbuffers twice, once for the FBO rendering in the thread (layer composition) and the as part of the main rendering cycle.
> > 
> > I think later maybe we can do something like paint in the main thread to a QPaintBuffer / cairo_recording_surface, and do the actual painting to the buffer in the compositor thread.
> 
> Absolutely, it is our goal as well as chromium folk's goal.
> 
> > But I agree that right now in the Qt context we don't really offload too much to the compositor thread. also I'm not sure if in order to achieve this background painting we have to put 
> > 
> > > I do see the advantage of getting rid of TextureMapperTiledBackingStore, but I do wonder if the result is slower in the end and maybe more complex. No'am, what is your take on all this?
> > > 
> > From what I understood this project's goal is more about GTK, and making the compositor work in GTK the same way it does in Chromium, and as a byproduct the code would be more maintainable for Qt as well.
> > If this project was just for code maintainability I would rather avoid this refactor.
> 
> Very valid concern. This project had started for GTK. and now for QT and GTK. I agree that the benefit for QT is a bit little and controversial.
> 
> We will continue to refactor coordinated graphics until the refactoring has its own value.
> When we reach at the point that refactoring makes code more complex, we will discuss with you how to work.

Our goal is implementing off-the-thread compositing on WebKitGTK, and implementing Coordinated Graphics in QtWebKit1 is half way of this project.

I think we can omit implementing Coordinated Graphics in QtWebKit1, if Qt does not need this feature. But I think refactoring should be in-progress to achieve the main goal of this project.
Comment 12 Jocelyn Turcotte 2013-01-02 04:32:28 PST
(In reply to comment #9)
> I think later maybe we can do something like paint in the main thread to a QPaintBuffer / cairo_recording_surface, and do the actual painting to the buffer in the compositor thread.
> But I agree that right now in the Qt context we don't really offload too much to the compositor thread. also I'm not sure if in order to achieve this background painting we have to put 

Can you explain what is the ultimate goal regarding what will be offloaded to the secondary thread? This is still not clear.

From my understanding, the reason that Chrome has a compositor thread is because they have set the dependency in this order, fastest to slowest:
1) The user-visible widget/window and the GPU process. This is the one needing maximal responsiveness.
2) The web process compositor thread. Gets scroll events before the main thread does to allow recompositing even if the main thread is busy and sends the resulting FBO to (1)
3) The web process main thread. This is the slow one, the source of latency that we want to make sure won't block (1)

See http://www.chromium.org/developers/design-documents/compositor-thread-architecture

This is what we have with Qt5/WK2:
a) The QtQuick rendering thread. We are currently doing the above's (1) regarding rendering, plus (2) since we didn't want to create an extra shared GL context to render the FBO. The delegatesScrolling() logic allows us to scroll the main layer without having to go through (c).
b) The Qt application's main thread.
c) The WK2 web process. Same as (3) above.

This is how I see where we are trying to head for WK1 with this design:
y) The main application thread. Does the above (1) and (3)
z) The AC thread. Does the above (2)

Now this is what doesn't make sense to me, why would we try to offload AC to a secondary thread while our main thread, also responsible for the final rendering to the user, is already depending on the highest sources of latency in the architecture: loading, script execution and layouting?
Even if the AC thread is free as the wind to render the composited FBO, this speed won't have any advantage since we will have to wait until the main thread is ready before the FBO can be shown to the user. Same for scrolling events, they won't be able to make it to the AC thread efficiently since they have to go through the slowest entity first.
The only advantage I see of all of this regarding performance is that we will be able to compute a few matrix operations in parallel, but I'm not even sure if this will be enough to par with the thread overhead.

The initial need of GTK in bug #100341 was about WK2, why is WK1 having anything to do with this?
Comment 13 Jocelyn Turcotte 2013-01-02 05:39:39 PST
(In reply to comment #12)
Thought about it a bit further during lunch and I see that WK1 is mainly considered as a first step and then to share code, as stated in comment #11.

It is going to bring a few things like AC layer tiling to WK1, but beside this I think that GraphicsLayer is currently a better abstraction point regarding code simplicity and maintenance.
"Making WK1 as complex as WK2" isn't really a fair goal regarding shared code, WK2 is already using in the UI process almost all the AC code that WK1 is using.

If somebody is going to care about WK1 and make sure that AC tiling is properly implemented for it and well tested for regressions, I think that this is worth it.
But if WK1 is only considered as a step stone to make debugging easier to later mainly focus on WK2, which I'm guessing is the case, then I don't think that it is worth it.
Comment 14 Jocelyn Turcotte 2013-01-02 08:11:17 PST
(In reply to comment #13)
Relevant noamr response:
17:19 < noamr> I thought it was valuable to try to unify the code bases; I'm always struggling with not breaking wk1 when I make coordinated graphics 
               changes
17:20 < jturcotte> noamr: Ok, why kind of change is usually breaking WK1 for example?
17:20 < noamr> for example stuff relating to backing stores
17:21 < noamr> there are always 10 places where I have to make the change
17:22 < noamr> right now WK1 AC has some issues
17:22 < noamr> direct composited images I don't think work correctly
17:22 < noamr> WebGL definitely doesn't on Mac
17:23 < noamr> Masks are not in the best shape; but that's on WK2 as well
17:23 < noamr> you can get out-of-memory for large tiles of course
17:25 < noamr> also since nobody is doing GL-based pixel-test gardening for webkit1, other things might get broken constantly

According to this, I agree that using CoordinatedGraphics in WK1 would straighten the behaviors between WebCore and TextureMapper and make it easier to work in this area. The IPC/Threading interface defined in [1] can't be avoided anyway if this is the way we're going for AC with GTK+WK2.

Hopefully my previous girly comments will have brought a bit clarity to compensate for the objecting tone.

[1] https://docs.google.com/document/pub?id=1UoI1zk-6nTUFtz8i4evURM8aQIjkDRC8boO1zPdMMBg
Comment 15 Jae Hyun Park 2013-01-02 15:39:09 PST
(In reply to comment #14)
> (In reply to comment #13)
> Relevant noamr response:
> 17:19 < noamr> I thought it was valuable to try to unify the code bases; I'm always struggling with not breaking wk1 when I make coordinated graphics 
>                changes
> 17:20 < jturcotte> noamr: Ok, why kind of change is usually breaking WK1 for example?
> 17:20 < noamr> for example stuff relating to backing stores
> 17:21 < noamr> there are always 10 places where I have to make the change
> 17:22 < noamr> right now WK1 AC has some issues
> 17:22 < noamr> direct composited images I don't think work correctly
> 17:22 < noamr> WebGL definitely doesn't on Mac
> 17:23 < noamr> Masks are not in the best shape; but that's on WK2 as well
> 17:23 < noamr> you can get out-of-memory for large tiles of course
> 17:25 < noamr> also since nobody is doing GL-based pixel-test gardening for webkit1, other things might get broken constantly
> 
> According to this, I agree that using CoordinatedGraphics in WK1 would straighten the behaviors between WebCore and TextureMapper and make it easier to work in this area. The IPC/Threading interface defined in [1] can't be avoided anyway if this is the way we're going for AC with GTK+WK2.
> 
> Hopefully my previous girly comments will have brought a bit clarity to compensate for the objecting tone.
> 
> [1] https://docs.google.com/document/pub?id=1UoI1zk-6nTUFtz8i4evURM8aQIjkDRC8boO1zPdMMBg

Thank you for your interest. 

We'll be looking forward to your comments on related patches.
Comment 16 Jocelyn Turcotte 2014-02-03 03:23:37 PST
=== Bulk closing of Qt bugs ===

If you believe that this bug report is still relevant for a non-Qt port of webkit.org, please re-open it and remove [Qt] from the summary.

If you believe that this is still an important QtWebKit bug, please fill a new report at https://bugreports.qt-project.org and add a link to this issue. See http://qt-project.org/wiki/ReportingBugsInQt for additional guidelines.