When PSONing, a web page proxy listens to two DrawingAreaProxy. While this may not be an issue when everything is working and rendering is properly suspended in WebProcess, this is fragile design, as shown by regression in https://bugs.webkit.org/show_bug.cgi?id=208918. It seems that, ideally, when the provisional DrawingAreaProxy starts receiving some messages, WebPageProxy should stop listening to the old one. Or maybe do some processing that ensures that rendering is swapped at the exact time draw area proxies are swapped.
(In reply to youenn fablet from comment #0) > When PSONing, a web page proxy listens to two DrawingAreaProxy. > While this may not be an issue when everything is working and rendering is > properly suspended in WebProcess, this is fragile design, as shown by > regression in https://bugs.webkit.org/show_bug.cgi?id=208918. > > It seems that, ideally, when the provisional DrawingAreaProxy starts > receiving some messages, WebPageProxy should stop listening to the old one. This would not be right. While the provisional page exists, the WebPageProxy is still the one displayed on screen and interactive. It definitely should keep listening to IPC from its DrawingArea. At any point, the provisional load can be cancelled and it would be like nothing had happened.
Also, the WebPageProxy does not really have 2 DrawingAreaProxy objects. The WebPageProxy has one and the ProvisionalPageProxy has one. Both proxies use different identifiers for their IPC.
(In reply to Chris Dumez from comment #1) > (In reply to youenn fablet from comment #0) > > When PSONing, a web page proxy listens to two DrawingAreaProxy. > > While this may not be an issue when everything is working and rendering is > > properly suspended in WebProcess, this is fragile design, as shown by > > regression in https://bugs.webkit.org/show_bug.cgi?id=208918. > > > > It seems that, ideally, when the provisional DrawingAreaProxy starts > > receiving some messages, WebPageProxy should stop listening to the old one. > > This would not be right. While the provisional page exists, the WebPageProxy > is still the one displayed on screen and interactive. It definitely should > keep listening to IPC from its DrawingArea. At any point, the provisional > load can be cancelled and it would be like nothing had happened. Currently, both drawing areas are linked to the same WebPageProxy. It they both receive EnterAcceleratedCompositingMode IPC messages roughly at the same time, they will call WebPageProxy:: enterAcceleratedCompositingMode and the last one will win. It would be nice to fix this. There is nothing preventing a buggy or bad-behaving web process to send this IPC message and keep the control of what is being displayed on the supposed-to-be process-swapped tab. It makes sense to me that the drawing areas would forward their messages to their WebPageProxy when they are set to WebPageProxy::m_drawingArea but not otherwise. In the current design, that would probably require buffering some state. I could check again but I believe enterAcceleratedCompositingMode messages are processed for the provisional drawing area before it is assigned to WebPageProxy::m_drawingArea.
Created attachment 393955 [details] Patch
Comment on attachment 393955 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=393955&action=review > Source/WebKit/WebProcess/WebPage/WebPage.cpp:5854 > + unfreezeLayerTree(LayerTreeFreezeReason::ProcessSwap); I think this will likely introduce flashing when navigating cross-site. This is the reason we unfroze the layer tree in didCompletePageTransition(), because we knew that at that point there was something meaningful to paint in the new drawing area.
(In reply to Chris Dumez from comment #5) > Comment on attachment 393955 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=393955&action=review > > > Source/WebKit/WebProcess/WebPage/WebPage.cpp:5854 > > + unfreezeLayerTree(LayerTreeFreezeReason::ProcessSwap); > > I think this will likely introduce flashing when navigating cross-site. This > is the reason we unfroze the layer tree in didCompletePageTransition(), > because we knew that at that point there was something meaningful to paint > in the new drawing area. What is causing the flashing? How can I check that? Is it mostly iOS? I added a timer to delay by 1 second the unfreezing and I did not see it.
Comment on attachment 393955 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=393955&action=review >>> Source/WebKit/WebProcess/WebPage/WebPage.cpp:5854 >>> + unfreezeLayerTree(LayerTreeFreezeReason::ProcessSwap); >> >> I think this will likely introduce flashing when navigating cross-site. This is the reason we unfroze the layer tree in didCompletePageTransition(), because we knew that at that point there was something meaningful to paint in the new drawing area. > > What is causing the flashing? How can I check that? Is it mostly iOS? > I added a timer to delay by 1 second the unfreezing and I did not see it. I don't remember if it is for macOS or iOS. I think Antti wrote this code so I cc'd him. Maybe it is OK to drop the ProcessSwap freeze reason here since we hopefully still have the "PageTransition" freeze reason until didCompletePageTransition(). I am not sure but I am hoping Antti knows. Basically, to test this, you'd want to navigate cross-site (e.g. clicking search results links on google) and make sure you do not see flashes to white (or black if dark mode) when navigating. It may be more or less obvious depending on the sites so try a lot of links.
Also swipe back and forward cross-site, with and without page cache disabled.
I tried on MacOS and it seems fine. Will try on iOS as well.
Comment on attachment 393955 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=393955&action=review >>>> Source/WebKit/WebProcess/WebPage/WebPage.cpp:5854 >>>> + unfreezeLayerTree(LayerTreeFreezeReason::ProcessSwap); >>> >>> I think this will likely introduce flashing when navigating cross-site. This is the reason we unfroze the layer tree in didCompletePageTransition(), because we knew that at that point there was something meaningful to paint in the new drawing area. >> >> What is causing the flashing? How can I check that? Is it mostly iOS? >> I added a timer to delay by 1 second the unfreezing and I did not see it. > > I don't remember if it is for macOS or iOS. I think Antti wrote this code so I cc'd him. Maybe it is OK to drop the ProcessSwap freeze reason here since we hopefully still have the "PageTransition" freeze reason until didCompletePageTransition(). I am not sure but I am hoping Antti knows. Basically, to test this, you'd want to navigate cross-site (e.g. clicking search results links on google) and make sure you do not see flashes to white (or black if dark mode) when navigating. It may be more or less obvious depending on the sites so try a lot of links. The problem was that we would unfreeze the initial empty document on a newly created process, which would then paint and produce white (or black) flash. The code exist to prevent this initial empty document from unfreezing. Is there now something else that stops this now? I would think both didCommitLoad and didCompletePageTransition do get called on initial empty document. I suspect that all the process prewarming we do might be making the case harder to hit but we still shouldn't regress.
Moving the code should be fine in itself but maybe it still needs to test for the initial empty document?
> Is there now something else that stops this now? I would think both > didCommitLoad and didCompletePageTransition do get called on initial empty > document. In the old code, we were checking for m_mainFrame to protect from unfreezing too early. This check was active within WebPage constructor. WebFrameLoaderClient::dispatchDidCommitLoad is not called within WebPage constructor as FrameLoader::dispatchDidCommitLoad has a similar check to m_mainFrame (checking for creatingInitialEmptyDocument state). This patch seems similar to the old code from that point of view. This might be different for page cache but, doing some testing, I do not see any white flash. A potential issue with the patch is that we delay a bit the unfreezing of the rendering tree in the new process. That could lead to a small delay in first paint.
Created attachment 394072 [details] Patch
Ok, makes sense
Could we fix the title according to comment #2?
(In reply to Simon Fraser (smfr) from comment #15) > Could we fix the title according to comment #2? Do you have any suggestion? Before the patch, both drawing areas were listening to IPC. This patch makes it so that either one of them will listen, not both.
Comment on attachment 394072 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=394072&action=review > Source/WebKit/ChangeLog:3 > + On PSON, WebPageProxy listens to two DrawAreaProxy at the same time DrawingAreaProxies Maybe "Explicitly activate the new DrawingAreaProxy on PSON navigations" > Source/WebKit/UIProcess/DrawingAreaProxy.h:61 > + void activate(); This name is a bit generic. Maybe becomeActiveDrawingAreaForPage() > Source/WebKit/WebProcess/WebPage/WebPage.cpp:5854 > + auto& coreFrame = *frame->coreFrame(); This should be a Ref<>
Comment on attachment 394072 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=394072&action=review >> Source/WebKit/ChangeLog:3 >> + On PSON, WebPageProxy listens to two DrawAreaProxy at the same time > > DrawingAreaProxies > > Maybe "Explicitly activate the new DrawingAreaProxy on PSON navigations" OK >> Source/WebKit/UIProcess/DrawingAreaProxy.h:61 >> + void activate(); > > This name is a bit generic. Maybe becomeActiveDrawingAreaForPage() m_drawingArea->activate() reads better to me in WebPageProxy than m_drawingArea->becomeActiveDrawingAreaForPage(). Could change it to startListening or something related to starting receiving IPC messages. >> Source/WebKit/WebProcess/WebPage/WebPage.cpp:5854 >> + auto& coreFrame = *frame->coreFrame(); > > This should be a Ref<> Let's revert this change actually. There is no perf gain and, calling frame->coreFrame() makes sure that we will nullptr crash instead of a potential UAF. I'll keep the ASSERT though.
Comment on attachment 394072 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=394072&action=review >>> Source/WebKit/UIProcess/DrawingAreaProxy.h:61 >>> + void activate(); >> >> This name is a bit generic. Maybe becomeActiveDrawingAreaForPage() > > m_drawingArea->activate() reads better to me in WebPageProxy than m_drawingArea->becomeActiveDrawingAreaForPage(). > Could change it to startListening or something related to starting receiving IPC messages. I also don't like activate(), I am no idea what it means to "activate" a DrawingAreaProxy.
Created attachment 394164 [details] Patch for landing
Comment on attachment 394164 [details] Patch for landing Some name options: startDrawing startReceivingMessages didInstallInPage Side note: Is it still OK for the DrawingArea destructor to unconditionally call removeMessageReceiver(), even though it can no longer guarantee that it will call addMessageReceiver()? Maybe a better design would not assign WebPageProxy or WebProcessProxy at all in the constructor, and then this function would be called "setPage", and it would be clearer why we delay adding messages receiver until we have a webpage that we are installed into.
(In reply to Geoffrey Garen from comment #21) > Comment on attachment 394164 [details] > Patch for landing > > Some name options: > > startDrawing > startReceivingMessages > didInstallInPage > > Side note: Is it still OK for the DrawingArea destructor to unconditionally > call removeMessageReceiver(), even though it can no longer guarantee that it > will call addMessageReceiver()? Right, we might hit a debug assert. > Maybe a better design would not assign > WebPageProxy or WebProcessProxy at all in the constructor, and then this > function would be called "setPage", and it would be clearer why we delay > adding messages receiver until we have a webpage that we are installed into. Sounds good. My initial idea was to remove the concept of a provisional DrawingAreaProxy, but I decided to not go there since this is used in ViewGestureController. Maybe some further clean-up could be done in that area as well.
> > Maybe a better design would not assign > > WebPageProxy or WebProcessProxy at all in the constructor, and then this > > function would be called "setPage", and it would be clearer why we delay > > adding messages receiver until we have a webpage that we are installed into. > > Sounds good. > > My initial idea was to remove the concept of a provisional DrawingAreaProxy, > but I decided to not go there since this is used in ViewGestureController. > Maybe some further clean-up could be done in that area as well. Well, we still want the DrawingAreaProxy to send some IPC messages. For now, I'll just ensure we do not remove the listener if we did not add it in the first place. I'll try to look at whether we can remove the concept of a provisional drawing area proxy once we know for sure this patch does not trigger any PLT/flash regression.
Created attachment 394247 [details] Patch
Committed r258829: <https://trac.webkit.org/changeset/258829> All reviewed patches have been landed. Closing bug and clearing flags on attachment 394247 [details].
<rdar://problem/60765704>
Created attachment 394341 [details] simple example https://trac.webkit.org/r258829 broke requestAnimationFrame on GTK. In the attached example click on the button to trigger cross-domain navigation. There is expected to be alert fired from requestAnimationFrame which is not happening after the change.
Reopening because of the GTK breakage.
Maybe we should revert the patch then. Carlos, could you take a quick look at the reason we are breaking GTK here?
(In reply to youenn fablet from comment #29) > Maybe we should revert the patch then. > Carlos, could you take a quick look at the reason we are breaking GTK here? Sure, I'll check.
So, the web process drawing area sends an update message to the UI process before the drawing area proxy starts receiving messages and keeps waiting for the DidUpdate forever. I need to investigate more how to fix this. I thought painting was disabled in the drawing area at that point. I'll investigate tomorrow.
Thanks Carlos for the GTK fix, closing this bug.