Bug 218177 - REGRESSION (r268386): Flashes of inverted color when zooming the map on windy.com
Summary: REGRESSION (r268386): Flashes of inverted color when zooming the map on windy...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebGL (show other bugs)
Version: WebKit Local Build
Hardware: Mac All
: P2 Normal
Assignee: Kimmo Kinnunen
URL:
Keywords: InRadar
: 218282 (view as bug list)
Depends on: 212461 217212
Blocks: 218282
  Show dependency treegraph
 
Reported: 2020-10-26 05:01 PDT by Kimmo Kinnunen
Modified: 2020-10-29 01:14 PDT (History)
9 users (show)

See Also:


Attachments
Patch (35.29 KB, patch)
2020-10-26 05:25 PDT, Kimmo Kinnunen
no flags Details | Formatted Diff | Diff
Patch (36.06 KB, patch)
2020-10-26 11:54 PDT, Kimmo Kinnunen
no flags Details | Formatted Diff | Diff
Patch (36.75 KB, patch)
2020-10-26 13:22 PDT, Kimmo Kinnunen
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Kimmo Kinnunen 2020-10-26 05:01:55 PDT
REGRESSION (r268386): Flashes of inverted color when zooming the map on windy.com

Due to the WebGL using uninitialised IOSurfaces when the compositor does not display the WebGL layer.

Made worse by r268386 due to change where the iosurfaces are not reused if they're in use by the CA compositor.
Comment 1 Kimmo Kinnunen 2020-10-26 05:02:31 PDT
<rdar://70418565>
Comment 2 Radar WebKit Bug Importer 2020-10-26 05:03:05 PDT
<rdar://problem/70676037>
Comment 3 Kimmo Kinnunen 2020-10-26 05:25:14 PDT
Created attachment 412309 [details]
Patch
Comment 4 Kimmo Kinnunen 2020-10-26 11:54:19 PDT
Created attachment 412342 [details]
Patch
Comment 5 Dean Jackson 2020-10-26 12:00:57 PDT
Comment on attachment 412342 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=412342&action=review

> LayoutTests/ChangeLog:14
> +        Test case for WebGL which is drawn to a canvas that is not visible. This should still
> +        adhere to preserveDrawingBuffer == false contract of clearing the drawing buffer
> +        correctly.
> +
> +        Case that failed at the time of writing was the case where after draw, the element
> +        would be removed by setting display:none.

Why is it failing on the iOS simulator?
Comment 6 Myles C. Maxfield 2020-10-26 12:39:01 PDT
Comment on attachment 412342 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=412342&action=review

> Source/WebCore/platform/graphics/cocoa/GraphicsContextGLOpenGLCocoa.mm:-337
> -        m_webGLLayer = adoptNS([[WebGLLayer alloc] initWithClient:this devicePixelRatio:attrs.devicePixelRatio contentsOpaque:!attrs.alpha]);

It looks like this eventually ends up calling WebGLRenderingContextBase::didComposite() whenever the composition happens, which seems to affect the web inspector and eventually will call GraphicsContextGLOpenGL::prepareForDisplay(). This functions seems important, and I don't see anything in the ChangeLog that explains why it's okay to stop running that function for each composition.
Comment 7 Kimmo Kinnunen 2020-10-26 12:50:54 PDT
Comment on attachment 412342 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=412342&action=review

>> Source/WebCore/platform/graphics/cocoa/GraphicsContextGLOpenGLCocoa.mm:-337
>> -        m_webGLLayer = adoptNS([[WebGLLayer alloc] initWithClient:this devicePixelRatio:attrs.devicePixelRatio contentsOpaque:!attrs.alpha]);
> 
> It looks like this eventually ends up calling WebGLRenderingContextBase::didComposite() whenever the composition happens, which seems to affect the web inspector and eventually will call GraphicsContextGLOpenGL::prepareForDisplay(). This functions seems important, and I don't see anything in the ChangeLog that explains why it's okay to stop running that function for each composition.

Could you rephrase? I did not understand:
- what ends up calling did composite? The constructor does not.
- what calls prepareForDisplay? didComposite does not call prepareForDisplay()

The old sequence is:

1) webgl draw (like clear)
2) put element to prepare list
3) for each context of  prepare list call prepareForDisplay
4) swap buffers
<--> bad stuff can happen here, especially if below is not done
5) ca commits
6) ca commit calls WebGLLayer display
7) GraphicsContextGLOpenGL::didDisplay()
7) WebGLRenderingContextBase::didComposite()


New sequence is:
1) webgl draw (like clear)
2) put element to prepare list
3) for each context of  prepare list call prepareForDisplay
4) swap buffers
5) WebGLRenderingContextBase::didComposite()
<--> no bad stuff can happen
5) ca commits
6) ca commit calls WebGLLayer display


     The fix in this commit fixes the case where WebGL context is drawn to
     but the CA does not display the contents. Draw would cause preparation
     of the drawing buffer for display, along with the contract that drawing
     buffer might be uninitialized. However, the clear of the drawing buffer
     was marked needed only during display.

>> LayoutTests/ChangeLog:14
>> +        would be removed by setting display:none.
> 
> Why is it failing on the iOS simulator?

readpixels has a bug, test uses read pixels
the linked bug in the expectation fixes read pixels
Comment 8 Kimmo Kinnunen 2020-10-26 13:22:05 PDT
Created attachment 412349 [details]
Patch
Comment 9 Kenneth Russell 2020-10-26 13:51:37 PDT
Comment on attachment 412349 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=412349&action=review

This change is tied to several recent changes dino@ made in Bug 212461 and follow-on bugs, as well as Bug 217212.

dino@ should probably review this as the author of that rearchitecture, but I'll think about it more too. I'm not 100% sure that it's correct to consider prepareForDisplay() as the trigger which consumes the WebGL texture (the previous call to display()).

> Source/WebCore/ChangeLog:12
> +        draw on top of the IOSurface even if CA was using it.

It's not clear to me that prior to r268386 / Bug 217212, that this was the case. WebGLLayer previously contained three IOSurfaces and used a triple-buffering scheme among them.
Comment 10 Myles C. Maxfield 2020-10-26 14:03:12 PDT
Comment on attachment 412342 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=412342&action=review

>>> Source/WebCore/platform/graphics/cocoa/GraphicsContextGLOpenGLCocoa.mm:-337
>>> -        m_webGLLayer = adoptNS([[WebGLLayer alloc] initWithClient:this devicePixelRatio:attrs.devicePixelRatio contentsOpaque:!attrs.alpha]);
>> 
>> It looks like this eventually ends up calling WebGLRenderingContextBase::didComposite() whenever the composition happens, which seems to affect the web inspector and eventually will call GraphicsContextGLOpenGL::prepareForDisplay(). This functions seems important, and I don't see anything in the ChangeLog that explains why it's okay to stop running that function for each composition.
> 
> Could you rephrase? I did not understand:
> - what ends up calling did composite? The constructor does not.
> - what calls prepareForDisplay? didComposite does not call prepareForDisplay()
> 
> The old sequence is:
> 
> 1) webgl draw (like clear)
> 2) put element to prepare list
> 3) for each context of  prepare list call prepareForDisplay
> 4) swap buffers
> <--> bad stuff can happen here, especially if below is not done
> 5) ca commits
> 6) ca commit calls WebGLLayer display
> 7) GraphicsContextGLOpenGL::didDisplay()
> 7) WebGLRenderingContextBase::didComposite()
> 
> 
> New sequence is:
> 1) webgl draw (like clear)
> 2) put element to prepare list
> 3) for each context of  prepare list call prepareForDisplay
> 4) swap buffers
> 5) WebGLRenderingContextBase::didComposite()
> <--> no bad stuff can happen
> 5) ca commits
> 6) ca commit calls WebGLLayer display
> 
> 
>      The fix in this commit fixes the case where WebGL context is drawn to
>      but the CA does not display the contents. Draw would cause preparation
>      of the drawing buffer for display, along with the contract that drawing
>      buffer might be uninitialized. However, the clear of the drawing buffer
>      was marked needed only during display.

Right, this makes sense. I was just asking about -[CALayer display]. As I understand it, the platform is allowed to call this at any point (perhaps from layer tree snapshotting?) I see that, in this patch, -[WebGLLayer update] still sets self.contents, which is good, but it looks like it no longer calls _client->didDisplay(), which would have eventually bubbled up to the Document and do inspector stuff and EGL stuff, if I'm reading the code correctly.

I don't really know what I'm talking about (since I'm not an expert in OpenGL) but it seems like such a change should be explicitly intentional. I just don't see anything in the ChangeLog describing why it's okay for -[CALayer display] to no longer call through to GraphicsContextGLOpenGL::prepareForDisplay().
Comment 11 Kenneth Russell 2020-10-26 14:29:05 PDT
(In reply to Myles C. Maxfield from comment #10)
> Comment on attachment 412342 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=412342&action=review
> 
> >>> Source/WebCore/platform/graphics/cocoa/GraphicsContextGLOpenGLCocoa.mm:-337
> >>> -        m_webGLLayer = adoptNS([[WebGLLayer alloc] initWithClient:this devicePixelRatio:attrs.devicePixelRatio contentsOpaque:!attrs.alpha]);
> >> 
> >> It looks like this eventually ends up calling WebGLRenderingContextBase::didComposite() whenever the composition happens, which seems to affect the web inspector and eventually will call GraphicsContextGLOpenGL::prepareForDisplay(). This functions seems important, and I don't see anything in the ChangeLog that explains why it's okay to stop running that function for each composition.
> > 
> > Could you rephrase? I did not understand:
> > - what ends up calling did composite? The constructor does not.
> > - what calls prepareForDisplay? didComposite does not call prepareForDisplay()
> > 
> > The old sequence is:
> > 
> > 1) webgl draw (like clear)
> > 2) put element to prepare list
> > 3) for each context of  prepare list call prepareForDisplay
> > 4) swap buffers
> > <--> bad stuff can happen here, especially if below is not done
> > 5) ca commits
> > 6) ca commit calls WebGLLayer display
> > 7) GraphicsContextGLOpenGL::didDisplay()
> > 7) WebGLRenderingContextBase::didComposite()
> > 
> > 
> > New sequence is:
> > 1) webgl draw (like clear)
> > 2) put element to prepare list
> > 3) for each context of  prepare list call prepareForDisplay
> > 4) swap buffers
> > 5) WebGLRenderingContextBase::didComposite()
> > <--> no bad stuff can happen
> > 5) ca commits
> > 6) ca commit calls WebGLLayer display
> > 
> > 
> >      The fix in this commit fixes the case where WebGL context is drawn to
> >      but the CA does not display the contents. Draw would cause preparation
> >      of the drawing buffer for display, along with the contract that drawing
> >      buffer might be uninitialized. However, the clear of the drawing buffer
> >      was marked needed only during display.
> 
> Right, this makes sense. I was just asking about -[CALayer display]. As I
> understand it, the platform is allowed to call this at any point (perhaps
> from layer tree snapshotting?) I see that, in this patch, -[WebGLLayer
> update] still sets self.contents, which is good, but it looks like it no
> longer calls _client->didDisplay(), which would have eventually bubbled up
> to the Document and do inspector stuff and EGL stuff, if I'm reading the
> code correctly.
> 
> I don't really know what I'm talking about (since I'm not an expert in
> OpenGL) but it seems like such a change should be explicitly intentional. I
> just don't see anything in the ChangeLog describing why it's okay for
> -[CALayer display] to no longer call through to
> GraphicsContextGLOpenGL::prepareForDisplay().

Slight correction to these two comments: Dean's changes from Bug 212461 refactored things so that "prepare for display" is triggered from the DOM thread and does all of the OpenGL work, and "display" can be called from a different thread. "display" doesn't call back to prepareForDisplay, and mustn't do any OpenGL / EGL work. It did however reset some boolean fields and call into the Inspector, under the cover of the WebThreadLock in UIWebView.
Comment 12 Kimmo Kinnunen 2020-10-26 23:41:34 PDT
To explicitly answer (same what kbr wrote):

> I just don't see anything in the ChangeLog describing why it's okay for -[CALayer 
display] to no longer call through to GraphicsContextGLOpenGL::prepareForDisplay().

It's not described by ChangeLog because that's not how it ever worked. This is explained by the call graph I marked up. 

It's exactly the opposite: prepareForDisplay is the "reason" for -[CALayer display]. That's in the name: "prepare for calayer display". 

It's only "conceptually the reason", since CALayer display is done for every commit, regardless if anything changed. But conceptually: when we have something to display, we prepare for display. When we have prepared for display, we display.
Comment 13 Kimmo Kinnunen 2020-10-26 23:56:13 PDT
> I'm not 100% sure that it's correct to consider prepareForDisplay() as the trigger which consumes the WebGL texture (the previous call to display()).

I think you need to elaborate, especially what you mean by "the previous call to display()".

I think it's relatively good strategy to freeze the tree in a specific time point.
It might swap in cases where the eventual compositor does not use the contents, e.g. redundantly, e.g. the observer never gets to view the results. Few examples in the test. However, I don't think it's in violation of the WebGL contract. In every case where the compositor is not locked with the engine this will (and should) happen.

The downside is that the visibility check is harder, and for example in WebKit implementation not done due to that. So theoretical cases will suffer a bit due to the redundant swapping.


>> Source/WebCore/ChangeLog:12
>> +        draw on top of the IOSurface even if CA was using it.


>It's not clear to me that prior to r268386 / Bug 217212, that this was the case. >WebGLLayer previously contained three IOSurfaces and used a triple-buffering scheme >among them.

According to my reading, the previous pseudocode was:

GraphicsContextGLOpenGL::prepareForDisplay() {
  swap(m_drawingBuffer, m_displayBuffer);
  if (m_drawingBuffer->isInUse()) 
    swap(m_drawingBuffer, m_spareBuffer);
}

So if we ended up using spareBuffer, we would use it regardless if it's in use or not.
Comment 14 Dean Jackson 2020-10-27 00:31:29 PDT
Comment on attachment 412342 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=412342&action=review

>>>>> Source/WebCore/platform/graphics/cocoa/GraphicsContextGLOpenGLCocoa.mm:-337
>>>>> -        m_webGLLayer = adoptNS([[WebGLLayer alloc] initWithClient:this devicePixelRatio:attrs.devicePixelRatio contentsOpaque:!attrs.alpha]);
>>>> 
>>>> It looks like this eventually ends up calling WebGLRenderingContextBase::didComposite() whenever the composition happens, which seems to affect the web inspector and eventually will call GraphicsContextGLOpenGL::prepareForDisplay(). This functions seems important, and I don't see anything in the ChangeLog that explains why it's okay to stop running that function for each composition.
>>> 
>>> Could you rephrase? I did not understand:
>>> - what ends up calling did composite? The constructor does not.
>>> - what calls prepareForDisplay? didComposite does not call prepareForDisplay()
>>> 
>>> The old sequence is:
>>> 
>>> 1) webgl draw (like clear)
>>> 2) put element to prepare list
>>> 3) for each context of  prepare list call prepareForDisplay
>>> 4) swap buffers
>>> <--> bad stuff can happen here, especially if below is not done
>>> 5) ca commits
>>> 6) ca commit calls WebGLLayer display
>>> 7) GraphicsContextGLOpenGL::didDisplay()
>>> 7) WebGLRenderingContextBase::didComposite()
>>> 
>>> 
>>> New sequence is:
>>> 1) webgl draw (like clear)
>>> 2) put element to prepare list
>>> 3) for each context of  prepare list call prepareForDisplay
>>> 4) swap buffers
>>> 5) WebGLRenderingContextBase::didComposite()
>>> <--> no bad stuff can happen
>>> 5) ca commits
>>> 6) ca commit calls WebGLLayer display
>>> 
>>> 
>>>      The fix in this commit fixes the case where WebGL context is drawn to
>>>      but the CA does not display the contents. Draw would cause preparation
>>>      of the drawing buffer for display, along with the contract that drawing
>>>      buffer might be uninitialized. However, the clear of the drawing buffer
>>>      was marked needed only during display.
>> 
>> Right, this makes sense. I was just asking about -[CALayer display]. As I understand it, the platform is allowed to call this at any point (perhaps from layer tree snapshotting?) I see that, in this patch, -[WebGLLayer update] still sets self.contents, which is good, but it looks like it no longer calls _client->didDisplay(), which would have eventually bubbled up to the Document and do inspector stuff and EGL stuff, if I'm reading the code correctly.
>> 
>> I don't really know what I'm talking about (since I'm not an expert in OpenGL) but it seems like such a change should be explicitly intentional. I just don't see anything in the ChangeLog describing why it's okay for -[CALayer display] to no longer call through to GraphicsContextGLOpenGL::prepareForDisplay().
> 
> Slight correction to these two comments: Dean's changes from Bug 212461 refactored things so that "prepare for display" is triggered from the DOM thread and does all of the OpenGL work, and "display" can be called from a different thread. "display" doesn't call back to prepareForDisplay, and mustn't do any OpenGL / EGL work. It did however reset some boolean fields and call into the Inspector, under the cover of the WebThreadLock in UIWebView.

Yep. I think it just wasn't clear from the patch that markLayerComposited has moved into prepareForDisplay, so the Inspector is still being told something had happened.
Comment 15 Dean Jackson 2020-10-27 00:38:53 PDT
Comment on attachment 412349 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=412349&action=review

>> Source/WebCore/ChangeLog:12
>> +        draw on top of the IOSurface even if CA was using it.
> 
> It's not clear to me that prior to r268386 / Bug 217212, that this was the case. WebGLLayer previously contained three IOSurfaces and used a triple-buffering scheme among them.

Yeah. The reason for the third buffer was due to the call to display happening when the current buffer was still locked, and thus we were unable to swap it into the slot for rendering the next frame. That issue may have gone away when we moved to doing "prepareForDisplay" or when Kimmo cleaned up the buffer ownership.

The effect was pretty clear - lots of blank or corrupt frames, so I'm willing to trust the outcome of this patch if it is stable. Please check on heavy workloads though. Fullscreen YouTube 360 video is a good one.
Comment 16 Dean Jackson 2020-10-27 00:41:43 PDT
(In reply to Kimmo Kinnunen from comment #13)
> > I'm not 100% sure that it's correct to consider prepareForDisplay() as the trigger which consumes the WebGL texture (the previous call to display()).

I think it is ok, since nothing triggered by the page (i.e JS or events) should happen between prepareForDisplay and the actual call to display.
Comment 17 EWS 2020-10-27 01:01:47 PDT
Committed r269025: <https://trac.webkit.org/changeset/269025>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 412349 [details].
Comment 18 Kimmo Kinnunen 2020-10-29 01:14:58 PDT
*** Bug 218282 has been marked as a duplicate of this bug. ***