293683 – Flickering in WebGL

NEW293683

Flickering in WebGL

https://bugs.webkit.org/show_bug.cgi?id=293683

Summary Flickering in WebGL

github

Reported 2025-05-28 06:02:58 PDT

When using a specific scene in Babylon.js (see https://forum.babylonjs.com/t/needdepthprepass-creates-flickering-in-8-6-2/58421/7), the output flickers. Reproduced on MacOS 15.4 and iOS 18.4 by the OP and other people. I could reproduce on my iPhone SE iOS 18.5. A workaround is to add the following code to the shader: bool bbb = any(isnan(position)); if (bbb) { } It seems that what is passed to isnan doesn't matter, using another variable works as well. Note that the workaround is now live, so if you want to reproduce, use this link: https://playground.babylonjs.com/?version=8.9.0#RNWZOX#44 (you will have to rotate the scene to see the flickering). Link with the workaround: https://playground.babylonjs.com/#RNWZOX#44 The workaround is a bit of Voodoo, it would be helpful to have more information about the cause of the problem...

Attachments
Add attachment proposed patch, testcase, etc.

Kimmo Kinnunen

Comment 1 2025-05-30 05:34:19 PDT

Thank you for the report. I can reproduce the issue. Referring to nans or infs causes the shader be complex with Metal "fast math" feature off. The fast math feature causes: - infs and nans to not be expected - math operations aggressively reordered and fused, which may cause differences in results compared to not doing this The trigger "scene.useRightHandedSystem = true // TRIGGER 1" only adds one -1.0 multiplication in both VS and FS. It is possible that this interacts with fast math, fusing the multiplications differently to 1.0 case. However, this seems improbable? Would it be possible that the content tries to write -infinities to depth buffer in case of useRightHandedSystem == true? In this cases it'd be perhaps prudent to change the algorithm to not do this, and maybe it would work as-is? Otherwise a bit simpler reproduction case would be useful. pbr.needDepthPrePass = true causes the framework to emit one extra depth prepass shader. Toggling this shader inactive with Web Inspector causes the glitches to go away, but my eyes cannot discern any other effect. In other words, I don't understand what the depth prepass step contributes to the image when it works correctly.

Kimmo Kinnunen

Comment 2 2025-05-30 06:00:13 PDT

I think the issue reproduces also with useRightHandedSystem == false. To reproduce, set the mesh position z to 0.01: x.position = new Vector3(-0.001, 0, 0.01) // TRIGGER 3 I don't know if that makes sense, though. In ANGLE, this should fix it: options.get().mathMode = MTLMathModeRelaxed; options.get().mathFloatingPointFunctions = MTLMathFloatingPointFunctionsFast; So the difference is that infs/nans work with MTLMathModeRelaxed. So it would seem that the content likely uses infinities.

github

Comment 3 2025-06-02 07:10:02 PDT

Thanks for looking into this! I tried to check if there were any infinities in our calculation, but I couldn't prove it: https://playground.babylonjs.com/#RNWZOX#71 This PG adds these lines of code to the end of the vertex shader: vMainUV1 = vec2(0.8, 0.2); if (any(isnan(position))) vMainUV1 = vec2(0.2, 0.2); if (any(isnan(vPositionW))) vMainUV1 = vec2(0.2, 0.2); if (any(isnan(vNormalW))) vMainUV1 = vec2(0.2, 0.2); if (any(isnan(viewDirectionW))) vMainUV1 = vec2(0.2, 0.2); if (any(isnan(vEyePosition))) vMainUV1 = vec2(0.2, 0.2); if (any(isnan(gl_Position))) vMainUV1 = vec2(0.2, 0.2); Some parts should turn black if isnan() returns true (the texture coordinates vec2(0.2, 0.2) are black, as you can see here: https://playground.babylonjs.com/#RNWZOX#70). So is it possible that the problem is caused by the reordering of operations? > pbr.needDepthPrePass = true causes the framework to emit an additional depth pre-pass shader. Disabling this shader in Web Inspector makes the anomalies disappear, but I don't see any other effects. In other words, I don't understand the point of the depth pre-pass step when it works correctly. Yes, needDepthPrePass = true generates a pre-pass where the object is first drawn to the depth buffer before being drawn normally. Is it possible that the reordering of operations is different in this pre-pass compared to the normal pass? In the pre-pass, the color mask is set to false for all channels, so the compiler probably optimizes the vertex shader differently compared to the normal phase (?) Our code relies on the fact that the depth calculated (and written to the depth buffer) by the pre-pass operation is exactly the same as in the normal pass, but if the operations are reordered differently, it is possible that the result will be slightly different, leading to a difference in the calculated depth and causing flickering. How can we verify this hypothesis and prevent this from happening in the web context if this is the problem?

Radar WebKit Bug Importer

Comment 4 2025-06-04 06:03:19 PDT

<rdar://problem/152572384>

Kimmo Kinnunen

Comment 5 2025-06-06 07:35:06 PDT

> Some parts should turn black if isnan() returns true (the texture coordinates vec2(0.2, 0.2) are black The effect cannot be observed like that. E.g. if the shader uses isnan or isinf, the fast math optimization is turned off and the effect goes away. When the shader is not using infs, nans, then the math operations are not expected to produce infs or nans. If the operations would produce such, then the shader would behave in undefined manner. > So is it possible that the problem is caused by the reordering of operations? The fast math changes the math operations orders and their precisions. But what you probably are asking, if the operations are evaluated differently for depth prepass and the regular pass. This is a possibility, but it's unclear if this happens. > Our code relies on the fact that the depth calculated (and written to the depth buffer) by the pre-pass operation is exactly the same as in the normal pass, but if the operations are reordered differently, it is possible that the result will be slightly different, leading to a difference in the calculated depth and causing flickering. In OpenGL, the shader author is expected to mark the outputs as invariant. E.g. in all shaders that you expect the outputs to match, mark the output as "invariant". See "4.6 Variance and the Invariant Qualifier" in https://registry.khronos.org/OpenGL/specs/es/3.0/GLSL_ES_Specification_3.00.pdf. WebKit WebGL Metal backend does support this, but with my brief testing I didn't see this solving the issue you had. > How can we verify this hypothesis and prevent this from happening in the web context if this is the problem? I think you could read the depth buffer out and save it as image in both cases. E.g. save the prepass render depth buffer to image A and single-pass render depth buffer to B. That could at least give a hint. >> I don't understand the point of the depth pre-pass step when it works correctly. > Yes, needDepthPrePass = true generates a pre-pass where the object is first drawn to the depth buffer before being drawn normally. Yeah, this part I get. However, when I tried to force disable the prepass shader when the engine thought the prepass would be drawn, I could not immediately observe any difference to the rendering. I'd imagine the prepass would somehow affect the output when the engine is in the prepass mode? Or is it pure perf thing?

github

Comment 6 2025-06-13 03:53:52 PDT

(In reply to Kimmo Kinnunen from comment #5) > When the shader is not using infs, nans, then the math operations are not > expected to produce infs or nans. If the operations would produce such, then > the shader would behave in undefined manner. That's what I wanted to test, whether we produce nans or infs, which doesn't seem to be the case. > > Our code relies on the fact that the depth calculated (and written to the depth buffer) by the pre-pass operation is exactly the same as in the normal pass, but if the operations are reordered differently, it is possible that the result will be slightly different, leading to a difference in the calculated depth and causing flickering. > > In OpenGL, the shader author is expected to mark the outputs as invariant. > E.g. in all shaders that you expect the outputs to match, mark the output as > "invariant". > > See "4.6 Variance and the Invariant Qualifier" in > https://registry.khronos.org/OpenGL/specs/es/3.0/GLSL_ES_Specification_3.00. > pdf. > > WebKit WebGL Metal backend does support this, but with my brief testing I > didn't see this solving the issue you had. Thanks for the reference to the doc. I'm not sure what output you flagged as "invariant" in your tests, though, if the problem concerns the depth value calculated by the GPU? > > > How can we verify this hypothesis and prevent this from happening in the web context if this is the problem? > > I think you could read the depth buffer out and save it as image in both > cases. E.g. save the prepass render depth buffer to image A and single-pass > render depth buffer to B. That could at least give a hint. Thanks for the suggestion! That's what I did in this PG: https://playground.babylonjs.com/#RNWZOX#89 Note that we can't read pixels from the depth buffer in WebGL, which is why I first copy them to a RED float 32 texture. After one second, the clear color turns red if the two buffers are not identical, and that's what happens when I test on my iPhone (I get 969 different values in the pop-up window)! The clear color remains blue on my desktop, so it seems that this is the problem: the depth written by the pre-pass is different from that written by the normal pass. Since we don't write the depth ourselves, I think this confirms that it is a reordering of operations that is causing the problem when fast math mode is enabled? The question now is how to fix it... > > > Yeah, this part I get. However, when I tried to force disable the prepass > shader when the engine thought the prepass would be drawn, I could not > immediately observe any difference to the rendering. I'd imagine the prepass > would somehow affect the output when the engine is in the prepass mode? Or > is it pure perf thing? In this case, it doesn't change anything; it was just for reproduction purposes. In other circumstances, it can help manage transparent objects.

Kimmo Kinnunen

Comment 7 2025-06-16 05:18:51 PDT

> That's what I wanted to test, whether we produce nans or infs, which doesn't seem to be the case. Yes, but to observe the problem, one would need to observe it without referring to infs/nans. E.g. one would need to write the value to the depth buffer, and then look at the values in the buffer and try understand if there were some such written. >I'm not sure what output you flagged as "invariant" in your tests, though, if the problem concerns the depth value calculated by the GPU? The vertex shader outputs that need to match should be marked invariant. The depth gets determined by gl_Position, so you would need at least following: invariant gl_Position; Consider: prepass vertex shader, prepass fragment shader real vertex shader, real fragment shader So you want prepass vertex shader to produce the same values as real vertex shader for gl_Position, because the result of prepass is the depth buffer values, determined by the gl_Position. Same could need to be done for other outputs that need to not vary between pipelines, but I don't know if your example has any other. > Since we don't write the depth ourselves, I think this confirms that it is a reordering of operations that is causing the problem when fast math mode is enabled? > The question now is how to fix it... I think the presence of the error is some sort of symptom of the fast math -- it's unclear if it is specifically the fact that prepass gets optimised differently to the real shader. If adding invariant would fix the issue, it would be indication that the shader optimisation devergence would be the reason. You could still try to add that invariant declaration. That is anyway the semantic you want. If it's not helping, this can be worked around by referring to inf in any form. Maybe code of form if (false) { isinf(3.); } or similar.

github

Comment 8 2025-06-16 07:18:53 PDT

Adding "invariant gl_Position;" does work! PG without the line: https://playground.babylonjs.com/#RNWZOX#98 PG with the line: https://playground.babylonjs.com/#RNWZOX#99 Thank you for your help with this issue. I believe we can now close it.

Note You need to log in before you can comment on or make changes to this bug.

Status NEW

Resolution

Priority P2

Severity Normal

Classification Unclassified

Version Safari 18

Hardware Unspecified

OS Unspecified

Product WebKit

Component WebGL

Assignee

Nobody

Reported

2025-05-28 06:02 PDT

Modified

2025-06-16 07:18 PDT History

CC List

4 users Show

URL

Keywords InRadar

Depends on

Blocks