Bug 236130 - Low fps when mesh instances are scaled to 0
Summary: Low fps when mesh instances are scaled to 0
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebGL (show other bugs)
Version: Safari 15
Hardware: Mac (Intel) macOS 12
: P2 Normal
Assignee: Nobody
URL: https://playground.babylonjs.com/#SM5...
Keywords: InRadar
Depends on:
Blocks: anglemetalregr
  Show dependency treegraph
 
Reported: 2022-02-04 00:55 PST by gbalasis87
Modified: 2022-03-01 02:19 PST (History)
10 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description gbalasis87 2022-02-04 00:55:58 PST
I noticed a significant performance drop in WebGL in the latest webkit versions, appearing under these specific circumstances:

- Import a geometry and create several instances of it
- Add the instances under a parent (e.g. a box geometry)
- Scale the parent to 0

This makes the fps drop significantly. Here is a playground from BabylonJS where the issue is replicated:
https://playground.babylonjs.com/#SM5RNH#24

Here is a playground where the issue is not replicated. The difference between this and the previous one is the scaling of the parent. This time it's not zero:
https://playground.babylonjs.com/#SM5RNH#23

I would expect the opposite result - since the scaling is 0 and nothing is rendered, the fps should go up, but on MacBook Pro 2019 it drops from 60fps to 3-5fps.

Here is the forum of BabylonJS where I had initially reported the issue:
https://forum.babylonjs.com/t/framerate-drop-on-ios15-with-scaling-0/25928

The issue can not be replicated on Chrome.
Comment 1 Radar WebKit Bug Importer 2022-02-04 16:14:51 PST
<rdar://problem/88513232>
Comment 2 Kenneth Russell 2022-02-07 16:43:55 PST
Possibly related to the switch to ANGLE's Metal backend?

Note that there exist surprising pathological geometry cases for GPUs. One I'm aware of from years back was long skinny almost-vertical triangles - this ran much more slowly on vendors' GPUs than others.

It might be the case that the tiny geometry is causing lots of overlapping fragments to be rendered, and maybe serialized, on the GPU.

Please indicate what type of Mac this is reproduced on.
Comment 3 gbalasis87 2022-02-08 00:37:10 PST
I have reproduced it on:
- MacBook Pro (15-inch, 2019), macOS Monterey v12.1 with Intel UHD Graphics 630
- iPhone XS Max, iOS 15.1

Both while using Safari. Colleagues have managed to reproduce it on more iPhones that have a recent version of iOS.
Comment 4 Kyle Piddington 2022-02-08 10:19:22 PST
Can confirm this is a Metal-ANGLE regression. Switching to GL runs the first sample at 60.
Comment 5 Kyle Piddington 2022-02-08 14:27:41 PST
Strangely enough, this doesn't appear to be a degenerate GPU performance case, but rather an encoding issue. When capturing draw commands, I'm seeing far more render passes (64!) when we are drawing the zero-scale skulls, vs when we are drawing content (2).
Comment 6 Kyle Piddington 2022-02-08 14:36:02 PST
Examining the draw calls encoded in our slow case in detail, it appears that when drawing with a scaled size of 0, we're not actually drawing instanced primitives. I assume that's why we're experiencing this slowdown. In addition, the backend appears to be updating the mapping on some buffer, and waiting on the CPU for each skull to finish drawing individually, when doing that buffer readback.
Comment 7 Kyle Piddington 2022-02-08 15:22:21 PST
Alright, I think I've got a potential answer. I'm not sure how much we can do to fix this one, but here's what I've got from a quick look. Would love a BabylonJS developer to correct me if I've got any of this wrong.

1) Where did the instancing go?
Looking into the Babylonjs source, it looks like there's some code to deal with rendering instanced meshes that has to deal with if a mesh is visible or not. Given that we're setting the scale of these meshes to zero, I assume babylonjs marks all instances as 'not visible.'

Assuming we're using the prototype 

 Mesh.prototype._renderWithInstances = function (subMesh, fillMode, batch, effect, engine)... 

When we go to render an instanced mesh, we may up getting an instance count of 1. 
(instancesCount = (renderSelf ? 1 : 0) + visibleInstances.length;)
This causes us to render each skull one at a time. Since we're rendering zero pixels, this should be a fast operation. But...

2) Why are we waiting for each skull to render?

Here's where it gets trickier. In between each mesh render, it seems that we get a call to bufferSubData to set a uniforms buffer up for the render. mapping the underlying buffer to send data, at the moment, causes a full render to occur. This means sending a command to the GPU, and waiting for a callback. I'm assuming, and to an extent observing that this wait is what's causing our problems.

My questions for the BabylonJS developers:
1) Am I reading this code correctly re: instances? If none are visible, they're drawn one at a time?
2) If so, is this a bug worth fixing in BabylonJS? We can continue analyzing slowdown here, but it seems like this might be undesired behavior.
Comment 8 gbalasis87 2022-02-08 15:31:33 PST
I will ping the BabylonJS devs
Comment 9 gbalasis87 2022-02-09 08:16:06 PST
A dev from the BabylonJS team has confirmed that there is an issue with scaling instances to 0, and I have created an issue in their github repo. Thanks for spotting it.

https://forum.babylonjs.com/t/framerate-drop-on-ios15-with-scaling-0/25928/9
https://github.com/BabylonJS/Babylon.js/issues/11952