Bug 234926 - Dynamic indexing into a fixed-size uniform array with GLSL makes the Safari tab hang
Summary: Dynamic indexing into a fixed-size uniform array with GLSL makes the Safari t...
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebGL (show other bugs)
Version: Safari 15
Hardware: Mac (Apple Silicon) macOS 12
: P1 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2022-01-06 10:34 PST by Rob Swain
Modified: 2022-04-07 23:21 PDT (History)
9 users (show)

See Also:


Attachments
3d scene example that reproduces the problem (11.10 MB, application/zip)
2022-01-07 03:06 PST, Rob Swain
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Rob Swain 2022-01-06 10:34:13 PST
I am running on a 2021 16" M1 Max MacBook Pro, with Safari 15.2 (same behaviour with technology preview 137) with the default browser configuration (i.e. none of the experimental features were changed.) This same issue does _not_ occur on an Intel x86_64 MacBook Pro with the exact same code.

The GLSL code I want to execute is:
```
struct ClusterOffsetsAndCounts {
    uvec4 data[1024];
};

uniform ClusterOffsetsAndCounts_block_3Fragment { ClusterOffsetsAndCounts _group_0_binding_8_fs; };

ClusterOffsetAndCount unpack_offset_and_count(uint cluster_index) {
    ClusterOffsetAndCount output_;
    uint offset_and_count = _group_0_binding_8_fs.data[(cluster_index >> 2u)][(cluster_index & ((1u << 2u) - 1u))];
    output_.offset = ((offset_and_count >> 8u) & ((1u << 24u) - 1u));
    output_.offset = 0u;
    output_.count = (offset_and_count & ((1u << 8u) - 1u));
    output_.count = 0u;
    ClusterOffsetAndCount _e68 = output_;
    return _e68;
}
```

The code may look a bit strangely-written because it is parsed from WGSL by the Rust naga library, which then converts it to the above GLSL when the target platform is WebGL2.

Through repeated testing, I have managed to isolate that the problematic line of code is:

```
uint offset_and_count = _group_0_binding_8_fs.data[(cluster_index >> 2u)][(cluster_index & ((1u << 2u) - 1u))];
```

If I use constants to index into the array and uvec4 components, it works fine. i.e. this works fine:
```
uint offset_and_count = _group_0_binding_8_fs.data[0u][0u];
```

If I use a variable containing a constant value for the first index into the array, it also works fine. i.e. this works fine:
```
uint item_index = 0u;
uint offset_and_count = _group_0_binding_8_fs.data[item_index][0u];
```

However, if I use a variable containing an index that is dynamically calculated, the page just kind of hangs, nothing is ever displayed in the canvas, and if I go into the developer tools to inspect the GLSL program source it just never loads. It seems like the WebGL2 stuff just gets stuck. i.e. this does not work:
```
uint item_index = cluster_index / 4u; // I first suspected bitwise operators were the problem so I tried switching to division and modulo instead
uint offset_and_count = _group_0_binding_8_fs.data[item_index][0u];
```

In my mind, that this only happens on an M1 running macOS, and not on Intel, suggests that the Intel iGPU/AMD Radeon drivers have no problem with dynamic indexing into fixed-size uniform arrays, and that the M1 GPU drivers do. I don't know if dynamic indexing into fixed-size uniform arrays _should_ be supported here, but it works in Firefox 95.0.2 and Chrome 97.0.4692.71 and I thought all three browsers were using ANGLE to support WebGL2 but leverage Metal for rendering on macOS. I'd be interested to understand where the problem lies. :)

I thought it could be possibly related to this issue: https://bugs.webkit.org/show_bug.cgi?id=214393
Comment 1 Rob Swain 2022-01-06 10:41:16 PST
I'm bumping this to P1 because the bug reporting guidelines say that if the issue is something that worked in previous versions of WebKit and it has regressed, then bumping to P1 is OK. So I hope that is indeed OK. :)

I tested this with "WebGL via Metal" and my understanding is that Safari 15.1 introduced using ANGLE for supporting WebGL on top of Metal.
Comment 2 Rob Swain 2022-01-06 10:43:59 PST
(In reply to Rob Swain from comment #1)
> I tested this with "WebGL via Metal" and my understanding is that Safari
> 15.1 introduced using ANGLE for supporting WebGL on top of Metal.

Sorry, that was supposed to read: I tested this with "WebGL via Metal" **disabled** and the problematic code worked fine.
Comment 3 Kenneth Russell 2022-01-06 14:33:09 PST
Please provide a self-contained test case demonstrating the problem. Without one, it will be impossible to investigate this further.
Comment 4 Rob Swain 2022-01-07 03:06:54 PST
Created attachment 448578 [details]
3d scene example that reproduces the problem

The problem I am experiencing is implemented as WGSL shader code in the open source bevy game engine. As mentioned, the naga library is parsing WGSL and outputting GLSL for consumption by WebGL2 APIs. I am not currently writing any WebGL2/GLSL code myself, rather I am writing Rust/WGSL and compiling it to wasm.

However, I tried to implement a small test case which used a fullscreen triangle and the uniform buffer, structs, and function as described above, and I was unable to reproduce but I'm not sure I'm doing everything in the same way as in the code where I can reproduce the problem 100% consistently.

In the interest of providing you all a way of reproducing the problem, I have attached a zip file containing a directory containing some html, js, and wasm that is a simple scene with a ground plane, a cube, camera, and a light. If you unpack this somewhere, and serve the directory, then open the 3d_scene.html in Safari 15.2 with "WebGL via Metal" enabled (i.e. default configuration) then you should only see a white background with diagonal black lines (some CSS style that makes it easier to see when the canvas is actually being rendered to) and the tab hangs. If I disabled "WebGL via Metal", close Safari (I don't know if this is necessary when changing the configuration but I thought it might be), open it again and open the page, after maybe 15s or so, the scene renders correctly.

Let me know if that is enough or if you need something more. Hopefully it suffices. Thank you for the support!
Comment 5 Radar WebKit Bug Importer 2022-01-13 10:35:27 PST
<rdar://problem/87558302>
Comment 6 Kyle Piddington 2022-03-02 14:47:15 PST
This is certainly an issue with the translator, I think the uniform struct might be too big post-saturation modification. I'm not sure why it's causing a full tab-stall though...
Comment 7 Rob Swain 2022-03-16 07:33:48 PDT
(In reply to Kyle Piddington from comment #6)
> This is certainly an issue with the translator, I think the uniform struct
> might be too big post-saturation modification. I'm not sure why it's causing
> a full tab-stall though...

What is the maximum size of the uniform struct through WebGL2 in Safari and how much overhead does the 'post-saturation modification' add? I can test this.
Comment 8 Rob Swain 2022-03-16 08:40:33 PDT
I tried reducing the binding size to 16000 bytes, and even to 8000 bytes, and that did not help. It still hangs.
Comment 9 Kevin Rogovin 2022-04-07 23:18:26 PDT
I have the same issue on both an M1 and AMD GPU on MacOS in a complicated application when WebGL2 is configured to be translated to Metal. I work-around the issue by using usampler2D(then the application works, but less efficiently than it would with a uniform buffer object). In my application the UBO's in question are of the form:

```
uniform UBO
{
  uvec4 data[N];
};
```

and access is dynamic on the index into `data`. 

As a side note, Safari from around a year ago did work with this code (but I think I needed to configure it to use the GL backend, but the GL backend is even more broken because even with the work around the tab crashes).
Comment 10 Kevin Rogovin 2022-04-07 23:21:01 PDT
(In reply to Kevin Rogovin from comment #9)
> I have the same issue on both an M1 and AMD GPU on MacOS in a complicated
> application when WebGL2 is configured to be translated to Metal. I
> work-around the issue by using usampler2D(then the application works, but
> less efficiently than it would with a uniform buffer object). In my
> application the UBO's in question are of the form:
> 
> ```
> uniform UBO
> {
>   uvec4 data[N];
> };
> ```
> 
> and access is dynamic on the index into `data`. 
> 
> As a side note, Safari from around a year ago did work with this code (but I
> think I needed to configure it to use the GL backend, but the GL backend is
> even more broken because even with the work around the tab crashes).

One more thing, which I forgot to mention, the value of N is made so that the size of the UBO does not exceed GL_MAX_UNIFORM_BLOCK_SIZE