WebKit Bugzilla
New
Browse
Search+
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
302711
WebGPU Crash on iOS with Time-Varying Mesh Access using instancing vertex buffers
https://bugs.webkit.org/show_bug.cgi?id=302711
Summary
WebGPU Crash on iOS with Time-Varying Mesh Access using instancing vertex buf...
s.todchuk
Reported
2025-11-18 06:08:27 PST
Created
attachment 477420
[details]
HTML file demonstrating the crash (takes around 70 sec) Rendering with instancing vertex buffers crashes after ~70s when mesh access order varies per frame. Works fine with fixed order or storage buffers. # WebGPU Crash on iOS with Time-Varying Mesh Access Patterns ## Summary Safari on iOS crashes when rendering multiple meshes with **mesh access order that varies per frame** while using instancing vertex buffers for transforms. The crash is **time-dependent** (accumulates over ~70 seconds), **scales with mesh count and object count**, indicating a memory corruption or resource tracking bug in iOS WebGPU's Metal backend. **Does NOT crash on macOS Safari** - iOS-specific bug. ## Environment - **Device**: iPhone 15 Pro/16 Pro Max, iOS 26.0/26.1 - **Browser**: Safari (WebKit Metal backend) - **Cross-platform**: DOES not crash on Chrome/Edge/Firefox (Windows/Android/macOS) and Safari macOS ## Reproduction Steps ### Test Case: `iPhoneWGPUCrash.html` Standalone HTML file demonstrating the crash. Error after ~70 seconds: `"InvalidStateError: GPUCommandEncoder.finish: Unable to finish."` **Minimum crash conditions:** - 1000+ unique meshes, 10000+ object instances - Instancing vertex buffer for per-object transforms - **Mesh access order that changes every frame** (random shuffle OR random start offset) **Config:** `MESH_COUNT`, `GRID_SIZE`, `VARY_RQ_HEAD`, `DO_RQ_SHUFFLE` (see code comments) ### Key Findings **What triggers the crash:** - **Time-varying mesh access patterns** - order changes per frame (whether via `firstIndex` OR buffer binding) - Instancing vertex buffer for transforms - Scales with mesh count (min: 1000) and object count (min: 4096 on iPhone 15 Pro) - Time-dependent failure (~70s), not immediate validation error **What does NOT affect crash:** - Mesh complexity or vertex format - `baseVertex` parameter - Buffer layout (shared vs separate buffers) - Draw call count (batching improves FPS but doesn't prevent crash) ### Workarounds Tested | Workaround | Result | |------------|--------| | Bake vertex offsets, `baseVertex=0` | ❌ Crashes | | Separate buffers per mesh | ❌ Crashes | | Batch draws via instancing | ❌ Crashes | | Static transform buffers (no updates) | ❌ Crashes | | Indirect drawing (`drawIndexedIndirect`) | ❌ Crashes | | **Fixed mesh order per frame** (sequential OR constant random) | ✅ Works | | **Storage/constant buffer for transforms** (instead of instancing vertex buffer) | ✅ Works | **Root cause:** Instancing vertex buffer + time-varying mesh access order = crash. Fixed order (even if non-sequential) works fine. ## Impact on Real-World Applications This bug **blocks all standard 3D engine techniques** that vary rendering order per frame: - **Frustum Culling** - rendering only visible objects - **Depth Sorting** - transparent object ordering - **Material Batching** - grouping by material/shader - **LOD Systems** - dynamic mesh detail switching - **Dynamic Scenes** - adding/removing objects **Result:** iOS WebGPU is effectively unusable for production 3D applications. ## Business Impact **Blocking delivery to enterprise customers:** ConocoPhillips and AkerBP (major oil & gas companies) are waiting for our 3D engine product, which uses these exact rendering patterns. We cannot ship a product that crashes on iOS. **Storage buffer workaround limitations:** - Requires major architectural changes - Reduces rendering efficiency (alignment overhead, suboptimal memory access) - Limits hardware/browser compatibility (stricter size limits) ## Request Please investigate this memory corruption/resource tracking bug in iOS WebGPU's Metal backend. This is a **critical, reproducible issue** blocking legitimate 3D rendering techniques and enterprise product deliveries.
Attachments
HTML file demonstrating the crash (takes around 70 sec)
(27.66 KB, text/html)
2025-11-18 06:08 PST
,
s.todchuk
no flags
Details
patch from April
(26.53 KB, patch)
2025-11-20 10:03 PST
,
Mike Wyrzykowski
no flags
Details
Formatted Diff
Diff
View All
Add attachment
proposed patch, testcase, etc.
s.todchuk
Comment 1
2025-11-18 06:31:08 PST
https://webgpu.github.io/webgpu-samples/sample/animometer/
also crashes after some time (~2 minutes) with numTriangles=20000, renderBundles=false, dynamicOffsets=true
Radar WebKit Bug Importer
Comment 2
2025-11-18 22:09:08 PST
<
rdar://problem/165023230
>
Mike Wyrzykowski
Comment 3
2025-11-18 22:21:28 PST
Thank you for the repro case. The main difference between iOS and macOS is that iOS will terminate due to memory pressure whereas macOS will not until it reaches much higher thresholds. In the repro, memory starts at 2GB and climbs to 3GB in WebKit's GPU process on macOS as well. Retain issue is reproducible on macOS. Seems something some large amount of memory is being retained when it should not, I will take a look
Mike Wyrzykowski
Comment 4
2025-11-18 22:29:25 PST
checking memgraphs for the com.apple.WebKit.GPU and com.apple.WebKit.WebKit processes
Mike Wyrzykowski
Comment 5
2025-11-18 22:30:00 PST
memory usage seems about half or less on Chrome for reference
Mike Wyrzykowski
Comment 6
2025-11-18 22:32:21 PST
Guess based on the report details is this repro constantly triggers vertex buffer validation and we have a retain issue with that
Mike Wyrzykowski
Comment 7
2025-11-19 14:59:36 PST
Skipping vertex buffer validation keeps memory usage stable around ~500MB over the same 70 second time period Certainly it appears we have unbounded memory growth due to buffer validation.
Mike Wyrzykowski
Comment 8
2025-11-19 15:00:00 PST
Great bug report by the way, thank you.
Mike Wyrzykowski
Comment 9
2025-11-19 15:35:33 PST
So it appears our cache quickly approaches ~5 million elements
https://github.com/WebKit/WebKit/blob/7d08e130dc4395638075edb553966d2b4a6659b9/Source/WebGPU/WebGPU/Buffer.h#L183
Either we are incorrectly missing the cache or we need to clear it.
Mike Wyrzykowski
Comment 10
2025-11-19 15:56:30 PST
Perf wise the buffer validation is severely negatively impacting performance here. Disabling buffer validation I observe ~45fps. With buffer validation I observe ~9fps. Chrome on the same Mac is ~30fps. There is more occurring than just an out of control cache. Memory usage limiting the cache size to 10 elements is still 1.6GB
Mike Wyrzykowski
Comment 11
2025-11-20 09:06:26 PST
The memory usage appears to originate from Vertex : Vertex memory barriers we emit, of which there are several thousand per frame
Mike Wyrzykowski
Comment 12
2025-11-20 09:31:10 PST
We can emit a single memory barrier by switching to MTLParallelRenderCommandEncoder. I.e., perform vertex buffer validation for all draws first, then proceed with standard rendering. I will try migrating to MTLParallelRenderCommandEncoder
Mike Wyrzykowski
Comment 13
2025-11-20 10:03:54 PST
Created
attachment 477453
[details]
patch from April I wrote a patch in April to switch to parallel command encoding for this purpose but there were some bugs and it was deemed too risky. I'm going to try and clean it up so we remove all but one of the memory barriers.
Mike Wyrzykowski
Comment 14
2025-11-20 14:48:12 PST
MTLParallelRenderCommandEncoder and limiting the cache size resolves the memory growth. Perf wise is still not great. Maybe something easy to resolve that too
Mike Wyrzykowski
Comment 15
2025-11-20 15:21:25 PST
Oh nice using the ring buffer allocator gets us ~35 fps and Chrome is ~33 fps on the same Mac so virtually identical. Going to make an iOS build to ensure the issue is fully addressed
Mike Wyrzykowski
Comment 16
2025-11-20 15:31:25 PST
Pull request:
https://github.com/WebKit/WebKit/pull/54279
Mike Wyrzykowski
Comment 17
2025-11-24 12:30:26 PST
Seems fine on an iPhone 13 mini, no crashes after several minutes
EWS
Comment 18
2025-12-04 16:20:06 PST
Committed
303942@main
(df6c49376568): <
https://commits.webkit.org/303942@main
> Reviewed commits have been landed. Closing PR #54279 and removing active labels.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug