Bug 200606 - [WebGPU] Improve GPUBindGroup performance using one device-shared argument MTLBuffer
Summary: [WebGPU] Improve GPUBindGroup performance using one device-shared argument MT...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Justin Fan
URL:
Keywords: InRadar
: 198683 (view as bug list)
Depends on:
Blocks: 200657 200658
  Show dependency treegraph
 
Reported: 2019-08-09 22:26 PDT by Justin Fan
Modified: 2019-08-15 16:35 PDT (History)
7 users (show)

See Also:


Attachments
Patch (58.45 KB, patch)
2019-08-09 22:45 PDT, Justin Fan
no flags Details | Formatted Diff | Diff
Patch (58.59 KB, patch)
2019-08-12 13:07 PDT, Justin Fan
no flags Details | Formatted Diff | Diff
Patch (58.81 KB, patch)
2019-08-12 16:50 PDT, Justin Fan
no flags Details | Formatted Diff | Diff
Patch for landing (58.91 KB, patch)
2019-08-12 21:30 PDT, Justin Fan
no flags Details | Formatted Diff | Diff
Patch (58.33 KB, patch)
2019-08-13 12:48 PDT, Justin Fan
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Justin Fan 2019-08-09 22:26:55 PDT
[WebGPU] Improve GPUBindGroup performance using one device-shared argument MTLBuffer
Comment 1 Justin Fan 2019-08-09 22:45:00 PDT
Created attachment 375999 [details]
Patch
Comment 2 Myles C. Maxfield 2019-08-10 18:46:52 PDT
Comment on attachment 375999 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=375999&action=review

> Source/WebCore/ChangeLog:9
> +        Vastly improves GPUProgrammablePassEncoder.setBindGroup performance.

By how much? On which benchmark?
Comment 3 Myles C. Maxfield 2019-08-12 11:14:36 PDT
Justin and I had a long conversation this morning about allocation strategies. We should do some more investigation to determine whether the strategy in this patch is acceptable to commit or not.
Comment 4 Justin Fan 2019-08-12 13:07:09 PDT
Created attachment 376087 [details]
Patch
Comment 5 Justin Fan 2019-08-12 16:50:09 PDT
Created attachment 376106 [details]
Patch
Comment 6 Justin Fan 2019-08-12 16:53:33 PDT
Talked with Geoff Garen about memory allocation strategies. A vector-like growth model with a minimum size should be fine for now, especially since it appears that shared MTLBuffers have a minimum allocation size (4KB on a tested Mac) way larger than the average argument buffer.

For future, we should free unused argument buffer memory. This requires:
1) Compaction logic before deciding whether to allocate new memory for a new GPUBindGroup, and
2) Defragmentation logic whenever a GPUBindGroup is released. 

I'll file bugs to address these two issues.
Comment 7 Justin Fan 2019-08-12 17:24:54 PDT
To back up my "average argument buffer" size claim, here are some numbers:

Tier 1 Argument Buffer support in Metal on iOS devices is the minimum. Each argument buffer can encode:

31 buffers, which require 32 bytes each (16 bytes for pointer and 16 bytes for size constant)
31 textures, which require 32 bytes each
16 samplers, which require 16 bytes each

And other than the constant data we encode for buffer bounds checking, we don't support encoding extra constant data.

This means each argument buffer we support has a maximum possible encodedLength of 2240 B. A GPUBindGroup uses one argument buffer per pipeline stage type, so for the three of vertex, fragment, and compute, worst case argument buffer size for a single GPUBindGroup is 6720 B.
Comment 8 Myles C. Maxfield 2019-08-12 19:21:59 PDT
Comment on attachment 376106 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=376106&action=review

r=me assuming there are links to follow-up bugs about compaction and shrinking.

> Source/WebCore/platform/graphics/gpu/cocoa/GPUBindGroupAllocatorMetal.mm:83
> +        // Based off mimimum allocation for a memory-shared MTLBuffer on macOS.

don't you mean "shared memory"

> Source/WebCore/platform/graphics/gpu/cocoa/GPUBindGroupAllocatorMetal.mm:127
> +

Seems like this could benefit from a FIXME with a link to a bug about compaction
Comment 9 Justin Fan 2019-08-12 21:30:43 PDT
Created attachment 376137 [details]
Patch for landing
Comment 11 WebKit Commit Bot 2019-08-13 10:13:29 PDT
Comment on attachment 376137 [details]
Patch for landing

Rejecting attachment 376137 [details] from commit-queue.

Failed to run "['/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch', '--status-host=webkit-queues.webkit.org', '--bot-id=webkit-cq-03', 'apply-attachment', '--no-update', '--non-interactive', 376137, '--port=mac']" exit_code: 2 cwd: /Volumes/Data/EWS/WebKit

Logging in as commit-queue@webkit.org...
Fetching: https://bugs.webkit.org/attachment.cgi?id=376137&action=edit
Fetching: https://bugs.webkit.org/show_bug.cgi?id=200606&ctype=xml&excludefield=attachmentdata
Processing 1 patch from 1 bug.
Processing patch 376137 from bug 200606.
Fetching: https://bugs.webkit.org/attachment.cgi?id=376137
Failed to run "[u'/Volumes/Data/EWS/WebKit/Tools/Scripts/svn-apply', '--force']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit

Parsed 25 diffs from patch file(s).
patching file Source/WebCore/ChangeLog
Hunk #1 succeeded at 1 with fuzz 3.
patching file Source/WebCore/Modules/webgpu/WebGPUDevice.cpp
patching file Source/WebCore/SourcesCocoa.txt
patching file Source/WebCore/WebCore.xcodeproj/project.pbxproj
Hunk #1 succeeded at 13934 (offset 2 lines).
Hunk #2 FAILED at 17088.
Hunk #3 succeeded at 18275 (offset 2 lines).
Hunk #4 succeeded at 26066 (offset 4 lines).
Hunk #5 succeeded at 28609 (offset 4 lines).
Hunk #6 succeeded at 30824 (offset 4 lines).
1 out of 6 hunks FAILED -- saving rejects to file Source/WebCore/WebCore.xcodeproj/project.pbxproj.rej
patching file Source/WebCore/platform/graphics/gpu/GPUBindGroup.h
patching file Source/WebCore/platform/graphics/gpu/GPUBindGroupAllocator.h
patching file Source/WebCore/platform/graphics/gpu/GPUBindGroupLayout.h
patching file Source/WebCore/platform/graphics/gpu/GPUBuffer.h
patching file Source/WebCore/platform/graphics/gpu/GPUComputePassEncoder.h
patching file Source/WebCore/platform/graphics/gpu/GPUDevice.cpp
patching file Source/WebCore/platform/graphics/gpu/GPUDevice.h
patching file Source/WebCore/platform/graphics/gpu/GPUProgrammablePassEncoder.h
patching file Source/WebCore/platform/graphics/gpu/GPURenderPassEncoder.h
patching file Source/WebCore/platform/graphics/gpu/GPUTexture.h
patching file Source/WebCore/platform/graphics/gpu/cocoa/GPUBindGroupAllocatorMetal.mm
patching file Source/WebCore/platform/graphics/gpu/cocoa/GPUBindGroupMetal.mm
patching file Source/WebCore/platform/graphics/gpu/cocoa/GPUBufferMetal.mm
patching file Source/WebCore/platform/graphics/gpu/cocoa/GPUComputePassEncoderMetal.mm
patching file Source/WebCore/platform/graphics/gpu/cocoa/GPUDeviceMetal.mm
patching file Source/WebCore/platform/graphics/gpu/cocoa/GPUProgrammablePassEncoderMetal.mm
patching file Source/WebCore/platform/graphics/gpu/cocoa/GPURenderPassEncoderMetal.mm
patching file Source/WebCore/platform/graphics/gpu/cocoa/GPUTextureMetal.mm
patching file LayoutTests/ChangeLog
Hunk #1 succeeded at 1 with fuzz 3.
patching file LayoutTests/webgpu/bind-groups-expected.txt
patching file LayoutTests/webgpu/bind-groups.html

Failed to run "[u'/Volumes/Data/EWS/WebKit/Tools/Scripts/svn-apply', '--force']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit

Full output: https://webkit-queues.webkit.org/results/12905201
Comment 12 Justin Fan 2019-08-13 12:48:32 PDT
Created attachment 376203 [details]
Patch
Comment 13 Justin Fan 2019-08-13 12:49:45 PDT
Comment on attachment 376203 [details]
Patch

Clearing flags on attachment: 376203

Committed r248606: <https://trac.webkit.org/changeset/248606>
Comment 14 Justin Fan 2019-08-13 12:49:46 PDT
All reviewed patches have been landed.  Closing bug.
Comment 15 Radar WebKit Bug Importer 2019-08-13 12:50:28 PDT
<rdar://problem/54269779>
Comment 16 Justin Fan 2019-08-15 16:35:11 PDT
*** Bug 198683 has been marked as a duplicate of this bug. ***