Bug 238166

Summary: [WebGPU] Limit the number of MTLCommandQueue objects created
Product: WebKit Reporter: Myles C. Maxfield <mmaxfield>
Component: New BugsAssignee: Myles C. Maxfield <mmaxfield>
Status: RESOLVED WONTFIX    
Severity: Normal CC: dino, djg, kkinnunen
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 238164    
Attachments:
Description Flags
Patch ews-feeder: commit-queue-

Description Myles C. Maxfield 2022-03-21 16:35:53 PDT
[WebGPU] Limit the number of MTLCommandQueue objects created
Comment 1 Myles C. Maxfield 2022-03-21 16:45:41 PDT
Created attachment 455300 [details]
Patch
Comment 2 Kimmo Kinnunen 2022-03-22 05:07:17 PDT
I don't know enough about the api..

Is this a standard technique? I cannot immediately see anything else than limiting the number of devices, which probably is not an option if the test suite creates thousands of devices?

Can there be ordering problems if flush on other device induces a flush on the commands of the other device?

// device A tasks
command1
command2
signalSemaphore(s1)

// device B tasks
waitForSemaphore(s1)
commandZZ

// A non-linear programmer writes them as
a.command1
a.command2
b.waitForSemaphore(s1)
b.commandZZ
b.flush
a.signalSemaphore(s1)
a.flush


// The shared command buffer implementation would contain
command1
command2
waitForSemaphore(s1)  // deadlock
commandZZ
signalSemaphore(s1)



// where individual command buffers implementation would contain

// command buffer 1
command1
command2
signalSemaphore(s1)

// command buffer 2
waitForSemaphore(s1) // No deadlock?
commandZZ
Comment 3 Myles C. Maxfield 2022-03-22 12:15:02 PDT
You’re right that deadlocks would naturally be a problem. However, we have 2 mitigating factors:

1. We don’t have fences / semaphores / barriers in the API yet. In fact, we don’t even have multiple queues yet (for a single device) with which to use semaphores. See https://github.com/gpuweb/gpuweb/pull/1217
2. Even when we did, we solved this deadlock problem by validating that:
      A) Signal values only ever increased over time
      B) A command to signal the value being waited on has already been submitted

(This is only possible because we don’t have multi-threading yet, so a single thread sees every signal/wait command produced by the app.)
Comment 4 Myles C. Maxfield 2022-03-23 16:34:31 PDT
Comment on attachment 455300 [details]
Patch

I should see if the test suite calls GPUDevice.destroy().
Comment 5 Myles C. Maxfield 2022-03-24 16:47:39 PDT
(In reply to Myles C. Maxfield from comment #4)
> Comment on attachment 455300 [details]
> Patch
> 
> I should see if the test suite calls GPUDevice.destroy().

It totally does.