Bug 238722 - [WebGPU] Implement missing validity checks
Summary: [WebGPU] Implement missing validity checks
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebGPU (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Myles C. Maxfield
URL:
Keywords: InRadar
Depends on: 238720
Blocks: 238723
  Show dependency treegraph
 
Reported: 2022-04-04 00:18 PDT by Myles C. Maxfield
Modified: 2022-04-11 17:14 PDT (History)
4 users (show)

See Also:


Attachments
Patch (14.12 KB, patch)
2022-04-04 00:20 PDT, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
Patch (16.73 KB, patch)
2022-04-04 00:23 PDT, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
Rebased (16.05 KB, patch)
2022-04-07 16:09 PDT, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
Rebased (15.14 KB, patch)
2022-04-08 20:16 PDT, Myles C. Maxfield
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Myles C. Maxfield 2022-04-04 00:18:37 PDT
.
Comment 1 Myles C. Maxfield 2022-04-04 00:20:41 PDT
Created attachment 456537 [details]
Patch
Comment 2 Myles C. Maxfield 2022-04-04 00:23:31 PDT
Created attachment 456538 [details]
Patch
Comment 3 Kimmo Kinnunen 2022-04-07 03:32:26 PDT
Comment on attachment 456538 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=456538&action=review

> Source/WebGPU/WebGPU/CommandEncoder.mm:102
> +    if (!source.isValidToUseWith(*this))

My mental model is:
All functions do:
 1) `this` is validated with isValid()
 2) all inputs are validated with isValidToUseWith(*this)

so this seems to lack the 1)

> Source/WebGPU/WebGPU/CommandEncoder.mm:167
>  

Note: here we have a contrasting internal functions:

validateCopyBufferToBuffer(const Buffer& source, uint64_t sourceOffset, const Buffer& destination, uint64_t destinationOffset, uint64_t size)
vs
static bool validateImageCopyBuffer(const WGPUImageCopyBuffer& imageCopyBuffer)

It is kind of easier to follow validateCopyBufferToBuffer because all the parameters are converted to "business-as-usual" internal parameters (as opposed to weird-and-cumbersome-C-API-types)

OTOH validateImageCopyBuffer codepaths are weird and unusual as they leak the API types to the internal

So example:
Where is copyTextureToBuffer "isValidToUseWith" checks?

I know it's cumbersome if there's a descriptor with 100 fields, and you don't want to 
a) add a duplicated descriptor but with internal types
b) add 100 params

I don't think there's good options here.

It's a bit hard to understand that all these have 
1) isVAlid check
2) isValidToUseWith check for each resource referenced

    RefPtr<ComputePassEncoder> beginComputePass(const WGPUComputePassDescriptor&);
    RefPtr<RenderPassEncoder> beginRenderPass(const WGPURenderPassDescriptor&);
    void copyBufferToBuffer(const Buffer& source, uint64_t sourceOffset, const Buffer& destination, uint64_t destinationOffset, uint64_t size);
    void copyBufferToTexture(const WGPUImageCopyBuffer& source, const WGPUImageCopyTexture& destination, const WGPUExtent3D& copySize);
    void copyTextureToBuffer(const WGPUImageCopyTexture& source, const WGPUImageCopyBuffer& destination, const WGPUExtent3D& copySize);
    void copyTextureToTexture(const WGPUImageCopyTexture& source, const WGPUImageCopyTexture& destination, const WGPUExtent3D& copySize);
    void clearBuffer(const Buffer&, uint64_t offset, uint64_t size);
    RefPtr<CommandBuffer> finish(const WGPUCommandBufferDescriptor&);
    void insertDebugMarker(String&& markerLabel);
    void popDebugGroup();
    void pushDebugGroup(String&& groupLabel);
    void resolveQuerySet(const QuerySet&, uint32_t firstQuery, uint32_t queryCount, const Buffer& destination, uint64_t destinationOffset);
    void writeTimestamp(const QuerySet&, uint32_t queryIndex);
    void setLabel(String&&);

In  my mind reviewing this, I'm thinking: "In the diff, I expect to see isValid() check for each function. Why is there not?"
Comment 4 Myles C. Maxfield 2022-04-07 15:06:04 PDT
Comment on attachment 456538 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=456538&action=review

>> Source/WebGPU/WebGPU/CommandEncoder.mm:102
>> +    if (!source.isValidToUseWith(*this))
> 
> My mental model is:
> All functions do:
>  1) `this` is validated with isValid()
>  2) all inputs are validated with isValidToUseWith(*this)
> 
> so this seems to lack the 1)

Coincidentally I had the same thought when I was implementing this. I previously emailed one of the spec editors about it; here's the relevant part of his reply:

> This is at intentional (though that doesn't mean it's the right choice). I think the thinking was these checks were supposed to be specific to the object in question (`source`/`destination`), and there should be an explicit check for whether `this` is valid, which may just be missing in a lot of places.
> 
> In this particular case though, it's inconsequential whether the encoder is valid. IIRC, the only effect from copyBufferToBuffer validation failing is (supposed to be) that the encoder becomes invalid. So it doesn't matter to fail if the encoder was already invalid.

I think the right answer here is for me to make a PR to the spec to add a bunch of "`this` must be valid" checks, and then once that lands, to update our implementation to add the checks.

>> Source/WebGPU/WebGPU/CommandEncoder.mm:167
>>  
> 
> Note: here we have a contrasting internal functions:
> 
> validateCopyBufferToBuffer(const Buffer& source, uint64_t sourceOffset, const Buffer& destination, uint64_t destinationOffset, uint64_t size)
> vs
> static bool validateImageCopyBuffer(const WGPUImageCopyBuffer& imageCopyBuffer)
> 
> It is kind of easier to follow validateCopyBufferToBuffer because all the parameters are converted to "business-as-usual" internal parameters (as opposed to weird-and-cumbersome-C-API-types)
> 
> OTOH validateImageCopyBuffer codepaths are weird and unusual as they leak the API types to the internal
> 
> So example:
> Where is copyTextureToBuffer "isValidToUseWith" checks?
> 
> I know it's cumbersome if there's a descriptor with 100 fields, and you don't want to 
> a) add a duplicated descriptor but with internal types
> b) add 100 params
> 
> I don't think there's good options here.
> 
> It's a bit hard to understand that all these have 
> 1) isVAlid check
> 2) isValidToUseWith check for each resource referenced
> 
>     RefPtr<ComputePassEncoder> beginComputePass(const WGPUComputePassDescriptor&);
>     RefPtr<RenderPassEncoder> beginRenderPass(const WGPURenderPassDescriptor&);
>     void copyBufferToBuffer(const Buffer& source, uint64_t sourceOffset, const Buffer& destination, uint64_t destinationOffset, uint64_t size);
>     void copyBufferToTexture(const WGPUImageCopyBuffer& source, const WGPUImageCopyTexture& destination, const WGPUExtent3D& copySize);
>     void copyTextureToBuffer(const WGPUImageCopyTexture& source, const WGPUImageCopyBuffer& destination, const WGPUExtent3D& copySize);
>     void copyTextureToTexture(const WGPUImageCopyTexture& source, const WGPUImageCopyTexture& destination, const WGPUExtent3D& copySize);
>     void clearBuffer(const Buffer&, uint64_t offset, uint64_t size);
>     RefPtr<CommandBuffer> finish(const WGPUCommandBufferDescriptor&);
>     void insertDebugMarker(String&& markerLabel);
>     void popDebugGroup();
>     void pushDebugGroup(String&& groupLabel);
>     void resolveQuerySet(const QuerySet&, uint32_t firstQuery, uint32_t queryCount, const Buffer& destination, uint64_t destinationOffset);
>     void writeTimestamp(const QuerySet&, uint32_t queryIndex);
>     void setLabel(String&&);
> 
> In  my mind reviewing this, I'm thinking: "In the diff, I expect to see isValid() check for each function. Why is there not?"

I think you're bringing up a few different points. Let's try to tease them apart:

1. Functions inside `namespace WebGPU` interrogate `WGPU` struct types. You're right that the only solutions I've though of to this problem are to either create duplicate-yet-slightly-different descriptors or to flatten the structs into a (potentially very long) parameter list. I think both of those medicines are worse than the sickness, though, for a few reasons:
    A) I'm already trying to get rid of the various copies of all the deeply nested structs inside our IPC code
    B) We've already decided that the WGSL compiler will be using these WGSL struct types directly, instead of copying them into WGSL-specific copies of the structs. See https://a1391192.slack.com/archives/GMHCJ75MK/p1645336902048319. So we're already operating under the assumption that there won't be a layering boundary between WGPU types and the internal implementation
    C) It would be kind of unfortunate to create duplicated-yet-slightly-different structs for some WGPU types, but not others, depending on whether there happens to be a resource transitively somewhere within the type. Both because it will be hard to remember which types get a copy, and also because if someone sticks a resource into some struct in a future version of the spec, that could potentially cause a large set of structs to have to be duplicated (because anything that transitively contains a resource would be duplicated).

2. Where is copyTextureToBuffer "isValidToUseWith" checks? I think, for validation purposes where there isn't significant performance impact, matching the spec's language/layout is probably the best answer to this kind of question. I took inspiration from the design of LFC, which tries to name / organize concepts to match the terms-of-art in the relevant specs. (Of course, matching the organization of the spec is much less important than performance, so if performance is a concern, this design goes out the window. I don't think, in this situation here, though, that performance is a concern. At least not yet - we'll definitely get to the point where we're measuring various benchmarks, and then we can revisit this.)

3. I expect to see isValid() check for each function. Why is there not? Like above, I think the best path forward here is for me to add these checks to the spec first, and then update our implementation accordingly. I'll put that on my to-do list.
Comment 5 Myles C. Maxfield 2022-04-07 16:09:17 PDT
Created attachment 456981 [details]
Rebased
Comment 6 Myles C. Maxfield 2022-04-08 01:24:56 PDT
Kimmo and I discussed this offline. The conclusion we reached was that I should write a code generate to generate WebGPU:: versions of all the structs. These new structs will hold references to other WebGPU:: objects rather than WGPU objects.

We can also use the code generator to generate IPC structs too.

We don't want to mark any existing patches as blocked on that (somewhat large) code generator project.
Comment 7 Myles C. Maxfield 2022-04-08 20:16:25 PDT
Created attachment 457138 [details]
Rebased
Comment 8 Radar WebKit Bug Importer 2022-04-11 00:19:14 PDT
<rdar://problem/91550917>
Comment 9 EWS 2022-04-11 15:05:05 PDT
Committed r292728 (249513@main): <https://commits.webkit.org/249513@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 457138 [details].
Comment 10 Myles C. Maxfield 2022-04-11 17:14:59 PDT
Committed r292744 (249528@trunk): <https://commits.webkit.org/249528@trunk>