Bug 199215 - [WHLSL] Matrix memory layout should match HLSL by laying out columns linearly
Summary: [WHLSL] Matrix memory layout should match HLSL by laying out columns linearly
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebGPU (show other bugs)
Version: Other
Hardware: Unspecified Unspecified
: P1 Normal
Assignee: Saam Barati
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2019-06-25 18:25 PDT by Justin Fan
Modified: 2019-07-15 20:54 PDT (History)
4 users (show)

See Also:


Attachments
patch (16.08 KB, patch)
2019-07-12 19:11 PDT, Saam Barati
mmaxfield: review+
Details | Formatted Diff | Diff
patch for landing (15.80 KB, patch)
2019-07-15 16:11 PDT, Saam Barati
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Justin Fan 2019-06-25 18:25:21 PDT
From MSL docs: 

>// This sets the 4th element of the 3rd column to 3.0. 
>m[2][3] = 3.0f;

This is opposite of HLSL, which is what WHLSL should be compatible with. I don't think we are taking this into account when compiling WHLSL.

From Microsoft docs:

>A 4x4 matrix is accessed with the following indices:

>[0][0], [0][1], [0][2], [0][3]
>[1][0], [1][1], [1][2], [1][3]
>[2][0], [2][1], [2][2], [2][3]
>[3][0], [3][1], [3][2], [3][3]
Comment 1 Justin Fan 2019-06-25 18:49:48 PDT
Myles' comments:

> 1) make the WHLSL functions round-trip. So if you invoke a bunch of different constructors and then run a bunch of getters, you get the expected thing.
> 2) make the memory layout match HLSL. If you upload 16 floats in a buffer and then tell the shader that the buffer contains a matrix, the getters should also return the expected thing
Comment 2 Myles C. Maxfield 2019-07-01 20:57:24 PDT
In Direct3D, if you put [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16] in a buffer and tell the shader that the buffer is a matrix, and then ask the matrix for [0][1], the result is 5.

If you call float4x4(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16) and then you ask it for [0][1], the result is 2.

If you call float4x4(float4(1, 2, 3, 4), float4(5, 6, 7, 8), float4(9, 10, 11, 12), float4(13, 14, 15, 16)) and then you ask it for [0][1], the result is 2.
Comment 3 Myles C. Maxfield 2019-07-03 14:09:00 PDT
In Direct3D, if you put [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32] in a buffer, and tell the shader that it's two float4x4s next to each other, then you mul(buffer1[0], buffer1[1]) and put the result in another buffer and then read the buffer on the CPU side, the results are:

538
612
686
760
650
740
830
920
762
868
974
1080
874
996
1118
1240

Thus, the two matrices are
[1 5  9 13]   [17 21 25 29]
[2 6 10 14] * [18 22 26 30]
[3 7 11 15]   [19 23 27 31]
[4 8 12 16]   [20 24 28 32]
Comment 4 Myles C. Maxfield 2019-07-03 14:10:15 PDT
(In reply to Myles C. Maxfield from comment #3)
> In Direct3D, if you put [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
> 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32] in a
> buffer, and tell the shader that it's two float4x4s next to each other, then
> you mul(buffer1[0], buffer1[1]) and put the result in another buffer and
> then read the buffer on the CPU side, the results are:
> 
> 538
> 612
> 686
> 760
> 650
> 740
> 830
> 920
> 762
> 868
> 974
> 1080
> 874
> 996
> 1118
> 1240
> 
> Thus, the two matrices are
> [1 5  9 13]   [17 21 25 29]
> [2 6 10 14] * [18 22 26 30]
> [3 7 11 15]   [19 23 27 31]
> [4 8 12 16]   [20 24 28 32]

and buffer1[0][0] is [1 5 9 13]
Comment 5 Justin Fan 2019-07-03 18:29:23 PDT
For Metal Shading Language:

There is no difference between supplying [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16] as a buffer to a shader and telling it to treat it as a float4x4, or using the float4x4() constructor with those values. Result[0] is [1, 2, 3, 4].

If you put [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32] in a buffer, and tell the shader that it's two float4x4s next to each other, the results of buffer1[0] * buffer1[1] are:

538
612
686
760
650
740
830
920
762
868
974
1080
874
996
1118
1240

Thus, the two matrices are
[1 5  9 13]   [17 21 25 29]
[2 6 10 14] * [18 22 26 30]
[3 7 11 15]   [19 23 27 31]
[4 8 12 16]   [20 24 28 32]

and buffer1[0][0] is [1, 2, 3, 4].

This makes sense, as MSL docs imply matrices are built AND stored as Arrays Of Columns Vectors.

Thus the difference remains: HLSL (and thus WHLSL) indexes matrices as Arrays of Row Vectors.
Comment 6 Justin Fan 2019-07-03 18:33:50 PDT
The difference is that HLSL's float4x4 constructor and indexing operators behave in row-major order, but in memory in both languages, the values are in column-major order.
Comment 7 Justin Fan 2019-07-03 19:18:27 PDT
To make things even more fun, currently in WHLSL, if you put [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32] in a buffer, and tell the shader that it's two float4x4s next to each other, the results of mul(buffer1[0], buffer1[1]) are:

250
260
270
280
618
644
670
696
986
1028
1070
1112
1354
1412
1470
1528

Implying that the matrices are:
[1   2  3  4]   [17 18 19 20]
[5   6  7  8] * [21 22 23 24]
[9  10 11 12]   [25 26 27 28]
[13 14 15 16]   [29 30 31 32]
Comment 8 Justin Fan 2019-07-03 19:28:25 PDT
This means: this patch must ensure that WHLSL indexing behavior matches HLSL, if we are going with HLSL source-compatibility. The underlying memory layout should not need change.
Comment 9 Justin Fan 2019-07-03 19:29:20 PDT
(In reply to Justin Fan from comment #8)
> This means: this patch must ensure that WHLSL indexing behavior matches
> HLSL, if we are going with HLSL source-compatibility. The underlying memory
> layout should not need change.

e.g. in WHLSL, result[0][0] gives [1, 2, 3, 4] when it should return [1, 5, 9, 13].
Comment 10 Saam Barati 2019-07-12 17:32:16 PDT
patch forthcoming
Comment 11 Saam Barati 2019-07-12 19:11:37 PDT
Created attachment 374068 [details]
patch
Comment 12 Justin Fan 2019-07-12 19:19:24 PDT
Comment on attachment 374068 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=374068&action=review

> LayoutTests/webgpu/whlsl-matrix-memory-layout.html:5
> +<script src="../resources/js-test-pre.js"></script>

Any reason why you didn't use the WHLSL harness to write this test?
Comment 13 Myles C. Maxfield 2019-07-12 20:04:02 PDT
Comment on attachment 374068 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=374068&action=review

> Source/WebCore/ChangeLog:11
> +        [0, 4, 8, 12, 1, 5, 9, 13, 2, 6, 10, 14, 3, 7, 11, 15]

This is an odd way to structure this sentence. Why not say the memory is 1, 2, 3?

> Source/WebCore/ChangeLog:20
> +        contents of a matrix. So the matrix float4x3 will now be an array<float, 12>

Why not just consider the matrix to be a container, and make the accessors/mutators do the right thing?
Comment 14 Myles C. Maxfield 2019-07-12 22:04:54 PDT
Comment on attachment 374068 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=374068&action=review

I’ll wait for Saam to answer the question about why we’re using arrays instead of matrixes as the container before finishing the review.

> Source/WebCore/Modules/webgpu/WHLSL/Metal/WHLSLNativeFunctionWriter.cpp:261
> +    auto numberOfMatrixColumns = [&] () -> unsigned {

Can we share some code?
Comment 15 Saam Barati 2019-07-12 22:23:45 PDT
Comment on attachment 374068 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=374068&action=review

>> Source/WebCore/ChangeLog:11
>> +        [0, 4, 8, 12, 1, 5, 9, 13, 2, 6, 10, 14, 3, 7, 11, 15]
> 
> This is an odd way to structure this sentence. Why not say the memory is 1, 2, 3?

I’m confused what you’re suggesting. I’m just trying to show what the backing memory of the below matrix is. Im not sure why how other backing store of memory as an example is any clearer. Maybe you have something in mind?

>> Source/WebCore/ChangeLog:20
>> +        contents of a matrix. So the matrix float4x3 will now be an array<float, 12>
> 
> Why not just consider the matrix to be a container, and make the accessors/mutators do the right thing?

I’m confused what the alternative is. This patch is making it a container (specifically array) and we change the accessors to do the right thing

>> Source/WebCore/Modules/webgpu/WHLSL/Metal/WHLSLNativeFunctionWriter.cpp:261
>> +    auto numberOfMatrixColumns = [&] () -> unsigned {
> 
> Can we share some code?

Yup. Can fix

>> LayoutTests/webgpu/whlsl-matrix-memory-layout.html:5
>> +<script src="../resources/js-test-pre.js"></script>
> 
> Any reason why you didn't use the WHLSL harness to write this test?

The main reason was just to manually control the buffer explicitly in the test. But it could be written using the harness
Comment 16 Myles C. Maxfield 2019-07-12 22:49:32 PDT
Comment on attachment 374068 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=374068&action=review

>>> Source/WebCore/ChangeLog:11
>>> +        [0, 4, 8, 12, 1, 5, 9, 13, 2, 6, 10, 14, 3, 7, 11, 15]
>> 
>> This is an odd way to structure this sentence. Why not say the memory is 1, 2, 3?
> 
> I’m confused what you’re suggesting. I’m just trying to show what the backing memory of the below matrix is. Im not sure why how other backing store of memory as an example is any clearer. Maybe you have something in mind?

This sentence is of the form “A [X] with the form [some complicated thing] should be the same as [this simple thing]”. I just mean that these sentences are usually of the form ”A [X] with the form [this simple thing] should be the same as [some complicated thing].”

>>> Source/WebCore/ChangeLog:20
>>> +        contents of a matrix. So the matrix float4x3 will now be an array<float, 12>
>> 
>> Why not just consider the matrix to be a container, and make the accessors/mutators do the right thing?
> 
> I’m confused what the alternative is. This patch is making it a container (specifically array) and we change the accessors to do the right thing

Oh, I meant: why are WHLSL matrices represented as MSL nested arrays instead of being represented by MSL matrices? That seem more natural.
Comment 17 Saam Barati 2019-07-12 23:39:17 PDT
Comment on attachment 374068 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=374068&action=review

>>>> Source/WebCore/ChangeLog:11
>>>> +        [0, 4, 8, 12, 1, 5, 9, 13, 2, 6, 10, 14, 3, 7, 11, 15]
>>> 
>>> This is an odd way to structure this sentence. Why not say the memory is 1, 2, 3?
>> 
>> I’m confused what you’re suggesting. I’m just trying to show what the backing memory of the below matrix is. Im not sure why how other backing store of memory as an example is any clearer. Maybe you have something in mind?
> 
> This sentence is of the form “A [X] with the form [some complicated thing] should be the same as [this simple thing]”. I just mean that these sentences are usually of the form ”A [X] with the form [this simple thing] should be the same as [some complicated thing].”

I can change it.

>>>> Source/WebCore/ChangeLog:20
>>>> +        contents of a matrix. So the matrix float4x3 will now be an array<float, 12>
>>> 
>>> Why not just consider the matrix to be a container, and make the accessors/mutators do the right thing?
>> 
>> I’m confused what the alternative is. This patch is making it a container (specifically array) and we change the accessors to do the right thing
> 
> Oh, I meant: why are WHLSL matrices represented as MSL nested arrays instead of being represented by MSL matrices? That seem more natural.

they’re not nested arrays. just a flat array.
Comment 18 Myles C. Maxfield 2019-07-13 17:54:50 PDT
Comment on attachment 374068 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=374068&action=review

>>>>> Source/WebCore/ChangeLog:20
>>>>> +        contents of a matrix. So the matrix float4x3 will now be an array<float, 12>
>>>> 
>>>> Why not just consider the matrix to be a container, and make the accessors/mutators do the right thing?
>>> 
>>> I’m confused what the alternative is. This patch is making it a container (specifically array) and we change the accessors to do the right thing
>> 
>> Oh, I meant: why are WHLSL matrices represented as MSL nested arrays instead of being represented by MSL matrices? That seem more natural.
> 
> they’re not nested arrays. just a flat array.

Oh, sorry, my bad. I meant “why represent a WHLSL matrix as a MSL array instead of as an MSL matrix?”
Comment 19 Saam Barati 2019-07-14 13:30:49 PDT
Comment on attachment 374068 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=374068&action=review

>>>>>> Source/WebCore/ChangeLog:20
>>>>>> +        contents of a matrix. So the matrix float4x3 will now be an array<float, 12>
>>>>> 
>>>>> Why not just consider the matrix to be a container, and make the accessors/mutators do the right thing?
>>>> 
>>>> I’m confused what the alternative is. This patch is making it a container (specifically array) and we change the accessors to do the right thing
>>> 
>>> Oh, I meant: why are WHLSL matrices represented as MSL nested arrays instead of being represented by MSL matrices? That seem more natural.
>> 
>> they’re not nested arrays. just a flat array.
> 
> Oh, sorry, my bad. I meant “why represent a WHLSL matrix as a MSL array instead of as an MSL matrix?”

I think it’s less understandable to take a metal matrix and treat it as a bag of elements that needs to be accessed in a bizarre way. It’s simpler just to have our own bag of bytes and define our own way of indexing into it. It’s one level of weirdness instead of two.
Comment 20 Saam Barati 2019-07-15 16:11:43 PDT
Created attachment 374162 [details]
patch for landing
Comment 21 WebKit Commit Bot 2019-07-15 20:50:32 PDT
Comment on attachment 374162 [details]
patch for landing

Clearing flags on attachment: 374162

Committed r247468: <https://trac.webkit.org/changeset/247468>
Comment 22 WebKit Commit Bot 2019-07-15 20:50:33 PDT
All reviewed patches have been landed.  Closing bug.
Comment 23 Radar WebKit Bug Importer 2019-07-15 20:54:01 PDT
<rdar://problem/53136686>