199215 – [WHLSL] Matrix memory layout should match HLSL by laying out columns linearly

RESOLVED FIXED 199215

[WHLSL] Matrix memory layout should match HLSL by laying out columns linearly

https://bugs.webkit.org/show_bug.cgi?id=199215

Summary [WHLSL] Matrix memory layout should match HLSL by laying out columns linearly

Justin Fan

Reported 2019-06-25 18:25:21 PDT

From MSL docs: >// This sets the 4th element of the 3rd column to 3.0. >m[2][3] = 3.0f; This is opposite of HLSL, which is what WHLSL should be compatible with. I don't think we are taking this into account when compiling WHLSL. From Microsoft docs: >A 4x4 matrix is accessed with the following indices: >[0][0], [0][1], [0][2], [0][3] >[1][0], [1][1], [1][2], [1][3] >[2][0], [2][1], [2][2], [2][3] >[3][0], [3][1], [3][2], [3][3]

Attachments
patch (16.08 KB, patch) 2019-07-12 19:11 PDT, Saam Barati	mmaxfield: review+	Details Formatted Diff Diff
patch for landing (15.80 KB, patch) 2019-07-15 16:11 PDT, Saam Barati	no flags	Details Formatted Diff Diff
Show Obsolete (1) View All Add attachment proposed patch, testcase, etc.

Justin Fan

Comment 1 2019-06-25 18:49:48 PDT

Myles' comments: > 1) make the WHLSL functions round-trip. So if you invoke a bunch of different constructors and then run a bunch of getters, you get the expected thing. > 2) make the memory layout match HLSL. If you upload 16 floats in a buffer and then tell the shader that the buffer contains a matrix, the getters should also return the expected thing

Myles C. Maxfield

Comment 2 2019-07-01 20:57:24 PDT

In Direct3D, if you put [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16] in a buffer and tell the shader that the buffer is a matrix, and then ask the matrix for [0][1], the result is 5. If you call float4x4(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16) and then you ask it for [0][1], the result is 2. If you call float4x4(float4(1, 2, 3, 4), float4(5, 6, 7, 8), float4(9, 10, 11, 12), float4(13, 14, 15, 16)) and then you ask it for [0][1], the result is 2.

Myles C. Maxfield

Comment 3 2019-07-03 14:09:00 PDT

In Direct3D, if you put [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32] in a buffer, and tell the shader that it's two float4x4s next to each other, then you mul(buffer1[0], buffer1[1]) and put the result in another buffer and then read the buffer on the CPU side, the results are: 538 612 686 760 650 740 830 920 762 868 974 1080 874 996 1118 1240 Thus, the two matrices are [1 5 9 13] [17 21 25 29] [2 6 10 14] * [18 22 26 30] [3 7 11 15] [19 23 27 31] [4 8 12 16] [20 24 28 32]

Myles C. Maxfield

Comment 4 2019-07-03 14:10:15 PDT

(In reply to Myles C. Maxfield from comment #3) > In Direct3D, if you put [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, > 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32] in a > buffer, and tell the shader that it's two float4x4s next to each other, then > you mul(buffer1[0], buffer1[1]) and put the result in another buffer and > then read the buffer on the CPU side, the results are: > > 538 > 612 > 686 > 760 > 650 > 740 > 830 > 920 > 762 > 868 > 974 > 1080 > 874 > 996 > 1118 > 1240 > > Thus, the two matrices are > [1 5 9 13] [17 21 25 29] > [2 6 10 14] * [18 22 26 30] > [3 7 11 15] [19 23 27 31] > [4 8 12 16] [20 24 28 32] and buffer1[0][0] is [1 5 9 13]

Justin Fan

Comment 5 2019-07-03 18:29:23 PDT

For Metal Shading Language: There is no difference between supplying [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16] as a buffer to a shader and telling it to treat it as a float4x4, or using the float4x4() constructor with those values. Result[0] is [1, 2, 3, 4]. If you put [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32] in a buffer, and tell the shader that it's two float4x4s next to each other, the results of buffer1[0] * buffer1[1] are: 538 612 686 760 650 740 830 920 762 868 974 1080 874 996 1118 1240 Thus, the two matrices are [1 5 9 13] [17 21 25 29] [2 6 10 14] * [18 22 26 30] [3 7 11 15] [19 23 27 31] [4 8 12 16] [20 24 28 32] and buffer1[0][0] is [1, 2, 3, 4]. This makes sense, as MSL docs imply matrices are built AND stored as Arrays Of Columns Vectors. Thus the difference remains: HLSL (and thus WHLSL) indexes matrices as Arrays of Row Vectors.

Justin Fan

Comment 6 2019-07-03 18:33:50 PDT

The difference is that HLSL's float4x4 constructor and indexing operators behave in row-major order, but in memory in both languages, the values are in column-major order.

Justin Fan

Comment 7 2019-07-03 19:18:27 PDT

To make things even more fun, currently in WHLSL, if you put [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32] in a buffer, and tell the shader that it's two float4x4s next to each other, the results of mul(buffer1[0], buffer1[1]) are: 250 260 270 280 618 644 670 696 986 1028 1070 1112 1354 1412 1470 1528 Implying that the matrices are: [1 2 3 4] [17 18 19 20] [5 6 7 8] * [21 22 23 24] [9 10 11 12] [25 26 27 28] [13 14 15 16] [29 30 31 32]

Justin Fan

Comment 8 2019-07-03 19:28:25 PDT

This means: this patch must ensure that WHLSL indexing behavior matches HLSL, if we are going with HLSL source-compatibility. The underlying memory layout should not need change.

Justin Fan

Comment 9 2019-07-03 19:29:20 PDT

(In reply to Justin Fan from comment #8) > This means: this patch must ensure that WHLSL indexing behavior matches > HLSL, if we are going with HLSL source-compatibility. The underlying memory > layout should not need change. e.g. in WHLSL, result[0][0] gives [1, 2, 3, 4] when it should return [1, 5, 9, 13].

Saam Barati

Comment 10 2019-07-12 17:32:16 PDT

patch forthcoming

Saam Barati

Comment 11 2019-07-12 19:11:37 PDT

Created attachment 374068 [details] patch

Justin Fan

Comment 12 2019-07-12 19:19:24 PDT

Comment on attachment 374068 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=374068&action=review > LayoutTests/webgpu/whlsl-matrix-memory-layout.html:5 > +<script src="../resources/js-test-pre.js"></script> Any reason why you didn't use the WHLSL harness to write this test?

Myles C. Maxfield

Comment 13 2019-07-12 20:04:02 PDT

Comment on attachment 374068 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=374068&action=review > Source/WebCore/ChangeLog:11 > + [0, 4, 8, 12, 1, 5, 9, 13, 2, 6, 10, 14, 3, 7, 11, 15] This is an odd way to structure this sentence. Why not say the memory is 1, 2, 3? > Source/WebCore/ChangeLog:20 > + contents of a matrix. So the matrix float4x3 will now be an array<float, 12> Why not just consider the matrix to be a container, and make the accessors/mutators do the right thing?

Myles C. Maxfield

Comment 14 2019-07-12 22:04:54 PDT

Comment on attachment 374068 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=374068&action=review I’ll wait for Saam to answer the question about why we’re using arrays instead of matrixes as the container before finishing the review. > Source/WebCore/Modules/webgpu/WHLSL/Metal/WHLSLNativeFunctionWriter.cpp:261 > + auto numberOfMatrixColumns = [&] () -> unsigned { Can we share some code?

Saam Barati

Comment 15 2019-07-12 22:23:45 PDT

Comment on attachment 374068 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=374068&action=review >> Source/WebCore/ChangeLog:11 >> + [0, 4, 8, 12, 1, 5, 9, 13, 2, 6, 10, 14, 3, 7, 11, 15] > > This is an odd way to structure this sentence. Why not say the memory is 1, 2, 3? I’m confused what you’re suggesting. I’m just trying to show what the backing memory of the below matrix is. Im not sure why how other backing store of memory as an example is any clearer. Maybe you have something in mind? >> Source/WebCore/ChangeLog:20 >> + contents of a matrix. So the matrix float4x3 will now be an array<float, 12> > > Why not just consider the matrix to be a container, and make the accessors/mutators do the right thing? I’m confused what the alternative is. This patch is making it a container (specifically array) and we change the accessors to do the right thing >> Source/WebCore/Modules/webgpu/WHLSL/Metal/WHLSLNativeFunctionWriter.cpp:261 >> + auto numberOfMatrixColumns = [&] () -> unsigned { > > Can we share some code? Yup. Can fix >> LayoutTests/webgpu/whlsl-matrix-memory-layout.html:5 >> +<script src="../resources/js-test-pre.js"></script> > > Any reason why you didn't use the WHLSL harness to write this test? The main reason was just to manually control the buffer explicitly in the test. But it could be written using the harness

Myles C. Maxfield

Comment 16 2019-07-12 22:49:32 PDT

Comment on attachment 374068 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=374068&action=review >>> Source/WebCore/ChangeLog:11 >>> + [0, 4, 8, 12, 1, 5, 9, 13, 2, 6, 10, 14, 3, 7, 11, 15] >> >> This is an odd way to structure this sentence. Why not say the memory is 1, 2, 3? > > I’m confused what you’re suggesting. I’m just trying to show what the backing memory of the below matrix is. Im not sure why how other backing store of memory as an example is any clearer. Maybe you have something in mind? This sentence is of the form “A [X] with the form [some complicated thing] should be the same as [this simple thing]”. I just mean that these sentences are usually of the form ”A [X] with the form [this simple thing] should be the same as [some complicated thing].” >>> Source/WebCore/ChangeLog:20 >>> + contents of a matrix. So the matrix float4x3 will now be an array<float, 12> >> >> Why not just consider the matrix to be a container, and make the accessors/mutators do the right thing? > > I’m confused what the alternative is. This patch is making it a container (specifically array) and we change the accessors to do the right thing Oh, I meant: why are WHLSL matrices represented as MSL nested arrays instead of being represented by MSL matrices? That seem more natural.

Saam Barati

Comment 17 2019-07-12 23:39:17 PDT

Comment on attachment 374068 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=374068&action=review >>>> Source/WebCore/ChangeLog:11 >>>> + [0, 4, 8, 12, 1, 5, 9, 13, 2, 6, 10, 14, 3, 7, 11, 15] >>> >>> This is an odd way to structure this sentence. Why not say the memory is 1, 2, 3? >> >> I’m confused what you’re suggesting. I’m just trying to show what the backing memory of the below matrix is. Im not sure why how other backing store of memory as an example is any clearer. Maybe you have something in mind? > > This sentence is of the form “A [X] with the form [some complicated thing] should be the same as [this simple thing]”. I just mean that these sentences are usually of the form ”A [X] with the form [this simple thing] should be the same as [some complicated thing].” I can change it. >>>> Source/WebCore/ChangeLog:20 >>>> + contents of a matrix. So the matrix float4x3 will now be an array<float, 12> >>> >>> Why not just consider the matrix to be a container, and make the accessors/mutators do the right thing? >> >> I’m confused what the alternative is. This patch is making it a container (specifically array) and we change the accessors to do the right thing > > Oh, I meant: why are WHLSL matrices represented as MSL nested arrays instead of being represented by MSL matrices? That seem more natural. they’re not nested arrays. just a flat array.

Myles C. Maxfield

Comment 18 2019-07-13 17:54:50 PDT

Comment on attachment 374068 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=374068&action=review >>>>> Source/WebCore/ChangeLog:20 >>>>> + contents of a matrix. So the matrix float4x3 will now be an array<float, 12> >>>> >>>> Why not just consider the matrix to be a container, and make the accessors/mutators do the right thing? >>> >>> I’m confused what the alternative is. This patch is making it a container (specifically array) and we change the accessors to do the right thing >> >> Oh, I meant: why are WHLSL matrices represented as MSL nested arrays instead of being represented by MSL matrices? That seem more natural. > > they’re not nested arrays. just a flat array. Oh, sorry, my bad. I meant “why represent a WHLSL matrix as a MSL array instead of as an MSL matrix?”

Saam Barati

Comment 19 2019-07-14 13:30:49 PDT

Comment on attachment 374068 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=374068&action=review >>>>>> Source/WebCore/ChangeLog:20 >>>>>> + contents of a matrix. So the matrix float4x3 will now be an array<float, 12> >>>>> >>>>> Why not just consider the matrix to be a container, and make the accessors/mutators do the right thing? >>>> >>>> I’m confused what the alternative is. This patch is making it a container (specifically array) and we change the accessors to do the right thing >>> >>> Oh, I meant: why are WHLSL matrices represented as MSL nested arrays instead of being represented by MSL matrices? That seem more natural. >> >> they’re not nested arrays. just a flat array. > > Oh, sorry, my bad. I meant “why represent a WHLSL matrix as a MSL array instead of as an MSL matrix?” I think it’s less understandable to take a metal matrix and treat it as a bag of elements that needs to be accessed in a bizarre way. It’s simpler just to have our own bag of bytes and define our own way of indexing into it. It’s one level of weirdness instead of two.

Saam Barati

Comment 20 2019-07-15 16:11:43 PDT

Created attachment 374162 [details] patch for landing

WebKit Commit Bot

Comment 21 2019-07-15 20:50:32 PDT

Comment on attachment 374162 [details] patch for landing Clearing flags on attachment: 374162 Committed r247468: <https://trac.webkit.org/changeset/247468>

WebKit Commit Bot

Comment 22 2019-07-15 20:50:33 PDT

All reviewed patches have been landed. Closing bug.

Radar WebKit Bug Importer

Comment 23 2019-07-15 20:54:01 PDT

<rdar://problem/53136686>

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P1

Severity Normal

Classification Unclassified

Version Other

Hardware Unspecified

OS Unspecified

Product WebKit

Component WebGPU

Assignee

Saam Barati

Reported

2019-06-25 18:25 PDT

Modified

2019-07-15 20:54 PDT History

CC List

4 users Show

URL

Keywords InRadar

Depends on

Blocks