-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some non square matrix definitions are incorrect #74
Comments
Hi @mlangerak, this is expected and by design. The reason for that is that is that even if the physical layout changes, the logical layout still works. Are you finding inconsistencies when using these? |
Conceptually a float4x3 consists of 4 rows (or columns for column major) of float3's so you'd expect that the matrix internally has 4 SIMD vector elements, since that would be the most natural and efficient mapping to matrix operations like constructing from 4 rows (or columns for column major), accessing a row (or column for column major), etc. |
That is true, the reason I went with the other model is so that it was more space-efficient. Would you make it work the same way if it was a 4x2 matrix also be 4 simd vectors? Changing it now to work how you need would be a huge amount of work, even if I understand the rationale for it. All the matrix operations are done taking this in mind. |
I've been looking at the code. There seems to be no "right" solution to trying to represent non 4x4 matrices by fixed-width vectors. The advantages and disadvantages to a hypothetical change I can see are: Advantages Disadvantages This of course ignores the large disadvantage that is to change the inner workings of all these matrices, verifying results, etc. I don't have a unit test coverage as large as I would like, it was originally more manual. I think I can see both sides of this, it took me a lot of time to get the matrix stuff working and verified results, etc (even if there are bugs), changing it now would cause a lot more issues than it solves. If you have other ideas or suggestions I'm open to seeing your perspective or how you'd solve it |
There are pros and cons to either layout, but I would argue for using the layout that causes the least surprise even if it is less optimal in some respects. By least surprise I mean which layout is more common in practice. For instance I think (I did not double check this) that when sharing constant data with the GPU, the DXC compiler will use 4 float3's for a float4x3 in row-major mode. In that case, it would be "least surprise" if you could memcopy a hlslpp::float4x3 directly when pushing constant data to the GPU. The float2x2 case is fortunate, since then DXC uses the same packing into a single float4 as hlslpp does already. So in that case it is already consistent with DXC. [edit] struct Matrix4x3{...} // storage for 4 float3's, implicitly row major It would get confusing really quickly though, so unless there is a good reason to support both row and column major, I would default to row major always since it is more natural IMO, indeed I am using hlslpp and DXC in row major mode. Incidentally I need to also support Metal shading language which is sadly column major and there is no shader compiler switch to change it either it seems. To compensate, I implemented a mul intrinsic for MSL which hides this column/row major distinction:
This works well to hide the row/column major confusion, and so far I've been able to write all my MSL pretending it is row-major. |
There is a common misunderstanding that this library is meant to interface with hlsl when uploading to the GPU, I'll refer you to #58 for some more discussion. While that would be convenient there's all sorts of cases where this doesn't happen. Just as a few examples: cbuffer ExampleCB
{
float3 a; // Offset: 0
float b; // Offset: 12
float2x2 m; // Offset: 16
float4 v; // Offset: 48
}
Same applies to float1 and float2, with their respective paddings.
The main takeaway is that there is no least surprise behavior, these things are surprising no matter what you do, even if you had no SIMD vectors C++'s packing rules can come in and do unexpected things. It is not HLSL++'s aim to solve these problems. That said, I use a little interface to cater for this use case in my codebase. As it is still incomplete it hasn't made its way here yet, but it is possible to do with not that much work using the store() family of functions. The other thing I have basically seen at lots of codebases is to just declare aligned types in general, i.e. don't use anything other than float4, maybe row_major float3x4 and float4x4. Hopefully that helps |
There seems to be an issue with the matrix definitions for some non-square matrices. For example, float4x3 has a constructor taking 3 n128 and internally consists of 3 n128 (row or column)vectors. It should have a constructor taking 4 n128, and have 4 n128 internal (row or column)vectors. There are other cases too, e.g. float3x2.
The text was updated successfully, but these errors were encountered: