-
Notifications
You must be signed in to change notification settings - Fork 766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][DOC] Update SPV_INTEL_joint_matrix #12497
base: sycl
Are you sure you want to change the base?
Conversation
The PR adds checked load/store and construct instructions Signed-off-by: Sidorov, Dmitry <[email protected]>
author: Levytskyy, Vyacheslav Signed-off-by: Sidorov, Dmitry <[email protected]>
Signed-off-by: Sidorov, Dmitry <[email protected]>
c687190
to
e132331
Compare
There were incorrectly named and had incorrect operands. See intel#12497 Signed-off-by: Sidorov, Dmitry <[email protected]>
There were incorrectly named and had incorrect operands. See #12497 --------- Signed-off-by: Sidorov, Dmitry <[email protected]>
Signed-off-by: Sidorov, Dmitry <[email protected]>
This pull request is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be automatically closed in 30 days. |
The spec is available here: intel/llvm#12497 The PR doesn't add OpCooperativeMatrixApplyFunctionINTEL instruction as it's still experimental and not properly tested E2E. The PR also fixes few bugs in the related code: 1. CooperativeMatrixMulAddKHR optional operand must be literal, not a constant; 2. Fixed available capabilities table creation for a case, when a single extension adds few capabilities, that occupy not contiguous op codes. Signed-off-by: Sidorov, Dmitry <[email protected]>
The spec is available here: intel/llvm#12497 The PR doesn't add OpCooperativeMatrixApplyFunctionINTEL instruction as it's still experimental and not properly tested E2E. The PR also fixes few bugs in the related code: 1. CooperativeMatrixMulAddKHR optional operand must be literal, not a constant; 2. Fixed available capabilities table creation for a case, when a single extension adds few capabilities, that occupy not contiguous op codes. --------- Signed-off-by: Sidorov, Dmitry <[email protected]>
This pull request was closed because it has been stalled for 30 days with no activity. |
Signed-off-by: Sidorov, Dmitry <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a minor comment
@@ -419,4 +710,6 @@ Revision History | |||
|13|2023-09-25|Dmitry Sidorov|Add convertion instructions for tf32 and bf16 | |||
|14|2023-10-11|Dmitry Sidorov|Add matrix prefetch instruction | |||
|15|2023-11-06|Dmitry Sidorov|Put deprecation note on OpCooperativeMatrixGetElementCoordINTEL | |||
|16|2023-11-06|Dmitry Sidorov|Add checked load, store and construct instructions | |||
|17|2024-12-16|Dounia Khaldi|Add and store with offset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you missed "load" here --> "Add load and store with offset instructions".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The meaning of "checked" and "offset" isn't immediately apparent, so consider updating the Overview section to describe these new instructions.
Note that many of the comments for OpCooperativeMatrixLoadCheckedINTEL also apply to OpCooperativeMatrixStoreCheckedINTEL.
instructions. + | ||
+ | ||
| *{main_capability_name}* + | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're missing a description of CooperativeMatrixOffsetInstructionsINTEL here.
Load a cooperative matrix through a pointer. Global matrix size might be not multiple the size of | ||
the two-dimentional region that is being loaded, in this case the out-of-bounds elements are | ||
set to 0. + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be useful to define what is meant by the "global matrix" somewhere.
It's not clear to me why being a multiple of the size of the 2D region being loaded is relevant (or not, in the case of this instruction). Isn't this really about going "off the edge", which could happen even if the block size evenly divides the matrix size, if the X offset or Y offset set appropriately.
Also, there's a minor typo here:
Load a cooperative matrix through a pointer. Global matrix size might be not multiple the size of | |
the two-dimentional region that is being loaded, in this case the out-of-bounds elements are | |
set to 0. + | |
Load a cooperative matrix through a pointer. Global matrix size might be not multiple the size of | |
the two-dimensional region that is being loaded, in this case the out-of-bounds elements are | |
set to 0. + |
'X offset' must be a scalar 32-bit integer type. It specifies offset in number of elements | ||
along X axis from the 'Pointer' where the loaded memory region starts from. + | ||
+ | ||
'Y offset' must be a scalar 32-bit integer type. It specifies offset in number of elements | ||
along Y axis from the 'Pointer' where the loaded memory region starts from. + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: the "offset" instructions use different names for these operands ("Rows Offset" and "Columns Offset"). More importantly, for the "offset" instructions the "Rows Offset" comes first, so I think the meaning is transposed compared to these "checked" instructions. There's no requirement that they match, but I think it would be less confusing to use a similar convention.
'Pointer' is a pointer. Its type must be an *OpTypePointer* whose 'Type' operand | ||
is a scalar or vector type. If the *Shader* capability was declared, 'Pointer' | ||
must point into an array and any *ArrayStride* decoration on 'Pointer' is ignored. + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: Consider updating this text so it works with untyped pointers also (assuming that's what we want to do).
'Width' is the width (number of columns of a big matrix) of the two-dimensional | ||
region to load the matrix from. It must be a scalar 'integer type'. + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should describe the units for the width, specifically whether it is in units of bytes or elements (I think it's bytes?).
Add a new form of load/store operations for cooperative matrices that accepts two separate arguments: the row index and the column index. Unlike the original approach requiring a pointer to the matrix base, this new form of load/store operations is expected to yield better optimized code on 2dblock read/write instructions on PVC. CapabilityCooperativeMatrixOffsetInstructionsINTEL = 6238 OpCooperativeMatrixLoadOffsetINTEL = 6239 OpCooperativeMatrixStoreOffsetINTEL = 6240 Spec: intel/llvm#12497
Add a new form of load/store operations for cooperative matrices that accepts two separate arguments: the row index and the column index. Unlike the original approach requiring a pointer to the matrix base, this new form of load/store operations is expected to yield better optimized code on 2dblock read/write instructions on PVC. CapabilityCooperativeMatrixOffsetInstructionsINTEL = 6238 OpCooperativeMatrixLoadOffsetINTEL = 6239 OpCooperativeMatrixStoreOffsetINTEL = 6240 Spec: #12497 Original commit: KhronosGroup/SPIRV-LLVM-Translator@193661c352de3ca
Add a new form of load/store operations for cooperative matrices that accepts two separate arguments: the row index and the column index. Unlike the original approach requiring a pointer to the matrix base, this new form of load/store operations is expected to yield better optimized code on 2dblock read/write instructions on PVC. CapabilityCooperativeMatrixOffsetInstructionsINTEL = 6238 OpCooperativeMatrixLoadOffsetINTEL = 6239 OpCooperativeMatrixStoreOffsetINTEL = 6240 Spec: #12497 Original commit: KhronosGroup/SPIRV-LLVM-Translator@193661c352de3ca
Add a new form of load/store operations for cooperative matrices that accepts two separate arguments: the row index and the column index. Unlike the original approach requiring a pointer to the matrix base, this new form of load/store operations is expected to yield better optimized code on 2dblock read/write instructions on PVC. CapabilityCooperativeMatrixOffsetInstructionsINTEL = 6238 OpCooperativeMatrixLoadOffsetINTEL = 6239 OpCooperativeMatrixStoreOffsetINTEL = 6240 Spec: #12497 Original commit: KhronosGroup/SPIRV-LLVM-Translator@193661c352de3ca
The PR adds checked load/store and construct instructions