You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the 2D block IO lowering, we have compensate the offset of non-64 bytes aligned base to the OffsetX and BaseWidth.
But there is extra restriction on the OffsetX that it has to be 4-bytes aligned.
We need to fallback to gather load for the case that OffsetX is not 4-bytes aligned.
Describe the bug
In the FlexDecoding test case, we found an issue that the block IO returns the in-correct matrix value if the base address is not aligned.
The Inductor code will generate the code like this:
It adds the offset directly into the base.
Environment details
Triton XPU: Latest
The text was updated successfully, but these errors were encountered: