Skip to content

Conversation

@causten
Copy link
Collaborator

@causten causten commented Nov 24, 2025

Motivation

  • Improve MXFP4 by pulling in updated rocMLIR
  • Minor fixes to docs

Technical Details

Changelog Category

    • Added: New functionality.
    • Changed: Changes to existing functionality.
    • Removed: Functionality or support that has been removed. (Compared to a previous release)
    • Optimized: Component performance that has been optimized or improved.
    • Resolved Issues: Known issues from a previous version that have been resolved.
    • Not Applicable: This PR is not to be included in the changelog.

github-actions bot and others added 19 commits November 17, 2025 10:41
Generate a raw mask from each mask_index number used for right padding.
… Elapsed (#4424)

The parallel stage view on BlueOcean is not showing the correct total time after the whole parallel step is complete
Remove the GroupQueryAttention ref op and use equivalent ops in its place.
rocMLIR has implemented split kv and GQA, which enables us to implement flash decoding.
Updates since PR 4190 for clamping
@causten causten requested a review from a team as a code owner November 24, 2025 16:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.