Skip to content

[QNN-EP] Support non-last axis TopK. #24881

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

minfhong-quic
Copy link
Contributor

Description

In TopK op builder, add Transpose around TopK to permute the axis to the last before and permute back after.
Additionally, since TopK's second output is indices which may have INT64 dtype, add Cast to cast transformed INT32 back to INT64 if is graph output.

Motivation and Context

QNN only accepts TopK on the last axis but ONNX/ORT's TopK has axis attribute. Complement TopK op builder to avoid falling back to CPU for non-last axis TopK.

@HectorSVC
Copy link
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@HectorSVC HectorSVC added the ep:QNN issues related to QNN exeution provider label May 28, 2025
@HectorSVC
Copy link
Contributor

There was a fix for the Web CI pipeline, please merge the code from latest main branch.

QNN only accepts TopK on the last axis but ONNX/ORT's TopK has axis
attribute. In TopK op builder, add Transpose around TopK to permute the
axis to the last before and permute back after.
Additionally, since TopK's second output is indices which may have INT64
dtype, add Cast to cast transformed INT32 back to INT64 if is graph
output.
@minfhong-quic minfhong-quic force-pushed the dev/minfhong/non-last-axis-topk branch from 4c5c656 to bf1aab0 Compare June 3, 2025 01:36
@minfhong-quic
Copy link
Contributor Author

There was a fix for the Web CI pipeline, please merge the code from latest main branch.

Thanks for reminding. I've rebased the PR with latest main branch.

@HectorSVC
Copy link
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@minfhong-quic
Copy link
Contributor Author

Kindly remind that CI pipelines are passed. Please help merge this PR if no further issue.
Thanks!

@minfhong-quic
Copy link
Contributor Author

Hi @HectorSVC,
Could you please review and merge the PR at your earliest convenience? I appreciate your help!
Thank you very much!

Copy link
Contributor

@HectorSVC HectorSVC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@HectorSVC HectorSVC merged commit f390eb5 into microsoft:main Jun 11, 2025
82 checks passed
javier-intel pushed a commit to intel/onnxruntime that referenced this pull request Jun 15, 2025
### Description
In TopK op builder, add Transpose around TopK to permute the axis to the last before and permute back after.
Additionally, since TopK's second output is indices which may have INT64 dtype, add Cast to cast transformed INT32 back to INT64 if is graph output.

### Motivation and Context
QNN only accepts TopK on the last axis but ONNX/ORT's TopK has axis attribute. Complement TopK op builder to avoid falling back to CPU for non-last axis TopK.
@minfhong-quic minfhong-quic deleted the dev/minfhong/non-last-axis-topk branch June 27, 2025 03:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:QNN issues related to QNN exeution provider
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants