-
Notifications
You must be signed in to change notification settings - Fork 3.4k
[QNN-EP] Support non-last axis TopK. #24881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QNN-EP] Support non-last axis TopK. #24881
Conversation
/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline |
Azure Pipelines successfully started running 5 pipeline(s). |
There was a fix for the Web CI pipeline, please merge the code from latest main branch. |
QNN only accepts TopK on the last axis but ONNX/ORT's TopK has axis attribute. In TopK op builder, add Transpose around TopK to permute the axis to the last before and permute back after. Additionally, since TopK's second output is indices which may have INT64 dtype, add Cast to cast transformed INT32 back to INT64 if is graph output.
4c5c656
to
bf1aab0
Compare
Thanks for reminding. I've rebased the PR with latest main branch. |
/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline |
Azure Pipelines successfully started running 5 pipeline(s). |
Kindly remind that CI pipelines are passed. Please help merge this PR if no further issue. |
Hi @HectorSVC, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
### Description In TopK op builder, add Transpose around TopK to permute the axis to the last before and permute back after. Additionally, since TopK's second output is indices which may have INT64 dtype, add Cast to cast transformed INT32 back to INT64 if is graph output. ### Motivation and Context QNN only accepts TopK on the last axis but ONNX/ORT's TopK has axis attribute. Complement TopK op builder to avoid falling back to CPU for non-last axis TopK.
Description
In TopK op builder, add Transpose around TopK to permute the axis to the last before and permute back after.
Additionally, since TopK's second output is indices which may have INT64 dtype, add Cast to cast transformed INT32 back to INT64 if is graph output.
Motivation and Context
QNN only accepts TopK on the last axis but ONNX/ORT's TopK has axis attribute. Complement TopK op builder to avoid falling back to CPU for non-last axis TopK.