Fix custom op example #15483

cccclai · 2025-10-30T23:38:32Z

There were some refactor and looks like the custom example was not covered in CI

Run

python3 examples/qualcomm/custom_op/custom_ops_1.py --build_folder build-android -s R3CY50HEGYM -m SM8750 --op_package_dir examples/qualcomm/custom_op/example_op_package_htp/ExampleOpPackage --build_op_package

and output is

Output log

Quantizing(PTQ) the model...
WARNING:root:Op aten.unbind.int was requested for preservation by partitioner.  This request is ignored because it is in a blocklist.
WARNING:root:Op aten.unbind.int was requested for preservation by partitioner.  This request is ignored because it is in a blocklist.
[INFO] [Qnn ExecuTorch]: create QNN Logger with log_level 1
[INFO] [Qnn ExecuTorch]: Initialize Qnn backend parameters for Qnn executorch backend type 2
[INFO] [Qnn ExecuTorch]: Caching: Caching is in SAVE MODE.
[INFO] [Qnn ExecuTorch]: Running level=3 optimization.
[QNN Partitioner Op Support]: my_ops.mul3.default | True
[INFO] [Qnn ExecuTorch]: Destroy Qnn backend parameters
[INFO] [Qnn ExecuTorch]: Destroy Qnn context
[INFO] [Qnn ExecuTorch]: Destroy Qnn device
[INFO] [Qnn ExecuTorch]: Destroy Qnn backend
[INFO] [Qnn ExecuTorch]: Destroy Qnn backend parameters
INFO:executorch.backends.qualcomm.partition.qnn_partitioner:Qnn partitioner will delegate torch mutable buffer with the same I/O address during the runtime, so if your model contains mutable buffer, then you can get the better performance with skip_mutable_buffer=False. If you encounter accuracy issue during the runtime, then please set `skip_mutable_buffer=True` and try again.
[INFO] [Qnn ExecuTorch]: create QNN Logger with log_level 1
[INFO] [Qnn ExecuTorch]: Initialize Qnn backend parameters for Qnn executorch backend type 2
[INFO] [Qnn ExecuTorch]: Caching: Caching is in SAVE MODE.
</details>
[INFO] [Qnn ExecuTorch]: Running level=3 optimization.
INFO:executorch.backends.qualcomm.qnn_preprocess:Processing Method(0): (1/1)
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: quantized_decomposed_quantize_per_tensor_default, quantized_decomposed.quantize_per_tensor.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: my_ops_mul3_default, my_ops.mul3.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: quantized_decomposed_dequantize_per_tensor_tensor, quantized_decomposed.dequantize_per_tensor.tensor

====== DDR bandwidth summary ======
spill_bytes=0
fill_bytes=0
write_total_bytes=131584
read_total_bytes=125440

[INFO] [Qnn ExecuTorch]: Destroy Qnn backend parameters
[INFO] [Qnn ExecuTorch]: Destroy Qnn context
[INFO] [Qnn ExecuTorch]: Destroy Qnn device
[INFO] [Qnn ExecuTorch]: Destroy Qnn backend
[INFO] [Qnn ExecuTorch]: Destroy Qnn backend parameters
WARNING:root:Op aten.unbind.int was requested for preservation by partitioner.  This request is ignored because it is in a blocklist.
./custom_op/custom_qnn.pte: 1 file pushed, 0 skipped. 166.7 MB/s (31652 bytes in 0.000s)
/home/chenlai/fbsource/third-party/qualcomm/qnn/qnn-2.34/lib/aarch64-android/libQnnHtp.so: 1 file pushed, 0 skipped. 302.9 MB/s (2193976 bytes in 0.007s)
/home/chenlai/fbsource/third-party/qualcomm/qnn/qnn-2.34/lib/hexagon-v79/unsigned/libQnnHtpV79Skel.so: 1 file pushed, 0 skipped. 384.9 MB/s (9087648 bytes in 0.023s)
/home/chenlai/fbsource/third-party/qualcomm/qnn/qnn-2.34/lib/aarch64-android/libQnnHtpV79Stub.so: 1 file pushed, 0 skipped. 263.7 MB/s (477208 bytes in 0.002s)
/home/chenlai/fbsource/third-party/qualcomm/qnn/qnn-2.34/lib/aarch64-android/libQnnHtpPrepare.so: 1 file pushed, 0 skipped. 373.1 MB/s (52389040 bytes in 0.134s)
/home/chenlai/fbsource/third-party/qualcomm/qnn/qnn-2.34/lib/aarch64-android/libQnnSystem.so: 1 file pushed, 0 skipped. 266.2 MB/s (2497656 bytes in 0.009s)
build-android/examples/qualcomm/executor_runner/qnn_executor_runner: 1 file pushed, 0 skipped. 444.4 MB/s (45963304 bytes in 0.099s)
build-android/backends/qualcomm/libqnn_executorch_backend.so: 1 file pushed, 0 skipped. 240.5 MB/s (646624 bytes in 0.003s)
/home/chenlai/fbsource/third-party/qualcomm/qnn/qnn-2.34/lib/aarch64-android/libQnnModelDlc.so: 1 file pushed, 0 skipped. 303.0 MB/s (2430512 bytes in 0.008s)
/data/users/chenlai/executorch/custom_op/input_list.txt: 1 file pushed, 0 skipped. 0.1 MB/s (14 bytes in 0.000s)
/data/users/chenlai/executorch/custom_op/input_0_0.raw: 1 file pushed, 0 skipped. 288.0 MB/s (100352 bytes in 0.000s)
examples/qualcomm/custom_op/example_op_package_htp/ExampleOpPackage/build/hexagon-v79/libQnnExampleOpPackage_HTP.so: 1 file pushed, 0 skipped. 110.0 MB/s (177136 bytes in 0.002s)
examples/qualcomm/custom_op/example_op_package_htp/ExampleOpPackage/build/aarch64-android/libQnnExampleOpPackage.so: 1 file pushed, 0 skipped. 340.1 MB/s (874888 bytes in 0.002s)
I 00:00:00.000608 executorch:qnn_executor_runner.cpp:232] Model file custom_qnn.pte is loaded.
I 00:00:00.000707 executorch:qnn_executor_runner.cpp:242] Using method forward
I 00:00:00.000718 executorch:qnn_executor_runner.cpp:289] Setting up planned buffer 0, size 200704.
[INFO] [Qnn ExecuTorch]: Deserializing processed data using QnnContextCustomProtocol
[INFO] [Qnn ExecuTorch]: create QNN Logger with log_level 1
[INFO] [Qnn ExecuTorch]: Initialize Qnn backend parameters for Qnn executorch backend type 2
[INFO] [Qnn ExecuTorch]: Caching: Caching is in RESTORE MODE.
[INFO] [Qnn ExecuTorch]: QnnContextCustomProtocol expected magic number: 0x5678abcd but get: 0x2000000
[INFO] [Qnn ExecuTorch]: Running level=1 optimization.
I 00:00:00.283815 executorch:qnn_executor_runner.cpp:313] Method loaded.
E 00:00:00.284038 executorch:method.cpp:1274] Output 0 is memory planned, or is a constant. Cannot override the existing data pointer.
I 00:00:00.284055 executorch:qnn_executor_runner.cpp:373] ignoring error from set_output_data_ptr(): 0x2
I 00:00:00.284061 executorch:qnn_executor_runner.cpp:376] Inputs prepared.
I 00:00:00.284115 executorch:qnn_executor_runner.cpp:382] Number of inputs: 1
I 00:00:00.284290 executorch:qnn_executor_runner.cpp:490] Perform 10 inference for warming up
I 00:00:04.366009 executorch:qnn_executor_runner.cpp:496] Start inference (0)
I 00:00:04.781286 executorch:qnn_executor_runner.cpp:514] 1 inference took 415.036000 ms, avg 415.036000 ms
I 00:00:04.782228 executorch:qnn_executor_runner.cpp:550] Total 1 inference took 415.036000 ms, avg 415.036000 ms
I 00:00:04.782429 executorch:qnn_executor_runner.cpp:615] Write etdump to /data/local/tmp/executorch/custom_qnn/etdump.etdp, Size = 1984
[INFO] [Qnn ExecuTorch]: Destroy Qnn backend parameters
[INFO] [Qnn ExecuTorch]: Destroy Qnn context
[INFO] [Qnn ExecuTorch]: Destroy Qnn device
[INFO] [Qnn ExecuTorch]: Destroy Qnn backend
[WARNING] [Qnn ExecuTorch]: QnnDsp <W> Function not called, PrepareLib isn't loaded!

/data/local/tmp/executorch/custom_qnn/outputs/: 1 file pulled, 0 skipped. 0.9 MB/s (100352 bytes in 0.104s)
is_close? True

pytorch-bot · 2025-10-30T23:38:36Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15483

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 18 New Failures, 1 Unrelated Failure

As of commit 50b850f with merge base 11f752c ():

NEW FAILURES - The following jobs have failed:

pull / unittest / linux / linux-job (gh)
backends/xnnpack/test/recipes/test_xnnpack_recipes.py::TestXnnpackRecipes::test_int8_static_quant_recipe
pull / unittest / macos / macos-job (gh)
backends/xnnpack/test/recipes/test_xnnpack_recipes.py::TestXnnpackRecipes::test_int8_static_quant_recipe
pull / unittest-editable / linux / linux-job (gh)
backends/xnnpack/test/recipes/test_xnnpack_recipes.py::TestXnnpackRecipes::test_int8_static_quant_recipe
pull / unittest-editable / macos / macos-job (gh)
backends/xnnpack/test/recipes/test_xnnpack_recipes.py::TestXnnpackRecipes::test_int8_static_quant_recipe
Test CUDA Builds / export-model-cuda-artifact (google, gemma-3-4b-it, non-quantized) / linux-job (gh)
RuntimeError: Command docker exec -t 1478c64c8c17b249ebac7f1e2d953296c663eadeaca99ebf89bef7be07cbf12e /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (google, gemma-3-4b-it, quantized-int4-tile-packed) / linux-job (gh)
RuntimeError: Command docker exec -t 5b9fc0a4b65a70c6a877f0b48736497b819d62462e1048e9899de7c92226fa3b /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, non-quantized) / linux-job (gh)
RuntimeError: Command docker exec -t ace7bab1668e3eb20e11e586e4df793cdfcbd74e05648f4d3a7f44831cf8d6c6 /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-tile-packed) / linux-job (gh)
RuntimeError: Command docker exec -t 9279549c981045441b84a12e31df8b334755eaba91f78f52daf6c3b428e8ad33 /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (mistralai, Voxtral-Mini-3B-2507, quantized-int4-weight-only) / linux-job (gh)
RuntimeError: Command docker exec -t 4a292766c0e372d7ab9eceec6487bc224a15a5729b1fabdeae1cb4e413a17b4a /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (openai, whisper-large-v3-turbo, non-quantized) / linux-job (gh)
RuntimeError: Command docker exec -t fae2221e9f5241f57e160ede1d60d4d2cec260ddaf26f2df1260f47c62717c54 /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (openai, whisper-large-v3-turbo, quantized-int4-tile-packed) / linux-job (gh)
RuntimeError: Command docker exec -t 9fa6ff08a0316d8a8983d0ce4e3785185608e2e85e7fcaee706b4e2329a788db /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (openai, whisper-large-v3-turbo, quantized-int4-weight-only) / linux-job (gh)
RuntimeError: Command docker exec -t f5b967e8b943c139de4ff5d4fc5aec7019b7e9ab307c3a7808caddc4824b54df /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (openai, whisper-small, non-quantized) / linux-job (gh)
RuntimeError: Command docker exec -t af8cdc5e959e0b27795222e6b2ebf75841aa9652b0f40eb09e17cf3ffda892e3 /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (openai, whisper-small, quantized-int4-tile-packed) / linux-job (gh)
RuntimeError: Command docker exec -t 579e5c25c89fdab0935afbe609c3eebf52759566f7b9d11f5a3436026e438eab /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (openai, whisper-small, quantized-int4-weight-only) / linux-job (gh)
RuntimeError: Command docker exec -t 4c84298a6ca5c3d2ef29f6b8835c55ea572ed7e41e80e8f42a5b73dccce4c681 /exec failed with exit code 1
Test Metal Backend / export-model-metal-artifact (mistralai, Voxtral-Mini-3B-2507, non-quantized) / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 127
Test Metal Backend / export-model-metal-artifact (openai, whisper-large-v3-turbo, non-quantized) / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 127
Test Metal Backend / export-model-metal-artifact (openai, whisper-small, non-quantized) / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 127

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / test-binary-size-linux / linux-job (gh) (detected as infra flaky with no log or failing log classifier)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2025-10-30T23:39:17Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

cccclai · 2025-11-06T19:09:16Z

Can I get a review on this?

examples/qualcomm/custom_op/custom_ops_1.py

cccclai · 2025-11-14T01:06:11Z

Address comments

Fix custom op example

20c3d0b

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 30, 2025

cccclai added 2 commits October 30, 2025 16:42

Update custom_ops_1.py

9bfe282

Update custom_ops_1.py

ea5ec21

cccclai requested review from chenweng-quic, haowhsu-quic, shewu-quic and winskuo-quic October 31, 2025 03:42

haowhsu-quic reviewed Nov 7, 2025

View reviewed changes

examples/qualcomm/custom_op/custom_ops_1.py Show resolved Hide resolved

haowhsu-quic reviewed Nov 7, 2025

View reviewed changes

examples/qualcomm/custom_op/custom_ops_1.py Outdated Show resolved Hide resolved

Fix indentation and clean up code in custom_ops_1.py

50b850f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix custom op example #15483

Fix custom op example #15483

Uh oh!

cccclai commented Oct 30, 2025

Uh oh!

pytorch-bot bot commented Oct 30, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 30, 2025

Uh oh!

cccclai commented Nov 6, 2025

Uh oh!

Uh oh!

Uh oh!

cccclai commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix custom op example #15483

Are you sure you want to change the base?

Fix custom op example #15483

Uh oh!

Conversation

cccclai commented Oct 30, 2025

Uh oh!

pytorch-bot bot commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15483

❌ 18 New Failures, 1 Unrelated Failure

Uh oh!

github-actions bot commented Oct 30, 2025

This PR needs a release notes: label

Uh oh!

cccclai commented Nov 6, 2025

Uh oh!

Uh oh!

Uh oh!

cccclai commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pytorch-bot bot commented Oct 30, 2025 •

edited

Loading

This PR needs a `release notes:` label