Skip to content

Loading of libamdhip64.so.7 fails when release/2.7 branch is build by TheRock using rocm 7.0 #2411

@lamikr

Description

@lamikr

🐛 Describe the bug

When we are doing the ci-build or pytorch and then install it in order to build the pytorch vision and audio we get error for loading libamdhip64.so.7 version.

Fix is available in the main branch from pr pytorch#158889
and I have tested that it fixes a following build error when I backport it to release/2.7 branch:


2025-07-24T19:49:47.6864980Z Successfully installed torch-2.7.1+rocm7.0.0.dev0.515115ea2cb85a0b71b5507ce56a627d14c7ae73
2025-07-24T19:49:48.0371194Z Traceback (most recent call last):
2025-07-24T19:49:48.0372026Z File "/__w/TheRock/TheRock/external-builds/pytorch/pytorch_audio/setup.py", line 9, in
2025-07-24T19:49:48.0373089Z import torch
2025-07-24T19:49:48.0373755Z File "/opt/python/cp311-cp311/lib/python3.11/site-packages/torch/init.py", line 424, in
2025-07-24T19:49:48.0374545Z from torch._C import * # noqa: F403
2025-07-24T19:49:48.0374948Z ^^^^^^^^^^^^^^^^^^^^^^
2025-07-24T19:49:48.0375552Z ImportError: libamdhip64.so.7: cannot open shared object file: No such file or directory
2025-07-24T19:49:48.0822449Z Traceback (most recent call last):
2025-07-24T19:49:48.0823527Z ++ Exec [/__w/TheRock/TheRock]$ /opt/python/cp311-cp311/bin/python -m pip cache remove rocm_sdk --cache-dir /tmp/pipcache
2025-07-24T19:49:48.0824910Z File "/__w/TheRock/TheRock/./external-builds/pytorch/build_prod_wheels.py", line 794, in
2025-07-24T19:49:48.0825793Z main(sys.argv[1:])
2025-07-24T19:49:48.0826534Z File "/__w/TheRock/TheRock/./external-builds/pytorch/build_prod_wheels.py", line 790, in main
2025-07-24T19:49:48.0828941Z ++ Exec [/__w/TheRock/TheRock]$ /opt/python/cp311-cp311/bin/python -m pip install --force-reinstall --pre --index-url https://d25kgig7rdsyks.cloudfront.net/v2/gfx94X-dcgpu/ --cache-dir /tmp/pipcache --cache-dir /tmp/pipcache 'rocm[libraries,devel]==7.0.0.dev0+515115ea2cb85a0b71b5507ce56a627d14c7ae73'
2025-07-24T19:49:48.0831487Z Installed version: 7.0.0.dev0+515115ea2cb85a0b71b5507ce56a627d14c7ae73

Only thing I needed to drop from the original patch was the sha256 checksum change for the aotriton 0.9.

Versions

Not relevant, error happens on ci-machine.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions