Skip to content

[bug]: Invoke refuses to use my RX 7600 XT GPU #7574

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task done
mcondarelli opened this issue Jan 20, 2025 · 7 comments
Open
1 task done

[bug]: Invoke refuses to use my RX 7600 XT GPU #7574

mcondarelli opened this issue Jan 20, 2025 · 7 comments
Labels
bug Something isn't working

Comments

@mcondarelli
Copy link

mcondarelli commented Jan 20, 2025

Is there an existing issue for this problem?

  • I have searched the existing issues

Operating system

Linux

GPU vendor

AMD (ROCm)

GPU model

RX 7600 XT

GPU VRAM

16GB

Version number

5.5.0

Browser

Firefox 134.0

Python dependencies

{
  "accelerate": "1.0.1",
  "compel": "2.0.2",
  "cuda": null,
  "diffusers": "0.31.0",
  "numpy": "1.26.3",
  "opencv": "4.9.0.80",
  "onnx": "1.16.1",
  "pillow": "10.2.0",
  "python": "3.11.11",
  "torch": "2.4.1+rocm6.1",
  "torchvision": "0.19.1+rocm6.1",
  "transformers": "4.46.3",
  "xformers": null
}

What happened

Every time I try to generate an image I get error:

Server Error
RuntimeError: HIP error: invalid device function HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing AMD_SERIALIZE_KERNEL=3 Compile with `TORCH_USE_HIP_DSA` to...

What you expected to happen

I expected image generation to start.

How to reproduce the problem

In my setup all image generation attempts produce this error.
Using a CPU-only, no-GPU configuration works as expected... and, as expected, is very slow.

Additional context

I have seen several bug reports mentioning ROCm, but I didn't find anything really comparable.
Notice I'm a completely newbie at AI hosting so I might be missing something pretty basic.

Full specs of my server are:

root@ikea:~# lshw -short
H/W path              Device          Class          Description
================================================================
                                      system         MS-7C91 (To be filled by O.E.M.)
/0                                    bus            MPG B550 GAMING EDGE WIFI (MS-7C91)
/0/0                                  memory         64KiB BIOS
/0/10                                 memory         32GiB System Memory
/0/10/0                               memory         2667 MHz (0.4 ns) [empty]
/0/10/1                               memory         16GiB DIMM DDR4 Synchronous Unbuffered (Unregistered) 2667 MHz (0.4 ns)
/0/10/2                               memory         2667 MHz (0.4 ns) [empty]
/0/10/3                               memory         16GiB DIMM DDR4 Synchronous Unbuffered (Unregistered) 2667 MHz (0.4 ns)
/0/13                                 memory         1MiB L1 cache
/0/14                                 memory         8MiB L2 cache
/0/15                                 memory         64MiB L3 cache
/0/16                                 processor      AMD Ryzen 9 5950X 16-Core Processor
/0/100                                bridge         Starship/Matisse Root Complex
/0/100/0.2                            generic        Starship/Matisse IOMMU
/0/100/1.1                            bridge         Starship/Matisse GPP Bridge
/0/100/1.1/0          /dev/nvme0      storage        CT2000P2SSD8
/0/100/1.1/0/0        hwmon0          disk           NVMe disk
/0/100/1.1/0/2        /dev/ng0n1      disk           NVMe disk
/0/100/1.1/0/1        /dev/nvme0n1    disk           2TB NVMe disk
/0/100/1.1/0/1/1      /dev/nvme0n1p1  volume         511MiB Windows FAT volume
/0/100/1.1/0/1/2      /dev/nvme0n1p2  volume         201GiB EXT4 volume
/0/100/1.1/0/1/3      /dev/nvme0n1p3  volume         1023MiB Linux swap volume
/0/100/1.1/0/1/4      /dev/nvme0n1p4  volume         1660GiB EXT4 volume
/0/100/1.2                            bridge         Starship/Matisse GPP Bridge
/0/100/1.2/0                          bus            500 Series Chipset USB 3.1 XHCI Controller
/0/100/1.2/0/0        usb1            bus            xHCI Host Controller
/0/100/1.2/0/0/2                      bus            USB2.0 Hub
/0/100/1.2/0/0/8      input6          input          MSI MYSTIC LIGHT
/0/100/1.2/0/0/9                      communication  AX200 Bluetooth
/0/100/1.2/0/1        usb2            bus            xHCI Host Controller
/0/100/1.2/0.1                        storage        500 Series Chipset SATA Controller
/0/100/1.2/0.2                        bridge         500 Series Chipset Switch Upstream Port
/0/100/1.2/0.2/8                      bridge         Advanced Micro Devices, Inc. [AMD]
/0/100/1.2/0.2/8/0    wlo1            network        Wi-Fi 6 AX200
/0/100/1.2/0.2/9                      bridge         Advanced Micro Devices, Inc. [AMD]
/0/100/1.2/0.2/9/0    enp42s0         network        RTL8125 2.5GbE Controller
/0/100/3.1                            bridge         Starship/Matisse GPP Bridge
/0/100/3.1/0                          bridge         Navi 10 XL Upstream Port of PCI Express Switch
/0/100/3.1/0/0        /dev/fb0        bridge         Navi 10 XL Downstream Port of PCI Express Switch
/0/100/3.1/0/0/0      /dev/fb0        display        Navi 33 [Radeon RX 7600/7600 XT/7600M XT/7600S/7700S / PRO W7600]
/0/100/3.1/0/0/0.1    card0           multimedia     Navi 31 HDMI/DP Audio
/0/100/3.1/0/0/0.1/0  input10         input          HDA ATI HDMI HDMI/DP,pcm=3
/0/100/3.1/0/0/0.1/1  input11         input          HDA ATI HDMI HDMI/DP,pcm=7
/0/100/3.1/0/0/0.1/2  input12         input          HDA ATI HDMI HDMI/DP,pcm=8
/0/100/3.1/0/0/0.1/3  input13         input          HDA ATI HDMI HDMI/DP,pcm=9
/0/100/7.1                            bridge         Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B]
/0/100/7.1/0                          generic        Starship/Matisse PCIe Dummy Function
/0/100/8.1                            bridge         Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B]
/0/100/8.1/0                          generic        Starship/Matisse Reserved SPP
/0/100/8.1/0.1                        generic        Starship/Matisse Cryptographic Coprocessor PSPCPP
/0/100/8.1/0.3                        bus            Matisse USB 3.0 Host Controller
/0/100/8.1/0.3/0      usb3            bus            xHCI Host Controller
/0/100/8.1/0.3/0/1    input0          input          CX 2.4G Receiver System Control
/0/100/8.1/0.3/1      usb4            bus            xHCI Host Controller
/0/100/8.1/0.4        card1           multimedia     Starship/Matisse HD Audio Controller
/0/100/8.1/0.4/0      input14         input          HDA Digital PCBeep
/0/100/8.1/0.4/1      input15         input          HD-Audio Generic Rear Mic
/0/100/8.1/0.4/2      input16         input          HD-Audio Generic Front Mic
/0/100/8.1/0.4/3      input17         input          HD-Audio Generic Line
/0/100/8.1/0.4/4      input18         input          HD-Audio Generic Line Out Front
/0/100/8.1/0.4/5      input19         input          HD-Audio Generic Line Out Surround
/0/100/8.1/0.4/6      input20         input          HD-Audio Generic Line Out CLFE
/0/100/8.1/0.4/7      input21         input          HD-Audio Generic Front Headphone
/0/100/14                             bus            FCH SMBus Controller
/0/100/14.3                           bridge         FCH LPC Bridge
/0/100/14.3/0                         system         PnP device PNP0c01
/0/100/14.3/1                         system         PnP device PNP0c02
/0/100/14.3/2                         system         PnP device PNP0b00
/0/100/14.3/3                         system         PnP device PNP0c02
/0/100/14.3/4                         system         PnP device PNP0c02
/0/101                                bridge         Starship/Matisse PCIe Dummy Host Bridge
/0/102                                bridge         Starship/Matisse PCIe Dummy Host Bridge
/0/103                                bridge         Starship/Matisse PCIe Dummy Host Bridge
/0/104                                bridge         Starship/Matisse PCIe Dummy Host Bridge
/0/105                                bridge         Starship/Matisse PCIe Dummy Host Bridge
/0/106                                bridge         Starship/Matisse PCIe Dummy Host Bridge
/0/107                                bridge         Starship/Matisse PCIe Dummy Host Bridge
/0/108                                bridge         Matisse/Vermeer Data Fabric: Device 18h; Function 0
/0/109                                bridge         Matisse/Vermeer Data Fabric: Device 18h; Function 1
/0/10a                                bridge         Matisse/Vermeer Data Fabric: Device 18h; Function 2
/0/10b                                bridge         Matisse/Vermeer Data Fabric: Device 18h; Function 3
/0/10c                                bridge         Matisse/Vermeer Data Fabric: Device 18h; Function 4
/0/10d                                bridge         Matisse/Vermeer Data Fabric: Device 18h; Function 5
/0/10e                                bridge         Matisse/Vermeer Data Fabric: Device 18h; Function 6
/0/10f                                bridge         Matisse/Vermeer Data Fabric: Device 18h; Function 7
/1                    input7          input          Power Button
/2                    input8          input          Power Button
/3                    input9          input          PC Speaker
root@ikea:~# 

Discord username

mcon

@mcondarelli mcondarelli added the bug Something isn't working label Jan 20, 2025
@SherLock707
Copy link

I have the same issue with my rx 6700xt on Arch.

[180412:0202/233234.609809:ERROR:gl_surface_presentation_helper.cc(260)] GetVSyncParametersIfAvailable() failed for 1 times!
[180412:0202/233242.280897:ERROR:gl_surface_presentation_helper.cc(260)] GetVSyncParametersIfAvailable() failed for 2 times!
[180412:0202/233242.281545:ERROR:gl_surface_presentation_helper.cc(260)] GetVSyncParametersIfAvailable() failed for 3 times!
Starting up...

Started Invoke process with PID: 180577

amdgpu.ids: No such file or directory

Could not load bitsandbytes native library: 'NoneType' object has no attribute 'split'
Traceback (most recent call last):
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/bitsandbytes/cextension.py", line 85, in <module>
    lib = get_native_library()
          ^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/bitsandbytes/cextension.py", line 64, in get_native_library
    cuda_specs = get_cuda_specs()
                 ^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/bitsandbytes/cuda_specs.py", line 39, in get_cuda_specs
    cuda_version_string=(get_cuda_version_string()),
                         ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/bitsandbytes/cuda_specs.py", line 29, in get_cuda_version_string
    major, minor = get_cuda_version_tuple()
                   ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/bitsandbytes/cuda_specs.py", line 24, in get_cuda_version_tuple
    major, minor = map(int, torch.version.cuda.split("."))
                            ^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'split'

CUDA Setup failed despite CUDA being available. Please run the following command to get more information:

python -m bitsandbytes

Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
and open an issue at: https://github.com/bitsandbytes-foundation/bitsandbytes/issues


>> patchmatch.patch_match: ERROR - patchmatch failed to load or compile (libvtkFiltersTexture.so.1: cannot open shared object file: No such file or directory).
>> patchmatch.patch_match: INFO - Refer to https://invoke-ai.github.io/InvokeAI/installation/060_INSTALL_PATCHMATCH/ for installation instructions.

[2025-02-02 23:33:15,760]::[InvokeAI]::INFO --> Patchmatch not loaded (nonfatal)

[2025-02-02 23:33:16,528]::[InvokeAI]::INFO --> Using torch device: AMD Radeon Graphics

[2025-02-02 23:33:16,665]::[InvokeAI]::INFO --> cuDNN version: 3001000

[2025-02-02 23:33:16,784]::[InvokeAI]::INFO --> InvokeAI version 5.6.0
[2025-02-02 23:33:16,784]::[InvokeAI]::INFO --> Root directory = /run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI

[2025-02-02 23:33:16,785]::[InvokeAI]::INFO --> Initializing database at /run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/databases/invokeai.db

[2025-02-02 23:33:16,818]::[ModelManagerService]::INFO --> [MODEL CACHE] Calculated model RAM cache size: 9200.00 MB. Heuristics applied: [1, 3].

[2025-02-02 23:33:16,905]::[InvokeAI]::INFO --> Pruned 1 finished queue items

[2025-02-02 23:33:19,957]::[InvokeAI]::INFO --> Cleaned database (freed 0.04MB)
[2025-02-02 23:33:19,957]::[InvokeAI]::INFO --> Invoke running on http://127.0.0.1:9090 (Press CTRL+C to quit)

[2025-02-02 23:33:19,961]::[InvokeAI]::INFO --> Executing queue item 2, session 57837bd5-451a-4b7d-98cf-77af221ee952

[2025-02-02 23:33:57,539]::[ModelManagerService]::INFO --> [MODEL CACHE] Loaded model '907a4c90-54e0-467d-9346-879f2c70d47a:unet' (UNet2DConditionModel) onto cuda device in 32.53s. Total model size: 4897.05MB, VRAM: 4897.05MB (100.0%)

[2025-02-02 23:33:57,924]::[ModelManagerService]::INFO --> [MODEL CACHE] Loaded model '907a4c90-54e0-467d-9346-879f2c70d47a:scheduler' (DDPMScheduler) onto cuda device in 0.00s. Total model size: 0.00MB, VRAM: 0.00MB (0.0%)

[2025-02-02 23:33:58,448]::[InvokeAI]::ERROR --> Error while invoking session 57837bd5-451a-4b7d-98cf-77af221ee952, invocation d372c6e3-d7e1-4f1f-8f27-3a277ceba8a6 (denoise_latents): HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.

[2025-02-02 23:33:58,448]::[InvokeAI]::ERROR --> Traceback (most recent call last):
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/services/session_processor/session_processor_default.py", line 129, in run_node
    output = invocation.invoke_internal(context=context, services=self._services)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/invocations/baseinvocation.py", line 300, in invoke_internal
    output = self.invoke(context)
             ^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/invocations/denoise_latents.py", line 824, in invoke
    return self._old_invoke(context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/itachi/.local/share/uv/python/cpython-3.11.11-linux-x86_64-gnu/lib/python3.11/contextlib.py", line 81, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/invocations/denoise_latents.py", line 1078, in _old_invoke
    timesteps, init_timestep, scheduler_step_kwargs = self.init_scheduler(
                                                      ^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/invocations/denoise_latents.py", line 729, in init_scheduler
    t_start_idx = len(list(filter(lambda ts: ts >= t_start_val, _timesteps)))
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/invocations/denoise_latents.py", line 729, in <lambda>
    t_start_idx = len(list(filter(lambda ts: ts >= t_start_val, _timesteps)))
                                             ^^^^^^^^^^^^^^^^^
RuntimeError: HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.

.............


[2025-02-02 23:35:12,417]::[InvokeAI]::INFO --> Executing queue item 5, session a3cea2be-230e-47a3-a75b-07fd01150a82

[2025-02-02 23:35:12,447]::[ModelManagerService]::INFO --> [MODEL CACHE] Loaded model '907a4c90-54e0-467d-9346-879f2c70d47a:unet' (UNet2DConditionModel) onto cuda device in 0.00s. Total model size: 4897.05MB, VRAM: 4897.05MB (100.0%)

[2025-02-02 23:35:12,449]::[ModelManagerService]::INFO --> [MODEL CACHE] Loaded model '907a4c90-54e0-467d-9346-879f2c70d47a:scheduler' (DDPMScheduler) onto cuda device in 0.00s. Total model size: 0.00MB, VRAM: 0.00MB (0.0%)

[2025-02-02 23:35:12,459]::[InvokeAI]::ERROR --> Error while invoking session a3cea2be-230e-47a3-a75b-07fd01150a82, invocation 2dfa2473-3dca-46d9-a2be-288795f10772 (denoise_latents): HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.

[2025-02-02 23:35:12,459]::[InvokeAI]::ERROR --> Traceback (most recent call last):
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/services/session_processor/session_processor_default.py", line 129, in run_node
    output = invocation.invoke_internal(context=context, services=self._services)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/invocations/baseinvocation.py", line 300, in invoke_internal
    output = self.invoke(context)
             ^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/invocations/denoise_latents.py", line 824, in invoke
    return self._old_invoke(context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/itachi/.local/share/uv/python/cpython-3.11.11-linux-x86_64-gnu/lib/python3.11/contextlib.py", line 81, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/invocations/denoise_latents.py", line 1078, in _old_invoke
    timesteps, init_timestep, scheduler_step_kwargs = self.init_scheduler(
                                                      ^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/invocations/denoise_latents.py", line 729, in init_scheduler
    t_start_idx = len(list(filter(lambda ts: ts >= t_start_val, _timesteps)))
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/invocations/denoise_latents.py", line 729, in <lambda>
    t_start_idx = len(list(filter(lambda ts: ts >= t_start_val, _timesteps)))
                                             ^^^^^^^^^^^^^^^^^
RuntimeError: HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.



[2025-02-02 23:35:12,818]::[InvokeAI]::INFO --> Graph stats: a3cea2be-230e-47a3-a75b-07fd01150a82
                          Node   Calls   Seconds  VRAM Used
             sdxl_model_loader       1    0.000s     4.881G
            sdxl_compel_prompt       2    0.001s     4.881G
                       collect       2    0.001s     4.881G
                         noise       1    0.016s     4.881G
               denoise_latents       1    0.015s     4.882G
TOTAL GRAPH EXECUTION TIME:   0.032s
TOTAL GRAPH WALL TIME:   0.035s
RAM used by InvokeAI process: 5.91G (+0.000G)
RAM used to load models: 4.78G
VRAM in use: 4.881G
RAM cache statistics:
   Model cache hits: 2
   Model cache misses: 0
   Models cached: 4
   Models cleared from cache: 0
   Cache high water mark: 6.31/0.00G

@reversesh3ll
Copy link

Same exact issue also but with a RX 6900XT...

@mcondarelli
Copy link
Author

Same exact issue also but with a RX 6900XT...

I solved (somehow) my problem installing InvokeAI and THEN:

  • removing torch, torchvision and bitsandbytes
  • installing the three (plus pytorch-triton-rocm) from Pytorch site.

This is my full start scrip (adjust for your GPU)t:

#!/bin/bash
set -x -e

script_path=$(readlink -f "$0" 2>/dev/null || realpath "$0" 2>/dev/null || echo "$0")
sdir="$(dirname "${script_path}")"
here="$(cd "$sdir" && pwd)"
echo "The path of this script is: $script_path ($here)"
user=$(ls -ld "$script_path" | awk '{print $3}')
home=$(getent passwd "$user" | cut -d: -f6)
echo "Home directory of $user is $home"

VENV="invoke"

# Check InvokeAI is instaled in virtual environment
if [ -x "$VENV/bin/invokeai-web" ]
then
    echo "InvokeAI is already instaled, skipping..."
else
    # check Virtual Environment exists
    if [ -x "$VENV/bin/python" ]
    then
        echo "Virtual Environment at '$VENV' already present, skipping..."
    else
        echo "Creating basic Virtual Environment at '$VENV'..."

        PYTHON="python3.11"
        CACHE="$here"
        # prepare environment
        $PYTHON -m venv $VENV
    fi

    # Activate virtual environment
    source "$VENV/bin/activate"

    # Install InvokeAI in Virtual Environment
    echo "Installing InvokeAI in Virtual Environment at '$VENV'..."
    REPO=https://download.pytorch.org/whl/nightly/rocm6.3
    $VENV/bin/pip install --extra-index-url $REPO invokeai

    # restore right version of pytorch-triton-rocm, torch and torchvision
    pip uninstall pytorch-triton-rocm torch torchvision bitsandbytes --yes
    pip install pytorch-triton-rocm torch torchvision --index-url https://download.pytorch.org/whl/nightly/rocm6.3

    # install multi-backend "bitsandbytes"
    if [ -d "$here/bitsandbytes" ]
    then
        echo "Multi-backend 'bitsandbytes' already present, skipping..."
    else
        echo "Compiling Multi-backend 'bitsandbytes'..."
        (
            cd "$here"
            # Install bitsandbytes from source
            # Clone bitsandbytes repo, ROCm backend is currently enabled on multi-backend-refactor branch
            git clone -b multi-backend-refactor https://github.com/bitsandbytes-foundation/bitsandbytes.git && cd bitsandbytes/

            # Install dependencies
            pip install .[dev]

            # Compile & install
            #sudo apt-get install -y build-essential cmake  # install build tools dependencies, unless present
            cmake -DCOMPUTE_BACKEND=hip -S .  # Use -DBNB_ROCM_ARCH="gfx90a;gfx942" to target specific gpu arch
            make
        )
    fi
    echo "Installing Multi-backend 'bitsandbytes'..."
    pip install "$here/bitsandbytes"  # `-e` for "editable" install, when developing BNB (otherwise leave that out)
fi

# start InvokeAI
export PYTORCH_ROCM_ARCH=gfx1102
export HSA_OVERRIDE_GFX_VERSION=11.0.0
export PYTORCH_HIP_ALLOC_CONF=expandable_segments:True
export TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1
export INVOKEAI_ROOT=~/invokeai
export GPU_DRIVER=rocm

$VENV/bin/invokeai-web

@SherLock707
Copy link

@mcondarelli , are you able to use all the features in Invoke?

@mcondarelli
Copy link
Author

@mcondarelli , are you able to use all the features in Invoke?

I am very new to InvokeAI so I have NO idea about "all the features", but I can do a lot of things with no errors, at least:

  • generate images from prompt SD1.x, SDXL and FLUX
  • do simple image to image
  • use and modify workflows
  • train simple SD1.5 LoRA

I didn't try upscaling, yet

Things surely not working:

  • train SDXL LoRA

I opened a few tickets against ROCm and bitsandbytes so not "everything is working".
If you need more info you should be more specific.
I am fully willing to make tests on my setup and share results.

@Asherathe
Copy link

Asherathe commented Mar 24, 2025

The official installer, for some reason, installs a version of bitsandbytes that doesn't support ROCm as a backend. I've been swapping it out for ROCm's fork of bitsandbytes, which of course does. But, since I built it myself and my distro is on ROCm 6.3, I then have to switch torch, torchvision, and pytorch-triton-rocm to the version compatible with ROCm 6.3. Basically, the same thing mcondarelli is doing. Haven't figured out how to get patchmatch working with it. Hope this gets fixed soon.

@daxime
Copy link

daxime commented Apr 9, 2025

I solved (somehow) my problem installing InvokeAI and THEN:
* removing torch, torchvision and bitsandbytes
* installing the three (plus pytorch-triton-rocm) from Pytorch site.

Thank you, it worked for me :)

@daxime daxime mentioned this issue Apr 9, 2025
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants