[sdkit.Context] Fallback to cpu device on Windows with AMD GPU #64

st1vms · 2023-10-31T16:35:25Z

When trying to load a model with load_model() on Windows, having an AMD GPU (RX 7900XT),
the calling program crashes with error: "Torch was not compiled with CUDA", without falling back to cpu device.

Tested by installing torch packages using:
pip3 install torch torchvision torchaudio

This patch ensures the sdkit.Context() gets initialized with cpu device when cuda is not available.

st1vms · 2023-10-31T23:13:29Z

Also gave a try with suggested commands from (CUDA12/ROCM-night) torch

Apparently Windows still doesn't have torch with ROCM support.

Don't know if you can reproduce this error, but apparently it is related to sdkit.Context having self._device = cuda:0 by default.

This is the relevant poc

"""PR#64 sdkit"""
from os import path as ospath
from os import getcwd
from sdkit import Context
from sdkit.models import load_model

MODEL_TYPE = "stable-diffusion"
MODEL_PATH = ospath.join(getcwd(), MODEL_TYPE, "v1-5-pruned-emaonly.ckpt")
CONTEXT = Context()
CONTEXT.model_paths[MODEL_TYPE] = MODEL_PATH
load_model(CONTEXT, MODEL_TYPE)

This script will throw the AssertionError: Torch not compiled with CUDA enabled

These are torch verification steps in REPL:

>>> import torch
>>> print(torch.__version__)
2.1.0+cpu
>>> print(torch.rand(5, 3))
tensor([[0.4543, 0.9270, 0.4408],
        [0.0567, 0.1613, 0.0919],
        [0.9389, 0.8564, 0.9848],
        [0.7152, 0.0204, 0.2193],
        [0.5769, 0.1634, 0.2768]])
>>> torch.cuda.is_available()
False

st1vms · 2023-10-31T23:21:43Z

Full stacktrace...

PS C:\Users\username\Desktop\Folder> .\venv\Scripts\activate
(venv) PS C:\Users\username\Desktop\Folder>
(venv) PS C:\Users\username\Desktop\Folder> cd .\models\
(venv) PS C:\Users\username\Desktop\Folder\models> py .\poc.py
00:15:51.623 INFO MainThread loading stable-diffusion model from C:\Users\username\Desktop\Folder\models\stable-diffusion\v1-5-pruned-emaonly.ckpt to device: cuda:0
No module 'xformers'. Proceeding without it.
00:15:53.710 INFO MainThread loading on diffusers
00:15:53.710 INFO MainThread using config: C:\Users\username\Desktop\Folder\venv\lib\site-packages\sdkit\models\models_db\configs\v1-inference.yaml
00:15:53.718 INFO MainThread using attn_precision: fp16
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
Traceback (most recent call last):
  File "C:\Users\username\Desktop\Folder\models\poc.py", line 11, in <module>
    load_model(CONTEXT, MODEL_TYPE)
  File "C:\Users\username\Desktop\Folder\venv\lib\site-packages\sdkit\models\model_loader\__init__.py", line 54, in load_model
    context.models[model_type] = get_loader_module(model_type).load_model(context, **kwargs)
  File "C:\Users\username\Desktop\Folder\venv\lib\site-packages\sdkit\models\model_loader\stable_diffusion\__init__.py", line 86, in load_model
    return load_diffusers_model(
  File "C:\Users\username\Desktop\Folder\venv\lib\site-packages\sdkit\models\model_loader\stable_diffusion\__init__.py", line 293, in load_diffusers_model
    default_pipe = default_pipe.to(context.device, torch.float16)
  File "C:\Users\username\Desktop\Folder\venv\lib\site-packages\diffusers\pipelines\pipeline_utils.py", line 727, in to
    module.to(torch_device, torch_dtype)
  File "C:\Users\username\Desktop\Folder\venv\lib\site-packages\transformers\modeling_utils.py", line 2065, in to
    return super().to(*args, **kwargs)
  File "C:\Users\username\Desktop\Folder\venv\lib\site-packages\torch\nn\modules\module.py", line 1160, in to
    return self._apply(convert)
  File "C:\Users\username\Desktop\Folder\venv\lib\site-packages\torch\nn\modules\module.py", line 810, in _apply
    module._apply(fn)
  File "C:\Users\username\Desktop\Folder\venv\lib\site-packages\torch\nn\modules\module.py", line 810, in _apply
    module._apply(fn)
  File "C:\Users\username\Desktop\Folder\venv\lib\site-packages\torch\nn\modules\module.py", line 810, in _apply
    module._apply(fn)
  File "C:\Users\username\Desktop\Folder\venv\lib\site-packages\torch\nn\modules\module.py", line 833, in _apply
    param_applied = fn(param)
  File "C:\Users\username\Desktop\Folder\venv\lib\site-packages\torch\nn\modules\module.py", line 1158, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
  File "C:\Users\username\Desktop\Folder\venv\lib\site-packages\torch\cuda\__init__.py", line 289, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

Using this patch I can go through with cpu device.

EDIT:
Apparently my original commit on this PR caused a circular import error when running proof of concept, so i replaced the utils.log function with a getLogger call against the 'sdkit' logger...waiting for feedback.

Circular import fix `from .utils import log` when running proof of concept.

[SDContext] fallback device configuration

825a3af

Update __init__.py

2dda240

Circular import fix `from .utils import log` when running proof of concept.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[sdkit.Context] Fallback to cpu device on Windows with AMD GPU #64

[sdkit.Context] Fallback to cpu device on Windows with AMD GPU #64

st1vms commented Oct 31, 2023

st1vms commented Oct 31, 2023 •

edited

Loading

st1vms commented Oct 31, 2023 •

edited

Loading

[sdkit.Context] Fallback to cpu device on Windows with AMD GPU #64

Are you sure you want to change the base?

[sdkit.Context] Fallback to cpu device on Windows with AMD GPU #64

Conversation

st1vms commented Oct 31, 2023

st1vms commented Oct 31, 2023 • edited Loading

st1vms commented Oct 31, 2023 • edited Loading

st1vms commented Oct 31, 2023 •

edited

Loading

st1vms commented Oct 31, 2023 •

edited

Loading