Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuda not compiled #9

Closed
SHoogstad opened this issue Jan 3, 2025 · 8 comments
Closed

cuda not compiled #9

SHoogstad opened this issue Jan 3, 2025 · 8 comments

Comments

@SHoogstad
Copy link

SHoogstad commented Jan 3, 2025

Hey i am trying to get this docker container to work on unraid and my b580 it installs everything but when trying to launch it says torch is not compiled with cuda anyreason for this? i am using a b580

@simonlui
Copy link
Owner

simonlui commented Jan 4, 2025

Huh, it might be too old but not sure. Technically not supported by Intel still. However, I can try and see what is going on. Can you run the following multi-command sanity check for me and tell me what it outputs? You can exit out with Ctrl + D afterwards.

docker run -it --device /dev/dri --entrypoint bash --rm  localhost/ipex-arc-comfy:latest
source /opt/intel/oneapi/setvars.sh
sycl-ls

It would also help if you posted the ComfyUI log output when trying to run the container. Thanks.

@SHoogstad
Copy link
Author

SHoogstad commented Jan 4, 2025

here is the log:

No command to use ipexrun to launch ComfyUI. Launching normally.
/deps/venv/lib/python3.11/site-packages/torch/xpu/__init__.py:57: UserWarning: XPU device count is zero! (Triggered internally at /build/pytorch/c10/xpu/XPUFunctions.cpp:50.)
  return torch._C._xpu_getDeviceCount()
/deps/venv/lib/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: 'libpng16.so.16: cannot open shared object file: No such file or directory'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
Traceback (most recent call last):
  File "/ComfyUI/main.py", line 136, in <module>
    import execution
  File "/ComfyUI/execution.py", line 13, in <module>
    import nodes
  File "/ComfyUI/nodes.py", line 22, in <module>
    import comfy.diffusers_load
  File "/ComfyUI/comfy/diffusers_load.py", line 3, in <module>
    import comfy.sd
  File "/ComfyUI/comfy/sd.py", line 6, in <module>
    from comfy import model_management
  File "/ComfyUI/comfy/model_management.py", line 166, in <module>
    total_vram = get_total_memory(get_torch_device()) / (1024 * 1024)
                                  ^^^^^^^^^^^^^^^^^^
  File "/ComfyUI/comfy/model_management.py", line 129, in get_torch_device
    return torch.device(torch.cuda.current_device())
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/deps/venv/lib/python3.11/site-packages/torch/cuda/__init__.py", line 778, in current_device
    _lazy_init()
  File "/deps/venv/lib/python3.11/site-packages/torch/cuda/__init__.py", line 284, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

also for that command i can't find the /dev/dri directory for some reason the drivers refuse top be downloaded

edit:
got it to run this is the log

   bash: BASH_VERSION = 5.1.16(1)-release
   args: Using "$@" for setvars.sh arguments: 
:: compiler -- latest
:: debugger -- latest
:: dev-utilities -- latest
:: tbb -- latest
:: oneAPI environment initialized ::
 
[opencl:cpu][opencl:0] Intel(R) OpenCL, AMD Ryzen 3 2200G with Radeon Vega Graphics     OpenCL 3.0 (Build 0) [2024.18.7.0.11_160000]


@simonlui
Copy link
Owner

simonlui commented Jan 4, 2025

/deps/venv/lib/python3.11/site-packages/torch/xpu/__init__.py:57: UserWarning: XPU device count is zero! (Triggered internally at /build/pytorch/c10/xpu/XPUFunctions.cpp:50.)

This is not good, it means your B580 isn't recognized and you don't have a GPU recognized. The fact though you had issues with finding /dev/dri means you might not have the drivers installed correctly as it should recognize that directory.

[opencl:cpu][opencl:0] Intel(R) OpenCL, AMD Ryzen 3 2200G with Radeon Vega Graphics     OpenCL 3.0 (Build 0) [2024.18.7.0.11_160000]

You should be seeing more lines. I have an A770 and some more lines but you should be seeing something like this.

[opencl:cpu][opencl:0] Intel(R) OpenCL, AMD Ryzen 9 5950X 16-Core Processor             OpenCL 3.0 (Build 0) [2024.18.7.0.11_160000]
[opencl:gpu][opencl:1] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) A770 Graphics OpenCL 3.0 NEO  [24.26.30049.6]
[opencl:cpu][opencl:2] Intel(R) OpenCL, AMD Ryzen 9 5950X 16-Core Processor             OpenCL 3.0 (Build 0) [2024.18.10.0.08_160000]
[level_zero:gpu][level_zero:0] Intel(R) Level-Zero, Intel(R) Arc(TM) A770 Graphics 1.3 [1.3.30049]

You need that level_zero line because that is what is being used by IPEX to run things. I think your drivers aren't installed correctly. Did you follow https://dgpu-docs.intel.com/driver/client/overview.html on your host Linux OS?

@SHoogstad
Copy link
Author

i did some research intel does not release drivers for there cards on linux as they are build in the newest version linux kernel 6.12 (and 6.11 although need to download some drivers to download) my unraid server (beta 7 rc2) which runs on linux kernel 6.6.66 i will close this issue as this is not an issue with your docker container although thanks for the help!

@simonlui
Copy link
Owner

simonlui commented Jan 5, 2025

Okay. I went and checked and it seems like support for Battlemage GPUs only got added in IPEX 2.5.10+xpu. I am a tad split on what I want to do going forward and conversion is a bit messy but hopefully I will have something by the end of this weekend or the next one.

@SHoogstad
Copy link
Author

if you want to support ipex 2.5 as far as i know you only need to download those packages right? or make branch with that support

@simonlui
Copy link
Owner

simonlui commented Jan 5, 2025

if you want to support ipex 2.5 as far as i know you only need to download those packages right? or make branch with that support

No, it goes and installs everything by pip now in terms of dependencies to run. On one hand, that does make life simpler for most people but for this project specifically, this has made it so I need to go and see what I need to remove or retain to make sure everything is minimal. It also calls into question whether this project is needed anymore. Technically, this means someone can just make a requirements.txt and that covers 99% of the usecases this repository used to.

@SHoogstad
Copy link
Author

i did contact intel about support for IPEX on battlemage in this issue and support requires driver level updates where there is no eta for atm so i would hold off with trying to implement it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants