Basic support for Intel XPU (Arc Graphics) #409

kwaa · 2023-04-05T13:28:58Z

closed #387

Note:

need to install the oneAPI Base Toolkit first, on Arch Linux it is paru -S intel-oneapi-basekit intel-compute-runtime-bin

Use AUR's intel-compute-runtime-bin instead of intel-compute-runtime to avoid the Assertion '__n < this->size()' failed. error

Maybe it also depends on the oneAPI AI Analytics Toolkit, I'm not sure

Then run:

# setvars
source /opt/intel/oneapi/setvars.sh
# install dependencies
pip install torch==1.13.0a0 torchvision==0.14.1a0 intel_extension_for_pytorch==1.13.10+xpu -f https://developer.intel.com/ipex-whl-stable-xpu
pip install -r requirements.txt
# run
python main.py --use-split-cross-attention

If --use-split-cross-attention is not used, output is noise.

Verify XPU availability under the ComfyUI folder where dependencies have been installed:

[user@host ComfyUI]$ python
>>> import torch
>>> import intel_extension_for_pytorch
>>> torch.xpu.is_available()
True
>>> torch.xpu.get_device_properties('xpu')
_DeviceProperties(name='Intel(R) Arc(TM) A770 Graphics', platform_name='Intel(R) Level-Zero', dev_type='gpu, support_fp64=0, total_memory=15473MB, max_compute_units=512)

Known issues:

~~Noise occurs when batch is larger than one, even with --use-split-cross-attention~~
- There is always about a 50% probability of noise generation
- If --use-split-cross-attention is not used, it is 100%
Some samplers and schedulers cannot be used

closed #387

huggingface/safetensors#142

kwaa · 2023-04-05T15:31:42Z

PS: If use torch.xpu.optimize on the model, will get the same error as intel/intel-extension-for-pytorch#319

elif vram_state == XPU:
+   real_model = torch.xpu.optimize(real_model)
    real_model.to("xpu")
    pass

comfyanonymous · 2023-04-05T17:04:53Z

By default should XPU have priority over CUDA if both are present?

kwaa · 2023-04-05T17:10:39Z

By default should XPU have priority over CUDA if both are present?

It will first check whether intel_extension_for_pytorch is installed, if not installed, it is cuda

comfyanonymous · 2023-04-05T17:12:11Z

I'm asking because some computers have integrated intel GPUs with a Nvidia GPU and I'm wondering if it could cause issues.

kwaa · 2023-04-05T17:15:53Z

I'm asking because some computers have integrated intel GPUs with a Nvidia GPU and I'm wondering if it could cause issues.

Probably not, because pytorch does not support XPU by default, need to install intel_extension_for_pytorch; and this behavior can be seen as wanting to use XPU first

kwaa · 2023-04-06T01:49:47Z

@comfyanonymous Is it possible to merge this PR, or what else is wrong with it?

comfy/model_management.py

kwaa · 2023-04-06T04:35:48Z

Update: I noticed the Experimental Codeless Optimization (ipexrun), but it seems to be cpu-only for now.

Consider manually adding torch.xpu.optimize, torch.xpu.amp.autocast, torch.xpu.empty_cache... would be very cumbersome and not easy to maintain, this seems to be a good solution.

Waiting for v2.0.0+xpu to be released...

requirements.txt

comfyanonymous · 2023-04-06T05:36:48Z

After looking at this pull request a bit XPU should not be treated as another vram state. It makes sense for CPU and MPS to be because they don't have any vram but with XPU there's actually vram so it would be good if --lowvram and --highvram worked.

kwaa · 2023-04-06T05:41:53Z

After looking at this pull request a bit XPU should not be treated as another vram state. It makes sense for CPU and MPS to be because they don't have any vram but with XPU there's actually vram so it would be good if --lowvram and --highvram worked.

That may need to be changed in more depth, but I do agree with that.

kwaa · 2023-04-06T06:30:52Z

~~Ah, there are some problems with --lowvram, pls wait~~

Looks related to this issue: #39
After executing pip install accelerate --upgrade everything works fine

Tested:

LOW_VRAM ✅
NORMAL_VRAM ✅
HIGH_VRAM ✅

comfy/model_management.py

comfyanonymous · 2023-04-07T03:56:03Z

I did a small refactor so if you can confirm it still works that would be great: bceccca

kwaa · 2023-04-07T05:14:51Z

I did a small refactor so if you can confirm it still works that would be great: bceccca

No problem, it still works.

kotx · 2023-04-16T23:39:24Z

Hi, I am unable to get any sort of generation working on my A770 by following the instructions in the first post:

Traceback (most recent call last):
  File "/home/kot/Documents/ComfyUI/execution.py", line 184, in execute
    executed += recursive_execute(self.server, prompt, self.outputs, x, extra_data)
  File "/home/kot/Documents/ComfyUI/execution.py", line 60, in recursive_execute
    executed += recursive_execute(server, prompt, outputs, input_unique_id, extra_data)
  File "/home/kot/Documents/ComfyUI/execution.py", line 60, in recursive_execute
    executed += recursive_execute(server, prompt, outputs, input_unique_id, extra_data)
  File "/home/kot/Documents/ComfyUI/execution.py", line 69, in recursive_execute
    outputs[unique_id] = getattr(obj, obj.FUNCTION)(**input_data_all)
  File "/home/kot/Documents/ComfyUI/nodes.py", line 768, in sample
    return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise)
  File "/home/kot/Documents/ComfyUI/nodes.py", line 699, in common_ksampler
    comfy.model_management.load_model_gpu(model)
  File "/home/kot/Documents/ComfyUI/comfy/model_management.py", line 168, in load_model_gpu
    real_model.to(get_torch_device())
  File "/home/kot/Documents/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 987, in to
    return self._apply(convert)
  File "/home/kot/Documents/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 639, in _apply
    module._apply(fn)
  File "/home/kot/Documents/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 639, in _apply
    module._apply(fn)
  File "/home/kot/Documents/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 639, in _apply
    module._apply(fn)
  [Previous line repeated 1 more time]
  File "/home/kot/Documents/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 662, in _apply
    param_applied = fn(param)
  File "/home/kot/Documents/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 985, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
RuntimeError: Native API failed. Native API returns: -997 (The plugin has emitted a backend specific error) -997 (The plugin has emitted a backend specific error)

A quick Google search shows me this issue: intel/compute-runtime#617, so it seems it's failing to allocate more than 8GB VRAM.
Is there any fix for it? I'm on Manjaro with kernel 6.1.23-1, libva-intel-driver 2.4.1-2 (unsure if it is relevant).

kwaa · 2023-04-17T03:00:30Z

Hi, I am unable to get any sort of generation working on my A770 by following the instructions in the first post:

A quick Google search shows me this issue: intel/compute-runtime#617, so it seems it's failing to allocate more than 8GB VRAM. Is there any fix for it? I'm on Manjaro with kernel 6.1.23-1, libva-intel-driver 2.4.1-2 (unsure if it is relevant).

I have not encountered this problem. Are you using AUR's intel-compute-runtime-bin?

kotx · 2023-04-17T05:14:38Z

Are you using AUR's intel-compute-runtime-bin?

Yes, I used the paru command in the first post.

kwaa · 2023-04-17T05:18:49Z

Yes, I used the paru command in the first post.

Perhaps you could try installing the oneAPI AI Kit, see #476

Or LOW_VRAM mode: python main.py --use-split-cross-attention --lowvram

kotx · 2023-04-18T00:29:03Z

Sadly no luck with AIKit or lowvram. Does it matter that I'm using my own instance of Python instead of Intel Python (included in AIKit)?

kwaa · 2023-04-18T03:55:38Z

Sadly no luck with AIKit or lowvram. Does it matter that I'm using my own instance of Python instead of Intel Python (included in AIKit)?

If you are not using the Intel Python that comes with AI Kit, installing it will make no noticeable difference

kotx · 2023-04-19T07:08:02Z

If you are not using the Intel Python that comes with AI Kit, installing it will make no noticeable difference

I recreated the venv with Intel Python:

(venv) [kot@rin ComfyUI]$ python -V
Python 3.9.15 :: Intel Corporation

But I get the same error as before. Model is Counterfeit v2.5.

kwaa · 2023-04-19T07:34:05Z

But I get the same error as before. Model is Counterfeit v2.5.

Hmm.... This is a bit tricky. Can you test the output of the sycl-ls command and torch.xpu.is_available(), torch.xpu.get_device_properties('xpu') in python?

kotx · 2023-04-20T22:45:42Z

sycl-ls:

[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device 1.2 [2022.15.12.0.01_081451]
[opencl:cpu:1] Intel(R) OpenCL, AMD Ryzen 5 3600 6-Core Processor               3.0 [2022.15.12.0.01_081451]
[opencl:gpu:2] Intel(R) OpenCL HD Graphics, Intel(R) Arc(TM) A770 Graphics 3.0 [23.05.25593.11]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Arc(TM) A770 Graphics 1.3 [1.3.25593]

Python:

Python 3.9.15 (main, Nov 11 2022, 13:58:57) 
[GCC 11.2.0] :: Intel Corporation on linux
Type "help", "copyright", "credits" or "license" for more information.
Intel(R) Distribution for Python is brought to you by Intel Corporation.
Please check out: https://software.intel.com/en-us/python-distribution
>>> import torch
>>> import intel_extension_for_pytorch
[W OperatorEntry.cpp:150] Warning: Overriding a previously registered kernel for the same operator and the same dispatch key
  operator: torchvision::nms
    no debug info
  dispatch key: CPU
  previous kernel: registered at /build/intel-pytorch-extension/csrc/cpu/aten/TorchVisionNms.cpp:47
       new kernel: registered at /opt/workspace/vision/torchvision/csrc/ops/cpu/nms_kernel.cpp:112 (function registerKernel)
/home/kot/Documents/ComfyUI/venv/lib/python3.9/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: 
  warn(f"Failed to load image Python extension: {e}")
>>> torch.xpu.is_available()
True
>>> torch.xpu.get_device_properties('xpu')
_DeviceProperties(name='Intel(R) Arc(TM) A770 Graphics', platform_name='Intel(R) Level-Zero', dev_type='gpu, support_fp64=0, total_memory=15473MB, max_compute_units=512)
>>>

kwaa · 2023-04-21T05:37:26Z

Looks normal, I probably don't have a proper workaround.

But I didn't get this warning:

/home/kot/Documents/ComfyUI/venv/lib/python3.9/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension:
   warn(f"Failed to load image Python extension: {e}")

kwaa added 2 commits April 5, 2023 21:22

Add basic XPU device support

37713e3

closed #387

Specify safetensors version to avoid upstream errors

1ced2bd

huggingface/safetensors#142

kwaa changed the title ~~Basic support for Intel XPU~~ Basic support for Intel XPU (Arc Graphics) Apr 5, 2023

neggles reviewed Apr 6, 2023

View reviewed changes

comfy/model_management.py Outdated Show resolved Hide resolved

Import intel_extension_for_pytorch as ipex

84b9c0a

comfyanonymous reviewed Apr 6, 2023

View reviewed changes

requirements.txt Show resolved Hide resolved

Use separate variables instead of vram_state

7cb924f

kwaa requested a review from comfyanonymous April 6, 2023 06:25

comfyanonymous reviewed Apr 6, 2023

View reviewed changes

comfy/model_management.py Outdated Show resolved Hide resolved

kwaa added 2 commits April 6, 2023 15:44

Fix auto lowvram detection on CUDA

3e2608e

Merge branch 'master' into ipex

05eeaa2

kwaa requested a review from comfyanonymous April 7, 2023 01:12

comfyanonymous merged commit 28a7205 into comfyanonymous:master Apr 7, 2023

NoAvailableAlias mentioned this pull request Apr 25, 2023

Consistently getting noise as output with Intel Arc #556

Closed

ji-huazhong mentioned this pull request Oct 30, 2024

[Feature Request] Request for Ascend NPU support #5420

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Basic support for Intel XPU (Arc Graphics) #409

Basic support for Intel XPU (Arc Graphics) #409

kwaa commented Apr 5, 2023 •

edited

Loading

kwaa commented Apr 5, 2023

comfyanonymous commented Apr 5, 2023

kwaa commented Apr 5, 2023

comfyanonymous commented Apr 5, 2023

kwaa commented Apr 5, 2023

kwaa commented Apr 6, 2023

kwaa commented Apr 6, 2023

comfyanonymous commented Apr 6, 2023

kwaa commented Apr 6, 2023

kwaa commented Apr 6, 2023 •

edited

Loading

comfyanonymous commented Apr 7, 2023

kwaa commented Apr 7, 2023

kotx commented Apr 16, 2023 •

edited

Loading

kwaa commented Apr 17, 2023

kotx commented Apr 17, 2023

kwaa commented Apr 17, 2023

kotx commented Apr 18, 2023

kwaa commented Apr 18, 2023

kotx commented Apr 19, 2023

kwaa commented Apr 19, 2023 •

edited

Loading

kotx commented Apr 20, 2023

kwaa commented Apr 21, 2023

Basic support for Intel XPU (Arc Graphics) #409

Basic support for Intel XPU (Arc Graphics) #409

Conversation

kwaa commented Apr 5, 2023 • edited Loading

Note:

Known issues:

kwaa commented Apr 5, 2023

comfyanonymous commented Apr 5, 2023

kwaa commented Apr 5, 2023

comfyanonymous commented Apr 5, 2023

kwaa commented Apr 5, 2023

kwaa commented Apr 6, 2023

kwaa commented Apr 6, 2023

comfyanonymous commented Apr 6, 2023

kwaa commented Apr 6, 2023

kwaa commented Apr 6, 2023 • edited Loading

comfyanonymous commented Apr 7, 2023

kwaa commented Apr 7, 2023

kotx commented Apr 16, 2023 • edited Loading

kwaa commented Apr 17, 2023

kotx commented Apr 17, 2023

kwaa commented Apr 17, 2023

kotx commented Apr 18, 2023

kwaa commented Apr 18, 2023

kotx commented Apr 19, 2023

kwaa commented Apr 19, 2023 • edited Loading

kotx commented Apr 20, 2023

kwaa commented Apr 21, 2023

kwaa commented Apr 5, 2023 •

edited

Loading

kwaa commented Apr 6, 2023 •

edited

Loading

kotx commented Apr 16, 2023 •

edited

Loading

kwaa commented Apr 19, 2023 •

edited

Loading