IMPLICIT: No CUDA extension has been built, can't train on GPU #718

ngianni · 2024-07-16T22:11:16Z

Hi! Im trying to run als model with gpu, but I get the following error:

ValueError: No CUDA extension has been built, can't train on GPU.

I also tryed to run it in google colab, but got the same error. It seems tha implicit.gpu.HAS_CUDA is always returning false. Any ideas?

Im running on Devian 11, and this is the nvidia-smi output:

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+

j-svensmark · 2024-08-17T02:24:56Z

I had a similar issue when trying to use Cuda 12. Cuda 11 works for me though.

I tried editing these lines to https://github.com/benfred/implicit/blob/main/implicit/gpu/__init__.py#L16-L17 to something like

except ImportError as e:
    print(f"{e}")

And got this error Import error libcublas.so.11: cannot open shared object file: No such file or directory when importing implicit.
Looks like the cuda extension is specifically looking for cuda 11.

win845 · 2024-10-14T10:05:58Z

Confirm, that implicit can't find cuda 12 (but finds cuda 11)

Quick way to reprodude in docker

# Dockerfile
FROM nvidia/cuda:12.6.1-cudnn-runtime-ubuntu24.04
WORKDIR /app
ENV DEBIAN_FRONTEND=noninteractive
ENV TZ=Europe/Berlin
ENV PATH="/opt/venv/bin:$PATH"

RUN gpg --keyserver keyserver.ubuntu.com --recv-keys F23C5A6CF475977595C89F51BA6932366A755776 && \
    gpg --export F23C5A6CF475977595C89F51BA6932366A755776 | tee /usr/share/keyrings/deadsnakes.gpg > /dev/null && \
    echo "deb [signed-by=/usr/share/keyrings/deadsnakes.gpg] https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu $(. /etc/os-release && echo $VERSION_CODENAME) main" | tee /etc/apt/sources.list.d/deadsnakes.list

RUN apt update && \
    apt-get install -y --no-install-recommends  \
    curl libgomp1 \
    python3.11 python3.11-dev python3.11-venv 

RUN curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py && \
    python3.11 get-pip.py && \
    python3.11 -m venv /opt/venv && \
    rm get-pip.py

RUN pip install implicit
CMD python -c "import implicit; print(implicit.gpu.HAS_CUDA)"

sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
docker build -t implicit -f Dockerfile .
docker run --gpus all -it implicit

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IMPLICIT: No CUDA extension has been built, can't train on GPU #718

IMPLICIT: No CUDA extension has been built, can't train on GPU #718

ngianni commented Jul 16, 2024

j-svensmark commented Aug 17, 2024 •

edited

Loading

win845 commented Oct 14, 2024 •

edited

Loading

IMPLICIT: No CUDA extension has been built, can't train on GPU #718

IMPLICIT: No CUDA extension has been built, can't train on GPU #718

Comments

ngianni commented Jul 16, 2024

j-svensmark commented Aug 17, 2024 • edited Loading

win845 commented Oct 14, 2024 • edited Loading

j-svensmark commented Aug 17, 2024 •

edited

Loading

win845 commented Oct 14, 2024 •

edited

Loading