Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IMPLICIT: No CUDA extension has been built, can't train on GPU #718

Open
ngianni opened this issue Jul 16, 2024 · 2 comments
Open

IMPLICIT: No CUDA extension has been built, can't train on GPU #718

ngianni opened this issue Jul 16, 2024 · 2 comments

Comments

@ngianni
Copy link

ngianni commented Jul 16, 2024

Hi! Im trying to run als model with gpu, but I get the following error:

ValueError: No CUDA extension has been built, can't train on GPU.

I also tryed to run it in google colab, but got the same error. It seems tha implicit.gpu.HAS_CUDA is always returning false. Any ideas?

Im running on Devian 11, and this is the nvidia-smi output:

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.42.06 Driver Version: 555.42.06 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Tesla T4 On | 00000000:00:04.0 Off | 0 |
| N/A 38C P8 10W / 70W | 1MiB / 15360MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+

@j-svensmark
Copy link

j-svensmark commented Aug 17, 2024

I had a similar issue when trying to use Cuda 12. Cuda 11 works for me though.

I tried editing these lines to https://github.com/benfred/implicit/blob/main/implicit/gpu/__init__.py#L16-L17 to something like

except ImportError as e:
    print(f"{e}")

And got this error Import error libcublas.so.11: cannot open shared object file: No such file or directory when importing implicit.
Looks like the cuda extension is specifically looking for cuda 11.

@win845
Copy link

win845 commented Oct 14, 2024

Confirm, that implicit can't find cuda 12 (but finds cuda 11)

Quick way to reprodude in docker

# Dockerfile
FROM nvidia/cuda:12.6.1-cudnn-runtime-ubuntu24.04
WORKDIR /app
ENV DEBIAN_FRONTEND=noninteractive
ENV TZ=Europe/Berlin
ENV PATH="/opt/venv/bin:$PATH"

RUN gpg --keyserver keyserver.ubuntu.com --recv-keys F23C5A6CF475977595C89F51BA6932366A755776 && \
    gpg --export F23C5A6CF475977595C89F51BA6932366A755776 | tee /usr/share/keyrings/deadsnakes.gpg > /dev/null && \
    echo "deb [signed-by=/usr/share/keyrings/deadsnakes.gpg] https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu $(. /etc/os-release && echo $VERSION_CODENAME) main" | tee /etc/apt/sources.list.d/deadsnakes.list

RUN apt update && \
    apt-get install -y --no-install-recommends  \
    curl libgomp1 \
    python3.11 python3.11-dev python3.11-venv 

RUN curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py && \
    python3.11 get-pip.py && \
    python3.11 -m venv /opt/venv && \
    rm get-pip.py

RUN pip install implicit
CMD python -c "import implicit; print(implicit.gpu.HAS_CUDA)"
sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
docker build -t implicit -f Dockerfile .
docker run --gpus all -it implicit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants