Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poetry startup uses CPU instead of GPU #235

Open
thb10086 opened this issue Jul 14, 2024 · 3 comments
Open

Poetry startup uses CPU instead of GPU #235

thb10086 opened this issue Jul 14, 2024 · 3 comments

Comments

@thb10086
Copy link

image
image
image

@ayancey
Copy link
Collaborator

ayancey commented Aug 10, 2024

Need more info.

@sandi2382
Copy link

I get the same problem. Using docker on WSL. Container created with command docker run -d --gpus all -p 9000:9000 -e ASR_MODEL=base -e ASR_ENGINE=openai_whisper onerahmet/openai-whisper-asr-webservice:latest-gpu, i get following log:

2024-08-21 14:01:00 
2024-08-21 14:01:00 ==========
2024-08-21 14:01:00 == CUDA ==
2024-08-21 14:01:00 ==========
2024-08-21 14:01:00 
2024-08-21 14:01:00 CUDA Version 11.8.0
2024-08-21 14:01:00 
2024-08-21 14:01:00 Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2024-08-21 14:01:00 
2024-08-21 14:01:00 This container image and its contents are governed by the NVIDIA Deep Learning Container License.
2024-08-21 14:01:00 By pulling and using the container, you accept the terms and conditions of this license:
2024-08-21 14:01:00 https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
2024-08-21 14:01:00 
2024-08-21 14:01:00 A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
2024-08-21 14:01:00 
2024-08-21 14:01:00 [2024-08-21 12:01:00 +0000] [27] [INFO] Starting gunicorn 22.0.0
2024-08-21 14:01:00 [2024-08-21 12:01:00 +0000] [27] [INFO] Listening at: http://0.0.0.0:9000 (27)
2024-08-21 14:01:00 [2024-08-21 12:01:00 +0000] [27] [INFO] Using worker: uvicorn.workers.UvicornWorker
2024-08-21 14:01:00 [2024-08-21 12:01:00 +0000] [28] [INFO] Booting worker with pid: 28
2024-08-21 14:01:01 /app/.venv/lib/python3.10/site-packages/torch/cuda/__init__.py:88: UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 500: named symbol not found (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:109.)
2024-08-21 14:01:01   return torch._C._cuda_getDeviceCount() > 0
2024-08-21 14:01:17 [2024-08-21 12:01:17 +0000] [28] [INFO] Started server process [28]
2024-08-21 14:01:17 [2024-08-21 12:01:17 +0000] [28] [INFO] Waiting for application startup.
2024-08-21 14:01:17 [2024-08-21 12:01:17 +0000] [28] [INFO] Application startup complete.

showing that cuda is present and working, but there is a problem during startup procedure. Notice

2024-08-21 14:01:01 /app/.venv/lib/python3.10/site-packages/torch/cuda/__init__.py:88: UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 500: named symbol not found (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:109.)
2024-08-21 14:01:01   return torch._C._cuda_getDeviceCount() > 0

in log.

If I access the docker container with docker exec -it <container hash> /bin/bash
and execute nvidia-smi, I can see the correct nvidia smi output with gpu shown there, meaning i have host drivers etc. set up correctly.

Wed Aug 21 12:11:57 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.58.02              Driver Version: 556.12         CUDA Version: 12.5     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 2080 Ti     On  |   00000000:01:00.0  On |                  N/A |
<...>

@ayancey
Copy link
Collaborator

ayancey commented Aug 21, 2024

Have you updated your NVIDIA drivers on the host Windows machine?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants