Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The NVIDIA driver on your system is too old (found version 11080). #20

Closed
candowu opened this issue Nov 29, 2023 · 2 comments
Closed

The NVIDIA driver on your system is too old (found version 11080). #20

candowu opened this issue Nov 29, 2023 · 2 comments
Assignees
Labels
good first issue Good for newcomers

Comments

@candowu
Copy link

candowu commented Nov 29, 2023

NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8

export MODEL_PATH=Yi-34B-Chat-4bits
01ai/Yi-34B-Chat-4bits  $ export MODEL_ID=01-ai/Yi-34B-Chat-4bits
01ai/Yi-34B-Chat-4bits  $ docker run -it --gpus=all --net=host --shm-size=1g \

-v $MODEL_PATH:$MODEL_PATH
-e DEVICE=cuda:1
-e NCCL_DEBUG=INFO
docker.io/vectorchai/scalellm:latest --logtostderr --model_path=$MODEL_PATH --model_id=$MODEL_ID --model_type=Yi
I20231129 08:13:34.992501 7 main.cpp:135] Using devices: cuda:1
W20231129 08:13:34.993809 7 args_overrider.cpp:132] Overwriting model_type from llama to Yi
I20231129 08:13:34.993916 7 engine.cpp:91] Initializing model from: /data4/candowu/modelscope/01ai/Yi-34B-Chat-4bits
W20231129 08:13:34.993944 7 model_loader.cpp:162] Failed to find tokenizer.json, use tokenizer.model instead. Please consider using fast tokenizer for better performance.
I20231129 08:13:35.245934 7 engine.cpp:98] Initializing model with dtype: Half
I20231129 08:13:35.245993 7 engine.cpp:107] Initializing model with ModelArgs: [model_type: Yi, dtype: float16, hidden_size: 7168, hidden_act: silu, intermediate_size: 20480, n_layers: 60, n_heads: 56, n_kv_heads: 8, vocab_size: 64000, rms_norm_eps: 1e-05, layer_norm_eps: 0, rotary_dim: 0, rope_theta: 5e+06, rope_scaling: 1, rotary_pct: 1, max_position_embeddings: 4096, bos_token_id: 1, eos_token_id: 2, use_parallel_residual: 0, attn_qkv_clip: 0, attn_qk_ln: 0, attn_alibi: 0, alibi_bias_max: 0, no_bias: 0, residual_post_layernorm: 0], QuantArgs: [quant_method: awq, bits: 4, group_size: 128, desc_act: 0, true_sequential: 0]
terminate called after throwing an instance of 'c10::Error'
what(): The NVIDIA driver on your system is too old (found version 11080). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver.
Exception raised from device_count_impl at ../c10/cuda/CUDAFunctions.cpp:53 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator >) + 0x6b (0x7f2c0dc6e38b in /app/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0xbf (0x7f2c0dc68f3f in /app/lib/libc10.so)
frame #2: c10::cuda::device_count_ensure_non_zero() + 0x18c (0x7f2c0e0535dc in /app/lib/libc10_cuda.so)

@guocuimi
Copy link
Collaborator

guocuimi commented Nov 29, 2023

Thank you for reporting this issue. It appears that an upgrade of your NVIDIA driver to version 525.* is necessary. Our image was built with PyTorch 2.* and CUDA 12.1, which requires a minimum driver version of 525.*.
Please note that the CUDA version is not a concern in this case, as the Docker image does not utilize CUDA. Upgrading your NVIDIA driver should resolve the issue.

FYI: here is the CUDA version and minimum required driver version.
https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html

            | Linux x86_64 Driver Version
CUDA 12.3.x | >=525.60.13
CUDA 12.2.x | >=525.60.13
CUDA 12.1.x | >=525.60.13
CUDA 12.0.x | >=525.60.13

@guocuimi guocuimi self-assigned this Nov 29, 2023
@guocuimi guocuimi added the good first issue Good for newcomers label Nov 29, 2023
@guocuimi
Copy link
Collaborator

guocuimi commented Dec 3, 2023

We are thrilled to share that ScaleLLM has expanded its compatibility to include both CUDA 11.8 and CUDA 12.1. I've just released a new version specifically for this purpose. You can check it out here: New Release for CUDA 11.8 Support.

Currently, the Docker image is being built, and you can track its progress here: Docker Image Build Progress. Once completed, the new image tailored for CUDA 11.8 will be available in this repository: Docker Repository for CUDA 11.8 Image.

To update your Docker image, follow this example command:

docker run -it --gpus=all --net=host \
  -v $HOME/.cache/huggingface/hub:/models \
  -e HF_MODEL_ID=TheBloke/Llama-2-7B-chat-AWQ \
  -e DEVICE=cuda:0 \
  docker.io/vectorchai/scalellm_cu118:latest --logtostderr

@guocuimi guocuimi closed this as completed Mar 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants