Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FlashAttn error when launching from docker image #4779

Closed
1 task done
hzhaoy opened this issue Jul 11, 2024 · 0 comments · Fixed by #4781
Closed
1 task done

FlashAttn error when launching from docker image #4779

hzhaoy opened this issue Jul 11, 2024 · 0 comments · Fixed by #4781
Labels
solved This problem has been already solved

Comments

@hzhaoy
Copy link
Contributor

hzhaoy commented Jul 11, 2024

Reminder

  • I have read the README and searched the existing issues.

System Info

System: Ubuntu 20.04.2 LTS
GPU: NVIDIA A100-SXM4-80GB
Docker: 24.0.0
Docker Compose: v2.17.3
llamafactory: 0.8.3.dev0

Reproduction

Dockerfile: https://github.com/hiyouga/LLaMA-Factory/blob/67040f149c0b3fbae443ba656ed0dcab0ebaf730/docker/docker-cuda/Dockerfile

Build Command:

docker build -f ./Dockerfile \
    --build-arg INSTALL_BNB=true \
    --build-arg INSTALL_VLLM=true \
    --build-arg INSTALL_DEEPSPEED=true \
    --build-arg INSTALL_FLASHATTN=true \
    --build-arg PIP_INDEX=https://pypi.tuna.tsinghua.edu.cn/simple \
    -t llamafactory:latest .

Launch Command:

docker run -dit --gpus=all \
    -v ./hf_cache:/root/.cache/huggingface \
    -v ./ms_cache:/root/.cache/modelscope \
    -v ./data:/app/data \
    -v ./output:/app/output \
    -p 7860:7860 \
    -p 8000:8000 \
    --shm-size 16G \
    --name llamafactory \
    llamafactory:latest

docker exec -it llamafactory bash

llamafactory-cli webui

Error:

Traceback (most recent call last):
  File "/usr/local/bin/llamafactory-cli", line 5, in <module>
    from llamafactory.cli import main
  File "/app/src/llamafactory/__init__.py", line 17, in <module>
    from .cli import VERSION
  File "/app/src/llamafactory/cli.py", line 22, in <module>
    from . import launcher
  File "/app/src/llamafactory/launcher.py", line 15, in <module>
    from llamafactory.train.tuner import run_exp
  File "/app/src/llamafactory/train/tuner.py", line 26, in <module>
    from ..model import load_model, load_tokenizer
  File "/app/src/llamafactory/model/__init__.py", line 15, in <module>
    from .loader import load_config, load_model, load_tokenizer
  File "/app/src/llamafactory/model/loader.py", line 28, in <module>
    from .patcher import patch_config, patch_model, patch_tokenizer, patch_valuehead_model
  File "/app/src/llamafactory/model/patcher.py", line 30, in <module>
    from .model_utils.longlora import configure_longlora
  File "/app/src/llamafactory/model/model_utils/longlora.py", line 25, in <module>
    from transformers.models.llama.modeling_llama import (
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py", line 53, in <module>
    from flash_attn import flash_attn_func, flash_attn_varlen_func
  File "/usr/local/lib/python3.10/dist-packages/flash_attn/__init__.py", line 3, in <module>
    from flash_attn.flash_attn_interface import (
  File "/usr/local/lib/python3.10/dist-packages/flash_attn/flash_attn_interface.py", line 10, in <module>
    import flash_attn_2_cuda as flash_attn_cuda
ImportError: /usr/local/lib/python3.10/dist-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda9SetDeviceEi

Expected behavior

Successfully started

Others

No response

@github-actions github-actions bot added the pending This problem is yet to be addressed label Jul 11, 2024
@hzhaoy hzhaoy mentioned this issue Jul 11, 2024
2 tasks
@hiyouga hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Jul 13, 2024
xtchen96 pushed a commit to xtchen96/LLaMA-Factory that referenced this issue Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants