Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

采用最新代码,运行vllm(0.4.3)报错:undefined symbol: _ZN2at4_ops5zeros4ca... #4264

Closed
1 task done
camposs1979 opened this issue Jun 13, 2024 · 1 comment
Closed
1 task done
Labels
solved This problem has been already solved

Comments

@camposs1979
Copy link

Reminder

  • I have read the README and searched the existing issues.

System Info

(base) root@I1a8f02256d00501f7a:/hy-tmp/LLaMA-Factory-main/src# python3.10 -m pip list
Package Version


accelerate 0.30.1
addict 2.4.0
aiofiles 23.2.1
aiohttp 3.9.3
aiosignal 1.3.1
aliyun-python-sdk-core 2.15.0
aliyun-python-sdk-kms 2.16.2
altair 5.2.0
annotated-types 0.6.0
anyio 4.3.0
async-timeout 4.0.3
attrs 23.2.0
auto_gptq 0.7.1
bitsandbytes 0.43.0
certifi 2019.11.28
cffi 1.16.0
chardet 3.0.4
charset-normalizer 3.3.2
click 8.1.7
cloudpickle 3.0.0
cmake 3.29.2
coloredlogs 15.0.1
contourpy 1.2.0
crcmod 1.7
cryptography 42.0.5
cupy-cuda12x 12.1.0
cycler 0.12.1
datasets 2.18.0
dbus-python 1.2.16
deepspeed 0.14.0
dill 0.3.8
diskcache 5.6.3
distro 1.9.0
distro-info 0.23ubuntu1
docstring_parser 0.16
einops 0.7.0
exceptiongroup 1.2.0
fastapi 0.110.0
fastrlock 0.8.2
ffmpy 0.3.2
filelock 3.13.3
fire 0.6.0
flash-attn 2.5.9.post1
fonttools 4.50.0
frozenlist 1.4.1
fsspec 2024.2.0
galore-torch 1.0
gast 0.5.4
gekko 1.0.7
gradio 4.29.0
gradio_client 0.16.1
h11 0.14.0
hjson 3.1.0
httpcore 1.0.4
httptools 0.6.1
httpx 0.27.0
huggingface-hub 0.23.3
humanfriendly 10.0
idna 2.8
importlib_metadata 7.1.0
importlib_resources 6.4.0
interegular 0.3.3
Jinja2 3.1.3
jmespath 0.10.0
joblib 1.3.2
jsonschema 4.21.1
jsonschema-specifications 2023.12.1
kiwisolver 1.4.5
lark 1.1.9
llvmlite 0.42.0
lm-format-enforcer 0.10.1
markdown-it-py 3.0.0
MarkupSafe 2.1.5
matplotlib 3.8.3
mdurl 0.1.2
modelscope 1.13.3
mpmath 1.3.0
msgpack 1.0.8
multidict 6.0.5
multiprocess 0.70.16
nest-asyncio 1.6.0
networkx 3.2.1
ninja 1.11.1.1
numba 0.59.1
numpy 1.26.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-ml-py 12.555.43
nvidia-nccl-cu12 2.20.5
nvidia-nvjitlink-cu12 12.4.99
nvidia-nvtx-cu12 12.1.105
openai 1.34.0
optimum 1.16.0
orjson 3.9.15
oss2 2.18.4
outlines 0.0.34
packaging 24.0
pandas 2.2.1
peft 0.11.1
pillow 10.2.0
pip 24.0
platformdirs 4.2.0
prometheus_client 0.20.0
prometheus-fastapi-instrumentator 7.0.0
protobuf 3.20.3
psutil 5.9.8
py-cpuinfo 9.0.0
pyarrow 15.0.2
pyarrow-hotfix 0.6
pycparser 2.21
pycryptodome 3.20.0
pydantic 2.6.4
pydantic_core 2.16.3
pydub 0.25.1
Pygments 2.17.2
PyGObject 3.36.0
pynvml 11.5.0
pyparsing 3.1.2
python-apt 2.0.1+ubuntu0.20.4.1
python-dateutil 2.9.0.post0
python-dotenv 1.0.1
python-multipart 0.0.9
pytz 2024.1
PyYAML 6.0.1
ray 2.10.0
referencing 0.34.0
regex 2023.12.25
requests 2.31.0
requests-unixsocket 0.2.0
rich 13.7.1
rouge 1.0.1
rpds-py 0.18.0
ruff 0.4.6
safetensors 0.4.2
scipy 1.12.0
semantic-version 2.10.0
sentencepiece 0.2.0
setuptools 69.2.0
shellingham 1.5.4
shtab 1.7.1
simplejson 3.19.2
six 1.14.0
sniffio 1.3.1
sortedcontainers 2.4.0
sse-starlette 2.0.0
ssh-import-id 5.10
starlette 0.36.3
sympy 1.12
termcolor 2.4.0
tiktoken 0.6.0
tokenizers 0.19.1
tomli 2.0.1
tomlkit 0.12.0
toolz 0.12.1
torch 2.3.0+cu121
tqdm 4.66.2
transformers 4.41.2
triton 2.3.0
trl 0.8.6
typer 0.12.3
typing_extensions 4.10.0
tyro 0.7.3
tzdata 2024.1
unattended-upgrades 0.1
unsloth 2024.5
urllib3 2.2.1
uvicorn 0.29.0
uvloop 0.19.0
vllm 0.4.3
vllm-flash-attn 2.5.8.post2
watchfiles 0.21.0
websockets 11.0.3
wheel 0.43.0
xformers 0.0.26.post1
xxhash 3.4.1
yapf 0.40.2
yarl 1.9.4
zipp 3.18.1

Reproduction

脚本配置如下:
#!/bin/bash

CUDA_VISIBLE_DEVICES=0,1,2,3 python3.10 webui.py
--model_name_or_path ../model/qwen/Qwen2-72B-Instruct-GPTQ-Int4
--template qwen
--use_fast_tokenizer True
--repetition_penalty 1.03
--infer_backend vllm
--cutoff_len 3072
--vllm_maxlen 3072

系统报错信息如下:
(base) root@I1a8f02256d00501f7a:/hy-tmp/LLaMA-Factory-main/src# export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:64
(base) root@I1a8f02256d00501f7a:/hy-tmp/LLaMA-Factory-main/src# bash web_demo.sh
Traceback (most recent call last):
File "/hy-tmp/LLaMA-Factory-main/src/webui-old.py", line 3, in
from llamafactory.webui.interface import create_ui
File "/hy-tmp/LLaMA-Factory-main/src/llamafactory/init.py", line 3, in
from .cli import VERSION
File "/hy-tmp/LLaMA-Factory-main/src/llamafactory/cli.py", line 7, in
from . import launcher
File "/hy-tmp/LLaMA-Factory-main/src/llamafactory/launcher.py", line 1, in
from llamafactory.train.tuner import run_exp
File "/hy-tmp/LLaMA-Factory-main/src/llamafactory/train/tuner.py", line 10, in
from ..model import load_model, load_tokenizer
File "/hy-tmp/LLaMA-Factory-main/src/llamafactory/model/init.py", line 1, in
from .loader import load_config, load_model, load_tokenizer
File "/hy-tmp/LLaMA-Factory-main/src/llamafactory/model/loader.py", line 13, in
from .patcher import patch_config, patch_model, patch_tokenizer, patch_valuehead_model
File "/hy-tmp/LLaMA-Factory-main/src/llamafactory/model/patcher.py", line 16, in
from .model_utils.longlora import configure_longlora
File "/hy-tmp/LLaMA-Factory-main/src/llamafactory/model/model_utils/longlora.py", line 6, in
from transformers.models.llama.modeling_llama import (
File "/usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py", line 54, in
from flash_attn import flash_attn_func, flash_attn_varlen_func
File "/usr/local/lib/python3.10/dist-packages/flash_attn/init.py", line 3, in
from flash_attn.flash_attn_interface import (
File "/usr/local/lib/python3.10/dist-packages/flash_attn/flash_attn_interface.py", line 10, in
import flash_attn_2_cuda as flash_attn_cuda
_ImportError: /usr/local/lib/python3.10/dist-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE

Expected behavior

已经参考过问题:#4242(拉取了最新的代码),#4145(更新了vllm==0.4.3,同时torch版本更新到:2.3.0+cu121),问题都未能解决。
希望能够运行Qwen2-72B-Instruct

Others

No response

@github-actions github-actions bot added the pending This problem is yet to be addressed label Jun 13, 2024
@hiyouga
Copy link
Owner

hiyouga commented Jun 13, 2024

pip uninstall flash_attn

@hiyouga hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Jun 13, 2024
@hiyouga hiyouga closed this as completed Jun 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

No branches or pull requests

2 participants