采用最新代码，运行vllm（0.4.3）报错：undefined symbol: _ZN2at4_ops5zeros4ca... #4264

camposs1979 · 2024-06-13T09:24:13Z

Reminder

I have read the README and searched the existing issues.

System Info

(base) root@I1a8f02256d00501f7a:/hy-tmp/LLaMA-Factory-main/src# python3.10 -m pip list
Package Version

accelerate 0.30.1
addict 2.4.0
aiofiles 23.2.1
aiohttp 3.9.3
aiosignal 1.3.1
aliyun-python-sdk-core 2.15.0
aliyun-python-sdk-kms 2.16.2
altair 5.2.0
annotated-types 0.6.0
anyio 4.3.0
async-timeout 4.0.3
attrs 23.2.0
auto_gptq 0.7.1
bitsandbytes 0.43.0
certifi 2019.11.28
cffi 1.16.0
chardet 3.0.4
charset-normalizer 3.3.2
click 8.1.7
cloudpickle 3.0.0
cmake 3.29.2
coloredlogs 15.0.1
contourpy 1.2.0
crcmod 1.7
cryptography 42.0.5
cupy-cuda12x 12.1.0
cycler 0.12.1
datasets 2.18.0
dbus-python 1.2.16
deepspeed 0.14.0
dill 0.3.8
diskcache 5.6.3
distro 1.9.0
distro-info 0.23ubuntu1
docstring_parser 0.16
einops 0.7.0
exceptiongroup 1.2.0
fastapi 0.110.0
fastrlock 0.8.2
ffmpy 0.3.2
filelock 3.13.3
fire 0.6.0
flash-attn 2.5.9.post1
fonttools 4.50.0
frozenlist 1.4.1
fsspec 2024.2.0
galore-torch 1.0
gast 0.5.4
gekko 1.0.7
gradio 4.29.0
gradio_client 0.16.1
h11 0.14.0
hjson 3.1.0
httpcore 1.0.4
httptools 0.6.1
httpx 0.27.0
huggingface-hub 0.23.3
humanfriendly 10.0
idna 2.8
importlib_metadata 7.1.0
importlib_resources 6.4.0
interegular 0.3.3
Jinja2 3.1.3
jmespath 0.10.0
joblib 1.3.2
jsonschema 4.21.1
jsonschema-specifications 2023.12.1
kiwisolver 1.4.5
lark 1.1.9
llvmlite 0.42.0
lm-format-enforcer 0.10.1
markdown-it-py 3.0.0
MarkupSafe 2.1.5
matplotlib 3.8.3
mdurl 0.1.2
modelscope 1.13.3
mpmath 1.3.0
msgpack 1.0.8
multidict 6.0.5
multiprocess 0.70.16
nest-asyncio 1.6.0
networkx 3.2.1
ninja 1.11.1.1
numba 0.59.1
numpy 1.26.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-ml-py 12.555.43
nvidia-nccl-cu12 2.20.5
nvidia-nvjitlink-cu12 12.4.99
nvidia-nvtx-cu12 12.1.105
openai 1.34.0
optimum 1.16.0
orjson 3.9.15
oss2 2.18.4
outlines 0.0.34
packaging 24.0
pandas 2.2.1
peft 0.11.1
pillow 10.2.0
pip 24.0
platformdirs 4.2.0
prometheus_client 0.20.0
prometheus-fastapi-instrumentator 7.0.0
protobuf 3.20.3
psutil 5.9.8
py-cpuinfo 9.0.0
pyarrow 15.0.2
pyarrow-hotfix 0.6
pycparser 2.21
pycryptodome 3.20.0
pydantic 2.6.4
pydantic_core 2.16.3
pydub 0.25.1
Pygments 2.17.2
PyGObject 3.36.0
pynvml 11.5.0
pyparsing 3.1.2
python-apt 2.0.1+ubuntu0.20.4.1
python-dateutil 2.9.0.post0
python-dotenv 1.0.1
python-multipart 0.0.9
pytz 2024.1
PyYAML 6.0.1
ray 2.10.0
referencing 0.34.0
regex 2023.12.25
requests 2.31.0
requests-unixsocket 0.2.0
rich 13.7.1
rouge 1.0.1
rpds-py 0.18.0
ruff 0.4.6
safetensors 0.4.2
scipy 1.12.0
semantic-version 2.10.0
sentencepiece 0.2.0
setuptools 69.2.0
shellingham 1.5.4
shtab 1.7.1
simplejson 3.19.2
six 1.14.0
sniffio 1.3.1
sortedcontainers 2.4.0
sse-starlette 2.0.0
ssh-import-id 5.10
starlette 0.36.3
sympy 1.12
termcolor 2.4.0
tiktoken 0.6.0
tokenizers 0.19.1
tomli 2.0.1
tomlkit 0.12.0
toolz 0.12.1
torch 2.3.0+cu121
tqdm 4.66.2
transformers 4.41.2
triton 2.3.0
trl 0.8.6
typer 0.12.3
typing_extensions 4.10.0
tyro 0.7.3
tzdata 2024.1
unattended-upgrades 0.1
unsloth 2024.5
urllib3 2.2.1
uvicorn 0.29.0
uvloop 0.19.0
vllm 0.4.3
vllm-flash-attn 2.5.8.post2
watchfiles 0.21.0
websockets 11.0.3
wheel 0.43.0
xformers 0.0.26.post1
xxhash 3.4.1
yapf 0.40.2
yarl 1.9.4
zipp 3.18.1

Reproduction

脚本配置如下：
#!/bin/bash

CUDA_VISIBLE_DEVICES=0,1,2,3 python3.10 webui.py
--model_name_or_path ../model/qwen/Qwen2-72B-Instruct-GPTQ-Int4
--template qwen
--use_fast_tokenizer True
--repetition_penalty 1.03
--infer_backend vllm
--cutoff_len 3072
--vllm_maxlen 3072

系统报错信息如下：
(base) root@I1a8f02256d00501f7a:/hy-tmp/LLaMA-Factory-main/src# export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:64
(base) root@I1a8f02256d00501f7a:/hy-tmp/LLaMA-Factory-main/src# bash web_demo.sh
Traceback (most recent call last):
File "/hy-tmp/LLaMA-Factory-main/src/webui-old.py", line 3, in
from llamafactory.webui.interface import create_ui
File "/hy-tmp/LLaMA-Factory-main/src/llamafactory/init.py", line 3, in
from .cli import VERSION
File "/hy-tmp/LLaMA-Factory-main/src/llamafactory/cli.py", line 7, in
from . import launcher
File "/hy-tmp/LLaMA-Factory-main/src/llamafactory/launcher.py", line 1, in
from llamafactory.train.tuner import run_exp
File "/hy-tmp/LLaMA-Factory-main/src/llamafactory/train/tuner.py", line 10, in
from ..model import load_model, load_tokenizer
File "/hy-tmp/LLaMA-Factory-main/src/llamafactory/model/init.py", line 1, in
from .loader import load_config, load_model, load_tokenizer
File "/hy-tmp/LLaMA-Factory-main/src/llamafactory/model/loader.py", line 13, in
from .patcher import patch_config, patch_model, patch_tokenizer, patch_valuehead_model
File "/hy-tmp/LLaMA-Factory-main/src/llamafactory/model/patcher.py", line 16, in
from .model_utils.longlora import configure_longlora
File "/hy-tmp/LLaMA-Factory-main/src/llamafactory/model/model_utils/longlora.py", line 6, in
from transformers.models.llama.modeling_llama import (
File "/usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py", line 54, in
from flash_attn import flash_attn_func, flash_attn_varlen_func
File "/usr/local/lib/python3.10/dist-packages/flash_attn/init.py", line 3, in
from flash_attn.flash_attn_interface import (
File "/usr/local/lib/python3.10/dist-packages/flash_attn/flash_attn_interface.py", line 10, in
import flash_attn_2_cuda as flash_attn_cuda
_ImportError: /usr/local/lib/python3.10/dist-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE

Expected behavior

已经参考过问题：#4242（拉取了最新的代码），#4145（更新了vllm==0.4.3，同时torch版本更新到：2.3.0+cu121），问题都未能解决。
希望能够运行Qwen2-72B-Instruct

Others

No response

hiyouga · 2024-06-13T09:30:24Z

pip uninstall flash_attn

github-actions bot added the pending This problem is yet to be addressed label Jun 13, 2024

hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Jun 13, 2024

hiyouga closed this as completed Jun 13, 2024

hzhaoy mentioned this issue Jun 25, 2024

support flash-attn in Dockerfile #4461

Merged

2 tasks

hiyouga mentioned this issue Jul 2, 2024

flash_attn_2_cuda.cpython-38-x86_64-linux-gnu.so: undefined symbol #4648

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

采用最新代码，运行vllm（0.4.3）报错：undefined symbol: _ZN2at4_ops5zeros4ca... #4264

采用最新代码，运行vllm（0.4.3）报错：undefined symbol: _ZN2at4_ops5zeros4ca... #4264

camposs1979 commented Jun 13, 2024

hiyouga commented Jun 13, 2024

采用最新代码，运行vllm（0.4.3）报错：undefined symbol: _ZN2at4_ops5zeros4ca... #4264

采用最新代码，运行vllm（0.4.3）报错：undefined symbol: _ZN2at4_ops5zeros4ca... #4264

Comments

camposs1979 commented Jun 13, 2024

Reminder

System Info

Reproduction

Expected behavior

Others

hiyouga commented Jun 13, 2024