Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

啥咱们这里使用vllm输入长度限制在2048token(千问原始支持32k的token),而且显存也没有提供限制的参数 #2782

Closed
1 task done
sunjunlishi opened this issue Mar 11, 2024 · 5 comments
Labels
solved This problem has been already solved

Comments

@sunjunlishi
Copy link

Reminder

  • I have read the README and searched the existing issues.

Reproduction

python src/web_demo.py
--model_name_or_path ../../../workspace/Llama/Qwen-14B-Chat-Int4
--template qwen
--infer_backend vllm --enforce_eager

Expected behavior

1710127960835

System Info

ValueError: Some specified arguments are not used by the HfArgumentParser: ['--enforce_eager']
使用vllm的问题(确实速度提速)但是问题1 显存自适应参数 不能用 问题2 max_seq_len=2048 token参数,也不支持更改

Others

No response

@sunjunlishi sunjunlishi changed the title 啥咱们这里使用vllm输入长度限制在2048token(千文原始支持32k的token),而且显存也没有提供限制的参数 啥咱们这里使用vllm输入长度限制在2048token(千问原始支持32k的token),而且显存也没有提供限制的参数 Mar 11, 2024
@hiyouga
Copy link
Owner

hiyouga commented Mar 11, 2024

使用 --vllm_maxlen 参数修改

@KelleyYin
Copy link

使用 --vllm_maxlen 参数修改

请问嗯vllm中支持NTK等方式动态扩展吗?

@sunjunlishi
Copy link
Author

sunjunlishi commented Mar 11, 2024

@hiyouga 那个参数也可以,更改代码也可以。还有显存的限制问题,直接改代码vllm_engine.py
require_version("vllm>=0.3.3", "To fix: pip install vllm>=0.3.3")
self.can_generate = finetuning_args.stage == "sft"
engine_args = AsyncEngineArgs(
model=model_args.model_name_or_path,
trust_remote_code=True,
max_model_len=model_args.vllm_maxlen,
tensor_parallel_size=get_device_count(),
disable_log_stats=True,
disable_log_requests=True,
enforce_eager=True,#add
gpu_memory_utilization=0.95#add
)

增加了 enforce_eager=True,#add
gpu_memory_utilization=0.95#add

@hiyouga hiyouga added the solved This problem has been already solved label Mar 11, 2024
hiyouga added a commit that referenced this issue Mar 12, 2024
@hiyouga
Copy link
Owner

hiyouga commented Mar 12, 2024

现在加了这两个参数

tybalex pushed a commit to sanjay920/LLaMA-Factory that referenced this issue Mar 15, 2024
@yecphaha
Copy link

现在加了这两个参数

请问用vllm如何添加这两个参数?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

No branches or pull requests

4 participants