We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
跑在内网环境,暂时只能通过截图
我看了下Qwen2官网提供的显存数据,确实需要40G以上,是AWQ量化版本不支持多卡推理吗?
目前机器是有四张A100 40G显卡,容器启动只指定了两张,现在运行情况看,只用了其中一张(因为显存不够,直接终止程序),另一张显卡压根没有使用到
希望能提供参数支持AWQ量化模型或正常模型成功推理运行。现在翻了issue和文档没看见怎么指定多卡推理的
No response
The text was updated successfully, but these errors were encountered:
e2665e7
fixed
Sorry, something went wrong.
No branches or pull requests
Reminder
System Info
跑在内网环境,暂时只能通过截图
Reproduction
我看了下Qwen2官网提供的显存数据,确实需要40G以上,是AWQ量化版本不支持多卡推理吗?
目前机器是有四张A100 40G显卡,容器启动只指定了两张,现在运行情况看,只用了其中一张(因为显存不够,直接终止程序),另一张显卡压根没有使用到
Expected behavior
希望能提供参数支持AWQ量化模型或正常模型成功推理运行。现在翻了issue和文档没看见怎么指定多卡推理的
Others
No response
The text was updated successfully, but these errors were encountered: