-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
有没有int4量化版,int4量化版推理需要多少什么显卡配置 #244
Comments
谢谢。那需要7张A100显卡了,成本比较高 |
GPTQ可能会和calibrate数据集相关,感觉还是AWQ的好一些? |
英伟达 新出的 project digits 128g 通用 vram,4 台可以部署,单价 $3000 |
好的,谢谢 |
如果用v2的量化版要求是不是低一点 |
@YMMF007 请问这个配置下推理速度怎么样,单token耗时大概多少 |
我这边显卡不够,就没跑了 |
Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
Describe the solution you'd like
A clear and concise description of what you want to happen.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: