Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

合并 LoRA 适配器与模型量化,使用 AutoGPTQ 量化模型得到后的模型推理报错如下ValueError: torch.bfloat16 is not supported for quantization method gptq. Supported dtypes: [torch.float16] #4677

Closed
1 task done
caijx168 opened this issue Jul 4, 2024 · 1 comment
Labels
solved This problem has been already solved

Comments

@caijx168
Copy link

caijx168 commented Jul 4, 2024

Reminder

  • I have read the README and searched the existing issues.

System Info

1

Reproduction

1

Expected behavior

合并 LoRA 适配器与模型量化
llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
使用 AutoGPTQ 量化模型
llamafactory-cli export examples/merge_lora/llama3_gptq.yaml
通过官方的这两个命令对自己微调的qwen2模型合并适配器与模型量化,量化之后的模型config.json有20多W行,然后使用旧版的api_demo.py 推理报错如下
ValueError: torch.bfloat16 is not supported for quantization method gptq. Supported dtypes: [torch.float16]
image
image

Others

帮助解决问题

@github-actions github-actions bot added the pending This problem is yet to be addressed label Jul 4, 2024
@hiyouga
Copy link
Owner

hiyouga commented Jul 4, 2024

更新代码

@hiyouga hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Jul 4, 2024
@hiyouga hiyouga closed this as completed in 1e27e8c Jul 4, 2024
xtchen96 pushed a commit to xtchen96/LLaMA-Factory that referenced this issue Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

No branches or pull requests

2 participants