合并 LoRA 适配器与模型量化，使用 AutoGPTQ 量化模型得到后的模型推理报错如下ValueError: torch.bfloat16 is not supported for quantization method gptq. Supported dtypes: [torch.float16] #4677

caijx168 · 2024-07-04T05:57:43Z

Reminder

I have read the README and searched the existing issues.

System Info

1

Reproduction

1

Expected behavior

合并 LoRA 适配器与模型量化
llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
使用 AutoGPTQ 量化模型
llamafactory-cli export examples/merge_lora/llama3_gptq.yaml
通过官方的这两个命令对自己微调的qwen2模型合并适配器与模型量化，量化之后的模型config.json有20多W行，然后使用旧版的api_demo.py 推理报错如下
ValueError: torch.bfloat16 is not supported for quantization method gptq. Supported dtypes: [torch.float16]

Others

帮助解决问题

hiyouga · 2024-07-04T06:22:19Z

更新代码

github-actions bot added the pending This problem is yet to be addressed label Jul 4, 2024

hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Jul 4, 2024

hiyouga closed this as completed in 1e27e8c Jul 4, 2024

xtchen96 pushed a commit to xtchen96/LLaMA-Factory that referenced this issue Jul 17, 2024

fix hiyouga#4677

7f8d132

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

合并 LoRA 适配器与模型量化，使用 AutoGPTQ 量化模型得到后的模型推理报错如下ValueError: torch.bfloat16 is not supported for quantization method gptq. Supported dtypes: [torch.float16] #4677

合并 LoRA 适配器与模型量化，使用 AutoGPTQ 量化模型得到后的模型推理报错如下ValueError: torch.bfloat16 is not supported for quantization method gptq. Supported dtypes: [torch.float16] #4677

caijx168 commented Jul 4, 2024

hiyouga commented Jul 4, 2024

合并 LoRA 适配器与模型量化，使用 AutoGPTQ 量化模型得到后的模型推理报错如下ValueError: torch.bfloat16 is not supported for quantization method gptq. Supported dtypes: [torch.float16] #4677

合并 LoRA 适配器与模型量化，使用 AutoGPTQ 量化模型得到后的模型推理报错如下ValueError: torch.bfloat16 is not supported for quantization method gptq. Supported dtypes: [torch.float16] #4677

Comments

caijx168 commented Jul 4, 2024

Reminder

System Info

Reproduction

Expected behavior

Others

hiyouga commented Jul 4, 2024