如何使用8bit model做inference #1462

tonyaw · 2023-11-10T04:23:24Z

您好，我想把codellama的model做8bit的quantization，然后再用它做inference。
使用参数如下:
{
"stage": "sft",
"model_name_or_path": "/workspace/model/CodeLlama-34b-Instruct-hf",
"do_train": false,
"do_predict": true,
"dataset": "test_no_answer",
"template": "llama2",
"finetuning_type": "lora",
"quantization_bit": 8,
"bf16": true,
"lora_target": "all",
"output_dir": "/workspace/34b_baseline",
"cutoff_len": 16384,
"per_device_eval_batch_size": 1,
"max_samples": 100,
"predict_with_generate": true
}

遇到error:
File "/workspace/task_entry.py", line 30, in training_task
run_exp(args_dict)
File "/workspace/src/llmtuner/tuner/tune.py", line 65, in run_exp
run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/workspace/src/llmtuner/tuner/sft/workflow.py", line 49, in run_sft
trainer = CustomSeq2SeqTrainer(
File "/usr/local/lib/python3.9/dist-packages/transformers/trainer_seq2seq.py", line 56, in init
super().init(
File "/usr/local/lib/python3.9/dist-packages/transformers/trainer.py", line 412, in init
raise ValueError(

Message:

You cannot perform fine-tuning on purely quantized models. Please attach trainable adapters on top of the quantized model to correctly perform fine-tuning. Please see: https://huggingface.co/docs/transformers/peft for more details

User error.

请问应该如何解决?
谢谢！

The text was updated successfully, but these errors were encountered:

hiyouga · 2023-11-10T06:04:19Z

安装 transformers==4.33.2

tonyaw · 2023-11-13T06:29:06Z

请问这个要求是完全等于4.33.2，还是
transformers>=4.33.2？

tonyaw · 2023-11-13T06:30:38Z

且LlamaFactory的code需要update么?
我现在用的是大概两周前的snapshot。
如果需要更新，请问有个PR告诉我哪些需要改么?

hiyouga · 2023-11-13T08:32:25Z

完全等于
你先试试

tonyaw · 2023-11-13T08:52:42Z

测试结果是:

完全等于，没有问题。
但是如果是大于等于，实际安装的是4.34.1，然后是同样的error。

这意味着以后不能再升级transformers了(暂时)?

hiyouga · 2023-11-13T09:27:43Z

是的

tonyaw · 2023-11-15T09:34:45Z

您好，
我下了最新的code，又遇到这个error，发现requirements.txt:
transformers>=4.31.0,<4.35.0
请问为何不是transformers==4.33.2呢？

hiyouga added the solved This problem has been already solved label Nov 10, 2023

hiyouga closed this as completed Nov 10, 2023

hiyouga mentioned this issue Dec 5, 2023

Cannot use Quantization bit 4 for prediction #1735

Closed

1 task

hiyouga added a commit that referenced this issue Dec 20, 2023

fix #1073 #1462 #1735 #1908

31165a9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

如何使用8bit model做inference #1462

如何使用8bit model做inference #1462

tonyaw commented Nov 10, 2023

hiyouga commented Nov 10, 2023

tonyaw commented Nov 13, 2023

tonyaw commented Nov 13, 2023

hiyouga commented Nov 13, 2023

tonyaw commented Nov 13, 2023

hiyouga commented Nov 13, 2023

tonyaw commented Nov 15, 2023 •

edited

Loading

如何使用8bit model做inference #1462

如何使用8bit model做inference #1462

Comments

tonyaw commented Nov 10, 2023

hiyouga commented Nov 10, 2023

tonyaw commented Nov 13, 2023

tonyaw commented Nov 13, 2023

hiyouga commented Nov 13, 2023

tonyaw commented Nov 13, 2023

hiyouga commented Nov 13, 2023

tonyaw commented Nov 15, 2023 • edited Loading

tonyaw commented Nov 15, 2023 •

edited

Loading