微调 qwen2的时候为啥默认开启的是多卡，我明明用的是单卡训练，而且我也用webui试了单卡，但是它默认的还是多卡 #4137

yxl23 · 2024-06-07T03:04:29Z

Reminder

I have read the README and searched the existing issues.

System Info

Reproduction

llamafactory-cli train examples/lora_single_gpu/llama3_lora_sft.yaml

Expected behavior

No response

Others

No response

frozenarctic · 2024-06-07T08:35:04Z

在cli.py的第80行，改成
if (not disable_torchrun) and (get_device_count() > 1):
然后到根目录下重新运行
pip install -e '.[torch,metrics]'
默认就是用本地单卡了

hiyouga · 2024-06-07T11:17:02Z

修复了

yxl23 · 2024-06-08T02:23:12Z

我用llamafactory-cli train examples/qlora_single_gpu/llama3_lora_sft_bitsandbytes.yaml 8bit量化微调qwen2出现
{'loss': 0.0, 'grad_norm': nan, 'learning_rate': 0.0, 'epoch': 0.51}
{'loss': 0.0, 'grad_norm': nan, 'learning_rate': 0.0, 'epoch': 1.02}
{'loss': 0.0, 'grad_norm': nan, 'learning_rate': 0.0, 'epoch': 1.53}

hiyouga · 2024-06-08T05:01:34Z

用 bf16

yxl23 · 2024-06-08T05:03:15Z

在哪添加啊

model

model_name_or_path: E:\LLaMA-Factory\qwen\Qwen2-7B-Instruct
quantization_bit: 8

method

stage: sft
do_train: true
finetuning_type: lora
lora_target: all

dataset

dataset: xunlian
template: qwen
cutoff_len: 1024
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16

output

output_dir: saves/qwen2-7b/lora/sft
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true

train

per_device_train_batch_size: 1
gradient_accumulation_steps: 8
learning_rate: 1.0e-4
num_train_epochs: 100.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
fp16: true

eval

val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 500

hiyouga · 2024-06-08T05:28:06Z

fp16 换成 bf16

yxl23 · 2024-06-08T05:28:56Z

好的谢谢

hiyouga added the solved This problem has been already solved label Jun 7, 2024

hiyouga closed this as completed in 8bf9da6 Jun 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

微调 qwen2的时候为啥默认开启的是多卡，我明明用的是单卡训练，而且我也用webui试了单卡，但是它默认的还是多卡 #4137

微调 qwen2的时候为啥默认开启的是多卡，我明明用的是单卡训练，而且我也用webui试了单卡，但是它默认的还是多卡 #4137

yxl23 commented Jun 7, 2024

frozenarctic commented Jun 7, 2024 •

edited

Loading

hiyouga commented Jun 7, 2024

yxl23 commented Jun 8, 2024

hiyouga commented Jun 8, 2024

yxl23 commented Jun 8, 2024

hiyouga commented Jun 8, 2024

yxl23 commented Jun 8, 2024

微调 qwen2的时候为啥默认开启的是多卡，我明明用的是单卡训练，而且我也用webui试了单卡，但是它默认的还是多卡 #4137

微调 qwen2的时候为啥默认开启的是多卡，我明明用的是单卡训练，而且我也用webui试了单卡，但是它默认的还是多卡 #4137

Comments

yxl23 commented Jun 7, 2024

Reminder

System Info

Reproduction

Expected behavior

Others

frozenarctic commented Jun 7, 2024 • edited Loading

hiyouga commented Jun 7, 2024

yxl23 commented Jun 8, 2024

hiyouga commented Jun 8, 2024

yxl23 commented Jun 8, 2024

model

method

dataset

output

train

eval

hiyouga commented Jun 8, 2024

yxl23 commented Jun 8, 2024

frozenarctic commented Jun 7, 2024 •

edited

Loading