使用PPO时报错:KeyError: 'device_map' #4198

ZachcZhang · 2024-06-11T07:14:33Z

Reminder

I have read the README and searched the existing issues.

System Info

llamafactory version: 0.8.1.dev0
Platform: Linux-5.4.0-155-generic-x86_64-with-glibc2.31
Python version: 3.10.14
PyTorch version: 2.1.2+cu121 (GPU)
Transformers version: 4.41.2
Datasets version: 2.18.0
Accelerate version: 0.31.0
PEFT version: 0.11.1
TRL version: 0.9.4
GPU type: NVIDIA A800-SXM4-80GB
DeepSpeed version: 0.14.0
Bitsandbytes version: 0.43.0
vLLM version: 0.4.0.post1

Reproduction

运行命令:

llamafactory-cli train
--stage ppo
--do_train True
--model_name_or_path saves/LLaMA3-8B/full/train_2024-06-10-00-24-00
--preprocessing_num_workers 16
--finetuning_type full
--template default
--flash_attn auto
--dataset_dir data
--dataset Taiyi_test
--cutoff_len 1024
--learning_rate 5e-05
--num_train_epochs 3.0
--max_samples 100000
--per_device_train_batch_size 2
--gradient_accumulation_steps 8
--lr_scheduler_type cosine
--max_grad_norm 1.0
--logging_steps 5
--save_steps 100
--warmup_steps 0
--optim adamw_torch
--packing False
--report_to none
--output_dir saves/LLaMA3-8B/full/train_2024-06-10-16-43-48
--fp16 True
--plot_loss True
--ddp_timeout 180000000
--include_num_input_tokens_seen True
--reward_model saves/LLaMA3-8B/full/train_2024-06-10-00-24-00
--reward_model_type full
--deepspeed cache/ds_z3_offload_config.json
--top_k 0
--top_p 0.6

报错

06/10/2024 13:46:22 - INFO - llamafactory.model.patcher - Using KV cache for faster generation.
Traceback (most recent call last):
File "/hpc2hdd/home/czhangcn/llm/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in
launch()
File "/hpc2hdd/home/czhangcn/llm/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch
run_exp()
File "/hpc2hdd/home/czhangcn/llm/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp
run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/hpc2hdd/home/czhangcn/llm/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 40, in run_ppo
reward_model = create_reward_model(model, model_args, finetuning_args)
File "/hpc2hdd/home/czhangcn/llm/LLaMA-Factory/src/llamafactory/train/utils.py", line 151, in create_reward_model
reward_model = load_model(
File "/hpc2hdd/home/czhangcn/llm/LLaMA-Factory/src/llamafactory/model/loader.py", line 116, in load_model
patch_config(config, tokenizer, model_args, init_kwargs, is_trainable)
File "/hpc2hdd/home/czhangcn/llm/LLaMA-Factory/src/llamafactory/model/patcher.py", line 82, in patch_config
if init_kwargs["device_map"] == "auto":
KeyError: 'device_map'

解决:

Expected behavior

运行ppo时报错KeyError: 'device_map', LLaMA-Factory/src/llamafactory/model/patcher.py line:82行
可能需要添加一条判断语句

if "device_map" in init_kwargs and init_kwargs["device_map"] == "auto":
    init_kwargs["offload_folder"] = model_args.offload_folder

这部分如果有特殊逻辑，请根据特殊逻辑调整～

Others

No response

The text was updated successfully, but these errors were encountered:

hiyouga · 2024-06-11T07:38:47Z

fixed

github-actions bot added the pending This problem is yet to be addressed label Jun 11, 2024

hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Jun 11, 2024

hiyouga closed this as completed in 89f2bd8 Jun 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

使用PPO时报错:KeyError: 'device_map' #4198

使用PPO时报错:KeyError: 'device_map' #4198

ZachcZhang commented Jun 11, 2024

hiyouga commented Jun 11, 2024

使用PPO时报错:KeyError: 'device_map' #4198

使用PPO时报错:KeyError: 'device_map' #4198

Comments

ZachcZhang commented Jun 11, 2024

Reminder

System Info

Reproduction

运行命令:

报错

解决:

Expected behavior

Others

hiyouga commented Jun 11, 2024