Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

使用PPO时报错:KeyError: 'device_map' #4198

Closed
1 task done
ZachcZhang opened this issue Jun 11, 2024 · 1 comment
Closed
1 task done

使用PPO时报错:KeyError: 'device_map' #4198

ZachcZhang opened this issue Jun 11, 2024 · 1 comment
Labels
solved This problem has been already solved

Comments

@ZachcZhang
Copy link

Reminder

  • I have read the README and searched the existing issues.

System Info

  • llamafactory version: 0.8.1.dev0
  • Platform: Linux-5.4.0-155-generic-x86_64-with-glibc2.31
  • Python version: 3.10.14
  • PyTorch version: 2.1.2+cu121 (GPU)
  • Transformers version: 4.41.2
  • Datasets version: 2.18.0
  • Accelerate version: 0.31.0
  • PEFT version: 0.11.1
  • TRL version: 0.9.4
  • GPU type: NVIDIA A800-SXM4-80GB
  • DeepSpeed version: 0.14.0
  • Bitsandbytes version: 0.43.0
  • vLLM version: 0.4.0.post1

Reproduction

运行命令:

llamafactory-cli train
--stage ppo
--do_train True
--model_name_or_path saves/LLaMA3-8B/full/train_2024-06-10-00-24-00
--preprocessing_num_workers 16
--finetuning_type full
--template default
--flash_attn auto
--dataset_dir data
--dataset Taiyi_test
--cutoff_len 1024
--learning_rate 5e-05
--num_train_epochs 3.0
--max_samples 100000
--per_device_train_batch_size 2
--gradient_accumulation_steps 8
--lr_scheduler_type cosine
--max_grad_norm 1.0
--logging_steps 5
--save_steps 100
--warmup_steps 0
--optim adamw_torch
--packing False
--report_to none
--output_dir saves/LLaMA3-8B/full/train_2024-06-10-16-43-48
--fp16 True
--plot_loss True
--ddp_timeout 180000000
--include_num_input_tokens_seen True
--reward_model saves/LLaMA3-8B/full/train_2024-06-10-00-24-00
--reward_model_type full
--deepspeed cache/ds_z3_offload_config.json
--top_k 0
--top_p 0.6

报错

06/10/2024 13:46:22 - INFO - llamafactory.model.patcher - Using KV cache for faster generation.
Traceback (most recent call last):
File "/hpc2hdd/home/czhangcn/llm/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in
launch()
File "/hpc2hdd/home/czhangcn/llm/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch
run_exp()
File "/hpc2hdd/home/czhangcn/llm/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp
run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/hpc2hdd/home/czhangcn/llm/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 40, in run_ppo
reward_model = create_reward_model(model, model_args, finetuning_args)
File "/hpc2hdd/home/czhangcn/llm/LLaMA-Factory/src/llamafactory/train/utils.py", line 151, in create_reward_model
reward_model = load_model(
File "/hpc2hdd/home/czhangcn/llm/LLaMA-Factory/src/llamafactory/model/loader.py", line 116, in load_model
patch_config(config, tokenizer, model_args, init_kwargs, is_trainable)
File "/hpc2hdd/home/czhangcn/llm/LLaMA-Factory/src/llamafactory/model/patcher.py", line 82, in patch_config
if init_kwargs["device_map"] == "auto":
KeyError: 'device_map'

解决:

20240611-150802

Expected behavior

运行ppo时报错KeyError: 'device_map', LLaMA-Factory/src/llamafactory/model/patcher.py line:82行
可能需要添加一条判断语句

if "device_map" in init_kwargs and init_kwargs["device_map"] == "auto":
    init_kwargs["offload_folder"] = model_args.offload_folder

这部分如果有特殊逻辑,请根据特殊逻辑调整~

Others

No response

@github-actions github-actions bot added the pending This problem is yet to be addressed label Jun 11, 2024
@hiyouga
Copy link
Owner

hiyouga commented Jun 11, 2024

fixed

@hiyouga hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Jun 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

No branches or pull requests

2 participants