PPO 跑example例子报错:value should be one of int, float, str, bool, or torch.Tensor #4458
Closed
1 task done
Labels
solved
This problem has been already solved
Reminder
System Info
llamafactory
version: 0.8.3.dev0Reproduction
在colab上运行的
!llamafactory-cli train examples/train_lora/llama3_lora_reward.yaml # 正常结果
!llamafactory-cli train examples/train_lora/llama3_lora_ppo.yaml # 这部报错
Traceback (most recent call last):
File "/usr/local/bin/llamafactory-cli", line 8, in
sys.exit(main())
File "/content/drive/My Drive/llama-factory-new/LLaMA-Factory/src/llamafactory/cli.py", line 110, in main
run_exp()
File "/content/drive/My Drive/llama-factory-new/LLaMA-Factory/src/llamafactory/train/tuner.py", line 54, in run_exp
run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/content/drive/My Drive/llama-factory-new/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 58, in run_ppo
ppo_trainer = CustomPPOTrainer(
File "/content/drive/My Drive/llama-factory-new/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 118, in init
PPOTrainer.init(
File "/usr/local/lib/python3.10/dist-packages/trl/trainer/ppo_trainer.py", line 227, in init
self.accelerator.init_trackers(
File "/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py", line 685, in _inner
return PartialState().on_main_process(function)(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py", line 2586, in init_trackers
tracker.store_init_configuration(config)
File "/usr/local/lib/python3.10/dist-packages/accelerate/tracking.py", line 79, in execute_on_main_process
return PartialState().on_main_process(function)(self, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/tracking.py", line 211, in store_init_configuration
self.writer.add_hparams(values, metric_dict={})
File "/usr/local/lib/python3.10/dist-packages/torch/utils/tensorboard/writer.py", line 341, in add_hparams
exp, ssi, sei = hparams(hparam_dict, metric_dict, hparam_domain_discrete)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/tensorboard/summary.py", line 316, in hparams
raise ValueError(
ValueError: value should be one of int, float, str, bool, or torch.Tensor
Expected behavior
No response
Others
No response
The text was updated successfully, but these errors were encountered: