CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control

Bibtex

If you find this code useful in your research, please cite:

@article{tevet2024closd,
  title={CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control},
  author={Tevet, Guy and Raab, Sigal and Cohan, Setareh and Reda, Daniele and Luo, Zhengyi and Peng, Xue Bin and Bermano, Amit H and van de Panne, Michiel},
  journal={arXiv preprint arXiv:2410.03441},
  year={2024}
}

Getting Started

The code was tested on Ubuntu 20.04.5 with Python 3.8.19.
Running CLoSD requires a single GPU with ~4GB RAM and a monitor.
Training and evaluation require a single GPU with ~50GB RAM (monitor is not required).
You only need to setup the Python environment. All the dependencies (data, checkpoints, etc.) will be cached automatically on the first run!

Setup env

Create a Conda env and setup the requirements:

conda create -n closd python=3.8
conda activate closd
pip install -r requirement.txt
python -m spacy download en_core_web_sm

Download Isaac GYM, and install it to your env:

conda activate closd
cd <ISSAC_GYM_DIR>/python
pip install -e .

Copyright notes

The code will automatically download cached versions of the following datasets and models. You must adhere to their terms of use!

SMPL license is according to https://smpl-x.is.tue.mpg.de/
AMASS license is according to https://amass.is.tue.mpg.de/
HumanML3D dataset license is according to https://github.com/EricGuo5513/HumanML3D

Run CLoSD

Multi-task

python closd/run.py\
  learning=im_big robot=smpl_humanoid\
  epoch=-1 test=True no_virtual_display=True\
  headless=False env.num_envs=9\
  env=closd_multitask exp_name=CLoSD_multitask_finetune

Sequence of tasks

python closd/run.py\
  learning=im_big robot=smpl_humanoid\
  epoch=-1 test=True no_virtual_display=True\
  headless=False env.num_envs=9\
  env=closd_sequence exp_name=CLoSD_multitask_finetune

Text-to-motion

python closd/run.py\
  learning=im_big robot=smpl_humanoid\
  epoch=-1 test=True no_virtual_display=True\
  headless=False env.num_envs=9\
  env=closd_t2m exp_name=CLoSD_t2m_finetune

For running the model without fine-tuning, use exp_name=CLoSD_no_finetune

Evaluate

Multi-task success rate

To reproduce Table 1 in the paper.

python closd/run.py\
 learning=im_big env=closd_multitask robot=smpl_humanoid\
 exp_name=CLoSD_multitask_finetune\
 epoch=-1\
 env.episode_length=500\
 env.dip.cfg_param=7.5\
 env.num_envs=4096\
 test=True\
 no_virtual_display=True\
 headless=True\
 closd_eval=True

Text-to-motion

The evaluation process runs on pre-recorded data and reproduces Table 3 in the paper.
The raw results are at https://huggingface.co/guytevet/CLoSD/blob/main/evaluation/closd/eval.log, this code should reproduce it.

python -m closd.diffusion_planner.eval.eval_humanml --external_results_file closd/diffusion_planner/saved_motions/closd/CloSD.pkl --do_unique

To log resutls in Wandb, add:

 --train_platform_type WandBPlatform --eval_name <wandb_exp_name>

Train your own CLoSD

Tracking controller (PHC based)

python closd/run.py\
 learning=im_big env=im_single_prim robot=smpl_humanoid\
 env.cycle_motion=True epoch=-1\
 exp_name=my_CLoSD_no_finetune

Train for 62K epochs

Fine-tune for Multi-task

python closd/run.py\
 learning=im_big env=closd_multitask robot=smpl_humanoid\
 learning.params.load_checkpoint=True\
 learning.params.load_path=output/CLoSD/my_CLoSD_no_finetune/Humanoid.pth\
 env.dip.cfg_param=2.5 env.num_envs=3072\
 has_eval=False epoch=-1\
 exp_name=my_CLoSD_multitask_finetune

Train for 4K epochs

Fine-tune for Text-to-motion

python closd/run.py\
 learning=im_big env=closd_t2m robot=smpl_humanoid\
 learning.params.load_checkpoint=True\
 learning.params.load_path=output/CLoSD/my_CLoSD_no_finetune/Humanoid.pth\
 env.dip.cfg_param=2.5 env.num_envs=3072\
 has_eval=False epoch=-1\
 exp_name=my_CLoSD_t2m_finetune

Train for 1K epochs

For debug run, use learning=im_toy and add no_log=True env.num_envs=4

DiP

Diffusion Planner (DiP) is a real-time autoregressive diffusion model that serves as the planner for the CLoSD agent.
Instead of running it as part of CLoSD, you can also run DiP in a stand-alone mode, fed by its own generated motions.
The following details how to sample/evaluate/train DiP in the stand-alone mode.

Generate Motion with the Stand-alone DiP

Full autoregressive generation (without target):

python -m closd.diffusion_planner.sample.generate\
 --model_path closd/diffusion_planner/save/DiP_no-target_10steps_context20_predict40/model000200000.pt\
 --num_repetitions 1 --autoregressive

Prefix completion with target trajectory:

python -m closd.diffusion_planner.sample.generate\
 --model_path closd/diffusion_planner/save/DiP_multi-target_10steps_context20_predict40/model000300000.pt\
 --num_repetitions 1 --sampling_mode goal\
 --target_joint_names "traj,heading" --target_joint_source data

To sample with random joint target (instead of sampling it from the data, which is more challenging), use --target_joint_source random
Other 'legal' joint conditions are:

--target_joint_names 
[traj,heading|
pelvis,heading|
right_wrist,heading|
left_wrist,heading|
right_foot,heading|
left_foot,heading]

Stand-alone Evaluation

Evaluate DiP fed by its own predictions (without the CLoSD framework):
To reproduce Tables 2 and 3 (the DiP entry) in the paper.

python -m closd.diffusion_planner.eval.eval_humanml\
 --guidance_param 7.5\
 --model_path closd/diffusion_planner/save/DiP_no-target_10steps_context20_predict40/model000600343.pt\
 --autoregressive

Train your own DiP

The following will reproduce the DiP used in the paper:

python -m closd.diffusion_planner.train.train_mdm\
 --save_dir closd/diffusion_planner/save/my_DiP\
 --dataset humanml --arch trans_dec --text_encoder_type bert\
 --diffusion_steps 10 --context_len 20 --pred_len 40\
 --mask_frames --eval_during_training --gen_during_training --overwrite --use_ema --autoregressive --train_platform_type WandBPlatform

To train DiP without target conditioning, add --lambda_target_loc 0

Acknowledgments

This code is standing on the shoulders of giants. We want to thank the following contributors that our code is based on:

MDM, PHC, MotionCLIP, text-to-motion, actor, joints2smpl, MoDi.

License

This code is distributed under an MIT LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control

Bibtex

Getting Started

Run CLoSD

Evaluate

Train your own CLoSD

DiP

Acknowledgments

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control

Bibtex

Getting Started

Run CLoSD

Evaluate

Train your own CLoSD

DiP

Acknowledgments

License