The PyTorch implementation of the paper GeneOH Diffusion, presenting a generalizable HOI denoising model designed to curate high-quality interaction data.
teaser_github_trimed.mp4
The repository contains
- Pre-trained models and example usage (on GRAB, GRAB (Beta), TACO, and HOI4D);
- Evaluation processes on GRAB, GRAB (Beta), and HOI4D;
- Training pipelines.
This code was tested on Ubuntu 20.04.5 LTS
and requires:
- Python 3.8.13
- conda3 or miniconda3
- CUDA capable GPU (one is enough)
Create a virtual environment
conda create -n geneoh-diffusion python==3.8.13
conda activate geneoh-diffusion
Install torch2.2.0+cu121
pip install torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 --index-url https://download.pytorch.org/whl/cu121
Install torch_cluster
cd whls
pip install torch_cluster-1.6.3+pt22cu121-cp38-cp38-linux_x86_64.whl
cd ..
Install remaining dependencies
pip install -r requirements.txt --no-cache
Important: Install manopth
cd manopth
pip install -e .
cd ..
Please note that the MANO layer utilized in our project deviates slightly from the original official release. It is essential to install the manopth package from this project, as failure to do so may result in abnormal denoised outcomes from the model.
Download models from this link and place them in the ./ckpts
folder.
1. GRAB (for GRAB and GRAB (Beta) test sets)
-
Download the preprocessed data (object, test split) and extract it into a data folder for GRAB preprocessed object data (e.g.,
./data/grab/GRAB_processed
). -
Download the GRAB object meshes and unzip the obtained
object_meshes.zip
into the folder./data/grab
. -
Download the preprocessed data (hand, test split) and extract it into a data folder for GRAB preprocessed subject data (e.g.,
./data/grab/GRAB_processed_wsubj
).
2. HOI4D
-
Download object CAD models and extract it into a data folder for HOI4D object CAD models (e.g.,
./data/hoi4d/HOI4D_CAD_Model_for_release
). -
Download the preprocessed data (per-category, rigid) and unzip the obtained
CAT_NM.zip
for each category into the folder for HOI4D rigid preprocessed data (e.g../data/hoi4d/HOI_Processed_Data_Rigid
). -
Download the preprocessed data (per-category, articulated) and unzip the obtained
CAT_NM.zip
for each category into the folder for HOI4D articulated preprocessed data (e.g../data/hoi4d/HOI_Processed_Data_Arti
).
3. TACO
Besides the test datasets mentioned in the paper, we've also evaluated our model on a recent TACO dataset. Data samples for testing purposes have been included in the folder ./data/taco/source_data
. More data will be incorporated soon.
Example
Here's an example of cleaning an input trajectory (sequence 14 of GRAB's test split) with Gaussian noise.
The input noisy trajectory is constructed by adding Gaussian noise onto the trajectory data/grab/source_data/14.npy
. And two different denoised samples are shown as below.
Input | Result 1 | Result 2 |
---|---|---|
To reproduce the above result, follow the steps below:
- Denoising
Ten random seeds will be utilizd for prediction. The predicted results will be saved in the folder
bash scripts/val_examples/predict_grab_rndseed_14.sh #### After completing the above command #### bash scripts/val_examples/predict_grab_rndseed_spatial_14.sh
./data/grab/result
. - Mesh reconstruction
Results will be saved under the same folder with the above step.
bash scripts/val_examples/reconstruct_grab_14.sh
- Extracting results and visualization
Adjust camera pose in the viewer given the first frame. Then figures capturing all frames will be saved under the root folder of the project. Use your favorate tool to compose them together into a video.
python visualize/vis_grab_example_14.py
Evaluate on the test split
- Update data and experimental paths in
.sh
scripts- For GRAB testing scripts, including
scripts/val/predict_grab_rndseed.sh
,scripts/val/predict_grab_rndseed_spatial.sh
, andscripts/val/reconstruct_grab.sh
, please edit the data and experimental path-related arguments specified in those scripts to correspond to the paths where the downloaded data is saved. For instance,
################# [Edit here] Set to your paths ################# #### Data and exp folders #### export seq_root="data/grab/GRAB_processed/test" export grab_path="data/grab/GRAB_extracted" export save_dir="exp/grab/eval_save" export grab_processed_dir="data/grab/GRAB_processed"
- For GRAB testing scripts, including
- Denoising
bash scripts/val/predict_grab_rndseed.sh #### After completing the above command #### bash scripts/val/predict_grab_rndseed_spatial.sh
- Mesh reconstruction
To utilize the script
scripts/val/reconstruct_grab.sh
to reconstruct a single sequence, you need to set thesingle_seq_path
and thetest_tag
in the script before running it.bash scripts/val/reconstruct_grab.sh
Denoising a full sequence
The evaluation setting for GRAB denoises the first 60 frames of a sequence. To denoise a full sequence, the input can be divided into several overlapping clips, each containing 60 frames. These clips can then be cleaned independently, followed by reconstructing the mesh sequence together.
For example, taking data/grab/source_data/14.npy
, the following scripts will add artificial Gaussian noise to it and denoise the full sequence:
##### Denoising #####
bash scripts/val/predict_grab_fullseq_rndseed.sh
##### Denoising #####
bash scripts/val/predict_grab_fullseq_rndseed_spatial.sh
##### Reconstructing #####
bash scripts/val/reconstruct_grab_fullseq.sh
The single_seq_path
parameter in each script specifies the sequence to denoise.
Example
The input noisy trajectory is constructed by adding noise from a Beta distirbution onto the trajectory data/grab/source_data/14.npy
. And two different denoised samples are shown as below.
Input | Result 1 | Result 2 |
---|---|---|
To reproduce this result, use the scripts located in the scripts/val_examples
directory. Please notice that the pert_type
argument in each .sh
file should be set to beta
.
Evaluate on the test split
To run th evaluation process on all GRAB test sequences, follow the same steps as outlined in the previous section. Please notice that the pert_type
argument in each .sh
file should be set to beta
.
Denoising a full sequence
Follow the same steps as outlined in the previous section. Don't forget to set the pert_type
argument in each .sh
file should be set to beta
.
Here's an example of cleaning an input noisy trajectory
data/taco/source_data/20231104_017.pkl
.
Below are the input, result, and overlayed video.
Input | Result | Overlayed |
---|---|---|
To reproduce the above result, follow the steps below:
- Denoising
Ten random seeds will be utilized for prediction, and the predicted results will be saved in the folder
bash scripts/val_examples/predict_taco_rndseed_spatial_20231104_017.sh
./data/taco/result
. - Mesh reconstruction
Results will be saved in the same folder as mentioned in the previous step.
bash scripts/val_examples/reconstruct_taco_20231104_017.sh
- Extracting results and visualization
Adjust the camera pose in the viewer based on the first frame. Figures of all frames will be captured and saved in the root folder of the project. Finally, use your preferred tool to compile these figures into a video.
python visualize/vis_taco_example_20231104_017.py
Example
Here's an example of cleaning an input noisy trajectory
data/hoi4d/source_data/ToyCar/case3/merged_data.npy
.
Below are the input, result, and overlayed video.
Input | Result | Overlayed |
---|---|---|
To reproduce the above result, follow the steps below:
- Denoising
Ten random seeds will be utilized for prediction, and the predicted results will be saved in the folder
bash scripts/val_examples/predict_hoi4d_rndseed_toycar_inst3.sh #### After completing the above command #### bash scripts/val_examples/predict_hoi4d_rndseed_toycar_inst3_spatial.sh
./data/hoi4d/result/ToyCar
. - Mesh reconstruction
Results will be saved in the same folder as mentioned in the previous step.
bash scripts/val_examples/reconstruct_hoi4d_toycar_inst3.sh
- Extracting results and visualization
Adjust the camera pose in the viewer based on the first frame. Figures of all frames will be captured and saved in the root folder of the project. Finally, use your preferred tool to compile these figures into a video.
python visualize/vis_hoi4d_example_toycar_inst3.py
Per-Category Evaluation (on rigid categories)
-
Update data and experimental paths in
.sh
scriptsFor evaluating on all sequences of a category
CAT_NM
, modify the followig parameter settings in filescripts/val/predict_hoi4d_rndseed.sh
,scripts/val/predict_hoi4d_rndseed_spatial.sh
, andscripts/val/reconstruct_hoi4d_category.sh
by settinghoi4d_cad_model_root
to the path to where you have downloaded theHOI4D_CAD_Model_for_release
(e.g.data/hoi4d/HOI4D_CAD_Model_for_release
),hoi4d_data_root
to the path where you have downloadedHOI_Processed_Data_Rigid
(for a rigid category) orHOI_Processed_Data_Arti
(for an articulated category),hoi4d_category_name
toCAT_NM
,hoi4d_eval_st_idx
to the minimum sequence index, andhoi4d_eval_ed_idx
to the maximum sequence index.export hoi4d_cad_model_root="data/hoi4d/HOI4D_CAD_Model_for_release" export hoi4d_data_root="data/hoi4d/HOI_Processed_Data_Rigid" export hoi4d_category_name="ToyCar" export hoi4d_eval_st_idx=0 export hoi4d_eval_ed_idx=250
Additionally, specify you experiment folders in the above mentioned
.sh
files accordingly by modifying the following argument(s).export save_dir="data/hoi4d/result"
-
Denoising
bash scripts/val/predict_hoi4d_rndseed.sh #### After completing the above command #### bash scripts/val/predict_hoi4d_rndseed_spatial.sh
-
Mesh reconstruction
bash scripts/val/reconstruct_hoi4d_category.sh
Per-Category Evaluation (on articulated categories)
Follow the above insructions but use the following three scripts for articulated categories instead: scripts/val/predict_hoi4d_arti_rndseed.sh
, scripts/val/predict_hoi4d_arti_rndseed_spatial.sh
, and scripts/val/reconstruct_hoi4d_arti_category.sh
.
In each of them, set the argument hoi4d_data_root
to the root folder where you store the pre-processed articulated data (e.g. data/hoi4d/HOI_Processed_Data_Arti
). You can vary the value of the argument select_part_idx
to select which part to use as the base part for providing object points.
After setting necessary arguments, run the denoising step and mesh reconstruction step as follows:
- Denoising
bash scripts/val/predict_hoi4d_arti_rndseed.sh #### After completing the above command #### bash scripts/val/predict_hoi4d_arti_rndseed_spatial.sh
- Mesh reconstruction
bash scripts/val/reconstruct_hoi4d_arti_category.sh
Results will be saved in the folder ${save_dir}/${hoi4d_category_name}
.
Download preprocessed data (train split) and extract it under the folder for preprocessed GRAB data, e.g., data/grab/GRAB_processed
.
Download preprocessed data (hand, train split) and extract it under the folder for GRAB preprocessed subject data (e.g., ./data/grab/GRAB_processed_wsubj
).
Set the argument grab_processed_dir
in scripts/train/train_motion_diff.sh
and scripts/train/train_spatial_diff.sh
to the path where you download and save the preprocessed data in the previous step. For instance:
export grab_processed_dir="data/grab/GRAB_processed"
Run the following scripts for training the motion diffusion model and the spatial diffusion model:
bash scripts/train/train_motion_diff.sh
bash scripts/train/train_spatial_diff.sh
They can be execute in parallel. Please note that the second training stage requires at least 42GB GPU memory. Our experiments of this part are conducted on an NVIDIA A40 GPU.
You can use your trained checkpoints resulting from train_motion_diff.sh
and train_spatial_diff.sh
to replace our provided pretrained checkpoints ckpts/model.pt
and ckpts/model_spatial.pt
for inference respectively.
- Example usage, evaluation process and pre-trained models
- HOI4D example usage
- Evaluation process on HOI4D (Rigid, Articulated)
- Data: HOI4D (Rigid, Articulated)
- Training procedure
- Evaluation process on ARCTIC
- Data: ARCTIC, and more examples on TACO
Please contact [email protected] or create a github issue if you have any questions.
If you find this code useful in your research, please cite:
@inproceedings{liu2024geneoh,
title={GeneOH Diffusion: Towards Generalizable Hand-Object Interaction Denoising via Denoising Diffusion},
author={Liu, Xueyi and Yi, Li},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024}
}
This code is standing on the shoulders of giants. We want to thank the following contributors that our code is based on: motion-diffusion-model and guided-diffusion.
This code is distributed under an MIT LICENSE.