This repository contains code for our work on Self-Supervised Viewpoint Learning from Image Collections (SSV) accepted at CVPR 2020. SSV provides a framework to learn viewpoint estimation of objects just using images of objects without the need for groundtruth viewpoint annotations.
We used Pytorch 1.0 with CUDA 10 and CuDNN 7.4.1 in Ubuntu 16.04.
All the dependencies are provided in requirements.txt.
A similar environment can be created using:
conda create --name ssv --file requirements.txt
Please download MTCNN-Pytorch from here and install it in 'data_preprocessing' folder. This is required for preprocessing the datasets.
300W-LP dataset can be downloaded from here. To preprocess it run:
python preprocess_data.py --src-dir <path_to_300wlp_dataset> --dst-dir <path_to_processed_300wlp> --datset 300WLP
Create an lmdb of the preprocessed 300W-LP data by running:
python prepare_lmdb.py <path_to_processed_300wlp> --out <path_to_300wlp_lmdb>
BIWI headpose estimation dataset can by downloaded by writing to the authors of 'Fanelli, G. and Dantone, M. and Gall, J. and Fossati, A. and van Gool, L., Random Forests for Real Time 3D Face Analysis, International Journal of Computer Vision, 2013.'
To preprocess it run:
python preprocess_data.py --src-dir <path_to_biwi_dataset> --dst-dir <path_to_processed_biwi> --dataset BIWI
To obtain the pretrained model please send an email here.
Run the following:
python test_vpnet.py --data_dir <path_to_processed_biwi> --model_path <path_to_pretrained_model>
Run the following for visualization of head pose predictions on some samples of BIWI dataset.
python ssv_demo.py
The plots are saved in 'demo_images/plots'.
The following command produces some sample synthesized images. These are saved in ''synth_images'.
python test_synthesis.py
The gif looks similar to the one shown below.
To train SSV from scratch, run the following:
python3 train.py --exp_name SSV --data_path <path_to_300wlp_lmdb> --num_workers 4 --exp_root <path_to_experiments_dir> --save_interval 5000 --sample_interval 500 --batch_size 2 --lr 0.0005 --code_size 64 --z_recn_weight 0.8 --vp_recn_weight 0.8 --img_recn_weight 0.4 --flip_cons_weight 0.4 --flipc_recn_weight_G 0.5 --az_range 1.4 --el_range 1.2 --ct_range 0.75
Please cite our paper if you find this code useful for your research.
@inproceedings{mustikovelaCVPR20,
title = {Self-Supervised Viewpoint Learning From Image Collections},
author = {Mustikovela, Siva Karthik and Jampani, Varun and De Mello, Shalini and Liu, Sifei and Iqbal, Umar and Rother, Carsten and Kautz, Jan},
booktitle = {IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)},
month = june,
year = {2020}
}