Generate raw audio from dance videos
movenet
is a research project for generating music from dance. The idea
is to turn the human body into an instrument, converting sequences of images
into raw audio waveforms.
This project uses the Kinetics dataset to train the dance-to-audio generation model because it conveniently comes with a downloader that supports downloading video and audio.
This repo uses miniconda as a virtual environment.
conda create -n movenet python=3.9
conda activate movenet
conda env update -n movenet -f env.yml
Install youtube-dl depending on your system.
- Dance Video Datasets for Artificial Intelligence: A medium post pointing to several relevant datasets.
- WaveNet: DeepMind model that generates raw audio waveforms.
- Unsupervised speech representation learning using WaveNet autoencoders
- Dance Revolution: Long-term Dance Generation With Music Via Curriculum Learning
- Dancing to Music
- Music-oriented Dance Video Synthesis with Pose Perceptual Loss
- Feel the Music: Automatically Generating A Dance For An Input Song
- Everybody Dance Now
- Learning to Dance: A graph convolutional adversarial network to generate realistic dance motions from audio
- Weakly-Supervised Deep Recurrent Neural Networks for Basic Dance Step Generation
- MagnaTagATune Dataset
- OpenAI Jukebox
- OpenAI Musenet
Clone the downloader
git clone https://github.com/hai-labs/kinetics-downloader
If you want to reconstitute a fresh dataset, download it with:
cd kinetics-downloader
python download.py --categories "dancing" --num-workers <NUM_WORKERS> -v
cd ..
cp -R kinetics-downloader/dataset datasets/kinetics
You can also download the datasets from google drive. For example you can dump the
kinetics_debug
directory into dataset/kinetics_debug
.
python movenet/pytorch_lightning_trainer.py --dataset datasets/kinetics_debug --n_epochs 1
The experiments
directory contains makefiles for running jobs over various
exprimental setups.
source env/gridai
make -f experiments/<makefile> <target>