Ryan Burgert1,3, Yuancheng Xu1,4, Wenqi Xian1, Oliver Pilarski1, Pascal Clausen1, Mingming He1, Li Ma1,
Yitong Deng2,5, Lingxiao Li2, Mohsen Mousavi1, Michael Ryoo3, Paul Debevec1, Ning Yu1†
1Netflix Eyeline Studios, 2Netflix, 3Stony Brook University, 4University of Maryland, 5Stanford University
†Project Lead
Go-with-the-Flow is an easy and efficient way to control the motion patterns of video diffusion models. It lets a user decide how the camera and objects in a scene will move, and can even let you transfer motion patterns from one video to another.
We simply fine-tune a base model — requiring no changes to the original pipeline or architecture, except: instead of using pure i.i.d. Gaussian noise, we use warped noise instead. Inference has exactly the same computational cost as running the base model.
If you like this project, please give it a ★!
Cut-and-drag motion control lets you take an image, and create a video by cutting out different parts of that image and dragging them around.
For cut-and-drag motion control, there are two parts: an GUI to create a crude animation (no GPU needed), then a diffusion script to turn that crude animation into a pretty one (requires GPU).
Examples:
-
Clone this repo, then
cd
into it. -
Install local requirements:
pip install -r requirements_local.txt
-
Run the GUI:
python cut_and_drag_gui.py
-
Follow the instructions shown in the GUI.
After completion, an MP4 file will be generated. You'll need to move this file to a computer with a decent GPU to continue.
-
Clone this repo on the machine with the GPU, then
cd
into it. -
Install requirements:
pip install -r requirements.txt
-
Warp the noise (replace
<PATH TO VIDEO OR URL>
accordingly):python make_warped_noise.py <PATH TO VIDEO OR URL> --output_folder noise_warp_output_folder
-
Run inference:
python cut_and_drag_inference.py noise_warp_output_folder \ --prompt "A duck splashing" \ --output_mp4_path "output.mp4" \ --device "cuda" \ --num_inference_steps 5
Adjust folder paths, prompts, and other hyperparameters as needed. The output will be saved as output.mp4
.
- Upload All Models
- Upload Cut-And-Drag Inference Code
- Release to Arxiv
- Depth-Warping Inference Code
- T2V Motion Transfer Code
- ComfyUI Node
- Replicate Instance
- Fine-Tuning Code
If you use this in your research, please consider citing:
@misc{burgert2025gowiththeflowmotioncontrollablevideodiffusion,
title={Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise},
author={Ryan Burgert and Yuancheng Xu and Wenqi Xian and Oliver Pilarski and Pascal Clausen and Mingming He and Li Ma and Yitong Deng and Lingxiao Li and Mohsen Mousavi and Michael Ryoo and Paul Debevec and Ning Yu},
year={2025},
eprint={2501.08331},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2501.08331},
}