Tianhao Qi*, Shancheng Fang, Yanze Wu✝, Hongtao Xie✉, Jiawei Liu,
Lang Chen, Qian He, Yongdong Zhang
(*Works done during the internship at ByteDance, ✝Project Lead, ✉Corresponding author)
From University of Science and Technology of China and ByteDance.
TL;DR: We propose DEADiff, a generic method facilitating the synthesis of novel images that embody the style of a given reference image and adhere to text prompts.
- [2024.4.3]: 🔥🔥 Release the inference code and pretrained checkpoint.
- [2024.3.5]: 🔥🔥 Release the project page.
- Release the inference code.
- Release training data.
conda create -n deadiff python=3.9.2
conda activate deadiff
conda install pytorch==2.0.0 torchvision==0.15.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install git+https://github.com/salesforce/LAVIS.git@20230801-blip-diffusion-edit
pip install -r requirements.txt
pip install -e .
- Download the pretrained model from Hugging Face and put it under ./pretrained/.
- Run the commands in terminal.
python3 scripts/app.py
The Gradio app allows you to transfer style from the reference image. Just try it for more details.
We develop this repository for RESEARCH purposes, so it can only be used for personal/research/non-commercial purposes.
@article{qi2024deadiff,
title={DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations},
author={Qi, Tianhao and Fang, Shancheng and Wu, Yanze and Xie, Hongtao and Liu, Jiawei and Chen, Lang and He, Qian and Zhang, Yongdong},
journal={arXiv preprint arXiv:2403.06951},
year={2024}
}
If your have any comments or questions, feel free to contact [email protected]