This is the official repo of the paper Perceptual Quality Improvement in Videoconferencing using Keyframes-based GAN.
In this work we propose a novel GAN architecture for compression artifacts reduction in videoconferencing. In this context, the speaker is typically in front of the camera and remains the same for the entire duration of the transmission. With this assumption, we can maintain a set of reference keyframes of the person from the higher quality I-frames that are transmitted within the video streams. First, we extract multi-scale features from the compressed and reference frames. Then, these features are combined in a progressive manner with Adaptive Spatial Feature Fusion blocks based on facial landmarks and with Spatial Feature Transform blocks. This allows to restore the high frequency details lost after the video compression.
- Clone the repo
git clone https://github.com/LorenzoAgnolucci/Keyframes-GAN.git
- Create a virtual env and install all the dependencies with
pip install -r requirements.txt
-
Even if it is not required, we strongly recommend to install
dlib
with GPU support -
For metrics computation, you need to run
pip install -e pybrisque/
- Download the pretrained models
and move them inside the pretrained_models
folder
For testing, you need one or more HQ mp4
videos. These videos will be compressed with a given CRF. The face from each frame
will be cropped, aligned and then restored with our model exploiting HQ keyframes.
-
Move the HQ videos under a directory named
{BASE_PATH}/original/
-
Run
python preprocessing.py --base_path {BASE_PATH} --crf 42
where crf
is a given Constant Rate Factor (default 42)
- Run
python video_inference.py --base_path {BASE_PATH} --crf 42 --max_keyframes 5
where crf
must be equal to the one of step 2 and max_keyframes
is the max cardinality of the set of keyframes (default 5)
- If needed, run
python compute_metrics.py --gt_path {BASE_PATH}/original --inference_path inference/DMSASFFNet/max_keyframes_5/LFU
where gt_path
is the directory that contains the HQ videos and inference_path
is the directory that contains the restored frames
-
Modify the file
BasicSR/options/train/DMSASFFNet/train_DMSASFFNet.yml
to indicate the path of your training and validation datasets -
Start training by running the following command with
BasicSR
as the current working directory:
python basicsr/train.py -opt options/train/DMSASFFNet/train_DMSASFFNet.yml
Please refer to BasicSR for more information on the fields of the options file.
@article{agnolucci2023perceptual,
title={Perceptual quality improvement in videoconferencing using keyframes-based {GAN}},
author={Agnolucci, Lorenzo and Galteri, Leonardo and Bertini, Marco and Del Bimbo, Alberto},
journal={IEEE Transactions on Multimedia},
volume={26},
pages={339--352},
year={2023},
publisher={IEEE}
}
We rely on BasicSR for the implementation of our model and for metrics computation.