You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thank you for your great work! I have a question about the FVD evaluation. I intend to follow this work, but I have some problems when evaluating FVD. (The other quanti results are consistent with the paper)
When I check the configs of the videos generated from gif (in tool/metrics/utils.py 'DatasetFVDVideoResize'), I find that the video has the size of [128, 112, 112, 3], however, the gif has only 16 frames. So when I check out the ffmpeg function in tool/metrics/utils.py line 358
it outputs something like below, which means it transfers the 16-frame gif to a 128 frame video (and segment it into 8 pieces for the num_seg parameter):
And if I set the fps in gen_eval.sh as 25 (and the video will be 16 frames), the FVD-3DRN50 will become 96.15 (from More TikTok-Style Training Data (FID-FVD: 15.7))
even if I don't change the fps (remain as 3), the FVD-3DRN50 is 20.34, different from the paper.
In Training/Validation Data Split #27 , the authors say the evaluation uses 335-340 and 5 OL video as evaluation, but the provided new10val_TiktokDance-poses-masks.yaml outputs 337/338/201/202/203. Maybe the correct yaml will lead to the paper FVD results?
Thank you!
The text was updated successfully, but these errors were encountered:
Hi @fwbx529 , could you please check my responses here? We can reproduce the results reported in the paper.
Btw, the outputs frame id (201, 202, 203) is NOT the video id of the original TikTok but from the collected online video. There is not leakage between training and testing data.
Hi, thank you for your great work! I have a question about the FVD evaluation. I intend to follow this work, but I have some problems when evaluating FVD. (The other quanti results are consistent with the paper)
When I check the configs of the videos generated from gif (in tool/metrics/utils.py 'DatasetFVDVideoResize'), I find that the video has the size of [128, 112, 112, 3], however, the gif has only 16 frames. So when I check out the ffmpeg function in tool/metrics/utils.py line 358
out, _ = (ffmpeg.input(path).output('pipe:', format='rawvideo', pix_fmt='rgb24').run(capture_stdout=True, quiet=False))
it outputs something like below, which means it transfers the 16-frame gif to a 128 frame video (and segment it into 8 pieces for the num_seg parameter):
Input #0, gif, from '/root/autodl-tmp/DisCo/run_test/exp/tiktok_ft/outputs//pred_gs1.5_scale-cond1.0-ref1.0_gif/TiktokDance_00337_0010png.gif': Duration: 00:00:05.28, start: 0.000000, bitrate: 866 kb/s Stream #0:0: Video: gif, bgra, 256x256, 3.03 fps, 24.25 tbr, 100 tbn, 100 tbc
Output #0, rawvideo, to 'pipe:': Metadata: encoder : Lavf58.29.100 Stream #0:0: Video: rawvideo (RGB[24] / 0x18424752), rgb24, 256x256, q=2-31, 38141 kb/s, 24.25 fps, 24.25 tbn, 24.25 tbc Metadata: encoder : Lavc58.54.100 rawvideo
And if I set the fps in gen_eval.sh as 25 (and the video will be 16 frames), the FVD-3DRN50 will become 96.15 (from More TikTok-Style Training Data (FID-FVD: 15.7))
even if I don't change the fps (remain as 3), the FVD-3DRN50 is 20.34, different from the paper.
So I have 3 questions on this evaluation:
Thank you!
The text was updated successfully, but these errors were encountered: