-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About the reproduction of VCR experiment results #8
Comments
Thanks for your interest in the code. In the leaderboard (https://visualcommonsense.com/leaderboard/), if you see Entry #18, there seems to be a third party running this code to reproduce the results. Since you only have 1 gpu, please change the hyper-parameter "gradient_accumulation_steps" from 5 to 20, and have a try again. Let me know how this goes, and hopefully the performance can catch up. |
Thanks for your reply! I am trying to modify the configuration file and re-experiment, after which I will feed back the results. |
Hi, this strategy worked. After I adjusted the hyper-parameter, when the training level reached 90%, I got the following results. |
Hi,
Thanks for your great work!
When i use the following command to train a model, it seems can't reach the expected results in the paper.
horovodrun -np 1 python train_vcr_adv.py --config config/train-vcr-base-4gpu-adv.json \ --output_dir vcr/output_base
Only use one GPU,I got these results
100%|##########| 8000/8000 [4:58:12<00:00, 1.98s/it][1,0]<stderr>:09/10/2021 08:48:59 - INFO - __main__ - ============Step 8000============= [1,0]<stderr>:09/10/2021 08:48:59 - INFO - __main__ - 1280000 examples trained at 71 ex/s [1,0]<stderr>:09/10/2021 08:48:59 - INFO - __main__ - =========================================== [1,0]<stderr>: [1,0]<stderr>:09/10/2021 08:48:59 - INFO - __main__ - start running validation... [1,0]<stderr>: [[[[1,0]<stderr>:09/10/2021 08:54:06 - INFO - __main__ - validation finished in 307 seconds, score_qa: 72.28 score_qar: 75.06 score: 54.35
I am confused that this result is a few percentage points different from the one mentioned in the paper.
What should i do? Thanks in advance!!!
The text was updated successfully, but these errors were encountered: